Commit Graph

137105 Commits

Author SHA1 Message Date
Mark Johnston
f161d294b9 Add missing sockaddr length and family validation to various protocols
Several protocol methods take a sockaddr as input.  In some cases the
sockaddr lengths were not being validated, or were validated after some
out-of-bounds accesses could occur.  Add requisite checking to various
protocol entry points, and convert some existing checks to assertions
where appropriate.

Reported by:	syzkaller+KASAN
Reviewed by:	tuexen, melifaro
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29519
2021-05-03 13:35:19 -04:00
Elliott Mitchell
a3c7da3d08 kern/intr: declare interrupt vectors unsigned
These should never get values large enough for sign to matter, but one
of them becoming negative could cause problems.

MFC after:	1 week
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D29327
2021-05-03 13:24:30 -04:00
Mark Johnston
243b324f96 devfs: Avoid comparison with an uninitialized var in devfs_fp_check()
devvn_refthread() will initialize *devp only if it succeeds, so check for
success before comparing with fp->f_data.  Other devvn_refthread()
callers are careful to do this.

Reported by:	KMSAN
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30068
2021-05-03 13:24:30 -04:00
Mark Johnston
2b2d77e720 VOP_STAT: Provide a default value for va_gen
Some filesystems, e.g., pseudofs and the NFSv3 client, do not provide
one.

Reviewed by:	kib
Reported by:	KMSAN
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30091
2021-05-03 13:24:30 -04:00
Mark Johnston
cdfcfc607a smp: Initialize arg->cpus sooner in smp_rendezvous_cpus_retry()
Otherwise, if !smp_started is true, then smp_rendezvous_cpus_done() will
harmlessly perform an atomic RMW on an uninitialized variable.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-05-03 13:24:30 -04:00
Konstantin Belousov
7cb40543e9 filt_timerexpire: do not iterate over the interval
User-supplied data might make this loop too time-consuming. Divide
directly, and handle both the possibility that we were woken up earlier,
and arithmetic overflows/underflows from the calculation.

Reported and tested by:	pho (previous version)
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30069
2021-05-03 19:49:54 +03:00
Konstantin Belousov
87a64872cd Add ptrace(PT_COREDUMP)
It writes the core of live stopped process to the file descriptor
provided as an argument.

Based on the initial version from https://reviews.freebsd.org/D29691,
submitted by Michał Górny <mgorny@gentoo.org>.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:18:26 +03:00
Konstantin Belousov
68d311b666 ptracestop: mark threads suspended there with the new TDB_SSWITCH flag
This way threads in ptracestop can be discovered by debugger

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:18:25 +03:00
Konstantin Belousov
9ebf9100ba ptrace: do not allow for parallel ptrace requests
Set a new P2_PTRACEREQ flag around the request Wait for the target     .
process P2_PTRACEREQ flag to clear before setting ours                 .

Otherwise, we rely on the moment that the process lock is not dropped
until the stopped target state is important.  This is going to be no
longer true after some future change.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:16:30 +03:00
Konstantin Belousov
54c8baa021 kern_ptrace(): extract code to determine ptrace eligibility into helper
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:48 +03:00
Konstantin Belousov
2bd0506c8d kern_ptrace: change type of proctree_locked to bool
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:48 +03:00
Konstantin Belousov
af928fded0 Add thread_run_flash() helper
It unsuspends single suspended thread, passed as the argument.
It is up to the caller to arrange the target thread to suspend later,
since the state of the process is not changed from stopped.  In particular,
the unsuspended thread must not leave to userspace, since boundary code
is not prepared to this situation.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:47 +03:00
Konstantin Belousov
15465a2c25 Add sleepq_remove_nested()
The helper removes the thread from a sleep queue, assuming that it would
need to sleep. The sleepq_remove_nested() function is intended for quite
special case, where suspended thread from traced stopped process is
temporary unsuspended to do some work on behalf of the debugger in the
target context, and this work might require sleep.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:47 +03:00
Konstantin Belousov
86ffb3d1a0 ELF coredump: define several useful flags for the coredump operations
- SVC_ALL request dumping all map entries, including those marked as
  non-dumpable
- SVC_NOCOMPRESS disallows compressing the dump regardless of the sysctl
  policy
- SVC_PC_COREDUMP is provided for future use by userspace core dump
  request

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:47 +03:00
Konstantin Belousov
5bc3c61780 imgact_elf: consistently pass flags from coredump down to helper functions
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:47 +03:00
Edward Tomasz Napierala
7818653fd6 cam: fix integer overflow during inquiry
From my understanding this could happen with iSCSI LUNs with
unusually long names.  The bug would make CAM fail to retrieve
the full inquiry data.  Instead of bumping the size of the local
variable, just use a macro.

Reviewed By:	imp, mav
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
X-NetApp-PR:	#50
Differential Revision:	https://reviews.freebsd.org/D29991
2021-05-03 15:20:17 +01:00
Jose Luis Duran
8f1562430f Add Apollo Lake SIO/LPSS UARTs PCI IDs
Add PCI IDs for Intel Apollo Lake Series HSUARTs:

    # pciconf -ll
    drv   selector      class    rev  hdr  vendor device subven subdev
    uart0@pci0:0:24:0:  118000   0b   00   8086   5abc   8086   7270
    uart1@pci0:0:24:1:  118000   0b   00   8086   5abe   8086   7270
    uart2@pci0:0:24:2:  118000   0b   00   8086   5ac0   8086   7270
    uart3@pci0:0:24:3:  118000   0b   00   8086   5aee   8086   7270

NB (Intel Document Number 336256-004US):
1. The E3900 and A3900 Series Processors support four LPSS_UART ports,
   while the N- and J- Series Processors support only LPSS_UART [2:1]
   ports.
2. The LPSS_UART1 port is dedicated for discrete Global Navigation
   Satellite System (GNSS).  This port can be used for generic UART
   functionality if GNSS is not used.
3. The LPSS_UART2 port is dedicated for host OS debug.
4. The LPSS_UART0 and LPSS_UART3 ports are for generic UART functionality.
5. Only UART [1:0] ports support DMA.

PR:	255556
Submitted by:	Jose Luis Duran <jlduran@gmail.com>
MFC after:	1 week
2021-05-03 14:38:52 +03:00
Jose Luis Duran
5b8b6b26e4 uart_bus_pci.c: Style
Wrap long lines, use tab instead of spaces.

PR:	255556
Submitted by:	Jose Luis Duran <jlduran@gmail.com>
MFC after:	1 week
2021-05-03 14:38:52 +03:00
Jose Luis Duran
0ea8a7f36d ifconfig: Minor documentation fix
Fix what appears to have been a small copy/paste typo in ifconfig(8)'s
documentation (man page and header file).

Not that it matters anymore.

Reference: Table I-2 in IEEE Std 802.1Q-2014.

PR:	255557
Submitted by:	Jose Luis Duran <jlduran@gmail.com>
MFC after:	1 week
2021-05-03 14:38:52 +03:00
Andrew Turner
0ec205197b Also enable IPIs on 32-bit arm
This was missed in 2420f6a

Reported by:	tuexen, imp
2021-05-03 08:36:57 +00:00
Michael Tuexen
8b3d0f6439 sctp: improve address list scanning
If the alternate address has to be removed, force the stack to
find a new one, if it is still needed.

MFC after:	3 days
2021-05-03 02:50:05 +02:00
Michael Tuexen
a89481d328 sctp: improve restart handling
This fixes in particular a possible use after free bug reported
Anatoly Korniltsev and Taylor Brandstetter for the userland stack.

MFC after:	3 days
2021-05-03 02:20:24 +02:00
Alexander Motin
655c200cc8 Fix build after 5f2e183505. 2021-05-02 20:07:38 -04:00
Alexander Motin
2760658b21 Improve UMA cache reclamation.
When estimating working set size, measure only allocation batches, not free
batches.  Allocation and free patterns can be very different.  For example,
ZFS on vm_lowmem event can free to UMA few gigabytes of memory in one call,
but it does not mean it will request the same amount back that fast too, in
fact it won't.

Update working set size on every reclamation call, shrinking caches faster
under pressure.  Lack of this caused repeating vm_lowmem events squeezing
more and more memory out of real consumers only to make it stuck in UMA
caches.  I saw ZFS drop ARC size in half before previous algorithm after
periodic WSS update decided to reclaim UMA caches.

Introduce voluntary reclamation of UMA caches not used for a long time. For
each zdom track longterm minimal cache size watermark, freeing some unused
items every UMA_TIMEOUT after first 15 minutes without cache misses. Freed
memory can get better use by other consumers.  For example, ZFS won't grow
its ARC unless it see free memory, since it does not know it is not really
used.  And even if memory is not really needed, periodic free during
inactivity periods should reduce its fragmentation.

Reviewed by:	markj, jeff (previous version)
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D29790
2021-05-02 19:45:23 -04:00
Michael Tuexen
5f2e183505 sctp: improve error handling in INIT/INIT-ACK processing
When processing INIT and INIT-ACK information, also during
COOKIE processing, delete the current association, when it
would end up in an inconsistent state.

MFC after:	3 days
2021-05-02 22:41:35 +02:00
Rick Macklem
4f592683c3 copy_file_range(2): improve copying of a large hole to EOF
PR#255523 reported that a file copy for a file with a large hole
to EOF on ZFS ran slowly over NFSv4.2.
The problem was that vn_generic_copy_file_range() would
loop around reading the hole's data and then see it is all
0s. It was coded this way since UFS always allocates a data
block near the end of the file, such that a hole to EOF never exists.

This patch modifies vn_generic_copy_file_range() to check for a
ENXIO returned from VOP_IOCTL(..FIOSEEKDATA..) and handle that
case as a hole to EOF. asomers@ confirms that it works for his
ZFS test case.

PR:	255523
Tested by:	asomers
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30076
2021-05-02 16:04:27 -07:00
Andrew Turner
2420f6aed9 Enable IPIs on CPU 0 on arm and arm64
Not all interrupt controllers enable IPIs by default as the Arm
GIC specs make it an implementation defined option. As at least two
hypervisors have also previously masked the IPIs on boot.

As we already enable these IPIs on the non-boot CPUs it is expected
this is a safe operation.

Differential Revision:	https://reviews.freebsd.org/D26975
2021-05-02 07:43:34 +00:00
Andrew Turner
fe38224977 Implement bus_map_resource on arm64
This will allow us to allocate an unmapped memory resource, then
later map it with a specific memory attribute.

This is also needed for virtio with the modern PCI attachment.

Reviewed by:	kib (via D29723)
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D29694
2021-05-02 07:35:16 +00:00
Justin Hibbits
be48fe6000 powerpc/xive: Remove POWER9 DD1 IRQ bits
The OPAL_XIVE_*_VIA_IFW flags are used only for POWER9 DD1, which we
don't support.

Noticed while perusing Linux and skiboot git logs.
2021-05-01 16:18:02 -05:00
Andrew Turner
c78ad207ba Switch the EFI virtual address to a uint64_t
It is defined as a uint64_t in the UEFI spec. As it's not used as a
pointer by the kernel follow this and define it as the same in the
kernel.

Reviewed by:	kib, manu, imp
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D29759
2021-05-01 06:01:20 +00:00
Andrew Turner
2abd4f8581 Add a way to map arm64 non-posted device memory
On arm64 we currently use a non-posted write for device memory, however
we should move to use posted writes. This is expected to work on most
hardware, however we will need to support a non-posted option for some
broken hardware.

Reviewed by:	imp, manu, bcr (manpage)
Differential Revision:	https://reviews.freebsd.org/D29722
2021-05-01 06:01:20 +00:00
Justin Hibbits
a6ca7519f8 powerpc64: Optimize radix trap handling a little more
Summary:
Since PCPU can live in a GPR for a while longer, let it, rather than
re-getting it in yet another register.  MFSPR is an expensive operation,
12 clock latency on POWER9, so the fewer operations we need, the better.

Since the check is tightly coupled to the fetch, by reducing the number
of fetch+check, we reduce the stalls, and improve the performance
marginally.  Buildworld was measured at a ~5-7% improvement on a single
run.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D30003
2021-04-30 19:58:11 -05:00
Marcin Wojtas
e245ee2774 gicv3_its: Flush cache after allocating ITT memory
It has to be zeroed before committing it to device.
We do that by allocating it with M_ZERO, but there was no
memory barrier or cache flush to ensure its sees it zeroed.
This fixes MSIX on LS1028A SoC.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30033
2021-05-01 00:58:26 +02:00
Eric van Gyzen
2f32a971b7 Wait longer for a previous IPI to be sent
When sending an IPI, if a previous IPI is still pending delivery,
native_lapic_ipi_vectored() waits for the previous IPI to be sent.
We've seen a few inexplicable panics with the current timeout of 50 ms.
Increase the timeout to 1 second and make it tunable.

No hardware specification mentions a timeout in this case; I checked
the Intel SDM, Intel MP spec, and Intel x2APIC spec.  Linux and illumos
wait forever.  In Linux, see __default_send_IPI_shortcut() in
arch/x86/kernel/apic/ipi.c.  In illumos, see apic_send_ipi() in
usr/src/uts/i86pc/io/pcplusmp/apic_common.c.  However, misbehaving hardware
could hang the system if we wait forever.

Reviewed by:	mav kib
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D29942
2021-04-30 13:32:29 -05:00
Konstantin Belousov
619fe09586 ioccom: define ioctl cmd value that can never be valid
Its use is for cases where some filler is needed for cmd, or we need an
indication that there were no cmd supplied, and so on.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29935
2021-04-30 17:43:45 +03:00
Konstantin Belousov
2082565798 O_PATH: disable kqfilter for fifos
Filter on fifos is real filter for the object, and not a filesystem
events filter like EVFILT_VNODE.

Reported by:	markj using syzkaller
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-04-30 17:43:45 +03:00
Konstantin Belousov
72a42ec63b amd64: disable LA57 by default for now
A testing on the real hardware uncovered an issue, and since I do not have
access to the machine, disable until the bug can be fixed.

Reported by:	"Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-04-30 17:43:45 +03:00
Konstantin Belousov
21fc6a2a10 amd64: invalidate TLB between page table update and access
When setting up trampoline mapping for LA57 switcher, it is possible
that TLB still has some random mapping at that address.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-04-30 17:43:45 +03:00
Michael Tuexen
e010d20032 sctp: update the vtag for INIT and INIT-ACK chunks
This is needed in case of responding with an ABORT to an INIT-ACK.
2021-04-30 13:33:16 +02:00
Marcin Wojtas
cd945dc08a iflib: Take iri_pad into account when processing small frames
Drivers can specify padding of received frames with iri_pad field.
This can be used to enforce ip alignment by hardware.
Iflib ignored that padding when processing small frames,
which rendered this feature inoperable.
I found it while writing a driver for a NIC that can ip align
received packets. Note that this doesn't change behavior of existing
drivers as they all set iri_pad to 0.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: gallatin
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30009
2021-04-30 12:46:17 +02:00
Michael Tuexen
eb79855920 sctp: fix SCTP_PEER_ADDR_PARAMS socket option
Ignore spp_pathmtu if it is 0, when setting the IPPROTO_SCTP level
socket option SCTP_PEER_ADDR_PARAMS as required by RFC 6458.

MFC after:	1 week
2021-04-30 12:31:09 +02:00
Kristof Provost
055c55abef pf: Fix IP checksum on reassembly
If we reassemble a packet we modify the IP header (to set the length and
remove the fragment offset information), but we failed to update the
checksum. On certain setups (mostly where we did not re-fragment again
afterwards) this could lead to us sending out packets with incorrect
checksums.

PR:		255432
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30026
2021-04-30 08:19:46 +02:00
Michael Tuexen
eecdf5220b sctp: use RTO.Initial of 1 second as specified in RFC 4960bis 2021-04-30 00:45:56 +02:00
Michael Tuexen
9de7354bb8 sctp: improve consistency in handling chunks with wrong size
Just skip the chunk, if no other handling is required by the
specification.
2021-04-28 18:11:06 +02:00
Mark Johnston
20e3b9d8bd kasan: Use vm_offset_t for the first parameter to kasan_shadow_map()
No functional change intended.

Sponsored by:	The FreeBSD Foundation
2021-04-29 11:39:02 -04:00
Yinlong Lu
ee8b757a94 ipmi: support getting address from EFI
The original implementation only supports getting the address from legacy
BIOS (by searching for the SMBIOS_SIG pattern in a fixed address space).

Try to get the SMBIOS table from EFI through efirt (EFI Runtime Services)
firstly.  Continue to search in the legacy BIOS if a NULL address is
returned from EFI.

By this way the ipmi function supports both legacy BIOS and UEFI systems.

Reviewed by:	dab, vangyzen
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D30007
2021-04-29 05:20:58 -05:00
Kristof Provost
eaabed8ac4 pf: Trivial typo fix
PV -> PF

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-04-29 15:25:07 +02:00
Navdeep Parhar
b9820bca18 cxgbe(4): Do not panic when tx is called with invalid checksum requests.
There is no need to panic in if_transmit if the checksums requested are
inconsistent with the frame being transmitted.  This typically indicates
that the kernel and driver were built with different INET/INET6 options,
or there is some other kernel bug.  The driver should just throw away
the requests that it doesn't understand and move on.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-04-28 14:04:53 -07:00
Alexander V. Chernikov
41ce0e34ea [fib algo] Update fib_gen counter under FIB_MOD_LOCK.
MFC after:	3 days
2021-04-28 20:23:03 +00:00
Mateusz Guzik
074abaccfa cache: remove incomplete lockless lockout support during resize
This is already properly handled thanks to 2 step hash replacement.
2021-04-28 19:53:25 +00:00