Commit Graph

267803 Commits

Author SHA1 Message Date
Mark Johnston
693c9516fa busdma: Add KMSAN integration
Sanitizer instrumentation of course cannot automatically update shadow
state when devices write to host memory.  KMSAN thus hooks into busdma,
both to update shadow state after a device write, and to verify that the
kernel does not publish uninitalized bytes to devices.

To implement this, when KMSAN is configured, each dmamap embeds a memory
descriptor describing the region currently loaded into the map.
bus_dmamap_sync() uses the operation flags to determine whether to
validate the loaded region or to mark it as initialized in the shadow
map.

Note that in cases where the amount of data written is less than the
buffer size, the entire buffer is marked initialized even when it is
not.  For example, if a NIC writes a 128B packet into a 2KB buffer, the
entire buffer will be marked initialized, but subsequent accesses past
the first 128 bytes are likely caused by bugs.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31338
2021-08-10 21:27:54 -04:00
Mark Johnston
3a1802fef4 busdma: Add an internal BUS_DMA_FORCE_MAP flag to x86 bounce_busdma
Use this flag to indicate that busdma should allocate a map structure
even no bouncing is required to satisfy the tag's constraints.  This
will be used for KMSAN.

Also fix a memory leak that can occur if the kernel fails to allocate
bounce pages in bounce_bus_dmamap_create().

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31338
2021-08-10 21:27:54 -04:00
Mark Johnston
1263022e39 Add a GENERIC-KMSAN kernel configuration for amd64
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Mark Johnston
ce2609947c kmsan: Add a manual page
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Mark Johnston
b0f71f1bc5 amd64: Add MD bits for KMSAN
Interrupt and exception handlers must call kmsan_intr_enter() prior to
calling any C code.  This is because the KMSAN runtime maintains some
TLS in order to track initialization state of function parameters and
return values across function calls.  Then, to ensure that this state is
kept consistent in the face of asynchronous kernel-mode excpeptions, the
runtime uses a stack of TLS blocks, and kmsan_intr_enter() and
kmsan_intr_leave() push and pop that stack, respectively.

Use these functions in amd64 interrupt and exception handlers.  Note
that handlers for user->kernel transitions need not be annotated.

Also ensure that trap frames pushed by the CPU and by handlers are
marked as initialized before they are used.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31467
2021-08-10 21:27:53 -04:00
Mark Johnston
8978608832 amd64: Populate the KMSAN shadow maps and integrate with the VM
- During boot, allocate PDP pages for the shadow maps.  The region above
  KERNBASE is currently not shadowed.
- Create a dummy shadow for the vm page array.  For now, this array is
  not protected by the shadow map to help reduce kernel memory usage.
- Grow shadows when growing the kernel map.
- Increase the default kernel stack size when KMSAN is enabled.  As with
  KASAN, sanitizer instrumentation appears to create stack frames large
  enough that the default value is not sufficient.
- Disable UMA's use of the direct map when KMSAN is configured.  KMSAN
  cannot validate the direct map.
- Disable unmapped I/O when KMSAN configured.
- Lower the limit on paging buffers when KMSAN is configured.  Each
  buffer has a static MAXPHYS-sized allocation of KVA, which in turn
  eats 2*MAXPHYS of space in the shadow map.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31295
2021-08-10 21:27:53 -04:00
Mark Johnston
5dda15adbc kern: Ensure that thread-local KMSAN state is available
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Mark Johnston
a422084abb Add the KMSAN runtime
KMSAN enables the use of LLVM's MemorySanitizer in the kernel.  This
enables precise detection of uses of uninitialized memory.  As with
KASAN, this feature has substantial runtime overhead and is intended to
be used as part of some automated testing regime.

The runtime maintains a pair of shadow maps.  One is used to track the
state of memory in the kernel map at bit-granularity: a bit in the
kernel map is initialized when the corresponding shadow bit is clear,
and is uninitialized otherwise.  The second shadow map stores
information about the origin of uninitialized regions of the kernel map,
simplifying debugging.

KMSAN relies on being able to intercept certain functions which cannot
be instrumented by the compiler.  KMSAN thus implements interceptors
which manually update shadow state and in some cases explicitly check
for uninitialized bytes.  For instance, all calls to copyout() are
subject to such checks.

The runtime exports several functions which can be used to verify the
shadow map for a given buffer.  Helpers provide the same functionality
for a few structures commonly used for I/O, such as CAM CCBs, BIOs and
mbufs.  These are handy when debugging a KMSAN report whose
proximate and root causes are far away from each other.

Obtained from:	NetBSD
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Mark Johnston
f95f780ea4 amd64: Define KVA regions for KMSAN shadow maps
KMSAN requires two shadow maps, each one-to-one with the kernel map.
Allocate regions of the kernels PML4 page for them.  Add functions to
create mappings in the shadow map regions, these will be used by the
KMSAN runtime.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31295
2021-08-10 21:27:52 -04:00
Mark Johnston
30d00832d7 conf: Add a KMSAN kernel option
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:22:12 -04:00
Mark Johnston
4fd450a87d amd64 pmap: Pre-set PG_M on 2MB KASAN shadow map entries
Also remove a redundant assertion in pmap_kasan_enter().

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31295
2021-08-10 21:22:07 -04:00
Mark Johnston
805c3af898 build.7: Describe the default value for LOCAL_MODULES
Suggested by:	jhb
MFC after:	1 week
2021-08-10 21:18:34 -04:00
Alexander Motin
b776de6796 Mark some sysctls as CTLFLAG_MPSAFE.
MFC after:	2 weeks
2021-08-10 20:44:27 -04:00
Alexander Motin
c2da954203 geom(4): Mark all sysctls as CTLFLAG_MPSAFE.
This code does not use Giant lock for very long time.

MFC after:	2 weeks
2021-08-10 20:18:46 -04:00
Alexander Motin
303477d325 cam(4): Mark all sysctls as CTLFLAG_MPSAFE.
This code does not use Giant lock for very long time.

MFC after:	2 weeks
2021-08-10 20:07:19 -04:00
Alexander Motin
94feb1f1eb ntb_hw_intel(4): Add CTLFLAG_MPSAFE flags.
I should have added those in 50f16247a1.

MFC after:	2 weeks
2021-08-10 20:07:19 -04:00
Warner Losh
9339e7c0bd rtsx: Fix wakeup race similar to sdhci one fixed in 35547df5c7
rtsx copied code from sdhci, and has the same wakeup race bug that was
fixed in 35547df5c7, so apply a similar fix here.

Sponsored by:		Netflix
2021-08-10 17:10:25 -06:00
Scott Long
bd9e461cf7 Address the reported mmc serialization issue.
Reset the mmc owner before calling the bridge release host callback.

Some people are hitting the "mmc: host bridge didn't serialize us." panic as
the bridge is released before the mmc owner is reset.

Submitted by: luiz
Sponsored by:   Rubicon Communications, LLC ("Netgate")
2021-08-10 22:41:23 +00:00
Scott Long
35547df5c7 Call wakeup() with the lock held to avoid missed wakeup races.
Submitted by: luiz
Sponsored by: Rubicon Communications, LLC ("Netgate")
2021-08-10 22:36:38 +00:00
Warner Losh
5dedd2517d devmatch: Ignore the pnp fields tagged as ignore ('#')
When matching entries, we should ignore those with a name of '#'. It's
the standard way to skip elements and need to be present to have the
proper offsets to the fields that are observed. No bus has a pnp
attribute of '#' and that is now disallowed for future buses that are
written.

Sponsored by:		Netflix
Reviewed by:		kbowling
Differential Revision:	https://reviews.freebsd.org/D31482
2021-08-10 15:47:55 -06:00
John Baldwin
c7bb0f47f7 nfs tls: Update for SSL_OP_ENABLE_KTLS.
Upstream OpenSSL (and the KTLS backport) have switched to an opt-in
option (SSL_OP_ENABLE_KTLS) in place of opt-out modes
(SSL_MODE_NO_KTLS_TX and SSL_MODE_NO_KTLS_RX) for controlling kernel
TLS.

Reviewed by:	rmacklem
Sponsored by:	Netflix
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31445
2021-08-10 14:18:43 -07:00
Ed Maste
38911b3c2c ar: provide error exit status upon failure
Previously ar and ranlib returned with exit status 0 (success) in the
case of a missing file or other error.  Update to use error handling
similar to that added by ELF Tool Chain after that project forked
FreeBSD's ar.

PR:		PR257599 [exp-run]
Reported by:	Shawn Webb, gehmehgeh (on HardenedBSD IRC)
Reviewed by:	markj
Obtained from:	elftoolchain
MFC after:	2 months
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31402
2021-08-10 17:08:10 -04:00
Alexander Motin
c6902e7796 ntb_transport(4): Mark callouts MP-safe.
The only thing around NTB using Giant lock is NewBus, and these callouts
have nothing to do with it.

MFC after:	2 weeks
2021-08-10 16:37:21 -04:00
Alexander Motin
50f16247a1 ntb_hw_intel(4): Remove CTLFLAG_NEEDGIANT flags.
Most of the sysctls just read hardware registers.  They don't need
any locking.

MFC after:	2 weeks
2021-08-10 16:37:21 -04:00
Mitchell Horne
d78896e46f pmc(3): remove Pentium-related man pages and references
Support for Pentium events was removed completely in e92a1350b5.

Don't bump .Dd where we are just removing xrefs.

Reviewed by:	emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31423
2021-08-10 17:19:58 -03:00
Kevin Bowling
12e8addd32 e1000: rctl/srrctl buffer size init, rfctl fix
Simplify the setup of srrctl.BSIZEPKT on igb class NICs.
Improve the setup of rctl.BSIZE on lem and em class NICs.
Don't try to touch rfctl on lem class NICs.
Manipulate rctl.BSEX correctly on lem and em class NICs.

Approved by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31457
2021-08-10 12:47:22 -07:00
Tony Nguyen
6bc61d22c4
Run arc_evict thread at higher priority
Run arc_evict thread at higher priority, nice=0, to give it more CPU
time which can improve performance for workload with high ARC evict
activities.

On mixed read/write and sequential read workloads, I've seen between
10-40% better performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #12397
2021-08-10 11:36:26 -06:00
Mark Johnston
e7a13643b1 build.7: Document LOCAL_MODULES and LOCAL_MODULES_DIR
Reviewed by:	0mp, imp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31461
2021-08-10 11:43:02 -04:00
Piotr Kubaj
8f642f797a share/man/man7/arch.7: bump date 2021-08-10 17:04:51 +02:00
Piotr Kubaj
ecb3f90dd1 share/man/man7/arch.7: powerpc64 support appeared in 9.0-RELEASE
Differential revision:	https://reviews.freebsd.org/D31013
Approved by:	eadler, 0mp
2021-08-10 16:33:29 +02:00
Alan Somers
518e697f2a [skip ci] unix(4): LOCAL_PEERCRED works on SOCK_SEQPACKET, too.
MFC after:	2 weeks
Reviewed By:	dchagin
Differential Revision: https://reviews.freebsd.org/D31456
2021-08-10 07:31:09 -06:00
Konstantin Belousov
4a5a67fe67 ip(4): Mention IP_IPSEC_POLICY ip-level socket option
Text is literally taken from NetBSD ip(4).

Sponsored by:	NVIDIA Networking
MFC after:	3 days
2021-08-10 03:46:49 +03:00
Konstantin Belousov
ba3896e169 ipsec_set_policy(3): fix sentence
Sponsored by:	NVIDIA Networking
MFC after:	3 days
2021-08-10 03:46:35 +03:00
Konstantin Belousov
8b000bf5bc netipsec/key.c: Use ANSI C definition for key_random()
Sponsored by:	NVIDIA Networking
MFC after:	3 days
2021-08-10 03:46:24 +03:00
Konstantin Belousov
fd4751b389 netipsec/keydb.h: fix typo
Sponsored by:	NVIDIA Networking
MFC after:	3 days
2021-08-10 03:45:36 +03:00
Kevin Bowling
015075f383 e1000: Fix lem/em UDP rx csum offload
Rebase on igb code and unify lem/em implementations.

PR:		257642
Reported by:	Nick Reilly <nreilly@blackberry.com>
Reviewed by:	karels, emaste
Tested by:	Nick Reilly <nreilly@blackberry.com>
Approved by:	grehan
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31449
2021-08-09 14:29:31 -07:00
Jessica Clarke
98138bbde0 riscv: Fix pmap_alloc_l2 when it should allocate a new L1 entry
The current code checks the RWX bits are 0 but does not check the V bit
is non-zero, meaning not-yet-allocated L1 entries that are still zero
are regarded as being allocated. This is likely due to copying the arm64
code that checks ATTR_DESC_MASK is L1_TABLE, which emcompasses both the
type and the validity in a single field, and erroneously translating
that to a check of just PTE_RWX being 0 to indicate non-leaf, forgetting
about the V bit. This then results in the following panic:

    panic: Fatal page fault at 0xffffffc0005cf292: 0x00000000000050
    cpuid = 1
    time = 1628379581
    KDB: stack backtrace:
    db_trace_self() at db_trace_self
    db_trace_self_wrapper() at db_trace_self_wrapper+0x38
    kdb_backtrace() at kdb_backtrace+0x2c
    vpanic() at vpanic+0x148
    panic() at panic+0x2a
    page_fault_handler() at page_fault_handler+0x1ba
    do_trap_supervisor() at do_trap_supervisor+0x7a
    cpu_exception_handler_supervisor() at
    cpu_exception_handler_supervisor+0x70
    --- exception 13, tval = 0x50
    pmap_enter_l2() at pmap_enter_l2+0xb2
    pmap_enter_object() at pmap_enter_object+0x15e
    vm_map_pmap_enter() at vm_map_pmap_enter+0x228
    vm_map_insert() at vm_map_insert+0x4ec
    vm_map_find() at vm_map_find+0x474
    vm_map_find_min() at vm_map_find_min+0x52
    vm_mmap_object() at vm_mmap_object+0x1ba
    vn_mmap() at vn_mmap+0xf8
    kern_mmap() at kern_mmap+0x4c4
    sys_mmap() at sys_mmap+0x38
    do_trap_user() at do_trap_user+0x208
    cpu_exception_handler_user() at cpu_exception_handler_user+0x72
    --- exception 8, tval = 0x1dd

Instead, we should just check the V bit, as on amd64, and assert that
any valid L1 entries are not leaves, since an L1 leaf would render the
entire range allocated and thus we should not have attempted to map that
VA in the first place.

Reported by:	David Gilbert <dgilbert@daveg.ca>
MFC after:	1 week
Reviewed by:	markj, mhorne
Differential Revision:	https://reviews.freebsd.org/D31460
2021-08-09 20:28:37 +01:00
Scott Long
27b8dd594d Remove duplicate entry for arm/mv/armada38x/armada38x_rtc.c
Sponsored by:   Rubicon Communications, LLC ("Netgate")
2021-08-09 18:57:50 +00:00
Mark Johnston
41335c6b7f vmm: Make iommu ops tables const
While here, use designated initializers and rename some AMD iommu method
implementations to match the corresponding op names.  No functional
change intended.

Reviewed by:	grehan
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31462
2021-08-09 13:28:27 -04:00
Mark Johnston
e54ae8258d amd64: Fix output operand specs for the stmxcsr and vmread intrinsics
This does not appear to affect code generation, at least with the
default toolchain.

Noticed because incorrect output specifications lead to false positives
from KMSAN, as the instrumentation uses them to update shadow state for
output operands.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31466
2021-08-09 13:28:08 -04:00
Mark Johnston
e0cc566035 kasan.9: Note the header required for kasan_mark()
Sponsored by:	The FreeBSD Foundation
2021-08-09 13:27:52 -04:00
Mark Johnston
663428ea17 nd6: Mark several callouts as MPSAFE
The use of Giant here is vestigal and does not provide any useful
synchronization.  Furthermore, non-MPSAFE callouts can cause the
softclock threads to block waiting for long-running newbus operations to
complete.

Reported by:	mav
Reviewed by:	bz
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31470
2021-08-09 13:27:52 -04:00
Mark Johnston
8ee0826f75 in6: Enter the net epoch in in6_tmpaddrtimer()
We need to do so to safely traverse the ifnet list.

Reviewed by:	bz
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31470
2021-08-09 13:27:52 -04:00
Mark Johnston
eca9ac5a32 vfs: Avoid a comparison with an uninitialized field in setutimes()
Some filesystems, e.g., devfs, do not populate va_birthtime in their
GETATTR implementations.  To handle this, make sure that va_birthtime is
initialized to the quasi-standard value of { VNOVAL, 0 } before calling
VOP_GETATTR.

Reported by:	KMSAN
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31468
2021-08-09 13:27:20 -04:00
Brad Davis
83952a5baa release: allow VM_EXTRA_PACKAGES to be specified in the environment
This is useful for adding extra packages to the build of an AMI.
For example:
	env VM_EXTRA_PACKAGES="zsh" make -C release ec2ami

Approved by:	gjb
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-09 10:31:51 -06:00
Brad Davis
be2bc82f18 release: fix copypasta
Approved by:	gjb
MFC after:	1 week
X-MFC-With:	fd17ea8c18
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-09 10:23:18 -06:00
Brad Davis
fd17ea8c18 release: make pkg installs more robust
Currently pkg(8) will fail to install any package if one is missing, so
make this a loop to prevent one missing package from preventing the rest
from installing.  Seen building an AWS AMI for aarch64 on main and
ebsnvme-id is not available in the repo at the moment.

Approved by:	gjb
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-09 09:54:08 -06:00
Michael Tuexen
3808ab732e sctp: remove some set, but unused variables
Thanks to pkasting for submitting the patch for the userland stack.

MFC after:	3 days
2021-08-09 15:58:46 +02:00
Gordon Bergling
6bddade611 mkimg(1): Correct a typo in the usage output
- s/partion/partition/

MFC after:  5 days
2021-08-09 13:53:30 +02:00
Gordon Bergling
8b9f6d62f7 nanobsd: Correct a typo in a comment
- s/partion/partition/

MFC after:	3 days
2021-08-09 13:45:10 +02:00