- In g_disk_start(), verify that the data to be written is initialized
according to KMSAN shadow state.
- In g_disk_done(), verify that the block driver updated shadow state as
expected, so as to catch sources of false positives early.
Sponsored by: The FreeBSD Foundation
When cloning a ktls session (which is needed when we need to
switch output NICs for a NIC TLS session), we need to also
init the reset task, like we do when creating a new tls session.
Reviewed by: jhb
Sponsored by: Netflix
Make kdb_thr_first() and kdb_thr_next() return sane values if the
allproc list and pidhashtbl haven't been initialized yet. This can
happen if the debugger is entered very early on, for example with the
'-d' boot flag.
This allows remote gdb to attach at such a time, and fixes some ddb
commands like 'show threads'.
Be explicit about the static initialization of these variables. This
part has no functional change.
Reviewed by: markj, imp (previous version)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D31495
We use the midr_el1 register to decode which CPU type we are booting
from. Read it on the secondary CPUs before waiting for the boot CPU
to release us as it will need to use it before the release.
Sponsored by: The FreeBSD Foundation
Summary:
When running on a CPU that supports the arm64 sha256 intrinsics use them
to improve perfromance of sha256 calculations.
With this changethe following improvement has been seen on an Apple M1
with FreeBS running under Parallels, with similar results on a
Neoverse-N1 r3p1.
x sha256.orig
+ sha256.arm64
+--------------------------------------------------------------------+
|++ x x|
|+++ xxx|
||A |A||
+--------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 3.41 3.5 3.46 3.458 0.042661458
+ 5 0.47 0.54 0.5 0.504 0.027018512
Difference at 95.0% confidence
-2.954 +/- 0.0520768
-85.4251% +/- 0.826831%
(Student's t, pooled s = 0.0357071)
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31284
Some simulators don't implement arbitrary sized memory access to the
virtio PCI registers. Follow Linux and use single byte accesses to read
and write to these registers.
Reviewed by: bryanv, emaste (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31424
This includes a style fix around ioflag checking as well.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib, bcr
Differential Revision: https://reviews.freebsd.org/D31505
This does not appear to change generated code with the default
toolchain. However, KMSAN makes use of output operand specifications to
instrument inline asm, and with incorrect specifications we get false
positives in code that uses the CK_(S)LIST macros.
This was submitted upstream:
https://github.com/concurrencykit/ck/pull/175
The commit applies the same change locally to make KMSAN usable until
something equivalent is merged upstream.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
For now, just hook the allocation path: upon allocation, items are
marked as initialized (absent M_ZERO). Some zones are exempted from
this when it would otherwise raise false positives.
Use kmsan_orig() to update the origin map for UMA and malloc(9)
allocations. This allows KMSAN to print the return address when an
uninitialized UMA item is implicated in a report. For example:
panic: MSan: Uninitialized UMA memory from m_getm2+0x7fe
Sponsored by: The FreeBSD Foundation
Sanitizer instrumentation of course cannot automatically update shadow
state when devices write to host memory. KMSAN thus hooks into busdma,
both to update shadow state after a device write, and to verify that the
kernel does not publish uninitalized bytes to devices.
To implement this, when KMSAN is configured, each dmamap embeds a memory
descriptor describing the region currently loaded into the map.
bus_dmamap_sync() uses the operation flags to determine whether to
validate the loaded region or to mark it as initialized in the shadow
map.
Note that in cases where the amount of data written is less than the
buffer size, the entire buffer is marked initialized even when it is
not. For example, if a NIC writes a 128B packet into a 2KB buffer, the
entire buffer will be marked initialized, but subsequent accesses past
the first 128 bytes are likely caused by bugs.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31338
Use this flag to indicate that busdma should allocate a map structure
even no bouncing is required to satisfy the tag's constraints. This
will be used for KMSAN.
Also fix a memory leak that can occur if the kernel fails to allocate
bounce pages in bounce_bus_dmamap_create().
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31338
Interrupt and exception handlers must call kmsan_intr_enter() prior to
calling any C code. This is because the KMSAN runtime maintains some
TLS in order to track initialization state of function parameters and
return values across function calls. Then, to ensure that this state is
kept consistent in the face of asynchronous kernel-mode excpeptions, the
runtime uses a stack of TLS blocks, and kmsan_intr_enter() and
kmsan_intr_leave() push and pop that stack, respectively.
Use these functions in amd64 interrupt and exception handlers. Note
that handlers for user->kernel transitions need not be annotated.
Also ensure that trap frames pushed by the CPU and by handlers are
marked as initialized before they are used.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31467
- During boot, allocate PDP pages for the shadow maps. The region above
KERNBASE is currently not shadowed.
- Create a dummy shadow for the vm page array. For now, this array is
not protected by the shadow map to help reduce kernel memory usage.
- Grow shadows when growing the kernel map.
- Increase the default kernel stack size when KMSAN is enabled. As with
KASAN, sanitizer instrumentation appears to create stack frames large
enough that the default value is not sufficient.
- Disable UMA's use of the direct map when KMSAN is configured. KMSAN
cannot validate the direct map.
- Disable unmapped I/O when KMSAN configured.
- Lower the limit on paging buffers when KMSAN is configured. Each
buffer has a static MAXPHYS-sized allocation of KVA, which in turn
eats 2*MAXPHYS of space in the shadow map.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31295
KMSAN enables the use of LLVM's MemorySanitizer in the kernel. This
enables precise detection of uses of uninitialized memory. As with
KASAN, this feature has substantial runtime overhead and is intended to
be used as part of some automated testing regime.
The runtime maintains a pair of shadow maps. One is used to track the
state of memory in the kernel map at bit-granularity: a bit in the
kernel map is initialized when the corresponding shadow bit is clear,
and is uninitialized otherwise. The second shadow map stores
information about the origin of uninitialized regions of the kernel map,
simplifying debugging.
KMSAN relies on being able to intercept certain functions which cannot
be instrumented by the compiler. KMSAN thus implements interceptors
which manually update shadow state and in some cases explicitly check
for uninitialized bytes. For instance, all calls to copyout() are
subject to such checks.
The runtime exports several functions which can be used to verify the
shadow map for a given buffer. Helpers provide the same functionality
for a few structures commonly used for I/O, such as CAM CCBs, BIOs and
mbufs. These are handy when debugging a KMSAN report whose
proximate and root causes are far away from each other.
Obtained from: NetBSD
Sponsored by: The FreeBSD Foundation
KMSAN requires two shadow maps, each one-to-one with the kernel map.
Allocate regions of the kernels PML4 page for them. Add functions to
create mappings in the shadow map regions, these will be used by the
KMSAN runtime.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31295
Also remove a redundant assertion in pmap_kasan_enter().
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31295
Reset the mmc owner before calling the bridge release host callback.
Some people are hitting the "mmc: host bridge didn't serialize us." panic as
the bridge is released before the mmc owner is reset.
Submitted by: luiz
Sponsored by: Rubicon Communications, LLC ("Netgate")
Simplify the setup of srrctl.BSIZEPKT on igb class NICs.
Improve the setup of rctl.BSIZE on lem and em class NICs.
Don't try to touch rfctl on lem class NICs.
Manipulate rctl.BSEX correctly on lem and em class NICs.
Approved by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31457
The current code checks the RWX bits are 0 but does not check the V bit
is non-zero, meaning not-yet-allocated L1 entries that are still zero
are regarded as being allocated. This is likely due to copying the arm64
code that checks ATTR_DESC_MASK is L1_TABLE, which emcompasses both the
type and the validity in a single field, and erroneously translating
that to a check of just PTE_RWX being 0 to indicate non-leaf, forgetting
about the V bit. This then results in the following panic:
panic: Fatal page fault at 0xffffffc0005cf292: 0x00000000000050
cpuid = 1
time = 1628379581
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
page_fault_handler() at page_fault_handler+0x1ba
do_trap_supervisor() at do_trap_supervisor+0x7a
cpu_exception_handler_supervisor() at
cpu_exception_handler_supervisor+0x70
--- exception 13, tval = 0x50
pmap_enter_l2() at pmap_enter_l2+0xb2
pmap_enter_object() at pmap_enter_object+0x15e
vm_map_pmap_enter() at vm_map_pmap_enter+0x228
vm_map_insert() at vm_map_insert+0x4ec
vm_map_find() at vm_map_find+0x474
vm_map_find_min() at vm_map_find_min+0x52
vm_mmap_object() at vm_mmap_object+0x1ba
vn_mmap() at vn_mmap+0xf8
kern_mmap() at kern_mmap+0x4c4
sys_mmap() at sys_mmap+0x38
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x1dd
Instead, we should just check the V bit, as on amd64, and assert that
any valid L1 entries are not leaves, since an L1 leaf would render the
entire range allocated and thus we should not have attempted to map that
VA in the first place.
Reported by: David Gilbert <dgilbert@daveg.ca>
MFC after: 1 week
Reviewed by: markj, mhorne
Differential Revision: https://reviews.freebsd.org/D31460
While here, use designated initializers and rename some AMD iommu method
implementations to match the corresponding op names. No functional
change intended.
Reviewed by: grehan
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31462
This does not appear to affect code generation, at least with the
default toolchain.
Noticed because incorrect output specifications lead to false positives
from KMSAN, as the instrumentation uses them to update shadow state for
output operands.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31466
The use of Giant here is vestigal and does not provide any useful
synchronization. Furthermore, non-MPSAFE callouts can cause the
softclock threads to block waiting for long-running newbus operations to
complete.
Reported by: mav
Reviewed by: bz
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31470
We need to do so to safely traverse the ifnet list.
Reviewed by: bz
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31470
Some filesystems, e.g., devfs, do not populate va_birthtime in their
GETATTR implementations. To handle this, make sure that va_birthtime is
initialized to the quasi-standard value of { VNOVAL, 0 } before calling
VOP_GETATTR.
Reported by: KMSAN
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31468
This was added early in the development of the arm64 port when
cpu_switch was just a stub. It should have been removed when cpu_switch
was implemented, however this didn't seem to be the case, and the '%p'
was added.
As this hasn't been needed in 7 years we can remove it.
Sponsored by: The FreeBSD Foundation
When exiting to userspace the code is similar to the restore_registers
macro in exception.S. Rework it to remove most of the non-style
differences.
Sponsored by: The FreeBSD Foundation
Fix getsockopt() for the IPPROTO_IPV6 level socket options with the
following names: IPV6_HOPOPTS, IPV6_RTHDR, IPV6_RTHDRDSTOPTS,
IPV6_DSTOPTS, and IPV6_NEXTHOP.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D31458
When the device name is provided, we can simply run strncmp() for each
line to quickly skip unrelated ones, that is much faster than sscanf()
and only then strcmp().
MFC after: 2 weeks