It is currently allowed to fchownat(2), fchmodat(2), fchflagsat(2),
utimensat(2), fstatat(2), and linkat(2).
For linkat(2), PRIV_VFS_FHOPEN privilege is required to exercise the flag.
It allows to link any open file.
Requested by: trasz
Tested by: pho, trasz
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29111
Without the -w option, Windows guests crash on boot. This is caused by a rdmsr
of MSR_IA32_FEATURE_CONTROL. Windows checks this MSR to determine enabled VMX
features. This MSR isn't emulated in bhyve, so a #GP exception is injected
which causes Windows to crash.
Fix by returning a rdmsr of MSR_IA32_FEATURE_CONTROL with Lock Bit set and
VMX disabled to informWindows that VMX isn't available.
Reviewed by: jhb, grehan (bhyve)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29665
When quotas are enabled with the quotaon(8) command, it sets the
MNT_QUOTA flag in the mount structure mnt_flag field. The mount
structure holds a cached copy of the filesystem statfs structure
in mnt_stat that includes a copy of the mnt_flag field in
mnt_stat.f_flags. The mnt_stat structure may not be updated for
hours. Since the mount command requests mount details using the
MNT_NOWAIT option, it gets the mount's mnt_stat statfs structure
whose f_flags field does not yet show the MNT_QUOTA flag being set
in mnt_flag.
The fix is to have quotaon(8) set the MNT_QUOTA flag in both mnt_flag
and in mnt_stat.f_flags so that it will be immediately visible to
callers of statfs(2).
Reported by: Christos Chatzaras
Tested by: Christos Chatzaras
PR: 254682
MFC after: 3 days
Sponsored by: Netflix
Make it possible to reclaim items from a specific NUMA domain.
- Add uma_zone_reclaim_domain() and uma_reclaim_domain().
- Permit parallel reclamations. Use a counter instead of a flag to
synchronize with zone_dtor().
- Use the zone lock to protect cache_shrink() now that parallel reclaims
can happen.
- Add a sysctl that can be used to trigger reclamation from a specific
domain.
Currently the new KPIs are unused, so there should be no functional
change.
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29685
Note that the per-domain variant does not shrink the target bucket size.
No functional change intended.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Add global definitions for first-touch and interleave policies. The
former may be useful for UMA, which implements a similar policy without
using domainset iterators.
No functional change intended.
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29104
In 22bd0c9731 ossl(4) was ported to arm64. The manual page was
adapted, but never installed since the ossl(4) manual page was
i386 / amd64 only.
Reviewed by: mhorne
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29762
This was briefly broken, so ensure that we can read and clear rules
counters.
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29728
After the migration to libpfctl for rule retrieval we accidentally lost
support for clearing the rules counters.
Introduce a get_clear variant of pfctl_get_rule() which allows rules
counters to be cleared.
MFC after: 4 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29727
Notable upstream pull request merges:
#11742 When specifying raidz vdev name, parity count should match
#11744 Use a helper function to clarify gang block size
#11771 Support running FreeBSD buildworld on Arm-based macOS hosts
This is the last update that will be MFCed into stable/13.
From now on, the tracking of OpenZFS branches will be different:
- main continues tracking openzfs/zfs/master
- stable/13 is going to track openzfs/zfs/zfs-2.1-release
Obtained from: OpenZFS
MFC after: 1 week
Found by: syzkaller
Reported and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29746
It is fine to drop the process lock there, process cannot exit until its
timers are cleared.
Found by: syzkaller
Reported and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29746
and length should be not less than SBUF_MINSIZE
Reported and tested by: pho
Noted and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29752
Notable upstream pull request merges:
#11742 When specifying raidz vdev name, parity count should match
#11744 Use a helper function to clarify gang block size
#11771 Support running FreeBSD buildworld on Arm-based macOS hosts
Our current processor was identified as trusting cert not explicitly
marked for SERVER_AUTH, as well as certs that were tagged with
DISTRUST_AFTER.
Update the script to handle both scenarios. This patch was originally
authored by mandree@ for ports, and it was subsequently ported to base
caroot.
MFC after: 3 days
- Reuse some REDZONE bits to keep track of the requested and allocated
sizes, and use that to provide red zones.
- As in UMA, disable memory trashing to avoid unnecessary CPU overhead.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29461
We cache mapped execve argument buffers to avoid the overhead of TLB
shootdowns. Mark them invalid when they are freed to the cache.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29460
vnodes are a bit special in that they may exist on per-CPU lists even
while free. Add a KASAN-only destructor that poisons regions of each
vnode that are not expected to be accessed after a free.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29459
Memory allocated with kmem_* is unmapped upon free, so KASAN doesn't
provide a lot of benefit, but since allocations are always a multiple of
the page size we can create a redzone when the allocation request size
is not a multiple of the page size.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29458
We allocate kernel stacks using a UMA cache zone. Cache zones have
KASAN disabled by default, but in this case it makes sense to enable it.
Reviewed by: andrew
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29457
- Add a UMA_ZONE_NOKASAN flag to indicate that items from a particular
zone should not be sanitized. This is applied implicitly for NOFREE
and cache zones.
- Add KASAN call backs which get invoked:
1) when a slab is imported into a keg
2) when an item is allocated from a zone
3) when an item is freed to a zone
4) when a slab is freed back to the VM
In state transitions 1 and 3, memory is poisoned so that accesses will
trigger a panic. In state transitions 2 and 4, memory is marked
valid.
- Disable trashing if KASAN is enabled. It just adds extra CPU overhead
to catch problems that are detected by KASAN.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29456
- Initialize KASAN before executing SYSINITs.
- Add a GENERIC-KASAN kernel config, akin to GENERIC-KCSAN.
- Increase the kernel stack size if KASAN is enabled. Some of the
ASAN instrumentation increases stack usage and it's enough to
trigger stack overflows in ZFS.
- Mark the trapframe as valid in interrupt handlers if it is
assigned to td_intr_frame. Otherwise, an interrupt in a function
which creates a poisoned alloca region can trigger false positives.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29455
The idea behind KASAN is to use a region of memory to track the validity
of buffers in the kernel map. This region is the shadow map. The
compiler inserts calls to the KASAN runtime for every emitted load
and store, and the runtime uses the shadow map to decide whether the
access is valid. Various kernel allocators call kasan_mark() to update
the shadow map.
Since the shadow map tracks only accesses to the kernel map, accesses to
other kernel maps are not validated by KASAN. UMA_MD_SMALL_ALLOC is
disabled when KASAN is configured to reduce usage of the direct map.
Currently we have no mechanism to completely eliminate uses of the
direct map, so KASAN's coverage is not comprehensive.
The shadow map uses one byte per eight bytes in the kernel map. In
pmap_bootstrap() we create an initial set of page tables for the kernel
and preloaded data.
When pmap_growkernel() is called, we call kasan_shadow_map() to extend
the shadow map. kasan_shadow_map() uses pmap_kasan_enter() to allocate
memory for the shadow region and map it.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29417
KASAN enables the use of LLVM's AddressSanitizer in the kernel. This
feature makes use of compiler instrumentation to validate memory
accesses in the kernel and detect several types of bugs, including
use-after-frees and out-of-bounds accesses. It is particularly
effective when combined with test suites or syzkaller. KASAN has high
CPU and memory usage overhead and so is not suited for production
environments.
The runtime and pmap maintain a shadow of the kernel map to store
information about the validity of memory mapped at a given kernel
address.
The runtime implements a number of functions defined by the compiler
ABI. These are prefixed by __asan. The compiler emits calls to
__asan_load*() and __asan_store*() around memory accesses, and the
runtime consults the shadow map to determine whether a given access is
valid.
kasan_mark() is called by various kernel allocators to update state in
the shadow map. Updates to those allocators will come in subsequent
commits.
The runtime also defines various interceptors. Some low-level routines
are implemented in assembly and are thus not amenable to compiler
instrumentation. To handle this, the runtime implements these routines
on behalf of the rest of the kernel. The sanitizer implementation
validates memory accesses manually before handing off to the real
implementation.
The sanitizer in a KASAN-configured kernel can be disabled by setting
the loader tunable debug.kasan.disable=1.
Obtained from: NetBSD
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29416
LLVM support for enabling KASAN has not yet landed so the option is not
yet usable, but hopefully this will change soon.
Reviewed by: imp, andrew
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29454
These comments were copied from dbg_monitor_enter(), but the intended
modifications weren't made. Update them to reflect what this code
actually does.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
This is both intuitive and required, as any previous breakpoint settings
may not be applicable to the new process.
Reported by: arichardson
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29672
During device destruction it is possible that open() succeed, but
fdevname() return NULL, that can't be assigned to string variable.
Fix that by adding explicit NULL check.
Also while there switch from fdevname() to fdevname_r().
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
Since 7763814fc9 nfsrpc_setclient() uses mem_alloc() that is macro
around malloc(M_RPC). M_RPC is provided by xdr.ko.
Reviewed by: rmacklem
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week