freebsd-nq

Author	SHA1	Message	Date
Mark Johnston	3388bf06d7	Generalize sanitizer interceptors for memory and string routines Similar to commit `3ead60236f` ("Generalize bus_space(9) and atomic(9) sanitizer interceptors"), use a more generic scheme for interposing sanitizer implementations of routines like memcpy(). No functional change intended. Sponsored by: The FreeBSD Foundation (cherry picked from commit `ec8f1ea8d5`)	2021-11-01 10:20:50 -04:00
Mark Johnston	bf0986b742	Generalize bus_space(9) and atomic(9) sanitizer interceptors Make it easy to define interceptors for new sanitizer runtimes, rather than assuming KCSAN. Lay a bit of groundwork for KASAN and KMSAN. When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions in atomic_san.h are used by default instead of the inline implementations in the platform's atomic.h. These definitions are implemented in the sanitizer runtime, which includes machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual implementations. No functional change intended. Sponsored by: The FreeBSD Foundation (cherry picked from commit `3ead60236f`)	2021-11-01 10:16:39 -04:00
Mark Johnston	252b6ae3e6	KASAN: Disable checking before triggering a panic KASAN hooks will not generate reports if panicstr != NULL, but then there is a window after the initial panic() call where another report may be raised. This can happen if a false positive occurs; to simplify debugging of such problems, avoid recursing. Sponsored by: The FreeBSD Foundation (cherry picked from commit `ea3fbe0707`)	2021-11-01 10:07:45 -04:00
Mark Johnston	224a01a342	KASAN: Implement __asan_unregister_globals() It will be called during KLD unload to unpoison the redzones following global variables. Otherwise, virtual address ranges previously used for a KLD may be left tainted, triggering false positives when they are recycled. Reported by: pho Sponsored by: The FreeBSD Foundation (cherry picked from commit `588c7a06df`)	2021-11-01 10:07:13 -04:00
Mark Johnston	28c338b342	realloc: Fix KASAN(9) shadow map updates When copying from the old buffer to the new buffer, we don't know the requested size of the old allocation, but only the size of the allocation provided by UMA. This value is "alloc". Because the copy may access bytes in the old allocation's red zone, we must mark the full allocation valid in the shadow map. Do so using the correct size. Reported by: kp Tested by: kp Sponsored by: The FreeBSD Foundation (cherry picked from commit `9a7c2de364`)	2021-11-01 10:05:22 -04:00
Mark Johnston	9710b74dd0	malloc: Add state transitions for KASAN - Reuse some REDZONE bits to keep track of the requested and allocated sizes, and use that to provide red zones. - As in UMA, disable memory trashing to avoid unnecessary CPU overhead. Sponsored by: The FreeBSD Foundation (cherry picked from commit `06a53ecf24`)	2021-11-01 10:03:36 -04:00
Mark Johnston	2748ecec95	execve: Mark exec argument buffers We cache mapped execve argument buffers to avoid the overhead of TLB shootdowns. Mark them invalid when they are freed to the cache. Sponsored by: The FreeBSD Foundation (cherry picked from commit `f1c3adefd9`)	2021-11-01 10:03:28 -04:00
Mark Johnston	75306778f1	vfs: Add KASAN state transitions for vnodes vnodes are a bit special in that they may exist on per-CPU lists even while free. Add a KASAN-only destructor that poisons regions of each vnode that are not expected to be accessed after a free. Sponsored by: The FreeBSD Foundation (cherry picked from commit `b261bb4057`)	2021-11-01 10:03:19 -04:00
Mark Johnston	a3d4c8e21d	amd64: Implement a KASAN shadow map The idea behind KASAN is to use a region of memory to track the validity of buffers in the kernel map. This region is the shadow map. The compiler inserts calls to the KASAN runtime for every emitted load and store, and the runtime uses the shadow map to decide whether the access is valid. Various kernel allocators call kasan_mark() to update the shadow map. Since the shadow map tracks only accesses to the kernel map, accesses to other kernel maps are not validated by KASAN. UMA_MD_SMALL_ALLOC is disabled when KASAN is configured to reduce usage of the direct map. Currently we have no mechanism to completely eliminate uses of the direct map, so KASAN's coverage is not comprehensive. The shadow map uses one byte per eight bytes in the kernel map. In pmap_bootstrap() we create an initial set of page tables for the kernel and preloaded data. When pmap_growkernel() is called, we call kasan_shadow_map() to extend the shadow map. kasan_shadow_map() uses pmap_kasan_enter() to allocate memory for the shadow region and map it. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29417 (cherry picked from commit `6faf45b34b`)	2021-11-01 09:57:30 -04:00
Mark Johnston	48d2c7cc30	Add the KASAN runtime KASAN enables the use of LLVM's AddressSanitizer in the kernel. This feature makes use of compiler instrumentation to validate memory accesses in the kernel and detect several types of bugs, including use-after-frees and out-of-bounds accesses. It is particularly effective when combined with test suites or syzkaller. KASAN has high CPU and memory usage overhead and so is not suited for production environments. The runtime and pmap maintain a shadow of the kernel map to store information about the validity of memory mapped at a given kernel address. The runtime implements a number of functions defined by the compiler ABI. These are prefixed by __asan. The compiler emits calls to __asan_load() and __asan_store() around memory accesses, and the runtime consults the shadow map to determine whether a given access is valid. kasan_mark() is called by various kernel allocators to update state in the shadow map. Updates to those allocators will come in subsequent commits. The runtime also defines various interceptors. Some low-level routines are implemented in assembly and are thus not amenable to compiler instrumentation. To handle this, the runtime implements these routines on behalf of the rest of the kernel. The sanitizer implementation validates memory accesses manually before handing off to the real implementation. The sanitizer in a KASAN-configured kernel can be disabled by setting the loader tunable debug.kasan.disable=1. Obtained from: NetBSD Sponsored by: The FreeBSD Foundation (cherry picked from commit `38da497a4d`)	2021-11-01 09:56:31 -04:00
Mark Johnston	bb5c81812f	timecounter: Lock the timecounter list Timecounter registration is dynamic, i.e., there is no requirement that timecounters must be registered during single-threaded boot. Loadable drivers may in principle register timecounters (which can be switched to automatically). Timecounters cannot be unregistered, though this could be implemented. Registered timecounters belong to a global linked list. Add a mutex to synchronize insertions and the traversals done by (mpsafe) sysctl handlers. No functional change intended. Reviewed by: imp, kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `621fd9dcb2`)	2021-11-01 09:20:11 -04:00
Mark Johnston	943421bdf7	signal: Add SIG_FOREACH and refactor issignal() Add a SIG_FOREACH macro that can be used to iterate over a signal set. This is a bit cleaner and more efficient than calling sig_ffs() in a loop. The implementation is based on BIT_FOREACH_ISSET(), except that the bitset limbs are always 32 bits wide, and signal sets are 1-indexed rather than 0-indexed like bitset(9) sets. issignal() cannot really be modified to use SIG_FOREACH() directly. Take this opportunity to split the function into two explicit loops. I've always found this function hard to read and think that this change is an improvement. Remove sig_ffs(), nothing uses it now. Reviewed by: kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `81f2e9063d`)	2021-11-01 09:20:11 -04:00
Gordon Bergling	6ad1c6a826	jail(8): Fix a few common typos in source code comments - s/phyiscal/physical/ (cherry picked from commit `70de1003da`)	2021-10-30 09:48:43 +02:00
Konstantin Belousov	c3c880be15	uipc_shm: silent warnings about write-only variables in largepage code (cherry picked from commit `3b5331dd8d`)	2021-10-27 03:24:41 +03:00
Konstantin Belousov	17c83b7670	sig_ast_checksusp(): mark the local p as __diagused (cherry picked from commit `3d2778515a`)	2021-10-27 03:24:40 +03:00
Konstantin Belousov	ec235e162a	subr_firmware.c::unloadentry(): remove write-only variable (cherry picked from commit `6776747a0e`)	2021-10-27 03:24:40 +03:00
Konstantin Belousov	485cc5549c	procctl: stop using SA_*LOCKED, define local enum (cherry picked from commit `c7f38a2df1`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	59447a02f1	kern_procctl: skip zombies for process group operations (cherry picked from commit `49db81aa05`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	8589a3470d	kern_procctl.c: use td->td_proc instead of curproc (cherry picked from commit `3692877a6c`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	c802b970a5	procctl: actually require debug privileges over target (cherry picked from commit `f5bb6e5a6d`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	c7d4bd7477	procctl: make it possible to specify that some operations require debug privilege over the target (cherry picked from commit `1c4dbee5dd`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	84722e8171	sys_procctl(): zero the data buffer once, on syscall entry (cherry picked from commit `32026f5983`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	a89f144b0d	sys_procctl(): use table data to do copyin/copyout (cherry picked from commit `56d5323b4d`)	2021-10-26 05:26:27 +03:00
Konstantin Belousov	38506cebc1	kern_procctl_single(): convert to use table data (cherry picked from commit `68dc5b381a`)	2021-10-26 05:26:26 +03:00
Konstantin Belousov	3c7f03c25f	procctl: convert PDEATHSIG_CTL/STATUS to regular kern_procctl_single() cases (cherry picked from commit `34f39a8c0e`)	2021-10-26 05:26:26 +03:00
Konstantin Belousov	2e69ba48b9	procctl(2): add consistent shortcut P_ID:0 as curproc (cherry picked from commit `f833ab9dd1`)	2021-10-26 05:26:26 +03:00
Konstantin Belousov	19eec36599	kern_procctl(): convert the function to be table-driven (cherry picked from commit `7ae879b14a`)	2021-10-26 05:26:26 +03:00
Konstantin Belousov	1d72df1c3d	sys_procctl(2): remove sysproto and argused (cherry picked from commit `31faa565ed`)	2021-10-26 05:26:26 +03:00
Andrew Turner	f803dd1e24	Add pmap_change_prot on arm64 Support changing the protection of preloaded kernel modules by implementing pmap_change_prot on arm64 and calling it from preload_protect. Reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32026 (cherry picked from commit `a85ce4ad72`)	2021-10-25 14:46:44 +01:00
Jessica Clarke	af818612a5	riscv: Implement pmap_mapdev_attr This is needed for LinuxKPI's _ioremap_attr. This reuses the generic implementation introduced for aarch64, and itself requires implementing pmap_kenter, which is trivial to do given riscv currently treats all mapping attributes the same due to the Svpbmt extension not yet being ratified and in hardware. Reviewed by: markj, mhorne MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D32445 (cherry picked from commit `682c00a6ce`)	2021-10-24 19:51:10 +01:00
Alexander Motin	1e7091ac7c	sched_ule(4): Fix possible significance loss. Before this change kern.sched.interact sysctl setting above 32 gave all interactive threads identical priority of PRI_MIN_INTERACT due to ((PRI_MAX_INTERACT - PRI_MIN_INTERACT + 1) / sched_interact) turning zero. Setting the sysctl lower reduced the range of used priority levels up to half, that is not great either. Change of the operations order should fix the issue, always using full range of priorities, while overflow is impossible there since both score and priority values are small. While there, make the variables unsigned as they really are. MFC after: 1 month (cherry picked from commit `1c119e173d`)	2021-10-21 18:24:36 -04:00
Alexander Motin	11f14b3362	sched_ule(4): Fix hang with steal_thresh < 2. `e745d729be` caused infinite loop with interrupts disabled in load stealing code if steal_thresh set below 2. Such configuration should not generally be used, but appeared some people are using it to workaround some problems. To fix the problem explicitly pass to sched_highest() minimum number of transferrable threads, supported by the caller, instead of guessing. MFC after: 25 days (cherry picked from commit `08063e9f98`)	2021-10-21 18:24:36 -04:00
Alexander Motin	b5919ea4e6	x86: Add NUMA nodes into CPU topology. Depending on hardware, NUMA nodes may match last level caches, or they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC). This information is provided by ACPI instead of CPUID, and it is provided for each CPU individually instead of mask widths, but this code should be able to properly handle all the above cases. This change should immediately allow idle stealing in sched_ule(4) to prefer load from NUMA-local CPUs to remote ones when the node does not match LLC. Later we may think of how to better handle it on sched_pickcpu() side. MFC after: 1 month (cherry picked from commit `ef50d5fbc3`)	2021-10-21 18:24:36 -04:00
Alexander Motin	a3d50144cc	Fix build without SMP. MFC after: 1 month (cherry picked from commit `8db1669959`)	2021-10-21 18:24:35 -04:00
Alexander Motin	4808bab7fa	sched_ule(4): Improve long-term load balancer. Before this change long-term load balancer was unable to migrate running threads, only ones waiting on run queues. But with growing number of CPU cores it is quite typical now for system to not have many waiting threads. But same time if due to some coincidence two long-running CPU-bound threads ended up sharing same physical CPU core, they could suffer from the SMT penalty indefinitely, and the load balancer couldn't help. Improve that by teaching the load balancer to hint running threads to migrate by marking them with TDF_NEEDRESCHED and new TDF_PICKCPU flag, making sched_pickcpu() to search for better CPU later, when it is convenient. Fix CPU search logic when balancing to limit round-robin migrations in case of almost equal load to the group of physical cores. The previous code bounced threads across all the system, that should be pretty bad for caches and NUMA affinity, while additional fairness was almost invisible, diminishing with number of cores in the group. MFC after: 1 month (cherry picked from commit `e745d729be`)	2021-10-21 18:24:35 -04:00
Alexander Motin	fa226878a5	sbuf(9): Microoptimize sbuf_put_byte() This function is actively used by sbuf_vprintf(), so this simple inlining in half reduces time of kern.geom.confxml generation. MFC after: 2 weeks Sponsored by: iXsystem, Inc. (cherry picked from commit `7835b2cb4a`)	2021-10-21 18:24:29 -04:00
John Baldwin	58d69f4ecf	crypto: Add a new type of crypto buffer for a single mbuf. This is intended for use in KTLS transmit where each TLS record is described by a single mbuf that is itself queued in the socket buffer. Using the existing CRYPTO_BUF_MBUF would result in bus_dmamap_load_crp() walking additional mbufs in the socket buffer that are not relevant, but generating a S/G list that potentially exceeds the limit of the tag (while also wasting CPU cycles). Reviewed by: markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30136 (cherry picked from commit `883a0196b6`)	2021-10-21 08:51:26 -07:00
John Baldwin	60b9ce7245	sglist: Add sglist_append_single_mbuf(). This function appends the contents of a single mbuf to an sglist rather than an entire mbuf chain. Reviewed by: gallatin, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30135 (cherry picked from commit `6663f8a23e`)	2021-10-21 08:51:26 -07:00
John Baldwin	da557f2fe6	Rename m_unmappedtouio() to m_unmapped_uiomove(). This function doesn't only copy data into a uio but instead is a variant of uiomove() similar to uiomove_fromphys(). Reviewed by: gallatin, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30444 (cherry picked from commit `aa341db39b`)	2021-10-21 08:51:26 -07:00
John Baldwin	8efc88d0d6	Extend m_copyback() to support unmapped mbufs. Reviewed by: gallatin, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30133 (cherry picked from commit `3f9dac85cc`)	2021-10-21 08:51:25 -07:00
John Baldwin	2ba824366c	Extend m_apply() to support unmapped mbufs. m_apply() invokes the callback function separately on each segment of an unmapped mbuf: the TLS header, individual pages, and the TLS trailer. Reviewed by: markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30132 (cherry picked from commit `3c7a01d773`)	2021-10-21 08:51:25 -07:00
Mark Johnston	348fc38fd5	mount: Check for !VDIR mount points before handling -o emptydir To implement -o emptydir, vfs_emptydir() checks that the passed directory is empty. This should be done after checking whether the vnode is of type VDIR, though, or vfs_emptydir() may end up calling VOP_READDIR on a non-directory. Reported by: syzbot+4006732c69fb0f792b2c@syzkaller.appspotmail.com Reviewed by: kib, imp Sponsored by: The FreeBSD Foundation (cherry picked from commit `03d5820f73`)	2021-10-19 20:53:33 -04:00
John Baldwin	59a5099ec1	Document kern.log_wakeups_per_second. PR: 148680 (cherry picked from commit `c51e4962a3`)	2021-10-19 16:53:26 -07:00
Brooks Davis	3b55b61371	selsocket: handle sopoll() errors correctly Without this change, unmounting smbfs filesystems with an INVARIANTS kernel would panic after `10e64782ed`. PR: 253079 Found by: markj Reviewed by: markj, jhb Obtained from: CheriBSD Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D32492 (cherry picked from commit `04c91ac48a`)	2021-10-20 00:19:57 +01:00
Brooks Davis	fe388671ac	makesyscalls.lua: add a CAPENABLED flag The CAPENABLED flag indicates that the syscall can be used in capsicum capability mode. It is intended to replace capabilities.conf. Reviewed by: kevans, emaste Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D31349 (cherry picked from commit `6945df3fff`)	2021-10-20 00:19:56 +01:00
Brooks Davis	81184e92e0	makesyscalls.lua: Add a new syscall type: RESERVED RESERVED syscall number are reserved for local/vendor use. RESERVED is identical to UNIMPL except that comments are ignored. Reviewed by: kevans Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D27988 (cherry picked from commit `119fa6ee8a`)	2021-10-20 00:19:56 +01:00
Mark Johnston	54a01b5326	vfs: Permit unix sockets to be opened with O_PATH As with FIFOs, a path descriptor for a unix socket cannot be used with kevent(). In principle connectat(2) and bindat(2) could be modified to support an AT_EMPTY_PATH-like mode which operates on the socket referenced by an O_PATH fd referencing a unix socket. That would eliminate the path length limit imposed by sockaddr_un. Update O_PATH tests. Reviewed by: kib Sponsored by: The FreeBSD Foundation (cherry picked from commit `2bd9826995`)	2021-10-17 17:15:44 -04:00
Mark Johnston	66f5f95864	timecounter: Let kern.timecounter.stepwarnings be set as a tunable (cherry picked from commit `fa9da1f590`)	2021-10-16 09:31:19 -04:00
Greg V	1625e2db22	O_PATH: allow vfs_extattr syscalls (cherry picked from commit `98dae405de`)	2021-10-16 16:01:47 +03:00
Konstantin Belousov	f824a0d090	Style (cherry picked from commit `1adebca1fc`)	2021-10-15 23:39:07 +03:00

1 2 3 4 5 ...

18493 Commits