freebsd-dev

Author	SHA1	Message	Date
Alex Richardson	fa2528ac64	Use atomic loads/stores when updating td->td_state KCSAN complains about racy accesses in the locking code. Those races are fine since they are inside a TD_SET_RUNNING() loop that expects the value to be changed by another CPU. Use relaxed atomic stores/loads to indicate that this variable can be written/read by multiple CPUs at the same time. This will also prevent the compiler from doing unexpected re-ordering. Reported by: GENERIC-KCSAN Test Plan: KCSAN no longer complains, kernel still runs fine. Reviewed By: markj, mjg (earlier version) Differential Revision: https://reviews.freebsd.org/D28569	2021-02-18 14:02:48 +00:00
Mateusz Guzik	b83e94be53	thread: staticize thread_reap and move td_allocdomain thread_init is a much better fit as the the value is constant after initialization.	2020-11-26 06:59:27 +00:00
Mateusz Guzik	598f2b8116	dtrace: stop using eventhandlers for the part compiled into the kernel Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D27311	2020-11-23 18:27:21 +00:00
Mateusz Guzik	a9568cd2bc	thread: stash domain id to work around vtophys problems on ppc64 Adding to zombie list can be perfomed by idle threads, which on ppc64 leads to panics as it requires a sleepable lock. Reported by: alfredo Reviewed by: kib, markj Fixes: r367842 ("thread: numa-aware zombie reaping") Differential Revision: https://reviews.freebsd.org/D27288	2020-11-23 18:26:47 +00:00
Mateusz Guzik	d116b9f1ad	thread: numa-aware zombie reaping The current global list is a significant problem, in particular induces a lot of cross-domain thread frees. When running poudriere on a 2 domain box about half of all frees were of that nature. Patch below introduces per-domain thread data containing zombie lists and domain-aware reaping. By default it only reaps from the current domain, only reaping from others if there is free TID shortage. A dedicated callout is introduced to reap lingering threads if there happens to be no activity. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D27185	2020-11-19 10:00:48 +00:00
Conrad Meyer	85078b8573	Split out cwd/root/jail, cmask state from filedesc table No functional change intended. Tracking these structures separately for each proc enables future work to correctly emulate clone(2) in linux(4). __FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof. Reviewed by: kib Discussed with: markj, mjg Differential Revision: https://reviews.freebsd.org/D27037	2020-11-17 21:14:13 +00:00
Mateusz Guzik	19d3e47dca	select: call seltdfini on process and thread exit Since thread_zone is marked NOFREE the thread_fini callback is never executed, meaning memory allocated by seltdinit is never released. Adding the call to thread_dtor is not sufficient as exiting processes cache the main thread.	2020-11-16 03:12:21 +00:00
Mateusz Guzik	f34a2f56c3	thread: batch credential freeing	2020-11-14 19:22:02 +00:00
Mateusz Guzik	fb8ab68084	thread: batch resource limit free calls	2020-11-14 19:21:46 +00:00
Mateusz Guzik	5ef7b7a0f3	thread: rework tid batch to use helpers	2020-11-14 19:20:58 +00:00
Mateusz Guzik	d1ca25be49	thread: pad tid lock On a kernel with other changes this bumps 104-way thread creation/destruction from 0.96 mln ops/s to 1.1 mln ops/s.	2020-11-14 19:19:27 +00:00
Mateusz Guzik	62dbc992ad	thread: move nthread management out of tid_alloc While this adds more work single-threaded, it also enables SMP-related speed ups.	2020-11-12 00:29:23 +00:00
Mateusz Guzik	755341df4f	thread: batch tid_free calls in thread_reap This eliminates the highly pessimal pattern of relocking from multiple CPUs in quick succession. Note this is still globally serialized.	2020-11-11 18:45:06 +00:00
Mateusz Guzik	c5315f5196	thread: lockless zombie list manipulation This gets rid of the most contended spinlock seen when creating/destroying threads in a loop. (modulo kstack) Tested by: alfredo (ppc64), bdragon (ppc64)	2020-11-11 18:43:51 +00:00
Mateusz Guzik	26007fe37c	thread: add more fine-grained tidhash locking Note this still does not scale but is enough to move it out of the way for the foreseable future. In particular a trivial benchmark spawning/killing threads stops contesting on tidhash.	2020-11-11 08:51:04 +00:00
Mateusz Guzik	aae3547be3	thread: rework tidhash vs proc lock interaction Apart from minor clean up this gets rid of proc unlock/lock cycle on thread exit to work around LOR against tidhash lock.	2020-11-11 08:50:04 +00:00
Mateusz Guzik	cf31cadeb6	thread: fix thread0 tid allocation Startup code hardcodes the value instead of allocating it. The first spawned thread would then be a duplicate. Pointy hat: mjg	2020-11-11 08:48:43 +00:00
Mateusz Guzik	5c100123a3	thread: retire thread_find tdfind should be used instead.	2020-11-10 01:57:48 +00:00
Mateusz Guzik	94275e3e69	threads: remove the unused TID_BUFFER_SIZE macro	2020-11-10 01:31:06 +00:00
Mateusz Guzik	934e7e5ec9	thread: adds newer bits for r367537 The committed patch was an older version.	2020-11-10 01:13:58 +00:00
Mateusz Guzik	35bb59edc5	threads: reimplement tid allocation on top of a bitmap There are workloads with very bursty tid allocation and since unr tries very hard to have small-sized bitmaps it keeps reallocating memory. Just doing buildkernel gives almost 150k calls to free coming from unr. This also gets rid of the hack which tried to postpone TID reuse. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D27101	2020-11-09 23:05:28 +00:00
Mateusz Guzik	1bd3cf5de5	threads: introduce a limit for total number The intent is to replace the current id allocation method and a known upper bound will be useful. Reviewed by: kib (previous version), markj (previous version) Tested by: pho Differential Revision: https://reviews.freebsd.org/D27100	2020-11-09 23:04:30 +00:00
Edward Tomasz Napierala	1e2521ffae	Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead. Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26458	2020-09-27 18:47:06 +00:00
Mateusz Guzik	936c24faba	cred: add more asserts for td_realucred == td_ucred	2020-08-01 16:02:32 +00:00
Mateusz Guzik	7cd4443fb1	Short-circuit tdfind when looking for the calling thread. Common occurence with cpuset and other places.	2020-07-18 00:14:43 +00:00
Mateusz Guzik	1724c563e6	cred: distribute reference count per thread This avoids dirtying creds in the common case, see the comment in kern_prot.c for details. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D24007	2020-06-09 23:03:48 +00:00
Mark Johnston	4ee964d6b6	Fix up i386 thread structure layout assertions after r360354. Reported by: Jenkins	2020-04-26 22:04:43 +00:00
Mark Johnston	f13fa9df05	Use a single VM object for kernel stacks. Previously we allocated a separate VM object for each kernel stack. However, fully constructed kernel stacks are cached by UMA, so there is no harm in using a single global object for all stacks. This reduces memory consumption and makes it easier to define a memory allocation policy for kernel stack pages, with the aim of reducing physical memory fragmentation. Add a global kstack_object, and use the stack KVA address to index into the object like we do with kernel_object. Reviewed by: kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24473	2020-04-26 20:08:57 +00:00
Rick Macklem	8de97f394e	Remove the old NFS lock device driver that uses Giant. This NFS lock device driver was replaced by the kernel NLM around FreeBSD7 and has not normally been used since then. To use it, the kernel had to be built without "options NFSLOCKD" and the nfslockd.ko had to be deleted as well. Since it uses Giant and is no longer used, this patch removes it. With this device driver removed, there is now a lot of unused code in the userland rpc.lockd. That will be removed on a future commit. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D22933	2020-04-09 14:44:46 +00:00
John Baldwin	59838c1a19	Retire procfs-based process debugging. Modern debuggers and process tracers use ptrace() rather than procfs for debugging. ptrace() has a supserset of functionality available via procfs and new debugging features are only added to ptrace(). While the two debugging services share some fields in struct proc, they each use dedicated fields and separate code. This results in extra complexity to support a feature that hasn't been enabled in the default install for several years. PR: 244939 (exp-run) Reviewed by: kib, mjg (earlier version) Relnotes: yes Differential Revision: https://reviews.freebsd.org/D23837	2020-04-01 19:22:09 +00:00
Mark Johnston	5aa5420ff2	Ensure that arm64 thread structures are allocated from the direct map. Otherwise we can fail to handle translation faults on curthread, leading to a panic. Reviewed by: alc, rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23895	2020-02-29 18:41:48 +00:00
Konstantin Belousov	04869b812b	Add td_pflags2, yet another thread-private flags word. There is no more free bits in td_pflags. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-02-22 20:43:04 +00:00
Konstantin Belousov	146fc63fce	Add a way to manage thread signal mask using shared word, instead of syscall. A new syscall sigfastblock(2) is added which registers a uint32_t variable as containing the count of blocks for signal delivery. Its content is read by kernel on each syscall entry and on AST processing, non-zero count of blocks is interpreted same as the signal mask blocking all signals. The biggest downside of the feature that I see is that memory corruption that affects the registered fast sigblock location, would cause quite strange application misbehavior. For instance, the process would be immune to ^C (but killable by SIGKILL). With consumers (rtld and libthr added), benchmarks do not show a slow-down of the syscalls in micro-measurements, and macro benchmarks like buildworld do not demonstrate a difference. Part of the reason is that buildworld time is dominated by compiler, and clang already links to libthr. On the other hand, small utilities typically used by shell scripts have the total number of syscalls cut by half. The syscall is not exported from the stable libc version namespace on purpose. It is intended to be used only by our C runtime implementation internals. Tested by: pho Disscussed with: cem, emaste, jilles Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D12773	2020-02-09 11:53:12 +00:00
Konstantin Belousov	300b525d29	Correct the function name in the comment. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2020-02-08 15:06:06 +00:00
Mateusz Guzik	b52d50cf69	vfs: prealloc vnodes in getnewvnode_reserve Having a reserved vnode count does not guarantee that getnewvnodes wont block later. Said blocking partially defeats the purpose of reserving in the first place. Preallocate instaed. The only consumer was always passing "1" as count and never nesting reservations.	2020-01-11 22:58:14 +00:00
Konstantin Belousov	478ca4b004	Rename umtxq_check_susp() to thread_check_susp() and make it usable outside of kern_umtx.c. To be used in several future changes. Discussed with: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week	2020-01-02 22:13:59 +00:00
Jeff Roberson	686bcb5c14	schedlock 4/4 Don't hold the scheduler lock while doing context switches. Instead we unlock after selecting the new thread and switch within a spinlock section leaving interrupts and preemption disabled to prevent local concurrency. This means that mi_switch() is entered with the thread locked but returns without. This dramatically simplifies scheduler locking because we will not hold the schedlock while spinning on blocked lock in switch. This change has not been made to 4BSD but in principle it would be more straightforward. Discussed with: markj Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22778	2019-12-15 21:26:50 +00:00
Jeff Roberson	61a74c5ccd	schedlock 1/4 Eliminate recursion from most thread_lock consumers. Return from sched_add() without the thread_lock held. This eliminates unnecessary atomics and lock word loads as well as reducing the hold time for scheduler locks. This will eventually allow for lockless remote adds. Discussed with: kib Reviewed by: jhb Tested by: pho Differential Revision: https://reviews.freebsd.org/D22626	2019-12-15 21:11:15 +00:00
Konstantin Belousov	5e921ff49e	amd64: move pcb out of kstack to struct thread. This saves 320 bytes of the precious stack space. The only negative aspect of the change I can think of is that the struct thread increased by 320 bytes obviously, and that 320 bytes are not swapped out anymore. I believe the freed stack space is much more important than that. Also, current struct thread size is 1392 bytes on amd64, so UMA will allocate two thread structures per (4KB) slab, which leaves a space for pcb without increasing zone memory use. Reviewed by: alc, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D22138	2019-10-25 20:09:42 +00:00
Gleb Smirnoff	279b9aabe3	Remove epoch tracker from struct thread. It was an ugly crutch to emulate locking semantics for if_addr_rlock() and if_maddr_rlock().	2019-10-21 18:19:32 +00:00
Gleb Smirnoff	f6eccf96a0	Since EPOCH_TRACE had been moved to opt_global.h, we don't need to waste extra space in struct thread.	2019-10-14 04:17:56 +00:00
Gleb Smirnoff	dd902d015a	Add debugging facility EPOCH_TRACE that checks that epochs entered are properly nested and warns about recursive entrances. Unlike with locks, there is nothing fundamentally wrong with such use, the intent of tracer is to help to review complex epoch-protected code paths, and we mean the network stack here. Reviewed by: hselasky Sponsored by: Netflix Pull Request: https://reviews.freebsd.org/D21610	2019-09-25 18:26:31 +00:00
John Baldwin	1af9474b26	Always set td_errno to the error value of a system call. Early errors prior to a system call did not set td_errno. This commit sets td_errno for all errors during syscallenter(). As a result, syscallret() can now always use td_errno without checking TDP_NERRNO. Reviewed by: kib MFC after: 1 month Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D20898	2019-07-15 21:16:01 +00:00
Konstantin Belousov	4d3b28bcdc	amd64 pmap: rework delayed invalidation, removing global mutex. For machines having cmpxcgh16b instruction, i.e. everything but very early Athlons, provide lockless implementation of delayed invalidation. The implementation maintains lock-less single-linked list with the trick from the T.L. Harris article about volatile mark of the elements being removed. Double-CAS is used to atomically update both link and generation. New thread starting DI appends itself to the end of the queue, setting the generation to the generation of the last element +1. On DI finish, thread donates its generation to the previous element. The generation of the fake head of the list is the last passed DI generation. Basically, the implementation is a queued spinlock but without spinlock. Many thanks both to Peter Holm and Mark Johnson for keeping with me while I produced intermediate versions of the patch. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 month MFC note: td_md.md_invl_gen should go to the end of struct thread Differential revision: https://reviews.freebsd.org/D19630	2019-05-16 13:28:48 +00:00
John Baldwin	83bf5ec367	Remove p_code from struct proc. Contrary to the comments, it was never used by core dumps or debuggers. Instead, it used to hold the signal code of a pending signal, but that was replaced by the 'ksi_code' member of ksiginfo_t when signal information was reworked in 7.0. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D20047	2019-04-25 18:42:07 +00:00
Konstantin Belousov	6f1fe3305a	amd64: Add md process flags and first P_MD_PTI flag. PTI mode for the process pmap on exec is activated iff P_MD_PTI is set. On exec, the existing vmspace can be reused only if pti mode of the pmap matches the P_MD_PTI flag of the process. Add MD cpu_exec_vmspace_reuse() callback for exec_new_vmspace() which can vetoed reuse of the existing vmspace. MFC note: md_flags change struct proc KBI. Reviewed by: jhb, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D19514	2019-03-16 11:31:01 +00:00
Andrew Turner	be860eae0f	Fix the check for the offset of td_frame and td_emuldata in struct thread. Pointy hat: andrew Sponsored by: DARPA, AFRL	2019-01-12 20:41:57 +00:00
Andrew Turner	b3c0d957a2	Add support for the Clang Coverage Sanitizer in the kernel (KCOV). When building with KCOV enabled the compiler will insert function calls to probes allowing us to trace the execution of the kernel from userspace. These probes are on function entry (trace-pc) and on comparison operations (trace-cmp). Userspace can enable the use of these probes on a single kernel thread with an ioctl interface. It can allocate space for the probe with KIOSETBUFSIZE, then mmap the allocated buffer and enable tracing with KIOENABLE, with the trace mode being passed in as the int argument. When complete KIODISABLE is used to disable tracing. The first item in the buffer is the number of trace event that have happened. Userspace can write 0 to this to reset the tracing, and is expected to do so on first use. The format of the buffer depends on the trace mode. When in PC tracing just the return address of the probe is stored. Under comparison tracing the comparison type, the two arguments, and the return address are traced. The former method uses on entry per trace event, while the later uses 4. As such they are incompatible so only a single mode may be enabled. KCOV is expected to help fuzzing the kernel, and while in development has already found a number of issues. It is required for the syzkaller system call fuzzer [1]. Other kernel fuzzers could also make use of it, either with the current interface, or by extending it with new modes. A man page is currently being worked on and is expected to be committed soon, however having the code in the kernel now is useful for other developers to use. [1] https://github.com/google/syzkaller Submitted by: Mitchell Horne <mhorne063@gmail.com> (Earlier version) Reviewed by: kib Testing by: tuexen Sponsored by: DARPA, AFRL Sponsored by: The FreeBSD Foundation (Mitchell Horne) Differential Revision: https://reviews.freebsd.org/D14599	2019-01-12 11:21:28 +00:00
Konstantin Belousov	94dd54b9a2	Free bootstacks after AP startup. Bootstacks are unused after APs executed sched_throw() in init_secondary_tail() and started executing on proper idle thread stack. Add sysinit that detects that the idle thread for each CPU was scheduled at least once, and free corresponding bootstack. Slight addition of the code (~200 bytes) is compensated by the saving, because even on typical small modern desktop CPU we leak 128K of memory otherwise (4 pages x 8 threads). Reviewed by: jhb MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18486	2018-12-11 02:54:36 +00:00
Konstantin Belousov	f5cf758998	Provide storage for the process feature control flags in struct proc. The flags are cleared on exec, it is up to the image activator to set them. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2018-11-23 23:07:57 +00:00

1 2 3 4 5 ...

414 Commits