Commit Graph

447 Commits

Author SHA1 Message Date
Brooks Davis
5c274b3622 whitespace: rewrap to match case directly above
It's easier to visually diff the two case blocks if there aren't
gratutious whitespace differences.

Sponsored by:	DARPA
2023-02-03 00:37:31 +00:00
Mark Johnston
2c10be9e06 arm64: Handle translation faults for thread structures
The break-before-make requirement poses a problem when promoting or
demoting mappings containing thread structures: a CPU may raise a
translation fault while accessing curthread, and data_abort() accesses
the thread again before pmap_fault() can translate the address and
return.

Normally this isn't a problem because we have a hack to ensure that
slabs used by the thread zone are always accessed via the direct map,
where promotions and demotions are rare.  However, this hack doesn't
work properly with UMA_MD_SMALL_ALLOC disabled, as is the case with
KASAN configured (since our KASAN implementation does not shadow the
direct map and so tries to force the use of the kernel map wherever
possible).

Fix the problem by modifying data_abort() to handle translation faults
in the kernel map without dereferencing "td", i.e., curthread, and
without enabling interrupts.  pmap_klookup() has special handling for
translation faults which makes it safe to call in this context.  Then,
revert the aforementioned hack.

Reviewed by:	kevans, alc, kib, andrew
MFC after:	1 month
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D37231
2022-11-02 13:46:25 -04:00
Konstantin Belousov
f829268bcc Remove TDF_DOING_SA
We cannot see a thread with the flag set in unsuspend, after we stopped
doing SINGLE_ALLPROC from user processes.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:34:30 +03:00
Konstantin Belousov
5e5675cb4b Remove struct proc p_singlethr member
It does not serve any purpose after we stopped doing
thread_single(SINGLE_ALLPROC) from stoppable user processes.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:34:30 +03:00
Konstantin Belousov
cc29f221aa ksiginfo_alloc(): change to directly take M_WAITOK/NOWAIT flags
Also style, and remove unneeded cast.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:33:17 +03:00
Konstantin Belousov
c6d31b8306 AST: rework
Make most AST handlers dynamically registered.  This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it.  For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.

Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit.  For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.

The dynamic registration also allows third-party modules to register AST
handlers if needed.  There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it.  In fact, this
is already present behavior for hwpmc.ko and ufs.ko.  I do not think it
is worth the efforts and the runtime overhead to try to fix it.

Reviewed by:	markj
Tested by:	emaste (arm64), pho
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35888
2022-08-02 21:11:09 +03:00
Cy Schubert
d781401512 kern_thread.c: Fix i386 build
Chase 4493a13e3b by updating static
assertions of struct proc.
2022-06-13 19:35:33 -07:00
Mark Johnston
2d5ef216b6 thread_single_end(): consistently maintain p_boundary_count for ALLPROC mode
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 week
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
1b4701fe1e thread_unsuspend(): do not unuspend the suspended leader thread doing SINGLE_ALLPROC
markj wrote:
tdsendsignal() may unsuspend a target thread. I think there is at least
one bug there: suppose thread T is suspended in
thread_single(SINGLE_ALLPROC) when trying to kill another process with
REAP_KILL. Suppose a different thread sends SIGKILL to T->td_proc. Then,
tdsendsignal() calls thread_unsuspend(T, T->td_proc). thread_unsuspend()
incorrectly decrements T->td_proc->p_suspcount to -1.

Later, when T->td_proc exits, it will wait forever in
thread_single(SINGLE_EXIT) since T->td_proc->p_suspcount never reaches 1.

Since the thread suspension is bounded by time needed to do
thread_single(), skipping the thread_unsuspend_one() call there should
not affect signal delivery if this thread is selected as target.

Reported by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
b9009b1789 thread_single(): remove already checked conditional expression
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
4493a13e3b Do not single-thread itself when the process single-threaded some another process
Since both self single-threading and remote single-threading rely on
suspending the thread doing thread_single(), it cannot be mixed: thread
doing thread_suspend_switch() might be subject to thread_suspend_one()
and vice versa.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
dd883e9a7e weed_inhib(): correct the condition to re-suspend a thread
suspended for SINGLE_ALLPROC mode.  There is no need to check for
boundary state.  It is only required to see that the suspension comes
from the ALLPROC mode.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
b9893b3533 weed_inhib(): do not double-suspend already suspended thread if the loop reiterates
In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:02 +03:00
Konstantin Belousov
d7a9e6e740 thread_single: wait for P_STOPPED_SINGLE to pass
to avoid ALLPROC mode to try to race with any other single-threading
mode.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:02 +03:00
Mateusz Guzik
29ee49f66b thread: remove dead store from thread_cow_update 2022-02-13 13:07:08 +00:00
Mateusz Guzik
32114b639f Add PROC_COW_CHANGECOUNT and thread_cow_synced
Combined they can be used to avoid a proc lock/unlock cycle in the
syscall handler for curthread, see upcoming examples.
2022-02-11 11:44:07 +00:00
Mateusz Guzik
8a0cb04df4 Add lim_cowsync, similar to crcowsync 2022-02-11 11:44:07 +00:00
Konstantin Belousov
4d675b80f0 i386: fix struct proc layout asserts after 351d5f7fc5
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-28 21:56:21 +03:00
Konstantin Belousov
351d5f7fc5 exec: store parent directory and hardlink name of the binary in struct proc
While doing it, also move all the code to resolve pathnames and obtain
text vp and dvp, into single place.   Besides simplifying the code, it
avoids spurious vnode relocks and validates the explanation why
a transient text reference on the script vnode is not harmful.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32611
2021-10-28 20:49:56 +03:00
Konstantin Belousov
bd9e0f5df6 amd64: eliminate td_md.md_fpu_scratch
For signal send, copyout from the user FPU save area directly.

For sigreturn, we are in sleepable context and can do temporal
allocation of the transient save area.  We cannot copying from userspace
directly to user save area because XSAVE state needs to be validated,
also partial copyins can corrupt it.

Requested by:	jhb
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
df8dd6025a amd64: stop using top of the thread' kernel stack for FPU user save area
Instead do one more allocation at the thread creation time.  This frees
a lot of space on the stack.

Also do not use alloca() for temporal storage in signal delivery sendsig()
function and signal return syscall sys_sigreturn().  This saves equal
amount of space, again by the cost of one more allocation at the thread
creation time.

A useful experiment now would be to reduce KSTACK_PAGES.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
f575573ca5 Remove PT_GET_SC_ARGS_ALL
Reimplement bdf0f24bb1 by checking for the caller' ABI in
the implementation of PT_GET_SC_ARGS, and copying out everything if
it is Linuxolator.

Also fix a minor information leak: if PT_GET_SC_ARGS_ALL is done on the
thread reused after other process, it allows to read some number of that
thread last syscall arguments. Clear td_sa.args in thread_alloc().

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31968
2021-09-16 20:11:27 +03:00
Mark Johnston
5dda15adbc kern: Ensure that thread-local KMSAN state is available
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Mark Johnston
a422084abb Add the KMSAN runtime
KMSAN enables the use of LLVM's MemorySanitizer in the kernel.  This
enables precise detection of uses of uninitialized memory.  As with
KASAN, this feature has substantial runtime overhead and is intended to
be used as part of some automated testing regime.

The runtime maintains a pair of shadow maps.  One is used to track the
state of memory in the kernel map at bit-granularity: a bit in the
kernel map is initialized when the corresponding shadow bit is clear,
and is uninitialized otherwise.  The second shadow map stores
information about the origin of uninitialized regions of the kernel map,
simplifying debugging.

KMSAN relies on being able to intercept certain functions which cannot
be instrumented by the compiler.  KMSAN thus implements interceptors
which manually update shadow state and in some cases explicitly check
for uninitialized bytes.  For instance, all calls to copyout() are
subject to such checks.

The runtime exports several functions which can be used to verify the
shadow map for a given buffer.  Helpers provide the same functionality
for a few structures commonly used for I/O, such as CAM CCBs, BIOs and
mbufs.  These are handy when debugging a KMSAN report whose
proximate and root causes are far away from each other.

Obtained from:	NetBSD
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Dmitry Chagin
af29f39958 umtx: Split umtx.h on two counterparts.
To prevent umtx.h polluting by future changes split it on two headers:
umtx.h - ABI header for userspace;
umtxvar.h - the kernel staff.

While here fix umtx_key_match style.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31248
MFC after:		2 weeks
2021-07-29 12:41:29 +03:00
David Chisnall
cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00
Dmitry Chagin
5d9f790191 Eliminate p_elf_machine from struct proc.
Instead of p_elf_machine use machine member of the Elf_Brandinfo which is now
cached in the struct proc at p_elf_brandinfo member.

Note to MFC: D30918, KBI

Reviewed by:		kib, markj
Differential Revision:	https://reviews.freebsd.org/D30926
MFC after:		2 weeks
2021-06-29 20:18:29 +03:00
Dmitry Chagin
615f22b2fb Add a link to the Elf_Brandinfo into the struc proc.
To allow the ABI to make a dicision based on the Brandinfo add a link
to the Elf_Brandinfo into the struct proc. Add a note that the high 8 bits
of Elf_Brandinfo flags is private to the ABI.

Note to MFC: it breaks KBI.

Reviewed by:		kib, markj
Differential Revision:	https://reviews.freebsd.org/D30918
MFC after:		2 weeks
2021-06-29 20:15:08 +03:00
Konstantin Belousov
d3f7975fcb thread_reap_barrier(): remove unused variable
Noted by:	alc
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-05-31 23:03:42 +03:00
Konstantin Belousov
f62c7e54e9 Add thread_reap_barrier()
Reviewed by:	hselasky,markj
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30468
2021-05-31 18:09:22 +03:00
Konstantin Belousov
845d77974b kern_thread.c: wrap too long lines
Reviewed by:	hselasky, markj
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30468
2021-05-31 18:09:22 +03:00
Konstantin Belousov
1762f674cc ktrace: pack all ktrace parameters into allocated structure ktr_io_params
Ref-count the ktr_io_params structure instead of vnode/cred.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30257
2021-05-22 15:16:08 +03:00
Konstantin Belousov
af928fded0 Add thread_run_flash() helper
It unsuspends single suspended thread, passed as the argument.
It is up to the caller to arrange the target thread to suspend later,
since the state of the process is not changed from stopped.  In particular,
the unsuspended thread must not leave to userspace, since boundary code
is not prepared to this situation.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:13:47 +03:00
Alex Richardson
fa2528ac64 Use atomic loads/stores when updating td->td_state
KCSAN complains about racy accesses in the locking code. Those races are
fine since they are inside a TD_SET_RUNNING() loop that expects the value
to be changed by another CPU.

Use relaxed atomic stores/loads to indicate that this variable can be
written/read by multiple CPUs at the same time. This will also prevent
the compiler from doing unexpected re-ordering.

Reported by:	GENERIC-KCSAN
Test Plan:	KCSAN no longer complains, kernel still runs fine.
Reviewed By:	markj, mjg (earlier version)
Differential Revision: https://reviews.freebsd.org/D28569
2021-02-18 14:02:48 +00:00
Mateusz Guzik
b83e94be53 thread: staticize thread_reap and move td_allocdomain
thread_init is a much better fit as the the value is constant after
initialization.
2020-11-26 06:59:27 +00:00
Mateusz Guzik
598f2b8116 dtrace: stop using eventhandlers for the part compiled into the kernel
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D27311
2020-11-23 18:27:21 +00:00
Mateusz Guzik
a9568cd2bc thread: stash domain id to work around vtophys problems on ppc64
Adding to zombie list can be perfomed by idle threads, which on ppc64 leads to
panics as it requires a sleepable lock.

Reported by:	alfredo
Reviewed by:	kib, markj
Fixes:	r367842 ("thread: numa-aware zombie reaping")
Differential Revision:	https://reviews.freebsd.org/D27288
2020-11-23 18:26:47 +00:00
Mateusz Guzik
d116b9f1ad thread: numa-aware zombie reaping
The current global list is a significant problem, in particular induces a lot
of cross-domain thread frees. When running poudriere on a 2 domain box about
half of all frees were of that nature.

Patch below introduces per-domain thread data containing zombie lists and
domain-aware reaping. By default it only reaps from the current domain, only
reaping from others if there is free TID shortage.

A dedicated callout is introduced to reap lingering threads if there happens
to be no activity.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D27185
2020-11-19 10:00:48 +00:00
Conrad Meyer
85078b8573 Split out cwd/root/jail, cmask state from filedesc table
No functional change intended.

Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).

__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.

Reviewed by:	kib
Discussed with:	markj, mjg
Differential Revision:	https://reviews.freebsd.org/D27037
2020-11-17 21:14:13 +00:00
Mateusz Guzik
19d3e47dca select: call seltdfini on process and thread exit
Since thread_zone is marked NOFREE the thread_fini callback is never
executed, meaning memory allocated by seltdinit is never released.

Adding the call to thread_dtor is not sufficient as exiting processes
cache the main thread.
2020-11-16 03:12:21 +00:00
Mateusz Guzik
f34a2f56c3 thread: batch credential freeing 2020-11-14 19:22:02 +00:00
Mateusz Guzik
fb8ab68084 thread: batch resource limit free calls 2020-11-14 19:21:46 +00:00
Mateusz Guzik
5ef7b7a0f3 thread: rework tid batch to use helpers 2020-11-14 19:20:58 +00:00
Mateusz Guzik
d1ca25be49 thread: pad tid lock
On a kernel with other changes this bumps 104-way thread creation/destruction
from 0.96 mln ops/s to 1.1 mln ops/s.
2020-11-14 19:19:27 +00:00
Mateusz Guzik
62dbc992ad thread: move nthread management out of tid_alloc
While this adds more work single-threaded, it also enables SMP-related
speed ups.
2020-11-12 00:29:23 +00:00
Mateusz Guzik
755341df4f thread: batch tid_free calls in thread_reap
This eliminates the highly pessimal pattern of relocking from multiple
CPUs in quick succession. Note this is still globally serialized.
2020-11-11 18:45:06 +00:00
Mateusz Guzik
c5315f5196 thread: lockless zombie list manipulation
This gets rid of the most contended spinlock seen when creating/destroying
threads in a loop. (modulo kstack)

Tested by:	alfredo (ppc64), bdragon (ppc64)
2020-11-11 18:43:51 +00:00
Mateusz Guzik
26007fe37c thread: add more fine-grained tidhash locking
Note this still does not scale but is enough to move it out of the way
for the foreseable future.

In particular a trivial benchmark spawning/killing threads stops contesting
on tidhash.
2020-11-11 08:51:04 +00:00
Mateusz Guzik
aae3547be3 thread: rework tidhash vs proc lock interaction
Apart from minor clean up this gets rid of proc unlock/lock cycle on thread
exit to work around LOR against tidhash lock.
2020-11-11 08:50:04 +00:00
Mateusz Guzik
cf31cadeb6 thread: fix thread0 tid allocation
Startup code hardcodes the value instead of allocating it.
The first spawned thread would then be a duplicate.

Pointy hat:	mjg
2020-11-11 08:48:43 +00:00