freebsd-skq/sys
kevans 857a4b0ab0 kern: cpuset: properly rebase when attaching to a jail
The current logic is a fine choice for a system administrator modifying
process cpusets or a process creating a new cpuset(2), but not ideal for
processes attaching to a jail.

Currently, when a process attaches to a jail, it does exactly what any other
process does and loses any mask it might have applied in the process of
doing so because cpuset_setproc() is entirely based around the assumption
that non-anonymous cpusets in the process can be replaced with the new
parent set.

This approach slightly improves the jail attach integration by modifying
cpuset_setproc() callers to indicate if they should rebase their cpuset to
the indicated set or not (i.e. cpuset_setproc_update_set).

If we're rebasing and the process currently has a cpuset assigned that is
not the containing jail's root set, then we will now create a new base set
for it hanging off the jail's root with the existing mask applied instead of
using the jail's root set as the new base set.

Note that the common case will be that the process doesn't have a cpuset
within the jail root, but the system root can freely assign a cpuset from
a jail to a process outside of the jail with no restriction. We assume that
that may have happened or that it could happen due to a race when we drop
the proc lock, so we must recheck both within the loop to gather up
sufficient freed cpusets and after the loop.

To recap, here's how it worked before in all cases:

0     4 <-- jail              0      4 <-- jail / process
|                             |
1                 ->          1
|
3 <-- process

Here's how it works now:

0     4 <-- jail             0       4 <-- jail
|                            |       |
1                 ->         1       5 <-- process
|
3 <-- process

or

0     4 <-- jail             0       4 <-- jail / process
|                            |
1 <-- process     ->         1

More importantly, in both cases, the attaching process still retains the
mask it had prior to attaching or the attach fails with EDEADLK if it's
left with no CPUs to run on or the domain policy is incompatible. The
author of this patch considers this almost a security feature, because a MAC
policy could grant PRIV_JAIL_ATTACH to an unprivileged user that's
restricted to some subset of available CPUs the ability to attach to a jail,
which might lift the user's restrictions if they attach to a jail with a
wider mask.

In most cases, it's anticipated that admins will use this to be able to,
for example, `cpuset -c -l 1 jail -c path=/ command=/long/running/cmd`,
and avoid the need for contortions to spawn a command inside a jail with a
more limited cpuset than the jail.

Reviewed by:	jamie
MFC after:	1 month (maybe)
Differential Revision:	https://reviews.freebsd.org/D27298
2020-11-25 03:14:25 +00:00
..
amd64 Pull the check for VM ownership into ppt_find(). 2020-11-24 23:56:33 +00:00
arm arm: Remove old amlogic support 2020-11-24 17:51:10 +00:00
arm64 arm64: Check if we have a map before checking the flags 2020-11-24 14:05:35 +00:00
bsm bsm: add AUE_CLOSERANGE 2020-04-24 01:27:25 +00:00
cam Do not truncate the last character from serial number. 2020-11-24 21:14:36 +00:00
cddl [cddl] Fix lz4 function definitions to not tri pup compile. 2020-11-17 17:11:07 +00:00
compat Linuxolator: Replace use of eventhandlers by sysent hooks. 2020-11-23 18:18:16 +00:00
conf Port rtsx(4) driver for Realtek SD card reader from OpenBSD. 2020-11-24 21:28:44 +00:00
contrib Adjust ENA driver files to latest ena-com changes 2020-11-18 14:59:22 +00:00
crypto Check cipher key lengths during probesession. 2020-11-05 23:31:58 +00:00
ddb db_search_symbol: prevent pollution from bogus symbols 2020-10-26 16:42:53 +00:00
dev Remove more legacy of parallel SCSI. 2020-11-24 22:43:27 +00:00
dts Brand our DTS with the Linux version it was imported from 2020-10-10 07:18:51 +00:00
fs msdosfs: suspend around unmount or remount rw->ro. 2020-11-20 15:19:30 +00:00
gdb gdb(4): Don't escape GDB special characters at application layer 2020-09-30 14:55:54 +00:00
geom gbde: replace malloc_last_fail with a kludge 2020-11-12 20:20:57 +00:00
gnu Brand our DTS with the Linux version it was imported from 2020-10-10 07:18:51 +00:00
i386 Port rtsx(4) driver for Realtek SD card reader from OpenBSD. 2020-11-24 21:28:44 +00:00
isa
kern kern: cpuset: properly rebase when attaching to a jail 2020-11-25 03:14:25 +00:00
kgssapi State kgssapi dependency on xdr. 2020-09-17 22:29:38 +00:00
libkern arc4random(9): Integrate with RANDOM_FENESTRASX push-reseed 2020-10-10 21:48:06 +00:00
mips Fix octeon_pmc post-r334827 2020-11-18 17:37:01 +00:00
modules Port rtsx(4) driver for Realtek SD card reader from OpenBSD. 2020-11-24 21:28:44 +00:00
net Refactor rib iterator functions. 2020-11-22 20:21:10 +00:00
net80211 net80211: fix a typo 2020-11-04 12:07:33 +00:00
netgraph ng_nat: unbreak ABI 2020-11-10 02:26:44 +00:00
netinet Fix two occurences of a typo in a comment introduced in r367530. 2020-11-23 10:13:56 +00:00
netinet6 Refactor rib iterator functions. 2020-11-22 20:21:10 +00:00
netipsec Trigger soft lifetime expiration on sequence number 2020-10-16 11:27:01 +00:00
netpfil pf: Make tag hashing more robust 2020-11-24 16:18:47 +00:00
netsmb net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
nfs nfs: clean up empty lines in .c and .h files 2020-09-01 21:25:39 +00:00
nfsclient nfs: clean up empty lines in .c and .h files 2020-09-01 21:25:39 +00:00
nfsserver nfs: Mark unused statistics variable as reserved 2020-11-18 04:35:49 +00:00
nlm nlm: clean up empty lines in .c and .h files 2020-09-01 22:14:52 +00:00
ofed Fix for referencing file via its vnode in ibore. 2020-11-02 10:44:29 +00:00
opencrypto Remove the cloned file descriptors for /dev/crypto. 2020-11-25 00:10:54 +00:00
powerpc [POWERPC] print uprintf_signal 'type' field in hex 2020-11-20 18:52:37 +00:00
riscv riscv: always initialize the static kernel environment 2020-11-20 15:21:10 +00:00
rpc Fix a potential memory leak in the NFS over TLS handling code. 2020-09-05 00:50:52 +00:00
security pipe: allow for lockless pipe_stat 2020-11-19 06:30:25 +00:00
sys Remove the cloned file descriptors for /dev/crypto. 2020-11-25 00:10:54 +00:00
teken Do a sweep and remove most WARNS=6 settings 2020-10-01 01:10:51 +00:00
tests Add small tool to invoke kernel test framework tests. 2020-09-02 09:20:40 +00:00
tools Brand our DTS with the Linux version it was imported from 2020-10-10 07:18:51 +00:00
ufs Handle LoR in flush_pagedep_deps(). 2020-11-14 05:30:10 +00:00
vm Wrap a long line in vm_pqbatch_process_page() 2020-11-19 15:41:42 +00:00
x86 Add device_t member to struct iommu. 2020-11-16 15:29:52 +00:00
xdr xdr: clean up empty lines in .c and .h files 2020-09-01 22:13:28 +00:00
xen xen: clean up empty lines in .c and .h files 2020-09-01 21:21:55 +00:00
Makefile