Commit Graph

17979 Commits

Author SHA1 Message Date
Mateusz Guzik
d48c2b8d29 vfs: correctly predict last fdrop on failed open
Arguably since the count is guaranteed to be 1 the code should be modified
to avoid the work.
2020-12-13 21:28:15 +00:00
Konstantin Belousov
203affb291 Fix TDP_WAKEUP/thr_wake(curthread->td_tid) after r366428.
Reported by:	arichardson
Reviewed by:	arichardson, markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27597
2020-12-13 19:45:42 +00:00
Konstantin Belousov
0b459854bc Correct indent.
Sponsored by:	The FreeBSD Foundation
2020-12-13 19:43:45 +00:00
Mateusz Guzik
edcdcefb88 fd: fix fdrop prediction when closing a fd
Most of the time this is the last reference, contrary to typical fdrop use.
2020-12-13 18:06:24 +00:00
Ryan Libby
d3bbf8af68 cache_fplookup: quiet gcc -Wreturn-type
Reviewed by:	markj, mjg
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D27555
2020-12-11 22:51:44 +00:00
Mateusz Guzik
0ecce93dca fd: make serialization in fdescfree_fds conditional on hold count
p_fd nullification in fdescfree serializes against new threads transitioning
the count 1 -> 2, meaning that fdescfree_fds observing the count of 1 can
safely assume there is nobody else using the table. Losing the race and
observing > 1 is harmless.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27522
2020-12-10 17:17:22 +00:00
Mark Johnston
3309fa7403 Plug a race between fd table teardown and several loops
To export information from fd tables we have several loops which do
this:

FILDESC_SLOCK(fdp);
for (i = 0; fdp->fd_refcount > 0 && i <= lastfile; i++)
	<export info for fd i>;
FILDESC_SUNLOCK(fdp);

Before r367777, fdescfree() acquired the fd table exclusive lock between
decrementing fdp->fd_refcount and freeing table entries.  This
serialized with the loop above, so the file at descriptor i would remain
valid until the lock is dropped.  Now there is no serialization, so the
loops may race with teardown of file descriptor tables.

Acquire the exclusive fdtable lock after releasing the final table
reference to provide a barrier synchronizing with these loops.

Reported by:	pho
Reviewed by:	kib (previous version), mjg
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27513
2020-12-09 14:05:08 +00:00
Mark Johnston
4c1c90ea95 Use refcount_load(9) to load fd table reference counts
No functional change intended.

Reviewed by:	kib, mjg
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27512
2020-12-09 14:04:54 +00:00
Kyle Evans
f1b18a668d cpuset_set{affinity,domain}: do not allow empty masks
cpuset_modify() would not currently catch this, because it only checks that
the new mask is a subset of the root set and circumvents the EDEADLK check
in cpuset_testupdate().

This change both directly validates the mask coming in since we can
trivially detect an empty mask, and it updates cpuset_testupdate to catch
stuff like this going forward by always ensuring we don't end up with an
empty mask.

The check_mask argument has been renamed because the 'check' verbiage does
not imply to me that it's actually doing a different operation. We're either
augmenting the existing mask, or we are replacing it entirely.

Reported by:	syzbot+4e3b1009de98d2fabcda@syzkaller.appspotmail.com
Discussed with:	andrew
Reviewed by:	andrew, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27511
2020-12-08 18:47:22 +00:00
Kyle Evans
b2780e8537 kern: cpuset: resolve race between cpuset_lookup/cpuset_rel
The race plays out like so between threads A and B:

1. A ref's cpuset 10
2. B does a lookup of cpuset 10, grabs the cpuset lock and searches
   cpuset_ids
3. A rel's cpuset 10 and observes the last ref, waits on the cpuset lock
   while B is still searching and not yet ref'd
4. B ref's cpuset 10 and drops the cpuset lock
5. A proceeds to free the cpuset out from underneath B

Resolve the race by only releasing the last reference under the cpuset lock.
Thread A now picks up the spinlock and observes that the cpuset has been
revived, returning immediately for B to deal with later.

Reported by:	syzbot+92dff413e201164c796b@syzkaller.appspotmail.com
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27498
2020-12-08 18:45:47 +00:00
Kyle Evans
9c83dab96c kern: cpuset: plug a unr leak
cpuset_rel_defer() is supposed to be functionally equivalent to
cpuset_rel() but with anything that might sleep deferred until
cpuset_rel_complete -- this setup is used specifically for cpuset_setproc.

Add in the missing unr free to match cpuset_rel. This fixes a leak that
was observed when I wrote a small userland application to try and debug
another issue, which effectively did:

cpuset(&newid);
cpuset(&scratch);

newid gets leaked when scratch is created; it's off the list, so there's
no mechanism for anything else to relinquish it. A more realistic reproducer
would likely be a process that inherits some cpuset that it's the only ref
for, but it creates a new one to modify. Alternatively, administratively
reassigning a process' cpuset that it's the last ref for will have the same
effect.

Discovered through D27498.

MFC after:	1 week
2020-12-08 18:44:06 +00:00
Mateusz Guzik
8fcfd0e222 vfs: add cleanup on error missed in r368375
Noted by:	jrtc27
2020-12-06 19:24:38 +00:00
Mateusz Guzik
60e2a0d9a4 vfs: factor buffer allocation/copyin out of namei 2020-12-06 04:59:24 +00:00
Mateusz Guzik
0c23d26230 vfs: keep bad ops on vnode reclaim
They were only modified to accomodate a redundant assertion.

This runs into problems as lockless lookup can still try to use the vnode
and crash instead of getting an error.

The bug was only present in kernels with INVARIANTS.

Reported by:	kevans
2020-12-05 05:56:23 +00:00
Konstantin Belousov
be2535b0a6 Add kern_ntp_adjtime(9).
Reviewed by:	brooks, cy
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D27471
2020-12-04 18:56:44 +00:00
Kyle Evans
34af05ead3 kern: soclose: don't sleep on SO_LINGER w/ timeout=0
This is a valid scenario that's handled in the various protocol layers where
it makes sense (e.g., tcp_disconnect and sctp_disconnect). Given that it
indicates we should immediately drop the connection, it makes little sense
to sleep on it.

This could lead to panics with INVARIANTS. On non-INVARIANTS kernels, this
could result in the thread hanging until a signal interrupts it if the
protocol does not mark the socket as disconnected for whatever reason.

Reported by:	syzbot+e625d92c1dd74e402c81@syzkaller.appspotmail.com
Reviewed by:	glebius, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27407
2020-12-04 04:39:48 +00:00
Mark Johnston
b957b18594 Always use 64-bit physical addresses for dump_avail[] in minidumps
As of r365978, minidumps include a copy of dump_avail[].  This is an
array of vm_paddr_t ranges.  libkvm walks the array assuming that
sizeof(vm_paddr_t) is equal to the platform "word size", but that's not
correct on some platforms.  For instance, i386 uses a 64-bit vm_paddr_t.

Fix the problem by always dumping 64-bit addresses.  On platforms where
vm_paddr_t is 32 bits wide, namely arm and mips (sometimes), translate
dump_avail[] to an array of uint64_t ranges.  With this change, libkvm
no longer needs to maintain a notion of the target word size, so get rid
of it.

This is a no-op on platforms where sizeof(vm_paddr_t) == 8.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27082
2020-12-03 17:12:31 +00:00
Oleksandr Tymoshenko
18ce865a4f Add support for hw.physmem tunable for ARM/ARM64/RISC-V platforms
hw.physmem tunable allows to limit number of physical memory available to the
system. It's handled in machdep files for x86 and PowerPC. This patch adds
required logic to the consolidated physmem management interface that is used by
ARM, ARM64, and RISC-V.

Submitted by:	Klara, Inc.
Reviewed by:	mhorne
Sponsored by:	Ampere Computing
Differential Revision:	https://reviews.freebsd.org/D27152
2020-12-03 05:39:27 +00:00
Mateusz Guzik
10e64782ed select: make sure there are no wakeup attempts after selfdfree returns
Prior to the patch returning selfdfree could still be racing against doselwakeup
which set sf_si = NULL and now locks stp to wake up the other thread.

A sufficiently unlucky pair can end up going all the way down to freeing
select-related structures before the lock/wakeup/unlock finishes.

This started manifesting itself as crashes since select data started getting
freed in r367714.
2020-12-02 00:48:15 +00:00
Konstantin Belousov
6814c2dac5 lio_listio(2): send signal even if number of jobs is zero.
Right now, if lio registered zero jobs, syscall frees lio job
structure, cleaning up queued ksi.  As result, the realtime signal is
dequeued and never delivered.

Fix it by allowing sendsig() to copy ksi when job count is zero.

PR: 220398
Reported and reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D27421
2020-12-01 22:53:33 +00:00
Konstantin Belousov
2933165666 vfs_aio.c: style.
Mostly re-wrap conditions to split after binary ops.

Reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D27421
2020-12-01 22:46:51 +00:00
Konstantin Belousov
5c5005ec20 vfs_aio.c: correct comment.
Reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D27421
2020-12-01 22:30:32 +00:00
Mark Johnston
dad22308a1 vmem: Revert r364744
A pair of bugs are believed to have caused the hangs described in the
commit log message for r364744:

1. uma_reclaim() could trigger reclamation of the reserve of boundary
   tags used to avoid deadlock.  This was fixed by r366840.
2. The loop in vmem_xalloc() would in some cases try to allocate more
   boundary tags than the expected upper bound of BT_MAXALLOC.  The
   reserve is sized based on the value BT_MAXMALLOC, so this behaviour
   could deplete the reserve without guaranteeing a successful
   allocation, resulting in a hang.  This was fixed by r366838.

PR:		248008
Tested by:	rmacklem
2020-12-01 16:06:31 +00:00
Alexander V. Chernikov
8db8bebf1f Move inner loop logic out of sysctl_sysctl_next_ls().
Refactor sysctl_sysctl_next_ls():
* Move huge inner loop out of sysctl_sysctl_next_ls() into a separate
 non-recursive function, returning the next step to be taken.
* Update resulting node oid parts only on successful lookup
* Make sysctl_sysctl_next_ls() return boolean success/failure instead of errno,
 slightly simplifying logic

Reviewed by:	freqlabs
Differential Revision:	https://reviews.freebsd.org/D27029
2020-11-30 21:59:52 +00:00
Toomas Soome
93b18e3730 vt: if loader did pass the font via metadata, use it
The built in 8x16 font may be way too small with large framebuffer
resolutions, to improve readability, use loader provied font.
2020-11-30 11:45:47 +00:00
Toomas Soome
a4a10b37d4 Add VT driver for VBE framebuffer device
Implement vt_vbefb to support Vesa Bios Extensions (VBE) framebuffer with VT.
vt_vbefb is built based on vt_efifb and is assuming similar data for
initialization, use MODINFOMD_VBE_FB to identify the structure vbe_fb
in kernel metadata.

struct vbe_fb, is populated by boot loader, and is passed to kernel via
metadata payload.

Differential Revision:	https://reviews.freebsd.org/D27373
2020-11-30 08:22:40 +00:00
Matt Macy
2338da0373 Import kernel WireGuard support
Data path largely shared with the OpenBSD implementation by
Matt Dunwoodie <ncon@nconroy.net>

Reviewed by:	grehan@freebsd.org
MFC after:	1 month
Sponsored by:	Rubicon LLC, (Netgate)
Differential Revision:	https://reviews.freebsd.org/D26137
2020-11-29 19:38:03 +00:00
Konstantin Belousov
a9d4fe977a bio aio: Destroy ephemeral mapping before unwiring page.
Apparently some architectures, like ppc in its hashed page tables
variants, account mappings by pmap_qenter() in the response from
pmap_is_page_mapped().

While there, eliminate useless userp variable.

Noted and reviewed by:	alc (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27409
2020-11-29 10:30:56 +00:00
Alexander Motin
83f6b50123 Remove alignment requirements for KVA buffer mapping.
After r368124 pbuf_zone has extra page to handle this particular case.
2020-11-29 01:30:17 +00:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Kyle Evans
e07e3fa3c9 kern: cpuset: drop the lock to allocate domainsets
Restructure the loop a little bit to make it a little more clear how it
really operates: we never allocate any domains at the beginning of the first
iteration, and it will run until we've satisfied the amount we need or we
encounter an error.

The lock is now taken outside of the loop to make stuff inside the loop
easier to evaluate w.r.t. locking.

This fixes it to not try and allocate any domains for the freelist under the
spinlock, which would have happened before if we needed any new domains.

Reported by:	syzbot+6743fa07b9b7528dc561@syzkaller.appspotmail.com
Reviewed by:	markj
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D27371
2020-11-28 01:21:11 +00:00
Mark Johnston
0c56925bc2 callout(9): Remove some leftover APM BIOS support
This code is obsolete since r366546.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27267
2020-11-27 20:46:02 +00:00
Konstantin Belousov
99c66d3acf vn_read_from_obj(): fix handling of doomed vnodes.
There is no reason why vp->v_object cannot be NULL. If it is, it's
fine, handle it by delegating to VOP_READ().

Tested by:	pho
Reviewed by:	markj, mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27327
2020-11-26 18:13:33 +00:00
Konstantin Belousov
164438a7b9 More careful handling of the mount failure.
- VFS_UNMOUNT() requires vn_start_write() around it [*].
- call VFS_PURGE() before unmount.
- do not destroy mp if cleanup unmount did not succeed.
- set MNTK_UNMOUNT, and indicate forced unmount with MNTK_UNMOUNTF
  for VFS_UNMOUNT() in cleanup.

PR:	251320 [*]
Reported by:	Tong Zhang <ztong0001@gmail.com>
Reviewed by:	markj, mjg
Discussed with:	rmacklem
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27327
2020-11-26 18:08:42 +00:00
Konstantin Belousov
3b1f974bfb Make max ticks for pause in vn_lock_pair() adjustable at runtime.
Reduce default value from hz / 10 to hz / 100.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2020-11-26 18:00:26 +00:00
Mateusz Guzik
b83e94be53 thread: staticize thread_reap and move td_allocdomain
thread_init is a much better fit as the the value is constant after
initialization.
2020-11-26 06:59:27 +00:00
Mateusz Guzik
2e51c2bfd1 pipe: follow up cleanup to previous
The commited patch was incomplete.

- add back missing goto retry, noted by jhb
- 'if (error)'  -> 'if (error != 0)'
- consistently do:

if (error != 0)
    break;
continue;

instead of:

if (error != 0)
    break;
else
    continue;

This adds some 'continue' uses which are not needed, but line up with the
rest of pipe_write.
2020-11-25 22:53:21 +00:00
Mateusz Guzik
c8df8543fd pipe: drop spurious pipeunlock/pipelock cycle on write 2020-11-25 21:41:23 +00:00
Kyle Evans
d431dea5ac kern: cpuset: properly rebase when attaching to a jail
The current logic is a fine choice for a system administrator modifying
process cpusets or a process creating a new cpuset(2), but not ideal for
processes attaching to a jail.

Currently, when a process attaches to a jail, it does exactly what any other
process does and loses any mask it might have applied in the process of
doing so because cpuset_setproc() is entirely based around the assumption
that non-anonymous cpusets in the process can be replaced with the new
parent set.

This approach slightly improves the jail attach integration by modifying
cpuset_setproc() callers to indicate if they should rebase their cpuset to
the indicated set or not (i.e. cpuset_setproc_update_set).

If we're rebasing and the process currently has a cpuset assigned that is
not the containing jail's root set, then we will now create a new base set
for it hanging off the jail's root with the existing mask applied instead of
using the jail's root set as the new base set.

Note that the common case will be that the process doesn't have a cpuset
within the jail root, but the system root can freely assign a cpuset from
a jail to a process outside of the jail with no restriction. We assume that
that may have happened or that it could happen due to a race when we drop
the proc lock, so we must recheck both within the loop to gather up
sufficient freed cpusets and after the loop.

To recap, here's how it worked before in all cases:

0     4 <-- jail              0      4 <-- jail / process
|                             |
1                 ->          1
|
3 <-- process

Here's how it works now:

0     4 <-- jail             0       4 <-- jail
|                            |       |
1                 ->         1       5 <-- process
|
3 <-- process

or

0     4 <-- jail             0       4 <-- jail / process
|                            |
1 <-- process     ->         1

More importantly, in both cases, the attaching process still retains the
mask it had prior to attaching or the attach fails with EDEADLK if it's
left with no CPUs to run on or the domain policy is incompatible. The
author of this patch considers this almost a security feature, because a MAC
policy could grant PRIV_JAIL_ATTACH to an unprivileged user that's
restricted to some subset of available CPUs the ability to attach to a jail,
which might lift the user's restrictions if they attach to a jail with a
wider mask.

In most cases, it's anticipated that admins will use this to be able to,
for example, `cpuset -c -l 1 jail -c path=/ command=/long/running/cmd`,
and avoid the need for contortions to spawn a command inside a jail with a
more limited cpuset than the jail.

Reviewed by:	jamie
MFC after:	1 month (maybe)
Differential Revision:	https://reviews.freebsd.org/D27298
2020-11-25 03:14:25 +00:00
Kyle Evans
30b7c6f977 kern: cpuset: rename _cpuset_create() to cpuset_init()
cpuset_init() is better descriptor for what the function actually does. The
name was previously taken by a sysinit that setup cpuset_zero's mask
from all_cpus, it was removed in r331698 before stable/12 branched.

A comment referencing the removed sysinit has now also been removed, since
the setup previously done was moved into cpuset_thread0().

Suggested by:	markj
MFC after:	1 week
2020-11-25 02:12:24 +00:00
Kyle Evans
29d04ea8c3 kern: cpuset: allow cpuset_create() to take an allocated *setp
Currently, it must always allocate a new set to be used for passing to
_cpuset_create, but it doesn't have to. This is purely kern_cpuset.c
internal and it's sparsely used, so just change it to use *setp if it's
not-NULL and modify the two consumers to pass in the address of a NULL
cpuset.

This paves the way for consumers that want the unr allocation without the
possibility of sleeping as long as they've done their due diligence to
ensure that the mask will properly apply atop the supplied parent
(i.e. avoiding the free_unr() in the last failure path).

Reviewed by:	jamie, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27297
2020-11-25 01:42:32 +00:00
Kyle Evans
c7ef3490e2 kern: never restart syscalls calling closefp(), e.g. close(2)
All paths leading into closefp() will either replace or remove the fd from
the filedesc table, and closefp() will call fo_close methods that can and do
currently sleep without regard for the possibility of an ERESTART. This can
be dangerous in multithreaded applications as another thread could have
opened another file in its place that is subsequently operated on upon
restart.

The following are seemingly the only ones that will pass back ERESTART
in-tree:
- sockets (SO_LINGER)
- fusefs
- nfsclient

Reviewed by:	jilles, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27310
2020-11-25 01:08:57 +00:00
Cy Schubert
e5a307c6ac Fix a typo in a comment.
MFC after:	3 days
2020-11-24 06:42:32 +00:00
Mateusz Guzik
f90d57b808 locks: push lock_delay_arg_init calls down
Minor cleanup to skip doing them when recursing on locks and so that
they can act on found lock value if need be.
2020-11-24 03:49:37 +00:00
Mateusz Guzik
094c148b7a sx: drop spurious volatile keyword 2020-11-24 03:48:44 +00:00
Mateusz Guzik
598f2b8116 dtrace: stop using eventhandlers for the part compiled into the kernel
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D27311
2020-11-23 18:27:21 +00:00
Mateusz Guzik
a9568cd2bc thread: stash domain id to work around vtophys problems on ppc64
Adding to zombie list can be perfomed by idle threads, which on ppc64 leads to
panics as it requires a sleepable lock.

Reported by:	alfredo
Reviewed by:	kib, markj
Fixes:	r367842 ("thread: numa-aware zombie reaping")
Differential Revision:	https://reviews.freebsd.org/D27288
2020-11-23 18:26:47 +00:00
Konstantin Belousov
87a9b18d22 Provide ABI modules hooks for process exec/exit and thread exit.
Exec and exit are same as corresponding eventhandler hooks.

Thread exit hook is called somewhat earlier, while thread is still
owned by the process and enough context is available.  Note that the
process lock is owned when the hook is called.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27309
2020-11-23 17:29:25 +00:00
Edward Tomasz Napierala
9c8c797c1a Remove the 'wantparent' variable, unused since r145004.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D27193
2020-11-23 12:47:23 +00:00
Kyle Evans
dac521ebcf cpuset_setproc: use the appropriate parent for new anonymous sets
As far as I can tell, this has been the case since initially committed in
2008.  cpuset_setproc is the executor of cpuset reassignment; note this
excerpt from the description:

* 1) Set is non-null.  This reparents all anonymous sets to the provided
*    set and replaces all non-anonymous td_cpusets with the provided set.

However, reviewing cpuset_setproc_setthread() for some jail related work
unearthed the error: if tdset was not anonymous, we were replacing it with
`set`. If it was anonymous, then we'd rebase it onto `set` (i.e. copy the
thread's mask over and AND it with `set`) but give the new anonymous set
the original tdset as the parent (i.e. the base of the set we're supposed to
be leaving behind).

The primary visible consequences were that:

1.) cpuset_getid() following such assignment returns the wrong result, the
    setid that we left behind rather than the one we joined.
2.) When a process attached to the jail, the base set of any anonymous
    threads was a set outside of the jail.

This was initially bundled in D27298, but it's a minor fix that's fairly
easy to verify the correctness of.

A test is included in D27307 ("badparent"), which demonstrates the issue
with, effectively:

osetid = cpuset_getid()
newsetid = cpuset()
cpuset_setaffinity(thread)
cpuset_setid(osetid)
cpuset_getid(thread) -> observe that it matches newsetid instead of osetid.

MFC after:	1 week
2020-11-23 02:49:53 +00:00