Commit Graph

19078 Commits

Author SHA1 Message Date
Konstantin Belousov
6fe78ad434 subr_unit.c: make userspace tests buildable
by defining a placeholder for UNR_NO_MTX

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-04-28 03:00:14 +03:00
Konstantin Belousov
709783373e Fix another race between fork(2) and PROC_REAP_KILL subtree
where we might not yet see a new child when signalling a process.
Ensure that this cannot happen by stopping all reapping subtree,
which ensures that the child is not inside a syscall, in particular
fork(2).

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00
Konstantin Belousov
39794d80ad Fix a race between fork(2) and PROC_REAP_KILL subtree
by repeating iteration over the subtree until there are no new processes
to signal.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00
Konstantin Belousov
d1df347368 kern_procctl: add possibility to take stop_all_proc_block() around exec
stop_allo_proc_block() must be taken before proctree_lock.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00
Konstantin Belousov
2e7595ef2f Add stop_all_proc_block(9)
It allows to have more than one consumer of thread_signle(SIGNLE_ALLPROC) by
serializing them.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00
Konstantin Belousov
54a11adbd9 reap_kill(): split children and subtree killers into helpers
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:34 +03:00
Konstantin Belousov
134529b11b reap_kill(): rename the reap variable to reaper
Suggested and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:34 +03:00
Konstantin Belousov
e4ce431e2a reap_kill(): de-inline LIST_FOREACH(), twice
Suggested and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:34 +03:00
Konstantin Belousov
b9294a3e15 reaper_abandon_children(): upgrade proctree_lock assert to exclusive
p_reapsibling linkage is protected by proctree_lock, and it is modified
there.

Suggested and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:34 +03:00
Konstantin Belousov
e59b940dcb unr(9): allow to avoid internal locking
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:34 +03:00
Konstantin Belousov
c4be460e84 init_unrhdr(): make it usable by initializing everything
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:34 +03:00
John Baldwin
1431239494 Add a __witness_used for variables only used under #ifdef WITNESS.
__diagused is now solely used for variables only used under INVARIANTS.

Reviewed by:	mjg
Differential Revision:	https://reviews.freebsd.org/D35085
2022-04-27 11:46:16 -07:00
Dmitry Chagin
4a700f3c32 sigtimedwait: Prevent timeout math overflows.
Our kern_sigtimedwait() calculates absolute sleep timo value as 'uptime+timeout'.
So, when the user specifies a big timeout value (LONG_MAX), the calculated
timo can be less the the current uptime value.
In that case kern_sigtimedwait() returns EAGAIN instead of EINTR, if
unblocked signal was caught.

While here switch to a high-precision sleep method.

Reviewed by:		mav, kib
In collaboration with:	mav
Differential revision:	https://reviews.freebsd.org/D34981
MFC after:		2 weeks
2022-04-25 10:23:15 +03:00
Dmitry Chagin
91e7bdcdcf Add timespecvalid_interval macro and use it.
Reviewed by:		jhb, imp (early rev)
Differential revision:	https://reviews.freebsd.org/D34848
MFC after:		2 weeks
2022-04-25 10:20:54 +03:00
John Baldwin
a4c5d490f6 KTLS: Move OCF function pointers out of ktls_session.
Instead, create a switch structure private to ktls_ocf.c and store a
pointer to the switch in the ocf_session.  This will permit adding an
additional function pointer needed for NIC TLS RX without further
bloating ktls_session.

Reviewed by:	hselasky
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D35011
2022-04-22 15:52:12 -07:00
John Baldwin
92e40a9b92 busdma_bounce: Batch bounce page free operations when possible.
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D34968
2022-04-21 12:01:55 -07:00
John Baldwin
d4ab3a8d4f busdma_bounce: Add free_bounce_pages helper function.
Deduplicate code to iterate over the bpages list in a bus_dmamap_t
freeing bounce pages during bus_dmamap_unload.

Reviewed by:	imp
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D34967
2022-04-21 10:42:14 -07:00
John Baldwin
10fe9a1fb4 busdma_bounce: Make the map waiting list per-bounce-zone.
When pages are freed to a bounce zone, only maps waiting for pages for
that zone can make forward progress.  If a map for a different bounce
zone is at the head of the global list, then requests that could
otherwise make forward progress will be stalled waiting on the other
bounce zone.  If bounce zones shared bounce pages then a global list
would still make sense to prevent "later" requests from starving an
earlier request but that is not a concern with per-zone bounce page
pools.

Reviewed by:	imp
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D34966
2022-04-21 10:41:09 -07:00
John Baldwin
d11f5d4762 busdma_bounce: Use a simple kproc to invoke deferred requests.
Rather than using a software interrupt with a single handler, just
create a dedicated kernel process woken up with a simple wakeup().

Reviewed by:	imp
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D34965
2022-04-21 10:40:35 -07:00
John Baldwin
c7aa0304d5 Run softclock threads at a hardware ithread priority.
Add a new PI_SOFTCLOCK for use by softclock threads.  Currently this
maps to PI_AV which is the second-highest ithread priority.

Reviewed by:	mav, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33693
2022-04-21 10:40:01 -07:00
John Baldwin
3d7e90fc20 cpufreq_curr_sysctl: Use devclass_find to lookup cpufreq devclass.
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D35002
2022-04-21 10:29:14 -07:00
Kristof Provost
a879e40ca2 callout: fix using shared rmlocks
15b1eb142c changed the callout code to store the CALLOUT_SHAREDLOCK flag
in c_iflags (where it used to be c_flags), but failed to update the
check in softclock_call_cc(). This resulted in the callout code always
taking the write lock, even if a read lock had been requested (with
the CALLOUT_SHAREDLOCK flag in callout_init_rm()).

Reviewed by:	markj
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D34959
2022-04-20 13:06:50 +02:00
John Baldwin
5bdea8826b devclass_add_driver: Permit NULL to be passed in dcp.
This permits a driver module structure that doesn't want to store a
pointer to the new driver's devclass.

Reviewed by:	imp
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D34962
2022-04-19 10:43:50 -07:00
Mateusz Guzik
c5c981d443 signals: plug a set-but-not-used var
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-04-19 12:45:57 +00:00
John Baldwin
d139909d6e destroy_dev_sched*: Don't hold Giant for all deferred destroy_dev.
Rather than using taskqueue_swi_giant which holds Giant for all
deferred destroy_dev calls, create a separate queue for destroyed
devices with D_NEEDGIANT set in the corresponding cdevsw.  The task
for this queue holds Giant whild destroying deferred devices while the
task for the default queue does not hold Giant.

In addition, switch to taskqueue_thread for destroy_dev_sched.
Deferred destroy_dev requests don't need to run at an SWI priority.

Reviewed by:	imp, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D34915
2022-04-18 12:04:30 -07:00
Konstantin Belousov
362ff9867e Revert rest of a5970a529c: use vrefact() when working on fp->f_vnode
Now, since O_PATH-opened file descriptors use use references instead
of the hold references, vrefact() chahges from that revision can be
reverted.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34906
2022-04-15 16:56:20 +03:00
Ed Maste
f99cc5a389 sysent: regen after 52a1d90c8b, posix_fadvise in capmode 2022-04-14 15:17:36 -04:00
Ed Maste
52a1d90c8b Allow posix_fadvise in capability mode
posix_fadvise operates only on a provided fd.  Noted by
Mathieu <sigsys@gmail.com> in review D34761.

No new CAP_ rights are added for posix_fadvise(), as 'advice' in
general only influences when I/O happens; the fd must have existing
CAP_ rights for actual data access.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34903
2022-04-14 15:11:21 -04:00
Konstantin Belousov
bf13db086b Mostly revert a5970a529c: Make files opened with O_PATH to not block non-forced unmount
Problem is that open(O_PATH) on nullfs -o nocache is broken then,
because there is no reference on the vnode after the open syscall exits.

Reported and tested by:	ambrisko
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-04-14 02:47:04 +03:00
John Baldwin
36fb372264 kern: Move variables only used for MAC under #ifdef MAC. 2022-04-13 16:08:23 -07:00
John Baldwin
4aec198420 sched_ule: Inline value of ts in sched_thread_priority.
This avoids a set but unused warning in kernels without SMP where
TDQ_CPU() doesn't use its argument.
2022-04-13 16:08:23 -07:00
John Baldwin
8758ac757f sched_4bsd: ts is only used in sched_bind for SMP. 2022-04-13 16:08:22 -07:00
John Baldwin
72ff256c51 sched_4bsd: Remove unused variables. 2022-04-12 14:58:59 -07:00
John Baldwin
dbd51c416a realloc(9): Move slab and zone under #ifndef DEBUG_REDZONE. 2022-04-12 14:58:59 -07:00
Mark Johnston
d769609620 tty: Remove an incorrect assertion from ttyinq_line_iterate()
We may legitimately have tib == NULL if we're at the very end of the
queue.

PR:		215373
Reported by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-04-12 17:30:04 -04:00
Tom Jones
1ea833a572 kdb: set kdb_why when entered via reboot and panic
Reviewed by:	jhb
Sponsored by:   NetApp, Inc.
Sponsored by:   Klara, Inc.
X-NetApp-PR:    #74
Differential Revision:	https://reviews.freebsd.org/D34551
2022-04-12 10:34:40 +01:00
Dmitry Chagin
c6487446d7 getdirentries: return ENOENT for unlinked but still open directory.
To be more compatible to IEEE Std 1003.1-2008 (“POSIX.1”).

Reviewed by:		mjg, Pau Amma (doc)
Differential revision:  https://reviews.freebsd.org/D34680
MFC after:		2 weeks
2022-04-11 23:30:16 +03:00
Konstantin Belousov
eca39864f7 Add sysctl KERN_LOCKF
reporting the shapshot of the active advisory locks.

A new VFS ops method vfs_report_lockf if provided in the mount point
op table.  If it is NULL, as it is currently for all existing
filesystems, vfs_report_lockf() function is used, which gathers
information from the standard implementation inside kern/kern_lockf.c.

Filesystems implementing its own locking (NFSv4 as example) can provide
a custom implementation.

Reviewed by:	markj, rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34756
2022-04-10 00:43:53 +03:00
Konstantin Belousov
147e4fe3f1 kern_lockf.c: remove no longer neeeded UFS headers
Reviewed by:	markj, rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34756
2022-04-10 00:43:53 +03:00
Konstantin Belousov
59e85819be lockf: remove lf_inode from struct lockf_entry
The UFS-specific struct inode cannot be used in generic advisory lock
code.  It was probably used as a shortcut for the debugging, as the
remnants of the code around it indicates.

Use somewhat more verbose and less concentrated, but universal,
VOP_PRINT(), where needed.

Reviewed by:	markj, rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34756
2022-04-10 00:43:53 +03:00
Gordon Bergling
f171938cd6 jail: Remove a double word in a source code comment
- s/a a/a/

MFC after:	3 days
2022-04-09 14:19:17 +02:00
Gordon Bergling
c3721292e3 kern: Remove a double word in a source code comment
- s/for for/for/

MFC after:	3 days
2022-04-09 10:50:04 +02:00
Gordon Bergling
768f9b8b8b kern: Fix a typo in a source code comment
- s/is is/is/

MFC after:	3 days
2022-04-09 09:14:14 +02:00
Andrew Turner
41e6d2091c Enable subr_physmem_test on supported architectures
Only build where it's supported.

While here add support for amd64 to help with testing.

Sponsored by:	The FreeBSD Foundation
2022-04-07 14:31:51 +01:00
Andrew Turner
d8bff5b67c Handle non-page aligned/sized memory in physmem
In some configurations the firmware may pass memory regions that are
not page sized or aligned, e.g. when using 16k pages on arm64. If this
is the case we will calculate many small regions because the alignment
is applied before being inserted. As we round the start up and end down
this will leave a 1 page hole between what should have been a single
region.

Fix by keeping the original alignment until we are just about to insert
the region into the avail array.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34694
2022-04-06 14:13:29 +01:00
Andrew Turner
8c99dfed54 Port subr_physmem to userspace and add tests
These give us some confidience we haven't broken anything in early
boot code that may be running before the console.

Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34691
2022-04-06 14:13:05 +01:00
Mitchell Horne
eb9d205fa6 livedump: add event handler hooks
Add three hooks to the livedump process: before, after, and for each
block of dumped data. This allows, for example, quiescing the system
before the dump begins or protecting data of interest to ensure its
consistency in the final output.

Reviewed by:	markj, kib (previous version)
Reviewed by:	debdrup (manpages)
Reviewed by:	Pau Amma <pauamma@gundo.com> (manpages)
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34067
2022-04-05 15:35:05 -03:00
Mitchell Horne
c9114f9f86 Add new vnode dumper to support live minidumps
This dumper can instantiate and write the dump's contents to a
file-backed vnode.

Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.

As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.

A future change to savecore(8) will add an option to save a live dump.

Reviewed by:	markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with:	kib
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D33813
2022-04-05 15:35:05 -03:00
Mitchell Horne
59c27ea18c Split out dumper allocation from list insertion
Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.

This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.

free_single_dumper() is made public and renamed to dumper_destroy().

Reviewed by:	kib, markj
MFC after:	1 week
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34068
2022-04-05 15:35:05 -03:00
Mateusz Guzik
b7262756e2 vfs: fixup WANTIOCTLCAPS on open
In some cases vn_open_cred overwrites cn_flags, effectively nullifying
initialisation done in NDINIT. This will have to be fixed.

In the meantime make sure the flag is passed.

Reported by:	jenkins
Noted by:	Mathieu <sigsys@gmail.com>
2022-04-02 20:49:01 +02:00