Commit Graph

19021 Commits

Author SHA1 Message Date
Andrew Turner
f461b95561 Fix a sign mismatch warning in the physmem code
Make sure both sides of a comparison are unsigned. As the values being
compared are size_t make the the value in the for loop size_t too.

Sponsored by:	The FreeBSD Foundation
2022-03-28 11:51:09 +01:00
Mateusz Guzik
2533b5dc82 vfs: add missing bits to vdropl_impl
This completes the patch which was originally meant to go in.

Spotted by:	mhorne
Fixes: c35ec1efdc ("vfs: [1/2] fix stalls in vnode reclaim by not
requeieing from vnlru")
2022-03-27 14:35:37 +00:00
Mateusz Guzik
a4032e2a69 vfs: assorted tidy ups to lookup
No functional changes.
2022-03-26 17:06:09 +00:00
Alexander Leidinger
aeb91e95cf Log euid, rgid and jail on listen queue overflow
If you have numerous jails with multiple similar services running,
this helps to narrow down which services this log is referring to.
2022-03-26 11:17:55 +01:00
Eric van Gyzen
aca2a7faca stack_zero is not needed before stack_save
The man page was recently clarified to commit to this contract.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2022-03-25 20:10:38 -05:00
Eric van Gyzen
863070bbf6 ksiginfo_alloc: pass M_WAITOK or M_NOWAIT to uma_zalloc
It expects exactly one of those flags.  A future commit will assert this.

Reviewed by:	rstone
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D34451
2022-03-25 20:10:37 -05:00
Mateusz Guzik
0f60088399 vfs: set cn_namelen when handling degenerate lookups
Turns out execve looks at it to store binary name, but in order to
trigger the problem one has to be trying to exec '/'. As is the value
would be left uninitialized (or rather set to -1 on debug kernels).

Fixes:	56244d3574 ("vfs: hoist degenerate path lookups out of the
loop")
2022-03-25 18:19:36 +00:00
Mateusz Guzik
4ef6e56ae8 vfs: hoist trailing slash handling out of the loop 2022-03-24 14:36:31 +00:00
Mateusz Guzik
3b6792d28a vfs: factor symlink traversal out of namei
The intent down the road is to eliminate the loop to begin with,
pushing traversal down to vfs_lookup, all while not allocating the
extra buffer.
2022-03-24 13:11:22 +00:00
Mateusz Guzik
d9ea7e2b1e vfs: factor FAILIFEXISTS handling out of vfs_lookup 2022-03-24 11:22:20 +00:00
Mateusz Guzik
56244d3574 vfs: hoist degenerate path lookups out of the loop 2022-03-24 11:22:12 +00:00
Mateusz Guzik
bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Mark Johnston
1babcad6bc elf: Avoid dumping uninitialized bytes in PRSTATUS core dump notes
elf_prstatus_t contains pad space.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34606
2022-03-23 12:53:49 -04:00
Mark Johnston
7524994da0 callout: Remove the CS_EXECUTING flag
It is now unused.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34626
2022-03-23 12:37:02 -04:00
Mark Johnston
b319171861 setitimer: Fix exit race
We use the p_itcallout callout, interlocked by the proc lock, to
schedule timeouts for the setitimer(2) system call.  When a process
exits, the callout must be stopped before the process struct is
recycled.

Currently we attempt to stop the callout in exit1() with the call
_callout_stop_safe(&p->p_itcallout, CS_EXECUTING).  If this call returns
0, then we sleep in order to drain the callout.  However, this happens
only if the callout is not scheduled at all.  If the callout thread is
blocked on the proc lock, then exit1() will not block and the callout
may execute after the process has fully exited, typically resulting in a
panic.

I cannot see a reason to use the CS_EXECUTING flag here.  Instead, use
the regular callout_stop()/callout_drain() dance to halt the callout.

Reported by:	ler
Tested by:	ler, pho
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34625
2022-03-23 12:36:12 -04:00
Alexander Motin
fd6ca665d2 Fix umtxq_sleep() regression caused by 56070dd2e4.
umtxq_requeue() moves the queue to a different hash chain and different
lock, so we can't rely on msleep_sbt() reacquiring the same old lock.
We have to use PDROP and update the queue chain and so lock pointer.

PR:		262587
MFC after:	2 weeks
2022-03-21 19:55:55 -04:00
firk
bb53dd56c3 kern_tc.c/cputick2usec() (which is used to calculate cputime from
cpu ticks) has some imprecision and, worse, huge timestep (about
20 minutes on 4GHz CPU) near 53.4 days of elapsed time.

kern_time.c/cputick2timespec() (it is used for clock_gettime() for
querying process or thread consumed cpu time) Uses cputick2usec()
and then needlessly converting usec to nsec, obviously losing
precision even with fixed cputick2usec().

kern_time.c/kern_clock_getres() uses some weird (anyway wrong)
formula for getting cputick resolution.

PR:		262215
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D34558
2022-03-21 09:33:46 -04:00
Andrew Turner
cab496e16c Make SHMMAXPGS an unsigned long
This is used to calculate sizes that are then stored in unsigned long
fields. Make this unsigned long so the calculations use this type and
not an int that can lead to an integer overflow with a large PAGE_SIZE.

This allows building this on arm64 with PAGE_SIZE of 16k. Further work
will be needed if a 32-bit architecture tries to use a similar sized
page.

Sponsored by:	The FreeBSD Foundation
2022-03-21 10:27:35 +00:00
Colin Percival
2406867f5b tslog: Add CTLFLAG_SKIP to sysctls
The timestamp logs are quite large (often much larger than all the
other sysctls combined) so it's unlikely anyone will want to have
them displayed by `sysctl -a`.

MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D34616
2022-03-20 11:31:16 -07:00
Mateusz Guzik
6ff3e8a316 cache: add a comment about a realpath bug 2022-03-19 15:11:25 +00:00
Mateusz Guzik
eb574ba0b6 vfs: replace VFS_NOTIFY_UPPER_* macros with an enum 2022-03-19 13:15:55 +00:00
Mateusz Guzik
cceb91b025 vfs: add missing flags to db show mount 2022-03-19 12:04:44 +00:00
Mateusz Guzik
93a0ba8f49 vfs: retire the no longer used MNTK_LOOKUP_EXCL_DOTDOT flag
Reviewed by:	markj
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D34466
2022-03-19 10:47:29 +00:00
Mateusz Guzik
1cb0045c97 vfs: add MNTK_UNLOCKED_INSMNTQUE
Can be used when the fs at hand can synchronize insmntque with other
means than the vnode lock.

Reviewed by:	markj
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D34466
2022-03-19 10:46:40 +00:00
firk
28d08dc7d0 clock_gettime: Fix CLOCK_THREAD_CPUTIME_ID race
Use a spinlock section instead of a critical section to synchronize with
statclock().  Otherwise the CLOCK_THREAD_CPUTIME_ID clock can appear to
go backwards.

PR:		262273
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34568
2022-03-17 15:39:00 -04:00
Mark Johnston
fc7e121d88 file: Move FILEDESC_FOREACH macros to kern_descrip.c
They are only used in kern_descrip.c, so make them private.  No
functional change intended.

Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
2022-03-17 15:39:00 -04:00
Mark Johnston
c702242292 file: Avoid a read-after-free of fd tables in sysctl handlers
Some loops access the fd table of a different process, and drop the
filedesc lock while iterating, so they check the table's refcount.
However, we access the table before the first iteration, in order to get
the number of table entries, and this access can be a use-after-free.

Fix the problem by checking the refcount before we start iterating.

Reported by:	pho
Reviewed by:	mjg
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34575
2022-03-17 15:39:00 -04:00
Mateusz Guzik
0134bbe56f vfs: prefix lookup and relookup with vfs_
Reviewed by:	imp, mckusick
Differential Revision:		https://reviews.freebsd.org/D34530
2022-03-13 14:44:39 +00:00
Mateusz Guzik
02fc4e319c cache: use flexible array member
... instead of 0-sizing the array
2022-03-13 14:43:35 +00:00
John Baldwin
6b71405bfe Store core dump notes for all valid register sets for FreeBSD processes.
In particular, use a generic wrapper around struct regset rather than
requiring per-regset helpers.  This helper replaces the MI
__elfN(note_prstatus) and __elfN(note_fpregset) helpers.  It also
removes the need to explicitly dump NT_ARM_ADDR_MASK in the arm64
__elfN(dump_thread).

Reviewed by:	markj, emaste
Sponsored by:	University of Cambridge, Google, Inc.
Differential Revision:	https://reviews.freebsd.org/D34446
2022-03-10 15:40:19 -08:00
Kornel Duleba
b344de4d0d Extend device_get_property API
In order to support various types of data stored in device
tree properties or ACPI _DSD packages, create a new enum so
the caller can specify the expected type of a property they
want to read, according to the binding. The bus logic will use
that information to process the underlying data.

For example in DT all integer properties are stored in BE format.
In order to get constant results across different platforms we
need to convert its endianness to match the host.

Another example are ACPI_TYPE_INTEGER properties stored
as uint64_t. Before this patch the ACPI logic would refuse
to read them if the provided buffer was smaller than 8 bytes.
Now this can be handled by using DEVICE_PROP_UINT32 type.

Modify the existing consumers of this API to reflect the changes
and update the man pages accordingly.

Reviewed by: mw
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33457
2022-03-10 12:11:32 +01:00
Kornel Duleba
206dc82bc3 bus_if: Add a default implementation of get_property
There are multiple buses that pretend to be ofw compatible,
e.g ofw_pci, mii_fdt. We now need to provide an implementation
of BUS_GET_PROPERTY for every one of them. Instead of modifying
them one by one it's better to just provide a default
implementation that simply traverses up the device tree.
Remove the now unneeded BUS_GET_PROPERTY implementation in mii_fdt.

Reviewed by: andrew, bz
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34031
2022-03-10 12:11:32 +01:00
Mateusz Guzik
3a4c5dab92 vfs: [2/2] fix stalls in vnode reclaim by only counting attempts
... and ignoring if they succeded, which matches historical behavior.

Reported by:	pho
2022-03-10 09:41:50 +00:00
Mateusz Guzik
c35ec1efdc vfs: [1/2] fix stalls in vnode reclaim by not requeieing from vnlru
Reported by:	pho
2022-03-10 09:41:50 +00:00
Ed Maste
080b4e8a0c kcov: use __func__ in KASSERT instead of old function name
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:47:27 -05:00
Mark Johnston
afb44cb010 rmlock: Temporarily revert commit c84bb8cd77
It appears to have introduced a regression on arm64, possibly due to the
fact that the pcpu pointer is reloaded outside of the critical section
in _rm_rlock().  Until this is resolved one way or another, let's
revert.

Reported by:	Ronald Klop <ronald-lists@klop.ws>
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:43:19 -05:00
Mark Johnston
8dbae4ce32 linker: Permit CTFv3 containers
Reviewed by:	Domagoj Stolfa
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34362
2022-03-07 10:43:19 -05:00
Mark Johnston
cab9382a2c linker: Simplify CTF container handling
Use sys/ctf.h to provide various definitions required to parse the CTF
header.  No functional change intended.

Reviewed by:	Domagoj Stolfa, emaste
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34359
2022-03-07 10:43:18 -05:00
Konstantin Belousov
1fb00c8f10 buf_alloc(): Stop using LK_NOWAIT, use LK_NOWITNESS
Despite the buffer taken from cache or free list, it still can be
locked, due to 'lockless lookup' in getblkx() potentially operating on
the freed buffers.  The lock is transient, but prevents the use of
LK_NOWAIT there for the goal of neutralizing WITNESS.

Just use LK_NOWITNESS.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-03-06 10:29:31 -05:00
Alexander Motin
56070dd2e4 Improve timeout precision of pthread_cond_timedwait().
This code was not touched when all other user-space sleep functions were
switched to sbintime_t and decoupled from hardclock.  When it is possible,
convert supplied times into sbinuptime to supply directly to msleep_sbt()
with C_ABSOLUTE.  This provides the timeout resolution of few microseconds
instead of 2 milliseconds, plus avoids few clock reads and conversions.

Reviewed by:	vangyzen
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D34163
2022-03-03 22:03:09 -05:00
John Baldwin
0b25cbc79d Fix the size returned for NT_FPREGSET.
Sponsored by:	University of Cambridge, Google, Inc.
2022-03-03 17:53:06 -08:00
John Baldwin
9af41803cb Use vnsz2log directly in assertion on its relation to sizeof(struct vnode).
This reduces the size of diffs required to support different values of
vnsz2log.  In CheriBSD, kernels for CHERI architectures have vnodes
larger than 512 bytes and require a value of 9.

Reviewed by:	mjg
Obtained from:	CheriBSD
Sponsored by:	University of Cambridge, Google, Inc.
Differential Revision:	https://reviews.freebsd.org/D34418
2022-03-03 17:52:07 -08:00
Mateusz Guzik
afb08a6d07 cache: hide hash stats behind DEBUG_CACHE
They take a long time to dump and hinder sysctl -a when used with
DIAGNOSTIC.
2022-03-03 17:21:58 +00:00
Mateusz Guzik
f3f3e3c44d fd: add close_range(..., CLOSE_RANGE_CLOEXEC)
For compatibility with Linux.

MFC after:	3 days
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D34424
2022-03-03 17:21:58 +00:00
Mark Johnston
879b0604a8 proc: Remove assertion that P_WEXIT is not set in proc_rwmem()
exit1() sets P_WEXIT before waiting for holding threads to finish,
rather than after, so this assertion is racy.

Fixes:	12fb39ec3e ("proc: Relax proc_rwmem()'s assertion on the process hold count")
Reported by:	Jenkins
2022-03-01 15:09:45 -05:00
Mark Johnston
12fb39ec3e proc: Relax proc_rwmem()'s assertion on the process hold count
This reference ensures that the process and its associated vmspace will
not be destroyed while proc_rwmem() is executing.  If, however, the
calling thread belongs to the target process, then it is unnecessary to
hold the process.  In particular, fasttrap - a module which enables
userspace dtrace - may frequently call proc_rwmem(), and we'd prefer to
avoid the overhead of locking and bumping the hold count when possible.

Thus, make the assertion conditional on "p != curproc".  Also assert
that the process is not already exiting.  No functional change intended.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-03-01 12:40:35 -05:00
Warner Losh
b36bd3a906 bus: Create dev_wired_cache
A simple cache to cache differnet locators to the same device.

Sponsored by:		Netflix
Changes Suggested by:	jhb
Differential Revision:	https://reviews.freebsd.org/D32783
2022-03-01 08:06:41 -07:00
Warner Losh
cae7d9ec83 bus: Add ACPI locator support
Add support for printing ACPI paths. This is a bit of a degenerate case
for this interface since it's always just the device handle if the
device has one. But it is illustrtive of how to do this for a few nodes
in the tree.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32748
2022-03-01 08:06:41 -07:00
Warner Losh
38e942a345 devctl: Add DEV_GET_PATH
DEV_GET_PATH will get the path to a device based on different locators.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32745
2022-03-01 08:06:41 -07:00
Warner Losh
e19db70769 bus: Introduce the bus interface get_device_path
This returns the full path of a the child device requested. Since
there's different ways to recon the entire path, include a 'locator'
method. The default 'FreeBSD' method uses a filesystem-like path name
with each device to the root node separated by /. Other locators will be
UEFI, ACPI and fdt, though others are possible in the future. Make the
locator a string to allow maximum flexibility.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32744
2022-03-01 08:06:40 -07:00