Commit Graph

18992 Commits

Author SHA1 Message Date
John Baldwin
6b71405bfe Store core dump notes for all valid register sets for FreeBSD processes.
In particular, use a generic wrapper around struct regset rather than
requiring per-regset helpers.  This helper replaces the MI
__elfN(note_prstatus) and __elfN(note_fpregset) helpers.  It also
removes the need to explicitly dump NT_ARM_ADDR_MASK in the arm64
__elfN(dump_thread).

Reviewed by:	markj, emaste
Sponsored by:	University of Cambridge, Google, Inc.
Differential Revision:	https://reviews.freebsd.org/D34446
2022-03-10 15:40:19 -08:00
Kornel Duleba
b344de4d0d Extend device_get_property API
In order to support various types of data stored in device
tree properties or ACPI _DSD packages, create a new enum so
the caller can specify the expected type of a property they
want to read, according to the binding. The bus logic will use
that information to process the underlying data.

For example in DT all integer properties are stored in BE format.
In order to get constant results across different platforms we
need to convert its endianness to match the host.

Another example are ACPI_TYPE_INTEGER properties stored
as uint64_t. Before this patch the ACPI logic would refuse
to read them if the provided buffer was smaller than 8 bytes.
Now this can be handled by using DEVICE_PROP_UINT32 type.

Modify the existing consumers of this API to reflect the changes
and update the man pages accordingly.

Reviewed by: mw
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33457
2022-03-10 12:11:32 +01:00
Kornel Duleba
206dc82bc3 bus_if: Add a default implementation of get_property
There are multiple buses that pretend to be ofw compatible,
e.g ofw_pci, mii_fdt. We now need to provide an implementation
of BUS_GET_PROPERTY for every one of them. Instead of modifying
them one by one it's better to just provide a default
implementation that simply traverses up the device tree.
Remove the now unneeded BUS_GET_PROPERTY implementation in mii_fdt.

Reviewed by: andrew, bz
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34031
2022-03-10 12:11:32 +01:00
Mateusz Guzik
3a4c5dab92 vfs: [2/2] fix stalls in vnode reclaim by only counting attempts
... and ignoring if they succeded, which matches historical behavior.

Reported by:	pho
2022-03-10 09:41:50 +00:00
Mateusz Guzik
c35ec1efdc vfs: [1/2] fix stalls in vnode reclaim by not requeieing from vnlru
Reported by:	pho
2022-03-10 09:41:50 +00:00
Ed Maste
080b4e8a0c kcov: use __func__ in KASSERT instead of old function name
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:47:27 -05:00
Mark Johnston
afb44cb010 rmlock: Temporarily revert commit c84bb8cd77
It appears to have introduced a regression on arm64, possibly due to the
fact that the pcpu pointer is reloaded outside of the critical section
in _rm_rlock().  Until this is resolved one way or another, let's
revert.

Reported by:	Ronald Klop <ronald-lists@klop.ws>
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:43:19 -05:00
Mark Johnston
8dbae4ce32 linker: Permit CTFv3 containers
Reviewed by:	Domagoj Stolfa
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34362
2022-03-07 10:43:19 -05:00
Mark Johnston
cab9382a2c linker: Simplify CTF container handling
Use sys/ctf.h to provide various definitions required to parse the CTF
header.  No functional change intended.

Reviewed by:	Domagoj Stolfa, emaste
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34359
2022-03-07 10:43:18 -05:00
Konstantin Belousov
1fb00c8f10 buf_alloc(): Stop using LK_NOWAIT, use LK_NOWITNESS
Despite the buffer taken from cache or free list, it still can be
locked, due to 'lockless lookup' in getblkx() potentially operating on
the freed buffers.  The lock is transient, but prevents the use of
LK_NOWAIT there for the goal of neutralizing WITNESS.

Just use LK_NOWITNESS.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-03-06 10:29:31 -05:00
Alexander Motin
56070dd2e4 Improve timeout precision of pthread_cond_timedwait().
This code was not touched when all other user-space sleep functions were
switched to sbintime_t and decoupled from hardclock.  When it is possible,
convert supplied times into sbinuptime to supply directly to msleep_sbt()
with C_ABSOLUTE.  This provides the timeout resolution of few microseconds
instead of 2 milliseconds, plus avoids few clock reads and conversions.

Reviewed by:	vangyzen
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D34163
2022-03-03 22:03:09 -05:00
John Baldwin
0b25cbc79d Fix the size returned for NT_FPREGSET.
Sponsored by:	University of Cambridge, Google, Inc.
2022-03-03 17:53:06 -08:00
John Baldwin
9af41803cb Use vnsz2log directly in assertion on its relation to sizeof(struct vnode).
This reduces the size of diffs required to support different values of
vnsz2log.  In CheriBSD, kernels for CHERI architectures have vnodes
larger than 512 bytes and require a value of 9.

Reviewed by:	mjg
Obtained from:	CheriBSD
Sponsored by:	University of Cambridge, Google, Inc.
Differential Revision:	https://reviews.freebsd.org/D34418
2022-03-03 17:52:07 -08:00
Mateusz Guzik
afb08a6d07 cache: hide hash stats behind DEBUG_CACHE
They take a long time to dump and hinder sysctl -a when used with
DIAGNOSTIC.
2022-03-03 17:21:58 +00:00
Mateusz Guzik
f3f3e3c44d fd: add close_range(..., CLOSE_RANGE_CLOEXEC)
For compatibility with Linux.

MFC after:	3 days
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D34424
2022-03-03 17:21:58 +00:00
Mark Johnston
879b0604a8 proc: Remove assertion that P_WEXIT is not set in proc_rwmem()
exit1() sets P_WEXIT before waiting for holding threads to finish,
rather than after, so this assertion is racy.

Fixes:	12fb39ec3e ("proc: Relax proc_rwmem()'s assertion on the process hold count")
Reported by:	Jenkins
2022-03-01 15:09:45 -05:00
Mark Johnston
12fb39ec3e proc: Relax proc_rwmem()'s assertion on the process hold count
This reference ensures that the process and its associated vmspace will
not be destroyed while proc_rwmem() is executing.  If, however, the
calling thread belongs to the target process, then it is unnecessary to
hold the process.  In particular, fasttrap - a module which enables
userspace dtrace - may frequently call proc_rwmem(), and we'd prefer to
avoid the overhead of locking and bumping the hold count when possible.

Thus, make the assertion conditional on "p != curproc".  Also assert
that the process is not already exiting.  No functional change intended.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-03-01 12:40:35 -05:00
Warner Losh
b36bd3a906 bus: Create dev_wired_cache
A simple cache to cache differnet locators to the same device.

Sponsored by:		Netflix
Changes Suggested by:	jhb
Differential Revision:	https://reviews.freebsd.org/D32783
2022-03-01 08:06:41 -07:00
Warner Losh
cae7d9ec83 bus: Add ACPI locator support
Add support for printing ACPI paths. This is a bit of a degenerate case
for this interface since it's always just the device handle if the
device has one. But it is illustrtive of how to do this for a few nodes
in the tree.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32748
2022-03-01 08:06:41 -07:00
Warner Losh
38e942a345 devctl: Add DEV_GET_PATH
DEV_GET_PATH will get the path to a device based on different locators.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32745
2022-03-01 08:06:41 -07:00
Warner Losh
e19db70769 bus: Introduce the bus interface get_device_path
This returns the full path of a the child device requested. Since
there's different ways to recon the entire path, include a 'locator'
method. The default 'FreeBSD' method uses a filesystem-like path name
with each device to the root node separated by /. Other locators will be
UEFI, ACPI and fdt, though others are possible in the future. Make the
locator a string to allow maximum flexibility.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32744
2022-03-01 08:06:40 -07:00
Warner Losh
78408171bd devctl2: Change to 644 protections
We make sure that we check for device privs (usually meaning root or
better) for everything. To allow other functions that don't require
this, default to 644 protection.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D32863
2022-03-01 08:06:40 -07:00
Mark Johnston
89ae8eb74e rmlock: Add required compiler barriers to _rm_runlock()
Also remove excessive whitespace in _rm_rlock().

Reviewed by:	jah, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34381
2022-03-01 09:38:45 -05:00
Warner Losh
9891cb1e76 Eliminate curlen, it's set but never used
Sponsored by:		Netflix
2022-02-27 09:02:45 -07:00
Mark Johnston
c84bb8cd77 rmlock: Micro-optimize read locking
Use get_pcpu() instead of an open-coded pcpu_find(td->td_oncpu).  This
eliminates some memory accesses and results in a shorter instruction
sequence.  Note that get_pcpu() didn't exist when rmlocks were added.

Reviewed by:	jah, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34377
2022-02-25 13:55:24 -05:00
Marvin Ma
1517b8d5a7 vfs_unregister: fix error handling
Due to misplaced braces, an error from vfs_uninit() in the VFCF_SBDRY
case was ignored.

Reported by:	Anton Rang <rang@acm.org>
Reviewed by:	Anton Rang <rang@acm.org>, markj
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D34375
2022-02-25 12:19:14 -06:00
Warner Losh
2bfdc1ee9b cons: Use bool for boolean variables
MFC After:		3 days
Sponsored by:		Netflix
2022-02-24 10:58:07 -07:00
Jamie Gritton
d7c4ea7d72 posixshm: Allow jails to use kern.ipc.posix_shm_list
PR:		257554
Reported by:	grembo@
2022-02-24 09:30:49 -08:00
Gleb Smirnoff
d2b3a0ed31 sendto: don't clear transient errors for atomic protocols
The changeset 65572cade3 uncovered the fact that top layer of sendto(2)
would clear a transient error code if some data was copied out of uio.
The clearing of the error makes sense for non-atomic protocols, since
they have sent some data.  The atomic protocols send all or nothing.

The current implementation of unix/dgram uses sosend_generic(), which
would always copyout and only then it may fail to deliver a message.
The sosend_dgram(), currently used by UDP only, also has same behavior.

Reported by:		pho
Reviewed by:		pho, markj
Differential revision:	https://reviews.freebsd.org/D34309
2022-02-23 10:24:14 -08:00
Andrew Turner
d58b79d1ce Built all KCSAN atomic interceptors on arm64
These atomic functions are now supported. Add them to KCSAN.

Sponsored by:	The FreeBSD Foundation
2022-02-23 14:45:47 +00:00
Mateusz Guzik
f17ef28674 fd: rename fget*_locked to fget*_noref
This gets rid of the error prone naming where fget_unlocked returns with
a ref held, while fget_locked requires a lock but provides nothing in
terms of making sure the file lives past unlock.

No functional changes.
2022-02-22 18:53:43 +00:00
Robert Wing
0a2f498234 tty: fix a panic with INVARIANTS
watch'ing a tty triggers a refcount wraparound panic, take a reference
on fp after fget_cap_locked() to fix.

Reported by:    Michael Jung <mikej_at_paymentallianceintl.com>
Reviewed by:	hselasky, mjg
Fixes:          f40dd6c803 ("tty: switch ttyhook_register to use fget_cap_locked")
Differential Revision:	https://reviews.freebsd.org/D34335
2022-02-22 09:37:13 -09:00
Mitchell Horne
5a8fceb3bd boottrace: trace annotations for startup and shutdown
Add trace events for execution of SYSINITs (both static and dynamically
loaded), and to the various steps in the shutdown/panic/reboot paths.

Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
X-NetApp-PR:	#23
Differential Revision:	https://reviews.freebsd.org/D30187
2022-02-21 20:15:57 -04:00
Mitchell Horne
0aa9ffcd9c init_main.c: sort includes
This is preferred by style(9). Do this ahead of adding another include.

Reviewed by:	imp, kevans, allanjude
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30186
2022-02-21 20:15:51 -04:00
Mitchell Horne
877eea429b kern_linker.c: sort includes
This is preferred by style(9). Do this ahead of adding another include.

Reviewed by:	imp, kevans
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30185
2022-02-21 20:15:51 -04:00
Mitchell Horne
da5b7e90e7 boottrace: a simple boot and shutdown-time tracing facility
Boottrace is a facility for capturing trace events during boot and
shutdown. This includes kernel initialization, as well as rc. It has
been used by NetApp internally for several years, for catching and
diagnosing slow devices or subsystems. It is driven from userspace by
sysctl interface, and the output is a human-readable log of events
(kern.boottrace.log).

This commit adds the core boottrace functionality implementing these
interfaces. Adding the trace annotations themselves to kernel and
userland will happen in follow-up commits. A future commit will also add
a boottrace(4) man page.

For now, boottrace is unconditionally compiled into the kernel but
disabled by default. It can be enabled by setting the
kern.boottrace.enabled tunable to 1 in loader.conf(5).

There is an existing boot-time event tracing facility, which can be
compiled into the kernel with 'options TSLOG'. While there is some
functional overlap between this and boottrace, they are distinct. TSLOG
is suitable for generating detailed timing information and flamegraphs,
and has been used to great success by cperciva@ to diagnose and reduce
the overall system boot time. Boottrace aims to more quickly provide an
overview of timing and resource usage of the boot (and shutdown) process
to a sysadmin who requires this knowledge.

Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
X-NetApp-PR:	#23
Differential Revision:	https://reviews.freebsd.org/D30184
2022-02-21 20:15:45 -04:00
Mateusz Guzik
e68a5225e8 fd: add fde_copy
To dedup handrolled memcpy. This will be used later to make fd code
atomic-clean.
2022-02-15 17:51:08 +00:00
Mateusz Guzik
ec12b4f4ff fd: add missing seqc to dupfdopen 2022-02-15 17:51:08 +00:00
Mateusz Guzik
c9a995994b seqc: rename seqc_consistent_nomb to seqc_consistent_no_fence
For more consistency with other primitives.
2022-02-15 17:51:07 +00:00
John Baldwin
becaf6433b Use vmspace->vm_stacktop in place of sv_usrstack in more places.
Reviewed by:	markj
Obtained from:	CheriBSD
Differential Revision:	https://reviews.freebsd.org/D34174
2022-02-14 10:57:30 -08:00
Gleb Smirnoff
65572cade3 unix/dgram: return EAGAIN instead of ENOBUFS when O_NONBLOCK set
This is behavior what some programs expect and what Linux does.  For
example nginx expects EAGAIN when sending messages to /var/run/log,
which it connects to with O_NONBLOCK.  Particularly with nginx the
problem is magnified by the fact that a ENOBUFS on send(2) is also
logged, so situation creates a log-bomb - a failed log message
triggers another log message.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D34187
2022-02-14 09:21:55 -08:00
Mark Johnston
852ff943b9 sleepqueue: Annotate sleepq_max_depth as static
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-02-14 10:06:47 -05:00
Mark Johnston
893be9d8ac sleepqueue: Address a lock order reversal
After commit 74cf7cae4d ("softclock: Use dedicated ithreads for
running callouts."), there is a lock order reversal between the per-CPU
callout lock and the scheduler lock.  softclock_thread() locks callout
lock then the scheduler lock, when preparing to switch off-CPU, and
sleepq_remove_thread() stops the timed sleep callout while potentially
holding a scheduler lock.  In the latter case, it's the thread itself
that's locked, and if the thread is sleeping then its lock will be a
sleepqueue lock, but if it's still in the process of going to sleep
it'll be a scheduler lock.

We could perhaps change softclock_thread() to try to acquire locks in
the opposite order, but that'd require dropping and re-acquiring the
callout lock, which seems expensive for an operation that will happen
quite frequently.  We can instead perhaps avoid stopping the
td_slpcallout callout if the thread is still going to sleep, which is
what this patch does.  This will result in a spurious call to
sleepq_timeout(), but some counters suggest that this is very rare.

PR:		261198
Fixes:		74cf7cae4d ("softclock: Use dedicated ithreads for running callouts.")
Reported and tested by:	thj
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34204
2022-02-14 10:06:47 -05:00
Mateusz Guzik
6aa246e605 vfs: convert vnsz2log to a macro 2022-02-13 13:07:08 +00:00
Mateusz Guzik
5c31025060 fd: use FILEDESC_FOREACH_{FDE,FP} where appropriate 2022-02-13 13:07:08 +00:00
Mateusz Guzik
809f3121be fd: assign fd_freefile early when copying
This is to simplify an upcomming change.
2022-02-13 13:07:08 +00:00
Mateusz Guzik
893d20c95a fd: move fd table sizing out of fdinit
now it is placed with the rest of actual initialisation
2022-02-13 13:07:08 +00:00
Mateusz Guzik
4103c3cd5b fd: drop volatile keyword from refcounts
While here move a comment where it belongs and do small whitespace clean up.
2022-02-13 13:07:08 +00:00
Mateusz Guzik
b53133a778 proc: load/store p_cowgen using atomic primitives 2022-02-13 13:07:08 +00:00
Mateusz Guzik
29ee49f66b thread: remove dead store from thread_cow_update 2022-02-13 13:07:08 +00:00