Commit Graph

19035 Commits

Author SHA1 Message Date
Andrew Turner
41e6d2091c Enable subr_physmem_test on supported architectures
Only build where it's supported.

While here add support for amd64 to help with testing.

Sponsored by:	The FreeBSD Foundation
2022-04-07 14:31:51 +01:00
Andrew Turner
d8bff5b67c Handle non-page aligned/sized memory in physmem
In some configurations the firmware may pass memory regions that are
not page sized or aligned, e.g. when using 16k pages on arm64. If this
is the case we will calculate many small regions because the alignment
is applied before being inserted. As we round the start up and end down
this will leave a 1 page hole between what should have been a single
region.

Fix by keeping the original alignment until we are just about to insert
the region into the avail array.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34694
2022-04-06 14:13:29 +01:00
Andrew Turner
8c99dfed54 Port subr_physmem to userspace and add tests
These give us some confidience we haven't broken anything in early
boot code that may be running before the console.

Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34691
2022-04-06 14:13:05 +01:00
Mitchell Horne
eb9d205fa6 livedump: add event handler hooks
Add three hooks to the livedump process: before, after, and for each
block of dumped data. This allows, for example, quiescing the system
before the dump begins or protecting data of interest to ensure its
consistency in the final output.

Reviewed by:	markj, kib (previous version)
Reviewed by:	debdrup (manpages)
Reviewed by:	Pau Amma <pauamma@gundo.com> (manpages)
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34067
2022-04-05 15:35:05 -03:00
Mitchell Horne
c9114f9f86 Add new vnode dumper to support live minidumps
This dumper can instantiate and write the dump's contents to a
file-backed vnode.

Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.

As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.

A future change to savecore(8) will add an option to save a live dump.

Reviewed by:	markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with:	kib
MFC after:	3 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D33813
2022-04-05 15:35:05 -03:00
Mitchell Horne
59c27ea18c Split out dumper allocation from list insertion
Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.

This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.

free_single_dumper() is made public and renamed to dumper_destroy().

Reviewed by:	kib, markj
MFC after:	1 week
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34068
2022-04-05 15:35:05 -03:00
Mateusz Guzik
b7262756e2 vfs: fixup WANTIOCTLCAPS on open
In some cases vn_open_cred overwrites cn_flags, effectively nullifying
initialisation done in NDINIT. This will have to be fixed.

In the meantime make sure the flag is passed.

Reported by:	jenkins
Noted by:	Mathieu <sigsys@gmail.com>
2022-04-02 20:49:01 +02:00
Gordon Bergling
c9b04ee4f8 kern: Fix two typos in source code comments
- s/accomodate/accommodate/

MFC after:	3 days
2022-04-02 14:52:49 +02:00
Gordon Bergling
7181887e82 kern: Fix two typos in source code comments
- s/measurment/measurement/

MFC after:	3 days
2022-04-02 14:15:27 +02:00
Mateusz Guzik
0c805718cb vfs: fix memory leak on lookup with fds with ioctl caps
Reviewed by:	markj
PR:		262515
Noted by:	firk@cantconnect.ru
Differential Revision:	https://reviews.freebsd.org/D34667
2022-04-02 12:09:07 +00:00
Gordon Bergling
669d5ea4e3 kern: Fix a typo in a source code comment
- s/paniced/panicked/

MFC after:	3 days
2022-04-02 10:15:02 +02:00
Ed Maste
e5821a2156 syscalls.master: remove obsolete comment about compatibility tables
Compatibility ABIs no longer use a separate syscalls.master.

Fixes:		be67ea40c5 ("freebsd32: generate from ...")
Sponsored by:	The FreeBSD Foundation
2022-03-30 11:07:00 -04:00
Brooks Davis
8601fca789 sysent: regen for syscallarg_t 2022-03-28 19:43:03 +01:00
Brooks Davis
b1ad6a9000 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-03-28 19:43:03 +01:00
Andrew Turner
f461b95561 Fix a sign mismatch warning in the physmem code
Make sure both sides of a comparison are unsigned. As the values being
compared are size_t make the the value in the for loop size_t too.

Sponsored by:	The FreeBSD Foundation
2022-03-28 11:51:09 +01:00
Mateusz Guzik
2533b5dc82 vfs: add missing bits to vdropl_impl
This completes the patch which was originally meant to go in.

Spotted by:	mhorne
Fixes: c35ec1efdc ("vfs: [1/2] fix stalls in vnode reclaim by not
requeieing from vnlru")
2022-03-27 14:35:37 +00:00
Mateusz Guzik
a4032e2a69 vfs: assorted tidy ups to lookup
No functional changes.
2022-03-26 17:06:09 +00:00
Alexander Leidinger
aeb91e95cf Log euid, rgid and jail on listen queue overflow
If you have numerous jails with multiple similar services running,
this helps to narrow down which services this log is referring to.
2022-03-26 11:17:55 +01:00
Eric van Gyzen
aca2a7faca stack_zero is not needed before stack_save
The man page was recently clarified to commit to this contract.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2022-03-25 20:10:38 -05:00
Eric van Gyzen
863070bbf6 ksiginfo_alloc: pass M_WAITOK or M_NOWAIT to uma_zalloc
It expects exactly one of those flags.  A future commit will assert this.

Reviewed by:	rstone
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D34451
2022-03-25 20:10:37 -05:00
Mateusz Guzik
0f60088399 vfs: set cn_namelen when handling degenerate lookups
Turns out execve looks at it to store binary name, but in order to
trigger the problem one has to be trying to exec '/'. As is the value
would be left uninitialized (or rather set to -1 on debug kernels).

Fixes:	56244d3574 ("vfs: hoist degenerate path lookups out of the
loop")
2022-03-25 18:19:36 +00:00
Mateusz Guzik
4ef6e56ae8 vfs: hoist trailing slash handling out of the loop 2022-03-24 14:36:31 +00:00
Mateusz Guzik
3b6792d28a vfs: factor symlink traversal out of namei
The intent down the road is to eliminate the loop to begin with,
pushing traversal down to vfs_lookup, all while not allocating the
extra buffer.
2022-03-24 13:11:22 +00:00
Mateusz Guzik
d9ea7e2b1e vfs: factor FAILIFEXISTS handling out of vfs_lookup 2022-03-24 11:22:20 +00:00
Mateusz Guzik
56244d3574 vfs: hoist degenerate path lookups out of the loop 2022-03-24 11:22:12 +00:00
Mateusz Guzik
bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Mark Johnston
1babcad6bc elf: Avoid dumping uninitialized bytes in PRSTATUS core dump notes
elf_prstatus_t contains pad space.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34606
2022-03-23 12:53:49 -04:00
Mark Johnston
7524994da0 callout: Remove the CS_EXECUTING flag
It is now unused.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34626
2022-03-23 12:37:02 -04:00
Mark Johnston
b319171861 setitimer: Fix exit race
We use the p_itcallout callout, interlocked by the proc lock, to
schedule timeouts for the setitimer(2) system call.  When a process
exits, the callout must be stopped before the process struct is
recycled.

Currently we attempt to stop the callout in exit1() with the call
_callout_stop_safe(&p->p_itcallout, CS_EXECUTING).  If this call returns
0, then we sleep in order to drain the callout.  However, this happens
only if the callout is not scheduled at all.  If the callout thread is
blocked on the proc lock, then exit1() will not block and the callout
may execute after the process has fully exited, typically resulting in a
panic.

I cannot see a reason to use the CS_EXECUTING flag here.  Instead, use
the regular callout_stop()/callout_drain() dance to halt the callout.

Reported by:	ler
Tested by:	ler, pho
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34625
2022-03-23 12:36:12 -04:00
Alexander Motin
fd6ca665d2 Fix umtxq_sleep() regression caused by 56070dd2e4.
umtxq_requeue() moves the queue to a different hash chain and different
lock, so we can't rely on msleep_sbt() reacquiring the same old lock.
We have to use PDROP and update the queue chain and so lock pointer.

PR:		262587
MFC after:	2 weeks
2022-03-21 19:55:55 -04:00
firk
bb53dd56c3 kern_tc.c/cputick2usec() (which is used to calculate cputime from
cpu ticks) has some imprecision and, worse, huge timestep (about
20 minutes on 4GHz CPU) near 53.4 days of elapsed time.

kern_time.c/cputick2timespec() (it is used for clock_gettime() for
querying process or thread consumed cpu time) Uses cputick2usec()
and then needlessly converting usec to nsec, obviously losing
precision even with fixed cputick2usec().

kern_time.c/kern_clock_getres() uses some weird (anyway wrong)
formula for getting cputick resolution.

PR:		262215
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D34558
2022-03-21 09:33:46 -04:00
Andrew Turner
cab496e16c Make SHMMAXPGS an unsigned long
This is used to calculate sizes that are then stored in unsigned long
fields. Make this unsigned long so the calculations use this type and
not an int that can lead to an integer overflow with a large PAGE_SIZE.

This allows building this on arm64 with PAGE_SIZE of 16k. Further work
will be needed if a 32-bit architecture tries to use a similar sized
page.

Sponsored by:	The FreeBSD Foundation
2022-03-21 10:27:35 +00:00
Colin Percival
2406867f5b tslog: Add CTLFLAG_SKIP to sysctls
The timestamp logs are quite large (often much larger than all the
other sysctls combined) so it's unlikely anyone will want to have
them displayed by `sysctl -a`.

MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D34616
2022-03-20 11:31:16 -07:00
Mateusz Guzik
6ff3e8a316 cache: add a comment about a realpath bug 2022-03-19 15:11:25 +00:00
Mateusz Guzik
eb574ba0b6 vfs: replace VFS_NOTIFY_UPPER_* macros with an enum 2022-03-19 13:15:55 +00:00
Mateusz Guzik
cceb91b025 vfs: add missing flags to db show mount 2022-03-19 12:04:44 +00:00
Mateusz Guzik
93a0ba8f49 vfs: retire the no longer used MNTK_LOOKUP_EXCL_DOTDOT flag
Reviewed by:	markj
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D34466
2022-03-19 10:47:29 +00:00
Mateusz Guzik
1cb0045c97 vfs: add MNTK_UNLOCKED_INSMNTQUE
Can be used when the fs at hand can synchronize insmntque with other
means than the vnode lock.

Reviewed by:	markj
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D34466
2022-03-19 10:46:40 +00:00
firk
28d08dc7d0 clock_gettime: Fix CLOCK_THREAD_CPUTIME_ID race
Use a spinlock section instead of a critical section to synchronize with
statclock().  Otherwise the CLOCK_THREAD_CPUTIME_ID clock can appear to
go backwards.

PR:		262273
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34568
2022-03-17 15:39:00 -04:00
Mark Johnston
fc7e121d88 file: Move FILEDESC_FOREACH macros to kern_descrip.c
They are only used in kern_descrip.c, so make them private.  No
functional change intended.

Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
2022-03-17 15:39:00 -04:00
Mark Johnston
c702242292 file: Avoid a read-after-free of fd tables in sysctl handlers
Some loops access the fd table of a different process, and drop the
filedesc lock while iterating, so they check the table's refcount.
However, we access the table before the first iteration, in order to get
the number of table entries, and this access can be a use-after-free.

Fix the problem by checking the refcount before we start iterating.

Reported by:	pho
Reviewed by:	mjg
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34575
2022-03-17 15:39:00 -04:00
Mateusz Guzik
0134bbe56f vfs: prefix lookup and relookup with vfs_
Reviewed by:	imp, mckusick
Differential Revision:		https://reviews.freebsd.org/D34530
2022-03-13 14:44:39 +00:00
Mateusz Guzik
02fc4e319c cache: use flexible array member
... instead of 0-sizing the array
2022-03-13 14:43:35 +00:00
John Baldwin
6b71405bfe Store core dump notes for all valid register sets for FreeBSD processes.
In particular, use a generic wrapper around struct regset rather than
requiring per-regset helpers.  This helper replaces the MI
__elfN(note_prstatus) and __elfN(note_fpregset) helpers.  It also
removes the need to explicitly dump NT_ARM_ADDR_MASK in the arm64
__elfN(dump_thread).

Reviewed by:	markj, emaste
Sponsored by:	University of Cambridge, Google, Inc.
Differential Revision:	https://reviews.freebsd.org/D34446
2022-03-10 15:40:19 -08:00
Kornel Duleba
b344de4d0d Extend device_get_property API
In order to support various types of data stored in device
tree properties or ACPI _DSD packages, create a new enum so
the caller can specify the expected type of a property they
want to read, according to the binding. The bus logic will use
that information to process the underlying data.

For example in DT all integer properties are stored in BE format.
In order to get constant results across different platforms we
need to convert its endianness to match the host.

Another example are ACPI_TYPE_INTEGER properties stored
as uint64_t. Before this patch the ACPI logic would refuse
to read them if the provided buffer was smaller than 8 bytes.
Now this can be handled by using DEVICE_PROP_UINT32 type.

Modify the existing consumers of this API to reflect the changes
and update the man pages accordingly.

Reviewed by: mw
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33457
2022-03-10 12:11:32 +01:00
Kornel Duleba
206dc82bc3 bus_if: Add a default implementation of get_property
There are multiple buses that pretend to be ofw compatible,
e.g ofw_pci, mii_fdt. We now need to provide an implementation
of BUS_GET_PROPERTY for every one of them. Instead of modifying
them one by one it's better to just provide a default
implementation that simply traverses up the device tree.
Remove the now unneeded BUS_GET_PROPERTY implementation in mii_fdt.

Reviewed by: andrew, bz
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34031
2022-03-10 12:11:32 +01:00
Mateusz Guzik
3a4c5dab92 vfs: [2/2] fix stalls in vnode reclaim by only counting attempts
... and ignoring if they succeded, which matches historical behavior.

Reported by:	pho
2022-03-10 09:41:50 +00:00
Mateusz Guzik
c35ec1efdc vfs: [1/2] fix stalls in vnode reclaim by not requeieing from vnlru
Reported by:	pho
2022-03-10 09:41:50 +00:00
Ed Maste
080b4e8a0c kcov: use __func__ in KASSERT instead of old function name
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:47:27 -05:00
Mark Johnston
afb44cb010 rmlock: Temporarily revert commit c84bb8cd77
It appears to have introduced a regression on arm64, possibly due to the
fact that the pcpu pointer is reloaded outside of the critical section
in _rm_rlock().  Until this is resolved one way or another, let's
revert.

Reported by:	Ronald Klop <ronald-lists@klop.ws>
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:43:19 -05:00