In some configurations the firmware may pass memory regions that are
not page sized or aligned, e.g. when using 16k pages on arm64. If this
is the case we will calculate many small regions because the alignment
is applied before being inserted. As we round the start up and end down
this will leave a 1 page hole between what should have been a single
region.
Fix by keeping the original alignment until we are just about to insert
the region into the avail array.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34694
These give us some confidience we haven't broken anything in early
boot code that may be running before the console.
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34691
Add three hooks to the livedump process: before, after, and for each
block of dumped data. This allows, for example, quiescing the system
before the dump begins or protecting data of interest to ensure its
consistency in the final output.
Reviewed by: markj, kib (previous version)
Reviewed by: debdrup (manpages)
Reviewed by: Pau Amma <pauamma@gundo.com> (manpages)
MFC after: 3 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34067
This dumper can instantiate and write the dump's contents to a
file-backed vnode.
Unlike existing disk or network dumpers, the vnode dumper should not be
invoked during a system panic, and therefore is not added to the global
dumper_configs list. Instead, the vnode dumper is constructed ad-hoc
when a live dump is requested using the new ioctl on /dev/mem. This is
similar in spirit to a kgdb session against the live system via
/dev/mem.
As described briefly in the mem(4) man page, live dumps are not
guaranteed to result in a usuable output file, but offer some debugging
value where forcefully panicing a system to dump its memory is not
desirable/feasible.
A future change to savecore(8) will add an option to save a live dump.
Reviewed by: markj, Pau Amma <pauamma@gundo.com> (manpages)
Discussed with: kib
MFC after: 3 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D33813
Add a new function, dumper_create(), to allocate a dumper.
dumper_insert() will call this function and retains the existing
behaviour.
This is desirable for performing live dumps of the system. Here, there
is a need to allocate and configure a dumper structure that is invoked
outside of the typical debugger context. Therefore, it should be
excluded from the list of panic-time dumpers.
free_single_dumper() is made public and renamed to dumper_destroy().
Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34068
In some cases vn_open_cred overwrites cn_flags, effectively nullifying
initialisation done in NDINIT. This will have to be fixed.
In the meantime make sure the flag is passed.
Reported by: jenkins
Noted by: Mathieu <sigsys@gmail.com>
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).
Obtained from: CheriBSD
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D33780
Make sure both sides of a comparison are unsigned. As the values being
compared are size_t make the the value in the for loop size_t too.
Sponsored by: The FreeBSD Foundation
This completes the patch which was originally meant to go in.
Spotted by: mhorne
Fixes: c35ec1efdc ("vfs: [1/2] fix stalls in vnode reclaim by not
requeieing from vnlru")
It expects exactly one of those flags. A future commit will assert this.
Reviewed by: rstone
MFC after: 1 month
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D34451
Turns out execve looks at it to store binary name, but in order to
trigger the problem one has to be trying to exec '/'. As is the value
would be left uninitialized (or rather set to -1 on debug kernels).
Fixes: 56244d3574 ("vfs: hoist degenerate path lookups out of the
loop")
We use the p_itcallout callout, interlocked by the proc lock, to
schedule timeouts for the setitimer(2) system call. When a process
exits, the callout must be stopped before the process struct is
recycled.
Currently we attempt to stop the callout in exit1() with the call
_callout_stop_safe(&p->p_itcallout, CS_EXECUTING). If this call returns
0, then we sleep in order to drain the callout. However, this happens
only if the callout is not scheduled at all. If the callout thread is
blocked on the proc lock, then exit1() will not block and the callout
may execute after the process has fully exited, typically resulting in a
panic.
I cannot see a reason to use the CS_EXECUTING flag here. Instead, use
the regular callout_stop()/callout_drain() dance to halt the callout.
Reported by: ler
Tested by: ler, pho
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34625
umtxq_requeue() moves the queue to a different hash chain and different
lock, so we can't rely on msleep_sbt() reacquiring the same old lock.
We have to use PDROP and update the queue chain and so lock pointer.
PR: 262587
MFC after: 2 weeks
cpu ticks) has some imprecision and, worse, huge timestep (about
20 minutes on 4GHz CPU) near 53.4 days of elapsed time.
kern_time.c/cputick2timespec() (it is used for clock_gettime() for
querying process or thread consumed cpu time) Uses cputick2usec()
and then needlessly converting usec to nsec, obviously losing
precision even with fixed cputick2usec().
kern_time.c/kern_clock_getres() uses some weird (anyway wrong)
formula for getting cputick resolution.
PR: 262215
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D34558
This is used to calculate sizes that are then stored in unsigned long
fields. Make this unsigned long so the calculations use this type and
not an int that can lead to an integer overflow with a large PAGE_SIZE.
This allows building this on arm64 with PAGE_SIZE of 16k. Further work
will be needed if a 32-bit architecture tries to use a similar sized
page.
Sponsored by: The FreeBSD Foundation
The timestamp logs are quite large (often much larger than all the
other sysctls combined) so it's unlikely anyone will want to have
them displayed by `sysctl -a`.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D34616
Can be used when the fs at hand can synchronize insmntque with other
means than the vnode lock.
Reviewed by: markj
Tested by: pho (previous version)
Differential Revision: https://reviews.freebsd.org/D34466
Use a spinlock section instead of a critical section to synchronize with
statclock(). Otherwise the CLOCK_THREAD_CPUTIME_ID clock can appear to
go backwards.
PR: 262273
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34568
Some loops access the fd table of a different process, and drop the
filedesc lock while iterating, so they check the table's refcount.
However, we access the table before the first iteration, in order to get
the number of table entries, and this access can be a use-after-free.
Fix the problem by checking the refcount before we start iterating.
Reported by: pho
Reviewed by: mjg
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34575
In particular, use a generic wrapper around struct regset rather than
requiring per-regset helpers. This helper replaces the MI
__elfN(note_prstatus) and __elfN(note_fpregset) helpers. It also
removes the need to explicitly dump NT_ARM_ADDR_MASK in the arm64
__elfN(dump_thread).
Reviewed by: markj, emaste
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34446
In order to support various types of data stored in device
tree properties or ACPI _DSD packages, create a new enum so
the caller can specify the expected type of a property they
want to read, according to the binding. The bus logic will use
that information to process the underlying data.
For example in DT all integer properties are stored in BE format.
In order to get constant results across different platforms we
need to convert its endianness to match the host.
Another example are ACPI_TYPE_INTEGER properties stored
as uint64_t. Before this patch the ACPI logic would refuse
to read them if the provided buffer was smaller than 8 bytes.
Now this can be handled by using DEVICE_PROP_UINT32 type.
Modify the existing consumers of this API to reflect the changes
and update the man pages accordingly.
Reviewed by: mw
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D33457
There are multiple buses that pretend to be ofw compatible,
e.g ofw_pci, mii_fdt. We now need to provide an implementation
of BUS_GET_PROPERTY for every one of them. Instead of modifying
them one by one it's better to just provide a default
implementation that simply traverses up the device tree.
Remove the now unneeded BUS_GET_PROPERTY implementation in mii_fdt.
Reviewed by: andrew, bz
Obtained from: Semihalf
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34031
It appears to have introduced a regression on arm64, possibly due to the
fact that the pcpu pointer is reloaded outside of the critical section
in _rm_rlock(). Until this is resolved one way or another, let's
revert.
Reported by: Ronald Klop <ronald-lists@klop.ws>
Sponsored by: The FreeBSD Foundation