This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).
Obtained from: CheriBSD
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D33780
Prior to this commit, the TSC and local APIC frequencies were calibrated
at boot time by measuring the clocks before and after a one-second sleep.
This was simple and effective, but had the disadvantage of *requiring a
one-second sleep*.
Rather than making two clock measurements (before and after sleeping) we
now perform many measurements; and rather than simply subtracting the
starting count from the ending count, we calculate a best-fit regression
between the target clock and the reference clock (for which the current
best available timecounter is used). While we do this, we keep track
of an estimate of the uncertainty in the regression slope (aka. the ratio
of clock speeds), and stop measuring when we believe the uncertainty is
less than 1 PPM.
In order to avoid the risk of aliasing resulting from the data-gathering
loop synchronizing with (a multiple of) the frequency of the reference
clock, we add some additional spinning depending upon the iteration number.
For numerical stability and simplicity of implementation, we make use of
floating-point arithmetic for the statistical calculations.
On the author's Dell laptop, this reduces the time spent in calibration
from 2000 ms to 29 ms; on an EC2 c5.xlarge instance, it is reduced from
2000 ms to 2.5 ms.
Reviewed by: bde (previous version), kib
MFC after: 1 month
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D33802
When detaching, truss(1) sends SIGSTOP to the traced process to ensure
that it is detaching in the steady state. But it is possible, for
multithreaded process, that wait() call returns event other than our
SIGSTOP notification. As result, SIGSTOP might sit in some thread'
sigqueue, which makes SIGCONT a nop. Then, the process is stopped when
the queued SIGSTOP is acted upon.
To handle this, loop until we drain everything before SIGSTOP,
and see that the process is stopped.
Note that the earlier fix makes it safe to have some more debugging
events longering after SIGSTOP is acted upon. They will be ignored
after PT_DETACH.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33861
vm_reserv.c uses its own bitstring implemenation for popmaps. Using
the bitstring_t type from a standard header eliminates the code
duplication, allows some bit-at-a-time operations to be replaced with
more efficient bitstring range operations, and, in
vm_reserv_test_contig, allows bit_ffc_area_at to more efficiently
search for a big-enough set of consecutive zero-bits.
Make bitstring changes improve the vm_reserv code. Define a bit_ntest
method to test whether a range of bits is all set, or all clear.
Define bit_ff_at and bit_ff_area_at to implement the ffs and ffc
versions with a parameter to choose between set- and clear- bits.
Improve the area_at implementation. Modify the bit_nset and
bit_nclear implementations to allow code optimization in the cases
when start or end are multiples of _BITSTR_BITS.
Add a few new cases to bitstring_test.
Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision: https://reviews.freebsd.org/D33312
Pointer authentication allows userspace to add instructions to insert
a Pointer Authentication Code (PAC) into a register based on an address
and modifier and check if the PAC is correct. If the check fails it will
either return an invalid address or fault to the kernel.
As many of these instructions are a NOP when disabled and in earlier
revisions of the architecture this can be used, for example, to sign
the return address before pushing it to the stack making Return-oriented
programming (ROP) attack more difficult on hardware that supports them.
The kernel manages five 128 bit signing keys: 2 instruction keys, 2 data
keys, and a generic key. The instructions then use one of these when
signing the registers. Instructions that use the first four store the
PAC in the register being signed, however the instructions that use the
generic key store the PAC in a separate register.
Currently all userspace threads share all the keys within a process
with a new set of userspace keys being generated when executing a new
process. This means a forked child will share its keys with its parent
until it calls an appropriate exec system call.
In the kernel we allow the use of one of the instruction keys, the ia
key. This will be used to sign return addresses in function calls.
Unlike userspace each kernel thread has its own randomly generated.
Thread0 has a static key as does the early code on secondary CPUs.
This should be safe as there is minimal user interaction with these
threads, however we could generate random keys when the Armv8.5
Random number generation instructions are present.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31261
The USB controller drivers assume they can cast a NULL pointer to a
struct and find the address of a member. KUBSan complains about this so
replace with the __offsetof and __containerof macros that use either a
builtin function where available, or the same NULL pointer on older
compilers without the builtin.
Reviewers: hselasky
Subscribers: imp
Reviewed by: hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33865
sddadump has been derived from sddastart.
mmc_sim interface has grown a new method, cam_poll, to support polled
operation.
mmc_sim code has been changed to provide a sim_poll hook only if the
controller implements the new method. The hooks is implemented in terms
of the new mmc_sim_cam_poll method.
Additionally, in-progress CCB-s now have CAM_REQ_INPROG status to
satisfy xpt_pollwait().
mmc_sim_cam_poll method has been implemented in dwmmc host controller.
Reviewed by: manu, mav, imp
MFC after: 2 weeks
Relnotes: perhaps
Differential Revision: https://reviews.freebsd.org/D33843
Add some test cases, based on the existing nullfs10 scenario, to
ensure that unionfs write references are propagated between the
unionfs and underlying vnodes, including unionfs copy-on-write
operations
Reviewed by: kib (prior version), markj, pho
Differential Revision: https://reviews.freebsd.org/D33729
do_execve() will hold the vnode lock shared when it calls VOP_OPEN(),
but unionfs_open() requires the lock to be held exclusive to
correctly synchronize node status updates. This requirement is
asserted in unionfs_get_node_status().
Change unionfs_open() to temporarily upgrade the lock as is already
done in unionfs_close(). Related to this, fix various cases throughout
unionfs in which vnodes are not checked for reclamation following lock
upgrades that may have temporarily dropped the lock. Also fix another
related issue in which unionfs_lock() can incorrectly add LK_NOWAIT
during a downgrade operation, which trips a lockmgr assertion.
Reviewed by: kib (prior version), markj, pho
Reported by: pho
Differential Revision: https://reviews.freebsd.org/D33729
This better reflects the variables purpose and matches other functions
in this file.
Requested by: markj
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33755
These callbacks allow multiple contiguous blocks to be manipulated in
a single call. Note that any trailing partial block for a stream
cipher must still be passed to encrypt/decrypt_last.
While here, document the setkey and reinit hooks and reorder the hooks
in 'struct enc_xform' to better reflect the life cycle.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33529
This cipher is a wrapper around the ChaCha20-Poly1305 AEAD cipher
which accepts a larger nonce. Part of the nonce is used along with
the key as an input to HChaCha20 to generate a derived key used for
ChaCha20-Poly1305.
This cipher is used by WireGuard.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33523
This was added to 'device crypto' in the kernel in
bbb7a2c7c329494e0148026f8568c0da4d8db085 but was missing from the
module.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33522
Entries for foo.debug files matching an existing entry in OLD_FILES or
OLD_LIBS are unnecessary as they are auto-generated.
Reviewed by: imp, emaste
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33777
These targets generate a raw list of the candidate old files roughly
corresponding to the values of OLD_DIRS, OLD_FILES, and OLD_LIBS.
Currently list-old-files also includes uncompressed manpages in
addition to compressed manpages.
Use these targets in the implementation of check-old-* and
delete-old-* to replace duplicated logic.
Reviewed by: imp, emaste
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33327
The UFS and ZFS file systems only support Allow/Deny ACEs
in the NFSv4 ACLs. This patch does not allow the server
to parse Audit/Alarm ACEs. The NFSv4 client is still
allowed to pase Audit/Alarm ACEs, since non-FreeBSD NFSv4
servers may use them.
This patch should not have a significant effect, since the
UFS and ZFS file systems will not handle these ACEs anyhow.
It simply serves as an additional "safety belt" for the
NFSv4 server.
MFC after: 2 weeks
For consistancy with the kernel linker script also use ${MACHINE} for
finding the kernel module linker script. As we currently only use this
for amd64 and i386 this is a no-op, but I'm planning on using this with
arm64 where ${MACHINE} != ${MACHINE_ARCH}.
Reviewed by: markj, kib, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33841
This reverts commit 0fa074b53e7c22157dcb41aaa25a33abc8118f26.
I now see that the implementation of the "dacl" operation
requires that the NFSv4 server to "automatic inheritance"
and I do not plan on doing this. As such, this patch is
harmless, but unneeded.
This reverts commit f10dc28ec21db60cf1faa3c4b445c4065e760dba.
The client should still be able to getfacl
audit and alarm ACEs, for non-FreeBSD NFSv4 servers.
A patch that only disables audit/alarm for the server
side will be committed to replace this patch.
The "bg" option does not go background until the initial mount
attempt fails, which can take 60+ seconds.
This new "bgnow" option goes background immediately, avoiding
the 60+ second delay, if the NFS server is not yet available.
The man page update is a content change.
Tested by: jwb
Reviewed by: debdrup, emaste
PR: 260764
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33733
Suspend/Resume of Win10 leads that CPU0 is busy on handling interrupts.
Win10 does not use LAPIC timer to often and in most cases, and I see it
is disabled by writing 0 to Initial Count Register (for Timer).
During resume, restart timer only for enabled LAPIC and enabled timer
for that LAPIC.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33448
Remove always-false checks for UMA zone creation failure. No functional
change intended.
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33809
Flip dwwdt_prevent_restart to false. What's the use of a watchdog if it
does not restart a hung system?
Add a knob for panic-ing on the first timeout, resetting on the second
one. This can be useful if interrupts can still work, otherwise a reset
recovers a system without any aid for debugging the hang.
The change also doubles the timeout that's programmed into the hardware.
The previous version of the code always had the interrupt on the first
timeout enabled, but it took no action on it. Only the second timeout
could be configured to reset the system. So, the hardware timeout was
set to a half of the user requested timeout. But now,we can take a
corrective action on the first timeout, so we use the user requested
timeout.
While here, define boolean sysctl-s as such.
Reviewed by: manu
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D33534
We had a hardcoded limit of 1/128-th of physical memory that was further
subdivided between all CPUs as principal buffers are allocated on the
per-CPU basis. Actually, the buffers could use up 1/64-th of the
memmory because with the default switch policy there are two buffers per
CPU.
This commit allows to change that limit.
Note that the discussed limit is per dtrace command invocation.
The idea is to limit the size of a single malloc(9) call, not the total
memory size used by DTrace buffers.
Reviewed by: markj
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D33648
It was set to the start of the buffer and that can be different from the
start of teh first record because of a misalignment.
This change follows the example of dt_realloc_buf().
Reviewed by: tsoome, markj
MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D33649
Those can be returned by CHECK POWER MODE command (0xe5).
Note that some of the definitions duplicate definitions for Extended
Power Conditions.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33646