It seems that in r354857 I got more than one thing wrong.
Convert the SYSCTL_OPAQUE to a SYSCTL_PROC to properly export the these
days allocated and not longer static per-vnet viftable array.
This fixes a problem with netstat -g which would show bogus information
for the IPv4 Virtual Interface Table.
PR: 246626
Reported by: Ozkan KIRIK (ozkan.kirik gmail.com)
MFC after: 3 days
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled. Otherwise, there is no
functional change. The mechanism can be disabled with
debug.fxrng_vdso_enable=0.
arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time. Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed. On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.
This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator. That change is
left for future discussion/work.
Reviewed by: kib (previous version)
Approved by: csprng (me -- only touching FXRNG here)
Differential Revision: https://reviews.freebsd.org/D22839
There is no functional change for the existing Fortuna random(4)
implementation, which remains the default in GENERIC.
In the FenestrasX model, when the root CSPRNG is reseeded from pools due to
an (infrequent) timer, child CSPRNGs can cheaply detect this condition and
reseed. To do so, they just need to track an additional 64-bit value in the
associated state, and compare it against the root seed version (generation)
on random reads.
This revision integrates arc4random(9) into that model without substantially
changing the design or implementation of arc4random(9). The motivation is
that arc4random(9) is immediately reseeded when the backing random(4)
implementation has additional entropy. This is arguably most important
during boot, when fenestrasX is reseeding at 1, 3, 9, 27, etc., second
intervals. Today, arc4random(9) has a hardcoded 300 second reseed window.
Without this mechanism, if arc4random(9) gets weak entropy during initial
seed (and arc4random(9) is used early in boot, so this is quite possible),
it may continue to emit poorly seeded output for 5 minutes. The FenestrasX
push-reseed scheme corrects consumers, like arc4random(9), as soon as
possible.
Reviewed by: markm
Approved by: csprng (markm)
Differential Revision: https://reviews.freebsd.org/D22838
Fortuna remains the default; no functional change to GENERIC.
Big picture:
- Scalable entropy generation with per-CPU, buffered local generators.
- "Push" system for reseeding child generators when root PRNG is
reseeded. (Design can be extended to arc4random(9) and userspace
generators.)
- Similar entropy pooling system to Fortuna, but starts with a single
pool to quickly bootstrap as much entropy as possible early on.
- Reseeding from pooled entropy based on time schedule. The time
interval starts small and grows exponentially until reaching a cap.
Again, the goal is to have the RNG state depend on as much entropy as
possible quickly, but still periodically incorporate new entropy for
the same reasons as Fortuna.
Notable design choices in this implementation that differ from those
specified in the whitepaper:
- Blake2B instead of SHA-2 512 for entropy pooling
- Chacha20 instead of AES-CTR DRBG
- Initial seeding. We support more platforms and not all of them use
loader(8). So we have to grab the initial entropy sources in kernel
mode instead, as much as possible. Fortuna didn't have any mechanism
for this aside from the special case of loader-provided previous-boot
entropy, so most of these sources remain TODO after this commit.
Reviewed by: markm
Approved by: csprng (markm)
Differential Revision: https://reviews.freebsd.org/D22837
DTS must be synced with the kernel, add a freebsd,dts-version string in
the root node of each DTS that we compile so we can later in the kernel
check that it contain a correct value.
Reviewed by: imp, mmel
Differential Revision: https://reviews.freebsd.org/D26724
r366344 corrected the optimized amd64 skein assembly implementation, so
we can now enable it again.
Also add a dependency on this Makefile for the skein_block object, so
that it will be rebuit (similar to r366362).
PR: 248221
Sponsored by: The FreeBSD Foundation
r362163 upgraded mountd so that it could handle MAX_NGROUPS
groups for the anonymous user credentials (the ones provided by
-maproot and -mapall exports options).
The problem is that this resulted in every export structure growing by
about 4Kbytes, because the cr_groups field went from 16->MAX_NGROUPS.
This patch fixes this by only including a small 32 element cr_groups in the
structure and then malloc()'ng cr_groups when a larger one is needed.
The value of SMALLNGROUPS is arbitrarily set to 32, assuming most users
used by -maproot or -mapall will be in <= 32 groups.
Reviewed by: kib, freqlabs
Differential Revision: https://reviews.freebsd.org/D26521
r365732 was the first attempt to get an accurate count but it was
writing to some read-only registers to clear them and that obviously
didn't work. Instead, note the counter's value when it is supposed to
be cleared and subtract it from future readings.
dev.<port>.stats.rx_fcs_error should not be serviced from the MPS
register for T6.
The stats.* sysctls should all use T5_PORT_REG for T5 and above. This
must have been missed in the initial T5 support years ago. Fix it while
here.
MFC after: 3 days
Sponsored by: Chelsio Communications
Truncating requires an exclusive lock, but it was not taken if the
filesystem indicates support for shared writes. This only concerns
ZFS.
In particular fixes cp of files which have trailing holes.
Reported by: bdrewery
The module handler code invokes a MOD_UNLOAD event immediately if
MOD_LOAD fails. The result was that if seminit() failed, semunload()
was invoked twice. semunload() is not idempotent however and would
try to remove it's process_exit eventhandler twice resulting in a
panic.
Reviewed by: kib, markj
Obtained from: CheriBSD
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26696
Use of dead_vnodeops would result in a panic instead of returning the intended
EOPNOTSUPP error.
While here make sure to abort, not just try to return a partial result.
The former allows the regular lookup to restart from scratch, while the latter
makes it stuck with an unusable vnode.
Reported by: kevans
Single quotes interfere with the workaround put in with r335753 and
aren't necessary in this case. I believe that all the underling issues
with r335753 have been corrected, but need to do more extensive
followup before reverting it as a bad idea.
PR: 240411
MFC After: 2 days (to give it time to get into 12.2)
- When flushing extra lines after all input has been processed, make
sure that local state is reinitialized correctly.
- When -f is specified, make sure to end output with a full newline.
- Fix some style issues and update comments.
- Add some regression tests.
PR: 249308
Submitted by: Yang Zhong <yzhong@freebsdfoundation.org>
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26536
- whitespace at end of input line
- skipping paragraph macro: Pp at the end of Sh
- new sentence, new line
- consider using OS macro: Fx
- AUTHORS section without An macro
- skipping paragraph macro: Pp before Ss
Create the RISC-V NOTES and LINT files. As of r366559, LINT configs are
no longer generated but checked in to the tree.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D26502
Allow the DSCP codepoint also to be configurable
for the traffic in the direction from the initiator
to the target, such that writes and any requests
are also treated in the appropriate QoS class.
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26714
- no blank before trailing delimiter
- whitespace at end of input line
- sections out of conventional order
- normalizing date format
- AUTHORS section without An macro
Consider the currently in-use TCP options when
calculating the amount of new data to be injected during
SACK loss recovery. That addresses the effect that very small
(new) segments could be injected on partial ACKs while
still performing a SACK loss recovery.
Reported by: Liang Tian
Reviewed by: tuexen, chengc_netapp.com
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26446
This adds a new IP_PROTO / IPV6_PROTO setsockopt (getsockopt)
option IP(V6)_VLAN_PCP, which can be set to -1 (interface
default), or explicitly to any priority between 0 and 7.
Note that for untagged traffic, explicitly adding a
priority will insert a special 801.1Q vlan header with
vlan ID = 0 to carry the priority setting
Reviewed by: gallatin, rrs
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26409
`cpuset -g -x N` along with requested information always prints
message `cpuset: getdomain: Invalid argument'. The EINVAL is returned
from kern_cpuset_getdomain(), since it doesn't expect CPU_LEVEL_WHICH
and CPU_WHICH_IRQ parameters.
To fix the error, do not call cpuset_getdomain() when `-x' is specified.
MFC after: 1 week
Extend netstat to display TCP stack and detailed congestion state
Adding the "-c" option used to show detailed per-connection
congestion control state for TCP sessions.
This is one summary patch, which adds the relevant variables into
xtcpcb. As previous "spare" space is used, these changes are ABI
compatible.
Reviewed by: tuexen
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26518
Adding the "-c" option used to show detailed per-connection
congestion control state for TCP sessions.
This is one summary patch, which adds the relevant variables into
xtcpcb. As previous "spare" space is used, these changes are ABI
compatible.
Reviewed by: tuexen
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26518
makeLINT.mk isn't needed or used anymore, remove it and all the files
it uses.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
Now that config(8) has supported include for 19 years, transition to
including the NOTES files. include support didn't exist at the time,
nor did the envvar stuff recently added. Now that it does, eliminate
the building of LINT files by just including everything you need.
Note: This may cause conflicts with updating in some cases.
find sys -name LINT\* -rm
is suggested across this commit to remove the generated LINT
files.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
Without this patch, when vn_generic_copy_file_range() is
doing a large copy, it will remain in the function for a
considerable amount of time, delaying handling of any
outstanding signals until the copy completes.
This patch adds checks for signals that need to be
processed after each successful data copy cycle.
When sig_intr() returns non-zero, vn_generic_copy_file_range()
will return.
The check "if (len < savlen)" ensures that some data
has been copied, so that progress will be made.
Note that, since copy_file_range(2) is allowed to
return fewer bytes copied than requested, it
will never return EINTR/ERESTART when sig_intr()
returns non-zero.
Reviewed by: kib, asomers
Differential Revision: https://reviews.freebsd.org/D26620
We're going to check these files in shortly since we don't need to
generate them anymore. Generated files cause issues for different work
flows anyway.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
LINT config files are about to be checked in directly. Eliminate
building them by hand here from NOTES files.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
Too many version of UEFI firmware (so far only confirmed on amd64)
don't really support efibootmgr selection of boot. That's the most
reliable, when it works, since there's no guesswork. However, many do
not save, unmolested, the variables that efibootmgr sets, so as a
fallback we also install loader.efi as bootXXX.efi (where XXX is
either aa64 or x64) if it doesn't already exist in /efi/boot on the
ESP. The standard only defines this for removable devices, but it's
almost ubiquitously used as a fallback. Many BIOSes implement a drive
selection feature that takes over the efibootmgr protocol, rendinering
it useless (either generally, or for those vendors not on the short
list). bootxxx.efi works around this. However, we don't install it
unconditionally there, as that breaks some popular multi-boot setups.
MFC After: 1 week
Differential Revision: https://reviews.freebsd.org/D26428
The precedence of the '&' operator is less than of '+'. Added braces
do change the order of evaluation into the natural one, in my opinion.
On the other hand, the value of the expression should not change since
all elements should have page-aligned values.
This fixes a gcc warning reported.
Reported by: adrian
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Normally when a buffer with B_BARRIER is written, the flag is cleared
by g_vfs_strategy() when creating bio. But in some cases FFS buffer
might not reach g_vfs_strategy(), for instance when copy-on-write
reports an error like ENOSPC. In this case buffer is returned to
dirty queue and might be written later by other means. Among then
bdwrite() reasonably asserts that B_BARRIER is not set.
In fact, the only current use of B_BARRIER is for lazy inode block
initialization, where write of the new inode block is fenced against
cylinder group write to mark inode as used. The situation could be
seen that we break dependency by updating cg without written out
inode. Practically since CoW was not able to find space for a copy of
inode block, for the same reason cg group block write should fail.
Reported by: pho
Discussed with: chs, imp, mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26511
Check td_flags for relevant AST requests lock-less. This opens the
race slightly wider where sig_intr() returns false negative, but might
be it is worth it.
Requested by: mjg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Specifically, if lookup() returned any error and the topping directory
was not latched, which means that (non-existent) path did not returned
to the topping location, give ENOTCAPABLE a priority over the lookup()
error.
PR: 249960
Reviewed by: emaste, ngie
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26695