Commit Graph

269740 Commits

Author SHA1 Message Date
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Peter Holm
7e3c4b09a0 stress2: Added two test scenarios for future gunion(8) 2021-11-11 10:11:49 +01:00
Felix Johnson
c5e0492ae8 module(9): Document that evhand can be NULL
PR:		192250
MFC after:	3 days
Reported by:	ngie
2021-11-11 01:32:54 -05:00
Navdeep Parhar
448bcd01dc cxgbe(4): internal knob for flexible control over FEC selection.
Recent firmwares have support for autonomous FEC selection and a "force"
knob to let the driver control this behavior (or not) in a fine grained
manner. This change adds a driver knob so that all the different ways of
configuring the link FEC can be exercised. Note that this controls the
internal driver/firmware interaction for link configuration and is not
meant for general use.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-11-10 15:16:53 -08:00
Navdeep Parhar
f6a2e1100f cxgbe(4): separate sysctls for user-requested and in-use FEC.
Recent firmwares have more leeway in FEC selection and there is a need
to track the FECs requested by the driver separately from the FEC in use
on the link. The existing dev.<port>.<inst>.fec sysctl can read both but
its behavior depends on the link state and it is sometimes hard to find
out what was requested when the link is up.

Split the fec sysctl into two (requested_fec and link_fec) to get access
to both pieces of information regardless of the link state.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-11-10 15:04:37 -08:00
Mark Johnston
ac2b544417 mbuf: Fix an offset calculation in m_apply_extpg_one()
We were not including the requested starting offset in the page offset.

Reviewed by:	jhb
Fixes:		3c7a01d773 ("Extend m_apply() to support unmapped mbufs.")
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32922
2021-11-10 16:57:12 -05:00
Felix Johnson
5504d83942 sockstat(1): Update Synopsis section
Update sockstat(1) manpage so the Synopsis section includes q (silent
mode) and the -j argument name is consistent.

PR:		256795
MFC after:	3 days
Reported by:	Nick Reilly <nreilly@blackberry.com>
2021-11-10 15:22:06 -05:00
Konstantin Belousov
439c3d9563 Regen 2021-11-10 21:18:54 +02:00
Konstantin Belousov
43e6f07b06 sched.h: add CPU_EQUAL() for better compatibility with Linux
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
f239545591 x86: provide userspace implementation of sched_getcpu() where possible
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
77b2c2f814 Add sched_getcpu()
for compatibility with Linux.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
43736b71dd Add sched_get/setaffinity(3)
for compatibility with Linux.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
160b4b922b Add real sched.h
It is required by IEEE Std 1003.1-2008 AKA POSIX.

Put some Linux compatibility stuff under BSD_VISIBLE namespace, in
particular, sys/cpuset.h definitions.  Also, if user really want
Linux compatibility, she can request cpu_set_t typedef with
_WITH_CPU_SET_T define.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:53 +02:00
Konstantin Belousov
e5adb145f0 Remove arm/linux from sysent toplevel target
Sponsored by:	The FreeBSD Foundation
Fixes:	65e485014b
2021-11-10 21:18:53 +02:00
Hans Petter Selasky
ad8f078f66 ifconfig(8): Don't set network interface capabilities when there is no change.
A quick grep through the kernel code shows network drivers compute the
changed bits of network capabilities after a SIOCSIFCAP IOCTL(2) by
using the bitwise exclusive or operation. When the set capabilities
are equal to the already read capabilities, no action will be taken.

Let ifconfig(8) predict this case and skip the SIOCSIFCAP IOCTL(2)
system call.

Discussed with:	kib@ (revert change in case of issues)
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-11-10 15:50:52 +01:00
Martin Matuska
81b22a9892 zfs: merge openzfs/zfs@6c8f03232 (master) into main
Notable upstream pull request merges:
  #12333: Creating gang ABDs for Raidz optional IOs
  #12668: FreeBSD: Catch up with recent VFS changes
  #12687: Skip spacemaps reading in case of pool readonly import
  #12704: Fix some FreeBSD VOPs to synchronize properly with teardown
  #12724: Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency

Obtained from:	OpenZFS
OpenZFS commit:	6c8f03232a
2021-11-10 14:22:37 +01:00
Kristof Provost
2de49deeca pf tests: Test PR259689
We didn't populate dyncnt/tblcnt, so `pfctl -sr -vv` might not have the
table element count.

PR:		259689
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32893
2021-11-10 11:27:22 +01:00
Kristof Provost
218a8a491c pf: ensure we populate dyncnt/tblcnt in struct pf_addr_wrap
PR:		259689
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32892
2021-11-10 11:27:22 +01:00
Bjoern A. Zeeb
3987e50611 USB dwc3 controller: add quirk snps,dis_rxdet_inp3_quirk
Add support for the "snps,dis_rxdet_inp3_quirk" quirk needed
at least on SolidRun's HoneyComb.

Reviewed by:	manu, mw
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D32921
2021-11-10 09:44:44 +00:00
Bjoern A. Zeeb
fad51d34f2 arm64/gicv3: improve a panic message
Print the device/unit in the panic message for which we cannot get
the MSI device ID to have a clue where to start looking.
While here use __func__ instead of hardcoding the function name.

Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D32917
2021-11-10 09:41:57 +00:00
Peter Holm
6ffad483ff stress2: Added a new zfs test scenario 2021-11-10 10:27:44 +01:00
Dimitri John Ledkov
6c8f03232a Upgrade to libabigail 2.0.0
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Closes #12722
Closes #12739
2021-11-09 17:36:42 -08:00
Tony Hutter
ae70d628ff
zed: Control NVMe fault LEDs
The ZED code currently can only turn on the fault LED for
a faulted disk in a JBOD enclosure.  This extends support
for faulted NVMe disks as well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12648
Closes #12695
2021-11-09 16:50:18 -08:00
Fedor Uporov
e39fe05b69
Skip spacemaps reading in case of pool readonly import
The only zdb utility require to read metaslab-related data during
read-only pool import because of spacemaps validation. Add global
variable which will allow zdb read spacemaps in case of readonly
import mode.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9095
Closes #12687
2021-11-09 12:50:39 -08:00
Brian Atkinson
345196be18
Single IO issue for raidz writes with skip sector
In order to reduce contention on the vq_lock, optional skip sectors
for Raidz writes can be placed into a single IO request. This is done by
padding out the linear ABD for a parity column to contain the skip
sector and by creating gang ABD to contain the data and skip sector for
data columns.

The vdev_raidz_map_alloc() function now contains specific functions for
both reads and write to allocate the ABD's that will be issued down to
the VDEV chldren.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-By: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #12333
2021-11-09 12:51:33 -07:00
Brian Behlendorf
453c63e9b7 Linux 5.16 compat: submit_bio()
The submit_bio() prototype has changed again.  The version is 5.16
still only expects a single argument but the return type has changed
to void.  Since we never used the returned value before update the
configure check to detect both single arg versions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725
2021-11-09 11:27:46 -08:00
Brian Behlendorf
1e7d634867 Linux 5.16 compat: linux/elevator.h
Commit https://github.com/torvalds/linux/commit/2e9bc346 moved
the elevator.h header under the block/ directory as part of some
refactoring.  This turns out not to be a problem since there's
no longer anything we need from the header.  This has been the
case for some time, this change removes the elevator.h include
and replaces it with a major.h include.

Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725
2021-11-09 11:26:35 -08:00
Edward Tomasz Napierala
65e485014b Remove unfinished ARM Linuxulator support
This was never made to work, and given that 32-bit ARM support
is quickly becoming irrelevant, there is no point in keeping it
in the tree.

Sponsored By:	EPSRC
Reviewed By:	emaste
Differential Revision:	https://reviews.freebsd.org/D32918
2021-11-09 14:40:44 +00:00
Dries Michiels
e641c29a00 UPDATING: Change update procedure to use etcupdate(8) over mergemaster(8)
This commit aligns the steps in UPDATING with the steps from the
handbook which already prefers etcupdate(8). While here also remove a
dubious comment.

PR:			252417
Reviewed by:		ceri
Approved by:		philip (mentor), imp
Differential Revision:	https://reviews.freebsd.org/D28062
2021-11-10 09:18:42 +01:00
Kyle Evans
4c14980baa grep: fix/remove references to -P
-P in gnugrepland means PCRE, which we do not support.  We may eventually
support it if onigmo ends up getting imported as a more performant regex
implementation, and we can re-add it properly in these places (and more)
when that time comes.

The optstr change is a functional nop; the case was not explicitly handled,
thus ending in usage() anyways.

Reported by:	Vladimir Misev (via twitter)
2021-11-10 00:42:42 -06:00
Navdeep Parhar
d99b1d83b9 cxgbe(4): sysctl to track the last L1_CFG32 requested by the driver.
dev.<port>.<inst>.rcaps

 # sysctl dev.cc | grep rcaps
 dev.cc.1.rcaps: 581107776
 dev.cc.0.rcaps: 582156414

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-11-09 15:41:20 -08:00
Rick Macklem
b2bf1a5787 VOP_ALLOCATE: Update man page for Commit f0c9847a6c
Commit f0c9847a6c added the ioflag and cred arguments to
VOP_ALLOCATE() for NFSv4.2 server support. This patch updates
the man page for these arguments.

Reviewed by:	khng, gbe
Differential Revision:	https://reviews.freebsd.org/D32898
2021-11-09 15:13:15 -08:00
Bjoern A. Zeeb
8e902c1d21 mii: update URL for OUIs
Update the URL for OUIs as the old one is 404 not even 301 anymore.
2021-11-09 21:48:02 +00:00
Hans Petter Selasky
808108da32 service(8): Bump date after commit 66d795ec19 .
Differential revision:  https://reviews.freebsd.org/D32582
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-11-09 22:33:04 +01:00
Hans Petter Selasky
66d795ec19 service(8): Fix typo in man page.
Differential revision:  https://reviews.freebsd.org/D32582
Submitted by:   christos@
MFC after:      1 week
Sponsored by:   NVIDIA Networking
2021-11-09 22:12:19 +01:00
Hans Petter Selasky
337c814316 kldstat(8): Fix indentation, whitespace to tabs.
No functional change intended.

Differential revision:  https://reviews.freebsd.org/D32502
Submitted by:   christos@
MFC after:      1 week
Sponsored by:   NVIDIA Networking
2021-11-09 22:12:19 +01:00
Hans Petter Selasky
4c537df51a echo(1): Replace errexit() with err(3)
Differential revision:	https://reviews.freebsd.org/D32501
Submitted by:	christos@
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-11-09 22:12:19 +01:00
Hans Petter Selasky
11f09b17fe snd_uaudio(4): Fix string index computations for iFeature.
This allows the iFeature strings to be properly read by the snd_uaudio(4) driver,
when parsing the audio feature unit descriptors.

Submitted by:	Zhichao1.Li@dell.com
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-11-09 22:11:25 +01:00
Rene Ladan
0752d078df share/misc: update portmgr membership
After five years of service, adamw steps down from portmgr.
Also please welcome tcberner to portmgr.
2021-11-09 21:25:14 +01:00
John Baldwin
442ad83e38 crypto: Don't assert on valid IV length for Chacha20-Poly1305.
The assertion checking for valid IV lengths added in 1833d6042c
was not properly updated to permit an IV length of 8 in commit
42dcd39528.

Reported by:	syzbot+f0c0559b8be1d6eb28c7@syzkaller.appspotmail.com
Reviewed by:	markj
Fixes:		42dcd39528 crypto: Support Chacha20-Poly1305 with a nonce size of 8 bytes.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32860
2021-11-09 10:52:30 -08:00
John Baldwin
e3ba94d4f3 Don't require the socket lock for sorele().
Previously, sorele() always required the socket lock and dropped the
lock if the released reference was not the last reference.  Many
callers locked the socket lock just before calling sorele() resulting
in a wasted lock/unlock when not dropping the last reference.

Move the previous implementation of sorele() into a new
sorele_locked() function and use it instead of sorele() for various
places in uipc_socket.c that called sorele() while already holding the
socket lock.

The sorele() macro now uses refcount_release_if_not_last() try to drop
the socket reference without locking the socket.  If that shortcut
fails, it locks the socket and calls sorele_locked().

Reviewed by:	kib, markj
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D32741
2021-11-09 10:50:12 -08:00
Ed Maste
e818178e3a tests: do not build ktls_test if WITHOUT_OPENSSL
ktls_test requires libcrypto to build, and fails if it is not available
(which is the case when building WITHOUT_OPENSSL).

Reported by:	Michael Dexter, Build Option Survey
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32895
2021-11-09 13:47:20 -05:00
Mark Johnston
1f960e646b pci: Implement pci_bar_enabled() for SR-IOV VFs
In a VF's configuration space, "memory space enable" is hard-wired to 0,
so the existing implementation always returns false.  We need to read
the SR-IOV control register from the PF device to get the value of the
MSE bit.

Fix pci_bar_enabled() to read this register instead for VFs.  I don't
see any way to access the PF's config space without a backpointer in the
pci device ivars, so I added one.

This fixes a regression where bhyve(8) fails to map the MSI-X table
after commit 7fa2335347 ("bhyve: Map the MSI-X table unconditionally
for passthrough") when a VF is passed through, since with that commit we
use PCIOCBARMMAP to map the table and that ioctl always fails for VFs
without this change.  As a bonus, pciconf(8) now correctly reports the
enablement of BARs for VFs.

Reported and tested by:	Raúl Muñoz <raul.munoz@custos.es>
Reviewed by:	rstone, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32839
2021-11-09 13:13:36 -05:00
John Baldwin
57093f9366 vfs: Consistently validate AT_* flags in kern_* functions.
Some syscalls checked for invalid AT_* flags in sys_* and others in
kern_*.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D32864
2021-11-09 09:42:12 -08:00
John Baldwin
3225fd22b2 kern_utimensat: Update name of last arg in prototype.
The last argument is a mask of AT_* flags, not a namei cnp flag as
'int follow' implies in other kern_* functions.

Obtained from:	CheriBSD
Sponsored by:	The University of Cambridge, Google Inc.
2021-11-09 09:41:17 -08:00
Mike Karels
a2e7dfca86 systat: clean up code assuming network classes
Similar to netstat, clean up code that uses inet_lnaof() to check for
binding to "host 0" (lowest host on network) as a "network" bind.
Such things don't happen, and current networks are seldom if ever
found in /etc/networks.

MFC after:	1 month
Reviewers:	tuexen
Differential Revision: https://reviews.freebsd.org/D32720
2021-11-09 09:35:16 -06:00
Mike Karels
64acb29b7d sockstat: change check for wildcard sockets to avoid historical classes
sockstat was checking whether a bound address was "host 0", the lowest
host on a network, using inet_lnaof().  This only works for class A/B/C.
However, it isn't useful to bind such an address unless it is really
the unspecified address INADDR_ANY.  Change the check to to use that.

MFC after:	1 month
Reviewd by:	tuexen
Differential Revision: https://reviews.freebsd.org/D32715
2021-11-09 09:34:44 -06:00
Mike Karels
bd27c71c45 netstat: reduce use of historical Internet classes
When attempting to characterize bound addresses, netstat was checking
for host 0 on a (historical) net using inet_lnaof().  Such addresses
are not normally bound, as they would not work, with the exception
of the unspecified address, INADDR_ANY.  Check for that explicitly.
Similarly, don't check bound addresses for a match to a network name.

MFC after:	1 month
Reviewed by:	tuexen
Differential Revision: https://reviews.freebsd.org/D32714
2021-11-09 09:34:22 -06:00
Mike Karels
92aebdeaff mountd: deprecate exports to a network without mask
The exports file format allows export to a network using an explicit
mask or prefix length (CIDR).  It also allows a network with just
a dotted address, in which case the historical mask was used.
Deprecate this usage, and warn when it is used.  Document that this
is deprecated.

MFC after:	1 month
Reviewed by:	rmacklem, bcr, #manpages
Differential Revision: https://reviews.freebsd.org/D32713
2021-11-09 09:34:06 -06:00
Mike Karels
0bf7f99b2a res_init: remove unused inet_makeaddr with IN_LOOPBACKNET
Remove code that is ifdefed out on USELOOPBACK, which uses historical
class.  No functional change intended.

MFC after:	1 month
Differential Revision: https://reviews.freebsd.org/D32712
2021-11-09 09:33:48 -06:00