Commit Graph

269930 Commits

Author SHA1 Message Date
Konstantin Belousov
d032cda0d0 DEBUG_VFS_LOCKS: stop excluding devfs and doomed vnode from asserts
We do not require devvp vnode locked for metadata io.  It is typically
not needed indeed, since correctness of the file system using
corresponding block device ensures that there is no incorrect or racy
manipulations.

But right now DEBUG_VFS_LOCKS option excludes both character device
vnodes and completely destroyed (VBAD) vnodes from asserts.  This is not
too bad since WITNESS still ensures that we do not leak locks.  On the
other hand, asserts do not mean what they should, to the reader, and
reliance on them being enforced might result in wrong code.

Note that ASSERT_VOP_LOCKED() still silently accepts NULLVP, I think it
is worth fixing as well, in the next round.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:02:42 +02:00
Konstantin Belousov
47b248ac65 Make locking assertions for VOP_FSYNC() and VOP_FDATASYNC() more correct
For devfs vnodes, it is fine to not lock vnodes for VOP_FSYNC().
Otherwise vnode must be locked exclusively, except for MNT_SHARED_WRITES()
where the shared lock is enough.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:02:13 +02:00
Konstantin Belousov
d1d675cb30 freevnode(): lock the freeing vnode around destroy_vpollinfo()
to satisfy locking requirements of knlist manipulations.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:01:02 +02:00
Konstantin Belousov
eede22d66d ffs_snapshot: do not assert that um_devvp is locked
It is not, and the lock is not needed there

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:54 +02:00
Konstantin Belousov
25809a018d mntfs: lock mntfs pseudo devfs vnode properly
Require devvp locked for mntfs_freevp(), to have it locked around
vgone().  Make that true for ffs, which is the only consumer of
the interface.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:41 +02:00
Konstantin Belousov
76b05e3e39 ffs: Remove assertions about locked um_devvp in several places
Namely, ffs_blkfree_cg(), and ffs_flushfiles().

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:33 +02:00
Konstantin Belousov
a7b4a54d2c getblk(): do not require devvp vnodes to be locked
Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:24 +02:00
Konstantin Belousov
8db7d16526 geom_vfs: lock devvp in g_vfs_close()
It is needed for g_vfs_close() invalidating the buffers.  We rely on the
vnode lock for correctness.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:13 +02:00
Michael Tuexen
df07bfda67 tcp: Fix a locking issue
INP_WLOCK_RECHECK_CLEANUP() and INP_WLOCK_RECHECK() might return
from the function, so any locks held must be released.

Reported by:		syzbot+b1a888df08efaa7b4bf1@syzkaller.appspotmail.com
Reviewed by:		markj
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D32975
2021-11-12 22:13:50 +01:00
Gleb Smirnoff
6913bf4c3d tests/divert: fix after 2ce85919bb (IP source address validation)
Just make the test packet more legitimate.

Reviewed by:	melifaro
2021-11-12 11:20:06 -08:00
Mark Johnston
034a924009 tcp: Ensure that vnets have an initialized V_default_cc_ptr
This causes new vnets to inherit the cc algorithm from vnet0. This is a
temporary patch to fix vnet jail creation.

With encouragement from: glebius
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Differential Revision: https://reviews.freebsd.org/D32970
2021-11-12 12:18:12 -07:00
Warner Losh
7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Andrew Turner
ae062ff269 Move KHELP_DECLARE_MOD_UMA later in the boot
Both KHELP_DECLARE_MOD_UMA and the kernel linker SYSINIT to find
in-kernel modules run at SI_SUB_KLD, SI_ORDER_ANY. As the former
depends on the latter running first move it later in the boot,
to the new SI_SUB_KHELP. This ensures KHELP_DECLARE_MOD_UMA
module SYSINIT functions will be after the kernel linker.

Previously we may have received a panic similar to the following if
the order was incorrect:

panic: module_register_init: module named ertt not found

Reported by:	bob prohaska <fbsd AT www.zefox.net>
Discussed with:	imp, jhb
Sponsored by:	The FreeBSD Foundation
2021-11-12 18:56:58 +00:00
Gleb Smirnoff
1817be481b Add net.inet6.ip6.source_address_validation
Drop packets arriving from the network that have our source IPv6
address.  If maliciously crafted they can create evil effects
like an RST exchange between two of our listening TCP ports.
Such packets just can't be legitimate.  Enable the tunable
by default.  Long time due for a modern Internet host.

Reviewed by:		melifaro, donner, kp
Differential revision:	https://reviews.freebsd.org/D32915
2021-11-12 09:01:40 -08:00
Gleb Smirnoff
2ce85919bb Add net.inet.ip.source_address_validation
Drop packets arriving from the network that have our source IP
address.  If maliciously crafted they can create evil effects
like an RST exchange between two of our listening TCP ports.
Such packets just can't be legitimate.  Enable the tunable
by default.  Long time due for a modern Internet host.

Reviewed by:		donner, melifaro
Differential revision:	https://reviews.freebsd.org/D32914
2021-11-12 09:00:33 -08:00
Gleb Smirnoff
9c89392f12 Add in_localip_fib(), in6_localip_fib().
Check if given address/FIB exists locally.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D32913
2021-11-12 08:59:42 -08:00
Gleb Smirnoff
81674f121e ip_input: packet filters shall not modify m_pkthdr.rcvif
Quick review confirms that they do not, also IPv6 doesn't expect
such a change in mbuf.  In IPv4 this appeared in 0aade26e6d,
which doesn't seem to have a valid explanation why.

Reviewed by:		donner, kp, melifaro
Differential revision:	https://reviews.freebsd.org/D32913
2021-11-12 08:58:27 -08:00
Gleb Smirnoff
94df3271d6 Rename net.inet.ip.check_interface to rfc1122_strong_es and document it.
This very questionable feature was enabled in FreeBSD for a very short
time.  It was disabled very soon upon merging to RELENG_4 - 23d7f14119.
And in HEAD was also disabled pretty soon - 4bc37f9836.

The tunable has very vague name. Check interface for what? Given that
it was never documented and almost never enabled, I think it is fine
to rename it together with documenting it.

Also, count packets dropped by this tunable as ips_badaddr, otherwise
they fall down to ips_cantforward counter, which is misleading, as
packet was not supposed to be forwarded, it was destined locally.

Reviewed by:		donner, kp
Differential revision:	https://reviews.freebsd.org/D32912
2021-11-12 08:57:06 -08:00
Warner Losh
eb6b6622fe Update MINIMAL to have CC options
The MINIMAL configs were overlooked. They are compiled as part of
universe, so this broke universe builds. Add the same defafults as for
GENERIC.

Sponsored by:		Netflix
2021-11-12 09:33:18 -07:00
Mateusz Guzik
0359e7a5e4 net: sprinkle __predict_false in ip_input on error conditions
While here rearrange the RVSP check to inspect proto first and avoid
evaluating V_rsvp in the common case to begin with (most notably avoid
the expensive read).

Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32929
2021-11-12 15:40:28 +00:00
Sergey A. Osokin
b39a93b18e ktls.4: fix openssl-devel port name
PR:	259630
2021-11-12 09:31:48 -05:00
Cy Schubert
c9516b83c1 wpa: Fix WITHOUT_WPA_SUPPLICANT_EAPOL build
Reported by:	FreeBSD Build Option Survey
		https://callfortesting.org/results/bos-2021-11-04/
Fixes:		c1d255d3ff
MFC after:	1 week
2021-11-11 19:03:05 -08:00
Cy Schubert
ba5de3c2b3 wpa: Fix WITHOUT_OPENSSL build
PR:		259517
Reported by:	emaste, FreeBSD Build Option Survey
		https://callfortesting.org/results/bos-2021-11-04/
Fixes:		c1d255d3ff
MFC after:	1 week
2021-11-11 19:03:05 -08:00
Cy Schubert
96e2ac9c48 Revert "wpa: Fix WITHOUT_CRYPT build"
This reverts commit a30e8044aa.
WITHOUT_OPENSSL build is a subset of WITHOUT_CRYPT build. It was
incorrect to label this patch as fixing WITHOUT_CRYPT when in fact
it fixes WITHOUT_OPENSSL. The build failure will be addressed in a
fix for WITHOUT_OPENSSL build.

MFC after:	1 week
2021-11-11 19:03:05 -08:00
Cy Schubert
3332f1b444 wpa: Remove duplicate options definitions
Global options are defined in usr.sbin/wpa/Makefile.inc. Those in
usr.sbin/wpa/src/crypto/Makefile are duplicates of those found above.
Remove them.

MFC after:	1 week
2021-11-11 19:03:05 -08:00
Rick Macklem
44744f7538 nfscl: Add a LayoutError RPC for NFSv4.2 pNFS mounts
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the MDS server to
tell it about the error.  This patch adds the same to the
FreeBSD NFSv4.2 pNFS client, to maintain Linux compatible
behaviour, particlularily for non-FreeBSD pNFS servers.

MFC after:	2 weeks
2021-11-11 15:43:58 -08:00
John Baldwin
522a2aa761 Drop "All rights reserved" from a Netflix copyright.
Reviewed by:	imp
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32778
2021-11-11 14:41:16 -08:00
Kirk McKusick
e38717c128 Fix regression to verbose behavior introduced in 68bff4a07e.
Reported by:    Brad Davis (brd)
Reviewed by:    Kristof Provost (kp)
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D32736
Sponsored by:   Netflix
2021-11-11 12:11:25 -08:00
Mark Johnston
811d05449b vm_page_alloc.9: Document VM_ALLOC_NORECLAIM
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-11-11 14:52:00 -05:00
Mark Johnston
e4bdb6857a vm_page: Handle VM_ALLOC_NORECLAIM in the contiguous page allocator
We added _NORECLAIM to request that kmem_alloc_contig_pages() not spend
time scanning physical memory for candidates to reclaim.  In some
situations the scanning can induce large amounts of undesirable latency,
and it's less important that the request be satisfied than it is that we
not spend many milliseconds scanning.

The problem extends to vm_reserv_reclaim_contig(), which unlike
vm_reserv_reclaim() may have to scan the entire list of partially
populated reservations.  Use VM_ALLOC_NORECLAIM to request that this
scan not be executed.[1]

As a side effect, this fixes a regression in 02fb0585e7 ("vm_page:
Drop handling of VM_ALLOC_NOOBJ in vm_page_alloc_contig_domain()")
where VM_ALLOC_CONTIG was not included in VPAC_FLAGS or VPANC_FLAGS even
though it is not masked by kmem_alloc_contig_pages().[2]

Reported by:	gallatin [1], glebius [2]
Reviewed by:	alc, glebius, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32899
2021-11-11 14:26:41 -05:00
Mateusz Piotrowski
4042b356a0 bsdinstall: Fix mirror selection
This is a follow-up to 2697622687,
which fixed 2 out of 3 broken uses of the mirrorselect script.

Reviewed by:	emaste
Approved by:	emaste (src)
MFC after:	7 days
Differential Revision:	https://reviews.freebsd.org/D32927
2021-11-11 16:18:36 +01:00
Randall Stewart
d695386338 Add in the commit revision to the UPDATING file for the CC changes
The UPDATING file had a xxx for the CC changes commit, now that its
committed fix that.
2021-11-11 06:39:20 -05:00
Randall Stewart
26cbd0028c tcp: Rack may still calculate long RTT on persists probes.
When a persists probe is lost, we will end up calculating a long
RTT based on the initial probe and when the response comes from the
second probe (or third etc). This means we have a minimum of a
confidence level of 3 on a incorrect probe. This commit will change it
so that we have one of two options
a) Just not count RTT of probes where we had a loss
<or>
b) Count them still but degrade the confidence to 0.

I have set in this the default being to just not measure them, but I am open
to having the default be otherwise.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32897
2021-11-11 06:35:51 -05:00
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Peter Holm
7e3c4b09a0 stress2: Added two test scenarios for future gunion(8) 2021-11-11 10:11:49 +01:00
Felix Johnson
c5e0492ae8 module(9): Document that evhand can be NULL
PR:		192250
MFC after:	3 days
Reported by:	ngie
2021-11-11 01:32:54 -05:00
Navdeep Parhar
448bcd01dc cxgbe(4): internal knob for flexible control over FEC selection.
Recent firmwares have support for autonomous FEC selection and a "force"
knob to let the driver control this behavior (or not) in a fine grained
manner. This change adds a driver knob so that all the different ways of
configuring the link FEC can be exercised. Note that this controls the
internal driver/firmware interaction for link configuration and is not
meant for general use.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-11-10 15:16:53 -08:00
Navdeep Parhar
f6a2e1100f cxgbe(4): separate sysctls for user-requested and in-use FEC.
Recent firmwares have more leeway in FEC selection and there is a need
to track the FECs requested by the driver separately from the FEC in use
on the link. The existing dev.<port>.<inst>.fec sysctl can read both but
its behavior depends on the link state and it is sometimes hard to find
out what was requested when the link is up.

Split the fec sysctl into two (requested_fec and link_fec) to get access
to both pieces of information regardless of the link state.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-11-10 15:04:37 -08:00
Mark Johnston
ac2b544417 mbuf: Fix an offset calculation in m_apply_extpg_one()
We were not including the requested starting offset in the page offset.

Reviewed by:	jhb
Fixes:		3c7a01d773 ("Extend m_apply() to support unmapped mbufs.")
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32922
2021-11-10 16:57:12 -05:00
Felix Johnson
5504d83942 sockstat(1): Update Synopsis section
Update sockstat(1) manpage so the Synopsis section includes q (silent
mode) and the -j argument name is consistent.

PR:		256795
MFC after:	3 days
Reported by:	Nick Reilly <nreilly@blackberry.com>
2021-11-10 15:22:06 -05:00
Konstantin Belousov
439c3d9563 Regen 2021-11-10 21:18:54 +02:00
Konstantin Belousov
43e6f07b06 sched.h: add CPU_EQUAL() for better compatibility with Linux
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
f239545591 x86: provide userspace implementation of sched_getcpu() where possible
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
77b2c2f814 Add sched_getcpu()
for compatibility with Linux.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
43736b71dd Add sched_get/setaffinity(3)
for compatibility with Linux.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:54 +02:00
Konstantin Belousov
160b4b922b Add real sched.h
It is required by IEEE Std 1003.1-2008 AKA POSIX.

Put some Linux compatibility stuff under BSD_VISIBLE namespace, in
particular, sys/cpuset.h definitions.  Also, if user really want
Linux compatibility, she can request cpu_set_t typedef with
_WITH_CPU_SET_T define.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32901
2021-11-10 21:18:53 +02:00
Konstantin Belousov
e5adb145f0 Remove arm/linux from sysent toplevel target
Sponsored by:	The FreeBSD Foundation
Fixes:	65e485014b
2021-11-10 21:18:53 +02:00
Hans Petter Selasky
ad8f078f66 ifconfig(8): Don't set network interface capabilities when there is no change.
A quick grep through the kernel code shows network drivers compute the
changed bits of network capabilities after a SIOCSIFCAP IOCTL(2) by
using the bitwise exclusive or operation. When the set capabilities
are equal to the already read capabilities, no action will be taken.

Let ifconfig(8) predict this case and skip the SIOCSIFCAP IOCTL(2)
system call.

Discussed with:	kib@ (revert change in case of issues)
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-11-10 15:50:52 +01:00
Martin Matuska
81b22a9892 zfs: merge openzfs/zfs@6c8f03232 (master) into main
Notable upstream pull request merges:
  #12333: Creating gang ABDs for Raidz optional IOs
  #12668: FreeBSD: Catch up with recent VFS changes
  #12687: Skip spacemaps reading in case of pool readonly import
  #12704: Fix some FreeBSD VOPs to synchronize properly with teardown
  #12724: Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency

Obtained from:	OpenZFS
OpenZFS commit:	6c8f03232a
2021-11-10 14:22:37 +01:00
Kristof Provost
2de49deeca pf tests: Test PR259689
We didn't populate dyncnt/tblcnt, so `pfctl -sr -vv` might not have the
table element count.

PR:		259689
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32893
2021-11-10 11:27:22 +01:00