Commit Graph

280079 Commits

Author SHA1 Message Date
Andrew Gallatin
ac4e3a27ab Unbreak the build when MAC is not defined
7a2c93b86e removed the use of "error" when MAC was not
defined, resulting in an unused variable error.

Sponsored by: Netflix
Reviewed by: jhb
2022-12-14 17:39:25 -05:00
Randall Stewart
2e2a1c3139 Opps take out a stray left behind printf that was
for debugging.. Sorry.
2022-12-14 16:11:39 -05:00
Randall Stewart
e2e088ae86 Rack cannot be loaded without cc_newreno compiled into the kernel.
Right now rack will fail to load due to its hack in accessing symbol names
in cc_newreno. This was fine when newreno was always compiled into the
kernel but now ... not so much. Instead lets fix up rack to use the socket
option queries to get the information it wants and set the parameters. We
also fix the CC parameter so they are always settable.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D37622
2022-12-14 15:37:48 -05:00
Alexander V. Chernikov
80f03e63d6 netlink: improve interface handling
* Separate interface creation from interface modification code
* Support setting some interface attributes (ifdescr, mtu, up/down, promisc)
* Improve interaction with the cloners requiring to parse/write custom
 interface attributes
* Add bitmask-based way of checking if the attribute is present in the
message
* Don't use multipart RTM_GETLINK replies when searching for the
specific interface names
* Use ENODEV instead of ENOENT in case of failed RTM_GETLINK search
* Add python netlink test helpers
* Add some netlink interface tests

Differential Revision: https://reviews.freebsd.org/D37668
2022-12-14 19:52:35 +00:00
Andrew Gallatin
1cac76c93f vm: reduce lock contention when processing vm batchqueues
Rather than waiting until the batchqueue is full to acquire the lock &
process the queue, we now start trying to acquire the lock using trylocks
when the batchqueue is 1/2 full. This removes almost all contention on the
vm pagequeue mutex for for our busy sendfile() based web workload.
It also greadly reduces the amount of time a network driver ithread
remains blocked on a mutex, and eliminates some packet drops under
heavy load.

So that the system does not loose the benefit of processing large
batchqueues, I've doubled the size of the batchqueues. This way, when
there is no contention, we process the same batch size as before.

This has been run for several months on a busy Netflix server, as well
as on my personal desktop.

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37305
2022-12-14 14:34:07 -05:00
Andrew Gallatin
c4a4b2633d allocate inpcb aligned to cachelines
The inpcb struct is one of the most heavily utilized in the kernel
on a busy network server.  By aligning it to a cacheline
boundary, we can ensure that closely related fields in the inpcb
and tcbcb can be predictably located on the same cacheline.  rrs
has already done a lot of this work to put related fields on the
same line for the tcbcb.

In combination with a forthcoming patch to align the start of the tcpcb,
we see a roughly 3% reduction in CPU use on a busy web server serving
traffic over roughly 50,000 TCP connections.

Reviewed by: glebius, markj, tuexen
Differential Revision: https://reviews.freebsd.org/D37687
Sponsored by: Netflix
2022-12-14 14:19:35 -05:00
Gleb Smirnoff
7a2c93b86e sockets: provide sousrsend() that does socket specific error handling
Sockets have special handling for EPIPE on a write, that was spread out
into several places.  Treating transient errors is also special - if
protocol is atomic, than we should ignore any changes to uio_resid, a
transient error means the write had completely failed (see d2b3a0ed31).

- Provide sousrsend() that expects a valid uio, and leave sosend() for
  kernel consumers only.  Do all special error handling right here.
- In dofilewrite() don't do special handling of error for DTYPE_SOCKET.
- For send(2), write(2) and aio_write(2) call into sousrsend() and remove
  error handling for kern_sendit(), soo_write() and soaio_process_job().

PR:			265087
Reported by:            rz-rpi03 at h-ka.de
Reviewed by:            markj
Differential revision:	https://reviews.freebsd.org/D35863
2022-12-14 10:02:44 -08:00
Gleb Smirnoff
eaabc93764 tcp: retire TCPDEBUG
This subsystem is superseded by modern debugging facilities,
e.g. DTrace probes and TCP black box logging.

We intentionally leave SO_DEBUG in place, as many utilities may
set it on a socket.  Also the tcp::debug DTrace probes look at
this flag on a socket.

Reviewed by:		gnn, tuexen
Discussed with:		rscheff, rrs, jtl
Differential revision:	https://reviews.freebsd.org/D37694
2022-12-14 09:54:06 -08:00
Mark Johnston
ab8b2d108c sys/conf: Remove an unneeded flag variable
After commit fac6dee9eb ("Remove tests for obsolete compilers in the
build system"), we always set -fdebug-prefix-map, so there's no point in
defining and testing _MAP_DEBUG_PREFIX.  No functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-12-14 09:32:17 -05:00
Mark Johnston
57cc96f49e pf: Fix definitions of pf_pfil_*_hooked
This use of "volatile" in the vnet definitions doesn't have any effect.
VNET_DEFINE_STATE(volatile int, ...) should work, but let's avoid using
"volatile" altogether and convert to atomic_load/atomic_store.  Also
convert to bool while here.

Reviewed by:	kp, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37684
2022-12-14 09:29:59 -05:00
Kristof Provost
1596d28026 if_ovpn: fix LINT-NOIP build
Reported by:	mjg
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-12-14 10:53:03 +01:00
Kristof Provost
654e8d84ec pf tests: check that we clean up unused kifs
The previous commit fixed a memory leak, where we'd fail to clean up
removed groups (and interfaces).
Check that we now clean those up as expected.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37570
2022-12-14 10:19:01 +01:00
Nick Reilly
bfeef0d32a pf: fix pfi_ifnet leak on interface removal
The detach of the interface and group were leaving pfi_ifnet memory
behind. Check if the kif still has references, and clean it up if it
doesn't

On interface detach, the group deletion was notified first and then a
change notification was sent. This would recreate the group in the kif
layer. Reorder the change to before the delete.

PR:		257218
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D37569
2022-12-14 10:19:01 +01:00
Mateusz Guzik
e6fc01f6be tcp: whack the stale declaration of rack_timer_stop
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-12-14 08:48:52 +00:00
Kristof Provost
a002c839ec if_ovpn: cleanup offsetof() use
Move the use of the `offsetof(struct ovpn_counters, fieldname) /
sizeof(uint64_t)` construct into a macro.
This removes a fair bit of code duplication and should make things a
little easier to read.

Reviewed by:	zlei
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37607
2022-12-14 06:48:59 +01:00
Kristof Provost
c357bf397f if_ovpn: include peer counters in a OVPN_NOTIF_DEL_PEER message
When we remove a peer userspace can no longer retrieve its counters. To
ensure that userspace can get a full count of the entire session we now
include the counters in the deletion message.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37606
2022-12-14 06:48:59 +01:00
Kristof Provost
92f0cf77db if_ovpn: allow peer lookup by vpn4/vpn6 address
Introduce two more RB_TREEs so that we can look up peers by their peer
id (already present) or vpn4 or vpn6 address.
This removes the last linear scan of the peer list.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37605
2022-12-14 06:48:59 +01:00
Kristof Provost
8b630fa9ef if_ovpn: implement OVPN_GET_PEER_STATS
Allow userspace to retrieve per-peer traffic stats.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37604
2022-12-14 06:48:58 +01:00
Kristof Provost
18a30fd39b if_ovpn: start tracking per-peer packets/bytes in/out
OpenVPN will introduce a mechanism to retrieve per-peer statistics.
Start tracking those so we can return them to userspace when queried.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37603
2022-12-14 06:48:58 +01:00
Kristof Provost
66de89d4c2 if_ovpn: remove OVPN_SEND_PKT
OpenVPN userspace no longer uses the ioctl interface to send control
packets. It instead uses the socket directly.
The use of OVPN_SEND_PKT was never released, so we can remove this
without worrying about compatibility.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37602
2022-12-14 06:48:58 +01:00
Konstantin Belousov
9e0d976d95 sizeof(7): miscellaneous edits
Suggested by:	pstef
Reviewed by:	imp, pstef (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37683
2022-12-14 07:44:04 +02:00
Jan Schaumann
57bee0817f sizeof(7): remove "All rights reserved"
Requested by:	pstef
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37674
2022-12-14 07:43:04 +02:00
Gleb Smirnoff
1b5895c624 tcp: remove a 4.4BSD relic
The actual code to modify this counter was disabled in 2c37256e5a
and later removed in d0390e0570.
2022-12-13 20:21:45 -08:00
Li-Wen Hsu
d969aeab73
Complete retire cp(4)
Sponsored by:	The FreeBSD Foundation
2022-12-14 11:42:36 +08:00
Li-Wen Hsu
611c6b233d
Complete retire cp(4)
And fix the LINT build.

Sponsored by:	The FreeBSD Foundation

Fixes:	895992bb66
2022-12-14 11:38:55 +08:00
Gleb Smirnoff
5050df3f4a tcp: fix counter leak for SYN_RCVD state when syncache_socket() fails
The SYN_RCVD state count is tricky here due to default code path and TFO
being so different.  In the default case the count is incremented when a
syncache entry is added to the the database in syncache_insert().  Later
when connection transitions from syncache entry to a socket in
syncache_expand(), this counter is inherited by the tcpcb.  If socket or
tcpcb allocation failed in syncache_socket() failed the syncache_expand()
is responsible for decrement.  In the TFO case the syncache entry is not
inserted into database and count of SYN_RCVD is first incremented in the
syncache_tfo_expand() after successful socket allocation.  Thus, inside
syncache_socket() we can't tell whether we need to decrement in a case of
a failure or not.  The caller is responsible for this book keeping.

Fixes:	07285bb4c2
Differential revision:	https://reviews.freebsd.org/D37610
2022-12-13 19:31:05 -08:00
Kyle Evans
54d65fdd56 diff: restyle loop a bit
This is a bit more readable, and this loop is probably unlikely to gain
any `continue` or `break`s.

Suggested by:	pstef
Differential Revision:	https://reviews.freebsd.org/D37676
2022-12-13 19:31:21 -06:00
Kyle Evans
8bf187f35b diff: fix side-by-side output with tabbed input
The previous logic conflated some things... in this block:
- j: input characters rendered so far
- nc: number of characters in the line
- col: columns rendered so far
- hw: column width ((h)ard (w)idth?)

Comparing j to hw or col to nc are naturally wrong, as col and hw are
limits on their respective counters and nc is already brought down to hw
if the input line should be truncated to start with.

Right now, we end up easily truncating lines with tabs in them as we
count each tab for $tabwidth lines in the input line, but we really
should only be accounting for them in the column count.  The problem is
most easily demonstrated by the two input files added for the tests,
the two tabbed lines lose at least a word or two even though there's
plenty of space left in the row for each side.

Reviewed by:	bapt, pstef
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D37676
2022-12-13 19:31:21 -06:00
John Baldwin
1656007e4c ptrace_test: Remove another MIPS remnant. 2022-12-13 14:51:52 -08:00
Mateusz Guzik
e6bc24b038 kref: switch internal type to atomic_t and bring back const to kref_read
This unbreak drm-kmod build.

the const is part of Linux API

Unfortunately drm-kmod uses hand-rolled refcount* calls on a kref
object. For now go the easy route of keeping it operational by casting
stuff internally.

The general goal here is to make FreeBSD refcount API use an opaque
type, hence the ongoing removal of hand-rolled accesses.

Reported by:	emaste
2022-12-13 20:46:58 +00:00
Ed Maste
fa4d25f5b4 retire sconfig(8) ce(4)/cp(4) configuration tool
The ce(4) and cp(4) drivers have been retired.

Differential Revision:	https://reviews.freebsd.org/D33469
2022-12-13 15:25:13 -05:00
Ed Maste
895992bb66 retire cp(4) driver
Sync serial (e.g. T1/T1/G.703) interfaces are obsolete, this driver
includes obfuscated source, and has reported potential security issues.

Differential Revision:  https://reviews.freebsd.org/D33468
2022-12-13 15:24:52 -05:00
Ed Maste
76f6751844 retire ce(4) driver
Sync serial (e.g. T1/T1/G.703) interfaces are obsolete, this driver
includes obfuscated source, and has reported potential security issues.

Differential Revision:	https://reviews.freebsd.org/D33467
2022-12-13 15:24:25 -05:00
Ed Maste
20dfe27b2d Add deprecation notices to ce,cp sync serial drivers
And the related sconfig utility.  Sync serial (e.g. E1/T1) interfaces
are obsolete, and nobody responded to several inquires on the mailing
lists about use of these drivers.

Relnotes:	Yes
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23928
2022-12-13 14:59:08 -05:00
Jose Luis Duran
47972d6dc4 Fix rcorder example to match the keyword in the description
Differential Revision: https://reviews.freebsd.org/D37686
2022-12-13 19:56:28 +00:00
Ceri Davies
cd9cdd0eaa sysctl.8: grammar nit 2022-12-13 19:52:10 +00:00
Martin Matuska
bd5e624a86 libarchive: merge from vendor branch
Libarchive 3.6.2

Important bug fixes:
  rar5 reader: fix possible garbled output with bsdtar -O (#1745)
  mtree reader: support reading mtree files with tabs (#1783)
  various small fixes for issues found by CodeQL

MFC after:	2 weeks
PR:		286306 (exp-run)
2022-12-13 20:21:13 +01:00
Mike Karels
8d0ed56646 RELNOTES: Add mention of growfs addition of swap partition.
As documented in growfs(7).
2022-12-13 08:54:43 -06:00
Ed Maste
8974fa4515 ssh: describe two additional changes present in base system ssh
Sponsored by:	The FreeBSD Foundation
2022-12-13 09:45:56 -05:00
Peter Holm
eb928778a3 stress2: Add problem found 2022-12-13 12:00:01 +01:00
Mateusz Guzik
67e628b7a6 kref: replace hand-rolled atomic ops with refcount API
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37608
2022-12-13 09:24:57 +00:00
Søren Schmidt
85e7d8e0ae Add driver for Motorcomm YT8511 GbE PHY
Partially from:	https://reviews.freebsd.org/D36093
2022-12-13 05:58:51 +00:00
Warner Losh
2e1e68cbae stand: Make ioctl declaration consistent
It typically had two args with an optional third from the userland
declaration in sys/ioccom.h. However, the funciton definition used a
non-optional char * argument. This mismatch is UB behavior (but worked
due to the calling convetions of all our machines).

Instead, add a declaration for ioctl to stand.h, make the third arg
'void *' which is a better match to the ... declaration before. This
prevents the convert int * -> char * errors as well. Make the ioctl
user-space declaration truly user-space specific (omit it in the
stand-alone build).

No functional change intended.

Sponsored by:		Netflix
Reviewed by:		emaste
Differential Revision:	https://reviews.freebsd.org/D37680
2022-12-12 21:46:34 -07:00
Jan Schaumann
0b75997f4c Add sizeof(7) manual page
PR:	268310
Reviewed by:	kib
Discussed with:	brooks, pauamma
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37674
2022-12-13 06:43:28 +02:00
Ed Maste
94db10b2db geom: minor man page updates suggested by igor(1)
Reviewed by:	pauamma
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37681
2022-12-12 19:27:17 -05:00
Ed Maste
a752e011a8 ssh: remove note about local change to [Use]PrivilegeSeparation
We documented "[Use]PrivilegeSeparation defaults to sandbox" as one of
our modifications to ssh's server-side defaults, but this is not (any
longer) a difference from upstream.

Sponsored by:	The FreeBSD Foundation
2022-12-12 17:07:27 -05:00
Ed Maste
d181a91267 geom: add vinum as a recognized class
And note that it is deprecated.

PR:		236569
Reported by:	bcran
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37678
2022-12-12 16:19:02 -05:00
Ed Maste
f16281460a RELNTOES: Add note for llvm-objdump as objdump 2022-12-12 15:52:29 -05:00
Alan Cox
f0878da03b pmap: standardize promotion conditions between amd64 and arm64
On amd64, don't abort promotion due to a missing accessed bit in a
mapping before possibly write protecting that mapping.  Previously,
in some cases, we might not repromote after madvise(MADV_FREE) because
there was no write fault to trigger the repromotion.  Conversely, on
arm64, don't pointlessly, yet harmlessly, write protect physical pages
that aren't part of the physical superpage.

Don't count aborted promotions due to explicit promotion prohibition
(arm64) or hardware errata (amd64) as ordinary promotion failures.

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36916
2022-12-12 11:32:50 -06:00
Chuck Silvers
9dda00df7e restore: fix restore of NFS4 ACLs
Changing the mode bits on a file with an NFS4 ACL results in the
NFS4 ACL being replaced by one matching the new mode bits being set,
so when restoring a file with an NFS4 ACL, set the owner/group/mode first
and then set the NFS4 ACL, so that setting the mode does not throw away
the ACL that we just set.

Reviewed by:	mckusick
Differential Revision:  https://reviews.freebsd.org/D37618
2022-12-12 08:19:51 -08:00