Commit Graph

19182 Commits

Author SHA1 Message Date
Kristof Provost
ab91feabcc ovpn: Introduce OpenVPN DCO support
OpenVPN Data Channel Offload (DCO) moves OpenVPN data plane processing
(i.e. tunneling and cryptography) into the kernel, rather than using tap
devices.
This avoids significant copying and context switching overhead between
kernel and user space and improves OpenVPN throughput.

In my test setup throughput improved from around 660Mbit/s to around
2Gbit/s.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D34340
2022-06-28 11:33:10 +02:00
Mateusz Guzik
7388fb714a cache: drop the vfs.cache_rename_add tunable
The functionality has been in use since Jan 2021 -- long enough(tm).
2022-06-27 09:56:20 +02:00
Gleb Smirnoff
458f475df8 unix/dgram: smart socket buffers for one-to-many sockets
A one-to-many unix/dgram socket is a socket that has been bound
with bind(2) and can get multiple connections.  A typical example
is /var/run/log bound by syslogd(8) and receiving multiple
connections from libc syslog(3) API.  Until now all of these
connections shared the same receive socket buffer of the bound
socket.  This made the socket vulnerable to overflow attack.
See 240d5a9b1c for a historical attempt to workaround the problem.

This commit creates a per-connection socket buffer for every single
connected socket and eliminates the problem.  The new behavior will
optimize seldom writers over frequent writers.  See added test case
scenarios and code comments for more detailed description of the
new behavior.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35303
2022-06-24 09:09:11 -07:00
Gleb Smirnoff
1093f16487 unix/dgram: reduce mbuf chain traversals in send(2) and recv(2)
o Use m_pkthdr.memlen from m_uiotombuf()
o Modify unp_internalize() to keep track of allocated space and memory
  as well as pointer to the last buffer.
o Modify unp_addsockcred() to keep track of allocated space and memory
  as well as pointer to the last buffer.
o Record the datagram len/memlen/ctllen in the first (from) mbuf of the
  chain in uipc_sosend_dgram() and reuse it in uipc_soreceive_dgram().

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35302
2022-06-24 09:09:11 -07:00
Gleb Smirnoff
9b841b0e23 m_uiotombuf: write total memory length of the allocated chain in pkthdr
Data allocated by m_uiotombuf() usually goes into a socket buffer.
We are interested in the length of useful data to be added to sb_acc,
as well as total memory used by mbufs.  The later would be added to
sb_mbcnt.  Calculating this value at allocation time allows to save
on extra traversal of the mbuf chain.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35301
2022-06-24 09:09:11 -07:00
Gleb Smirnoff
a7444f807e unix/dgram: use minimal possible socket buffer for PF_UNIX/SOCK_DGRAM
This change fully splits away PF_UNIX/SOCK_DGRAM from other socket
buffer implementations, without any behavior changes.

Generic socket implementation is reduced down to one STAILQ and very
little code.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35300
2022-06-24 09:09:11 -07:00
Gleb Smirnoff
a4fc41423f sockets: enable protocol specific socket buffers
Split struct sockbuf into common shared fields and protocol specific
union, where protocols are free to implement whatever buffer they
want.  Such protocols should mark themselves with PR_SOCKBUF and are
expected to initialize their buffers in their pr_attach and tear
them down in pr_detach.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35299
2022-06-24 09:09:10 -07:00
Gleb Smirnoff
315167c0de unix: provide an option to return locked from unp_connectat()
Use this new version in unix/dgram socket when sending to a target
address.  This removes extra lock release/acquisition and possible
counter-intuitive ENOTCONN.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35298
2022-06-24 09:09:10 -07:00
Gleb Smirnoff
5dc8dd5f3a unix/dgram: inline sbappendaddr_locked() into uipc_sosend_dgram()
This allows to remove one M_NOWAIT allocation and also makes it
more clear what's going on.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35297
2022-06-24 09:09:10 -07:00
Gleb Smirnoff
e3fbbf965e unix/dgram: add a specific receive method - uipc_soreceive_dgram
With this second step PF_UNIX/SOCK_DGRAM has protocol specific
implementation.  This gives some possibility performance
optimizations.  However, it still operates on the same struct
socket as all other sockets do.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35296
2022-06-24 09:09:10 -07:00
Gleb Smirnoff
f384a97c83 unix/dgram: cleanup uipc_send of PF_UNIX/SOCK_DGRAM, step 2
Just remove one level of indentation as the case clause always match.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35295
2022-06-24 09:09:10 -07:00
Gleb Smirnoff
7e5b6b391e unix/dgram: cleanup uipc_send of PF_UNIX/SOCK_DGRAM, step 1
Remove the dead code.  The new uipc_sosend_dgram() handles send()
on PF_UNIX/SOCK_DGRAM in full.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35294
2022-06-24 09:09:10 -07:00
Gleb Smirnoff
3464958246 unix/dgram: add a specific send method - uipc_sosend_dgram()
This is first step towards splitting classic BSD socket
implementation into separate classes.  The first to be
split is PF_UNIX/SOCK_DGRAM as it has most differencies
to SOCK_STREAM sockets and to PF_INET sockets.

Historically a protocol shall provide two methods for sendmsg(2):
pru_sosend and pru_send.  The former is a generic send method,
e.g. sosend_generic() which would internally call the latter,
uipc_send() in our case.  There is one important exception, though,
the sendfile(2) code will call pru_send directly.  But sendfile
doesn't work on SOCK_DGRAM, so we can do the trick.  We will create
socket class specific uipc_sosend_dgram() which will carry only
important bits from sosend_generic() and uipc_send().

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35293
2022-06-24 09:09:10 -07:00
Mitchell Horne
29afffb942 subr_bus: restore bus_null_rescan()
Partially revert the previous change; we need to keep this method as a
specific override for pci_driver subclasses which should not use
pci_rescan_method() -- cardbus and ofw_pcibus. However, change the return
value to ENODEV for the same reasoning given in the original commit, and
use this as the default rescan method in bus_if.m.

Reported by:	jhb
Fixes:		36a8572ee8 ("bus_if: provide a default null rescan method")
MFC with:	36a8572ee8
2022-06-23 16:07:00 -03:00
Mitchell Horne
8701571df9 set_cputicker: use a bool
The third argument to this function indicates whether the supplied
ticker is fixed or variable, i.e. requiring calibration. Give this
argument a type and name that better conveys this purpose.

Reviewed by:	kib, markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35459
2022-06-23 15:15:11 -03:00
Mitchell Horne
36a8572ee8 bus_if: provide a default null rescan method
There is an existing helper method in subr_bus.c, but almost no drivers
know to use it. It also returns the same error as an empty method,
making it not very useful. Move this to bus_if.m and return a more
sensible error code.

This gives a slightly more meaningful error message when attempting
'devctl rescan' on buses and devices alike:
  "Device not configured" --> "Operation not supported by device"

Reviewed by:	imp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35501
2022-06-23 15:15:10 -03:00
Chuck Silvers
5bd21cbbd1 vfs: fix vfs_bio_clrbuf() for PAGE_SIZE > block size
Calculate the desired page valid mask using math that will not
overflow the types used.

Sponsored by:	Netflix

Reviewed by:	mckusick, kib, markj
Differential Revision:	https://reviews.freebsd.org/D34837
2022-06-21 17:58:52 -07:00
Mark Johnston
9553bc89db aio: Improve UMA usage
- Remove the AIO proc zone.  This zone gets one allocation per AIO
  daemon process, which isn't enough to warrant a dedicated zone.  Plus,
  unlike other AIO structures, aiops are small (32 bytes with LP64), so
  UMA doesn't provide better space efficiency than malloc(9).  Change
  one of the malloc types in vfs_aio.c to make it more general.

- Don't set the NOFREE flag on the other AIO zones.  This flag means
  that memory allocated to the AIO subsystem is never freed back to the
  VM, so it's always preferable to avoid using it when possible.  NOFREE
  was set without explanation when AIO was converted to use UMA 20 years
  ago, but it does not appear to be required; all of the structures
  allocated from UMA (per-process kaioinfo, kaiocb, and aioliojob) keep
  track of references and get freed only when none exist.  Plus, these
  structures will contain dangling pointer after they're freed (e.g.,
  the "cred", "fd_file" and "uiop" fields of struct kaiocb), so
  use-after-frees are dangerous even when the structures themselves are
  type-stable.

Reviewed by:	asomers
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35493
2022-06-20 12:48:13 -04:00
Damjan Jovanovic
8c309d48aa struct kinfo_file changes needed for lsof to work using only usermode APIs`
Add kf_pipe_buffer_[in/out/size] fields to kf_pipe, and populate them.

Add a kf_kqueue struct to the kf_un union, to allow querying kqueue state,
and populate it.

Populate the kf_sock_rcv_sb_state and kf_sock_snd_sb_state fields in
kf_sock for INET/INET6 sockets, and populate all other fields for all
transport layer protocols, not just TCP.

Bump __FreeBSD_version.

Differential revision:	https://reviews.freebsd.org/D34184
Reviewed by:	jhb, kib, se
MFC after:	1 week
2022-06-18 12:34:25 +03:00
Damjan Jovanovic
8ae7694913 KERN_LOCKF: report kl_file_fsid consistently with stat(2)
PR:	264723
Reviewed by:	kib
Discussed with:	markj
MFC after:	1 week
2022-06-18 12:34:17 +03:00
Mark Johnston
f6379f7fde socket: Fix a race between kevent(2) and listen(2)
When locking the knote list for a socket, we check whether the socket is
a listening socket in order to select the appropriate mutex; a listening
socket uses the socket lock, while data sockets use socket buffer
mutexes.

If SOLISTENING(so) is false and the knote lock routine locks a socket
buffer, then it must re-check whether the socket is a listening socket
since solisten_proto() could have changed the socket's identity while we
were blocked on the socket buffer lock.

Reported by:	syzkaller
Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35492
2022-06-16 10:20:04 -04:00
Mark Johnston
756bc3adc5 kasan: Create a shadow for the bootstack prior to hammer_time()
When the kernel is compiled with -asan-stack=true, the address sanitizer
will emit inline accesses to the shadow map.  In other words, some
shadow map accesses are not intercepted by the KASAN runtime, so they
cannot be disabled even if the runtime is not yet initialized by
kasan_init() at the end of hammer_time().

This went unnoticed because the loader will initialize all PML4 entries
of the bootstrap page table to point to the same PDP page, so early
shadow map accesses do not raise a page fault, though they are silently
corrupting memory.  In fact, when the loader does not copy the staging
area, we do get a page fault since in that case only the first and last
PML4Es are populated by the loader.  But due to another bug, the loader
always treated KASAN kernels as non-relocatable and thus always copied
the staging area.

It is not really practical to annotate hammer_time() and all callees
with __nosanitizeaddress, so instead add some early initialization which
creates a shadow for the boot stack used by hammer_time().  This is only
needed by KASAN, not by KMSAN, but the shared pmap code handles both.

Reported by:	mhorne
Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35449
2022-06-15 11:39:10 -04:00
Doug Ambrisko
ce00b11940 mount: revert the active vnode reporting feature
Revert the computing of active vnode reporting since statfs is used
by a lot of tools.  Only report the vnodes used.

Reported by:	mjg
2022-06-15 07:24:55 -07:00
Mark Johnston
7565431f30 mount: Fix an incorrect assertion in kernel_mount()
The pointer to the mount values may be null if an error occurred while
copying them in, so fix the assertion condition to reflect that
possibility.

While here, move some initialization code into the error == 0 block.  No
functional change intended.

Reported by:	syzkaller
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-06-14 12:00:59 -04:00
Mark Johnston
630f633f2a vm_object: Use the vm_object_(set|clear)_flag() helpers
... rather than setting and clearing flags inline.  No functional change
intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35469
2022-06-14 12:00:59 -04:00
Mark Johnston
e8955bd643 pipe: Use a distinct wait channel for I/O serialization
Suppose a thread tries to read from an empty pipe.  pipe_read() does the
following:

1. pipelock(), possibly sleeping
2. check for buffered data
3. pipeunlock()
4. set PIPE_WANTR and sleep
5. goto 1

pipelock() is an open-coded mutex; if a thread blocks in pipelock(), it
sleeps until the lock holder calls pipeunlock().

Both sleeps use the same wait channel.  So if there are multiple threads
in pipe_read(), a thread T1 in step 3 can wake up a thread T2 sleeping
in step 4.  Then T1 goes to sleep in step 4, and T2 acquires and
releases the pipelock, waking up T1 again.  This can go on indefinitely,
livelocking the process (and potentially starving a would-be writer).

Fix the problem by using a separate wait channel for pipelock().

Reported by:	Paul Floyd <paulf2718@gmail.com>
Reviewed by:	mjg, kib
PR:		264441
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35415
2022-06-14 12:00:59 -04:00
Cy Schubert
d781401512 kern_thread.c: Fix i386 build
Chase 4493a13e3b by updating static
assertions of struct proc.
2022-06-13 19:35:33 -07:00
Konstantin Belousov
1575804961 reap_kill_proc(): avoid singlethreading any other process if we are exiting
This is racy because curproc process lock is not used, but allows the
process to exit faster.  It is userspace issue to create such race
anyway, and not fullfilling the guarantee that all reaper descendants
are signalled should be fine.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
e0343eacf3 reap_kill_subtree(): hold the reaper when entering it into the queue to handle later
We drop proctree_lock, which allows the process to exit while memoized
in the list to proceed.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
1d4abf2cfa reap_kill_subtree_once(): handle proctree_lock unlock in reap_kill_proc()
Recorded reaper might loose its reaper status, so we should not assert
it, but check and avoid signalling if this happens.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 week
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
addf103ce6 reap_kill_proc: do not retry on thread_single() failure
The failure means that the process does single-threading itself, which
makes our action not needed.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
008b2e6544 Make stop_all_proc_block interruptible to avoid deadlock with parallel suspension
If we try to single-thread a process which thread entered
procctl(REAP_KILL_SUBTREE), and sleeping waiting for us unlocking
stop_all_proc_blocker, we must be able to finish single-threading.  This
requires the sleep to be interruptible.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Mark Johnston
2d5ef216b6 thread_single_end(): consistently maintain p_boundary_count for ALLPROC mode
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 week
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
1b4701fe1e thread_unsuspend(): do not unuspend the suspended leader thread doing SINGLE_ALLPROC
markj wrote:
tdsendsignal() may unsuspend a target thread. I think there is at least
one bug there: suppose thread T is suspended in
thread_single(SINGLE_ALLPROC) when trying to kill another process with
REAP_KILL. Suppose a different thread sends SIGKILL to T->td_proc. Then,
tdsendsignal() calls thread_unsuspend(T, T->td_proc). thread_unsuspend()
incorrectly decrements T->td_proc->p_suspcount to -1.

Later, when T->td_proc exits, it will wait forever in
thread_single(SINGLE_EXIT) since T->td_proc->p_suspcount never reaches 1.

Since the thread suspension is bounded by time needed to do
thread_single(), skipping the thread_unsuspend_one() call there should
not affect signal delivery if this thread is selected as target.

Reported by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
b9009b1789 thread_single(): remove already checked conditional expression
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
4493a13e3b Do not single-thread itself when the process single-threaded some another process
Since both self single-threading and remote single-threading rely on
suspending the thread doing thread_single(), it cannot be mixed: thread
doing thread_suspend_switch() might be subject to thread_suspend_one()
and vice versa.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
dd883e9a7e weed_inhib(): correct the condition to re-suspend a thread
suspended for SINGLE_ALLPROC mode.  There is no need to check for
boundary state.  It is only required to see that the suspension comes
from the ALLPROC mode.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:03 +03:00
Konstantin Belousov
b9893b3533 weed_inhib(): do not double-suspend already suspended thread if the loop reiterates
In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:02 +03:00
Konstantin Belousov
d7a9e6e740 thread_single: wait for P_STOPPED_SINGLE to pass
to avoid ALLPROC mode to try to race with any other single-threading
mode.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:02 +03:00
Konstantin Belousov
02a2aacbe2 issignal(): ignore signals when process is single-threading for exit
Places that will wait for curproc->p_singlethr to become zero (in the
next commit, the counter of number of external single-threading is
to be introduced), must wait for it interruptible, otherwise we
deadlock.  On the other hand, a signal delivered during this window,
if directed to the waiting thread, would cause the wait loop to become
a busy loop.

Since we are exiting, it is safe to ignore the signals.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:02 +03:00
Konstantin Belousov
d3000939c7 P2_WEXIT: avoid thread_single() for exiting process earlier
before the process itself does thread_single(SINGLE_EXIT).  We cannot
single-thread such process in ALLPROC (external) mode, and properly
detect and report the failure to do so due to the process becoming
zombie is easier to prevent than handle.

In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D35310
2022-06-13 22:30:02 +03:00
Doug Ambrisko
6468cd8e0e mount: add vnode usage per file system with mount -v
This avoids the need to drop into the ddb to figure out vnode
usage per file system.  It helps to see if they are or are not
being freed.  Suggestion to report active vnode count was from
kib@

Reviewed by:   	kib
Differential Revision: https://reviews.freebsd.org/D35436
2022-06-13 07:56:38 -07:00
Hans Petter Selasky
b8394039dc mbuf(9): Fix size of mbuf for all 32-bit platforms (i386, ARM, PowerPC and RISCV)
Do this by reducing the size of the MBUF_PEXT_MAX_PGS, causing "struct mbuf" to
be bigger than M_SIZE, and also add a missing padding field to ensure 64-bit
alignment.

Reviewed by:	gallatin@
Reported by:	Elliott Mitchell
Differential revision:	https://reviews.freebsd.org/D35339
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-06-07 22:09:10 +02:00
Hans Petter Selasky
fe8c78f0d2 ktls: Add full support for TLS RX offloading via network interface.
Basic TLS RX offloading uses the "csum_flags" field in the mbuf packet
header to figure out if an incoming mbuf has been fully offloaded or
not. This information follows the packet stream via the LRO engine, IP
stack and finally to the TCP stack. The TCP stack preserves the mbuf
packet header also when re-assembling packets after packet loss. When
the mbuf goes into the socket buffer the packet header is demoted and
the offload information is transferred to "m_flags" . Later on a
worker thread will analyze the mbuf flags and decide if the mbufs
making up a TLS record indicate a fully-, partially- or not decrypted
TLS record. Based on these three cases the worker thread will either
pass the packet on as-is or recrypt the decrypted bits, if any, or
decrypt the packet as usual.

During packet loss the kernel TLS code will call back into the network
driver using the send tag, informing about the TCP starting sequence
number of every TLS record that is not fully decrypted by the network
interface. The network interface then stores this information in a
compressed table and starts asking the hardware if it has found a
valid TLS header in the TCP data payload. If the hardware has found a
valid TLS header and the referred TLS header is at a valid TCP
sequence number according to the TCP sequence numbers provided by the
kernel TLS code, the network driver then informs the hardware that it
can resume decryption.

Care has been taken to not merge encrypted and decrypted mbuf chains,
in the LRO engine and when appending mbufs to the socket buffer.

The mbuf's leaf network interface pointer is used to figure out from
which network interface the offloading rule should be allocated. Also
this pointer is used to track route changes.

Currently mbuf send tags are used in both transmit and receive
direction, due to convenience, but may get a new name in the future to
better reflect their usage.

Reviewed by:	jhb@ and gallatin@
Differential revision:	https://reviews.freebsd.org/D32356
Sponsored by:	NVIDIA Networking
2022-06-07 12:58:09 +02:00
Hans Petter Selasky
f0fca64618 ktls: Refer send tag pointer once.
So that the asserts and the actual code see the same values.

Differential revision:	https://reviews.freebsd.org/D32356
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-06-07 12:57:03 +02:00
Hans Petter Selasky
4d88d81c31 mbuf(9): Implement a leaf network interface field in the mbuf packet header.
When packets are received they may traverse several network interfaces like
vlan(4) and lagg(9). When doing receive side offloads it is important to
know the first network interface entry point, because that is where all
offloading is taking place. This makes it possible to track receive
side route changes for multiport setups, for example when lagg(9) receives
traffic from more than one port. This avoids having to install multiple
offloading rules for the same stream.

This field works similar to the existing "rcvif" mbuf packet header field.

Submitted by:	jhb@
Reviewed by:	gallatin@ and gnn@
Differential revision:	https://reviews.freebsd.org/D35339
Sponsored by:	NVIDIA Networking
Sponsored by:	Netflix
2022-06-07 12:54:42 +02:00
Gleb Smirnoff
d97922c6c6 unix/*: rewrite unp_internalize() cmsg parsing cycle
Make it a complex, but a single for(;;) statement.  The previous cycle
with some loop logic in the beginning and some loop logic at the end
was confusing.  Both me and markj@ were misleaded to a conclusion that
some checks are unnecessary, while they actually were necessary.

While here, handle an edge case found by Mark, when on 64-bit platform
an incorrect message from userland would underflow length counter, but
return without any error.  Provide a test case for such message.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35375
2022-06-06 10:05:28 -07:00
Yuichiro NAITO
8d95f50052 smp: Use local copies of the setup function pointer and argument
No functional change intended.

PR:		264383
Reviewed by:	jhb, markj
MFC after:	1 week
2022-06-06 11:29:51 -04:00
Gleb Smirnoff
2573e6ced9 unix/dgram: rename unpdg_sendspace to unpdg_maxdgram
Matches the meaning of the variable and sysctl node name.
2022-06-03 12:55:44 -07:00
Gleb Smirnoff
a8e286bb5d sockets: use socket buffer mutexes in struct socket directly
Convert more generic socket code to not use sockbuf compat pointer.
Continuation of 4328318445.
2022-06-03 12:55:44 -07:00