Commit Graph

4812 Commits

Author SHA1 Message Date
Gleb Smirnoff
89128ff3e4 protocols: init with standard SYSINIT(9) or VNET_SYSINIT
The historical BSD network stack loop that rolls over domains and
over protocols has no advantages over more modern SYSINIT(9).
While doing the sweep, split global and per-VNET initializers.

Getting rid of pr_init allows to achieve several things:
o Get rid of ifdef's that protect against double foo_init() when
  both INET and INET6 are compiled in.
o Isolate initializers statically to the module they init.
o Makes code easier to understand and maintain.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33537
2022-01-03 10:15:21 -08:00
Ed Maste
818952c638 Fix kernel build without INET6
Reported by:	Gary Jennejohn
Fixes:		ff3a85d324 ("[lltable] Add per-family lltable ...")
Sponsored by:	The FreeBSD Foundation
2021-12-30 18:40:46 -05:00
Stefan Eßer
e2650af157 Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.

Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.

The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).

The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.

This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.

One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.

Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.

The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.

Reviewed by:	kib
MFC after:	2 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D33451
2021-12-30 12:20:32 +01:00
Alexander V. Chernikov
63f7f3921b routing: Add unified level-based logging support for the routing subsystem.
Summary: MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33664
2021-12-29 21:30:18 +00:00
Alexander V. Chernikov
823a08d740 nhops: split nh_family into nh_upper_family and nh_neigh_family.
With IPv4 over IPv6 nexthops and IP->MPLS support, there is a need
 to distingush "upper" e.g. traffic family and "neighbor" e.g. LLE/gateway
 address family. Store them explicitly in the private part of the nexthop data.

While here, store nhop fibnum in nhop_prip datastructure to make it self-contained.

MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33663
2021-12-29 21:03:19 +00:00
Alexander V. Chernikov
ff3a85d324 [lltable] Add per-family lltable getters.
Introduce a new function, lltable_get(), to retrieve lltable pointer
 for the specified interface and family.
Use it to avoid all-iftable list traversal when adding or deleting
 ARP/ND records.

Differential Revision: https://reviews.freebsd.org/D33660
MFC after:	2 weeks
2021-12-29 20:57:15 +00:00
Vincenzo Maffione
4561c4f0ca net: iflib: sync isc_capenable to if_capenable
On SIOCSIFCAP, some bits in ifp->if_capenable may be toggled.
When this happens, apply the same change to isc_capenable, which
is the iflib private copy of if_capenable (for a subset of the
IFCAP_* bits). In this way the iflib drivers can check the bits
using isc_capenable rather than if_capenable. This is convenient
because the latter access requires an additional indirection
through the ifp, and it is also less likely to be in cache.

PR:		260068
Reviewed by:	kbowling, gallatin
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33156
2021-12-28 10:55:21 +00:00
Kristof Provost
e7809dceb5 pf: make if_pfsync.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33504
2021-12-17 12:38:35 +01:00
Kristof Provost
dc04fa802d pf: make if_pflog.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33503
2021-12-17 12:38:35 +01:00
Kristof Provost
e9167358e4 net: make if_bridgevar.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33502
2021-12-17 12:38:35 +01:00
Kristof Provost
f4096a7c8a net: make ethernet.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33501
2021-12-17 12:38:35 +01:00
Kristof Provost
c658610b92 pf: make pfvar.h self-contained
Ensure that the pfvar.h header can be included without including any
other headers.

Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33499
2021-12-17 12:38:34 +01:00
Kristof Provost
b29c145cc1 if_stf: make if_stf.h self-contained
Ensure that the if_stf.h header can be included without including any
other headers.

Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33498
2021-12-17 12:38:34 +01:00
Warner Losh
c6df6f5322 Create wrapper for Giant taken for newbus
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D31831
2021-12-09 17:04:45 -07:00
Mateusz Guzik
e735fa3212 net/if.c: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 20:39:40 +00:00
Gleb Smirnoff
7e0bba4d80 ifnet: make V_if_index static to if.c
This requires moving net.link.generic sysctl declaration from if_mib.c
to if.c.  Ideally if_mib.c needs just to be merged to if.c, but they
have different license texts.

Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
d74b7baeb0 ifnet_byindex() actually requires network epoch
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them
in epoch.  Most of the code touched remains unsafe, as the returned
pointer is being used after epoch exit.  Mark that with a comment.

Validate the index argument inside the function, reducing argument
validation requirement from the callers and making V_if_index
private to if.c.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
7b40b00fad ifnet: merge ifindex_alloc(), ifnet_setbyindex(), if_grow() and call magic
Now it is possible to just merge all this complexity into single
linear function.  Note that IFNET_WLOCK() is a sleepable lock, so
we can M_WAITOK and epoch_wait_preempt().

Reviewed by:		melifaro, bz, kp
Differential revision:	https://reviews.freebsd.org/D33262
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
6ff4cac2ee ifnet: initial if_grow() shall always succeed
So let's just call malloc() directly.  This also avoids hidden
doubling of default V_if_indexlim.

Reviewed by:		melifaro, bz, kp
Differential revision:	https://reviews.freebsd.org/D33261
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
450394af27 ifnet: use ck_pr(3) store & load setting ifnet pointer in ifindex
The lockless access to the array is protected by the network epoch.

Reviewed by:		bz, kp
Differential revision:	https://reviews.freebsd.org/D33260
2021-12-06 09:32:30 -08:00
Gleb Smirnoff
8062e5759c ifnet: allocate index at the end of if_alloc_domain()
Now that if_alloc_domain() never fails and actually doesn't
expose ifnet to outside we can eliminate IFNET_HOLD and two
step index allocation.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33259
2021-12-06 09:32:30 -08:00
Gleb Smirnoff
ad2a0aec29 nhop: hash ifnet pointer instead of if_index
Yet another problem created by VIMAGE/if_vmove/epair design that
relocates ifnet between vnets and changes if_index.  Since if_index
changes, nhop hash values also changes, unlink_nhop() isn't able to
find entry in hash and leaks the nhop.  Since nhop references ifnet,
the latter is also leaked.  As result running network tests leaks
memory on every single test that creates vnet jail.

While here, rewrite whole hash_priv() to use static initializer,
per Alexander's suggestion.

Reviewed by:	melifaro
2021-12-04 10:05:46 -08:00
Kristof Provost
6d4baa0d01 if_pflog: fix packet length
There were two issues with the new pflog packet length.
The first is that the length is expected to be a multiple of
sizeof(long), but we'd assumed it had to be a multiple of
sizeof(uint32_t).

The second is that there's some broken software out there (such as
Wireshark) that makes incorrect assumptions about the amount of padding.
That is, Wireshark assumes there's always three bytes of padding, rather
than however much is needed to get to a multiple of sizeof(long).

Fix this by adding extra padding, and a fake field to maintain
Wireshark's assumption.

Reported by:	Ozkan KIRIK <ozkan.kirik@gmail.com>
Tested by:	Ozkan KIRIK <ozkan.kirik@gmail.com>
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33236
2021-12-04 08:42:55 +01:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Gleb Smirnoff
9e93d2b335 ifnet: enable & fix if_debug build
Fixes:	ce40632a31
2021-12-02 10:59:43 -08:00
Gleb Smirnoff
93c67567e0 Remove "options PCBGROUP"
With upcoming changes to the inpcb synchronisation it is going to be
broken. Even its current status after the move of PCB synchronization
to the network epoch is very questionable.

This experimental feature was sponsored by Juniper but ended never to
be used in Juniper and doesn't exist in their source tree [sjg@, stevek@,
jtl@]. In the past (AFAIK, pre-epoch times) it was tried out at Netflix
[gallatin@, rrs@] with no positive result and at Yandex [ae@, melifaro@].

I'm up to resurrecting it back if there is any interest from anybody.

Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33020
2021-12-02 10:48:48 -08:00
Gleb Smirnoff
1cec1c5831 Allow to compile RSS without PCBGROUP.
Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33019
2021-12-02 10:48:48 -08:00
Zhenlei Huang
73d41cc730 if_epair: Also mark the flag of pair b with IFF_KNOWSEPOCH
Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33210
2021-12-01 15:54:23 +01:00
Kristof Provost
439da7f06d if_stf: KASAN fix
In in_stf_input() we grabbed a pointer to the IPv4 header and later did
an m_pullup() before we look at the IPv6 header. However, m_pullup()
could rearrange the mbuf chain and potentially invalidate the pointer to
the IPv4 header.

Avoid this issue by copying the IP header rather than getting a pointer
to it.

Reported by:	markj, Jenkins (KASAN job)
Reviewed by:	markj
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33192
2021-11-30 17:35:15 +01:00
Mateusz Guzik
2cedfc3f7e if_epair: ifdef vars only used with ALTQ
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:28:54 +00:00
Gleb Smirnoff
3bc40f39fd if_free: add a comment explaining why ifindex_free() is performed here 2021-11-22 19:59:27 -08:00
Gleb Smirnoff
fe499a8452 ifnet: merge if_destroy() and if_free_internal() into one
New function has more meaningful name if_free_deferred() and has
its header comment fixed to reflect reality.  NFC
2021-11-22 19:53:12 -08:00
Gleb Smirnoff
4787572d05 ifnet: make if_alloc_domain() never fail
The last consumer of if_com_alloc() is firewire.  It never fails
to allocate.  Most likely the if_com_alloc() KPI will go away
together with if_fwip(), less likely new consumers of if_com_alloc()
will be added, but they would need to follow the no fail KPI.
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
1e3ca25d92 ifnet: make if_alloc_domain() static 2021-11-22 19:49:57 -08:00
Gleb Smirnoff
ce40632a31 ifnet: append if_debug.c to if.c
With this change if_index can become static.  There is nothing
that if_debug.c would want to isolate from if.c.  Potentially
if.c wants to share everything with if_debug.c.

Move Bjoern's copyright to if.c.

Reviewed by:	bz
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
8a6f38c8ac ifnet: garbage collect drbr_*_drv().
They were left in 62d76917b8 but after years proved not to be useful.
2021-11-22 19:49:57 -08:00
Kristof Provost
b46512f704 if_stf: add dtrace probe points
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33038
2021-11-20 19:29:01 +01:00
Kristof Provost
19dc644511 if_stf: add 6rd support
Implement IPv6 Rapid Deployment (RFC5969) on top of the existing 6to4
(RFC3056) if_stf code.

PR:		253328
Reviewed by:	hrs
Obtained from:	pfSense
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33037
2021-11-20 19:29:01 +01:00
Kristof Provost
3142d4f622 lagg: fix unused-but-set-variable
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-19 22:01:27 +01:00
Andriy Gapon
1bfdb812c7 iflib_stop: drain rx tasks to prevent any data races
iflib_stop modifies iflib data structures that are used by _task_fn_rx,
most prominently the free lists.  So, iflib_stop has to ensure that the
rx task threads are not active.

This should help to fix a crash seen when iflib_if_ioctl (e.g.,
SIOCSIFCAP) is called while there is already traffic flowing.

The crash has been seen on VMWare guests with vmxnet3 driver.

My guess is that on physical hardware the couple of 1ms delays that
iflib_stop has after disabling interrupts are enough for the queued work
to be completed before any iflib state is touched.

But on busy hypervisors the guests might not get enough CPU time to
complete the work, thus there can be a race between the taskqueue
threads and the work done to handle an ioctl, specifically in iflib_stop
and iflib_init_locked.

PR:		259458
Reviewed by:	markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D32926
2021-11-19 10:00:38 +02:00
Kristof Provost
8e492101ec pf: add COMPAT_FREEBSD13 for DIOCKEEPCOUNTERS
DIOCKEEPCOUNTERS used to overlap with DIOCGIFSPEEDV0, which has been
fixed in 14, but remains in stable/12 and stable/13.
Support the old, overlapping, call under COMPAT_FREEBSD13.

Reviewed by:	jhb
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33001
2021-11-17 03:09:20 +01:00
Mateusz Guzik
79554f2b6c net: whack "set but not used" warnings in net/rtsock.c
... except for one where the error is ignored.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:20:46 +00:00
Mateusz Guzik
c681cce925 net: whack "set but not used" warnings in net/pfil.c
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:19:58 +00:00
Mateusz Guzik
5a4e46f6ec net: whack "set but not used" warnings in net/if.c
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:15:08 +00:00
Kristof Provost
047c4e365d pf: renumber DIOCKEEPCOUNTERS
We accidentally had two ioctls use the same base number
(DIOCKEEPCOUNTERS and DIOCGIFSPEEDV{0,1}). We get away with that on most
platforms because the size of the argument structures is different.
This does break CHERI, and is generally a bad idea anyway.
Renumber to avoid this collision.

Reported by:	jhb
2021-11-14 15:36:59 +01:00
Kristof Provost
8e45fed3ae if_stf: enable use in vnet jails
The cloner must be per-vnet so that cloned interfaces get destroyed when
the vnet goes away. Otherwise we fail assertions in vnet_if_uninit():

	panic: vnet_if_uninit:475 tailq &V_ifnet=0xfffffe01665fe070 not empty
	cpuid = 19
	time = 1636107064
	KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe015d0cac60
	vpanic() at vpanic+0x187/frame 0xfffffe015d0cacc0
	panic() at panic+0x43/frame 0xfffffe015d0cad20
	vnet_if_uninit() at vnet_if_uninit+0x7b/frame 0xfffffe015d0cad30
	vnet_destroy() at vnet_destroy+0x170/frame 0xfffffe015d0cad60
	prison_deref() at prison_deref+0x9b0/frame 0xfffffe015d0cadd0
	sys_jail_remove() at sys_jail_remove+0x119/frame 0xfffffe015d0cae00
	amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe015d0caf30
	fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe015d0caf30
	--- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x8011e920a, rsp = 0x7fffffffe788, rbp = 0x7fffffffe810 ---
	KDB: enter: panic

MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32849
2021-11-09 09:39:53 +01:00
Kristof Provost
3576121c8b if_stf: style(9) pass
As stated in style(9): "Values in return statements should be enclosed
in parentheses."

MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32848
2021-11-09 09:39:53 +01:00
Kristof Provost
8ca6c11a7c if_gif: fix vnet shutdown panic
If an if_gif exists and has an address assigned inside a vnet when the
vnet is shut down we failed to clean up the address, leading to a panic
when we ip_destroy() and the V_in_ifaddrhashtbl is not empty.

This happens because of the VNET_SYS(UN)INIT order, which means we
destroy the if_gif interface before the addresses can be purged (and
if_detach() does not remove addresses, it assumes this will be done by
the stack teardown code).

Set subsystem SI_SUB_PSEUDO just like if_bridge so the cleanup
operations happen in the correct order.

MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32835
2021-11-08 12:00:00 +01:00
Wojciech Macek
acdfc09639 lagg: update capabilites on SIOCSIFMTU
Some NICs might have limited capabilities when Jumbo frames are used.
For exampe some neta interfaces only support TX csum offload when the
packet size is lower than a value specified in DT.
Fix it by re-reading capabilities of children interfaces after MTU
has been successfully changed.

Found by: Jerome Tomczyk <jerome.tomczyk@stormshield.eu>
Reviewed by: jhb
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D32724
2021-11-06 10:43:08 +01:00