Commit Graph

4819 Commits

Author SHA1 Message Date
Vincenzo Maffione
e0e1240528 netmap: fix LOR in iflib_netmap_register
In iflib_device_register(), the CTX_LOCK is acquired first and then
IFNET_WLOCK is acquired by ether_ifattach(). However, in netmap_hw_reg()
we do the opposite: IFNET_RLOCK is acquired first, and then CTX_LOCK
is acquired by iflib_netmap_register(). Fix this LOR issue by wrapping
the CTX_LOCK/UNLOCK calls in iflib_device_register with an additional
IFNET_WLOCK. This is safe since the IFNET_WLOCK is recursive.

MFC after:	1 month
2022-01-14 21:09:04 +00:00
Kristof Provost
5f5e32f1b3 pf: protect the rpool from races
The roundrobin pool stores its state in the rule, which could
potentially lead to invalid addresses being returned.

For example, thread A just executed PF_AINC(&rpool->counter) and
immediately afterwards thread B executes PF_ACPY(naddr, &rpool->counter)
(i.e. after the pf_match_addr() check of rpool->counter).

Lock the rpool with its own mutex to prevent these races. The
performance impact of this is expected to be low, as each rule has its
own lock, and the lock is also only relevant when state is being created
(so only for the initial packets of a connection, not for all traffic).

See also:	https://redmine.pfsense.org/issues/12660
Reviewed by:	glebius
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33874
2022-01-14 10:30:33 +01:00
Alexander Motin
618d49f5ca Revert "iflib: Relax timer period from 0.5 to 0.5-0.75s."
I've noticed relations between iflib_timer() vs ixl_admin_timer().
Both scheduled at the same 2Hz rate, but the second is rescheduling
the first each time, so if the first get any slower, it won't be
executed at all.  Revert this until deeper investigation.

This reverts commit 90bc1cf657.
2022-01-10 09:40:38 -05:00
Alexander Motin
90bc1cf657 iflib: Relax timer period from 0.5 to 0.5-0.75s.
While there switch it from hardclock ticks to milliseconds.

MFC after:	2 weeks
2022-01-09 20:32:50 -05:00
Ryan Stone
5adea417d4 Fix ifa refcount leak in ifa_ifwithnet()
In 4f6c66cc9c, ifa_ifwithnet() was changed to no longer
ifa_ref() the returned ifaddr, and instead the caller was required
to stay in the net_epoch for as long as they wanted the ifaddr
to remain valid.  However, this missed the case where an AF_LINK
lookup would call ifaddr_byindex(), which still does ifa_ref()
the ifaddr.  This would cause a refcount leak.

Fix this by inlining the relevant parts of ifaddr_byindex() here,
with the ifa_ref() call removed.  This also avoids an unnecessary
entry and exit from the net_epoch for this case.

I've audited all in-tree consumers of ifa_ifwithnet() that could
possibly perform an AF_LINK lookup and confirmed that none of them
will expect the ifaddr to have a reference that they need to
release.

MFC after: 2 months
Sponsored by: Dell Inc
Differential Revision:	https://reviews.freebsd.org/D28705
Reviewed by: melifaro
2022-01-06 15:04:24 -05:00
Ed Maste
a6668e31aa Fix kernel build without INET and INET6
Reviewed by:	brooks, melifaro
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33718
2022-01-05 09:41:38 -05:00
Gleb Smirnoff
644ca0846d domains: make domain_init() initialize only global state
Now that each module handles its global and VNET initialization
itself, there is no VNET related stuff left to do in domain_init().

Differential revision:	https://reviews.freebsd.org/D33541
2022-01-03 10:15:22 -08:00
Gleb Smirnoff
89128ff3e4 protocols: init with standard SYSINIT(9) or VNET_SYSINIT
The historical BSD network stack loop that rolls over domains and
over protocols has no advantages over more modern SYSINIT(9).
While doing the sweep, split global and per-VNET initializers.

Getting rid of pr_init allows to achieve several things:
o Get rid of ifdef's that protect against double foo_init() when
  both INET and INET6 are compiled in.
o Isolate initializers statically to the module they init.
o Makes code easier to understand and maintain.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33537
2022-01-03 10:15:21 -08:00
Ed Maste
818952c638 Fix kernel build without INET6
Reported by:	Gary Jennejohn
Fixes:		ff3a85d324 ("[lltable] Add per-family lltable ...")
Sponsored by:	The FreeBSD Foundation
2021-12-30 18:40:46 -05:00
Stefan Eßer
e2650af157 Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.

Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.

The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).

The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.

This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.

One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.

Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.

The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.

Reviewed by:	kib
MFC after:	2 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D33451
2021-12-30 12:20:32 +01:00
Alexander V. Chernikov
63f7f3921b routing: Add unified level-based logging support for the routing subsystem.
Summary: MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33664
2021-12-29 21:30:18 +00:00
Alexander V. Chernikov
823a08d740 nhops: split nh_family into nh_upper_family and nh_neigh_family.
With IPv4 over IPv6 nexthops and IP->MPLS support, there is a need
 to distingush "upper" e.g. traffic family and "neighbor" e.g. LLE/gateway
 address family. Store them explicitly in the private part of the nexthop data.

While here, store nhop fibnum in nhop_prip datastructure to make it self-contained.

MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33663
2021-12-29 21:03:19 +00:00
Alexander V. Chernikov
ff3a85d324 [lltable] Add per-family lltable getters.
Introduce a new function, lltable_get(), to retrieve lltable pointer
 for the specified interface and family.
Use it to avoid all-iftable list traversal when adding or deleting
 ARP/ND records.

Differential Revision: https://reviews.freebsd.org/D33660
MFC after:	2 weeks
2021-12-29 20:57:15 +00:00
Vincenzo Maffione
4561c4f0ca net: iflib: sync isc_capenable to if_capenable
On SIOCSIFCAP, some bits in ifp->if_capenable may be toggled.
When this happens, apply the same change to isc_capenable, which
is the iflib private copy of if_capenable (for a subset of the
IFCAP_* bits). In this way the iflib drivers can check the bits
using isc_capenable rather than if_capenable. This is convenient
because the latter access requires an additional indirection
through the ifp, and it is also less likely to be in cache.

PR:		260068
Reviewed by:	kbowling, gallatin
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33156
2021-12-28 10:55:21 +00:00
Kristof Provost
e7809dceb5 pf: make if_pfsync.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33504
2021-12-17 12:38:35 +01:00
Kristof Provost
dc04fa802d pf: make if_pflog.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33503
2021-12-17 12:38:35 +01:00
Kristof Provost
e9167358e4 net: make if_bridgevar.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33502
2021-12-17 12:38:35 +01:00
Kristof Provost
f4096a7c8a net: make ethernet.h self-contained
Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33501
2021-12-17 12:38:35 +01:00
Kristof Provost
c658610b92 pf: make pfvar.h self-contained
Ensure that the pfvar.h header can be included without including any
other headers.

Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33499
2021-12-17 12:38:34 +01:00
Kristof Provost
b29c145cc1 if_stf: make if_stf.h self-contained
Ensure that the if_stf.h header can be included without including any
other headers.

Reviewed by:	imp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33498
2021-12-17 12:38:34 +01:00
Warner Losh
c6df6f5322 Create wrapper for Giant taken for newbus
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D31831
2021-12-09 17:04:45 -07:00
Mateusz Guzik
e735fa3212 net/if.c: plug set-but-not-unused vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 20:39:40 +00:00
Gleb Smirnoff
7e0bba4d80 ifnet: make V_if_index static to if.c
This requires moving net.link.generic sysctl declaration from if_mib.c
to if.c.  Ideally if_mib.c needs just to be merged to if.c, but they
have different license texts.

Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
d74b7baeb0 ifnet_byindex() actually requires network epoch
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them
in epoch.  Most of the code touched remains unsafe, as the returned
pointer is being used after epoch exit.  Mark that with a comment.

Validate the index argument inside the function, reducing argument
validation requirement from the callers and making V_if_index
private to if.c.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D33263
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
7b40b00fad ifnet: merge ifindex_alloc(), ifnet_setbyindex(), if_grow() and call magic
Now it is possible to just merge all this complexity into single
linear function.  Note that IFNET_WLOCK() is a sleepable lock, so
we can M_WAITOK and epoch_wait_preempt().

Reviewed by:		melifaro, bz, kp
Differential revision:	https://reviews.freebsd.org/D33262
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
6ff4cac2ee ifnet: initial if_grow() shall always succeed
So let's just call malloc() directly.  This also avoids hidden
doubling of default V_if_indexlim.

Reviewed by:		melifaro, bz, kp
Differential revision:	https://reviews.freebsd.org/D33261
2021-12-06 09:32:31 -08:00
Gleb Smirnoff
450394af27 ifnet: use ck_pr(3) store & load setting ifnet pointer in ifindex
The lockless access to the array is protected by the network epoch.

Reviewed by:		bz, kp
Differential revision:	https://reviews.freebsd.org/D33260
2021-12-06 09:32:30 -08:00
Gleb Smirnoff
8062e5759c ifnet: allocate index at the end of if_alloc_domain()
Now that if_alloc_domain() never fails and actually doesn't
expose ifnet to outside we can eliminate IFNET_HOLD and two
step index allocation.

Reviewed by:		kp
Differential revision:	https://reviews.freebsd.org/D33259
2021-12-06 09:32:30 -08:00
Gleb Smirnoff
ad2a0aec29 nhop: hash ifnet pointer instead of if_index
Yet another problem created by VIMAGE/if_vmove/epair design that
relocates ifnet between vnets and changes if_index.  Since if_index
changes, nhop hash values also changes, unlink_nhop() isn't able to
find entry in hash and leaks the nhop.  Since nhop references ifnet,
the latter is also leaked.  As result running network tests leaks
memory on every single test that creates vnet jail.

While here, rewrite whole hash_priv() to use static initializer,
per Alexander's suggestion.

Reviewed by:	melifaro
2021-12-04 10:05:46 -08:00
Kristof Provost
6d4baa0d01 if_pflog: fix packet length
There were two issues with the new pflog packet length.
The first is that the length is expected to be a multiple of
sizeof(long), but we'd assumed it had to be a multiple of
sizeof(uint32_t).

The second is that there's some broken software out there (such as
Wireshark) that makes incorrect assumptions about the amount of padding.
That is, Wireshark assumes there's always three bytes of padding, rather
than however much is needed to get to a multiple of sizeof(long).

Fix this by adding extra padding, and a fake field to maintain
Wireshark's assumption.

Reported by:	Ozkan KIRIK <ozkan.kirik@gmail.com>
Tested by:	Ozkan KIRIK <ozkan.kirik@gmail.com>
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33236
2021-12-04 08:42:55 +01:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Gleb Smirnoff
9e93d2b335 ifnet: enable & fix if_debug build
Fixes:	ce40632a31
2021-12-02 10:59:43 -08:00
Gleb Smirnoff
93c67567e0 Remove "options PCBGROUP"
With upcoming changes to the inpcb synchronisation it is going to be
broken. Even its current status after the move of PCB synchronization
to the network epoch is very questionable.

This experimental feature was sponsored by Juniper but ended never to
be used in Juniper and doesn't exist in their source tree [sjg@, stevek@,
jtl@]. In the past (AFAIK, pre-epoch times) it was tried out at Netflix
[gallatin@, rrs@] with no positive result and at Yandex [ae@, melifaro@].

I'm up to resurrecting it back if there is any interest from anybody.

Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33020
2021-12-02 10:48:48 -08:00
Gleb Smirnoff
1cec1c5831 Allow to compile RSS without PCBGROUP.
Reviewed by:		rrs
Differential revision:	https://reviews.freebsd.org/D33019
2021-12-02 10:48:48 -08:00
Zhenlei Huang
73d41cc730 if_epair: Also mark the flag of pair b with IFF_KNOWSEPOCH
Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33210
2021-12-01 15:54:23 +01:00
Kristof Provost
439da7f06d if_stf: KASAN fix
In in_stf_input() we grabbed a pointer to the IPv4 header and later did
an m_pullup() before we look at the IPv6 header. However, m_pullup()
could rearrange the mbuf chain and potentially invalidate the pointer to
the IPv4 header.

Avoid this issue by copying the IP header rather than getting a pointer
to it.

Reported by:	markj, Jenkins (KASAN job)
Reviewed by:	markj
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33192
2021-11-30 17:35:15 +01:00
Mateusz Guzik
2cedfc3f7e if_epair: ifdef vars only used with ALTQ
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:28:54 +00:00
Gleb Smirnoff
3bc40f39fd if_free: add a comment explaining why ifindex_free() is performed here 2021-11-22 19:59:27 -08:00
Gleb Smirnoff
fe499a8452 ifnet: merge if_destroy() and if_free_internal() into one
New function has more meaningful name if_free_deferred() and has
its header comment fixed to reflect reality.  NFC
2021-11-22 19:53:12 -08:00
Gleb Smirnoff
4787572d05 ifnet: make if_alloc_domain() never fail
The last consumer of if_com_alloc() is firewire.  It never fails
to allocate.  Most likely the if_com_alloc() KPI will go away
together with if_fwip(), less likely new consumers of if_com_alloc()
will be added, but they would need to follow the no fail KPI.
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
1e3ca25d92 ifnet: make if_alloc_domain() static 2021-11-22 19:49:57 -08:00
Gleb Smirnoff
ce40632a31 ifnet: append if_debug.c to if.c
With this change if_index can become static.  There is nothing
that if_debug.c would want to isolate from if.c.  Potentially
if.c wants to share everything with if_debug.c.

Move Bjoern's copyright to if.c.

Reviewed by:	bz
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
8a6f38c8ac ifnet: garbage collect drbr_*_drv().
They were left in 62d76917b8 but after years proved not to be useful.
2021-11-22 19:49:57 -08:00
Kristof Provost
b46512f704 if_stf: add dtrace probe points
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33038
2021-11-20 19:29:01 +01:00
Kristof Provost
19dc644511 if_stf: add 6rd support
Implement IPv6 Rapid Deployment (RFC5969) on top of the existing 6to4
(RFC3056) if_stf code.

PR:		253328
Reviewed by:	hrs
Obtained from:	pfSense
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33037
2021-11-20 19:29:01 +01:00
Kristof Provost
3142d4f622 lagg: fix unused-but-set-variable
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-19 22:01:27 +01:00
Andriy Gapon
1bfdb812c7 iflib_stop: drain rx tasks to prevent any data races
iflib_stop modifies iflib data structures that are used by _task_fn_rx,
most prominently the free lists.  So, iflib_stop has to ensure that the
rx task threads are not active.

This should help to fix a crash seen when iflib_if_ioctl (e.g.,
SIOCSIFCAP) is called while there is already traffic flowing.

The crash has been seen on VMWare guests with vmxnet3 driver.

My guess is that on physical hardware the couple of 1ms delays that
iflib_stop has after disabling interrupts are enough for the queued work
to be completed before any iflib state is touched.

But on busy hypervisors the guests might not get enough CPU time to
complete the work, thus there can be a race between the taskqueue
threads and the work done to handle an ioctl, specifically in iflib_stop
and iflib_init_locked.

PR:		259458
Reviewed by:	markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D32926
2021-11-19 10:00:38 +02:00
Kristof Provost
8e492101ec pf: add COMPAT_FREEBSD13 for DIOCKEEPCOUNTERS
DIOCKEEPCOUNTERS used to overlap with DIOCGIFSPEEDV0, which has been
fixed in 14, but remains in stable/12 and stable/13.
Support the old, overlapping, call under COMPAT_FREEBSD13.

Reviewed by:	jhb
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33001
2021-11-17 03:09:20 +01:00
Mateusz Guzik
79554f2b6c net: whack "set but not used" warnings in net/rtsock.c
... except for one where the error is ignored.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:20:46 +00:00