4781 Commits

Author SHA1 Message Date
Gleb Smirnoff
3bc40f39fd if_free: add a comment explaining why ifindex_free() is performed here 2021-11-22 19:59:27 -08:00
Gleb Smirnoff
fe499a8452 ifnet: merge if_destroy() and if_free_internal() into one
New function has more meaningful name if_free_deferred() and has
its header comment fixed to reflect reality.  NFC
2021-11-22 19:53:12 -08:00
Gleb Smirnoff
4787572d05 ifnet: make if_alloc_domain() never fail
The last consumer of if_com_alloc() is firewire.  It never fails
to allocate.  Most likely the if_com_alloc() KPI will go away
together with if_fwip(), less likely new consumers of if_com_alloc()
will be added, but they would need to follow the no fail KPI.
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
1e3ca25d92 ifnet: make if_alloc_domain() static 2021-11-22 19:49:57 -08:00
Gleb Smirnoff
ce40632a31 ifnet: append if_debug.c to if.c
With this change if_index can become static.  There is nothing
that if_debug.c would want to isolate from if.c.  Potentially
if.c wants to share everything with if_debug.c.

Move Bjoern's copyright to if.c.

Reviewed by:	bz
2021-11-22 19:49:57 -08:00
Gleb Smirnoff
8a6f38c8ac ifnet: garbage collect drbr_*_drv().
They were left in 62d76917b8678 but after years proved not to be useful.
2021-11-22 19:49:57 -08:00
Kristof Provost
b46512f704 if_stf: add dtrace probe points
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33038
2021-11-20 19:29:01 +01:00
Kristof Provost
19dc644511 if_stf: add 6rd support
Implement IPv6 Rapid Deployment (RFC5969) on top of the existing 6to4
(RFC3056) if_stf code.

PR:		253328
Reviewed by:	hrs
Obtained from:	pfSense
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33037
2021-11-20 19:29:01 +01:00
Kristof Provost
3142d4f622 lagg: fix unused-but-set-variable
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-19 22:01:27 +01:00
Andriy Gapon
1bfdb812c7 iflib_stop: drain rx tasks to prevent any data races
iflib_stop modifies iflib data structures that are used by _task_fn_rx,
most prominently the free lists.  So, iflib_stop has to ensure that the
rx task threads are not active.

This should help to fix a crash seen when iflib_if_ioctl (e.g.,
SIOCSIFCAP) is called while there is already traffic flowing.

The crash has been seen on VMWare guests with vmxnet3 driver.

My guess is that on physical hardware the couple of 1ms delays that
iflib_stop has after disabling interrupts are enough for the queued work
to be completed before any iflib state is touched.

But on busy hypervisors the guests might not get enough CPU time to
complete the work, thus there can be a race between the taskqueue
threads and the work done to handle an ioctl, specifically in iflib_stop
and iflib_init_locked.

PR:		259458
Reviewed by:	markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D32926
2021-11-19 10:00:38 +02:00
Kristof Provost
8e492101ec pf: add COMPAT_FREEBSD13 for DIOCKEEPCOUNTERS
DIOCKEEPCOUNTERS used to overlap with DIOCGIFSPEEDV0, which has been
fixed in 14, but remains in stable/12 and stable/13.
Support the old, overlapping, call under COMPAT_FREEBSD13.

Reviewed by:	jhb
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33001
2021-11-17 03:09:20 +01:00
Mateusz Guzik
79554f2b6c net: whack "set but not used" warnings in net/rtsock.c
... except for one where the error is ignored.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:20:46 +00:00
Mateusz Guzik
c681cce925 net: whack "set but not used" warnings in net/pfil.c
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:19:58 +00:00
Mateusz Guzik
5a4e46f6ec net: whack "set but not used" warnings in net/if.c
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-14 17:15:08 +00:00
Kristof Provost
047c4e365d pf: renumber DIOCKEEPCOUNTERS
We accidentally had two ioctls use the same base number
(DIOCKEEPCOUNTERS and DIOCGIFSPEEDV{0,1}). We get away with that on most
platforms because the size of the argument structures is different.
This does break CHERI, and is generally a bad idea anyway.
Renumber to avoid this collision.

Reported by:	jhb
2021-11-14 15:36:59 +01:00
Kristof Provost
8e45fed3ae if_stf: enable use in vnet jails
The cloner must be per-vnet so that cloned interfaces get destroyed when
the vnet goes away. Otherwise we fail assertions in vnet_if_uninit():

	panic: vnet_if_uninit:475 tailq &V_ifnet=0xfffffe01665fe070 not empty
	cpuid = 19
	time = 1636107064
	KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe015d0cac60
	vpanic() at vpanic+0x187/frame 0xfffffe015d0cacc0
	panic() at panic+0x43/frame 0xfffffe015d0cad20
	vnet_if_uninit() at vnet_if_uninit+0x7b/frame 0xfffffe015d0cad30
	vnet_destroy() at vnet_destroy+0x170/frame 0xfffffe015d0cad60
	prison_deref() at prison_deref+0x9b0/frame 0xfffffe015d0cadd0
	sys_jail_remove() at sys_jail_remove+0x119/frame 0xfffffe015d0cae00
	amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe015d0caf30
	fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe015d0caf30
	--- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x8011e920a, rsp = 0x7fffffffe788, rbp = 0x7fffffffe810 ---
	KDB: enter: panic

MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32849
2021-11-09 09:39:53 +01:00
Kristof Provost
3576121c8b if_stf: style(9) pass
As stated in style(9): "Values in return statements should be enclosed
in parentheses."

MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32848
2021-11-09 09:39:53 +01:00
Kristof Provost
8ca6c11a7c if_gif: fix vnet shutdown panic
If an if_gif exists and has an address assigned inside a vnet when the
vnet is shut down we failed to clean up the address, leading to a panic
when we ip_destroy() and the V_in_ifaddrhashtbl is not empty.

This happens because of the VNET_SYS(UN)INIT order, which means we
destroy the if_gif interface before the addresses can be purged (and
if_detach() does not remove addresses, it assumes this will be done by
the stack teardown code).

Set subsystem SI_SUB_PSEUDO just like if_bridge so the cleanup
operations happen in the correct order.

MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32835
2021-11-08 12:00:00 +01:00
Wojciech Macek
acdfc09639 lagg: update capabilites on SIOCSIFMTU
Some NICs might have limited capabilities when Jumbo frames are used.
For exampe some neta interfaces only support TX csum offload when the
packet size is lower than a value specified in DT.
Fix it by re-reading capabilities of children interfaces after MTU
has been successfully changed.

Found by: Jerome Tomczyk <jerome.tomczyk@stormshield.eu>
Reviewed by: jhb
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D32724
2021-11-06 10:43:08 +01:00
Kristof Provost
76c5eecc34 pf: Introduce ridentifier
Allow users to set a number on rules which will be exposed as part of
the pflog header.
The intent behind this is to allow users to correlate rules across
updates (remember that pf rules continue to exist and match existing
states, even if they're removed from the active ruleset) and pflog.

Obtained from:	pfSense
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32750
2021-11-05 09:39:56 +01:00
Bjoern A. Zeeb
1a8f198fa6 epair: remove "All rights reserved"
Remove "All rights reserved" from The FreeBSD Foundation owned
copyrights on epair code and documentation.

Approved by:	emaste (FreeBSD Foundation)
2021-11-02 16:50:26 +00:00
Bjoern A. Zeeb
3dd5760aa5 if_epair: rework
Rework if_epair(4) to no longer use netisr and dpcpu.
Instead use mbufq and swi_net.
This simplifies the code and seems to make it work better and
no longer hang.

Work largely by bz@, with minor tweaks by kp@.

Reviewed by:	bz, kp
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D31077
2021-11-02 09:23:46 +01:00
Mateusz Guzik
8f3d786cb3 pf: remove the flags argument from pf_unlink_state
All consumers call it with PF_ENTER_LOCKED.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-01 20:59:14 +01:00
Kristof Provost
62d2dcafb7 if_epair: delete mbuf tags
Remove all (non-persistent) tags when we transmit a packet. Real network
interfaces do not carry any tags either, and leaving tags attached can
produce unexpected results.

Reviewed by:	bz, glebius
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32663
2021-10-28 10:41:16 +02:00
Mark Johnston
426682b05a bpf: Fix the write filter for detached descriptors
A BPF descriptor only has an associated interface descriptor once it is
attached to an interface, e.g., with BIOCSETIF.  Avoid dereferencing a
NULL pointer in filt_bpfwrite() if the BPF descriptor is not attached.

Reviewed by:	ae
Reported by:	syzbot+ae45d5166afe15a5a21d@syzkaller.appspotmail.com
Fixes:	ded77e0237a8 ("Allow the BPF to be select for write.")
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32561
2021-10-26 10:00:39 -04:00
Gleb Smirnoff
c8ee75f231 Use network epoch to protect local IPv4 addresses hash.
The modification to the hash are already naturally locked by
in_control_sx.  Convert the hash lists to CK lists. Remove the
in_ifaddr_rmlock. Assert the network epoch where necessary.

Most cases when the hash lookup is done the epoch is already entered.
Cover a few cases, that need entering the epoch, which mostly is
initial configuration of tunnel interfaces and multicast addresses.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D32584
2021-10-22 14:40:53 -07:00
Gleb Smirnoff
6aae3517ed Retire synchronous PPP kernel driver sppp(4).
The last two drivers that required sppp are cp(4) and ce(4).

These devices are still produced and can be purchased
at Cronyx <http://cronyx.ru/hardware/wan.html>.

Since Roman Kurakin <rik@FreeBSD.org> has quit them, they no
longer support FreeBSD officially.  Later they have dropped
support for Linux drivers to.  As of mid-2020 they don't even
have a developer to maintain their Windows driver.  However,
their support verbally told me that they could provide aid to
a FreeBSD developer with documentaion in case if there appears
a new customer for their devices.

These drivers have a feature to not use sppp(4) and create an
interface, but instead expose the device as netgraph(4) node.
Then, you can attach ng_ppp(4) with help of ports/net/mpd5 on
top of the node and get your synchronous PPP.  Alternatively
you can attach ng_frame_relay(4) or ng_cisco(4) for HDLC.
Actually, last time I used cp(4) back in 2004, using netgraph(4)
instead of sppp(4) was already the right way to do.

Thus, remove the sppp(4) related part of the drivers and enable
by default the negraph(4) part.  Further maintenance of these
drivers in the tree shouldn't be a big deal.

While doing that, remove some cruft and enable cp(4) compilation
on amd64.  The ce(4) for some unknown reason marks its internal
DDK functions with __attribute__ fastcall, which most likely is
safe to remove, but without hardware I'm not going to do that, so
ce(4) remains i386-only.

Reviewed by:		emaste, imp, donner
Differential Revision:	https://reviews.freebsd.org/D32590
See also:		https://reviews.freebsd.org/D23928
2021-10-22 11:41:36 -07:00
Gleb Smirnoff
2144431c11 Remove in_ifaddr_lock acquisiton to access in_ifaddrhead.
An IPv4 address is embedded into an ifaddr which is freed
via epoch. And the in_ifaddrhead is already a CK list. Use
the network epoch to protect against use after free.

Next step would be to CK-ify the in_addr hash and get rid of the...

Reviewed by:		melifaro
Differential Revision:	https://reviews.freebsd.org/D32434
2021-10-13 10:04:46 -07:00
Hartmut Brandt
ded77e0237 Allow the BPF to be select for write. This is needed for boost:asio
which otherwise fails to handle BPFs.
Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D31967
2021-10-10 17:03:51 +02:00
Alexander V. Chernikov
7e64580b5f routing: Use the same index space for both nexthop and nexthop groups.
This simplifies userland object handling along with kernel-level
 nexthop handling in fib algo framework.

MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D32342
2021-10-08 07:58:55 +00:00
Kristof Provost
76c2e71c4c pf: remove unused field from pf_kanchor
The 'match' field is only used in the userspace version of the struct
(pf_anchor).

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-10-07 19:50:22 +02:00
Kristof Provost
5062afff9d pfctl: userspace adaptive syncookies configration
Hook up the userspace bits to configure syncookies in adaptive mode.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32136
2021-09-29 15:11:54 +02:00
Kristof Provost
bf8637181a pf: implement adaptive mode
Use atomic counters to ensure that we correctly track the number of half
open states and syncookie responses in-flight.
This determines if we activate or deactivate syncookies in adaptive
mode.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32134
2021-09-29 15:11:54 +02:00
Kristof Provost
63b3c1c770 pf: support dummynet
Allow pf to use dummynet pipes and queues.

We re-use the currently unused IPFW_IS_DUMMYNET flag to allow dummynet
to tell us that a packet is being re-injected after being delayed. This
is needed to avoid endlessly looping the packet between pf and dummynet.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31904
2021-09-24 11:41:25 +02:00
Arnaud Ysmal
0b92a7fe47 LACP: Do not wait response for marker messages not sent
The error returned when a marker message can not be emitted on a port is not handled.

This cause the lacp to block all emissions until the timeout of 3 seconds is reached.

To fix this issue, I just clear the LACP_PORT_MARK flag when the packet could not be emitted.

Differential revision:	https://reviews.freebsd.org/D30467
Obtained from:		Stormshield
2021-09-23 10:57:11 +02:00
John Baldwin
c782ea8bb5 Add a switch structure for send tags.
Move the type and function pointers for operations on existing send
tags (modify, query, next, free) out of 'struct ifnet' and into a new
'struct if_snd_tag_sw'.  A pointer to this structure is added to the
generic part of send tags and is initialized by m_snd_tag_init()
(which now accepts a switch structure as a new argument in place of
the type).

Previously, device driver ifnet methods switched on the type to call
type-specific functions.  Now, those type-specific functions are saved
in the switch structure and invoked directly.  In addition, this more
gracefully permits multiple implementations of the same tag within a
driver.  In particular, NIC TLS for future Chelsio adapters will use a
different implementation than the existing NIC TLS support for T6
adapters.

Reviewed by:	gallatin, hselasky, kib (older version)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D31572
2021-09-14 11:43:41 -07:00
Mark Johnston
b1746faad6 debugnet: Include some required headers
Don't depend on pollution from net/vnet.h.

PR:		258496
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-14 11:02:45 -04:00
Kristof Provost
b64f7ce98f pf: qid and pqid can be uint16_t
tag2name() returns a uint16_t, so we don't need to use uint32_t for the
qid (or pqid). This reduces the size of struct pf_kstate slightly. That
in turn buys us space to add extra fields for dummynet later.

Happily these fields are not exposed to user space (there are user space
versions of them, but they can just stay uint32_t), so there's no ABI
breakage in modifying this.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31873
2021-09-10 17:07:57 +02:00
Mark Johnston
b1e6a792d6 net: Enter a net epoch around protocol if_up/down notifications
When traversing a list of interface addresses, we need to be in a net
epoch section, and protocol ctlinput routines need a stable reference to
the address.

Reported by:	syzbot+3219af764ead146a3a4e@syzkaller.appspotmail.com
Reviewed by:	kp, melifaro
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31889
2021-09-10 09:07:40 -04:00
Alexander V. Chernikov
4b631fc832 routing: fix source address selection rules for IPv4 over IPv6.
Current logic always selects an IFA of the same family from the
 outgoing interfaces. In IPv4 over IPv6 setup there can be just
 single non-127.0.0.1 ifa, attached to the loopback interface.

Create a separate rt_getifa_family() to handle entire ifa selection
 for the IPv4 over IPv6.

Differential Revision: https://reviews.freebsd.org/D31868
MFC after:	1 week
2021-09-07 21:41:05 +00:00
Kristof Provost
bb25e36e13 pf: remove unused function prototype
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-09-07 16:38:49 +02:00
Kristof Provost
312f5f8a4f altq: mark callouts as mpsafe
There's no reason to acquire the Giant lock while executing the ALTQ
callouts.

While here also remove a few backwards compatibility defines for long
obsolete FreeBSD versions.

Reviewed by:	mav
Suggested by:	mav
Differential Revision:	https://reviews.freebsd.org/D31835
2021-09-04 17:26:10 +02:00
Kristof Provost
4cab80a8df pf: Add counters for syncookies
Count when we send a syncookie, receive a valid syncookie or detect a
synflood.

Reviewed by:	kbowling
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31713
2021-09-01 12:02:19 +02:00
Alexander V. Chernikov
0a3a377aee routing: Disallow zero nexthop weights in nexthop groups.
Adding such nexthops breaks calc_min_mpath_slots() assumptions,
 thus resulting in the incorrect nexthop group creation and
 eventually leading to panic.
Reported by:	avg
MFC after:	1 week
2021-09-01 07:16:24 +00:00
Alexander V. Chernikov
639d7abec6 routing: simplify malloc flags in alloc_nhgrp().
MFC after:	1 week
2021-08-31 08:14:16 +00:00
Alexander V. Chernikov
f84c30106e routing: Fix newly-added rt_get_inet[6]_parent() api.
Correctly handle the case when no default route is present.

Reported by:	Konrad <konrad.kreciwilk at korbank.pl>
2021-08-30 21:10:37 +00:00
Alexander V. Chernikov
d98954e229 routing: Bring back the ability to specify transmit interface via its name.
Some software references outgoing interfaces by specifying name instead of
 index.

Use rti_ifp from rt_addrinfo if provided instead of always using
 address interface when constructing nexthop.

PR: 		255678
Reported by:	martin.larsson2 at gmail.com
MFC after:	1 week
2021-08-29 20:05:14 +00:00
Kristof Provost
2b10cf85f8 pf: Introduce nvlist variant of DIOCGETSTATUS
Make it possible to extend the GETSTATUS call (e.g. when we want to add
new counters, such as for syncookie support) by introducing an
nvlist-based alternative.

MFC after:	1 week
Sponsored by:   Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31694
2021-08-29 14:59:04 +02:00
Luiz Otavio O Souza
eb680a63de if_bridge: add ALTQ support
Similar to the recent addition of ALTQ support to if_vlan.

Reviewed by:	donner
Obtained from:	pfsense
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31675
2021-08-26 11:23:44 +02:00
Luiz Otavio O Souza
2e5ff01d0a if_vlan: add the ALTQ support to if_vlan.
Inspired by the iflib implementation, allow ALTQ to be used with if_vlan
interfaces.

Reviewed by:	donner
Obtained from:	pfsense
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31647
2021-08-25 08:56:45 +02:00