Commit Graph

67 Commits

Author SHA1 Message Date
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Gleb Smirnoff
a6b55ee6be net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.

This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch.  A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.

Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH.  But the tcp_lro_flush_all() asserts the presence
of network epoch.  Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.

Reviewed by:		zlei, gallatin, erj
Differential Revision:	https://reviews.freebsd.org/D39510
2023-04-17 09:08:35 -07:00
Kristof Provost
c69ae84197 if_epair: also remove vlan metadata from mbufs
We already remove mbuf tags from packets transitting an if_epair, but we
didn't remove vlan metadata.
In certain configurations this could lead to unexpected vlan tags
turning up on the rx side.

PR:		270736
Reviewed by:	markj
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D39482
2023-04-10 15:55:35 +02:00
Mark Johnston
29c9b16733 epair: Remove unneeded includes and sort some of the rest
No functional change intended.

MFC after:	1 week
2023-03-13 10:45:35 -04:00
Mark Johnston
df7bbd8c35 epair: Simplify the transmit path and address lost wakeups
epairs currently shuttle all transmitted packets through a single global
taskqueue thread.  To hand packets over to the taskqueue thread, each
epair maintains a pair of ring buffers and a lockless scheme for
notifying the thread of pending work.  The implementation can lead to
lost wakeups, causing to-be-transmitted packets to end up stuck in the
queue.

Rather than extending the existing scheme, simply replace it with a
linked list protected by a mutex, and use the mutex to synchronize
wakeups of the taskqueue thread.  This appears to give equivalent or
better throughput with >= 16 producer threads and eliminates the lost
wakeups.

Reviewed by:	kp
MFC after:	1 week
Sponsored by:	Klara, Inc.
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D38843
2023-03-06 12:49:28 -05:00
Mark Johnston
48227d1c6d epair: Avoid loading m_flags into a short
The m_flags field of struct mbuf is 24 bits wide and so gets truncated
in a couple of places in the epair code.  Instead of preserving the
entire flag set, just remember whether M_BCAST or M_MCAST is set.

MFC after:	1 week
Sponsored by:	Klara, Inc.
2023-03-06 12:39:11 -05:00
Justin Hibbits
2c2b37ad25 ifnet/API: Move struct ifnet definition to a <net/if_private.h>
Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail.  Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.

Reviewed by:	glebius
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046
2023-01-24 14:36:30 -05:00
Kristof Provost
8a299958c1 if_epair: fix build with RSS
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-10-03 17:02:55 +02:00
Alexander V. Chernikov
04a32b802e if_epair: refactor interface creation and enqueue code.
* Factor out queue selection (epair_select_queue()) and mbuf
 preparation (epair_prepare_mbuf()) from epair_menq(). It simplifies
 epair_menq() implementation and reduces the amount of dependencies
 on the neighbouring epair.
* Use dedicated epair_set_state() instead of 2-lines copy-paste
* Factor out unit selection code (epair_handle_unit()) from
 epair_clone_create(). It simplifies the clone creation logic.

Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D36689
2022-09-27 13:34:19 +00:00
Alexander V. Chernikov
91ebcbe02a if_clone: migrate some consumers to the new KPI.
Convert most of the cloner customers who require custom params
 to the new if_clone KPI.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D36636
MFC after:	2 weeks
2022-09-22 12:30:09 +00:00
Alexander V. Chernikov
12aeeb9190 epair: deduplicate interface allocation code #1
Simplify epair_clone_create() and epair_clone_destroy() by
 factoring out epair softc allocation / desctruction and
 interface setup/teardown into separate functions.

Reviewed By: kp, zlei.huang_gmail.com
Differential Revision: https://reviews.freebsd.org/D36614
MFC after:	2 weeks
2022-09-22 09:06:06 +00:00
Kristof Provost
cbbce42345 epair: unbind prior to returning to userspace
If 'options RSS' is set we bind the epair tasks to different CPUs. We
must take care to not keep the current thread bound to the last CPU when
we return to userspace.

MFC after:	1 week
Sponsored by:	Orange Business Services
2022-05-07 18:17:33 +02:00
Kristof Provost
a6b0c8d04d epair: fix set but not used warning
If 'options RSS' is set.

MFC after:	1 week
Sponsored by:	Orange Business Services
2022-05-07 18:17:32 +02:00
Kristof Provost
0bf7acd6b7 if_epair: build fix
66acf7685b failed to build on riscv (and mips). This is because the
atomic_testandset_int() (and friends) functions do not exist there.
Happily those platforms do have the long variant, so switch to that.

PR:		262571
MFC after:	3 days
2022-03-17 06:43:47 +01:00
Michael Gmelin
66acf7685b if_epair: fix race condition on multi-core systems
As an unwanted side effect of the performance improvements in
24f0bfbad5, epair interfaces stop forwarding traffic on higher
load levels when running on multi-core systems.

This happens due to a race condition in the logic that decides when to
place work in the task queue(s) responsible for processing the content
of ring buffers.

In order to fix this, a field named state is added to the epair_queue
structure. This field is used by the affected functions to signal each
other that something happened in the underlying ring buffers that might
require work to be scheduled in task queue(s), replacing the existing
logic, which relied on checking if ring buffers are empty or not.

epair_menq() does:
  - set BIT_MBUF_QUEUED
  - queue mbuf
  - if testandset BIT_QUEUE_TASK:
      enqueue task

epair_tx_start_deferred() does:
  - swap ring buffers
  - process mbufs
  - clear BIT_QUEUE_TASK
  - if testandclear BIT_MBUF_QUEUED
      enqueue task

PR:		262571
Reported by:	Johan Hendriks <joh.hendriks@gmail.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D34569
2022-03-16 23:08:55 +01:00
Santiago Martinez
52bcdc5b80 if_epair: fix build with RSS and INET or INET6 disabled
Reviewed by:	kp
MFC after:	1 week
2022-03-03 18:31:26 +01:00
Li-Wen Hsu
7442b63231
if_epair: Use ANSI C definition
This fixes -Werror=strict-prototypes from gcc9

Sponsored by:	The FreeBSD Foundation
2022-02-15 21:45:22 +08:00
Kristof Provost
24f0bfbad5 if_epair: implement fanout
Allow multiple cores to be used to process if_epair traffic. We do this
(if RSS is enabled) based on the RSS hash of the incoming packet. This
allows us to distribute the load over multiple cores, rather than
sending everything to the same one.

We also switch from swi_sched() to taskqueues, which also contributes to
better throughput.

Benchmark results:
With net.isr.maxthreads=-1

Setup A: (cc0 - bridge0 - epair0a) (epair0b - bridge1 - cc1)

Before          627 Kpps
After (no RSS)  1.198 Mpps
After (RSS)     3.148 Mpps

Setup B: (cc0 - bridge0 - epaira0) (epair0b - vnet jail - epair1a) (epair1b - bridge1 - cc1)

Before          7.705 Kpps
After (no RSS)  1.017 Mpps
After (RSS)     2.083 Mpps

MFC after:	3 weeks
Sponsored by:	Orange Business Services
Differential Revision:	https://reviews.freebsd.org/D33731
2022-02-15 09:03:24 +01:00
Zhenlei Huang
73d41cc730 if_epair: Also mark the flag of pair b with IFF_KNOWSEPOCH
Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33210
2021-12-01 15:54:23 +01:00
Mateusz Guzik
2cedfc3f7e if_epair: ifdef vars only used with ALTQ
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:28:54 +00:00
Bjoern A. Zeeb
1a8f198fa6 epair: remove "All rights reserved"
Remove "All rights reserved" from The FreeBSD Foundation owned
copyrights on epair code and documentation.

Approved by:	emaste (FreeBSD Foundation)
2021-11-02 16:50:26 +00:00
Bjoern A. Zeeb
3dd5760aa5 if_epair: rework
Rework if_epair(4) to no longer use netisr and dpcpu.
Instead use mbufq and swi_net.
This simplifies the code and seems to make it work better and
no longer hang.

Work largely by bz@, with minor tweaks by kp@.

Reviewed by:	bz, kp
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D31077
2021-11-02 09:23:46 +01:00
Kristof Provost
62d2dcafb7 if_epair: delete mbuf tags
Remove all (non-persistent) tags when we transmit a packet. Real network
interfaces do not carry any tags either, and leaving tags attached can
produce unexpected results.

Reviewed by:	bz, glebius
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D32663
2021-10-28 10:41:16 +02:00
Kristof Provost
7f883a9b5b net: Revert vnet/epair cleanup race mitigation
Revert the mitigation code for the vnet/epair cleanup race (done in r365457).
r368237 introduced a more reliable fix.

MFC after:	2 weeks
Sponsored by:	Modirum MDPay
2020-12-01 16:34:43 +00:00
Kristof Provost
a969635b83 net: mitigate vnet / epair cleanup races
There's a race where dying vnets move their interfaces back to their original
vnet, and if_epair cleanup (where deleting one interface also deletes the other
end of the epair). This is commonly triggered by the pf tests, but also by
cleanup of vnet jails.

As we've not yet been able to fix the root cause of the issue work around the
panic by not dereferencing a NULL softc in epair_qflush() and by not
re-attaching DYING interfaces.

This isn't a full fix, but makes a very common panic far less likely.

PR:		244703, 238870
Reviewed by:	lutz_donnerhacke.de
MFC after:	4 days
Differential Revision:	https://reviews.freebsd.org/D26324
2020-09-08 14:54:10 +00:00
Mateusz Guzik
662c13053f net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Kristof Provost
b02fd8b790 epair: Do not abuse params to register the second interface
if_epair used the 'params' argument to pass a pointer to the b interface
through if_clone_create().
This pointer can be controlled by userspace, which means it could be abused to
trigger a panic. While this requires PRIV_NET_IFCREATE
privileges those are assigned to vnet jails, which means that vnet jails
could panic the system.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	3 days
2020-01-28 22:44:24 +00:00
Andrew Turner
5f901c92a8 Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by:	bz
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16147
2018-07-24 16:35:52 +00:00
Eugene Grosbein
804771f553 epair(4): make sure we do not duplicate MAC addresses
in case of reused if_index.

PR:		229957
Tested by:	O. Hartmann <ohartmann@walstatt.org>
Approved by:	avg (mentor)
2018-07-23 07:11:58 +00:00
Luca Pizzamiglio
11d416666d Improve MAC address uniqueness on if_epair(4).
As reported in PR184149, it can happen that epair devices can have the same
MAC address.
This solution is based on a 32-bit hash, obtained combining the if_index of
the a interface and the hostid.
If the hostid is zero, a random number is used.

PR:		184149
Reviewed by:	wollman, eugen
Approved by:	cognet
Differential Revision:	https://reviews.freebsd.org/D15329
2018-05-23 13:10:57 +00:00
Matt Macy
46d0f824be net: fix set but not used 2018-05-19 05:27:49 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Kristof Provost
85f330e5fa epair: Fix panic on unload
The VNET_SYSUNINIT() callback is executed after the MOD_UNLOAD. That means
that netisr_unregister() has already been called when
netisr_unregister_vnet() gets calls, leading to an assertion failure.

Restore the expected order of operations by performing everything that
was done in MOD_UNLOAD to a SYSUNINIT() (that will be called after the
VNET_SYSUNINIT()).

Differential Revision:	https://reviews.freebsd.org/D12771
2017-11-01 14:27:26 +00:00
Patrick Kelsey
75580d5881 Fixed typo in comment found while reading commit email for fix of
other typo in same comment.

ned -> need

MFC after:	3 days
2017-04-08 04:50:50 +00:00
Patrick Kelsey
59f35a8290 Fixed typo in comment.
patckets -> packets

MFC after:	3 days
2017-04-08 04:45:52 +00:00
Ermal Luçi
4e950412ff Correct handling of ALTQ with epair(4) interfaces but presenting that ALTQ(9) is supported.
Approved by:	ae
MFC after:	2 weeks
2017-03-24 00:55:16 +00:00
Warner Losh
607a4c520e Back out r314471. In https://reviews.freebsd.org/D1858 it was clear
that this shouldn't go in. I was unaware when I merged the pull
request. I don't wish to upset the status quo, so backout per
project practice.

Pull Request:	https://github.com/freebsd/freebsd/pull/92
Noted by:	hrs@
2017-03-01 05:38:04 +00:00
Warner Losh
7d85b06ecf Fix VNET - DAD detected duplicate IPv6 address
Assign a hopefully unique, locally administered etheraddr. - for
epairNa & epairNb

Submitted by:	Catalin <sslevil@users.noreply.github.com>
Pull Request:	https://github.com/freebsd/freebsd/pull/92
2017-03-01 04:47:22 +00:00
Andrey V. Elsukov
fdf95c0b81 Teach netisr_get_cpuid() to limit a given value to supported by netisr.
Use netisr_get_cpuid() in netisr_select_cpuid() to limit cpuid value
returned by protocol to be sure that it is not greather than nws_count.

PR:		211836
Reviewed by:	adrian
MFC after:	3 days
2016-08-17 20:21:33 +00:00
Bjoern A. Zeeb
89856f7e2d Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by:		re (hrs)
Obtained from:		projects/vnet
Reviewed by:		gnn, jhb
Sponsored by:		The FreeBSD Foundation
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D6747
2016-06-21 13:48:49 +00:00
Bjoern A. Zeeb
484149def8 Introduce a per-VNET flag to enable/disable netisr prcessing on that VNET.
Add accessor functions to toggle the state per VNET.
The base system (vnet0) will always enable itself with the normal
registration. We will share the registered protocol handlers in all
VNETs minimising duplication and management.
Upon disabling netisr processing for a VNET drain the netisr queue from
packets for that VNET.

Update netisr consumers to (de)register on a per-VNET start/teardown using
VNET_SYS(UN)INIT functionality.

The change should be transparent for non-VIMAGE kernels.

Reviewed by:	gnn (, hiren)
Obtained from:	projects/vnet
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D6691
2016-06-03 13:57:10 +00:00
Pedro F. Giffuni
a4641f4eaa sys/net*: minor spelling fixes.
No functional change.
2016-05-03 18:05:43 +00:00
Gleb Smirnoff
8ec07310fa These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
Hiroki Sato
18e199ad72 Fix a panic which was reproducible by an infinite loop of
"ifconfig epair0 create && ifconfig epair0a destroy".

This was caused by an uninitialized function pointer in
softc->media.
2015-09-02 16:30:45 +00:00
Hiroki Sato
3c3136b1dd Virtualize if_epair(4). An if_xname check for both "a" and "b" interfaces
is added to return EEXIST when only "b" interface exists---this can happen
when epair<N>b is moved to a vnet jail and then "ifconfig epair<N> create"
is invoked there.
2014-10-10 06:45:13 +00:00
Gleb Smirnoff
3751dddb3e Mechanically convert to if_inc_counter(). 2014-09-19 10:39:58 +00:00
Gleb Smirnoff
56b61ca27a Remove ifq_drops from struct ifqueue. Now queue drops are accounted in
struct ifnet if_oqdrops.

Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops
is simply removed from them. There were no API to read this statistic.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-09-19 09:01:19 +00:00
Gleb Smirnoff
b245f96c44 Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
  notion of data (packet counters, etc) are by no means MD. And it is a
  bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
  which at modern speeds overflow within a second.

  This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
  make future changes to if_data less ABI breaking. Unfortunately the
  8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with:	emax
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-03-13 03:42:24 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00