Commit Graph

32 Commits

Author SHA1 Message Date
Justin Hibbits
2c2b37ad25 ifnet/API: Move struct ifnet definition to a <net/if_private.h>
Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail.  Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.

Reviewed by:	glebius
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046
2023-01-24 14:36:30 -05:00
Zhenlei Huang
de1ea2d517 if_vxlan(4): Correct the statistic for output bytes
The vxlan interface encapsulates the Ethernet frame by prepending IP/UDP
and vxlan headers. For statistics, only the payload, i.e. the
encapsulated (inner) frame should be counted.

Event:		Aberdeen Hackathon 2022
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D36855
2022-10-07 13:45:16 +02:00
Zhenlei Huang
1fc839f489 if_vxlan(4): Add missing statistic for input packets
Event:		Aberdeen hackathon 2022
Reviewed by:	bryanv, kp
Differential Revision:	https://reviews.freebsd.org/D36841
2022-10-05 12:38:30 +02:00
Zhenlei Huang
8707cb19e6 if_vxlan(4): Check the size of data available in mbuf before using them
PR:		261711
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D36794
2022-09-30 09:56:15 +02:00
Alexander V. Chernikov
91ebcbe02a if_clone: migrate some consumers to the new KPI.
Convert most of the cloner customers who require custom params
 to the new if_clone KPI.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D36636
MFC after:	2 weeks
2022-09-22 12:30:09 +00:00
Zhenlei Huang
7f7a804ae0 vxlan: Add support for socket ioctls SIOC[SG]TUNFIB
Submitted by: Luiz Amaral <email@luiz.eng.br>
PR: 244004
Differential Revision:	https://reviews.freebsd.org/D32820
MFC after:	2 weeks
2022-07-08 18:14:19 +00:00
Kristof Provost
742e7210d0 udp: allow udp_tun_func_t() to indicate it did not eat the packet
Allow udp tunnel functions to indicate they have not taken ownership of
the packet, and that normal UDP processing should continue.

This is especially useful for scenarios where the kernel has taken
ownership of a socket that was originally created by userspace. It
allows the tunnel function to pass through certain packets for userspace
processing.

The primary user of this is if_ovpn, when it receives messages from
unknown peers (which might be a new client).

Reviewed by:	tuexen
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D34883
2022-04-12 10:04:59 +02:00
Gordon Bergling
bef80a7285 vxlan(4): Fix two typos in sysctl descriptions
- s/fowarding/forwarding/

MFC after:	3 days
2022-03-28 19:35:34 +02:00
Aleksandr Fedorov
ceaf442ff2 if_vxlan(4): Allow netmap_generic to intercept RX packets.
Netmap (generic) intercepts the if_input method to handle RX packets.

Call ifp->if_input() instead of netisr_dispatch().
Add stricter check for incoming packet length.

This change is very useful with bhyve + vale + if_vxlan.

Reviewed by:	vmaffione (mentor), kib, np, donner
Approved by:	vmaffione (mentor), kib, np, donner
MFC after:	2 weeks
Sponsored by:	vstack.com
Differential Revision:	https://reviews.freebsd.org/D30638
2022-02-06 15:27:46 +03:00
Bjoern A. Zeeb
a3c2c06bc9 Make LINT NOINET and NOIP kernel builds warning free.
Apply #ifdef INET or #if defined(INET6) || defined(INET) to make
universe NOINET and NOIP LINT kernels warning free as well again.
2021-06-06 14:03:06 +00:00
Konstantin Belousov
baacf70137 vxlan: correct interface MTU when using hw offloads
Otherwise it breaks when offloading like checksum or TSO are used,
because second (encapsulated) ip_output() processing passes fragments of
the encapsulated packet down to the hardware interface.

Diagnosed by:	hselasky
Reviewed by:	np
Sponsored by:	Nvidia Networking / Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29501
2021-03-31 14:38:26 +03:00
Konstantin Belousov
e243367b64 mbuf: add a way to mark flowid as calculated from the internal headers
In some settings offload might calculate hash from decapsulated packet.
Reserve a bit in packet header rsstype to indicate that.

Add m_adj_decap() that acts similarly to m_adj, but also either clear
flowid if it is not marked as inner, or transfer it to the decapsulated
header, clearing inner indicator. It depends on the internals of m_adj()
that reuses the argument packet header for the result.

Use m_adj_decap() for decapsulating vxlan(4) and gif(4) input packets.

Reviewed by:	ae, hselasky, np
Sponsored by:	Nvidia Networking / Mellanox Technologies
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28773
2021-03-31 14:38:26 +03:00
Konstantin Belousov
994e47023a vxlan: stop checking CSUM_ENCAP_VXLAN when converting inner CSUM flags into normal, for decapsulation.
The packet, if processed at this point, was already parsed to be UDP
directed to a vxlan port.

Connect-X 4+ does not provide easy method to infer which parser
processed the packet, so driver cannot set the flag without a lot of
efforts which are only to satisfy the formal requirements.

Reviewed by:	bryanv, np
Sponsored by:	Mellanox Technologies/NVidia Networking
Differential revision:	https://reviews.freebsd.org/D27449
MFC after:	1 week
2020-12-23 10:54:06 +02:00
Navdeep Parhar
610d345953 if_vxlan(4): csum_flags_to_inner_flags takes the tunnel protocol as a parameter.
No functional change.
2020-10-22 17:05:55 +00:00
Navdeep Parhar
b092fd6c97 if_vxlan(4): add support for hardware assisted checksumming, TSO, and RSS.
This lets a VXLAN pseudo-interface take advantage of hardware checksumming (tx
and rx), TSO, and RSS if the NIC is capable of performing these operations on
inner VXLAN traffic.

A VXLAN interface inherits the capabilities of its vxlandev interface if one is
specified or of the interface that hosts the vxlanlocal address. If other
interfaces will carry traffic for that VXLAN then they must have the same
hardware capabilities.

On transmit, if_vxlan verifies that the outbound interface has the required
capabilities and then translates the CSUM_ flags to their inner equivalents.
This tells the hardware ifnet that it needs to operate on the inner frame and
not the outer VXLAN headers.

An event is generated when a VXLAN ifnet starts. This allows hardware drivers to
configure their devices to expect VXLAN traffic on the specified incoming port.

On receive, the hardware does RSS and checksum verification on the inner frame.
if_vxlan now does a direct netisr dispatch to take full advantage of RSS. It is
not very clear why it didn't do this already.

Future work:
Rx: it should be possible to avoid the first trip up the protocol stack to get
the frame to if_vxlan just so it can decapsulate and requeue for a second trip
up the stack. The hardware NIC driver could directly call an if_vxlan receive
routine for VXLAN traffic instead.

Rx: LRO. depends on what happens with the previous item. There will have to to
be a mechanism to indicate that it's time for if_vxlan to flush its LRO state.

Reviewed by:	kib@
Relnotes:	Yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25873
2020-09-18 02:37:57 +00:00
Qing Li
a5154bb2e5 Correct the mask byte order when checking for reserved bits.
Reviewed by:	gnn
Approved by:	gnn
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D26071
2020-08-15 16:48:58 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Kirill Ponomarev
b7592822d5 Allow set MTU more than 1500 bytes.
Submitted by:	Alexandr Fedorov <aleksandr.fedorov_itglobal_dot_com>
Approved by:	jhb, rgrimes
Sponsored by:	ITGlobal.com
Differential Revision:	https://reviews.freebsd.org/D19422
2019-07-24 16:10:20 +00:00
Hans Petter Selasky
59854ecf55 Convert all IPv4 and IPv6 multicast memberships into using a STAILQ
instead of a linear array.

The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.

To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi

All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.

This patch has been tested by pho@ .

Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by:	markj @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:54:41 +00:00
Kyle Evans
3c3aa8c170 net: adjust randomized address bits
Give devices that need a MAC a 16-bit allocation out of the FreeBSD
Foundation OUI range. Change the name ether_fakeaddr to ether_gen_addr now
that we're dealing real MAC addresses with a real OUI rather than random
locally-administered addresses.

Reviewed by:	bz, rgrimes
Differential Revision:	https://reviews.freebsd.org/D19587
2019-04-17 17:18:43 +00:00
Kyle Evans
6b7e0c1cca ether: centralize fake hwaddr generation
We currently have two places with identical fake hwaddr generation --
if_vxlan and if_bridge. Lift it into if_ethersubr for reuse in other
interfaces that may also need a fake addr.

Reviewed by:	bryanv, kp, philip
Differential Revision:	https://reviews.freebsd.org/D19573
2019-03-14 17:18:00 +00:00
Matt Macy
46d0f824be net: fix set but not used 2018-05-19 05:27:49 +00:00
Bryan Venteicher
ac2b436d20 Add macro for vxlan list mutex lock and unlock
This will simplify some later VNET support.

Submitted by:	hrs
MFC after:	2 weeks
2017-12-30 19:49:40 +00:00
Bryan Venteicher
6d7bc5838b Advertise IFCAP_LINKSTAT after r326480 added link status support
MFC after:	2 weeks
2017-12-30 19:35:12 +00:00
Bryan Venteicher
33e0d8f057 Add support for IPv6 scoped addresses to vxlan
MFC after:	2 weeks
2017-12-30 04:03:53 +00:00
Andrey V. Elsukov
0e253fd12c Fix possible memory leak.
vxlan_ftable entries are sorted in ascending order, due to wrong arguments
order it is possible to stop search before existing element will be found.
Then new element will be allocated in vxlan_ftable_update_locked() and can
be inserted in the list second time or trigger MPASS() assertion with
enabled INVARIANTS.

PR:		224371
MFC after:	1 week
2017-12-16 14:36:21 +00:00
Bryan Venteicher
93a5a3b019 Add if media and link status events to vxlan
PR:		214359
MFC after:	2 weeks
2017-12-02 22:04:00 +00:00
Sepherosa Ziehau
36ad8372d4 net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties
Reviewed by:	hps, erj, tuexen
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6688
2016-06-07 04:51:50 +00:00
Randall Stewart
abb901c5d7 Complete the UDP tunneling of ICMP msgs to those protocols
interested in having tunneled UDP and finding out about the
ICMP (tested by Michael Tuexen with SCTP.. soon to be using
this feature).

Differential Revision:	http://reviews.freebsd.org/D5875
2016-04-28 15:53:10 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Bryan Venteicher
854f7e89e6 Use the size of the Ethernet address, not the entire header, when
copying into forwarding entry.

Reported by:	Coverity
CID:		1248849
2014-10-21 05:45:57 +00:00
Bryan Venteicher
007054f070 Add vxlan interface
vxlan creates a virtual LAN by encapsulating the inner Ethernet frame in
a UDP packet. This implementation is based on RFC7348.

Currently, the IPv6 support is not fully compliant with the specification:
we should be able to receive UPDv6 packets with a zero checksum, but we
need to support RFC6935 first. Patches for this should come soon.

Encapsulation protocols such as vxlan emphasize the need for the FreeBSD
network stack to support batching, GRO, and GSO. Each frame has to make
two trips through the network stack, and each frame will be at most MTU
sized. Performance suffers accordingly.

Some latest generation NICs have begun to support vxlan HW offloads that
we should also take advantage of. VIMAGE support should also be added soon.

Differential Revision:	https://reviews.freebsd.org/D384
Reviewed by:	gnn
Relnotes:	yes
2014-10-20 14:42:42 +00:00