freebsd-skq/sys/net
Hans Petter Selasky f3e7afe2d7 Implement kernel support for hardware rate limited sockets.
- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.

- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.

- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().

- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.

- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.

- How rate limiting works:

1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.

2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.

3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.

4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.

Reviewed by:		wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision:	https://reviews.freebsd.org/D3687
Sponsored by:		Mellanox Technologies
MFC after:		3 months
2017-01-18 13:31:17 +00:00
..
altq Remove an alias if_list, use if_link consistently. 2016-10-06 00:51:27 +00:00
bpf_buffer.c Remove the sleep from the buffer allocation routine. 2015-07-31 20:25:54 +00:00
bpf_buffer.h
bpf_filter.c Implement BPF_MOD and BPF_XOR instructions. 2016-10-21 06:55:07 +00:00
bpf_jitter.c
bpf_jitter.h
bpf_zerocopy.c
bpf_zerocopy.h
bpf.c Hide the boottime and bootimebin globals, provide the getboottime(9) 2016-07-27 11:08:59 +00:00
bpf.h sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
bpfdesc.h
bridgestp.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
bridgestp.h
ethernet.h net/vlan: Shift for pri is 13 (pri mask 0xe000) not 1. 2016-09-01 06:32:35 +00:00
fddi.h
firewire.h
flowtable.c Add variable declaration missing in r302372. 2016-07-06 17:46:49 +00:00
flowtable.h
ieee8023ad_lacp.c Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
ieee8023ad_lacp.h Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
ieee_oui.h
if_arc.h
if_arcsubr.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
if_arp.h
if_atm.h sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
if_atmsubr.c sys/net* : for pointers replace 0 with NULL. 2016-04-15 17:30:33 +00:00
if_bridge.c Remove excess CTLFLAG_VNET 2016-10-26 23:40:07 +00:00
if_bridgevar.h
if_clone.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
if_clone.h Fix group membership of cloned interfaces when one is moved by 2015-03-02 20:00:03 +00:00
if_dead.c Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
if_debug.c Add more fields to if_debug.c for ddb(4) 'show ifnet'; resort 2016-06-22 12:53:10 +00:00
if_disc.c Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
if_dl.h Constantify lookup key in ifa_ifwith* functions. 2015-09-05 05:33:20 +00:00
if_edsc.c Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
if_enc.c Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
if_enc.h Overhaul if_enc(4) and make it loadable in run-time. 2015-11-25 07:31:59 +00:00
if_epair.c Teach netisr_get_cpuid() to limit a given value to supported by netisr. 2016-08-17 20:21:33 +00:00
if_ethersubr.c Extract out the various local definitions of ETHER_IS_BROADCAST() and 2016-08-07 03:48:33 +00:00
if_fddisubr.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
if_fwsubr.c This change re-adds L2 caching for TCP and UDP, as originally added in D4306 2016-06-02 17:51:29 +00:00
if_gif.c Extract out the various local definitions of ETHER_IS_BROADCAST() and 2016-08-07 03:48:33 +00:00
if_gif.h - Remove GIF_{SEND,ACCEPT}_REVETHIP. 2015-09-10 05:59:39 +00:00
if_gre.c Cleanup unnecessary semicolons from the kernel. 2016-04-10 23:07:00 +00:00
if_gre.h
if_iso88025subr.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
if_lagg.c Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
if_lagg.h Revert accidental submit of WIP as part of r297609 2016-04-06 04:58:20 +00:00
if_llatbl.c Make LLTABLE list lock private for if_llatbl.c 2016-10-11 17:41:13 +00:00
if_llatbl.h Make LLTABLE list lock private for if_llatbl.c 2016-10-11 17:41:13 +00:00
if_llc.h
if_loop.c Clear mbuf hashtype on loopback when RSS is enabled. 2016-10-20 13:48:29 +00:00
if_me.c Add IFCAP_LINKSTATE support. 2015-10-03 09:15:23 +00:00
if_media.c Make IFMEDIA_DEBUG a kernel option. 2015-04-21 10:35:23 +00:00
if_media.h [net80211] add VHT media types in the media layer. 2017-01-05 04:49:23 +00:00
if_mib.c These files were getting sys/malloc.h and vm/uma.h with header pollution 2016-02-01 17:41:21 +00:00
if_mib.h
if_pflog.h
if_pfsync.h
if_sppp.h sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
if_spppfr.c
if_spppsubr.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
if_stf.c Convert if_stf(4) to new routing api. 2015-12-16 09:18:20 +00:00
if_tap.c if_tap: correct typo in sysctl description (Enably) 2015-10-21 19:56:16 +00:00
if_tap.h
if_tapvar.h
if_tun.c Allow writing IP packets of length TUNMRU no matter if TUNSIFHEAD is set 2016-05-19 13:52:12 +00:00
if_tun.h Allow an MTU of 65535 bytes to be set via TUN[SG]IFINFO. This requires 2016-05-24 11:47:14 +00:00
if_types.h After r281643 an #ifdef IFT_FOO preprocessor directive returns false, 2015-05-02 20:37:40 +00:00
if_var.h Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
if_vlan_var.h Add support to priority code point (PCP) that is an 3-bit field 2016-06-06 09:51:58 +00:00
if_vlan.c Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
if_vxlan.c net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties 2016-06-07 04:51:50 +00:00
if_vxlan.h
if.c if: Defer the if_up until the ifnet.if_ioctl is called. 2017-01-06 05:10:49 +00:00
if.h Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
ifdi_if.m 2017 IFLIB updates in preparation for commits to e1000 and ixgbe. 2017-01-02 00:56:33 +00:00
iflib.c Fix hangs in a uniprocessor configuration (qemu, virtualbox, real hw). 2017-01-15 00:50:10 +00:00
iflib.h 2017 IFLIB updates in preparation for commits to e1000 and ixgbe. 2017-01-02 00:56:33 +00:00
ifq.h buf_ring/drbr: Add buf_ring_peek_clear_sc and use it in drbr_peek 2016-02-29 03:54:51 +00:00
iso88025.h
mp_ring.c sys/net: more spelling. 2016-05-19 16:28:05 +00:00
mp_ring.h Import the 'iflib' API library for network drivers. From the author: 2016-05-18 04:35:58 +00:00
mppc.h ng_mppc(4): Bring netgraph(3) MPPC compression support. 2016-06-07 15:07:00 +00:00
mppcc.c ng_mppc(4):: basic readability cleanups. 2016-07-09 02:33:45 +00:00
mppcd.c ng_mppc(4):: basic readability cleanups. 2016-07-09 02:33:45 +00:00
netisr_internal.h
netisr.c Teach netisr_get_cpuid() to limit a given value to supported by netisr. 2016-08-17 20:21:33 +00:00
netisr.h Introduce a per-VNET flag to enable/disable netisr prcessing on that VNET. 2016-06-03 13:57:10 +00:00
netmap_user.h remove trailing whitespace. No code changes. 2016-10-18 15:41:57 +00:00
netmap_virt.h Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail: 2016-10-27 09:46:22 +00:00
netmap.h Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail: 2016-10-27 09:46:22 +00:00
paravirt.h
pfil.c Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
pfil.h
pfkeyv2.h IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets. 2016-11-25 14:44:49 +00:00
pfvar.h Update pf(4) and pflog(4) to survive basic VNET testing, which includes 2016-06-23 21:34:38 +00:00
ppp_defs.h
radix_mpath.c radix_mpath: Don't derefence a NULL pointer in for loop iteration 2016-04-26 20:27:17 +00:00
radix_mpath.h MFP r287070,r287073: split radix implementation and route table structure. 2016-01-25 06:33:15 +00:00
radix.c radix rn_inithead: Fix minor leak in low memory conditions 2016-04-20 02:01:45 +00:00
radix.h MFP r287070,r287073: split radix implementation and route table structure. 2016-01-25 06:33:15 +00:00
raw_cb.c
raw_cb.h
raw_usrreq.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
rndis.h hyperv/hn: Define empty packet filter. 2016-10-27 04:55:19 +00:00
route_var.h Replace rw_init/rw_destroy with corresponding macros. 2016-10-06 14:42:06 +00:00
route.c Fix the typos and style(9) in comment. 2016-12-08 18:18:48 +00:00
route.h rtentry: Initialize rt_mtx with MTX_NEW 2016-08-01 23:07:31 +00:00
rss_config.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
rss_config.h Replace the printf()s with optional rate limited debugging for RSS. 2015-08-28 05:58:16 +00:00
rtsock.c Plug a lock leak in sysctl_ifmalist(). 2016-11-15 19:23:48 +00:00
sff8436.h Add SFF-8024 Extended Specification Compliance 2015-12-28 09:26:07 +00:00
sff8472.h sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
slcompress.c
slcompress.h
toeplitz.c
toeplitz.h
vnet.c Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
vnet.h Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00