freebsd-skq/sys/netinet6
Hans Petter Selasky f3e7afe2d7 Implement kernel support for hardware rate limited sockets.
- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.

- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.

- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().

- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.

- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.

- How rate limiting works:

1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.

2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.

3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.

4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.

Reviewed by:		wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision:	https://reviews.freebsd.org/D3687
Sponsored by:		Mellanox Technologies
MFC after:		3 months
2017-01-18 13:31:17 +00:00
..
dest6.c
frag6.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
icmp6.c Make ICMPv6 hard error handling for TCP consistent with the ICMPv4 2016-10-21 10:32:57 +00:00
icmp6.h
in6_cksum.c
in6_fib.c
in6_fib.h
in6_gif.c
in6_ifattach.c Lock the ND prefix list and add refcounting for prefixes. 2016-10-07 21:10:53 +00:00
in6_ifattach.h Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
in6_jail.c Move IPv4-specific jail functions to new file netinet/in_jail.c 2016-08-09 02:16:21 +00:00
in6_mcast.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
in6_pcb.c This change re-adds L2 caching for TCP and UDP, as originally added in D4306 2016-06-02 17:51:29 +00:00
in6_pcb.h
in6_pcbgroup.c
in6_proto.c Improve some of the sysctl descriptions added in r299827. 2017-01-16 19:35:19 +00:00
in6_rmx.c
in6_rss.c
in6_rss.h
in6_src.c
in6_var.h Add ip6_tryforward() - a run to completion forwarding implementation 2016-12-12 10:57:32 +00:00
in6.c Lock the ND prefix list and add refcounting for prefixes. 2016-10-07 21:10:53 +00:00
in6.h Add net.inet6.ip6.intr_queue_maxlen sysctl. It can be used to 2016-07-15 17:09:30 +00:00
ip6_ecn.h
ip6_fastfwd.c Add ip6_tryforward() - a run to completion forwarding implementation 2016-12-12 10:57:32 +00:00
ip6_forward.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
ip6_gre.c
ip6_id.c sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
ip6_input.c Add a new socket option SO_TS_CLOCK to pick from several different clock 2017-01-16 17:46:38 +00:00
ip6_ipsec.c Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated 2016-08-21 18:55:30 +00:00
ip6_ipsec.h
ip6_mroute.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
ip6_mroute.h
ip6_output.c Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
ip6_var.h The pr_destroy field does not allow us to run the teardown code in a 2016-06-01 10:14:04 +00:00
ip6.h
ip6protosw.h sys/net*: minor spelling fixes. 2016-05-03 18:05:43 +00:00
ip_fw_nat64.h Add ipfw_nat64 module that implements stateless and stateful NAT64. 2016-08-13 16:09:49 +00:00
ip_fw_nptv6.h Add ipfw_nptv6 module that implements Network Prefix Translation for IPv6 2016-07-18 19:46:31 +00:00
mld6_var.h
mld6.c Get closer to a VIMAGE network stack teardown from top to bottom rather 2016-06-21 13:48:49 +00:00
mld6.h
nd6_nbr.c nd6_dad_timer(): don't assert that the address is tentative. 2016-10-01 01:30:34 +00:00
nd6_rtr.c Remove a bogus KASSERT from nd6_prefix_unlink(). 2016-12-19 19:21:28 +00:00
nd6.c Release the ND6 list lock before making a prefix off-link in nd6_timer(). 2017-01-08 18:46:00 +00:00
nd6.h Lock the ND prefix list and add refcounting for prefixes. 2016-10-07 21:10:53 +00:00
pim6_var.h
pim6.h
raw_ip6.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
raw_ip6.h
route6.c
scope6_var.h
scope6.c
sctp6_usrreq.c Whitespace changes. 2016-12-26 11:06:41 +00:00
sctp6_var.h Whitespace changes. 2016-12-06 10:21:25 +00:00
send.c
send.h
tcp6_var.h
udp6_usrreq.c Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead. 2016-09-15 07:41:48 +00:00
udp6_var.h