freebsd-skq/sys/netinet
Andrew Gallatin a034518ac8 Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain
In order to efficiently serve web traffic on a NUMA
machine, one must avoid as many NUMA domain crossings as
possible. With SO_REUSEPORT_LB, a number of workers can share a
listen socket. However, even if a worker sets affinity to a core
or set of cores on a NUMA domain, it will receive connections
associated with all NUMA domains in the system. This will lead to
cross-domain traffic when the server writes to the socket or
calls sendfile(), and memory is allocated on the server's local
NUMA node, but transmitted on the NUMA node associated with the
TCP connection. Similarly, when the server reads from the socket,
he will likely be reading memory allocated on the NUMA domain
associated with the TCP connection.

This change provides a new socket ioctl, TCP_REUSPORT_LB_NUMA. A
server can now tell the kernel to filter traffic so that only
incoming connections associated with the desired NUMA domain are
given to the server. (Of course, in the case where there are no
servers sharing the listen socket on some domain, then as a
fallback, traffic will be hashed as normal to all servers sharing
the listen socket regardless of domain). This allows a server to
deal only with traffic that is local to its NUMA domain, and
avoids cross-domain traffic in most cases.

This patch, and a corresponding small patch to nginx to use
TCP_REUSPORT_LB_NUMA allows us to serve 190Gb/s of kTLS encrypted
https media content from dual-socket Xeons with only 13% (as
measured by pcm.x) cross domain traffic on the memory controller.

Reviewed by:	jhb, bz (earlier version), bcr (man page)
Tested by: gonzo
Sponsored by:	Netfix
Differential Revision:	https://reviews.freebsd.org/D21636
2020-12-19 22:04:46 +00:00
..
cc TCP Cubic: improve reaction to (and rollback from) RTO 2020-10-24 16:11:46 +00:00
khelp
libalias net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
netdump Use zfree() instead of explicit_bzero() and free(). 2020-06-25 20:17:34 +00:00
tcp_stacks RFC 7323 specifies that: 2020-11-09 21:49:40 +00:00
accf_data.c Define a module version for accept filter modules. 2020-05-19 18:35:08 +00:00
accf_dns.c Define a module version for accept filter modules. 2020-05-19 18:35:08 +00:00
accf_http.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
dccp.h Add header definition for RFC4340, Datagram Congestion Control Protocol 2020-06-17 13:27:13 +00:00
icmp6.h icmp6: Count packets dropped due to an invalid hop limit 2020-10-19 17:07:19 +00:00
icmp_var.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
if_ether.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
if_ether.h
igmp_var.h igmp: convert igmpstat to use PCPU counters 2020-11-08 18:49:23 +00:00
igmp.c igmp: convert igmpstat to use PCPU counters 2020-11-08 18:49:23 +00:00
igmp.h
in_cksum.c
in_debug.c
in_fib.c Refactor fib4/fib6 functions. 2020-11-29 13:41:49 +00:00
in_fib.h Refactor fib4/fib6 functions. 2020-11-29 13:41:49 +00:00
in_gif.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
in_jail.c
in_kdtrace.c Separate out SCTP related dtrace code. 2019-10-14 20:32:11 +00:00
in_kdtrace.h Separate out SCTP related dtrace code. 2019-10-14 20:32:11 +00:00
in_mcast.c Simplify NET_EPOCH_EXIT in inp_join_group(). 2020-10-18 12:03:36 +00:00
in_pcb.c Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain 2020-12-19 22:04:46 +00:00
in_pcb.h Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain 2020-12-19 22:04:46 +00:00
in_pcbgroup.c
in_prot.c
in_proto.c Remove unused nhop_ref_any() function. 2020-09-20 21:32:52 +00:00
in_rmx.c Refactor rib iterator functions. 2020-11-22 20:21:10 +00:00
in_rss.c Implement flowid calculation for outbound connections to balance 2020-10-18 17:15:47 +00:00
in_rss.h Implement flowid calculation for outbound connections to balance 2020-10-18 17:15:47 +00:00
in_systm.h
in_var.h Simplify dom_<rtattach|rtdetach>. 2020-08-14 21:29:56 +00:00
in.c Implement SIOCGIFALIAS. 2020-10-14 09:22:54 +00:00
in.h Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow. 2020-10-09 12:06:43 +00:00
ip6.h Remove stale definitions. The removed definitions are not used right 2020-03-01 12:34:27 +00:00
ip_carp.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_carp.h carp: replace caddr_t with char * 2019-12-06 16:35:48 +00:00
ip_divert.c Add the SCTP_SUPPORT kernel option. 2020-06-18 19:32:34 +00:00
ip_divert.h
ip_dummynet.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_ecn.c
ip_ecn.h
ip_encap.c Widen NET_EPOCH coverage. 2019-10-07 22:40:05 +00:00
ip_encap.h
ip_fastfwd.c ip_fastfwd: style(9) tidy for r367628 2020-11-13 18:25:07 +00:00
ip_fw.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_gre.c Introduce NET_EPOCH_CALL() macro and use it everywhere where we free 2020-01-15 06:05:20 +00:00
ip_icmp.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_icmp.h
ip_id.c Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) 2020-02-26 14:26:36 +00:00
ip_input.c Remove RADIX_MPATH config option. 2020-11-29 19:43:33 +00:00
ip_mroute.c ip_mroute: fix the viftable export sysctl 2020-10-11 00:01:00 +00:00
ip_mroute.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_options.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_options.h
ip_output.c Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow. 2020-10-09 12:06:43 +00:00
ip_reass.c Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) 2020-02-26 14:26:36 +00:00
ip_var.h An earlier commit effectively turned out the fast forwading path 2020-11-12 21:58:47 +00:00
ip.h
pim_var.h
pim.h
raw_ip.c Implement flowid calculation for outbound connections to balance 2020-10-18 17:15:47 +00:00
sctp_asconf.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_asconf.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_auth.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_auth.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_bsd_addr.c Use __func__ instead of __FUNCTION__ for consistency. 2020-10-04 15:37:34 +00:00
sctp_bsd_addr.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_cc_functions.c Minor cleanups. 2020-10-07 15:22:48 +00:00
sctp_constants.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_crc32.c No need to include netinet/sctp_crc32.h twice. 2020-06-22 14:36:14 +00:00
sctp_crc32.h Add the SCTP_SUPPORT kernel option. 2020-06-18 19:32:34 +00:00
sctp_header.h Whitespace changes. 2020-09-24 12:26:06 +00:00
sctp_indata.c Fix a potential use-after-free bug introduced in 2020-11-09 13:12:07 +00:00
sctp_indata.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_input.c Harden the handling of outgoing streams in case of an restart or INIT 2020-12-13 23:51:51 +00:00
sctp_input.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_kdtrace.c Separate out SCTP related dtrace code. 2019-10-14 20:32:11 +00:00
sctp_kdtrace.h Separate out SCTP related dtrace code. 2019-10-14 20:32:11 +00:00
sctp_lock_bsd.h Whitespace changes. 2020-09-24 12:26:06 +00:00
sctp_module.c Provide support for building SCTP as a loadable module. 2020-07-10 14:56:05 +00:00
sctp_os_bsd.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_os.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_output.c Improve the handling of cookie life times. 2020-10-16 10:44:48 +00:00
sctp_output.h Whitespace changes. 2020-09-24 12:26:06 +00:00
sctp_pcb.c Ensure variables are initialized before used. 2020-10-06 11:29:08 +00:00
sctp_pcb.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_peeloff.c Non-functional changes due to upstream cleanup. 2020-06-11 13:34:09 +00:00
sctp_peeloff.h
sctp_ss_functions.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_structs.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_syscalls.c Cleanup, no functional change intended. 2020-07-12 18:34:09 +00:00
sctp_sysctl.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_sysctl.h Improve the handling of cookie life times. 2020-10-16 10:44:48 +00:00
sctp_timer.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_timer.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_uio.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_usrreq.c Improve the handling of cookie life times. 2020-10-16 10:44:48 +00:00
sctp_var.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp.h Improve the handling of cookie life times. 2020-10-16 10:44:48 +00:00
sctputil.c Remove dead stores reported by clang static code analysis 2020-10-06 11:08:52 +00:00
sctputil.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
siftr.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_debug.c
tcp_debug.h
tcp_fastopen.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_fastopen.h
tcp_fsm.h White space cleanup -- remove trailing tab's or spaces 2020-02-12 13:31:36 +00:00
tcp_hostcache.c Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) 2020-02-26 14:26:36 +00:00
tcp_hostcache.h
tcp_hpts.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_hpts.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_input.c Add TCP feature Proportional Rate Reduction (PRR) - RFC6937 2020-12-04 11:29:27 +00:00
tcp_log_buf.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_log_buf.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_lro.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
tcp_lro.h White space cleanup -- remove trailing tab's or spaces 2020-02-12 13:31:36 +00:00
tcp_offload.c Initial support for kernel offload of TLS receive. 2020-04-27 23:17:19 +00:00
tcp_offload.h Initial support for kernel offload of TLS receive. 2020-04-27 23:17:19 +00:00
tcp_output.c Stop sending tiny new data segments during SACK recovery 2020-10-09 12:44:56 +00:00
tcp_pcap.c Step 4.2: start divorce of M_EXT and M_EXTPG 2020-05-03 00:37:16 +00:00
tcp_pcap.h
tcp_ratelimit.c Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc(). 2020-10-29 23:28:39 +00:00
tcp_ratelimit.h Fix copyright year and eliminate the obsolete all rights reserved line. 2020-04-08 17:55:45 +00:00
tcp_reass.c Prevent premature SACK block transmission during loss recovery 2020-11-08 18:47:05 +00:00
tcp_sack.c Stop sending tiny new data segments during SACK recovery 2020-10-09 12:44:56 +00:00
tcp_seq.h
tcp_stats.c Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) 2020-02-26 14:26:36 +00:00
tcp_subr.c Save the current TCP pacing rate in t_pacing_rate. 2020-10-29 00:03:19 +00:00
tcp_syncache.c Fix two occurences of a typo in a comment introduced in r367530. 2020-11-23 10:13:56 +00:00
tcp_syncache.h Fix the following issues related to the TCP SYN-cache: 2020-08-10 20:24:48 +00:00
tcp_timer.c Improve the TCP blackhole detection. The principle is to reduce the 2020-04-14 16:35:05 +00:00
tcp_timer.h Reduce default TCP delayed ACK timeout to 40ms. 2020-04-16 15:59:23 +00:00
tcp_timewait.c Fix an issue I introuced in r367530: tcp_twcheck() can be called 2020-11-20 13:00:28 +00:00
tcp_usrreq.c Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain 2020-12-19 22:04:46 +00:00
tcp_var.h Add TCP feature Proportional Rate Reduction (PRR) - RFC6937 2020-12-04 11:29:27 +00:00
tcp.h Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain 2020-12-19 22:04:46 +00:00
tcpip.h
toecore.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
toecore.h Initial support for kernel offload of TLS receive. 2020-04-27 23:17:19 +00:00
udp_usrreq.c Implement flowid calculation for outbound connections to balance 2020-10-18 17:15:47 +00:00
udp_var.h Add a knob to allow zero UDP checksums for UDP/IPv6 traffic on the given UDP port. 2020-09-18 02:21:15 +00:00
udp.h White space cleanup -- remove trailing tab's or spaces 2020-02-12 13:31:36 +00:00
udplite.h White space cleanup -- remove trailing tab's or spaces 2020-02-12 13:31:36 +00:00