freebsd-dev/sys/netinet
Mark Johnston fdb987bebd inpcb: Split PCB hash tables
Currently we use a single hash table per PCB database for connected and
bound PCBs.  Since we started using net_epoch to synchronize hash table
lookups, there's been a bug, noted in a comment above in_pcbrehash():
connecting a socket can cause an inpcb to move between hash chains, and
this can cause a concurrent lookup to follow the wrong linkage pointers.
I believe this could cause rare, spurious ECONNREFUSED errors in the
worse case.

Address the problem by introducing a second hash table and adding more
linkage pointers to struct inpcb.  Now the database has one table each
for connected and unconnected sockets.

When inserting an inpcb into the hash table, in_pcbinhash() now looks at
the foreign address of the inpcb to figure out which table to use.  This
ensures that queue linkage pointers are stable until the socket is
disconnected, so the problem described above goes away.  There is also a
small benefit in that in_pcblookup_*() can now search just one of the
two possible hash buckets.

I also made the "rehash" parameter of in(6)_pcbconnect() unused.  This
parameter seems confusing and it is simpler to let the inpcb code figure
out what to do using the existing INP_INHASHLIST flag.

UDP sockets pose a special problem since they can be connected and
disconnected multiple times during their lifecycle.  To handle this, the
patch plugs a hole in the inpcb structure and uses it to store an SMR
sequence number.  When an inpcb is disconnected - an operation which
requires the global PCB database hash lock - the write sequence number
is advanced, and in order to reconnect, the connecting thread must wait
for readers to drain before reusing the inpcb's hash chain linkage
pointers.

raw_ip (ab)uses the hash table without using the corresponding
accessors.  Since there are now two hash tables, it arbitrarily uses the
"connected" table for all of its PCBs.  This will be addressed in some
way in the future.

inp interators which specify a hash bucket will only visit connected
PCBs.  This is not really correct, but nothing in the tree uses that
functionality except raw_ip, which as mentioned above places all of its
PCBs in the "connected" table and so is unaffected.

Discussed with:	glebius
Tested by:	glebius
Sponsored by:	Klara, Inc.
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D38569
2023-04-20 12:13:06 -04:00
..
cc Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities. 2023-03-16 11:43:16 -04:00
khelp tcp: provide macros to access inpcb and socket from a tcpcb 2022-11-08 10:24:40 -08:00
libalias libalias: Mark set but unused variables as unused. 2023-04-10 10:35:29 -07:00
netdump IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
tcp_stacks tcp: rack the request level logging is a bit too noisy when doing point logging. 2023-04-19 14:02:12 -04:00
accf_data.c Define a module version for accept filter modules. 2020-05-19 18:35:08 +00:00
accf_dns.c Define a module version for accept filter modules. 2020-05-19 18:35:08 +00:00
accf_http.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
dccp.h Add header definition for RFC4340, Datagram Congestion Control Protocol 2020-06-17 13:27:13 +00:00
icmp6.h pf: apply the network stack's ICMP rate limiting to ICMP errors sent by pf 2022-10-14 10:36:16 +02:00
icmp_var.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
if_ether.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
if_ether.h
igmp_var.h igmp: use callout(9) directly instead of pr_slowtimo, pr_fasttimo 2022-08-17 11:50:31 -07:00
igmp.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
igmp.h
in_cksum.c netinet: Implement in_cksum_skip() using m_apply() 2021-11-24 13:31:16 -05:00
in_debug.c Use network epoch to protect local IPv4 addresses hash. 2021-10-22 14:40:53 -07:00
in_fib_algo.c Fix IPv4 fib bsearch4() lookup array construction. 2021-01-17 20:32:26 +00:00
in_fib_dxr.c fib_algo: shift / mask by constants in dxr_lookup() 2022-01-17 00:13:47 +01:00
in_fib.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in_fib.h Refactor fib4/fib6 functions. 2020-11-29 13:41:49 +00:00
in_gif.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in_jail.c jail: convert several functions from int to bool 2023-03-14 21:05:33 -06:00
in_kdtrace.c Fix dtrace SDT probe tcp:::debug-input 2021-12-20 17:15:43 -09:00
in_kdtrace.h tcp: retire TCPDEBUG 2022-12-14 09:54:06 -08:00
in_mcast.c in_mcat.c: change multicast not member condition 2023-03-03 22:25:17 -07:00
in_pcb_var.h in_pcb: use jenkins hash over the entire IPv6 (or IPv4) address 2021-12-26 10:47:28 -08:00
in_pcb.c inpcb: Split PCB hash tables 2023-04-20 12:13:06 -04:00
in_pcb.h inpcb: Split PCB hash tables 2023-04-20 12:13:06 -04:00
in_prot.c
in_proto.c netinet*: add back necessary headers 2022-10-26 08:16:44 -07:00
in_rmx.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in_rss.c in_rss: fix set but not used warning 2022-05-07 18:17:33 +02:00
in_rss.h Implement flowid calculation for outbound connections to balance 2020-10-18 17:15:47 +00:00
in_systm.h
in_var.h inet: Simplify if_multiaddrs iteration. 2022-10-08 13:10:07 -04:00
in.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in.h netinet: Remove the IP(V6)_RSS_LISTEN_BUCKET socket option 2023-02-28 15:57:21 -05:00
ip6.h tcp: use IPV6_FLOWLABEL_LEN 2023-04-11 18:53:51 +02:00
ip_carp_nl.h carp: allow commands to use interface name rather than index 2023-03-31 11:29:58 +02:00
ip_carp.c carp: allow commands to use interface name rather than index 2023-03-31 11:29:58 +02:00
ip_carp.h protosw: separate pr_input and pr_ctlinput out of protosw 2022-08-17 11:50:31 -07:00
ip_divert.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
ip_divert.h divert(4): provide statistics 2022-08-30 15:09:21 -07:00
ip_dummynet.h ipfw: use unsigned int for dummynet bandwidth 2021-08-19 10:48:53 +02:00
ip_ecn.c
ip_ecn.h
ip_encap.c
ip_encap.h
ip_fastfwd.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
ip_fw.h ipfw: add support radix tables and table lookup for MAC addresses 2022-06-04 19:12:29 +03:00
ip_gre.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
ip_icmp.c netinet: Disallow unspecified addresses in ICMP-embedded packets 2023-03-13 10:45:56 -04:00
ip_icmp.h netinet*: remove PRC_ constants and streamline ICMP processing 2022-10-03 20:53:04 -07:00
ip_id.c
ip_input.c netinet: Tighten checks for unspecified source addresses 2023-03-06 15:06:00 -05:00
ip_mroute.c mroute: partially sanitize the file 2023-02-23 13:35:44 +00:00
ip_mroute.h IPv4 multicast: fix netstat -g 2022-03-22 07:38:01 -05:00
ip_options.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
ip_options.h
ip_output.c netinet: Remove the IP(V6)_RSS_LISTEN_BUCKET socket option 2023-02-28 15:57:21 -05:00
ip_reass.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
ip_var.h ipfw: garbage collect ip_fw_chk_ptr 2023-03-03 10:30:15 -08:00
ip.h ip_reass: retire ipreass_slowtimo() in favor of per-slot callout 2022-09-08 13:49:58 -07:00
pim_var.h
pim.h
raw_ip.c inpcb: Split PCB hash tables 2023-04-20 12:13:06 -04:00
sctp_asconf.c sctp: improve consistency 2022-05-14 06:28:19 +02:00
sctp_asconf.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_auth.c net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_auth.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_bsd_addr.c sctp: Remove unused variable. 2022-04-12 14:58:59 -07:00
sctp_bsd_addr.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_cc_functions.c sctp: plug set-but-not-used vars 2022-04-19 12:45:57 +00:00
sctp_constants.h sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_crc32.c sctp: fix a signed/unsigned mismatch. 2022-02-17 22:45:57 +01:00
sctp_crc32.h sctp: fix a signed/unsigned mismatch. 2022-02-17 22:45:57 +01:00
sctp_header.h sctp: improve negotiation of zero checksum feature 2023-03-15 22:29:52 +01:00
sctp_indata.c sctp: tweak panic message 2022-08-03 17:28:15 +02:00
sctp_indata.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_input.c sctp: enforce Kahn's rule during the handshake 2023-03-16 17:40:40 +01:00
sctp_input.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_kdtrace.c
sctp_kdtrace.h
sctp_lock_bsd.h sctp: get rid of stcb send lock 2022-03-29 01:50:17 +02:00
sctp_module.c protosw: refactor protosw and domain static declaration and load 2022-08-17 11:50:32 -07:00
sctp_os_bsd.h IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
sctp_os.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_output.c sctp: fix typo in assignment 2023-03-18 23:58:50 +01:00
sctp_output.h sctp: cleanup the SCTP_MAXSEG socket option. 2021-12-27 23:40:31 +01:00
sctp_pcb.c sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_pcb.h sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_peeloff.c sctp: Remove an unused sctp_inpcb field 2021-09-07 11:19:29 -04:00
sctp_peeloff.h
sctp_ss_functions.c sctp: fix typos 2022-03-29 21:09:51 +02:00
sctp_structs.h sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_syscalls.c sctp: ansify 2023-02-13 18:17:10 +00:00
sctp_sysctl.c sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_sysctl.h sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_timer.c Fix unused variable warning in sctp_timer.c 2022-07-25 22:08:28 +02:00
sctp_timer.h net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
sctp_uio.h sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctp_usrreq.c sctp: allow disabling of SCTP_ACCEPT_ZERO_CHECKSUM socket option 2023-03-15 22:55:23 +01:00
sctp_var.h sctp: minor changes due to upstreaming of Glebs recent changes 2022-11-06 23:06:40 +01:00
sctp.h sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctputil.c sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum 2023-03-10 01:45:46 +01:00
sctputil.h sctp: more sb_cc related cleanups 2022-05-23 16:09:23 +02:00
siftr.c pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks() 2023-02-14 10:02:49 -08:00
tcp_accounting.h Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities. 2023-03-16 11:43:16 -04:00
tcp_ecn.c tcp: retire TCPDEBUG 2022-12-14 09:54:06 -08:00
tcp_ecn.h tcp: add conservative d.cep accounting algorithm 2022-11-06 12:05:22 +01:00
tcp_fastopen.c tcp: provide macros to access inpcb and socket from a tcpcb 2022-11-08 10:24:40 -08:00
tcp_fastopen.h Use stub inline functions for no-op versions of tcp_fastopen*(). 2022-04-08 17:25:13 -07:00
tcp_fsm.h tcp: remove a 4.4BSD relic 2022-12-13 20:21:45 -08:00
tcp_hostcache.c tcp(4): Fix a typo in a sysctl description 2021-11-30 07:17:30 +01:00
tcp_hpts.c tcp: Inconsistent use of hpts_calling flag 2023-04-17 17:10:26 -04:00
tcp_hpts.h tcp_hpts: plug a compiler warn 2023-04-05 14:32:13 +00:00
tcp_input.c tcp: reduce argument list to functions that pass a segment 2023-04-07 12:18:06 -07:00
tcp_log_buf.c We have a TCP_LOG_CONNEND log that should come out at the very last log of every connection. This 2023-04-19 12:54:25 -04:00
tcp_log_buf.h tcp: misc cleanup of options for rack as well as socket option logging. 2023-04-07 10:15:29 -04:00
tcp_lro.c tcp_hpts: use queue(9) STAILQ for the input queue 2023-04-17 09:07:23 -07:00
tcp_lro.h tcp_lro: Fix for undefined behaviour. 2023-01-13 11:18:19 +01:00
tcp_offload.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
tcp_offload.h Path MTU discovery hooks for offloaded TCP connections. 2021-04-21 13:00:16 -07:00
tcp_output.c tcp: use IPV6_FLOWLABEL_LEN 2023-04-11 18:53:51 +02:00
tcp_pcap.c tcp: embed inpcb into tcpcb 2022-12-07 09:00:48 -08:00
tcp_pcap.h
tcp_ratelimit.c Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities. 2023-03-16 11:43:16 -04:00
tcp_ratelimit.h rack and bbr not loading if TCP_RATELIMIT is not configured. 2023-01-05 11:59:52 -05:00
tcp_reass.c tcp: retire TCPDEBUG 2022-12-14 09:54:06 -08:00
tcp_sack.c tcp: send SACK rescue retransmission also mid-stream 2023-03-28 04:47:01 +02:00
tcp_seq.h tcp: Correctly compute the retransmit length for all 64-bit platforms. 2022-06-03 10:49:17 +02:00
tcp_stats.c tcp: add missing void keyword to tcp_stats_init 2023-02-13 18:38:04 +00:00
tcp_subr.c We have a TCP_LOG_CONNEND log that should come out at the very last log of every connection. This 2023-04-19 12:54:25 -04:00
tcp_syncache.c tcp: bbr.c is non-capable of doing ECN and sets an INP flag to fend off ECN however our syncache is not aware of that flag. 2023-04-18 12:21:56 -04:00
tcp_syncache.h tcp: Add/update AccECN related statistics and numbers 2022-02-10 00:21:31 +01:00
tcp_timer.c Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities. 2023-03-16 11:43:16 -04:00
tcp_timer.h tcp: rearrange enum and remove unused variable 2023-02-21 18:26:49 +01:00
tcp_timewait.c tcp: retire TCPDEBUG 2022-12-14 09:54:06 -08:00
tcp_usrreq.c tcp: pass tcpcb in the tfb_tcp_ctloutput() method instead of inpcb 2023-04-07 12:18:10 -07:00
tcp_var.h tcp_hpts: use queue(9) STAILQ for the input queue 2023-04-17 09:07:23 -07:00
tcp.h tcp: misc cleanup of options for rack as well as socket option logging. 2023-04-07 10:15:29 -04:00
tcpip.h
toecore.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
toecore.h Path MTU discovery hooks for offloaded TCP connections. 2021-04-21 13:00:16 -07:00
udp_usrreq.c inpcb: use family specific sockaddr argument for bind functions 2023-02-15 10:30:16 -08:00
udp_var.h udp: add protocol method declarations to udp_var.h 2022-12-07 11:51:49 -08:00
udp.h headers: make a few more headers self-contained 2022-01-03 10:12:30 +01:00
udplite.h