freebsd-dev/sys/netinet6
Mark Johnston fdb987bebd inpcb: Split PCB hash tables
Currently we use a single hash table per PCB database for connected and
bound PCBs.  Since we started using net_epoch to synchronize hash table
lookups, there's been a bug, noted in a comment above in_pcbrehash():
connecting a socket can cause an inpcb to move between hash chains, and
this can cause a concurrent lookup to follow the wrong linkage pointers.
I believe this could cause rare, spurious ECONNREFUSED errors in the
worse case.

Address the problem by introducing a second hash table and adding more
linkage pointers to struct inpcb.  Now the database has one table each
for connected and unconnected sockets.

When inserting an inpcb into the hash table, in_pcbinhash() now looks at
the foreign address of the inpcb to figure out which table to use.  This
ensures that queue linkage pointers are stable until the socket is
disconnected, so the problem described above goes away.  There is also a
small benefit in that in_pcblookup_*() can now search just one of the
two possible hash buckets.

I also made the "rehash" parameter of in(6)_pcbconnect() unused.  This
parameter seems confusing and it is simpler to let the inpcb code figure
out what to do using the existing INP_INHASHLIST flag.

UDP sockets pose a special problem since they can be connected and
disconnected multiple times during their lifecycle.  To handle this, the
patch plugs a hole in the inpcb structure and uses it to store an SMR
sequence number.  When an inpcb is disconnected - an operation which
requires the global PCB database hash lock - the write sequence number
is advanced, and in order to reconnect, the connecting thread must wait
for readers to drain before reusing the inpcb's hash chain linkage
pointers.

raw_ip (ab)uses the hash table without using the corresponding
accessors.  Since there are now two hash tables, it arbitrarily uses the
"connected" table for all of its PCBs.  This will be addressed in some
way in the future.

inp interators which specify a hash bucket will only visit connected
PCBs.  This is not really correct, but nothing in the tree uses that
functionality except raw_ip, which as mentioned above places all of its
PCBs in the "connected" table and so is unaffected.

Discussed with:	glebius
Tested by:	glebius
Sponsored by:	Klara, Inc.
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D38569
2023-04-20 12:13:06 -04:00
..
dest6.c
frag6.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
icmp6.c netinet: Disallow unspecified addresses in ICMP-embedded packets 2023-03-13 10:45:56 -04:00
icmp6.h
in6_cksum.c
in6_fib_algo.c
in6_fib.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in6_fib.h
in6_gif.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in6_ifattach.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in6_ifattach.h
in6_jail.c jail: convert several functions from int to bool 2023-03-14 21:05:33 -06:00
in6_mcast.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in6_pcb.c inpcb: Split PCB hash tables 2023-04-20 12:13:06 -04:00
in6_pcb.h inpcb: use family specific sockaddr argument for bind functions 2023-02-15 10:30:16 -08:00
in6_proto.c net.inet6.ip6.log_interval: use ppsratecheck(9) internally 2023-03-13 16:47:06 +00:00
in6_rmx.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in6_rss.c
in6_rss.h
in6_src.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
in6_var.h IfAPI: Hide the in6m_lookup_locked() implementation. 2023-01-31 15:02:14 -05:00
in6.c inet6: protect address manipulation with a lock 2023-03-30 08:46:38 +00:00
in6.h netinet: Remove the IP(V6)_RSS_LISTEN_BUCKET socket option 2023-02-28 15:57:21 -05:00
ip6_ecn.h
ip6_fastfwd.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
ip6_forward.c pf: distinguish forwarding and output cases for pf_refragment6() 2023-03-16 10:59:04 +01:00
ip6_gre.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
ip6_id.c
ip6_input.c inet6: Include if_private.h in one more netstack file 2023-03-24 10:25:35 -04:00
ip6_mroute.c net.inet6.ip6.log_interval: use ppsratecheck(9) internally 2023-03-13 16:47:06 +00:00
ip6_mroute.h
ip6_output.c netinet: Remove the IP(V6)_RSS_LISTEN_BUCKET socket option 2023-02-28 15:57:21 -05:00
ip6_var.h net.inet6.ip6.log_interval: use ppsratecheck(9) internally 2023-03-13 16:47:06 +00:00
ip6.h
ip_fw_nat64.h
ip_fw_nptv6.h
mld6_var.h mld6: use callout(9) directly instead of pr_slowtimo, pr_fasttimo 2022-08-17 11:50:31 -07:00
mld6.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
mld6.h
nd6_nbr.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
nd6_rtr.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
nd6.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
nd6.h IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
pim6_var.h
pim6.h
raw_ip6.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
raw_ip6.h
route6.c
scope6_var.h
scope6.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
sctp6_usrreq.c sctp: minor changes due to upstreaming of Glebs recent changes 2022-11-06 23:06:40 +01:00
sctp6_var.h sctp: minor changes due to upstreaming of Glebs recent changes 2022-11-06 23:06:40 +01:00
send.c IfAPI: Explicitly include <net/if_private.h> in netstack 2023-01-31 15:02:16 -05:00
send.h
tcp6_var.h netinet*: de-void control input IP protocol methods 2022-10-03 20:53:04 -07:00
udp6_usrreq.c udp: Fix a memory leak in udp6_send() 2023-03-14 11:58:02 -04:00
udp6_var.h netinet*: de-void control input IP protocol methods 2022-10-03 20:53:04 -07:00