freebsd-dev/sys
Mark Johnston fdb987bebd inpcb: Split PCB hash tables
Currently we use a single hash table per PCB database for connected and
bound PCBs.  Since we started using net_epoch to synchronize hash table
lookups, there's been a bug, noted in a comment above in_pcbrehash():
connecting a socket can cause an inpcb to move between hash chains, and
this can cause a concurrent lookup to follow the wrong linkage pointers.
I believe this could cause rare, spurious ECONNREFUSED errors in the
worse case.

Address the problem by introducing a second hash table and adding more
linkage pointers to struct inpcb.  Now the database has one table each
for connected and unconnected sockets.

When inserting an inpcb into the hash table, in_pcbinhash() now looks at
the foreign address of the inpcb to figure out which table to use.  This
ensures that queue linkage pointers are stable until the socket is
disconnected, so the problem described above goes away.  There is also a
small benefit in that in_pcblookup_*() can now search just one of the
two possible hash buckets.

I also made the "rehash" parameter of in(6)_pcbconnect() unused.  This
parameter seems confusing and it is simpler to let the inpcb code figure
out what to do using the existing INP_INHASHLIST flag.

UDP sockets pose a special problem since they can be connected and
disconnected multiple times during their lifecycle.  To handle this, the
patch plugs a hole in the inpcb structure and uses it to store an SMR
sequence number.  When an inpcb is disconnected - an operation which
requires the global PCB database hash lock - the write sequence number
is advanced, and in order to reconnect, the connecting thread must wait
for readers to drain before reusing the inpcb's hash chain linkage
pointers.

raw_ip (ab)uses the hash table without using the corresponding
accessors.  Since there are now two hash tables, it arbitrarily uses the
"connected" table for all of its PCBs.  This will be addressed in some
way in the future.

inp interators which specify a hash bucket will only visit connected
PCBs.  This is not really correct, but nothing in the tree uses that
functionality except raw_ip, which as mentioned above places all of its
PCBs in the "connected" table and so is unaffected.

Discussed with:	glebius
Tested by:	glebius
Sponsored by:	Klara, Inc.
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D38569
2023-04-20 12:13:06 -04:00
..
amd64 x86: initialize use_xsave once 2023-04-19 02:22:28 +03:00
arm amd64: fix PKRU and swapout interaction 2023-04-15 02:53:59 +03:00
arm64 amd64: fix PKRU and swapout interaction 2023-04-15 02:53:59 +03:00
bsm
cam Revert "cam: fix up world compilation after previous" 2023-04-15 18:25:55 -06:00
cddl dtrace: handle NOP instructions in the riscv invop handler 2023-04-10 12:14:11 -04:00
compat LinuxKPI: 802.11: improve assertion and tkip code 2023-04-20 16:07:50 +00:00
conf arm64: Use FULLKERNEL instead of .ALLSRC in .bin target 2023-04-18 11:41:57 -04:00
contrib iwlwifi: quieten more compiler warnings 2023-04-20 16:07:05 +00:00
crypto OpenSSL: Regen an assembly file for arm 2023-03-21 15:13:51 -04:00
ddb ddb: ansify 2023-02-08 00:09:23 +00:00
dev ichiic: use bool for one-bit wide bit-fields 2023-04-19 22:25:50 +02:00
dts arm64/rockchip: Remove rk3328-dwc3 overlays 2022-11-16 11:58:32 +01:00
fs tmpfs: add missing vop_fplookup ops to tmpfs_fifoop_entries 2023-04-18 18:06:30 +00:00
gdb
geom geom: use bool for one-bit wide bit-field 2023-04-17 15:43:00 -04:00
gnu bwn: eliminate dead writes in BWN_GPL_PHY 2022-05-04 09:32:59 -04:00
i386 x86: initialize use_xsave once 2023-04-19 02:22:28 +03:00
isa
kern vfs cache: fix vfs.cache.stats.* name typos 2023-04-19 18:47:38 +00:00
kgssapi nfsd: Enable the NFSD_VNET vnet front end macros 2023-02-18 14:59:36 -08:00
libkern ashldi3: Use C89-style function definition 2022-11-27 13:23:25 -07:00
modules iwlwifi: rtw88: rtw89: fix gcc warnings 2023-04-19 12:21:40 +00:00
net ifnet: factor out interface renaming into a separate function. 2023-04-20 10:23:37 +00:00
net80211 net80211: Remove double words in source code comments 2023-04-18 07:14:50 +02:00
netgraph ng_atmllc: remove 2023-03-09 18:04:21 +00:00
netinet inpcb: Split PCB hash tables 2023-04-20 12:13:06 -04:00
netinet6 inpcb: Split PCB hash tables 2023-04-20 12:13:06 -04:00
netipsec ipsec: only update lastused when it changes 2023-02-16 07:33:51 +00:00
netlink netlink: sync interface IFLA attributes 2023-04-18 12:34:05 +00:00
netpfil pf: change pf_rules_lock and pf_ioctl_lock to per-vnet locks 2023-04-19 09:50:52 +02:00
netsmb smb_smb_treedisconnect: eliminate write only variable mbp 2022-04-04 22:30:57 -06:00
nfs Allow any user to read the NFS stats, for example with nfsstat(1). 2022-12-01 22:21:14 -07:00
nfsclient
nfsserver nfs: Cleanup dead files 2021-03-17 06:16:31 +11:00
nlm nlm: only access refcounts using dedicated primitives 2022-11-24 19:46:43 +00:00
ofed ofed: Fix a logic inversion from IfAPI conversion 2023-04-19 11:56:25 -04:00
opencrypto Complete removal of opt_compat.h 2023-02-13 19:07:38 +03:00
powerpc amd64: fix PKRU and swapout interaction 2023-04-15 02:53:59 +03:00
riscv riscv: save the thread pointer in both modes 2023-04-17 09:49:52 -04:00
rpc svc_rpcsec_gss.c: Separate out the non-vnet initialization 2023-03-01 15:29:25 -08:00
security mac: Honor order when registering MAC modules. 2023-04-18 15:36:27 -04:00
sys umtx: allow to configure minimal timeout (in nanoseconds) 2023-04-19 02:22:28 +03:00
teken teken: color #3 is yellow not brown - use TC_YELLOW as the name 2022-03-12 09:17:29 -05:00
tests tests: make ktest build on ppc. 2023-04-17 13:47:07 +00:00
tools vfs: validate that vop vectors provide all or none fplookup vops 2023-04-06 15:20:41 +00:00
ufs vn_lock_pair(): allow to request shared locking 2023-04-08 01:58:26 +03:00
vm amd64: fix PKRU and swapout interaction 2023-04-15 02:53:59 +03:00
x86 xen: move common variables off of sys/x86/xen/hvm.c 2023-04-14 15:59:11 +02:00
xdr xdr: ansify 2023-02-13 18:37:31 +00:00
xen xen: move common variables off of sys/x86/xen/hvm.c 2023-04-14 15:59:11 +02:00
Makefile Remove dead code in the cscope target 2022-11-11 15:53:57 +00:00
README.md note that some arch independent code can live in dev (e.g. SMBios) 2023-03-03 01:54:07 -08:00

FreeBSD Kernel Source:

This directory contains the source files and build glue that make up the FreeBSD kernel and its modules, including both original and contributed software.

Kernel configuration files are located in the conf/ subdirectory of each architecture. GENERIC is the configuration used in release builds. NOTES contains documentation of all possible entries. LINT is a compile-only configuration used to maximize build coverage and detect regressions.

Source Roadmap:

Directory Description
amd64 AMD64 (64-bit x86) architecture support
arm 32-bit ARM architecture support
arm64 64-bit ARM (AArch64) architecture support
cam Common Access Method storage subsystem - cam(4) and ctl(4)
cddl CDDL-licensed optional sources such as DTrace
conf kernel build glue
compat Linux compatibility layer, FreeBSD 32-bit compatibility
contrib 3rd-party imported software such as OpenZFS
crypto crypto drivers
ddb interactive kernel debugger - ddb(4)
fs most filesystems, excluding UFS, NFS, and ZFS
dev device drivers and other arch independent code
gdb kernel remote GDB stub - gdb(4)
geom GEOM framework - geom(4)
i386 i386 (32-bit x86) architecture support
kern main part of the kernel
libkern libc-like and other support functions for kernel use
modules kernel module infrastructure
net core networking code
net80211 wireless networking (IEEE 802.11) - net80211(4)
netgraph graph-based networking subsystem - netgraph(4)
netinet IPv4 protocol implementation - inet(4)
netinet6 IPv6 protocol implementation - inet6(4)
netipsec IPsec protocol implementation - ipsec(4)
netpfil packet filters - ipfw(4), pf(4), and ipfilter(4)
opencrypto OpenCrypto framework - crypto(7)
powerpc PowerPC/POWER (32 and 64-bit) architecture support
riscv 64-bit RISC-V architecture support
security security facilities - audit(4) and mac(4)
sys kernel headers
tests kernel unit tests
ufs Unix File System - ffs(7)
vm virtual memory system
x86 code shared by AMD64 and i386 architectures