freebsd-dev/sys/net
Luigi Rizzo de240d1013 merge code from ipfw3-head to reduce contention on the ipfw lock
and remove all O(N) sequences from kernel critical sections in ipfw.

In detail:

 1. introduce a IPFW_UH_LOCK to arbitrate requests from
     the upper half of the kernel. Some things, such as 'ipfw show',
     can be done holding this lock in read mode, whereas insert and
     delete require IPFW_UH_WLOCK.

  2. introduce a mapping structure to keep rules together. This replaces
     the 'next' chain currently used in ipfw rules. At the moment
     the map is a simple array (sorted by rule number and then rule_id),
     so we can find a rule quickly instead of having to scan the list.
     This reduces many expensive lookups from O(N) to O(log N).

  3. when an expensive operation (such as insert or delete) is done
     by userland, we grab IPFW_UH_WLOCK, create a new copy of the map
     without blocking the bottom half of the kernel, then acquire
     IPFW_WLOCK and quickly update pointers to the map and related info.
     After dropping IPFW_LOCK we can then continue the cleanup protected
     by IPFW_UH_LOCK. So userland still costs O(N) but the kernel side
     is only blocked for O(1).

  4. do not pass pointers to rules through dummynet, netgraph, divert etc,
     but rather pass a <slot, chain_id, rulenum, rule_id> tuple.
     We validate the slot index (in the array of #2) with chain_id,
     and if successful do a O(1) dereference; otherwise, we can find
     the rule in O(log N) through <rulenum, rule_id>

All the above does not change the userland/kernel ABI, though there
are some disgusting casts between pointers and uint32_t

Operation costs now are as follows:

  Function				Old	Now	  Planned
-------------------------------------------------------------------
  + skipto X, non cached		O(N)	O(log N)
  + skipto X, cached			O(1)	O(1)
XXX dynamic rule lookup			O(1)	O(log N)  O(1)
  + skipto tablearg			O(N)	O(1)
  + reinject, non cached		O(N)	O(log N)
  + reinject, cached			O(1)	O(1)
  + kernel blocked during setsockopt()	O(N)	O(1)
-------------------------------------------------------------------

The only (very small) regression is on dynamic rule lookup and this will
be fixed in a day or two, without changing the userland/kernel ABI

Supported by: Valeria Paoli
MFC after:	1 month
2009-12-22 19:01:47 +00:00
..
bpf_buffer.c Always embed pointer to BPF JIT function in BPF descriptor 2009-08-12 17:28:53 +00:00
bpf_buffer.h Introduce support for zero-copy BPF buffering, which reduces the 2008-03-24 13:49:17 +00:00
bpf_filter.c Fix the last missing parentheses for a return statement in bpf_filter.c. 2008-08-29 20:00:55 +00:00
bpf_jitter.c General style cleanup, no functional change. 2009-11-20 21:12:40 +00:00
bpf_jitter.h - Allocate scratch memory on stack instead of pre-allocating it with 2009-11-20 18:49:20 +00:00
bpf_zerocopy.c Always embed pointer to BPF JIT function in BPF descriptor 2009-08-12 17:28:53 +00:00
bpf_zerocopy.h Make sure we are clearing the ZBUF_FLAG_IMMUTABLE any time a free buffer 2008-07-05 20:11:28 +00:00
bpf.c Remove unneeded blank line from bpf_drvinit(). 2009-10-23 17:26:29 +00:00
bpf.h Sync DLTs with latest libpcap version. 2009-04-02 13:02:12 +00:00
bpfdesc.h Always embed pointer to BPF JIT function in BPF descriptor 2009-08-12 17:28:53 +00:00
bridgestp.c Rework global locks for interface list and index management, correcting 2009-08-23 20:40:19 +00:00
bridgestp.h Fix spelling. 2007-12-09 20:47:12 +00:00
ethernet.h Change if_output to take a struct route as its fourth argument in order 2009-04-16 20:30:28 +00:00
fddi.h Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
firewire.h Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
flowtable.c Verify "smp_started" is true before calling 2009-10-22 00:32:01 +00:00
flowtable.h The flow-table associates TCP/UDP flows and IP destinations with 2009-10-01 20:32:29 +00:00
ieee8023ad_lacp.c Use the flowid if its available for selecting the tx port. 2009-04-30 14:25:44 +00:00
ieee8023ad_lacp.h Remove extra semicolons. 2008-03-17 01:26:44 +00:00
if_arc.h Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
if_arcsubr.c Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
if_arp.h Add ARP statistics to the kernel and netstat. 2009-09-03 21:10:57 +00:00
if_atm.h Change if_output to take a struct route as its fourth argument in order 2009-04-16 20:30:28 +00:00
if_atmsubr.c Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC 2009-06-05 14:55:22 +00:00
if_bridge.c merge code from ipfw3-head to reduce contention on the ipfw lock 2009-12-22 19:01:47 +00:00
if_bridgevar.h Add an option to limit the number of source MACs that can be behind a bridge 2007-11-04 08:32:27 +00:00
if_clone.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_clone.h Introduce and use a sysinit-based initialization scheme for virtual 2009-07-23 20:46:49 +00:00
if_dead.c Remove if_timer/if_watchdog now that they are no longer used. The space 2009-11-30 21:25:57 +00:00
if_disc.c Change if_output to take a struct route as its fourth argument in order 2009-04-16 20:30:28 +00:00
if_dl.h
if_edsc.c
if_ef.c Take a step towards removing if_watchdog/if_timer. Don't explicitly set 2009-11-06 14:55:01 +00:00
if_enc.c Unbreak the VIMAGE build with IPSEC, broken with r197952 by 2009-10-14 11:55:55 +00:00
if_enc.h Increase statistic counters for enc0 interface when enabled 2008-08-12 09:05:01 +00:00
if_epair.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_ethersubr.c merge code from ipfw3-head to reduce contention on the ipfw lock 2009-12-22 19:01:47 +00:00
if_faith.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_fddisubr.c Break at_ifawithnet() into two variants: 2009-06-24 10:32:44 +00:00
if_fwsubr.c Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
if_gif.c Check pointer for NULL before dereferencing it, not after. 2009-10-22 06:17:04 +00:00
if_gif.h Remove unused VNET_SET() and related macros; only VNET_GET() is 2009-07-16 21:13:04 +00:00
if_gre.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_gre.h Add support for the optional key in the GRE header. 2008-06-20 17:26:34 +00:00
if_iso88025subr.c Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
if_lagg.c Use the flowid if its available for selecting the tx port. 2009-04-30 14:25:44 +00:00
if_lagg.h Change if_output to take a struct route as its fourth argument in order 2009-04-16 20:30:28 +00:00
if_llatbl.c Style fix - break too long a line in two. 2009-09-18 09:03:23 +00:00
if_llatbl.h Use locks specific to the lltable code, rather than borrow the ifnet 2009-08-25 09:52:38 +00:00
if_llc.h
if_loop.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_media.c
if_media.h Implementation of the upcoming Wireless Mesh standard, 802.11s, on the 2009-07-11 15:02:45 +00:00
if_mib.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_mib.h
if_sppp.h
if_spppfr.c
if_spppsubr.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_stf.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_stf.h
if_tap.c Change the type of uio_resid member of struct uio from int to ssize_t. 2009-06-25 18:46:30 +00:00
if_tap.h Add new TAPGIFNAME tap(4) character device ioctl. This is a 2008-09-08 22:43:55 +00:00
if_tapvar.h
if_tun.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
if_tun.h
if_types.h Remove IPX over IP tunneling support, which allows IPX routing over IP 2007-06-13 14:01:43 +00:00
if_var.h Remove commented out prototype for ifinit(). This prototype has been 2009-12-21 20:09:19 +00:00
if_vlan_var.h
if_vlan.c Compare pointer with NULL, not 0. 2009-09-09 03:36:43 +00:00
if.c Remove if_timer/if_watchdog now that they are no longer used. The space 2009-11-30 21:25:57 +00:00
if.h Revert revision 199201 for now as it has introduced a kernel vulnerability 2009-11-12 19:02:10 +00:00
iso88025.h Switch cmd argument to u_long. This matches what if_ethersubr.c does and 2009-06-21 10:29:31 +00:00
netisr.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
netisr.h Update epair(4) to the new netisr implementation and polish 2009-07-26 12:20:07 +00:00
pfil.c Clean up comments, white space, and style in pfil.c (especially new VNET 2009-10-19 15:19:14 +00:00
pfil.h Remove unused pfil_flags field in packet_filter_hook. 2009-10-18 22:54:09 +00:00
pfkeyv2.h Added support for NAT-Traversal (RFC 3948) in IPsec stack. 2009-06-12 15:44:35 +00:00
ppp_defs.h
radix_mpath.c Extend route command: 2009-04-14 23:05:36 +00:00
radix_mpath.h When RADIX_MPATH is enabled, the route selection is not rotating 2008-05-30 09:34:35 +00:00
radix.c Move the scan for max_keylen into route.c::route_init(), 2009-12-14 20:12:51 +00:00
radix.h Move the scan for max_keylen into route.c::route_init(), 2009-12-14 20:12:51 +00:00
raw_cb.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
raw_cb.h Remove unused VNET_SET() and related macros; only VNET_GET() is 2009-07-16 21:13:04 +00:00
raw_usrreq.c Merge the remainder of kern_vimage.c and vimage.h into vnet.c and 2009-08-01 19:26:27 +00:00
route.c Move the scan for max_keylen into route.c::route_init(), 2009-12-14 20:12:51 +00:00
route.h Add arp_update_event. This replaces route_arp_update_event, which 2009-09-08 21:17:17 +00:00
rtsock.c Throughout the network stack we have a few places of 2009-12-13 13:57:32 +00:00
slcompress.c
slcompress.h
vnet.c Introduce a separate sx lock for protecting lists of vnet sysinit 2009-08-28 22:30:55 +00:00
vnet.h Make VNET_DEBUG a standalone compile-time option, i.e. decouple it from 2009-08-14 22:41:39 +00:00
zlib.c
zlib.h