freebsd kernel with SKQ
Go to file
csjp e58c2855d8 Over the past couple of years, there have been a number of reports relating
the use of divert sockets to dead locks.  A number of LORs have been reported
between divert and a number of other network subsystems including: IPSEC, Pfil,
multicast, ipfw and others.  Other dead locks could occur because of recursive
entry into the IP stack.  This change should take care of most if not all of
these issues.

A summary of the changes follow:

- We disallow multicast operations on divert sockets.  It really doesn't make
  semantic sense to allow this, since typically you would set multicast
  parameters on multicast end points.

  NOTE: As a part of this change, we actually dis-allow multicast options on
  any socket that IS a divert socket OR IS NOT a SOCK_RAW or SOCK_DGRAM family

- We check to see if there are any socket options that have been specified on
  the socket, and if there was (which is very un-common and also probably
  doesnt make sense to support) we duplicate the mbuf carrying the options.

- We then drop the INP/INFO locks over the call to ip_output().  It should be
  noted that since we no longer support multicast operations on divert sockets
  and we have duplicated any socket options, we no longer need the reference
  to the pcb to be coherent.

- Finally, we replaced the call to ip_input() to use netisr queuing.  This
  should remove the recursive entry into the IP stack from divert.

By dropping the locks over the call to ip_output() we eliminate all the lock
ordering issues above.  By switching over to netisr on the inbound path,
we can no longer recursively enter the ip_input() code via divert.

I have tested this change by using the following command:

ipfwpcap -r 8000 - | tcpdump -r - -nn -v

This should exercise the input and re-injection (outbound) path, which is
very similar to the work load performed by natd(8).  Additionally, I have
run some ospf daemons which have a heavy reliance on raw sockets and
multicast.

Approved by:	re@ (kensmith)
MFC after:	1 month
LOR:		163
LOR:		181
LOR:		202
LOR:		203
Discussed with:	julian, andre et al (on freebsd-net)
In collaboration with:	bms [1], rwatson [2]

[1] bms helped out with the multicast decisions
[2] rwatson submitted the original netisr patches and came up with some
    of the original ideas on how to combat this issue.
2007-08-06 22:06:36 +00:00
bin Take care that the input to setenv() may actually be a pointer straight 2007-07-06 04:04:58 +00:00
cddl - Reduce number of atomic operations needed to be implemented in asm by 2007-06-08 12:35:47 +00:00
compat/opensolaris Use provider's ident to handle situations when disks are moved around 2007-05-06 01:39:39 +00:00
contrib Restore historical more(1) behavior (inhibit ti/te processing) which 2007-08-04 13:16:09 +00:00
crypto s/X11R6/local/g 2007-05-24 22:04:07 +00:00
etc 1. Move the disable-empty-zone stuff down below the first 25 lines so 2007-08-02 09:18:53 +00:00
games Remove duplicate. Was that a bug? :-) 2007-06-12 09:20:31 +00:00
gnu - Bump share library version which were missed in last bump 2007-06-18 18:47:54 +00:00
include declare struct tftphdr and embedded union as beeing packed, which is 2007-08-01 11:59:09 +00:00
kerberos5 Fix generator glue to only expose extern struct units %s_units[] is 2007-05-19 03:29:37 +00:00
lib Improve error handling in libdisk while parsing the kern.geom.conftxt sysctl. 2007-08-05 16:55:40 +00:00
libexec Stop mentioning /usr/X11R6. 2007-07-24 06:41:07 +00:00
release New release notes: if_bridge(4) private ports, wlandebug(8). 2007-08-03 02:26:18 +00:00
rescue Disconnect netatm from the build as it is not MPSAFE and relies on 2007-07-14 21:49:24 +00:00
sbin Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL. 2007-08-05 16:16:15 +00:00
secure - Bump share library version which were missed in last bump 2007-06-18 18:47:54 +00:00
share Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL. 2007-08-05 16:16:15 +00:00
sys Over the past couple of years, there have been a number of reports relating 2007-08-06 22:06:36 +00:00
tools Add regression tests for flopen(3). 2007-08-03 11:29:49 +00:00
usr.bin Fix for PR bin/115033. This corrects a crash when long options 2007-08-01 03:15:35 +00:00
usr.sbin The call to init_file() needs to be moved outside the loop in statd.c, 2007-08-05 16:33:06 +00:00
COPYRIGHT Welcome to 2007 2006-12-31 16:35:29 +00:00
LOCKS Document commit constraints for RELENG_6_*. 2006-01-13 06:51:43 +00:00
MAINTAINERS Update the maintainer id for em driver. 2007-05-23 21:47:19 +00:00
Makefile Expose all of {check,delete}-old{,-dirs,-files,-libs}. 2007-05-16 08:46:35 +00:00
Makefile.inc1 Add sed(1) to cross tools. We do want newly built version during 2007-07-10 10:17:32 +00:00
ObsoleteFiles.inc Remove the last entries to fast_ipsec. 2007-08-02 08:04:48 +00:00
README Simply running ``make world'' will bomb unless you dig up the 2006-06-07 03:33:48 +00:00
UPDATING Fix typo. 2007-07-09 01:13:00 +00:00

This is the top level of the FreeBSD source directory.  This file
was last revised on:
$FreeBSD$

For copyright information, please see the file COPYRIGHT in this
directory (additional copyright information also exists for some
sources in this tree - please see the specific source directories for
more information).

The Makefile in this directory supports a number of targets for
building components (or all) of the FreeBSD source tree, the most
commonly used one being ``world'', which rebuilds and installs
everything in the FreeBSD system from the source tree except the
kernel, the kernel-modules and the contents of /etc.  The ``world''
target should only be used in cases where the source tree has not
changed from the currently running version.  See:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
for more information, including setting make(1) variables.

The ``buildkernel'' and ``installkernel'' targets build and install
the kernel and the modules (see below).  Please see the top of
the Makefile in this directory for more information on the
standard build targets and compile-time flags.

Building a kernel is a somewhat more involved process, documentation
for which can be found at:
   http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html
And in the config(8) man page.
Note: If you want to build and install the kernel with the
``buildkernel'' and ``installkernel'' targets, you might need to build
world before.  More information is available in the handbook.

The sample kernel configuration files reside in the sys/<arch>/conf
sub-directory (assuming that you've installed the kernel sources), the
file named GENERIC being the one used to build your initial installation
kernel.  The file NOTES contains entries and documentation for all possible
devices, not just those commonly used.  It is the successor of the ancient
LINT file, but in contrast to LINT, it is not buildable as a kernel but a
pure reference and documentation file.


Source Roadmap:
---------------
bin		System/user commands.

contrib		Packages contributed by 3rd parties.

crypto		Cryptography stuff (see crypto/README).

etc		Template files for /etc.

games		Amusements.

gnu		Various commands and libraries under the GNU Public License.
		Please see gnu/COPYING* for more information.

include		System include files.

kerberos5	Kerberos5 (Heimdal) package.

lib		System libraries.

libexec		System daemons.

release		Release building Makefile & associated tools.

rescue		Build system for statically linked /rescue utilities.

sbin		System commands.

secure		Cryptographic libraries and commands.

share		Shared resources.

sys		Kernel sources.

tools		Utilities for regression testing and miscellaneous tasks.

usr.bin		User commands.

usr.sbin	System administration commands.


For information on synchronizing your source tree with one or more of
the FreeBSD Project's development branches, please see:

  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/synching.html