FreeBSD src
Go to file
Attilio Rao 3d7acbbabf Fix several callout migration races:
- Problem1:
   Hypothesis: thread1 is doing a callout_reset_on(), within his
   callout handler, willing to implicitly or explicitly migrate the
   callout.  thread2 is draining the callout.

   Thesys:
   * thread1 calls callout_lock() and locks the old callout cpu
   * thread1 performs the checks in the first path of the
     callout_reset_on()
   * thread1 hits this codepiece:
       /*
        * If the lock must migrate we have to check the state again as
        * we can't hold both the new and old locks simultaneously.
        */
       if (c->c_cpu != cpu) {
               c->c_cpu = cpu;
               CC_UNLOCK(cc);
               goto retry;
       }

     which means it will drop the lock and 'retry'
   * thread2 will callout_lock() and locks the new callout cpu.
     thread1 spins on the new lock and will not keep going for the
     moment.
   * thread2 checks that the callout is not pending (as callout is
     currently running) and that it is not on cc->cc_curr (because cc
     now refers to the new callout and the callout is running on the
     old callout cpu) thus it thinks it is done and returns.
   * thread1  will now acquire the lock and then adds the callout
     to the new callout cpu queue

   That seems an obvious race as callout_stop() falsely reports
   the callout stopped or worse, callout_drain() falsely returns
   while the callout is still in use.
 - Solution1:
   Fixing this problem would require, in general, to lock both
   callout cpus at once while switching the c_cpu field and avoid
   cyclic deadlocks between callout cpus locks.
   The concept of CPUBLOCK is then introduced (working more or less
   like the blocked_lock for thread_lock() function) meaning:
   "in callout_lock(), spin until the c->c_cpu is not different from
   CPUBLOCK". That way the "original" callout cpu, referred to the
   above mentioned code snippet, will remain blocked until the lock
   handover is over critical path will remain covered.

 - Problem2:
   Having the callout currently executed on a specific callout cpu
   and contemporary pending on another callout cpu (as it can happen
   with current code) breaks, at least, the assumption callout_drain()
   returns just once the callout cannot be referenced anymore.
 - Solution2:
   Callout migration is deferred if the current callout is already
   under execution.
   The best place to do that is in softclock() and new members are
   added to the callout cpu structure in order to specify a pending
   migration is requested. That is necessary because the callout
   cannot be trusted (not freed) the 100% of times after the execution
   of the callout handler.
   CPUBLOCK will prevent, in the "deferred migration" case, that the
   callout gets freed in this case, stopping any callout_stop() and
   callout_drain() possible activity until the migration is
   actually performed.

 - Problem3:
   There is a further race in callout_drain().
   In order to avoid a race between sleepqueue lock and callout cpu
   spinlock, in _callout_stop_safe(), the callout cpu lock is dropped,
   the sleepqueue lock is acquired and a new callout cpu lookup is
   performed.  Note that the channel used for locking the sleepqueue is
   obtained from the "current" callout cpu (&cc->cc_waiting).
   If the callout migrated in the meanwhile, callout_drain() will end up
   using the wrong wchan for the sleepqueue (the locked one will be the
   older, while the new one will not really be locked) leading to a
   lock leak and a race access to sleepqueue.
 - Solution3:
   It is enough to check if a migration happened between the operation
   of acquiring the sleepqueue lock and the new callout cpu lock and
   eventually unwind all those and try again.

This problems can lead to deathly races on moderate (4-ways) SMP
environment, leading to easy panic or deadlocks.
The 24-ways of the reporter, could easilly panic, with completely
normal workload, almost daily.
gianni@ kindly wrote the following prof-of-concept which can
panic a FreeBSD machine in less than one hour, in smaller SMP:
http://www.freebsd.org/~attilio/callout/test.c

Reported by:	Nicholas Esborn <nick at desert dot net>, DesertNet
In collabouration with:	gianni, pho, Nicholas Esborn
Reviewed by:	jhb
MFC after:	1 week (*)

* Usually, I would aim for a larger MFC timeout, but I really want this
  in before 8.2-RELEASE, thus re@ accepted a shorter timeout as a special
  case for this patch
2010-12-29 18:17:36 +00:00
bin sh: Don't do optimized command substitution if expansions have side effects. 2010-12-28 21:27:08 +00:00
cddl Print message with information about updating the boot code if a new 2010-12-08 13:51:25 +00:00
contrib Unbreak the build by temprorarily not using include directives in 2010-12-20 22:56:50 +00:00
crypto Merge OpenSSL 0.9.8q into head. 2010-12-03 22:59:54 +00:00
etc Add pidfile [1] 2010-12-27 22:52:47 +00:00
games factor(6): Check return values of BN_* functions. 2010-12-20 19:07:56 +00:00
gnu Switch mips architectures back to libgcc. 2010-12-29 17:12:05 +00:00
include rpc.lockd(8) WARNS cleanup 2010-12-20 21:12:18 +00:00
kerberos5 Fix a typo. 2010-01-09 18:53:03 +00:00
lib Switch mips architectures back to libgcc. 2010-12-29 17:12:05 +00:00
libexec Fix an error in the ABI in rtld_bind_start(). When passing arguments to a 2010-12-28 22:31:59 +00:00
release Fix the overflowing livefs ISO by removing man pages from the HFS part of 2010-12-15 23:43:25 +00:00
rescue Break out the rules which generate crunchgen'ed binaries into a separate 2010-11-13 03:11:27 +00:00
sbin Add support for FS_TRIM to user-mode UFS utilities. 2010-12-29 12:31:18 +00:00
secure Regenerate manual pages for OpenSSL 0.9.8q. 2010-12-03 23:07:45 +00:00
share Fix date, broken in r216667. 2010-12-22 17:12:58 +00:00
sys Fix several callout migration races: 2010-12-29 18:17:36 +00:00
tools sh: Don't do optimized command substitution if expansions have side effects. 2010-12-28 21:27:08 +00:00
usr.bin Start sentences on a new line to ease life for translators. Tweak the 2010-12-28 18:58:15 +00:00
usr.sbin Revert most of r210764, now that mdocml does the right 2010-12-28 10:08:50 +00:00
COPYRIGHT Happy New Year 2010! :-) 2009-12-31 10:00:37 +00:00
LOCKS Update LOCKS syntax. 2008-06-05 19:47:58 +00:00
MAINTAINERS Add a comment to MAINTAINERS indicating that sbin/routed is in fact 2010-04-10 12:29:09 +00:00
Makefile Redirect stderr from config to /dev/null. config -m is printing lots 2010-12-24 04:55:56 +00:00
Makefile.inc1 Do not lint code beyond necessity (with apologies to Wiliam of Ockham). 2010-11-18 16:32:52 +00:00
Makefile.mips Guard against TARGET_ABI being undefined (TARGET_ABI will go away soon) 2010-08-26 14:54:12 +00:00
ObsoleteFiles.inc libbsnmp was moved to usr/lib 2010-12-16 11:58:50 +00:00
README Import OpenSSL 0.9.8q. 2010-12-02 22:36:51 +00:00
UPDATING - Add some helper hook points to the TCP stack. The hooks allow Khelp modules to 2010-12-28 12:13:30 +00:00

This is the top level of the FreeBSD source directory.  This file
was last revised on:
$FreeBSD$

For copyright information, please see the file COPYRIGHT in this
directory (additional copyright information also exists for some
sources in this tree - please see the specific source directories for
more information).

The Makefile in this directory supports a number of targets for
building components (or all) of the FreeBSD source tree, the most
commonly used one being ``world'', which rebuilds and installs
everything in the FreeBSD system from the source tree except the
kernel, the kernel-modules and the contents of /etc.  The ``world''
target should only be used in cases where the source tree has not
changed from the currently running version.  See:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
for more information, including setting make(1) variables.

The ``buildkernel'' and ``installkernel'' targets build and install
the kernel and the modules (see below).  Please see the top of
the Makefile in this directory for more information on the
standard build targets and compile-time flags.

Building a kernel is a somewhat more involved process, documentation
for which can be found at:
   http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html
And in the config(8) man page.
Note: If you want to build and install the kernel with the
``buildkernel'' and ``installkernel'' targets, you might need to build
world before.  More information is available in the handbook.

The sample kernel configuration files reside in the sys/<arch>/conf
sub-directory (assuming that you've installed the kernel sources), the
file named GENERIC being the one used to build your initial installation
kernel.  The file NOTES contains entries and documentation for all possible
devices, not just those commonly used.  It is the successor of the ancient
LINT file, but in contrast to LINT, it is not buildable as a kernel but a
pure reference and documentation file.


Source Roadmap:
---------------
bin		System/user commands.

cddl		Various commands and libraries under the Common Development
		and Distribution License.

contrib		Packages contributed by 3rd parties.

crypto		Cryptography stuff (see crypto/README).

etc		Template files for /etc.

games		Amusements.

gnu		Various commands and libraries under the GNU Public License.
		Please see gnu/COPYING* for more information.

include		System include files.

kerberos5	Kerberos5 (Heimdal) package.

lib		System libraries.

libexec		System daemons.

release		Release building Makefile & associated tools.

rescue		Build system for statically linked /rescue utilities.

sbin		System commands.

secure		Cryptographic libraries and commands.

share		Shared resources.

sys		Kernel sources.

tools		Utilities for regression testing and miscellaneous tasks.

usr.bin		User commands.

usr.sbin	System administration commands.


For information on synchronizing your source tree with one or more of
the FreeBSD Project's development branches, please see:

  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/synching.html