Attilio Rao
3d7acbbabf
Fix several callout migration races:
- Problem1: Hypothesis: thread1 is doing a callout_reset_on(), within his callout handler, willing to implicitly or explicitly migrate the callout. thread2 is draining the callout. Thesys: * thread1 calls callout_lock() and locks the old callout cpu * thread1 performs the checks in the first path of the callout_reset_on() * thread1 hits this codepiece: /* * If the lock must migrate we have to check the state again as * we can't hold both the new and old locks simultaneously. */ if (c->c_cpu != cpu) { c->c_cpu = cpu; CC_UNLOCK(cc); goto retry; } which means it will drop the lock and 'retry' * thread2 will callout_lock() and locks the new callout cpu. thread1 spins on the new lock and will not keep going for the moment. * thread2 checks that the callout is not pending (as callout is currently running) and that it is not on cc->cc_curr (because cc now refers to the new callout and the callout is running on the old callout cpu) thus it thinks it is done and returns. * thread1 will now acquire the lock and then adds the callout to the new callout cpu queue That seems an obvious race as callout_stop() falsely reports the callout stopped or worse, callout_drain() falsely returns while the callout is still in use. - Solution1: Fixing this problem would require, in general, to lock both callout cpus at once while switching the c_cpu field and avoid cyclic deadlocks between callout cpus locks. The concept of CPUBLOCK is then introduced (working more or less like the blocked_lock for thread_lock() function) meaning: "in callout_lock(), spin until the c->c_cpu is not different from CPUBLOCK". That way the "original" callout cpu, referred to the above mentioned code snippet, will remain blocked until the lock handover is over critical path will remain covered. - Problem2: Having the callout currently executed on a specific callout cpu and contemporary pending on another callout cpu (as it can happen with current code) breaks, at least, the assumption callout_drain() returns just once the callout cannot be referenced anymore. - Solution2: Callout migration is deferred if the current callout is already under execution. The best place to do that is in softclock() and new members are added to the callout cpu structure in order to specify a pending migration is requested. That is necessary because the callout cannot be trusted (not freed) the 100% of times after the execution of the callout handler. CPUBLOCK will prevent, in the "deferred migration" case, that the callout gets freed in this case, stopping any callout_stop() and callout_drain() possible activity until the migration is actually performed. - Problem3: There is a further race in callout_drain(). In order to avoid a race between sleepqueue lock and callout cpu spinlock, in _callout_stop_safe(), the callout cpu lock is dropped, the sleepqueue lock is acquired and a new callout cpu lookup is performed. Note that the channel used for locking the sleepqueue is obtained from the "current" callout cpu (&cc->cc_waiting). If the callout migrated in the meanwhile, callout_drain() will end up using the wrong wchan for the sleepqueue (the locked one will be the older, while the new one will not really be locked) leading to a lock leak and a race access to sleepqueue. - Solution3: It is enough to check if a migration happened between the operation of acquiring the sleepqueue lock and the new callout cpu lock and eventually unwind all those and try again. This problems can lead to deathly races on moderate (4-ways) SMP environment, leading to easy panic or deadlocks. The 24-ways of the reporter, could easilly panic, with completely normal workload, almost daily. gianni@ kindly wrote the following prof-of-concept which can panic a FreeBSD machine in less than one hour, in smaller SMP: http://www.freebsd.org/~attilio/callout/test.c Reported by: Nicholas Esborn <nick at desert dot net>, DesertNet In collabouration with: gianni, pho, Nicholas Esborn Reviewed by: jhb MFC after: 1 week (*) * Usually, I would aim for a larger MFC timeout, but I really want this in before 8.2-RELEASE, thus re@ accepted a shorter timeout as a special case for this patch
…
This is the top level of the FreeBSD source directory. This file was last revised on: $FreeBSD$ For copyright information, please see the file COPYRIGHT in this directory (additional copyright information also exists for some sources in this tree - please see the specific source directories for more information). The Makefile in this directory supports a number of targets for building components (or all) of the FreeBSD source tree, the most commonly used one being ``world'', which rebuilds and installs everything in the FreeBSD system from the source tree except the kernel, the kernel-modules and the contents of /etc. The ``world'' target should only be used in cases where the source tree has not changed from the currently running version. See: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html for more information, including setting make(1) variables. The ``buildkernel'' and ``installkernel'' targets build and install the kernel and the modules (see below). Please see the top of the Makefile in this directory for more information on the standard build targets and compile-time flags. Building a kernel is a somewhat more involved process, documentation for which can be found at: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html And in the config(8) man page. Note: If you want to build and install the kernel with the ``buildkernel'' and ``installkernel'' targets, you might need to build world before. More information is available in the handbook. The sample kernel configuration files reside in the sys/<arch>/conf sub-directory (assuming that you've installed the kernel sources), the file named GENERIC being the one used to build your initial installation kernel. The file NOTES contains entries and documentation for all possible devices, not just those commonly used. It is the successor of the ancient LINT file, but in contrast to LINT, it is not buildable as a kernel but a pure reference and documentation file. Source Roadmap: --------------- bin System/user commands. cddl Various commands and libraries under the Common Development and Distribution License. contrib Packages contributed by 3rd parties. crypto Cryptography stuff (see crypto/README). etc Template files for /etc. games Amusements. gnu Various commands and libraries under the GNU Public License. Please see gnu/COPYING* for more information. include System include files. kerberos5 Kerberos5 (Heimdal) package. lib System libraries. libexec System daemons. release Release building Makefile & associated tools. rescue Build system for statically linked /rescue utilities. sbin System commands. secure Cryptographic libraries and commands. share Shared resources. sys Kernel sources. tools Utilities for regression testing and miscellaneous tasks. usr.bin User commands. usr.sbin System administration commands. For information on synchronizing your source tree with one or more of the FreeBSD Project's development branches, please see: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/synching.html
Description
Languages
C
60.1%
C++
26.1%
Roff
4.9%
Shell
3%
Assembly
1.7%
Other
3.7%