Fix a race in the sleepqueue timeout code that resulted in sleeps not

being properly cancelled by a timeout.  In general there is a race
between a the sleepq timeout handler firing while the thread is still
in the process of going to sleep.  In 6.x with sched_lock, the race was
largely protected by sched_lock.  The only place it was "exposed" and had
to be handled was while checking for any pending signals in
sleepq_catch_signals().

With the thread lock changes, the thread lock is dropped in between
sleepq_add() and sleepq_*wait*() opening up a new window for this race.
Thus, if the timeout fired while the sleeping thread was in between
sleepq_add() and sleepq_*wait*(), the thread would be marked as timed
out, but the thread would not be dequeued and sleepq_switch() would
still block the thread until it was awakened via some other means.  In
the case of pause(9) where there is no other wakeup, the thread would
never be awakened.

Fix this by teaching sleepq_switch() to check if the thread has had its
sleep canceled before blocking by checking the TDF_TIMEOUT flag and
aborting the sleep and dequeueing the thread if it is set.

MFC after:	3 days
Reported by:	dwhite, peter
This commit is contained in:
John Baldwin 2008-01-25 02:09:38 +00:00
parent 639c89833a
commit 515594a06f
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=175654

View File

@ -439,17 +439,36 @@ static void
sleepq_switch(void *wchan)
{
struct sleepqueue_chain *sc;
struct sleepqueue *sq;
struct thread *td;
td = curthread;
sc = SC_LOOKUP(wchan);
mtx_assert(&sc->sc_lock, MA_OWNED);
THREAD_LOCK_ASSERT(td, MA_OWNED);
/* We were removed */
/*
* If we have a sleep queue, then we've already been woken up, so
* just return.
*/
if (td->td_sleepqueue != NULL) {
mtx_unlock_spin(&sc->sc_lock);
return;
}
/*
* If TDF_TIMEOUT is set, then our sleep has been timed out
* already but we are still on the sleep queue, so dequeue the
* thread and return.
*/
if (td->td_flags & TDF_TIMEOUT) {
MPASS(TD_ON_SLEEPQ(td));
sq = sleepq_lookup(wchan);
sleepq_resume_thread(sq, td, -1);
mtx_unlock_spin(&sc->sc_lock);
return;
}
thread_lock_set(td, &sc->sc_lock);
MPASS(td->td_sleepqueue == NULL);
@ -790,10 +809,12 @@ sleepq_timeout(void *arg)
thread_unlock(td);
return;
}
/*
* If the thread is on the SLEEPQ but not sleeping and we have it
* locked it must be in sleepq_catch_signals(). Let it know we've
* timedout here so it can remove itself.
* If the thread is on the SLEEPQ but isn't sleeping yet, it
* can either be on another CPU in between sleepq_add() and
* one of the sleepq_*wait*() routines or it can be in
* sleepq_catch_signals().
*/
if (TD_ON_SLEEPQ(td)) {
td->td_flags |= TDF_TIMEOUT | TDF_INTERRUPT;