2005-01-06 23:35:40 +00:00
|
|
|
/*-
|
2007-03-31 23:23:42 +00:00
|
|
|
* Copyright (c) 2007 Attilio Rao <attilio@freebsd.org>
|
|
|
|
* Copyright (c) 2001 Jason Evans <jasone@freebsd.org>
|
|
|
|
* All rights reserved.
|
2001-03-05 19:59:41 +00:00
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice(s), this list of conditions and the following disclaimer as
|
2007-03-31 23:23:42 +00:00
|
|
|
* the first lines of this file unmodified other than the possible
|
2001-03-05 19:59:41 +00:00
|
|
|
* addition of one or more copyright notices.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice(s), this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
|
|
|
|
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
|
|
|
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
|
|
|
* DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY
|
|
|
|
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
|
|
|
|
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
|
|
|
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
|
|
|
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
|
|
|
|
* DAMAGE.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
2007-03-31 23:23:42 +00:00
|
|
|
* Shared/exclusive locks. This implementation attempts to ensure
|
|
|
|
* deterministic lock granting behavior, so that slocks and xlocks are
|
|
|
|
* interleaved.
|
2001-03-05 19:59:41 +00:00
|
|
|
*
|
|
|
|
* Priority propagation will not generally raise the priority of lock holders,
|
|
|
|
* so should not be relied upon in combination with sx locks.
|
|
|
|
*/
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
#include "opt_ddb.h"
|
2012-03-28 20:58:30 +00:00
|
|
|
#include "opt_hwpmc_hooks.h"
|
2009-05-29 08:01:48 +00:00
|
|
|
#include "opt_no_adaptive_sx.h"
|
2007-03-31 23:23:42 +00:00
|
|
|
|
2003-06-11 00:56:59 +00:00
|
|
|
#include <sys/cdefs.h>
|
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
2001-03-05 19:59:41 +00:00
|
|
|
#include <sys/param.h>
|
2011-12-12 10:05:13 +00:00
|
|
|
#include <sys/systm.h>
|
2012-12-22 09:37:34 +00:00
|
|
|
#include <sys/kdb.h>
|
2016-08-03 09:15:10 +00:00
|
|
|
#include <sys/kernel.h>
|
2001-03-05 19:59:41 +00:00
|
|
|
#include <sys/ktr.h>
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
#include <sys/lock.h>
|
2001-03-05 19:59:41 +00:00
|
|
|
#include <sys/mutex.h>
|
2005-12-13 23:14:35 +00:00
|
|
|
#include <sys/proc.h>
|
2014-11-04 16:35:56 +00:00
|
|
|
#include <sys/sched.h>
|
2007-03-31 23:23:42 +00:00
|
|
|
#include <sys/sleepqueue.h>
|
2001-03-05 19:59:41 +00:00
|
|
|
#include <sys/sx.h>
|
2016-08-01 21:48:37 +00:00
|
|
|
#include <sys/smp.h>
|
2009-05-29 08:01:48 +00:00
|
|
|
#include <sys/sysctl.h>
|
2007-03-31 23:23:42 +00:00
|
|
|
|
2009-05-29 08:01:48 +00:00
|
|
|
#if defined(SMP) && !defined(NO_ADAPTIVE_SX)
|
2007-03-31 23:23:42 +00:00
|
|
|
#include <machine/cpu.h>
|
|
|
|
#endif
|
2001-03-05 19:59:41 +00:00
|
|
|
|
2006-08-15 18:29:01 +00:00
|
|
|
#ifdef DDB
|
2005-12-13 23:14:35 +00:00
|
|
|
#include <ddb/ddb.h>
|
2007-03-31 23:23:42 +00:00
|
|
|
#endif
|
|
|
|
|
2009-05-29 01:49:27 +00:00
|
|
|
#if defined(SMP) && !defined(NO_ADAPTIVE_SX)
|
|
|
|
#define ADAPTIVE_SX
|
2007-03-31 23:23:42 +00:00
|
|
|
#endif
|
|
|
|
|
2009-06-02 13:03:35 +00:00
|
|
|
CTASSERT((SX_NOADAPTIVE & LO_CLASSFLAGS) == SX_NOADAPTIVE);
|
2007-07-06 13:20:44 +00:00
|
|
|
|
2012-03-28 20:58:30 +00:00
|
|
|
#ifdef HWPMC_HOOKS
|
|
|
|
#include <sys/pmckern.h>
|
|
|
|
PMC_SOFT_DECLARE( , , lock, failed);
|
|
|
|
#endif
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/* Handy macros for sleep queues. */
|
|
|
|
#define SQ_EXCLUSIVE_QUEUE 0
|
|
|
|
#define SQ_SHARED_QUEUE 1
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Variations on DROP_GIANT()/PICKUP_GIANT() for use in this file. We
|
|
|
|
* drop Giant anytime we have to sleep or if we adaptively spin.
|
|
|
|
*/
|
|
|
|
#define GIANT_DECLARE \
|
|
|
|
int _giantcnt = 0; \
|
|
|
|
WITNESS_SAVE_DECL(Giant) \
|
|
|
|
|
|
|
|
#define GIANT_SAVE() do { \
|
|
|
|
if (mtx_owned(&Giant)) { \
|
|
|
|
WITNESS_SAVE(&Giant.lock_object, Giant); \
|
|
|
|
while (mtx_owned(&Giant)) { \
|
|
|
|
_giantcnt++; \
|
|
|
|
mtx_unlock(&Giant); \
|
|
|
|
} \
|
|
|
|
} \
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
#define GIANT_RESTORE() do { \
|
|
|
|
if (_giantcnt > 0) { \
|
|
|
|
mtx_assert(&Giant, MA_NOTOWNED); \
|
|
|
|
while (_giantcnt--) \
|
|
|
|
mtx_lock(&Giant); \
|
|
|
|
WITNESS_RESTORE(&Giant.lock_object, Giant); \
|
|
|
|
} \
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
/*
|
2007-05-18 15:05:41 +00:00
|
|
|
* Returns true if an exclusive lock is recursed. It assumes
|
|
|
|
* curthread currently has an exclusive lock.
|
2007-03-31 23:23:42 +00:00
|
|
|
*/
|
|
|
|
#define sx_recursed(sx) ((sx)->sx_recurse != 0)
|
|
|
|
|
2011-11-16 21:51:17 +00:00
|
|
|
static void assert_sx(const struct lock_object *lock, int what);
|
2007-03-31 23:23:42 +00:00
|
|
|
#ifdef DDB
|
2011-11-16 21:51:17 +00:00
|
|
|
static void db_show_sx(const struct lock_object *lock);
|
2005-12-13 23:14:35 +00:00
|
|
|
#endif
|
2013-09-20 23:06:21 +00:00
|
|
|
static void lock_sx(struct lock_object *lock, uintptr_t how);
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2011-11-16 21:51:17 +00:00
|
|
|
static int owner_sx(const struct lock_object *lock, struct thread **owner);
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2013-09-20 23:06:21 +00:00
|
|
|
static uintptr_t unlock_sx(struct lock_object *lock);
|
2005-12-13 23:14:35 +00:00
|
|
|
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
struct lock_class lock_class_sx = {
|
2007-03-09 16:04:44 +00:00
|
|
|
.lc_name = "sx",
|
|
|
|
.lc_flags = LC_SLEEPLOCK | LC_SLEEPABLE | LC_RECURSABLE | LC_UPGRADABLE,
|
2007-11-18 14:43:53 +00:00
|
|
|
.lc_assert = assert_sx,
|
2005-12-13 23:14:35 +00:00
|
|
|
#ifdef DDB
|
2007-03-09 16:04:44 +00:00
|
|
|
.lc_ddb_show = db_show_sx,
|
2005-12-13 23:14:35 +00:00
|
|
|
#endif
|
2007-03-09 16:27:11 +00:00
|
|
|
.lc_lock = lock_sx,
|
|
|
|
.lc_unlock = unlock_sx,
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
|
|
|
.lc_owner = owner_sx,
|
|
|
|
#endif
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
};
|
|
|
|
|
2001-10-24 14:18:33 +00:00
|
|
|
#ifndef INVARIANTS
|
|
|
|
#define _sx_assert(sx, what, file, line)
|
|
|
|
#endif
|
|
|
|
|
2009-05-29 01:49:27 +00:00
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
static u_int asx_retries = 10;
|
|
|
|
static u_int asx_loops = 10000;
|
2011-11-07 15:43:11 +00:00
|
|
|
static SYSCTL_NODE(_debug, OID_AUTO, sx, CTLFLAG_RD, NULL, "sxlock debugging");
|
2011-01-12 19:54:19 +00:00
|
|
|
SYSCTL_UINT(_debug_sx, OID_AUTO, retries, CTLFLAG_RW, &asx_retries, 0, "");
|
|
|
|
SYSCTL_UINT(_debug_sx, OID_AUTO, loops, CTLFLAG_RW, &asx_loops, 0, "");
|
2016-08-01 21:48:37 +00:00
|
|
|
|
2017-01-27 15:03:51 +00:00
|
|
|
static struct lock_delay_config __read_mostly sx_delay = {
|
2016-08-01 21:48:37 +00:00
|
|
|
.initial = 1000,
|
|
|
|
.step = 500,
|
|
|
|
.min = 100,
|
|
|
|
.max = 5000,
|
|
|
|
};
|
|
|
|
|
|
|
|
SYSCTL_INT(_debug_sx, OID_AUTO, delay_initial, CTLFLAG_RW, &sx_delay.initial,
|
|
|
|
0, "");
|
|
|
|
SYSCTL_INT(_debug_sx, OID_AUTO, delay_step, CTLFLAG_RW, &sx_delay.step,
|
|
|
|
0, "");
|
|
|
|
SYSCTL_INT(_debug_sx, OID_AUTO, delay_min, CTLFLAG_RW, &sx_delay.min,
|
|
|
|
0, "");
|
|
|
|
SYSCTL_INT(_debug_sx, OID_AUTO, delay_max, CTLFLAG_RW, &sx_delay.max,
|
|
|
|
0, "");
|
|
|
|
|
|
|
|
static void
|
|
|
|
sx_delay_sysinit(void *dummy)
|
|
|
|
{
|
|
|
|
|
|
|
|
sx_delay.initial = mp_ncpus * 25;
|
|
|
|
sx_delay.step = (mp_ncpus * 25) / 2;
|
|
|
|
sx_delay.min = mp_ncpus * 5;
|
|
|
|
sx_delay.max = mp_ncpus * 25 * 10;
|
|
|
|
}
|
|
|
|
LOCK_DELAY_SYSINIT(sx_delay_sysinit);
|
2009-05-29 01:49:27 +00:00
|
|
|
#endif
|
|
|
|
|
2007-11-18 14:43:53 +00:00
|
|
|
void
|
2011-11-16 21:51:17 +00:00
|
|
|
assert_sx(const struct lock_object *lock, int what)
|
2007-11-18 14:43:53 +00:00
|
|
|
{
|
|
|
|
|
2011-11-16 21:51:17 +00:00
|
|
|
sx_assert((const struct sx *)lock, what);
|
2007-11-18 14:43:53 +00:00
|
|
|
}
|
|
|
|
|
2007-03-09 16:27:11 +00:00
|
|
|
void
|
2013-09-20 23:06:21 +00:00
|
|
|
lock_sx(struct lock_object *lock, uintptr_t how)
|
2007-03-09 16:27:11 +00:00
|
|
|
{
|
|
|
|
struct sx *sx;
|
|
|
|
|
|
|
|
sx = (struct sx *)lock;
|
|
|
|
if (how)
|
|
|
|
sx_slock(sx);
|
2013-09-22 14:09:07 +00:00
|
|
|
else
|
|
|
|
sx_xlock(sx);
|
2007-03-09 16:27:11 +00:00
|
|
|
}
|
|
|
|
|
2013-09-20 23:06:21 +00:00
|
|
|
uintptr_t
|
2007-03-09 16:27:11 +00:00
|
|
|
unlock_sx(struct lock_object *lock)
|
|
|
|
{
|
|
|
|
struct sx *sx;
|
|
|
|
|
|
|
|
sx = (struct sx *)lock;
|
2007-05-19 21:26:05 +00:00
|
|
|
sx_assert(sx, SA_LOCKED | SA_NOTRECURSED);
|
2007-03-09 16:27:11 +00:00
|
|
|
if (sx_xlocked(sx)) {
|
|
|
|
sx_xunlock(sx);
|
2013-09-22 14:09:07 +00:00
|
|
|
return (0);
|
2007-03-09 16:27:11 +00:00
|
|
|
} else {
|
|
|
|
sx_sunlock(sx);
|
2013-09-22 14:09:07 +00:00
|
|
|
return (1);
|
2007-03-09 16:27:11 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
|
|
|
int
|
2011-11-16 21:51:17 +00:00
|
|
|
owner_sx(const struct lock_object *lock, struct thread **owner)
|
2009-05-26 20:28:22 +00:00
|
|
|
{
|
2016-12-10 02:56:44 +00:00
|
|
|
const struct sx *sx;
|
|
|
|
uintptr_t x;
|
|
|
|
|
|
|
|
sx = (const struct sx *)lock;
|
|
|
|
x = sx->sx_lock;
|
|
|
|
*owner = NULL;
|
|
|
|
return ((x & SX_LOCK_SHARED) != 0 ? (SX_SHARERS(x) != 0) :
|
|
|
|
((*owner = (struct thread *)SX_OWNER(x)) != NULL));
|
2009-05-26 20:28:22 +00:00
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2002-04-02 16:05:43 +00:00
|
|
|
void
|
|
|
|
sx_sysinit(void *arg)
|
|
|
|
{
|
|
|
|
struct sx_args *sargs = arg;
|
|
|
|
|
2011-03-21 09:40:01 +00:00
|
|
|
sx_init_flags(sargs->sa_sx, sargs->sa_desc, sargs->sa_flags);
|
2002-04-02 16:05:43 +00:00
|
|
|
}
|
|
|
|
|
2001-03-05 19:59:41 +00:00
|
|
|
void
|
2007-03-31 23:23:42 +00:00
|
|
|
sx_init_flags(struct sx *sx, const char *description, int opts)
|
2001-03-05 19:59:41 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
int flags;
|
|
|
|
|
2007-05-19 16:35:27 +00:00
|
|
|
MPASS((opts & ~(SX_QUIET | SX_RECURSE | SX_NOWITNESS | SX_DUPOK |
|
2014-12-13 21:00:10 +00:00
|
|
|
SX_NOPROFILE | SX_NOADAPTIVE | SX_NEW)) == 0);
|
2009-08-17 16:17:21 +00:00
|
|
|
ASSERT_ATOMIC_LOAD_PTR(sx->sx_lock,
|
|
|
|
("%s: sx_lock not aligned for %s: %p", __func__, description,
|
|
|
|
&sx->sx_lock));
|
2007-05-19 16:35:27 +00:00
|
|
|
|
2009-06-02 13:03:35 +00:00
|
|
|
flags = LO_SLEEPABLE | LO_UPGRADABLE;
|
2007-03-31 23:23:42 +00:00
|
|
|
if (opts & SX_DUPOK)
|
|
|
|
flags |= LO_DUPOK;
|
|
|
|
if (opts & SX_NOPROFILE)
|
|
|
|
flags |= LO_NOPROFILE;
|
|
|
|
if (!(opts & SX_NOWITNESS))
|
|
|
|
flags |= LO_WITNESS;
|
2009-06-02 13:03:35 +00:00
|
|
|
if (opts & SX_RECURSE)
|
|
|
|
flags |= LO_RECURSABLE;
|
2007-03-31 23:23:42 +00:00
|
|
|
if (opts & SX_QUIET)
|
|
|
|
flags |= LO_QUIET;
|
2014-12-13 21:00:10 +00:00
|
|
|
if (opts & SX_NEW)
|
|
|
|
flags |= LO_NEW;
|
2007-03-31 23:23:42 +00:00
|
|
|
|
2009-06-02 13:03:35 +00:00
|
|
|
flags |= opts & SX_NOADAPTIVE;
|
2013-06-25 20:23:08 +00:00
|
|
|
lock_init(&sx->lock_object, &lock_class_sx, description, NULL, flags);
|
2007-03-31 23:23:42 +00:00
|
|
|
sx->sx_lock = SX_LOCK_UNLOCKED;
|
|
|
|
sx->sx_recurse = 0;
|
2001-03-05 19:59:41 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
sx_destroy(struct sx *sx)
|
|
|
|
{
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
KASSERT(sx->sx_lock == SX_LOCK_UNLOCKED, ("sx lock still held"));
|
|
|
|
KASSERT(sx->sx_recurse == 0, ("sx lock still recursed"));
|
2007-05-08 21:51:37 +00:00
|
|
|
sx->sx_lock = SX_LOCK_DESTROYED;
|
2007-03-21 21:20:51 +00:00
|
|
|
lock_destroy(&sx->lock_object);
|
2001-03-05 19:59:41 +00:00
|
|
|
}
|
|
|
|
|
2007-05-31 09:14:48 +00:00
|
|
|
int
|
|
|
|
_sx_slock(struct sx *sx, int opts, const char *file, int line)
|
2001-03-05 19:59:41 +00:00
|
|
|
{
|
2007-05-31 09:14:48 +00:00
|
|
|
int error = 0;
|
2001-03-05 19:59:41 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (0);
|
2012-12-22 09:37:34 +00:00
|
|
|
KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread),
|
2012-09-12 22:10:53 +00:00
|
|
|
("sx_slock() by idle thread %p on sx %s @ %s:%d",
|
|
|
|
curthread, sx->lock_object.lo_name, file, line));
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_slock() of destroyed sx @ %s:%d", file, line));
|
2008-09-10 19:13:30 +00:00
|
|
|
WITNESS_CHECKORDER(&sx->lock_object, LOP_NEWORDER, file, line, NULL);
|
2007-05-31 09:14:48 +00:00
|
|
|
error = __sx_slock(sx, opts, file, line);
|
|
|
|
if (!error) {
|
|
|
|
LOCK_LOG_LOCK("SLOCK", &sx->lock_object, 0, 0, file, line);
|
|
|
|
WITNESS_LOCK(&sx->lock_object, 0, file, line);
|
2015-08-02 00:03:08 +00:00
|
|
|
TD_LOCKS_INC(curthread);
|
2007-05-31 09:14:48 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
return (error);
|
2001-03-05 19:59:41 +00:00
|
|
|
}
|
|
|
|
|
2001-06-27 06:39:37 +00:00
|
|
|
int
|
2011-11-21 12:59:52 +00:00
|
|
|
sx_try_slock_(struct sx *sx, const char *file, int line)
|
2001-06-27 06:39:37 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
uintptr_t x;
|
2001-06-27 06:39:37 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (1);
|
|
|
|
|
2012-12-22 09:37:34 +00:00
|
|
|
KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread),
|
2012-09-12 22:10:53 +00:00
|
|
|
("sx_try_slock() by idle thread %p on sx %s @ %s:%d",
|
|
|
|
curthread, sx->lock_object.lo_name, file, line));
|
|
|
|
|
2007-10-02 14:48:48 +00:00
|
|
|
for (;;) {
|
|
|
|
x = sx->sx_lock;
|
|
|
|
KASSERT(x != SX_LOCK_DESTROYED,
|
|
|
|
("sx_try_slock() of destroyed sx @ %s:%d", file, line));
|
|
|
|
if (!(x & SX_LOCK_SHARED))
|
|
|
|
break;
|
|
|
|
if (atomic_cmpset_acq_ptr(&sx->sx_lock, x, x + SX_ONE_SHARER)) {
|
|
|
|
LOCK_LOG_TRY("SLOCK", &sx->lock_object, 0, 1, file, line);
|
|
|
|
WITNESS_LOCK(&sx->lock_object, LOP_TRYLOCK, file, line);
|
2015-07-19 22:24:33 +00:00
|
|
|
LOCKSTAT_PROFILE_OBTAIN_RWLOCK_SUCCESS(sx__acquire,
|
|
|
|
sx, 0, 0, file, line, LOCKSTAT_READER);
|
2015-08-02 00:03:08 +00:00
|
|
|
TD_LOCKS_INC(curthread);
|
2007-10-02 14:48:48 +00:00
|
|
|
return (1);
|
|
|
|
}
|
2001-06-27 06:39:37 +00:00
|
|
|
}
|
2007-03-31 23:23:42 +00:00
|
|
|
|
|
|
|
LOCK_LOG_TRY("SLOCK", &sx->lock_object, 0, 0, file, line);
|
|
|
|
return (0);
|
2001-06-27 06:39:37 +00:00
|
|
|
}
|
|
|
|
|
2007-05-31 09:14:48 +00:00
|
|
|
int
|
|
|
|
_sx_xlock(struct sx *sx, int opts, const char *file, int line)
|
2001-03-05 19:59:41 +00:00
|
|
|
{
|
2007-05-31 09:14:48 +00:00
|
|
|
int error = 0;
|
2001-03-05 19:59:41 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (0);
|
2012-12-22 09:37:34 +00:00
|
|
|
KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread),
|
2012-09-12 22:10:53 +00:00
|
|
|
("sx_xlock() by idle thread %p on sx %s @ %s:%d",
|
|
|
|
curthread, sx->lock_object.lo_name, file, line));
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_xlock() of destroyed sx @ %s:%d", file, line));
|
2007-03-21 21:20:51 +00:00
|
|
|
WITNESS_CHECKORDER(&sx->lock_object, LOP_NEWORDER | LOP_EXCLUSIVE, file,
|
2008-09-10 19:13:30 +00:00
|
|
|
line, NULL);
|
2007-05-31 09:14:48 +00:00
|
|
|
error = __sx_xlock(sx, curthread, opts, file, line);
|
|
|
|
if (!error) {
|
|
|
|
LOCK_LOG_LOCK("XLOCK", &sx->lock_object, 0, sx->sx_recurse,
|
|
|
|
file, line);
|
|
|
|
WITNESS_LOCK(&sx->lock_object, LOP_EXCLUSIVE, file, line);
|
2015-08-02 00:03:08 +00:00
|
|
|
TD_LOCKS_INC(curthread);
|
2007-05-31 09:14:48 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
return (error);
|
2001-03-05 19:59:41 +00:00
|
|
|
}
|
|
|
|
|
2001-06-27 06:39:37 +00:00
|
|
|
int
|
2011-11-21 12:59:52 +00:00
|
|
|
sx_try_xlock_(struct sx *sx, const char *file, int line)
|
2001-06-27 06:39:37 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
int rval;
|
2001-06-27 06:39:37 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (1);
|
|
|
|
|
2012-12-22 09:37:34 +00:00
|
|
|
KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread),
|
2012-09-12 22:10:53 +00:00
|
|
|
("sx_try_xlock() by idle thread %p on sx %s @ %s:%d",
|
|
|
|
curthread, sx->lock_object.lo_name, file, line));
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_try_xlock() of destroyed sx @ %s:%d", file, line));
|
2007-03-31 23:23:42 +00:00
|
|
|
|
2009-06-02 13:03:35 +00:00
|
|
|
if (sx_xlocked(sx) &&
|
|
|
|
(sx->lock_object.lo_flags & LO_RECURSABLE) != 0) {
|
2007-03-31 23:23:42 +00:00
|
|
|
sx->sx_recurse++;
|
|
|
|
atomic_set_ptr(&sx->sx_lock, SX_LOCK_RECURSED);
|
|
|
|
rval = 1;
|
|
|
|
} else
|
|
|
|
rval = atomic_cmpset_acq_ptr(&sx->sx_lock, SX_LOCK_UNLOCKED,
|
|
|
|
(uintptr_t)curthread);
|
|
|
|
LOCK_LOG_TRY("XLOCK", &sx->lock_object, 0, rval, file, line);
|
|
|
|
if (rval) {
|
|
|
|
WITNESS_LOCK(&sx->lock_object, LOP_EXCLUSIVE | LOP_TRYLOCK,
|
|
|
|
file, line);
|
2015-06-12 10:01:24 +00:00
|
|
|
if (!sx_recursed(sx))
|
2015-07-19 22:24:33 +00:00
|
|
|
LOCKSTAT_PROFILE_OBTAIN_RWLOCK_SUCCESS(sx__acquire,
|
|
|
|
sx, 0, 0, file, line, LOCKSTAT_WRITER);
|
2015-08-02 00:03:08 +00:00
|
|
|
TD_LOCKS_INC(curthread);
|
2001-06-27 06:39:37 +00:00
|
|
|
}
|
2007-03-31 23:23:42 +00:00
|
|
|
|
|
|
|
return (rval);
|
2001-06-27 06:39:37 +00:00
|
|
|
}
|
|
|
|
|
2001-03-05 19:59:41 +00:00
|
|
|
void
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
_sx_sunlock(struct sx *sx, const char *file, int line)
|
2001-03-05 19:59:41 +00:00
|
|
|
{
|
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return;
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_sunlock() of destroyed sx @ %s:%d", file, line));
|
2007-05-19 21:26:05 +00:00
|
|
|
_sx_assert(sx, SA_SLOCKED, file, line);
|
2007-03-21 21:20:51 +00:00
|
|
|
WITNESS_UNLOCK(&sx->lock_object, 0, file, line);
|
2007-03-31 23:23:42 +00:00
|
|
|
LOCK_LOG_LOCK("SUNLOCK", &sx->lock_object, 0, 0, file, line);
|
|
|
|
__sx_sunlock(sx, file, line);
|
2015-08-02 00:03:08 +00:00
|
|
|
TD_LOCKS_DEC(curthread);
|
2007-03-31 23:23:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
_sx_xunlock(struct sx *sx, const char *file, int line)
|
|
|
|
{
|
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return;
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_xunlock() of destroyed sx @ %s:%d", file, line));
|
2007-05-19 21:26:05 +00:00
|
|
|
_sx_assert(sx, SA_XLOCKED, file, line);
|
2007-03-31 23:23:42 +00:00
|
|
|
WITNESS_UNLOCK(&sx->lock_object, LOP_EXCLUSIVE, file, line);
|
|
|
|
LOCK_LOG_LOCK("XUNLOCK", &sx->lock_object, 0, sx->sx_recurse, file,
|
|
|
|
line);
|
|
|
|
__sx_xunlock(sx, curthread, file, line);
|
2015-08-02 00:03:08 +00:00
|
|
|
TD_LOCKS_DEC(curthread);
|
2007-03-31 23:23:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Try to do a non-blocking upgrade from a shared lock to an exclusive lock.
|
|
|
|
* This will only succeed if this thread holds a single shared lock.
|
|
|
|
* Return 1 if if the upgrade succeed, 0 otherwise.
|
|
|
|
*/
|
|
|
|
int
|
2011-11-21 12:59:52 +00:00
|
|
|
sx_try_upgrade_(struct sx *sx, const char *file, int line)
|
2007-03-31 23:23:42 +00:00
|
|
|
{
|
|
|
|
uintptr_t x;
|
|
|
|
int success;
|
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (1);
|
|
|
|
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_try_upgrade() of destroyed sx @ %s:%d", file, line));
|
2007-05-19 21:26:05 +00:00
|
|
|
_sx_assert(sx, SA_SLOCKED, file, line);
|
2007-03-31 23:23:42 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Try to switch from one shared lock to an exclusive lock. We need
|
|
|
|
* to maintain the SX_LOCK_EXCLUSIVE_WAITERS flag if set so that
|
|
|
|
* we will wake up the exclusive waiters when we drop the lock.
|
|
|
|
*/
|
|
|
|
x = sx->sx_lock & SX_LOCK_EXCLUSIVE_WAITERS;
|
|
|
|
success = atomic_cmpset_ptr(&sx->sx_lock, SX_SHARERS_LOCK(1) | x,
|
|
|
|
(uintptr_t)curthread | x);
|
|
|
|
LOCK_LOG_TRY("XUPGRADE", &sx->lock_object, 0, success, file, line);
|
2009-05-26 20:28:22 +00:00
|
|
|
if (success) {
|
2007-03-31 23:23:42 +00:00
|
|
|
WITNESS_UPGRADE(&sx->lock_object, LOP_EXCLUSIVE | LOP_TRYLOCK,
|
|
|
|
file, line);
|
2015-07-19 22:14:09 +00:00
|
|
|
LOCKSTAT_RECORD0(sx__upgrade, sx);
|
2009-05-26 20:28:22 +00:00
|
|
|
}
|
2007-03-31 23:23:42 +00:00
|
|
|
return (success);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Downgrade an unrecursed exclusive lock into a single shared lock.
|
|
|
|
*/
|
|
|
|
void
|
2011-11-21 12:59:52 +00:00
|
|
|
sx_downgrade_(struct sx *sx, const char *file, int line)
|
2007-03-31 23:23:42 +00:00
|
|
|
{
|
|
|
|
uintptr_t x;
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
int wakeup_swapper;
|
2007-03-31 23:23:42 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return;
|
|
|
|
|
2007-05-08 21:51:37 +00:00
|
|
|
KASSERT(sx->sx_lock != SX_LOCK_DESTROYED,
|
|
|
|
("sx_downgrade() of destroyed sx @ %s:%d", file, line));
|
2007-05-19 21:26:05 +00:00
|
|
|
_sx_assert(sx, SA_XLOCKED | SA_NOTRECURSED, file, line);
|
2007-03-31 23:23:42 +00:00
|
|
|
#ifndef INVARIANTS
|
|
|
|
if (sx_recursed(sx))
|
|
|
|
panic("downgrade of a recursed lock");
|
|
|
|
#endif
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
WITNESS_DOWNGRADE(&sx->lock_object, 0, file, line);
|
2001-03-05 19:59:41 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* Try to switch from an exclusive lock with no shared waiters
|
|
|
|
* to one sharer with no shared waiters. If there are
|
|
|
|
* exclusive waiters, we don't need to lock the sleep queue so
|
|
|
|
* long as we preserve the flag. We do one quick try and if
|
|
|
|
* that fails we grab the sleepq lock to keep the flags from
|
|
|
|
* changing and do it the slow way.
|
|
|
|
*
|
|
|
|
* We have to lock the sleep queue if there are shared waiters
|
|
|
|
* so we can wake them up.
|
|
|
|
*/
|
|
|
|
x = sx->sx_lock;
|
|
|
|
if (!(x & SX_LOCK_SHARED_WAITERS) &&
|
|
|
|
atomic_cmpset_rel_ptr(&sx->sx_lock, x, SX_SHARERS_LOCK(1) |
|
|
|
|
(x & SX_LOCK_EXCLUSIVE_WAITERS))) {
|
|
|
|
LOCK_LOG_LOCK("XDOWNGRADE", &sx->lock_object, 0, 0, file, line);
|
|
|
|
return;
|
2007-02-27 06:42:05 +00:00
|
|
|
}
|
2007-03-02 07:21:20 +00:00
|
|
|
|
2001-03-05 19:59:41 +00:00
|
|
|
/*
|
2007-03-31 23:23:42 +00:00
|
|
|
* Lock the sleep queue so we can read the waiters bits
|
|
|
|
* without any races and wakeup any shared waiters.
|
2001-03-05 19:59:41 +00:00
|
|
|
*/
|
2007-03-31 23:23:42 +00:00
|
|
|
sleepq_lock(&sx->lock_object);
|
2001-03-05 19:59:41 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* Preserve SX_LOCK_EXCLUSIVE_WAITERS while downgraded to a single
|
|
|
|
* shared lock. If there are any shared waiters, wake them up.
|
|
|
|
*/
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
wakeup_swapper = 0;
|
2007-03-31 23:23:42 +00:00
|
|
|
x = sx->sx_lock;
|
|
|
|
atomic_store_rel_ptr(&sx->sx_lock, SX_SHARERS_LOCK(1) |
|
|
|
|
(x & SX_LOCK_EXCLUSIVE_WAITERS));
|
|
|
|
if (x & SX_LOCK_SHARED_WAITERS)
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
wakeup_swapper = sleepq_broadcast(&sx->lock_object, SLEEPQ_SX,
|
|
|
|
0, SQ_SHARED_QUEUE);
|
2008-03-12 06:31:06 +00:00
|
|
|
sleepq_release(&sx->lock_object);
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
LOCK_LOG_LOCK("XDOWNGRADE", &sx->lock_object, 0, 0, file, line);
|
2015-07-19 22:14:09 +00:00
|
|
|
LOCKSTAT_RECORD0(sx__downgrade, sx);
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
|
|
|
|
if (wakeup_swapper)
|
|
|
|
kick_proc0();
|
2001-03-05 19:59:41 +00:00
|
|
|
}
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* This function represents the so-called 'hard case' for sx_xlock
|
|
|
|
* operation. All 'easy case' failures are redirected to this. Note
|
|
|
|
* that ideally this would be a static function, but it needs to be
|
|
|
|
* accessible from at least sx.h.
|
|
|
|
*/
|
2007-05-31 09:14:48 +00:00
|
|
|
int
|
|
|
|
_sx_xlock_hard(struct sx *sx, uintptr_t tid, int opts, const char *file,
|
|
|
|
int line)
|
2001-03-05 19:59:41 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
GIANT_DECLARE;
|
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
volatile struct thread *owner;
|
2009-05-29 01:49:27 +00:00
|
|
|
u_int i, spintries = 0;
|
2007-03-31 23:23:42 +00:00
|
|
|
#endif
|
|
|
|
uintptr_t x;
|
2009-03-15 08:03:54 +00:00
|
|
|
#ifdef LOCK_PROFILING
|
|
|
|
uint64_t waittime = 0;
|
|
|
|
int contested = 0;
|
|
|
|
#endif
|
|
|
|
int error = 0;
|
2016-08-02 00:15:08 +00:00
|
|
|
#if defined(ADAPTIVE_SX) || defined(KDTRACE_HOOKS)
|
2016-08-01 21:48:37 +00:00
|
|
|
struct lock_delay_arg lda;
|
|
|
|
#endif
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-06-12 10:01:24 +00:00
|
|
|
uintptr_t state;
|
2016-07-31 12:11:55 +00:00
|
|
|
u_int sleep_cnt = 0;
|
2009-05-26 20:28:22 +00:00
|
|
|
int64_t sleep_time = 0;
|
2015-06-12 10:01:24 +00:00
|
|
|
int64_t all_time = 0;
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2001-03-05 19:59:41 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (0);
|
|
|
|
|
2016-08-02 03:05:59 +00:00
|
|
|
#if defined(ADAPTIVE_SX)
|
2016-08-01 21:48:37 +00:00
|
|
|
lock_delay_arg_init(&lda, &sx_delay);
|
2016-08-02 03:05:59 +00:00
|
|
|
#elif defined(KDTRACE_HOOKS)
|
|
|
|
lock_delay_arg_init(&lda, NULL);
|
2016-08-01 21:48:37 +00:00
|
|
|
#endif
|
|
|
|
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/* If we already hold an exclusive lock, then recurse. */
|
2017-01-18 17:55:08 +00:00
|
|
|
if (__predict_false(lv_sx_owner(x) == (struct thread *)tid)) {
|
2009-06-02 13:03:35 +00:00
|
|
|
KASSERT((sx->lock_object.lo_flags & LO_RECURSABLE) != 0,
|
2007-05-19 16:35:27 +00:00
|
|
|
("_sx_xlock_hard: recursed on non-recursive sx %s @ %s:%d\n",
|
2007-07-06 13:20:44 +00:00
|
|
|
sx->lock_object.lo_name, file, line));
|
2007-03-31 23:23:42 +00:00
|
|
|
sx->sx_recurse++;
|
|
|
|
atomic_set_ptr(&sx->sx_lock, SX_LOCK_RECURSED);
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p recursing", __func__, sx);
|
2007-05-31 09:14:48 +00:00
|
|
|
return (0);
|
2007-03-31 23:23:42 +00:00
|
|
|
}
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR5(KTR_LOCK, "%s: %s contested (lock=%p) at %s:%d", __func__,
|
|
|
|
sx->lock_object.lo_name, (void *)sx->sx_lock, file, line);
|
2001-03-05 19:59:41 +00:00
|
|
|
|
2015-06-12 10:01:24 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
all_time -= lockstat_nsecs(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
state = x;
|
2015-06-12 10:01:24 +00:00
|
|
|
#endif
|
2016-06-01 18:32:20 +00:00
|
|
|
for (;;) {
|
2017-01-18 17:55:08 +00:00
|
|
|
if (x == SX_LOCK_UNLOCKED) {
|
|
|
|
if (atomic_cmpset_acq_ptr(&sx->sx_lock, x, tid))
|
|
|
|
break;
|
|
|
|
x = SX_READ_VALUE(sx);
|
|
|
|
continue;
|
|
|
|
}
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2016-08-01 21:48:37 +00:00
|
|
|
lda.spin_cnt++;
|
2012-03-28 20:58:30 +00:00
|
|
|
#endif
|
|
|
|
#ifdef HWPMC_HOOKS
|
|
|
|
PMC_SOFT_CALL( , , lock, failed);
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2007-12-15 23:13:31 +00:00
|
|
|
lock_profile_obtain_lock_failed(&sx->lock_object, &contested,
|
|
|
|
&waittime);
|
2007-03-31 23:23:42 +00:00
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
/*
|
|
|
|
* If the lock is write locked and the owner is
|
|
|
|
* running on another CPU, spin until the owner stops
|
|
|
|
* running or the state of the lock changes.
|
|
|
|
*/
|
2010-06-08 16:17:47 +00:00
|
|
|
if ((sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) {
|
2009-05-29 01:49:27 +00:00
|
|
|
if ((x & SX_LOCK_SHARED) == 0) {
|
2017-01-18 17:55:08 +00:00
|
|
|
owner = lv_sx_owner(x);
|
2009-05-29 01:49:27 +00:00
|
|
|
if (TD_IS_RUNNING(owner)) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR3(KTR_LOCK,
|
2007-03-31 23:23:42 +00:00
|
|
|
"%s: spinning on %p held by %p",
|
2009-05-29 01:49:27 +00:00
|
|
|
__func__, sx, owner);
|
2014-11-04 16:35:56 +00:00
|
|
|
KTR_STATE1(KTR_SCHED, "thread",
|
|
|
|
sched_tdname(curthread), "spinning",
|
|
|
|
"lockname:\"%s\"",
|
|
|
|
sx->lock_object.lo_name);
|
2009-05-29 01:49:27 +00:00
|
|
|
GIANT_SAVE();
|
2017-01-18 17:55:08 +00:00
|
|
|
do {
|
2016-08-01 21:48:37 +00:00
|
|
|
lock_delay(&lda);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
|
|
|
owner = lv_sx_owner(x);
|
|
|
|
} while (owner != NULL &&
|
|
|
|
TD_IS_RUNNING(owner));
|
2014-11-04 16:35:56 +00:00
|
|
|
KTR_STATE0(KTR_SCHED, "thread",
|
|
|
|
sched_tdname(curthread), "running");
|
2009-05-29 01:49:27 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
} else if (SX_SHARERS(x) && spintries < asx_retries) {
|
2014-11-04 16:35:56 +00:00
|
|
|
KTR_STATE1(KTR_SCHED, "thread",
|
|
|
|
sched_tdname(curthread), "spinning",
|
|
|
|
"lockname:\"%s\"", sx->lock_object.lo_name);
|
2009-09-02 17:33:51 +00:00
|
|
|
GIANT_SAVE();
|
2009-05-29 01:49:27 +00:00
|
|
|
spintries++;
|
|
|
|
for (i = 0; i < asx_loops; i++) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR4(KTR_LOCK,
|
|
|
|
"%s: shared spinning on %p with %u and %u",
|
|
|
|
__func__, sx, spintries, i);
|
|
|
|
x = sx->sx_lock;
|
|
|
|
if ((x & SX_LOCK_SHARED) == 0 ||
|
|
|
|
SX_SHARERS(x) == 0)
|
|
|
|
break;
|
2007-03-31 23:23:42 +00:00
|
|
|
cpu_spinwait();
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2016-08-01 21:48:37 +00:00
|
|
|
lda.spin_cnt++;
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
|
|
|
}
|
2014-11-04 16:35:56 +00:00
|
|
|
KTR_STATE0(KTR_SCHED, "thread",
|
|
|
|
sched_tdname(curthread), "running");
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2009-05-29 01:49:27 +00:00
|
|
|
if (i != asx_loops)
|
|
|
|
continue;
|
2007-03-31 23:23:42 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
2001-03-05 19:59:41 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
sleepq_lock(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects. Each lock class specifies a
name and set of flags (or properties) shared by all locks of a given
type. Currently there are three lock classes: spin mutexes, sleep
mutexes, and sx locks. A lock object specifies properties of an
additional lock along with a lock name and all of the extra stuff needed
to make witness work with a given lock. This abstract lock stuff is
defined in sys/lock.h. The lockmgr constants, types, and prototypes have
been moved to sys/lockmgr.h. For temporary backwards compatability,
sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
locks held. By making this per-cpu, we do not have to jump through
magic hoops to deal with sched_lock changing ownership during context
switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
proc->p_sleeplocks, which is a list of held sleep locks including sleep
mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
- MTX_NOWITNESS - specifies that this lock should be ignored by witness.
This is used for the mutex that blocks a sx lock for example.
- MTX_QUIET - this is not new, but you can pass this to mtx_init() now
and no events will be logged for this lock, so that one doesn't have
to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag. Use this flag to export
a mtx_initialized() macro that can be safely called from drivers. Also,
we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* If the lock was released while spinning on the
|
|
|
|
* sleep queue chain lock, try again.
|
|
|
|
*/
|
|
|
|
if (x == SX_LOCK_UNLOCKED) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
/*
|
|
|
|
* The current lock owner might have started executing
|
|
|
|
* on another CPU (or the lock could have changed
|
|
|
|
* owners) while we were waiting on the sleep queue
|
|
|
|
* chain lock. If so, drop the sleep queue lock and try
|
|
|
|
* again.
|
|
|
|
*/
|
|
|
|
if (!(x & SX_LOCK_SHARED) &&
|
2009-05-29 01:49:27 +00:00
|
|
|
(sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) {
|
2007-03-31 23:23:42 +00:00
|
|
|
owner = (struct thread *)SX_OWNER(x);
|
|
|
|
if (TD_IS_RUNNING(owner)) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If an exclusive lock was released with both shared
|
|
|
|
* and exclusive waiters and a shared waiter hasn't
|
|
|
|
* woken up and acquired the lock yet, sx_lock will be
|
|
|
|
* set to SX_LOCK_UNLOCKED | SX_LOCK_EXCLUSIVE_WAITERS.
|
|
|
|
* If we see that value, try to acquire it once. Note
|
|
|
|
* that we have to preserve SX_LOCK_EXCLUSIVE_WAITERS
|
|
|
|
* as there are other exclusive waiters still. If we
|
|
|
|
* fail, restart the loop.
|
|
|
|
*/
|
|
|
|
if (x == (SX_LOCK_UNLOCKED | SX_LOCK_EXCLUSIVE_WAITERS)) {
|
|
|
|
if (atomic_cmpset_acq_ptr(&sx->sx_lock,
|
|
|
|
SX_LOCK_UNLOCKED | SX_LOCK_EXCLUSIVE_WAITERS,
|
|
|
|
tid | SX_LOCK_EXCLUSIVE_WAITERS)) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
|
|
|
CTR2(KTR_LOCK, "%s: %p claimed by new writer",
|
|
|
|
__func__, sx);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
sleepq_release(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Try to set the SX_LOCK_EXCLUSIVE_WAITERS. If we fail,
|
|
|
|
* than loop back and retry.
|
|
|
|
*/
|
|
|
|
if (!(x & SX_LOCK_EXCLUSIVE_WAITERS)) {
|
|
|
|
if (!atomic_cmpset_ptr(&sx->sx_lock, x,
|
|
|
|
x | SX_LOCK_EXCLUSIVE_WAITERS)) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p set excl waiters flag",
|
|
|
|
__func__, sx);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Since we have been unable to acquire the exclusive
|
|
|
|
* lock and the exclusive waiters flag is set, we have
|
|
|
|
* to sleep.
|
|
|
|
*/
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p blocking on sleep queue",
|
|
|
|
__func__, sx);
|
|
|
|
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
sleep_time -= lockstat_nsecs(&sx->lock_object);
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2007-03-31 23:23:42 +00:00
|
|
|
GIANT_SAVE();
|
|
|
|
sleepq_add(&sx->lock_object, NULL, sx->lock_object.lo_name,
|
2007-05-31 09:14:48 +00:00
|
|
|
SLEEPQ_SX | ((opts & SX_INTERRUPTIBLE) ?
|
|
|
|
SLEEPQ_INTERRUPTIBLE : 0), SQ_EXCLUSIVE_QUEUE);
|
|
|
|
if (!(opts & SX_INTERRUPTIBLE))
|
2008-03-12 06:31:06 +00:00
|
|
|
sleepq_wait(&sx->lock_object, 0);
|
2007-05-31 09:14:48 +00:00
|
|
|
else
|
2008-03-12 06:31:06 +00:00
|
|
|
error = sleepq_wait_sig(&sx->lock_object, 0);
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
sleep_time += lockstat_nsecs(&sx->lock_object);
|
2009-05-26 20:28:22 +00:00
|
|
|
sleep_cnt++;
|
|
|
|
#endif
|
2007-05-31 09:14:48 +00:00
|
|
|
if (error) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK,
|
|
|
|
"%s: interruptible sleep by %p suspended by signal",
|
|
|
|
__func__, sx);
|
|
|
|
break;
|
|
|
|
}
|
2007-03-31 23:23:42 +00:00
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p resuming from sleep queue",
|
|
|
|
__func__, sx);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
}
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
all_time += lockstat_nsecs(&sx->lock_object);
|
2009-05-26 20:28:22 +00:00
|
|
|
if (sleep_time)
|
2015-07-19 22:14:09 +00:00
|
|
|
LOCKSTAT_RECORD4(sx__block, sx, sleep_time,
|
2015-06-12 10:01:24 +00:00
|
|
|
LOCKSTAT_WRITER, (state & SX_LOCK_SHARED) == 0,
|
|
|
|
(state & SX_LOCK_SHARED) == 0 ? 0 : SX_SHARERS(state));
|
2016-08-01 21:48:37 +00:00
|
|
|
if (lda.spin_cnt > sleep_cnt)
|
2015-07-19 22:14:09 +00:00
|
|
|
LOCKSTAT_RECORD4(sx__spin, sx, all_time - sleep_time,
|
2015-06-12 10:01:24 +00:00
|
|
|
LOCKSTAT_WRITER, (state & SX_LOCK_SHARED) == 0,
|
|
|
|
(state & SX_LOCK_SHARED) == 0 ? 0 : SX_SHARERS(state));
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2015-06-12 10:01:24 +00:00
|
|
|
if (!error)
|
2015-07-19 22:24:33 +00:00
|
|
|
LOCKSTAT_PROFILE_OBTAIN_RWLOCK_SUCCESS(sx__acquire, sx,
|
|
|
|
contested, waittime, file, line, LOCKSTAT_WRITER);
|
2015-06-12 10:01:24 +00:00
|
|
|
GIANT_RESTORE();
|
2007-05-31 09:14:48 +00:00
|
|
|
return (error);
|
2001-03-05 19:59:41 +00:00
|
|
|
}
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* This function represents the so-called 'hard case' for sx_xunlock
|
|
|
|
* operation. All 'easy case' failures are redirected to this. Note
|
|
|
|
* that ideally this would be a static function, but it needs to be
|
|
|
|
* accessible from at least sx.h.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
_sx_xunlock_hard(struct sx *sx, uintptr_t tid, const char *file, int line)
|
2001-08-13 21:25:30 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
uintptr_t x;
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
int queue, wakeup_swapper;
|
2001-08-13 21:25:30 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return;
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
MPASS(!(sx->sx_lock & SX_LOCK_SHARED));
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/* If the lock is recursed, then unrecurse one level. */
|
|
|
|
if (sx_xlocked(sx) && sx_recursed(sx)) {
|
|
|
|
if ((--sx->sx_recurse) == 0)
|
|
|
|
atomic_clear_ptr(&sx->sx_lock, SX_LOCK_RECURSED);
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p unrecursing", __func__, sx);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
MPASS(sx->sx_lock & (SX_LOCK_SHARED_WAITERS |
|
|
|
|
SX_LOCK_EXCLUSIVE_WAITERS));
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p contested", __func__, sx);
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
sleepq_lock(&sx->lock_object);
|
|
|
|
x = SX_LOCK_UNLOCKED;
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* The wake up algorithm here is quite simple and probably not
|
|
|
|
* ideal. It gives precedence to shared waiters if they are
|
|
|
|
* present. For this condition, we have to preserve the
|
|
|
|
* state of the exclusive waiters flag.
|
In current code, threads performing an interruptible sleep (on both
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.
A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).
In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.
The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.
Reported by: avg, kib, pho
Reviewed by: kib
Tested by: pho
2009-12-12 21:31:07 +00:00
|
|
|
* If interruptible sleeps left the shared queue empty avoid a
|
|
|
|
* starvation for the threads sleeping on the exclusive queue by giving
|
|
|
|
* them precedence and cleaning up the shared waiters bit anyway.
|
2007-03-31 23:23:42 +00:00
|
|
|
*/
|
In current code, threads performing an interruptible sleep (on both
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.
A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).
In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.
The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.
Reported by: avg, kib, pho
Reviewed by: kib
Tested by: pho
2009-12-12 21:31:07 +00:00
|
|
|
if ((sx->sx_lock & SX_LOCK_SHARED_WAITERS) != 0 &&
|
|
|
|
sleepq_sleepcnt(&sx->lock_object, SQ_SHARED_QUEUE) != 0) {
|
2007-03-31 23:23:42 +00:00
|
|
|
queue = SQ_SHARED_QUEUE;
|
|
|
|
x |= (sx->sx_lock & SX_LOCK_EXCLUSIVE_WAITERS);
|
|
|
|
} else
|
|
|
|
queue = SQ_EXCLUSIVE_QUEUE;
|
|
|
|
|
|
|
|
/* Wake up all the waiters for the specific queue. */
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR3(KTR_LOCK, "%s: %p waking up all threads on %s queue",
|
|
|
|
__func__, sx, queue == SQ_SHARED_QUEUE ? "shared" :
|
|
|
|
"exclusive");
|
|
|
|
atomic_store_rel_ptr(&sx->sx_lock, x);
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
wakeup_swapper = sleepq_broadcast(&sx->lock_object, SLEEPQ_SX, 0,
|
|
|
|
queue);
|
2008-03-12 06:31:06 +00:00
|
|
|
sleepq_release(&sx->lock_object);
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
if (wakeup_swapper)
|
|
|
|
kick_proc0();
|
2007-03-31 23:23:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This function represents the so-called 'hard case' for sx_slock
|
|
|
|
* operation. All 'easy case' failures are redirected to this. Note
|
|
|
|
* that ideally this would be a static function, but it needs to be
|
|
|
|
* accessible from at least sx.h.
|
|
|
|
*/
|
2007-05-31 09:14:48 +00:00
|
|
|
int
|
|
|
|
_sx_slock_hard(struct sx *sx, int opts, const char *file, int line)
|
2007-03-31 23:23:42 +00:00
|
|
|
{
|
|
|
|
GIANT_DECLARE;
|
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
volatile struct thread *owner;
|
2007-07-06 13:20:44 +00:00
|
|
|
#endif
|
2009-03-15 08:03:54 +00:00
|
|
|
#ifdef LOCK_PROFILING
|
2007-07-06 13:20:44 +00:00
|
|
|
uint64_t waittime = 0;
|
|
|
|
int contested = 0;
|
2009-03-15 08:03:54 +00:00
|
|
|
#endif
|
2007-03-31 23:23:42 +00:00
|
|
|
uintptr_t x;
|
2007-07-06 13:20:44 +00:00
|
|
|
int error = 0;
|
2016-08-02 00:15:08 +00:00
|
|
|
#if defined(ADAPTIVE_SX) || defined(KDTRACE_HOOKS)
|
2016-08-01 21:48:37 +00:00
|
|
|
struct lock_delay_arg lda;
|
|
|
|
#endif
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-06-12 10:01:24 +00:00
|
|
|
uintptr_t state;
|
2016-07-31 12:11:55 +00:00
|
|
|
u_int sleep_cnt = 0;
|
2009-05-26 20:28:22 +00:00
|
|
|
int64_t sleep_time = 0;
|
2015-06-12 10:01:24 +00:00
|
|
|
int64_t all_time = 0;
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2007-07-06 13:20:44 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return (0);
|
|
|
|
|
2016-08-02 03:05:59 +00:00
|
|
|
#if defined(ADAPTIVE_SX)
|
2016-08-01 21:48:37 +00:00
|
|
|
lock_delay_arg_init(&lda, &sx_delay);
|
2016-08-02 03:05:59 +00:00
|
|
|
#elif defined(KDTRACE_HOOKS)
|
|
|
|
lock_delay_arg_init(&lda, NULL);
|
2016-08-01 21:48:37 +00:00
|
|
|
#endif
|
2015-06-12 10:01:24 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
all_time -= lockstat_nsecs(&sx->lock_object);
|
2015-06-12 10:01:24 +00:00
|
|
|
#endif
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
|
|
|
#ifdef KDTRACE_HOOKS
|
|
|
|
state = x;
|
|
|
|
#endif
|
2015-06-12 10:01:24 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* As with rwlocks, we don't make any attempt to try to block
|
|
|
|
* shared locks once there is an exclusive waiter.
|
|
|
|
*/
|
|
|
|
for (;;) {
|
|
|
|
/*
|
|
|
|
* If no other thread has an exclusive lock then try to bump up
|
|
|
|
* the count of sharers. Since we have to preserve the state
|
|
|
|
* of SX_LOCK_EXCLUSIVE_WAITERS, if we fail to acquire the
|
|
|
|
* shared lock loop back and retry.
|
|
|
|
*/
|
|
|
|
if (x & SX_LOCK_SHARED) {
|
|
|
|
MPASS(!(x & SX_LOCK_SHARED_WAITERS));
|
|
|
|
if (atomic_cmpset_acq_ptr(&sx->sx_lock, x,
|
|
|
|
x + SX_ONE_SHARER)) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR4(KTR_LOCK,
|
|
|
|
"%s: %p succeed %p -> %p", __func__,
|
|
|
|
sx, (void *)x,
|
|
|
|
(void *)(x + SX_ONE_SHARER));
|
|
|
|
break;
|
|
|
|
}
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
2017-01-18 17:55:08 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
|
|
|
lda.spin_cnt++;
|
|
|
|
#endif
|
|
|
|
|
2012-03-28 20:58:30 +00:00
|
|
|
#ifdef HWPMC_HOOKS
|
|
|
|
PMC_SOFT_CALL( , , lock, failed);
|
|
|
|
#endif
|
2007-12-15 23:13:31 +00:00
|
|
|
lock_profile_obtain_lock_failed(&sx->lock_object, &contested,
|
|
|
|
&waittime);
|
2007-03-31 23:23:42 +00:00
|
|
|
|
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
/*
|
|
|
|
* If the owner is running on another CPU, spin until
|
|
|
|
* the owner stops running or the state of the lock
|
|
|
|
* changes.
|
|
|
|
*/
|
2009-05-29 01:49:27 +00:00
|
|
|
if ((sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) {
|
2017-01-18 17:55:08 +00:00
|
|
|
owner = lv_sx_owner(x);
|
2007-03-31 23:23:42 +00:00
|
|
|
if (TD_IS_RUNNING(owner)) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR3(KTR_LOCK,
|
|
|
|
"%s: spinning on %p held by %p",
|
|
|
|
__func__, sx, owner);
|
2014-11-04 16:35:56 +00:00
|
|
|
KTR_STATE1(KTR_SCHED, "thread",
|
|
|
|
sched_tdname(curthread), "spinning",
|
|
|
|
"lockname:\"%s\"", sx->lock_object.lo_name);
|
2007-03-31 23:23:42 +00:00
|
|
|
GIANT_SAVE();
|
2017-01-18 17:55:08 +00:00
|
|
|
do {
|
2016-08-01 21:48:37 +00:00
|
|
|
lock_delay(&lda);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
|
|
|
owner = lv_sx_owner(x);
|
|
|
|
} while (owner != NULL && TD_IS_RUNNING(owner));
|
2014-11-04 16:35:56 +00:00
|
|
|
KTR_STATE0(KTR_SCHED, "thread",
|
|
|
|
sched_tdname(curthread), "running");
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
2007-07-06 13:20:44 +00:00
|
|
|
}
|
2007-03-31 23:23:42 +00:00
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Some other thread already has an exclusive lock, so
|
|
|
|
* start the process of blocking.
|
|
|
|
*/
|
|
|
|
sleepq_lock(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The lock could have been released while we spun.
|
|
|
|
* In this case loop back and retry.
|
|
|
|
*/
|
|
|
|
if (x & SX_LOCK_SHARED) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef ADAPTIVE_SX
|
|
|
|
/*
|
|
|
|
* If the owner is running on another CPU, spin until
|
|
|
|
* the owner stops running or the state of the lock
|
|
|
|
* changes.
|
|
|
|
*/
|
|
|
|
if (!(x & SX_LOCK_SHARED) &&
|
2009-05-29 01:49:27 +00:00
|
|
|
(sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) {
|
2007-03-31 23:23:42 +00:00
|
|
|
owner = (struct thread *)SX_OWNER(x);
|
|
|
|
if (TD_IS_RUNNING(owner)) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Try to set the SX_LOCK_SHARED_WAITERS flag. If we
|
|
|
|
* fail to set it drop the sleep queue lock and loop
|
|
|
|
* back.
|
|
|
|
*/
|
|
|
|
if (!(x & SX_LOCK_SHARED_WAITERS)) {
|
|
|
|
if (!atomic_cmpset_ptr(&sx->sx_lock, x,
|
|
|
|
x | SX_LOCK_SHARED_WAITERS)) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p set shared waiters flag",
|
|
|
|
__func__, sx);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Since we have been unable to acquire the shared lock,
|
|
|
|
* we have to sleep.
|
|
|
|
*/
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p blocking on sleep queue",
|
|
|
|
__func__, sx);
|
2007-07-06 13:20:44 +00:00
|
|
|
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
sleep_time -= lockstat_nsecs(&sx->lock_object);
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2007-03-31 23:23:42 +00:00
|
|
|
GIANT_SAVE();
|
|
|
|
sleepq_add(&sx->lock_object, NULL, sx->lock_object.lo_name,
|
2007-05-31 09:14:48 +00:00
|
|
|
SLEEPQ_SX | ((opts & SX_INTERRUPTIBLE) ?
|
|
|
|
SLEEPQ_INTERRUPTIBLE : 0), SQ_SHARED_QUEUE);
|
|
|
|
if (!(opts & SX_INTERRUPTIBLE))
|
2008-03-12 06:31:06 +00:00
|
|
|
sleepq_wait(&sx->lock_object, 0);
|
2007-05-31 09:14:48 +00:00
|
|
|
else
|
2008-03-12 06:31:06 +00:00
|
|
|
error = sleepq_wait_sig(&sx->lock_object, 0);
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
sleep_time += lockstat_nsecs(&sx->lock_object);
|
2009-05-26 20:28:22 +00:00
|
|
|
sleep_cnt++;
|
|
|
|
#endif
|
2007-05-31 09:14:48 +00:00
|
|
|
if (error) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK,
|
|
|
|
"%s: interruptible sleep by %p suspended by signal",
|
|
|
|
__func__, sx);
|
|
|
|
break;
|
|
|
|
}
|
2007-03-31 23:23:42 +00:00
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p resuming from sleep queue",
|
|
|
|
__func__, sx);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2001-08-13 21:25:30 +00:00
|
|
|
}
|
2009-05-26 20:28:22 +00:00
|
|
|
#ifdef KDTRACE_HOOKS
|
2015-07-18 00:57:30 +00:00
|
|
|
all_time += lockstat_nsecs(&sx->lock_object);
|
2009-05-26 20:28:22 +00:00
|
|
|
if (sleep_time)
|
2015-07-19 22:14:09 +00:00
|
|
|
LOCKSTAT_RECORD4(sx__block, sx, sleep_time,
|
2015-06-12 10:01:24 +00:00
|
|
|
LOCKSTAT_READER, (state & SX_LOCK_SHARED) == 0,
|
|
|
|
(state & SX_LOCK_SHARED) == 0 ? 0 : SX_SHARERS(state));
|
2016-08-01 21:48:37 +00:00
|
|
|
if (lda.spin_cnt > sleep_cnt)
|
2015-07-19 22:14:09 +00:00
|
|
|
LOCKSTAT_RECORD4(sx__spin, sx, all_time - sleep_time,
|
2015-06-12 10:01:24 +00:00
|
|
|
LOCKSTAT_READER, (state & SX_LOCK_SHARED) == 0,
|
|
|
|
(state & SX_LOCK_SHARED) == 0 ? 0 : SX_SHARERS(state));
|
2009-05-26 20:28:22 +00:00
|
|
|
#endif
|
2015-06-12 10:01:24 +00:00
|
|
|
if (error == 0)
|
2015-07-19 22:24:33 +00:00
|
|
|
LOCKSTAT_PROFILE_OBTAIN_RWLOCK_SUCCESS(sx__acquire, sx,
|
|
|
|
contested, waittime, file, line, LOCKSTAT_READER);
|
2007-03-31 23:23:42 +00:00
|
|
|
GIANT_RESTORE();
|
2007-05-31 09:14:48 +00:00
|
|
|
return (error);
|
2001-08-13 21:25:30 +00:00
|
|
|
}
|
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* This function represents the so-called 'hard case' for sx_sunlock
|
|
|
|
* operation. All 'easy case' failures are redirected to this. Note
|
|
|
|
* that ideally this would be a static function, but it needs to be
|
|
|
|
* accessible from at least sx.h.
|
|
|
|
*/
|
2001-08-13 21:25:30 +00:00
|
|
|
void
|
2007-03-31 23:23:42 +00:00
|
|
|
_sx_sunlock_hard(struct sx *sx, const char *file, int line)
|
2001-08-13 21:25:30 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
uintptr_t x;
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
int wakeup_swapper;
|
2001-08-13 21:25:30 +00:00
|
|
|
|
panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
2011-12-11 21:02:01 +00:00
|
|
|
if (SCHEDULER_STOPPED())
|
|
|
|
return;
|
|
|
|
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
for (;;) {
|
|
|
|
/*
|
|
|
|
* We should never have sharers while at least one thread
|
|
|
|
* holds a shared lock.
|
|
|
|
*/
|
|
|
|
KASSERT(!(x & SX_LOCK_SHARED_WAITERS),
|
|
|
|
("%s: waiting sharers", __func__));
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* See if there is more than one shared lock held. If
|
|
|
|
* so, just drop one and return.
|
|
|
|
*/
|
|
|
|
if (SX_SHARERS(x) > 1) {
|
2009-09-30 13:26:31 +00:00
|
|
|
if (atomic_cmpset_rel_ptr(&sx->sx_lock, x,
|
2007-03-31 23:23:42 +00:00
|
|
|
x - SX_ONE_SHARER)) {
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR4(KTR_LOCK,
|
|
|
|
"%s: %p succeeded %p -> %p",
|
|
|
|
__func__, sx, (void *)x,
|
|
|
|
(void *)(x - SX_ONE_SHARER));
|
|
|
|
break;
|
|
|
|
}
|
2017-01-18 17:55:08 +00:00
|
|
|
|
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* If there aren't any waiters for an exclusive lock,
|
|
|
|
* then try to drop it quickly.
|
|
|
|
*/
|
|
|
|
if (!(x & SX_LOCK_EXCLUSIVE_WAITERS)) {
|
|
|
|
MPASS(x == SX_SHARERS_LOCK(1));
|
2009-09-30 13:26:31 +00:00
|
|
|
if (atomic_cmpset_rel_ptr(&sx->sx_lock,
|
|
|
|
SX_SHARERS_LOCK(1), SX_LOCK_UNLOCKED)) {
|
2007-03-31 23:23:42 +00:00
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p last succeeded",
|
|
|
|
__func__, sx);
|
|
|
|
break;
|
|
|
|
}
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* At this point, there should just be one sharer with
|
|
|
|
* exclusive waiters.
|
|
|
|
*/
|
|
|
|
MPASS(x == (SX_SHARERS_LOCK(1) | SX_LOCK_EXCLUSIVE_WAITERS));
|
2001-08-13 21:25:30 +00:00
|
|
|
|
2007-03-31 23:23:42 +00:00
|
|
|
sleepq_lock(&sx->lock_object);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Wake up semantic here is quite simple:
|
|
|
|
* Just wake up all the exclusive waiters.
|
|
|
|
* Note that the state of the lock could have changed,
|
|
|
|
* so if it fails loop back and retry.
|
|
|
|
*/
|
2009-09-30 13:26:31 +00:00
|
|
|
if (!atomic_cmpset_rel_ptr(&sx->sx_lock,
|
2007-03-31 23:23:42 +00:00
|
|
|
SX_SHARERS_LOCK(1) | SX_LOCK_EXCLUSIVE_WAITERS,
|
|
|
|
SX_LOCK_UNLOCKED)) {
|
|
|
|
sleepq_release(&sx->lock_object);
|
2017-01-18 17:55:08 +00:00
|
|
|
x = SX_READ_VALUE(sx);
|
2007-03-31 23:23:42 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (LOCK_LOG_TEST(&sx->lock_object, 0))
|
|
|
|
CTR2(KTR_LOCK, "%s: %p waking up all thread on"
|
|
|
|
"exclusive queue", __func__, sx);
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
wakeup_swapper = sleepq_broadcast(&sx->lock_object, SLEEPQ_SX,
|
|
|
|
0, SQ_EXCLUSIVE_QUEUE);
|
2008-03-12 06:31:06 +00:00
|
|
|
sleepq_release(&sx->lock_object);
|
If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).
With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.
Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().
Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks
2008-08-05 20:02:31 +00:00
|
|
|
if (wakeup_swapper)
|
|
|
|
kick_proc0();
|
2007-03-31 23:23:42 +00:00
|
|
|
break;
|
|
|
|
}
|
2001-08-13 21:25:30 +00:00
|
|
|
}
|
2001-10-23 22:39:11 +00:00
|
|
|
|
|
|
|
#ifdef INVARIANT_SUPPORT
|
2001-10-24 14:18:33 +00:00
|
|
|
#ifndef INVARIANTS
|
|
|
|
#undef _sx_assert
|
|
|
|
#endif
|
|
|
|
|
2001-10-23 22:39:11 +00:00
|
|
|
/*
|
|
|
|
* In the non-WITNESS case, sx_assert() can only detect that at least
|
|
|
|
* *some* thread owns an slock, but it cannot guarantee that *this*
|
|
|
|
* thread owns an slock.
|
|
|
|
*/
|
|
|
|
void
|
2011-11-16 21:51:17 +00:00
|
|
|
_sx_assert(const struct sx *sx, int what, const char *file, int line)
|
2001-10-23 22:39:11 +00:00
|
|
|
{
|
2007-03-31 23:23:42 +00:00
|
|
|
#ifndef WITNESS
|
|
|
|
int slocked = 0;
|
|
|
|
#endif
|
2001-10-23 22:39:11 +00:00
|
|
|
|
2004-02-27 16:13:44 +00:00
|
|
|
if (panicstr != NULL)
|
|
|
|
return;
|
2001-10-23 22:39:11 +00:00
|
|
|
switch (what) {
|
2007-05-19 21:26:05 +00:00
|
|
|
case SA_SLOCKED:
|
|
|
|
case SA_SLOCKED | SA_NOTRECURSED:
|
|
|
|
case SA_SLOCKED | SA_RECURSED:
|
2007-03-31 23:23:42 +00:00
|
|
|
#ifndef WITNESS
|
|
|
|
slocked = 1;
|
|
|
|
/* FALLTHROUGH */
|
|
|
|
#endif
|
2007-05-19 21:26:05 +00:00
|
|
|
case SA_LOCKED:
|
|
|
|
case SA_LOCKED | SA_NOTRECURSED:
|
|
|
|
case SA_LOCKED | SA_RECURSED:
|
2001-10-23 22:39:11 +00:00
|
|
|
#ifdef WITNESS
|
2007-03-21 21:20:51 +00:00
|
|
|
witness_assert(&sx->lock_object, what, file, line);
|
2001-10-23 22:39:11 +00:00
|
|
|
#else
|
2007-03-31 23:23:42 +00:00
|
|
|
/*
|
|
|
|
* If some other thread has an exclusive lock or we
|
|
|
|
* have one and are asserting a shared lock, fail.
|
|
|
|
* Also, if no one has a lock at all, fail.
|
|
|
|
*/
|
|
|
|
if (sx->sx_lock == SX_LOCK_UNLOCKED ||
|
|
|
|
(!(sx->sx_lock & SX_LOCK_SHARED) && (slocked ||
|
|
|
|
sx_xholder(sx) != curthread)))
|
2004-02-27 16:13:44 +00:00
|
|
|
panic("Lock %s not %slocked @ %s:%d\n",
|
2007-03-31 23:23:42 +00:00
|
|
|
sx->lock_object.lo_name, slocked ? "share " : "",
|
|
|
|
file, line);
|
|
|
|
|
|
|
|
if (!(sx->sx_lock & SX_LOCK_SHARED)) {
|
|
|
|
if (sx_recursed(sx)) {
|
2007-05-19 21:26:05 +00:00
|
|
|
if (what & SA_NOTRECURSED)
|
2007-03-31 23:23:42 +00:00
|
|
|
panic("Lock %s recursed @ %s:%d\n",
|
|
|
|
sx->lock_object.lo_name, file,
|
|
|
|
line);
|
2007-05-19 21:26:05 +00:00
|
|
|
} else if (what & SA_RECURSED)
|
2007-03-31 23:23:42 +00:00
|
|
|
panic("Lock %s not recursed @ %s:%d\n",
|
|
|
|
sx->lock_object.lo_name, file, line);
|
|
|
|
}
|
2001-10-23 22:39:11 +00:00
|
|
|
#endif
|
|
|
|
break;
|
2007-05-19 21:26:05 +00:00
|
|
|
case SA_XLOCKED:
|
|
|
|
case SA_XLOCKED | SA_NOTRECURSED:
|
|
|
|
case SA_XLOCKED | SA_RECURSED:
|
2007-03-31 23:23:42 +00:00
|
|
|
if (sx_xholder(sx) != curthread)
|
2004-02-27 16:13:44 +00:00
|
|
|
panic("Lock %s not exclusively locked @ %s:%d\n",
|
2007-03-21 21:20:51 +00:00
|
|
|
sx->lock_object.lo_name, file, line);
|
2007-03-31 23:23:42 +00:00
|
|
|
if (sx_recursed(sx)) {
|
2007-05-19 21:26:05 +00:00
|
|
|
if (what & SA_NOTRECURSED)
|
2007-03-31 23:23:42 +00:00
|
|
|
panic("Lock %s recursed @ %s:%d\n",
|
|
|
|
sx->lock_object.lo_name, file, line);
|
2007-05-19 21:26:05 +00:00
|
|
|
} else if (what & SA_RECURSED)
|
2007-03-31 23:23:42 +00:00
|
|
|
panic("Lock %s not recursed @ %s:%d\n",
|
|
|
|
sx->lock_object.lo_name, file, line);
|
2001-10-23 22:39:11 +00:00
|
|
|
break;
|
2007-05-19 21:26:05 +00:00
|
|
|
case SA_UNLOCKED:
|
2004-02-04 08:14:58 +00:00
|
|
|
#ifdef WITNESS
|
2007-03-21 21:20:51 +00:00
|
|
|
witness_assert(&sx->lock_object, what, file, line);
|
2004-02-04 08:14:58 +00:00
|
|
|
#else
|
2004-02-19 14:19:31 +00:00
|
|
|
/*
|
2007-03-31 23:23:42 +00:00
|
|
|
* If we hold an exclusve lock fail. We can't
|
|
|
|
* reliably check to see if we hold a shared lock or
|
|
|
|
* not.
|
2004-02-19 14:19:31 +00:00
|
|
|
*/
|
2007-03-31 23:23:42 +00:00
|
|
|
if (sx_xholder(sx) == curthread)
|
2004-02-27 16:13:44 +00:00
|
|
|
panic("Lock %s exclusively locked @ %s:%d\n",
|
2007-03-21 21:20:51 +00:00
|
|
|
sx->lock_object.lo_name, file, line);
|
2004-02-04 08:14:58 +00:00
|
|
|
#endif
|
|
|
|
break;
|
2001-10-23 22:39:11 +00:00
|
|
|
default:
|
|
|
|
panic("Unknown sx lock assertion: %d @ %s:%d", what, file,
|
|
|
|
line);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif /* INVARIANT_SUPPORT */
|
2005-12-13 23:14:35 +00:00
|
|
|
|
|
|
|
#ifdef DDB
|
2007-03-31 23:23:42 +00:00
|
|
|
static void
|
2011-11-16 21:51:17 +00:00
|
|
|
db_show_sx(const struct lock_object *lock)
|
2005-12-13 23:14:35 +00:00
|
|
|
{
|
|
|
|
struct thread *td;
|
2011-11-16 21:51:17 +00:00
|
|
|
const struct sx *sx;
|
2005-12-13 23:14:35 +00:00
|
|
|
|
2011-11-16 21:51:17 +00:00
|
|
|
sx = (const struct sx *)lock;
|
2005-12-13 23:14:35 +00:00
|
|
|
|
|
|
|
db_printf(" state: ");
|
2007-03-31 23:23:42 +00:00
|
|
|
if (sx->sx_lock == SX_LOCK_UNLOCKED)
|
|
|
|
db_printf("UNLOCKED\n");
|
2007-05-08 21:51:37 +00:00
|
|
|
else if (sx->sx_lock == SX_LOCK_DESTROYED) {
|
|
|
|
db_printf("DESTROYED\n");
|
|
|
|
return;
|
|
|
|
} else if (sx->sx_lock & SX_LOCK_SHARED)
|
2007-03-31 23:23:42 +00:00
|
|
|
db_printf("SLOCK: %ju\n", (uintmax_t)SX_SHARERS(sx->sx_lock));
|
|
|
|
else {
|
|
|
|
td = sx_xholder(sx);
|
2005-12-13 23:14:35 +00:00
|
|
|
db_printf("XLOCK: %p (tid %d, pid %d, \"%s\")\n", td,
|
2007-11-14 06:21:24 +00:00
|
|
|
td->td_tid, td->td_proc->p_pid, td->td_name);
|
2007-03-31 23:23:42 +00:00
|
|
|
if (sx_recursed(sx))
|
|
|
|
db_printf(" recursed: %d\n", sx->sx_recurse);
|
|
|
|
}
|
|
|
|
|
|
|
|
db_printf(" waiters: ");
|
|
|
|
switch(sx->sx_lock &
|
|
|
|
(SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS)) {
|
|
|
|
case SX_LOCK_SHARED_WAITERS:
|
|
|
|
db_printf("shared\n");
|
|
|
|
break;
|
|
|
|
case SX_LOCK_EXCLUSIVE_WAITERS:
|
|
|
|
db_printf("exclusive\n");
|
|
|
|
break;
|
|
|
|
case SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS:
|
|
|
|
db_printf("exclusive and shared\n");
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
db_printf("none\n");
|
|
|
|
}
|
2005-12-13 23:14:35 +00:00
|
|
|
}
|
2006-08-15 18:29:01 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Check to see if a thread that is blocked on a sleep queue is actually
|
|
|
|
* blocked on an sx lock. If so, output some details and return true.
|
|
|
|
* If the lock has an exclusive owner, return that in *ownerp.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
sx_chain(struct thread *td, struct thread **ownerp)
|
|
|
|
{
|
|
|
|
struct sx *sx;
|
|
|
|
|
|
|
|
/*
|
2007-03-31 23:23:42 +00:00
|
|
|
* Check to see if this thread is blocked on an sx lock.
|
|
|
|
* First, we check the lock class. If that is ok, then we
|
|
|
|
* compare the lock name against the wait message.
|
2006-08-15 18:29:01 +00:00
|
|
|
*/
|
2007-03-31 23:23:42 +00:00
|
|
|
sx = td->td_wchan;
|
|
|
|
if (LOCK_CLASS(&sx->lock_object) != &lock_class_sx ||
|
|
|
|
sx->lock_object.lo_name != td->td_wmesg)
|
2006-08-15 18:29:01 +00:00
|
|
|
return (0);
|
|
|
|
|
|
|
|
/* We think we have an sx lock, so output some details. */
|
|
|
|
db_printf("blocked on sx \"%s\" ", td->td_wmesg);
|
2007-03-31 23:23:42 +00:00
|
|
|
*ownerp = sx_xholder(sx);
|
|
|
|
if (sx->sx_lock & SX_LOCK_SHARED)
|
|
|
|
db_printf("SLOCK (count %ju)\n",
|
|
|
|
(uintmax_t)SX_SHARERS(sx->sx_lock));
|
|
|
|
else
|
2006-08-15 18:29:01 +00:00
|
|
|
db_printf("XLOCK\n");
|
|
|
|
return (1);
|
|
|
|
}
|
2005-12-13 23:14:35 +00:00
|
|
|
#endif
|