Commit Graph

64 Commits

Author SHA1 Message Date
jhb
2660b97507 Count the context switch when blocking on a mutex as a voluntary context
switch.  Count the context switch when preempting the current thread to let
a higher priority thread blocked on a mutex we just released run as an
involuntary context switch.

Reported by:	bde
2001-06-25 18:29:32 +00:00
jhb
21bc7f9fa7 - Move state about lock objects out of struct lock_object and into a new
struct lock_instance that is stored in the per-process and per-CPU lock
  lists.  Previously, the lock lists just kept a pointer to each lock held.
  That pointer is now replaced by a lock instance which contains a pointer
  to the lock object, the file and line of the last acquisition of a lock,
  and various flags about a lock including its recursion count.
- If we sleep while holding a sleepable lock, then mark that lock instance
  as having slept and ignore any lock order violations that occur while
  acquiring Giant when we wake up with slept locks.  This is ok because of
  Giant's special nature.
- Allow witness to differentiate between shared and exclusive locks and
  unlocks of a lock.  Witness will now detect the case when a lock is
  acquired first in one mode and then in another.  Mutexes are always
  locked and unlocked exclusively.  Witness will also now detect the case
  where a process attempts to unlock a shared lock while holding an
  exclusive lock and vice versa.
- Fix a bug in the lock list implementation where we used the wrong
  constant to detect the case where a lock list entry was full.
2001-05-04 17:15:16 +00:00
markm
bcca5847d5 Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
jhb
4448046853 Exit and re-enter the critical section while spinning for a spinlock so
that interrupts can come in while we are waiting for a lock.
2001-04-17 03:34:52 +00:00
markm
0efbb4e263 Handle a rare but fatal race invoked sometimes when SIGSTOP is
invoked.
2001-04-13 09:29:34 +00:00
jhb
0c490fd02e Rework the witness code to work with sx locks as well as mutexes.
- Introduce lock classes and lock objects.  Each lock class specifies a
  name and set of flags (or properties) shared by all locks of a given
  type.  Currently there are three lock classes: spin mutexes, sleep
  mutexes, and sx locks.  A lock object specifies properties of an
  additional lock along with a lock name and all of the extra stuff needed
  to make witness work with a given lock.  This abstract lock stuff is
  defined in sys/lock.h.  The lockmgr constants, types, and prototypes have
  been moved to sys/lockmgr.h.  For temporary backwards compatability,
  sys/lock.h includes sys/lockmgr.h.
- Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin
  locks held.  By making this per-cpu, we do not have to jump through
  magic hoops to deal with sched_lock changing ownership during context
  switches.
- Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with
  proc->p_sleeplocks, which is a list of held sleep locks including sleep
  mutexes and sx locks.
- Add helper macros for logging lock events via the KTR_LOCK KTR logging
  level so that the log messages are consistent.
- Add some new flags that can be passed to mtx_init():
  - MTX_NOWITNESS - specifies that this lock should be ignored by witness.
    This is used for the mutex that blocks a sx lock for example.
  - MTX_QUIET - this is not new, but you can pass this to mtx_init() now
    and no events will be logged for this lock, so that one doesn't have
    to change all the individual mtx_lock/unlock() operations.
- All lock objects maintain an initialized flag.  Use this flag to export
  a mtx_initialized() macro that can be safely called from drivers.  Also,
  we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness
  performs the corresponding checks using the initialized flag.
- The lock order reversal messages have been improved to output slightly
  more accurate file and line numbers.
2001-03-28 09:03:24 +00:00
jhb
4572ff9c78 - Switch from using save/disable/restore_intr to using critical_enter/exit
and change the u_int mtx_saveintr member of struct mtx to a critical_t
  mtx_savecrit.
- On the alpha we no longer need a custom _get_spin_lock() macro to avoid
  an extra PAL call, so remove it.
- Partially fix using mutexes with WITNESS in modules.  Change all the
  _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line
  parameters and rename them to use a prefix of two underscores.  Inside
  of kern_mutex.c, generate wrapper functions for
  _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore)
  that are called from modules.  The macros mtx_{un,}lock_{spin,}_flags()
  are mapped to the __mtx_* macros inside of the kernel to inline the
  usual case of mutex operations and map to the internal _mtx_* functions
  in the module case so that modules will use WITNESS and KTR logging if
  the kernel is compiled with support for it.
2001-03-28 02:40:47 +00:00
jhb
f108bc4208 Fix mtx_legal2block. The only time that it is bad to block on a mutex is
if we hold a spin mutex, since we can trivially get into deadlocks if we
start switching out of processes that hold spinlocks.  Checking to see if
interrupts were disabled was a sort of cheap way of doing this since most
of the time interrupts were only disabled when holding a spin lock.  At
least on the i386.  To fix this properly, use a per-process counter
p_spinlocks that counts the number of spin locks currently held, and
instead of checking to see if interrupts are disabled in the witness code,
check to see if we hold any spin locks.  Since child processes always
start up with the sched lock magically held in fork_exit(), we initialize
p_spinlocks to 1 for child processes.  Note that proc0 doesn't go through
fork_exit(), so it starts with no spin locks held.

Consulting from:	cp
2001-03-09 07:24:17 +00:00
jhb
46333fa556 - Add an extra check in priority_propagation() for UP systems to ensure we
don't end up back at ourselves which would indicate deadlock.
- Add the proc lock to the witness dup_list as we may hold more than one
  process lock at a time.
- Don't assert a mutex is owned in _mtx_unlock_sleep() as that is too late.
  We do the checks in the macros instead.
2001-03-07 02:45:15 +00:00
julian
cceb0beb35 Shuffle netgraph mutexes a bit and hold a reference on a node
from the function that is calling the destructor.
2001-02-28 18:49:09 +00:00
jake
623f06756a Sigh. Try to get priorities sorted out. Don't bother trying to
update native priority, it is diffcult to get right and likely
to end up horribly wrong.  Use an honestly wrong fixed value
that seems to work; PUSER for user threads, and the interrupt
priority for ithreads.  Set it once when the process is created
and forget about it.

Suggested by:	bde
Pointy hat:	me
2001-02-28 02:53:44 +00:00
jake
3d95b58cac Initialize native priority to PRI_MAX. It was usually 0 which made a
process's priority go through the roof when it released a (contested)
mutex.  Only set the native priority in mtx_lock if hasn't already
been set.

Reviewed by:	jhb
2001-02-26 23:27:35 +00:00
jake
df76fc11c8 Remove brackets around variables in a function that used to be
a macro.
2001-02-25 16:18:13 +00:00
julian
5f7d2c9e8b Move netgraph spimlock order entries out of
the #ifdef SMP section. They need to be there for UP too.
2001-02-25 04:56:23 +00:00
jhb
de4709b849 Grrr, s/INVARIANTS_SUPPORT/INVARIANT_SUPPORT/. 2001-02-24 21:29:32 +00:00
jhb
59969ced20 - Axe RETIP() as it was very i386 specific and unwieldy. Instead, use the
passed in filename and line number in the KTR tracepoint message.
- Even though it is #if 0'd code, change the code to detect that a process
  is an interrupt thread to check p->p_ithd against NULL rather than
  checking non-existant process flags from BSD/OS.
- Use '%p' to print pointers in KTR log messages instead of assuming
  sizeof(int) == sizeof(void *).
- Don't set p_mtxname to NULL when releasing a mutex.  It doesn't hurt
  to leave it set (we don't clear w_mesg for example) and at least at
  one time in the past, there used to be race conditions in the kernel
  that would result in setting this to NULL causing the kernel to
  dereference NULL.
- Make the _mtx_assert() function be compiled in if INVARIANTS_SUPPORT is
  defined rather than if INVARIANTS is defined so that a KLD compiled
  with INVARIANTS that uses mtx_assert() can be used with a kernel that
  just has INVARIANT_SUPPORT compiled in.
2001-02-24 19:36:13 +00:00
julian
f39a190a3f Add knowledge of the netgraph spinlocks into the Witness code.
Well, at least I think that's how it's done.
2001-02-24 14:29:47 +00:00
jhb
f871fe7250 - Use the NOCPU constant.
- Move the ithread spin locks before sched lock and clk in preparation for
  future commits to the ithread code.
2001-02-22 02:12:54 +00:00
bmilekic
0e619322a9 Change all instances of CURPROC' and CURTHD' to `curproc,' in order
to stay consistent.

Requested by: bde
2001-02-12 03:15:43 +00:00
jake
55d5108ac5 Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different
  scheduling classes using different portions of the array.  This
  allows user processes to have their priorities propogated up into
  interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
  32.  We used to have 4 separate arrays of 32 queues each, so this
  may not be optimal.  The new run queue code was written with this
  in mind; changing the number of run queues only requires changing
  constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter.  This
  is intended to be used to create per-cpu run queues.  Implement
  wrappers for compatibility with the old interface which pass in
  the global run queue structure.
- Group the priority level, user priority, native priority (before
  propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
  symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
  This was used to detect when a process' priority had lowered and
  it should yield.  We now effectively yield on every interrupt.
- Activate propogate_priority().  It should now have the desired
  effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
  idle loop.  It interfered with propogate_priority() because
  the idle process needed to do a non-blocking acquire of Giant
  and then other processes would try to propogate their priority
  onto it.  The idle process should not do anything except idle.
  vm_page_zero_idle() will return in the form of an idle priority
  kernel thread which is woken up at apprioriate times by the vm
  system.
- Update struct kinfo_proc to the new priority interface.  Deliberately
  change its size by adjusting the spare fields.  It remained the same
  size, but the layout has changed, so userland processes that use it
  would parse the data incorrectly.  The size constraint should really
  be changed to an arbitrary version number.  Also add a debug.sizeof
  sysctl node for struct kinfo_proc.
2001-02-12 00:20:08 +00:00
bmilekic
c4e7caf799 - Place back STR string declarations for lock/unlock strings used for KTR_LOCK
tracing in order to avoid duplication.
- Insert some tracepoints back into the mutex acq/rel code, thus ensuring
  that we can trace all lock acq/rel's again.
- All CURPROC != NULL checks are MPASS()es (under MUTEX_DEBUG) because they
  signify a serious mutex corruption.
- Change up some KASSERT()s to MPASS()es, and vice-versa, depending on the
  type of problem we're debugging (INVARIANTS is used here to check that
  the API is being used properly whereas MUTEX_DEBUG is used to ensure that
  something general isn't happening that will have bad impact on mutex
  locks).

Reminded by: jhb, jake, asmodai
2001-02-11 02:54:16 +00:00
jhb
6e319b65bd Unify the two sleep lock order lists to enforce the process lock ->
uidinfo lock locking order.
2001-02-09 20:52:02 +00:00
jhb
67e1fedd42 - Change the 'witness_list' ddb command to 'show mutexes'. Note that this
will only display sleep mutexes held by the current process.
- Clean up some nits in the witness_display() function and add a ddb
  command 'show witness' that dumps the hierarchy and order lists to the
  console.
- Use queue(3) macros where appropriate.
- Resort the spin lock order list so that "com" is before "sched_lock".
  Also, add appropriate #ifdef's around SMP and i386-specific mutexes.
- Add two new mutexes used to protect the ithread lists and tables to the
  order list.

Requested by:	bde (1)
2001-02-09 15:19:41 +00:00
bmilekic
f364d4ac36 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
jhb
02ab182bc1 Add a new ddb command 'witness_list' that lists the mutexes held by
curproc.

Requested by:	peter
2001-01-27 07:51:34 +00:00
jasone
8d2ec1ebc4 Convert all simplelocks to mutexes and remove the simplelock implementations. 2001-01-24 12:35:55 +00:00
jhb
233d6de215 - Don't use a union and fun tricks to shave one extra pointer off of struct
mtx right now as it makes debugging harder.  When we are in optimizing
  mode, we can revisit this.
- Fix the KTR trace messages to use %p rather than 0x%p to avoid duplicate
  0x's in KTR output.
- During witness_fixup, release Giant so that witness doesn't get confused.
  Also, grab all_mtx while walking the list of mutexes.
- Remove w_sleep and w_recurse.  Instead, perform checks on mutexes using
  the mutex's mtx_flags field.
- Allow debug.witness_ddb and debug.witness_skipspin to be set from the
  loader.
- Add Giant to the front of existing order_list entries to help ensure
  Giant is always first.
- Add an order entry for the various proc locks.  Note that this only
  helps keep proc in order mostly as the allproc and proctree mutexes are
  only obtained during a lockmgr operation on the specified mutex.
2001-01-24 10:57:01 +00:00
jasone
cf0a6d372c Print correct file name and line number in mtx_assert().
Noticed by:	jake
2001-01-22 05:56:55 +00:00
jasone
ec55088093 Move most of sys/mutex.h into kern/kern_mutex.c, thereby making the mutex
inline functions non-inlined.  Hide parts of the mutex implementation that
should not be exposed.

Make sure that WITNESS code is not executed during boot until the mutexes
are fully initialized by SI_SUB_MUTEX (the original motivation for this
commit).

Submitted by:	peter
2001-01-21 22:34:43 +00:00
jasone
c3fd76623a Make the order of the static initializer for all_mtx match the order of
fields in struct mtx.

Found by:	jake
2001-01-21 11:05:02 +00:00
jasone
24d53563ed Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.

This change is necessary in order to avoid some circular bootstrapping
dependencies.
2001-01-21 07:52:20 +00:00
jake
ea36052df5 - Make npx_intr INTR_MPSAFE and move acquiring Giant into the
function itself.
- Remove a hack to allow acquiring Giant from the npx asm trap
  vector.
2001-01-20 02:30:58 +00:00
bmilekic
37decc93f5 Implement MTX_RECURSE flag for mtx_init().
All calls to mtx_init() for mutexes that recurse must now include
the MTX_RECURSE bit in the flag argument variable. This change is in
preparation for an upcoming (further) mutex API cleanup.
The witness code will call panic() if a lock is found to recurse but
the MTX_RECURSE bit was not set during the lock's initialization.

The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to
MTX_RECURSED, which is more appropriate given its meaning.

The following locks have been made "recursive," thus far:
eventhandler, Giant, callout, sched_lock, possibly some others declared
in the architecture-specific code, all of the network card driver locks
in pci/, as well as some other locks in dev/ stuff that I've found to
be recursive.

Reviewed by: jhb
2001-01-19 01:59:14 +00:00
jake
4f5d8ed825 Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables
other then curproc.
2001-01-10 04:43:51 +00:00
jhb
fd5f604e9f - Add a new flag MTX_QUIET that can be passed to the various mtx_*
functions.  If this flag is set, then no KTR log messages are issued.
  This is useful for blocking excessive logging, such as with the internal
  mutex used by the witness code.
- Use MTX_QUIET on all of the mtx_enter/exit operations on the internal
  mutex used by the witness code.
- If we are in a panic, don't do witness checks in witness_enter(),
  witness_exit(), and witness_try_enter(), just return.
2000-12-13 21:53:42 +00:00
jake
e04de3cdaa - Add code to detect if a system call returns with locks other than Giant
held and panic if so (conditional on witness).
- Change witness_list to return the number of locks held so this is easier.
- Add kern/syscalls.c to the kernel build if witness is defined so that the
  panic message can contain the name of the offending system call.
- Add assertions that Giant and sched_lock are not held when returning from
  a system call, which were missing for alpha and ia64.
2000-12-12 01:14:32 +00:00
jhb
28288268c8 Oops, the witness mutex is a spin lock, so use MTX_SPIN in the call to
mtx_init().  Since the witness code ignores its internal mutex, this
doesn't result in any functional change.
2000-12-12 00:37:18 +00:00
dwmalone
dd75d1d73b Convert more malloc+bzero to malloc+M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
2000-12-08 21:51:06 +00:00
jhb
7cc6cf3ca7 Split the WITNESS and MUTEX_DEBUG options apart so that WITNESS does not
depend on MUTEX_DEBUG.  The MUTEX_DEBUG option turns on extra assertions
and checks to verify that mutexes themselves are implemented properly.
The WITNESS option uses extra checks and diagnostics to verify that other
code is using mutexes properly.
2000-12-01 00:10:59 +00:00
jhb
ab556afac4 Fix up priority propagation:
- Use a better test for determining when a process is running.
- Convert some checks to assertions.
- Remove unnecessary tests.
- Save the priority before acquiring a mutex rather than in msleep(9).
2000-11-30 00:51:16 +00:00
jhb
e96d454c32 Set p_mtxname when blocking on a mutex and clear it when waking up. 2000-11-29 20:17:15 +00:00
jhb
0640108ef5 Use an atomic operation with an appropriate memory barrier when releasing
a contested sleep mutex in the case that at least two processes are blocked
on the contested mutex.
2000-11-29 18:41:19 +00:00
jhb
42af0b6bb4 The sched_lock mutex goes after the sio mutex in the locking order since
a software interrupt can be scheduled in the sio interrupt handler while
the sio mutex is held.
2000-11-29 18:38:14 +00:00
jhb
35b8911585 Save the line number and filename of the last mtx_enter operation for
spin locks.  We already do this for sleep locks.
2000-11-29 18:37:01 +00:00
alfred
011c33c2f9 Move the #define of _KERN_MUTEX_C_ so that it's before any system headers
are included.  System headers can include sys/mutex.h and then certain
macros do not get defined.

Reviewed by: jake
2000-11-26 21:14:17 +00:00
jake
089f02872c Add uidinfo hash and uidinfo struct to the witness order list. 2000-11-26 15:05:46 +00:00
jake
f265931038 - Protect the callout wheel with a separate spin mutex, callout_lock.
- Use the mutex in hardclock to ensure no races between it and
  softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
  CALLOUT_MPSAFE, which specifies that a callout handler does not
  need giant.  There is still no way to set this flag when
  regstering a callout.

Reviewed by:	-smp@, jlemon
2000-11-19 06:02:32 +00:00
jake
3a97b3e213 - Split the run queue and sleep queue linkage, so that a process
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.

Tested by:	John Hay <jhay@icomtek.csir.co.za>
		jhb
2000-11-17 18:09:18 +00:00
jhb
c0bba69cbe Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch().  The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by:	witness
2000-11-16 02:16:44 +00:00
jhb
3e6befb757 Include the right headers to get the DDB #define and the db_active variable. 2000-11-15 22:08:16 +00:00