Commit Graph

244 Commits

Author SHA1 Message Date
Jeff Roberson
710eacdc5f - Placing the 'volatile' on the right side of the * in the td_lock
declaration removes the need for __DEVOLATILE().

Pointed out by:	tegge
2007-06-06 03:40:47 +00:00
Attilio Rao
d301eb10c7 Fix a problem with not-preemptive kernels caming from mis-merging of
existing code with the new thread_lock patch.
This also cleans up a bit unlock operation for mutexes.

Approved by: jhb, jeff(mentor)
2007-06-05 18:57:09 +00:00
Konstantin Belousov
b95b98b0bd Restore non-SMP build.
Reviewed by:	attilio
2007-06-05 14:20:13 +00:00
Jeff Roberson
2502c107ba Commit 3/14 of sched_lock decomposition.
- Add a per-turnstile spinlock to solve potential priority propagation
   deadlocks that are possible with thread_lock().
 - The turnstile lock order is defined as the exact opposite of the
   lock order used with the sleep locks they represent.  This allows us
   to walk in reverse order in priority_propagate and this is the only
   place we wish to multiply acquire turnstile locks.
 - Use the turnstile_chain lock to protect assigning mutexes to turnstiles.
 - Change the turnstile interface to pass back turnstile pointers to the
   consumers.  This allows us to reduce some locking and makes it easier
   to cancel turnstile assignment while the turnstile chain lock is held.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:51:44 +00:00
John Baldwin
c91fcee75d Move lock_profile_object_{init,destroy}() into lock_{init,destroy}(). 2007-05-18 15:04:59 +00:00
John Baldwin
c0bfd70306 Teach 'show lock' to properly handle a destroyed mutex. 2007-05-08 21:50:46 +00:00
Kip Macy
70fe8436c8 move lock_profile calls out of the macros and into kern_mutex.c
add check for mtx_recurse == 0 when releasing sleep lock
2007-04-03 22:52:31 +00:00
John Baldwin
cd6e6e4e11 - Simplify the #ifdef's for adaptive mutexes and rwlocks by conditionally
defining a macro earlier in the file.
- Add NO_ADAPTIVE_RWLOCKS option to disable adaptive spinning for rwlocks.
2007-03-22 16:09:23 +00:00
John Baldwin
aa89d8cd52 Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,
rwlocks, and sx locks to 'lock_object'.
2007-03-21 21:20:51 +00:00
John Baldwin
6e21afd40c Add two new function pointers 'lc_lock' and 'lc_unlock' to lock classes.
These functions are intended to be used to drop a lock and then reacquire
it when doing an sleep such as msleep(9).  Both functions accept a
'struct lock_object *' as their first parameter.  The 'lc_unlock' function
returns an integer that is then passed as the second paramter to the
subsequent 'lc_lock' function.  This can be used to communicate state.
For example, sx locks and rwlocks use this to indicate if the lock was
share/read locked vs exclusive/write locked.

Currently, spin mutexes and lockmgr locks do not provide working lc_lock
and lc_unlock functions.
2007-03-09 16:27:11 +00:00
John Baldwin
ae8dde30c2 Use C99-style struct member initialization for lock classes. 2007-03-09 16:04:44 +00:00
Kip Macy
c66d760608 lock stats updates need to be protected by the lock 2007-03-02 07:21:20 +00:00
Kip Macy
a5bceb77f2 Evidently I've overestimated gcc's ability to peak inside inline functions
and optimize away unused stack values. The 48 bytes that the lock_profile_object
adds to the stack evidently has a measurable performance impact on certain workloads.
2007-03-01 09:35:48 +00:00
Kip Macy
f183910b97 Further improvements to LOCK_PROFILING:
- Fix missing initialization in kern_rwlock.c causing bogus times to be collected
 - Move updates to the lock hash to after the lock is released for spin mutexes,
   sleep mutexes, and sx locks
 - Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling
   statistics when an acquisition is contended. This reduces the overhead of
   LOCK_PROFILING to increasing system time by 20%-25% which on
   "make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms
   of wall-clock time. Contrast this to enabling lock profiling without
   LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.
2007-02-27 06:42:05 +00:00
Kip Macy
fe68a91631 general LOCK_PROFILING cleanup
- only collect timestamps when a lock is contested - this reduces the overhead
  of collecting profiles from 20x to 5x

- remove unused function from subr_lock.c

- generalize cnt_hold and cnt_lock statistics to be kept for all locks

- NOTE: rwlock profiling generates invalid statistics (and most likely always has)
  someone familiar with that should review
2007-02-26 08:26:44 +00:00
Kip Macy
1364a812e7 - Fix some gcc warnings in lock_profile.h
- add cnt_hold cnt_lock support for spin mutexes
- make sure contested is initialized to zero to only bump contested when appropriate
- move initialization function to kern_mutex.c to avoid cyclic dependency between
  mutex.h and lock_profile.h
2006-12-16 02:37:58 +00:00
Kip Macy
61bd5e21b3 track lock class name in a way that doesn't break WITNESS 2006-11-13 05:41:46 +00:00
Kip Macy
7c0435b933 MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile
wait (time waited to acquire) and hold times for *all* kernel locks. If
the architecture has a system synchronized TSC, the profiling code will
use that - thereby minimizing profiling overhead. Large chunks of profiling
code have been moved out of line, the overhead measured on the T1 for when
it is compiled in but not enabled is < 1%.

Approved by: scottl (standing in for mentor rwatson)
Reviewed by: des and jhb
2006-11-11 03:18:07 +00:00
John Baldwin
0fa2168b19 - When spinning on a spin lock, if the debugger is active or we are in a
panic, go ahead and do the longer DELAY(1) spin wait.
- If we panic due to spinning too long, print out a few more details
  including the pointer to the mutex in question and the tid of the owning
  thread.
2006-08-15 18:26:12 +00:00
John Baldwin
764e4d54e9 Adjust td_locks for non-spin mutexes, rwlocks, and sx locks so that it is
a count of all non-spin locks, not just lockmgr locks.  This can give us a
much cheaper way to see if we have any locks held (such as when returning
to userland via userret()) without requiring WITNESS.

MFC after:	1 week
2006-07-27 21:45:55 +00:00
John Baldwin
186abbd727 Write a magic value into mtx_lock when destroying a mutex that will force
all other mtx_lock() operations to block.  Previously, when the mutex was
destroyed, it would still have a valid value in mtx_lock(): either the
unowned cookie, which would allow a subsequent mtx_lock() to succeed, or a
pointer to the thread who destroyed the mutex if the mutex was locked when
it was destroyed.

MFC after:	3 days
2006-07-27 19:58:18 +00:00
John Baldwin
49b94bfc54 Bah, fix fat finger in last. Invert the ~ on MTX_FLAGMASK as it's
non-intuitive for the ~ to be built into the mask.  All the users now
explicitly ~ the mask.  In addition, add MTX_UNOWNED to the mask even
though it technically isn't a flag.  This should unbreak mtx_owner().

Quickly spotted by:	kris
2006-06-03 21:11:33 +00:00
John Baldwin
315ce35f7b Simplify mtx_owner() so it only reads m->mtx_lock once. 2006-06-03 20:45:00 +00:00
John Baldwin
f781b5a4bb Style fix to be more like _mtx_lock_sleep(): use 'while (!foo) { ... }'
instead of 'for (;;) { if (foo) break; ... }'.
2006-06-03 20:44:01 +00:00
Poul-Henning Kamp
c40da00ca3 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
John Baldwin
73dbd3da73 Remove various bits of conditional Alpha code and fixup a few comments. 2006-05-12 05:04:46 +00:00
John Baldwin
76447e5618 Mark the thread pointer used during an adaptive spin volatile so that the
compiler doesn't decide to cache td_state.  Cachine the state would cause
the spinning thread to not notice when the owning thread stopped executing
(if it was preempted for example) which could result in livelock.
2006-04-14 19:51:50 +00:00
John Baldwin
7aa4f6852a - Add support for having both a shared and exclusive queue of threads in
each turnstile.  Also, allow for the owner thread pointer of a turnstile
  to be NULL.  This is needed for the upcoming reader/writer lock
  implementation.
- Add a new ddb command 'show turnstile' that will look up the turnstile
  associated with the given lock argument and display useful information
  like the list of threads blocked on each queue, etc.  If there isn't an
  active turnstile for a lock at the specified address, then the function
  will see if there is an active turnstile at the specified address and
  display info about it if so.
- Adjust the mutex code to handle the turnstile API changes.

Tested on:	i386 (all), alpha, amd64, sparc64 (1 and 3)
2006-01-27 22:42:12 +00:00
John Baldwin
67f7fe8c01 Whitespace fix. 2006-01-24 22:24:05 +00:00
John Baldwin
83a81bcb14 Add a new file (kern/subr_lock.c) for holding code related to struct
lock_obj objects:
- Add new lock_init() and lock_destroy() functions to setup and teardown
  lock_object objects including KTR logging and registering with WITNESS.
- Move all the handling of LO_INITIALIZED out of witness and the various
  lock init functions into lock_init() and lock_destroy().
- Remove the constants for static indices into the lock_classes[] array
  and change the code outside of subr_lock.c to use LOCK_CLASS to compare
  against a known lock class.
- Move the 'show lock' ddb function and lock_classes[] array out of
  kern_mutex.c over to subr_lock.c.
2006-01-17 16:55:17 +00:00
John Baldwin
550d1c9392 Initialize thread0.td_contested in init_turnstiles() rather than
mutex_init() as it is used by the turnstile code and is not mutex-specific.
2006-01-17 16:47:42 +00:00
Scott Long
861a23087b If destroying a spinlock, make sure that it is exited properly.
Submitted by: jhb
MFC After: 3 days
2006-01-08 00:18:34 +00:00
John Baldwin
3b783acd2a Revert an untested local change that crept in with the lo_class changes
and subsequently broke the build.  This change is supposed to fix the
case where doing a mtx_destroy() off a spin mutex while you hold it fails.
If it had been tested I would just leave it in, but it hasn't been tested
yet, so it will have to wait until later.
2006-01-07 14:03:15 +00:00
Tai-hwa Liang
75d6a87fa3 Trying to fix compilation bustage introduced in rev1.160 by converting
a missing lo_class to LO_CLASSINDEX().
2006-01-07 02:07:08 +00:00
John Baldwin
3c6decc327 Trim another pointer from struct lock_object (and thus from struct mtx and
struct sx).  Instead of storing a direct pointer to a our lock_class
struct in lock_object, reserve 4 bits in the lo_flags field to serve as an
index into a global lock_classes array that contains pointers to the lock
classes.  Only debugging code such as WITNESS or INVARIANTS checks and KTR
logging need to access the lock_class member, so this shouldn't add any
overhead to production kernels.  It might add some slight overhead to
kernels using those debug options however.

As with the previous set of changes to lock_object, this is going to
completely obliterate the kernel ABI, so be sure to recompile all your
modules.
2006-01-06 18:07:32 +00:00
John Baldwin
d272fe53a4 Add a new 'show lock' command to ddb. If the argument has a valid lock
class, then it displays various information about the lock and calls a
new function pointer in lock_class (lc_ddb_show) to dump class-specific
information about the lock as well (such as the owner of a mutex or
xlock'ed sx lock).  This is easier than staring at hex dumps of locks to
figure out who owns the lock, etc.  Note that extending lock_class doesn't
affect the ABI for any kernel modules as the only code that deals with
lock_class structures directly is kern_mutex.c, kern_sx.c, and witness.

MFC after:	1 week
2005-12-13 23:14:35 +00:00
John Baldwin
8c4b6380c7 Move the initialization of the devmtx into the mutex_init() function
called during early init before cninit().

Tested on:	i386, alpha, sparc64
Reviewed by:	phk, imp
Reported by:	Divacky Roman xdivac02 at stud dot fit dot vutbr dot cz
MFC after:	1 week
2005-10-18 18:27:44 +00:00
John Baldwin
83cece6fa1 - Add an assertion to panic if one tries to call mtx_trylock() on a spin
mutex.
- Don't panic if a spin lock is held too long inside _mtx_lock_spin() if
  panicstr is set (meaning that we are already in a panic).  Just keep
  spinning forever instead.
2005-09-02 20:21:49 +00:00
Paul Saab
1126349ae7 Ignore mutex asserts when we're dumping as well. This allows me
to panic a system from DDB when INVARIANTS is compiled into the
kernel on a scsi system.
2005-07-30 05:54:30 +00:00
John Baldwin
122eceef61 Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables.  This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after:	3 days
Tested on:	i386, alpha, sparc64
Compiled on:	ia64, powerpc, amd64
Kernel toolchain busted on:	arm
2005-07-15 18:17:59 +00:00
Gleb Smirnoff
4f20185860 Add additional newline to debug.mutex.prof.stats header, so that
column names are printed exactly above the columns.
2005-04-08 14:14:09 +00:00
John Baldwin
c6a37e8413 Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
John Baldwin
33fb8a386e Rework the optimization for spinlocks on UP to be slightly less drastic and
turn it back on.  Specifically, the actual changes are now less intrusive
in that the _get_spin_lock() and _rel_spin_lock() macros now have their
contents changed for UP vs SMP kernels which centralizes the changes.
Also, UP kernels do not use _mtx_lock_spin() and no longer include it.  The
UP versions of the spin lock functions do not use any atomic operations,
but simple compares and stores which allow mtx_owned() to still work for
spin locks while removing the overhead of atomic operations.

Tested on:	i386, alpha
2005-01-05 21:13:27 +00:00
John Baldwin
2ff0e645d1 Refine the turnstile and sleep queue interfaces just a bit:
- Add a new _lock() call to each API that locks the associated chain lock
  for a lock_object pointer or wait channel.  The _lookup() functions now
  require that the chain lock be locked via _lock() when they are called.
- Change sleepq_add(), turnstile_wait() and turnstile_claim() to lookup
  the associated queue structure internally via _lookup() rather than
  accepting a pointer from the caller.  For turnstiles, this means that
  the actual lookup of the turnstile in the hash table is only done when
  the thread actually blocks rather than being done on each loop iteration
  in _mtx_lock_sleep().  For sleep queues, this means that sleepq_lookup()
  is no longer used outside of the sleep queue code except to implement an
  assertion in cv_destroy().
- Change sleepq_broadcast() and sleepq_signal() to require that the chain
  lock is already required.  For condition variables, this lets the
  cv_broadcast() and cv_signal() functions lock the sleep queue chain lock
  while testing the waiters count.  This means that the waiters count
  internal to condition variables is no longer protected by the interlock
  mutex and cv_broadcast() and cv_signal() now no longer require that the
  interlock be held when they are called.  This lets consumers of condition
  variables drop the lock before waking other threads which can result in
  fewer context switches.

MFC after:	1 month
2004-10-12 18:36:20 +00:00
Stephan Uphoff
b9a80acadb Force MUTEX_WAKE_ALL.
A race condition in single thread wakeup may break priority inheritance.

Tested   by: pho
Reviewed by: jhb,julian
Approved by: sam (mentor)
MFC: ASAP
2004-10-12 16:28:18 +00:00
Scott Long
9923b511ed Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
John-Mark Gurney
000968010a add options MPROF_BUFFERS and MPROF_HASH_SIZE that adjust the sizes of
the mutex profiling buffers.  Document them in the man page and in NOTES.
Ensure _HASH_SIZE is larger than _BUFFERS with a cpp error.
2004-08-19 06:38:26 +00:00
John Baldwin
bdcfcf5bc4 Cache the value of curthread in the _get_sleep_lock() and _get_spin_lock()
macros and pass the value to the associated _mtx_*() functions to avoid
more curthread dereferences in the function implementations.  This provided
a very modest perf improvement in some benchmarks.

Suggested by:	rwatson
Tested by:	scottl
2004-08-04 20:18:45 +00:00
Maxime Henrion
9f1b87f106 Instead of calling ia32_pause() conditionally on __i386__ or __amd64__
being defined, define and use a new MD macro, cpu_spinwait().  It only
expands to something on i386 and amd64, so the compiled code should be
identical.

Name of the macro found by:	jhb
Reviewed by:	jhb
2004-08-03 18:44:27 +00:00
Robert Watson
a9abdce44a Add "options ADAPTIVE_GIANT" which causes Giant to also be treated in
an adaptive fashion when adaptive mutexes are enabled.  The theory
behind non-adaptive Giant is that Giant will be held for long periods
of time, and therefore spinning waiting on it is wasteful.  However,
in MySQL benchmarks which are relatively Giant-free, running Giant
adaptive makes an observable difference on SMP (5% transaction rate
improvement).  As such, make adaptive behavior on Giant an option so
it can be more widely benchmarked.
2004-07-27 16:34:48 +00:00
Peter Wemm
b09cb1027b #ifdef __i386__ -> __i386__ || __amd64__ 2004-07-20 02:15:10 +00:00
Pawel Jakub Dawidek
ece2d9891e Now we have NO_ADAPTIVE_MUTEXES option, so use it here too.
Missed by:	scottl
2004-07-18 23:27:14 +00:00
Scott Long
701f140800 Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to
NO_ADAPTIVE_MUTEXES.  This option has been enabled by default on amd64 for
quite some time, and has been extensively tested on i386 and sparc64.  It
shows measurable performance gains in many circumstances, and few negative
effects.  It would be nice in t he future if adaptive mutexes actually went
to sleep after a certain amount of spinning, but that will require quite a
bit more testing.
2004-07-18 15:59:03 +00:00
Marcel Moolenaar
2d50560abc Update for the KDB framework:
o  Make debugging code conditional upon KDB instead of DDB.
o  Call kdb_enter() instead of Debugger().
o  Call kdb_backtrace() instead of db_print_backtrace() or backtrace().

kern_mutex.c:
o  Replace checks for db_active with checks for kdb_active and make
   them unconditional.

kern_shutdown.c:
o  s/DDB_UNATTENDED/KDB_UNATTENDED/g
o  s/DDB_TRACE/KDB_TRACE/g
o  Save the TID of the thread doing the kernel dump so the debugger
   knows which thread to select as the current when debugging the
   kernel core file.
o  Clear kdb_active instead of db_active and do so unconditionally.
o  Remove backtrace() implementation.

kern_synch.c:
o  Call kdb_reenter() instead of db_error().
2004-07-10 21:36:01 +00:00
John Baldwin
0c0b25ae91 Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
John Baldwin
bf0acc273a - Change mi_switch() and sched_switch() to accept an optional thread to
switch to.  If a non-NULL thread pointer is passed in, then the CPU will
  switch to that thread directly rather than calling choosethread() to pick
  a thread to choose to.
- Make sched_switch() aware of idle threads and know to do
  TD_SET_CAN_RUN() instead of sticking them on the run queue rather than
  requiring all callers of mi_switch() to know to do this if they can be
  called from an idlethread.
- Move constants for arguments to mi_switch() and thread_single() out of
  the middle of the function prototypes and up above into their own
  section.
2004-07-02 19:09:50 +00:00
John Baldwin
535eb30962 Add a new kernel option MUTEX_WAKE_ALL that changes the mutex unlock code
to awaken all waiters when a contested mutex is released instead of just
the highest priority waiter.  If the various threads are awakened in
sequence then each thread may acquire and release the lock in question
without contention resulting in fewer expensive unlock and lock
operations.  This old behavior of waking just the highest priority is
still used if this option is specified.  Making the algorithm conditional
on a kernel option will allows us to benchmark both cases later and
determine which one should be used by default.

Requested by:	tanimura-san
2004-04-06 19:12:24 +00:00
Robert Watson
94ffb20d72 Add a reset sysctl for mutex profiling: zeros all of the mutex
profiling buffers and hash table.  This makes it a lot easier to
do multiple profiling runs without rebooting or performing
gratuitous arithmetic.  Sysctl is named debug.mutex.prof.reset.

Reviewed by:	jake
2004-01-28 22:11:53 +00:00
John Baldwin
8d768e7676 Rework witness_lock() to make it slightly more useful and flexible.
- witness_lock() is split into two pieces: witness_checkorder() and
  witness_lock().  Witness_checkorder() determines if acquiring a specified
  lock at the time it is called would result in a lock order.  It
  optionally adds a new lock order relationship as well.  witness_lock()
  updates witness's data structures to assume that a lock has been acquired
  by stick a new lock instance in the appropriate lock instance list.
- The mutex and sx lock functions now call checkorder() prior to trying to
  acquire a lock and continue to call witness_lock() after the acquire is
  completed.  This will let witness catch a deadlock before it happens
  rather than trying to do so after the threads have deadlocked (i.e. never
  actually report it).
- A new function witness_defineorder() has been added that adds a lock
  order between two locks at runtime without having to acquire the locks.
  If the lock order cannot be added it will return an error.  This function
  is available to programmers via the WITNESS_DEFINEORDER() macro which
  accepts either two mutexes or two sx locks as its arguments.
- A few simple wrapper macros were added to allow developers to call
  witness_checkorder() anywhere as a way of enforcing locking assertions
  in code that might acquire a certain lock in some situations.  The
  macros are: witness_check_{mutex,shared_sx,exclusive_sx} and take an
  appropriate lock as the sole argument.
- The code to remove a lock instance from a lock list in witness_unlock()
  was unnested by using a goto to vastly improve the readability of this
  function.
2004-01-28 20:39:57 +00:00
Jeff Roberson
29bcc4514f - Add a flags parameter to mi_switch. The value of flags may be SW_VOL or
SW_INVOL.  Assert that one of these is set in mi_switch() and propery
   adjust the rusage statistics.  This is to simplify the large number of
   users of this interface which were previously all required to adjust the
   proper counter prior to calling mi_switch().  This also facilitates more
   switch and locking optimizations.
 - Change all callers of mi_switch() to pass the appropriate paramter and
   remove direct references to the process statistics.
2004-01-25 03:54:52 +00:00
Robert Watson
8dc10be885 Add some basic support for measuring sleep mutex contention to the
mutex profiling code.  As with existing mutex profiling, measurement
is done with respect to mtx_lock() instances in the code, as opposed
to specific mutexes.  In particular, measure two things:

(1) Lock contention.  How often did this mtx_lock() call get made and
    have to sleep (or almost sleep) waiting for the lock.  This helps
    identify the "victims" of contention.

(2) Hold contention.  How often, while the lock was held by a thread
    as a result of this mtx_lock(), did another thread try to acquire
    the same mutex.  This helps identify the causes of contention.

I'm currently exploring adding measurement of "time waited for the
lock", but the current implementation has proven useful to me so far
so I figured I'd commit it so others could try it out.  Note that this
increases the size of mutexes when MUTEX_PROFILING is enabled, so you
might find you need to further bump UMA_BOOT_PAGES.  Fixes welcome.

The once over:	des, others
2004-01-25 01:59:27 +00:00
John Baldwin
eac097962f - Allow mtx_trylock() to recurse on a recursive mutex. Attempts to recurse
on a non-recursive mutex will fail but will not trigger any assertions.
- Add an assertion to mtx_lock() that one never recurses on a non-recursive
  mutex.  This is mostly useful for the non-WITNESS case.

Requested by:	deischen, julian, others (1)
2004-01-05 23:09:51 +00:00
John Baldwin
961a7b244d Add an implementation of turnstiles and change the sleep mutex code to use
turnstiles to implement blocking isntead of implementing a thread queue
directly.  These turnstiles are somewhat similar to those used in Solaris 7
as described in Solaris Internals but are also different.

Turnstiles do not come out of a fixed-sized pool.  Rather, each thread is
assigned a turnstile when it is created that it frees when it is destroyed.
When a thread blocks on a lock, it donates its turnstile to that lock to
serve as queue of blocked threads.  The queue associated with a given lock
is found by a lookup in a simple hash table.  The turnstile itself is
protected by a lock associated with its entry in the hash table.  This
means that sched_lock is no longer needed to contest on a mutex.  Instead,
sched_lock is only used when manipulating run queues or thread priorities.
Turnstiles also implement priority propagation inherently.

Currently turnstiles only support mutexes.  Eventually, however, turnstiles
may grow two queue's to support a non-sleepable reader/writer lock
implementation.  For more details, see the comments in sys/turnstile.h and
kern/subr_turnstile.c.

The two primary advantages from the turnstile code include: 1) the size
of struct mutex shrinks by four pointers as it no longer stores the
thread queue linkages directly, and 2) less contention on sched_lock in
SMP systems including the ability for multiple CPUs to contend on different
locks simultaneously (not that this last detail is necessarily that much of
a big win).  Note that 1) means that this commit is a kernel ABI breaker,
so don't mix old modules with a new kernel and vice versa.

Tested on:	i386 SMP, sparc64 SMP, alpha SMP
2003-11-11 22:07:29 +00:00
John Baldwin
4110951861 If a spin lock is held for too long and WITNESS is enabled, then call
witness_display_spinlock() to see if we can find out where the current
owner of the spin lock last acquired the lock.
2003-07-31 18:52:18 +00:00
John Baldwin
47b722c1af When complaining about a sleeping thread owning a mutex, display the
thread's pid to make debugging easier for people who don't want to have to
use the intended tool for these panics (witness).

Indirectly prodded by:	kris
2003-07-30 20:42:15 +00:00
John Baldwin
f7ee15901a - Add comments about the maintenance of the per-thread list of contested
locks held by each thread.
- Fix a bug in the original BSD/OS code where a contested lock was not
  properly handed off from the old thread to the new thread when a
  contested lock with more than one blocked thread was transferred from
  one thread to another.
- Don't use an atomic operation to write the MTX_CONTESTED value to
  mtx_lock in the aforementioned special case.  The memory barriers and
  exclusion provided by sched_lock are sufficient.

Spotted by:	alc (2)
2003-07-02 16:14:09 +00:00
David E. O'Brien
677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
Poul-Henning Kamp
b82af320cf Add "" around mutex name to make message less confusing. 2003-05-31 21:11:01 +00:00
John Baldwin
27dad03c97 Use TD_IS_RUNNING() instead of thread_running() in the adaptive mutex
code.
2003-04-17 22:28:58 +00:00
Julian Elischer
060563ec50 Move the _oncpu entry from the KSE to the thread.
The entry in the KSE still exists but it's purpose will change a bit
when we add the ability to lock a KSE to a cpu.
2003-04-10 17:35:44 +00:00
Tim J. Robbins
f949f795aa Remove unused mtx_lock_giant(), mtx_unlock_giant(), related globals
and sysctls.
2003-03-23 11:26:11 +00:00
Poul-Henning Kamp
b4b138c27f Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
John Baldwin
75d468ee12 Axe the useless MTX_SLEEPABLE flag. mutexes are not sleepable locks.
Nothing used this flag and WITNESS would have panic'd during mtx_init()
if anything had.
2003-03-11 20:02:57 +00:00
John Baldwin
1106937d99 Remove safety belt: it is now ok to do a mtx_trylock() on a mutex you
already own.  The mtx_trylock() will fail however.  Enhance the comment
at the top of the try lock function to explain this.

Requested by:	jlemon and his evil netisr locking
2003-03-04 21:32:25 +00:00
John Baldwin
5fa8dd90f9 Miscellaneous cleanups to _mtx_lock_sleep():
- Declare some local variables at the top of the function instead of in a
  nested block.
- Use mtx_owned() instead of masking off bits from mtx_lock manually.
- Read the value of mtx_lock into 'v' as a separate line rather than inside
  an if statement for clarity.  This code is hairy enough as it is.
2003-03-04 20:32:41 +00:00
John Baldwin
6b869595c5 Properly assert that mtx_trylock() is not called on a mutex we already
owned.  Previously the KASSERT would only trigger if we successfully
acquired a lock that we already held.  However, _obtain_lock() fails to
acquire locks that we already hold, so the KASSERT was never checked in
the case it was supposed to fail.
2003-03-04 20:30:30 +00:00
Mike Makonnen
0bd5f7979d Unbreak mutex profiling (at least for me).
o Always check for null when dereferencing the filename component.
	o Implement a try-and-backoff method for allocating memory to
	  dump stats to avoid a spin-lock -> sleep-lock mutex lock order
	  panic with WITNESS.

Approved by:	des, markm (mentor)
Not objected:	jhb
2003-02-25 22:28:46 +00:00
Dag-Erling Smørgrav
ecf031c9ad There's absolutely no need for a struct-within-a-struct, so move the
counters out of the inner struct and remove it.
2003-01-21 20:33:27 +00:00
Poul-Henning Kamp
fa669ab7b8 Disable the kernacc() check in mtx_validate() until such time that kernacc
does not require Giant.

This means that we may miss panics on a class of mutex programming bugs,
but only if running with a Chernobyl setting of debug-flags.

Spotted by:	Pete Carah <pete@ns.altadena.net>
2002-10-25 08:40:20 +00:00
Dag-Erling Smørgrav
f2c1ea8152 Whitespace cleanup. 2002-10-23 10:26:54 +00:00
Robert Drehmel
d08926b1f6 Change the `mutex_prof' structure to use three variables contained
in an anonymous structure as counters, instead of an array with
preprocessor-defined names for indices.  Remove the associated XXX-
comment.
2002-10-22 16:06:28 +00:00
Dag-Erling Smørgrav
6d0369001a Reduce the overhead of the mutex statistics gathering code, try to produce
shorter lines in the report, and clean up some minor style issues.
2002-10-21 18:48:28 +00:00
Jeff Roberson
b43179fbe8 - Create a new scheduler api that is defined in sys/sched.h
- Begin moving scheduler specific functionality into sched_4bsd.c
 - Replace direct manipulation of scheduler data with hooks provided by the
   new api.
 - Remove KSE specific state modifications and single runq assumptions from
   kern_switch.c

Reviewed by:	-arch
2002-10-12 05:32:24 +00:00
John Baldwin
551cf4e150 Rename the mutex thread and process states to use a more generic 'LOCK'
name instead.  (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of
TD_ON_MUTEX())  Eventually a turnstile abstraction will be added that
will be shared with mutexes and other types of locks.  SLOCK/TDI_LOCK will
be used internally by the turnstile code and will not be specific to
mutexes.  Making the change now ensures that turnstiles can be dropped
in at a later date without affecting the ABI of userland applications.
2002-10-02 20:31:47 +00:00
Julian Elischer
2735483034 uh, commit all of the patch 2002-09-29 23:28:58 +00:00
Julian Elischer
e081731767 commit the version I actually tested..
Submitted by:	davidxu
2002-09-29 23:23:25 +00:00
Julian Elischer
9eb1fdea37 Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode
from stopping another thread from completing a syscall, and this allows it to
release its resources etc. Probably more related commits to follow (at least
one I know of)

Initial concept by: julian, dillon
Submitted by:	davidxu
2002-09-29 23:04:34 +00:00
Julian Elischer
71fad9fdee Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00
John Baldwin
0d975d6341 Add some KASSERT()'s to ensure that we don't perform spin mutex ops on
sleep mutexes and vice versa.  WITNESS normally should catch this but
not everyone uses WITNESS so this is a fallback to catch nasty but easy
to do bugs.
2002-09-03 18:25:16 +00:00
Ian Dowse
02bd1bcd2a Add a new KTR type KTR_CONTENTION, and use it in the mutex code to
log the start and end of periods during which mtx_lock() is waiting
to acquire a sleep mutex. The log message includes the file and
line of both the waiter and the holder.

Reviewed by:	jhb, jake
2002-08-26 18:39:38 +00:00
John Baldwin
ce39e722ec Disable optimization of spinlocks on UP kernels w/o debugging for now
since it breaks mtx_owned() on spin mutexes when used outside of
mtx_assert().  Unfortunately we currently use it in the i386 MD code
and in the sio(4) driver.

Reported by:	bde
2002-07-27 16:54:23 +00:00
Dag-Erling Smørgrav
b61860ad2d Add mtx_ prefixes to the fields used for mutex profiling, and fix a bug
where the profiling code would report the release point instead of the
acquisition point.

Requested by:	bde
2002-07-03 01:50:27 +00:00
Julian Elischer
e602ba25fd Part 1 of KSE-III
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by:	Almost everyone who counts
	(at various times, peter, jhb, matt, alfred, mini, bernd,
	and a cast of thousands)

	NOTE: this is still Beta code, and contains lots of debugging stuff.
	expect slight instability in signals..
2002-06-29 17:26:22 +00:00
John Baldwin
6a95e08f2f Replace thread_runnable() with thread_running() as the latter is more
accurate.

Suggested by:	julian
2002-06-04 22:36:24 +00:00
John Baldwin
7fcca6096f Optimize the adaptive mutex spin a bit. Use a simple while loop with
simple reads (and on IA32, a "pause" instruction for each interation of the
loop) to spin until either the mutex owner field changes, or the lock owner
stops executing.

Suggested by:	tanimura
Tested on:	i386
2002-06-04 21:53:48 +00:00
John Baldwin
5853d37d3b Add a private thread_runnable() macro to make the code more readable and
make the KSE diff easier to maintain.
2002-06-04 21:50:02 +00:00
Dag-Erling Smørgrav
db586c8b7c Make the counters uintmax_ts, and use %ju rather than %llu. 2002-05-23 03:08:42 +00:00
John Baldwin
6b8c698908 Rename pause() to ia32_pause() so it doesn't conflict with the pause()
function defined in <unistd.h>.  I didn't #ifdef _KERNEL it because the
mutex implementation in libpthread will probably need this.
2002-05-22 20:32:39 +00:00
John Baldwin
0228ea4e0b Rename cpu_pause() to pause(). Originally I was going to make this an
MI API with empty cpu_pause() functions on other arch's, but this
functionality is definitely unique to IA-32, so I decided to leave it
as i386-only and wrap it in #ifdef's.  I should have dropped the cpu_
prefix when I made that decision.

Requested by:	bde
2002-05-22 13:19:22 +00:00
John Baldwin
703fc290fb Add appropriate IA32 "pause" instructions to improve performanec on
Pentium 4's and newer IA32 processors.  The "pause" instruction has been
verified by Intel to be a NOP on all currently existing IA32 processors
prior to the Pentium 4.
2002-05-21 22:26:35 +00:00