Commit Graph

188 Commits

Author SHA1 Message Date
rwatson
8963f5c8c0 Document, via WITNESS, that the NFS server mutex falls ahead of the socket
buffer mutexes.
2005-03-09 21:38:53 +00:00
wpaul
a72168b811 When you call MiniportInitialize() for an 802.11 driver, it will
at some point result in a status event being triggered (it should
be a link down event: the Microsoft driver design guide says you
should generate one when the NIC is initialized). Some drivers
generate the event during MiniportInitialize(), such that by the
time MiniportInitialize() completes, the NIC is ready to go. But
some drivers, in particular the ones for Atheros wireless NICs,
don't generate the event until after a device interrupt occurs
at some point after MiniportInitialize() has completed.

The gotcha is that you have to wait until the link status event
occurs one way or the other before you try to fiddle with any
settings (ssid, channel, etc...). For the drivers that set the
event sycnhronously this isn't a problem, but for the others
we have to pause after calling ndis_init_nic() and wait for the event
to arrive before continuing. Failing to wait can cause big trouble:
on my SMP system, calling ndis_setstate_80211() after ndis_init_nic()
completes, but _before_ the link event arrives, will lock up or
reset the system.

What we do now is check to see if a link event arrived while
ndis_init_nic() was running, and if it didn't we msleep() until
it does.

Along the way, I discovered a few other problems:

- Defered procedure calls run at PASSIVE_LEVEL, not DISPATCH_LEVEL.
  ntoskrnl_run_dpc() has been fixed accordingly. (I read the documentation
  wrong.)

- Similarly, the NDIS interrupt handler, which is essentially a
  DPC, also doesn't need to run at DISPATCH_LEVEL. ndis_intrtask()
  has been fixed accordingly.

- MiniportQueryInformation() and MiniportSetInformation() run at
  DISPATCH_LEVEL, and each request must complete before another
  can be submitted. ndis_get_info() and ndis_set_info() have been
  fixed accordingly.

- Turned the sleep lock that guards the NDIS thread job list into
  a spin lock. We never do anything with this lock held except manage
  the job list (no other locks are held), so it's safe to do this,
  and it's possible that ndis_sched() and ndis_unsched() can be
  called from DISPATCH_LEVEL, so using a sleep lock here is
  semantically incorrect. Also updated subr_witness.c to add the
  lock to the order list.
2005-03-07 03:05:31 +00:00
rwatson
57c91a09d8 When DDB is not defined, don't implement witness_thread_has_locks() and
witness_proc_has_locks(), as they are unused, which results in a compiler
error.  This problem was introduced with the implementation of "show
alllocks".

Spotted by:	Artem Kuchin <matrix at itlegion dot ru>
2005-01-22 21:14:21 +00:00
jhb
38cd373d81 - Up the WITNESS_COUNT macro from 200 to 1024 to support the growing number
of lock types in the kernel.  This results in an increase of witness
  data usage from ~145k to ~280k on i386 for kernels with
  'options WITNESS'.
- Remove the unused witness malloc bucket.

Submitted by:	Michal Mertl mime at traveller dot cz (1)
2004-12-28 21:21:27 +00:00
rwatson
c2459d7b3f Attempt to slightly refine the print out from "show alllocks" -- list
the process and thread numbers/names on the same line rather than on
separate lines, and print the thread pointer not just the tid.
2004-12-27 10:47:08 +00:00
rwatson
b864ac486c Add "show alllocks" command to DDB, which dumps a list of processes
and threads currently holding sleep mutexes (and spin mutexes for
curthread).  This can be quite useful in looking for a lock condition
summary for a system, as it avoids manually iterating through threads
and processes to find all the interesting locks.

NB: "alllocks" is up there with "lockedvnods" for a bad argument for
show.

MFC after:	2 weeks
2004-12-26 22:52:24 +00:00
jmg
17a80e7f16 clean up some tunables that should of been removed a while ago... 2004-11-09 06:46:14 +00:00
rwatson
5d60192ca7 Add entropy harvest mutex to hard-coded spin lock witness lock order,
remove previous entropy harvesting mutex names as they are no longer
present.  Commit to this file was ommitted when randomdev_soft.c:1.5
was made.

Feet shot:	Robert Huff <roberthuff at rcn dot com>
2004-10-11 08:26:18 +00:00
green
3a482df790 Don't "implicitly order all sleep locks before spin locks" in witness
when the spin lock in question isn't -- it's the critical_enter() that
KDB set.  No more panic in DDB for console -> syscons -> tty -> knote
operations.
2004-10-09 08:16:37 +00:00
rwatson
d4e6ebd0c9 Hard code witness lock order for BPF locks. 2004-09-09 05:01:37 +00:00
jmg
ff58a59f8f make witness it's own sysctl branch instead of using _ to do this. I have
left the old tunables in to give people a few days to transition their
loader.conf and sysctl.conf's over to the new names..

MFC after:	5 days
2004-09-06 23:27:28 +00:00
jhb
fb7bd65f3f Remove a potential deadlock on i386 SMP by changing the lazypmap ipi and
spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi
and spin-wait code snippets so that you can't get into the situation of
one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap
shootdown each of which are waiting on each other.  With this change, only
one of the CPUs would do an IPI and spin-wait at a time.
2004-08-04 20:31:19 +00:00
rwatson
1fd2b90911 Add netatalk mutexes to hard-coded WITNESS lock order. 2004-07-25 20:16:51 +00:00
marcel
ccf2260d9b Update for the KDB framework:
o  Make debugging code conditional upon KDB instead of DDB.
o  s/WITNESS_DDB/WITNESS_KDB/g
o  s/witness_ddb/witness_kdb/g
o  Rename the debug.witness_ddb sysctl to debug.witness_kdb.
o  Call kdb_backtrace() instead of backtrace().
o  Call kdb_enter() instead Debugger().
o  Assert kdb_active instead of db_active.
2004-07-10 21:42:16 +00:00
jhb
761713b2ff Check the lock lists to see if they are empty directly rather than
assigning a pointer to the list and then dereferencing the pointer as a
second step.  When the first spin lock is acquired, curthread is not in
a critical section so it may be preempted and would end up using another
CPUs lock list instead of its own.

When this code was in witness_lock() this sequence was safe as curthread
was in a critical section already since witness_lock() is called after the
lock is acquired.

Tested by:	Daniel Lang dl at leo.org
2004-07-09 17:46:27 +00:00
rwatson
e3d9cae8b6 Introduce socket and UNIX domain socket locks into hard-coded lock
order definition for witness.  Send lock before receive lock, and
socket locks after accept but  before select:

  filedesc -> accept -> so_snd -> so_rcv -> sellck

All routing locks after send lock:

  so_rcv -> radix node head

All protocol locks before socket locks:

  unp -> so_snd
  udp -> udpinp -> so_snd
  tcp -> tcpinp -> so_snd
2004-06-13 00:23:03 +00:00
jhb
66f3d8ffca - Comment out NULL, NULL barrier for Unix domain sockets section as the
double NULL entries signal Witness to stop processing the array of
  order entries meaning none of the spin locks are added resulting in
  panics on boot.
- Add a missing NULL, NULL terminator to the Slip locks list to keep them
  separate from the spin locks.
2004-06-03 20:07:44 +00:00
rwatson
de0c6ecd47 Expand the hard-coded WITNESS lock order to include the following
relationships:

Sockets:    filedesc->accept->sellck
Routing:    radix node head->rtentry->ifaddr
UDP:        udp->udpinp
TCP:        tcp->tcpinp
SLIP:       slip_mtx->slip sc_mtx

Drop in a place holder section for UNIX domain sockets.  Various
sections to be expanded over the next few days.
2004-06-02 23:28:06 +00:00
alfred
fbfae479f4 Emit a traceback when witness_trace is set and witness_warn() is
called and triggers (typically caused by sleeping with a non-sleepable
lock).

Reviewed by: jhb
2004-03-23 00:32:27 +00:00
jhb
d07a9130c6 Add an implementation of a generic sleep queue abstraction that is used
to queue threads sleeping on a wait channel similar to how turnstiles are
used to queue threads waiting for a lock.  This subsystem will be used as
the backend for sleep/wakeup and condition variables initially.  Eventually
it will also be used to replace the ithread-specific iwait thread
inhibitor.

Sleep queues are also not locked by sched_lock, so this splits sched_lock
up a bit further increasing concurrency within the scheduler.  Sleep queues
also natively support timeouts on sleeps and interruptible sleeps allowing
for the reduction of a lot of duplicated code between the sleep/wakeup and
condition variable implementations.  For more details on the sleep queue
implementation, check the comments in sys/sleepqueue.h and
kern/subr_sleepqueue.c.
2004-02-27 18:33:09 +00:00
jhb
3e89d3fc1b Remove a bogus assertion.
Noticed by:	bde
Pointy hat to:	jhb
2004-02-03 15:14:27 +00:00
jhb
47cec231e3 - Assert that witness_cold is not true in enroll().
- Only check witness_watch once in enroll().

Reported by:	ru (2)
2004-02-02 22:15:17 +00:00
jhb
7c38a96e26 Rework witness_lock() to make it slightly more useful and flexible.
- witness_lock() is split into two pieces: witness_checkorder() and
  witness_lock().  Witness_checkorder() determines if acquiring a specified
  lock at the time it is called would result in a lock order.  It
  optionally adds a new lock order relationship as well.  witness_lock()
  updates witness's data structures to assume that a lock has been acquired
  by stick a new lock instance in the appropriate lock instance list.
- The mutex and sx lock functions now call checkorder() prior to trying to
  acquire a lock and continue to call witness_lock() after the acquire is
  completed.  This will let witness catch a deadlock before it happens
  rather than trying to do so after the threads have deadlocked (i.e. never
  actually report it).
- A new function witness_defineorder() has been added that adds a lock
  order between two locks at runtime without having to acquire the locks.
  If the lock order cannot be added it will return an error.  This function
  is available to programmers via the WITNESS_DEFINEORDER() macro which
  accepts either two mutexes or two sx locks as its arguments.
- A few simple wrapper macros were added to allow developers to call
  witness_checkorder() anywhere as a way of enforcing locking assertions
  in code that might acquire a certain lock in some situations.  The
  macros are: witness_check_{mutex,shared_sx,exclusive_sx} and take an
  appropriate lock as the sole argument.
- The code to remove a lock instance from a lock list in witness_unlock()
  was unnested by using a goto to vastly improve the readability of this
  function.
2004-01-28 20:39:57 +00:00
ru
0287790feb Register the uart(4)'s spin lock with witness(4). 2004-01-25 15:04:37 +00:00
markm
6a2f4748c4 Fix a major faux pas of mine. I was causing 2 very bad things to
happen in interrupt context; 1) sleep locks, and 2) malloc/free
calls.

1) is fixed by using spin locks instead.

2) is fixed by preallocating a FIFO (implemented with a STAILQ)
   and using elements from this FIFO instead. This turns out
   to be rather fast.

OK'ed by:	re (scottl)
Thanks to:	peter, jhb, rwatson, jake
Apologies to:	*
2003-11-20 15:35:48 +00:00
peter
9dedda25aa Initial landing of SMP support for FreeBSD/amd64.
- This is heavily derived from John Baldwin's apic/pci cleanup on i386.
- I have completely rewritten or drastically cleaned up some other parts.
  (in particular, bootstrap)
- This is still a WIP.  It seems that there are some highly bogus bioses
  on nVidia nForce3-150 boards.  I can't stress how broken these boards
  are.  I have a workaround in mind, but right now the Asus SK8N is broken.
  The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed.
- Most of my testing has been with SCHED_ULE.  SCHED_4BSD works.
- the apic and acpi components are 'standard'.
- If you have an nVidia nForce3-150 board, you are stuck with 'device
  atpic' in addition, because they somehow managed to forget to connect the
  8254 timer to the apic, even though its in the same silicon!  ARGH!
  This directly violates the ACPI spec.
2003-11-17 08:58:16 +00:00
bde
60cfaec287 Localized the cy driver's locking. 2003-11-16 00:55:54 +00:00
jhb
6cc1f7e330 Add an implementation of turnstiles and change the sleep mutex code to use
turnstiles to implement blocking isntead of implementing a thread queue
directly.  These turnstiles are somewhat similar to those used in Solaris 7
as described in Solaris Internals but are also different.

Turnstiles do not come out of a fixed-sized pool.  Rather, each thread is
assigned a turnstile when it is created that it frees when it is destroyed.
When a thread blocks on a lock, it donates its turnstile to that lock to
serve as queue of blocked threads.  The queue associated with a given lock
is found by a lookup in a simple hash table.  The turnstile itself is
protected by a lock associated with its entry in the hash table.  This
means that sched_lock is no longer needed to contest on a mutex.  Instead,
sched_lock is only used when manipulating run queues or thread priorities.
Turnstiles also implement priority propagation inherently.

Currently turnstiles only support mutexes.  Eventually, however, turnstiles
may grow two queue's to support a non-sleepable reader/writer lock
implementation.  For more details, see the comments in sys/turnstile.h and
kern/subr_turnstile.c.

The two primary advantages from the turnstile code include: 1) the size
of struct mutex shrinks by four pointers as it no longer stores the
thread queue linkages directly, and 2) less contention on sched_lock in
SMP systems including the ability for multiple CPUs to contend on different
locks simultaneously (not that this last detail is necessarily that much of
a big win).  Note that 1) means that this commit is a kernel ABI breaker,
so don't mix old modules with a new kernel and vice versa.

Tested on:	i386 SMP, sparc64 SMP, alpha SMP
2003-11-11 22:07:29 +00:00
jhb
1d8b4454d7 Update spin lock order list for new i386 interrupt and SMP code. 2003-11-03 22:38:30 +00:00
silby
f0e686a675 Change all SYSCTLS which are readonly and have a related TUNABLE
from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide
more useful error messages.
2003-10-21 18:28:36 +00:00
sam
23e7708b76 add fast swi taskqueue spinlock to the order_list so witness doesn't complain
Submitted by:	Tor Egge <Tor.Egge@cvsup.no.freebsd.org>
2003-09-06 21:06:08 +00:00
jhb
a69166c61f Insert cosmetic spaces.
Reported by:	kris
2003-08-04 19:24:25 +00:00
jhb
a3b9c0d553 Add a new function to look for a spinlock's instance when it is held by
another thread.  We use the td_oncpu member of the other field to locate
it's associated CPU and then search the that CPU's list of spin locks
contained in its per-CPU data.  This is not always safe and may in fact
panic or just not work, but it is useful in at least one case.
2003-07-31 18:50:58 +00:00
peter
fb79192cce unifdef -DLAZY_SWITCH and start to tidy up the associated glue. 2003-07-10 01:02:59 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
phk
d876d5ad8d Remove return after panic.
Found by:       FlexeLint
2003-05-31 20:09:42 +00:00
peter
6c537c22b4 Add __amd64__ to the ifdefs that introduce the "pcicfg" spinlock to
witness.

Approved by:  re (safe amd64 support)
2003-05-31 06:42:37 +00:00
julian
6f175a0e20 Move the _oncpu entry from the KSE to the thread.
The entry in the KSE still exists but it's purpose will change a bit
when we add the ability to lock a KSE to a cpu.
2003-04-10 17:35:44 +00:00
mike
75859ca578 o In struct prison, add an allprison linked list of prisons (protected
by allprison_mtx), a unique prison/jail identifier field, two path
  fields (pr_path for reporting and pr_root vnode instance) to store
  the chroot() point of each jail.
o Add jail_attach(2) to allow a process to bind to an existing jail.
o Add change_root() to perform the chroot operation on a specified
  vnode.
o Generalize change_dir() to accept a vnode, and move namei() calls
  to callers of change_dir().
o Add a new sysctl (security.jail.list) which is a group of
  struct xprison instances that represent a snapshot of active jails.

Reviewed by:	rwatson, tjr
2003-04-09 02:55:18 +00:00
peter
46969da5f8 Commit a partial lazy thread switch mechanism for i386. it isn't as lazy
as it could be and can do with some more cleanup.  Currently its under
options LAZY_SWITCH.  What this does is avoid %cr3 reloads for short
context switches that do not involve another user process.  ie: we can
take an interrupt, switch to a kthread and return to the user without
explicitly flushing the tlb.  However, this isn't as exciting as it could
be, the interrupt overhead is still high and too much blocks on Giant
still.  There are some debug sysctls, for stats and for an on/off switch.

The main problem with doing this has been "what if the process that you're
running on exits while we're borrowing its address space?" - in this case
we use an IPI to give it a kick when we're about to reclaim the pmap.

Its not compiled in unless you add the LAZY_SWITCH option.  I want to fix a
few more things and get some more feedback before turning it on by default.

This is NOT a replacement for Bosko's lazy interrupt stuff.  This was more
meant for the kthread case, while his was for interrupts.  Mine helps a
little for interrupts, but his helps a lot more.

The stats are enabled with options SWTCH_OPTIM_STATS - this has been a
pseudo-option for years, I just added a bunch of stuff to it.

One non-trivial change was to select a new thread before calling
cpu_switch() in the first place.  This allows us to catch the silly
case of doing a cpu_switch() to the current process.  This happens
uncomfortably often.  This simplifies a bit of the asm code in cpu_switch
(no longer have to call choosethread() in the middle).  This has been
implemented on i386 and (thanks to jake) sparc64.  The others will come
soon.  This is actually seperate to the lazy switch stuff.

Glanced at by:  jake, jhb
2003-04-02 23:53:30 +00:00
jhb
966c72c345 - Remove witness_dead and just use witness_watch instead. If witness_watch
is set to 0, it now has the same affect as setting witness_dead used to
  have.
- Added a sysctl handler that allows root to change witness_watch from a
  non-zero value to zero to disable witness at runtime.  Note that you
  can't turn witness back on once it is off.  You can only turn it off as
  a one-way switch.
- Added a comment describing the possible values of witness_watch.
2003-03-24 21:03:53 +00:00
jhb
0c3ac305c8 Trim an extra blank line that snuck into the last commit. 2003-03-11 22:33:42 +00:00
jhb
b2bb08b487 - Change witness_displaydescendants() to accept the indentation level as
a parameter instead of using the level of a given witness.  When
  recursing, pass an indent level of indent + 1.
- Make use of the information witness_levelall() provides in
  witness_display_list() to use an O(n) algorithm instead of an O(n^2)
  algo to decide which witnesses to display hierarchies from.  Basically,
  we only display a hierarchy for witnesses with a level of 0.
- Add a new per-witness flag that is reset at the start of
  witness_display() for all witness's and is set the first time a witness
  is displayed in witness_displaydescendants().  If a witness is
  encountered more than once in the lock order tree (which happens often),
  witness_displaydescendants() marks the later occurrences with the string
  "(already displayed)" and doesn't display the subtree under that
  witness.  This avoids duplicating large amounts of the lock order tree
  in the 'show witness' output in DDB.

All these changes serve to make 'show witness' a lot more readable and
useful than it was previously.
2003-03-11 22:14:21 +00:00
jhb
736396fc05 - Split the itismychild() function into two functions: insertchild()
adds a witness to the child list of a parent witness.  rebalancetree()
  runs through the entire tree removing direct descendants of witnesses
  who already have said child witness as an indirect descendant through
  another direct descendant.  itismychild() now calls insertchild()
  followed by rebalancetree() and no longer needs the evil hack of
  having static recursed variable.
- Add a function reparentchildren() that adds all the direct descendants
  of one witness as direct descendants of another witness.
- Change the return value of itismychild() and similar functions so that
  they return 0 in the case of failure due to lack of resources instead
  of 1.  This makes the return value more intuitive.
- Check the return value of itismychild() when defining the static lock
  order in witness_initialize().
- Don't try to setup a lock instance in witness_lock() if itismychild()
  fails.  Witness is hosed anyways so no need to do any more witness
  related activity at that point.  It also makes the code flow easier to
  understand.
- Add a new depart() function as the opposite of enroll().  When the
  reference count of a witness drops to 0 in witness_destroy(), this
  function is called on that witness.  First, it runs through the
  lock order tree using reparentchildren() to reparent direct descendants
  of the departing witness to each of the witness' parents in the tree.
  Next, it releases it's own child list and other associated resources.
  Finally it calls rebalanacetree() to rebalance the lock order tree.
- Sort function prototypes into something closer to alphabetical order.

As a result of these changes, there should no longer be 'dead' witnesses
in the order tree, and repeatedly loading and unloading a module should no
longer exhaust witness of its internal resources.

Inspired by:	gallatin
2003-03-11 22:07:35 +00:00
jhb
5aeddebb51 Trim useless "../" leading strings from filenames passed into witness. 2003-03-11 21:53:12 +00:00
jhb
42e39caaab Adjust style of #ifdef's and #endif's to be more consistent and in line
with recent additions to style(9).
2003-03-11 21:38:49 +00:00
jhb
f7f4727b44 Do the lock order check skip for the LOP_TRYLOCK case after the check for
recursing on a lock instead of before.  This fixes a bug where WITNESS
could get a little confused if you did an sx_tryslock() on a sx lock that
you already had an slock on.  WITNESS would still function correctly but
it could result in weirdness in the output of 'show locks'.  This also
makes it possible for mtx_trylock() to recurse on a lock.
2003-03-11 20:54:37 +00:00
jhb
cb710fa095 Now that we have WITNESS_WARN(), we only call witness_list() from the
ddb 'show locks' command.  Thus, move witness_list() to the #ifdef DDB
section and remove extra checks for calling this function outside of
DDB.  Also, witness_list() now returns void instead of returning an int.

Reported by:	Steve Ames <steve@energistic.com>
Prodded by:	davidxu
2003-03-10 17:03:57 +00:00
jhb
7e6cfbee4e Oops, fix the double faults people were seeing with the recent changes to
witness.  Sleepable locks such as sx locks always come before all mutexes
including Giant.  However, the static lock order list placed Giant before
the proctree and allproc sx locks.  This resulted in witness creating a
cycle in its lock order "tree" (real trees don't have cycles) leading to
infinite recursion and eventually a double fault.  To fix, put Giant after
sx locks in the lock order list.
2003-03-06 17:25:06 +00:00
jhb
45fcac94f4 Bah, fix a bogon in the last commit: get the sense of a compare test right
so that we allow a sleepable lock to be acquired with Giant held rather
than allowing a sleepable lock to be acquired with anything but Giant held.
2003-03-04 22:34:07 +00:00