Commit Graph

200 Commits

Author SHA1 Message Date
Jeff Roberson
61a74c5ccd schedlock 1/4
Eliminate recursion from most thread_lock consumers.  Return from
sched_add() without the thread_lock held.  This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks.  This will eventually allow for lockless remote adds.

Discussed with:	kib
Reviewed by:	jhb
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22626
2019-12-15 21:11:15 +00:00
Mark Johnston
2fb62b1a46 Fix the turnstile_lock() KPI.
turnstile_{lock,unlock}() were added for use in epoch.  turnstile_lock()
returned NULL to indicate that the calling thread had lost a race and
the turnstile was no longer associated with the given lock, or the lock
owner.  However, reader-writer locks may not have a designated owner,
in which case turnstile_lock() would return NULL and
epoch_block_handler_preempt() would leak spinlocks as a result.

Apply a minimal fix: return the lock owner as a separate return value.

Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21048
2019-07-24 23:04:59 +00:00
Konstantin Belousov
a9fd669b4a subr_turnstile: Extract some common code to a helper.
Code walks the list of contested turnstiles to calculate the priority
to unlend.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-05-16 13:17:57 +00:00
Mateusz Guzik
d0a22279db Remove an unused argument to turnstile_unpend.
PR:	228694
Submitted by:	Julian Pszczołowski <julian.pszczolowski@gmail.com>
2018-06-02 22:37:53 +00:00
Matt Macy
3adccf38e3 turnstile / sleepqueue: annotate variables only used by debug builds 2018-05-19 05:00:16 +00:00
Matt Macy
06bf2a6aef Add simple preempt safe epoch API
Read locking is over used in the kernel to guarantee liveness. This API makes
it easy to provide livenes guarantees without atomics.

Includes epoch_test kernel module to stress test the API.

Documentation will follow initial use case.

Test case and improvements to preemption handling in response to discussion
with mjg@

Reviewed by:	imp@, shurd@
Approved by:	sbruno@
2018-05-10 17:55:24 +00:00
Pedro F. Giffuni
8a36da99de sys/kern: adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:20:12 +00:00
Conrad Meyer
d2e155a4f0 Remove unused declaration and update ddb.4
A follow-up to r322836.

Warnings for the unused declaration were breaking some second tier
architectures, but did not show up in Clang on x86.

Reported by:	markj (ddb.4), emaste (declaration)
Sponsored by:	Dell EMC Isilon
2017-08-24 19:16:25 +00:00
Conrad Meyer
0c1d923efb Merge print_lockchain and print_sleepchain
When debugging a deadlock, it is useful to follow the full chain of locks as
far as possible.

Reviewed by:	jhb
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12115
2017-08-24 15:12:16 +00:00
Conrad Meyer
8798ef0679 ddb(4): Add sleepchains to "show allchains"
Reported by:	markj
Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8320
2016-10-22 18:02:20 +00:00
Konstantin Belousov
86a448c3a4 Finish r173600. There is no need to test a condition if both cases
result in the same value.

Found by:	PVS-Studio
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-02-10 21:16:37 +00:00
Pedro F. Giffuni
cd508278c1 ddb: finish converting boolean values.
The replacement started at r283088 was necessarily incomplete without
replacing boolean_t with bool.  This also involved cleaning some type
mismatches and ansifying old C function declarations.

Pointed out by:	bde
Discussed with:	bde, ian, jhb
2015-05-21 15:16:18 +00:00
Andriy Gapon
d9fae5ab88 dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE
In its stead use the Solaris / illumos approach of emulating '-' (dash)
in probe names with '__' (two consecutive underscores).

Reviewed by:	markj
MFC after:	3 weeks
2013-11-26 08:46:27 +00:00
Attilio Rao
54366c0bd7 - For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
  calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
  version of the releasing functions for mutex, rwlock and sxlock.
  Failing to do so skips the lockstat_probe_func invokation for
  unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
  kernel compiled without lock debugging options, potentially every
  consumer must be compiled including opt_kdtrace.h.
  Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
  dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
  is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested.  As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while.  Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by:	EMC / Isilon storage division
Discussed with:	rstone
[0] Reported by:	rstone
[1] Discussed with:	philip
2013-11-25 07:38:45 +00:00
Pawel Jakub Dawidek
b2e054b0d4 Update the comment: we do show the backtrace of misbehaving thread. 2013-02-17 21:37:32 +00:00
Attilio Rao
e3ae0dfe69 Improve check coverage about idle threads.
Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.

Reviewed by:	bde, jhb, kib
MFC after:	1 week
2012-09-12 22:10:53 +00:00
John Baldwin
ba96d2d816 Mark the idle threads as non-sleepable and also assert that an idle
thread never blocks on a turnstile.
2012-08-22 20:01:38 +00:00
Ryan Stone
b3e9e682cf Implement the DTrace sched provider. This implementation aims to be
compatible with the sched provider implemented by Solaris and its open-
source derivatives.  Full documentation of the sched provider can be found
on Oracle's DTrace wiki pages.

Note that for compatibility with scripts originally written for Solaris,
serveral probes are defined that will never fire.  These probes are defined
to fire when Solaris-specific features perform certain actions.  As these
features are not present in FreeBSD, the probes can never fire.

Also, I have added a two probes that are not defined in Solaris, lend-pri
and load-change.  These probes have been added to make it possible to
collect schedgraph data with DTrace.

Finally, a few probes are defined in Solaris to take a cpuinfo_t *
argument.  As it was not immediately clear to me how to translate that to
FreeBSD, currently those probes are passed NULL in place of a cpuinfo_t *.

Sponsored by: Sandvine Incorporated
MFC after:	2 weeks
2012-05-15 01:30:25 +00:00
Davide Italiano
99006d44f8 Fix a typo.
Approved by:	gnn (mentor)
MFC after:	2 days
2012-04-14 23:59:58 +00:00
Marius Strobl
91849f349c Fix !DDB build after r234190. 2012-04-14 11:21:24 +00:00
John Baldwin
0cc457b000 - Extend the KDB interface to add a per-debugger callback to print a
backtrace for an arbitrary thread (rather than the calling thread).
  A kdb_backtrace_thread() wrapper function uses the configured debugger
  if possible, otherwise it falls back to using stack(9) if that is
  available.
- Replace a direct call to db_trace_thread() in propagate_priority()
  with a call to kdb_backtrace_thread() instead.

MFC after:	1 week
2012-04-12 17:43:59 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
John Baldwin
f7488600c0 Always assert that the turnstile chain lock is held in turnstile_wait()
and remove a duplicate hash lookup.

MFC after:	1 week
2011-02-04 14:16:41 +00:00
Attilio Rao
f7829d0d5c Introduce the new kernel thread called "deadlock resolver".
While the name is pretentious, a good explanation of its targets is
reported in this 17 months old presentation e-mail:
http://lists.freebsd.org/pipermail/freebsd-arch/2008-August/008452.html

In order to implement it, the sq_type in sleepqueues is mandatory and not
only compiled along with INVARIANTS option. Additively, a new sleepqueue
function, sleepq_type() is added, returning the type of the sleepqueue
linked to a wchan.
Three new sysctls are added in order to configure the thread:
debug.deadlkres.slptime_threshold
debug.deadlkres.blktime_threshold
debug.deadlkres.sleepfreq

rappresenting the thresholds for sleep and block time that will lead to
a deadlock matching (when exceeded), while the sleepfreq rappresents the
number of seconds between 2 consecutive thread runnings.
In order to enable the deadlock resolver thread recompile your kernel
with the option DEADLKRES.

Reviewed by:	jeff
Tested by:	pho, Giovanni Trematerra
Sponsored by:	Nokia Incorporated, Sandvine Incorporated
MFC after:	2 weeks
2010-01-09 01:46:38 +00:00
Ed Schouten
907b48bc05 Fix indentation. 2009-12-20 22:55:27 +00:00
Sam Leffler
39297ba455 Make ddb command registration dynamic so modules can extend
the command set (only so long as the module is present):
o add db_command_register and db_command_unregister to add and remove
  commands, respectively
o replace linker sets with SYSINIT's (and SYSUINIT's) that register
  commands
o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table
  for registering top-level commands, show operands, and show all operands,
  respectively

While here also:
o sort command lists
o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases
  for existing commands
o add "show all trace" as an alias for "show alltrace"
o add "show all locks" as an alias for "show alllocks"

Submitted by:	Guillaume Ballet <gballet@gmail.com> (original version)
Reviewed by:	jhb
MFC after:	1 month
2008-09-15 22:45:14 +00:00
John Baldwin
8c68f75a7c - Reduce scope of #ifdef's in uma_zcreate() call in init_turnstile0().
- Set UMA_ZONE_NOFREE so that the per-turnstile spin locks are type stable
  to avoid a race where one thread might dereference a lock in a free'd
  turnstile that was previously used by another thread.

Theorized by:	tegge (2)
MFC after:	1 week
2008-09-08 21:40:15 +00:00
Jeff Roberson
8df78c41d6 - Make SCHED_STATS more generic by adding a wrapper to create the
variables and sysctl nodes.
 - In reset walk the children of kern_sched_stats and reset the counters
   via the oid_arg1 pointer.  This allows us to add arbitrary counters to
   the tree and still reset them properly.
 - Define a set of switch types to be passed with flags to mi_switch().
   These types are named SWT_*.  These types correspond to SCHED_STATS
   counters and are automatically handled in this way.
 - Make the new SWT_ types more specific than the older switch stats.
   There are now stats for idle switches, remote idle wakeups, remote
   preemption ithreads idling, etc.
 - Add switch statistics for ULE's pickcpu algorithm.  These stats include
   how much migration there is, how often affinity was successful, how
   often threads were migrated to the local cpu on wakeup, etc.

Sponsored by:	Nokia
2008-04-17 04:20:10 +00:00
Jeff Roberson
626ac252ea - Add THREAD_LOCKPTR_ASSERT() to assert that the thread's lock points at
the provided lock or &blocked_lock.  The thread may be temporarily
   assigned to the blocked_lock by the scheduler so a direct comparison
   can not always be made.
 - Use THREAD_LOCKPTR_ASSERT() in the primary consumers of the scheduling
   interfaces.  The schedulers themselves still use more explicit asserts.

Sponsored by:	Nokia
2008-02-07 06:55:38 +00:00
Jeff Roberson
5dff04c31f Adaptive spinning in write path with readers and writer starvation avoidance.
- Move recursion checking into rwlock inlines to free a bit for use with
   adaptive spinners.
 - Clear the RW_LOCK_WRITE_SPINNERS flag whenever the lock state changes
   causing write spinners to restart their loop.
 - Write spinners are limited by a count while readers hold the lock as
   there is no way to know for certain whether readers are running still.
 - In the read path block if there are write waiters or spinners to avoid
   starving writers.  Use a new per-thread count, td_rw_rlocks, to skip
   starvation avoidance if it might cause a deadlock.
 - Remove or change invalid assertions in turnstiles.

Reviewed by:    attilio (developed parts of the patch as well)
Sponsored by:   Nokia
2008-02-06 01:02:13 +00:00
Julian Elischer
431f890614 generally we are interested in what thread did something as
opposed to what process. Since threads by default have teh name of the
process unless over-written with more useful information, just print the
thread name instead.
2007-11-14 06:21:24 +00:00
Jeff Roberson
3036ab79e3 - Include opt_sched.h for SCHED_STATS. 2007-06-12 23:27:31 +00:00
Jeff Roberson
2502c107ba Commit 3/14 of sched_lock decomposition.
- Add a per-turnstile spinlock to solve potential priority propagation
   deadlocks that are possible with thread_lock().
 - The turnstile lock order is defined as the exact opposite of the
   lock order used with the sleep locks they represent.  This allows us
   to walk in reverse order in priority_propagate and this is the only
   place we wish to multiply acquire turnstile locks.
 - Use the turnstile_chain lock to protect assigning mutexes to turnstiles.
 - Change the turnstile interface to pass back turnstile pointers to the
   consumers.  This allows us to reduce some locking and makes it easier
   to cancel turnstile assignment while the turnstile chain lock is held.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:51:44 +00:00
Jeff Roberson
2b7e2ee7a5 - Convert turnstiles and sleepqueus to use UMA. This provides a modest
speedup and will be more useful after each gains a spinlock in the
   impending thread_lock() commit.
 - Move initialization and asserts into init/fini routines.  fini routines
   are only needed in the INVARIANTS case for now.

Submitted by:	Attilio Rao <attilio@FreeBSD.org>
Tested by:	kris, jeff
2007-05-18 06:32:24 +00:00
Jeff Roberson
f0393f063a - Remove setrunqueue and replace it with direct calls to sched_add().
setrunqueue() was mostly empty.  The few asserts and thread state
   setting were moved to the individual schedulers.  sched_add() was
   chosen to displace it for naming consistency reasons.
 - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be
   different on all three schedulers where it was only called in one place
   each.
 - Remove the long ifdef'd out remrunqueue code.
 - Remove the now redundant ts_state.  Inspect the thread state directly.
 - Don't set TSF_* flags from kern_switch.c, we were only doing this to
   support a feature in one scheduler.
 - Change sched_choose() to return a thread rather than a td_sched.  Also,
   rely on the schedulers to return the idlethread.  This simplifies the
   logic in choosethread().  Aside from the run queue links kern_switch.c
   mostly does not care about the contents of td_sched.

Discussed with:	julian

 - Move the idle thread loop into the per scheduler area.  ULE wants to
   do something different from the other schedulers.

Suggested by:	jhb

Tested on:	x86/amd64 sched_{4BSD, ULE, CORE}.
2007-01-23 08:46:51 +00:00
Xin LI
4f506694bb Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 14:58:53 +00:00
John Baldwin
19c80b2652 Wrap propagate_priority() in a critical section to prevent unwanted
preemptions when adjusting the priority of a thread that is on a run
queue.  This was only observed when FULL_PREEMPTION was enabled.

Reported by:	kris
Diagnosed by:	ups
MFC after:	1 week
2007-01-11 19:13:27 +00:00
John Baldwin
462a7add8e Add a new 'show sleepchain' ddb command similar to 'show lockchain' except
that it operates on lockmgr and sx locks.  This can be useful for tracking
down vnode deadlocks in VFS for example.  Note that this command is a bit
more fragile than 'show lockchain' as we have to poke around at the
wait channel of a thread to see if it points to either a struct lock or
a condition variable inside of a struct sx.  If td_wchan points to
something unmapped, then this command will terminate early due to a fault,
but no harm will be done.
2006-08-15 18:29:01 +00:00
John Baldwin
77e662683b Rename 'show lockchain' to 'show locktree' and 'show threadchain' to
'show lockchain'.  The churn is because I'm about to add a new
'show sleepchain' similar to 'show lockchain' for sleep locks (lockmgr and
sx) and 'show threadchain' was a bit ambiguous as both commands show
a chain of thread dependencies, 'lockchain' is for non-sleepable locks
(mtx and rw) and 'sleepchain' is for sleepable locks.
2006-08-15 16:44:18 +00:00
John Baldwin
fed7988436 Honor db_pager_quit in 'show threadchain', 'show allchains', and
'show lockchain'.  This is especially helpful for the first 2 as a
threadchain could get stuck in an infinite loop during a mutex deadlock.
2006-07-12 21:25:24 +00:00
John Baldwin
ae110b53d1 Add some new commands to hopefully make it easier to diagnose lock-related
problems in ddb:
- "show threadchain [thread]" will start with the specified thread (or the
  current kdb thread by default) and show it's state.  If it is blocked on
  a lock, it will find the owner of the lock and show its state, etc.
- "show allchains" will find all of the threads that are blocked on a
  lock (but do not have any threads blocked on a lock they hold) and show
  the resulting thread chain.
- "show lockchain <lock>" takes a pointer to a lock_object (such as a
  mutex or rwlock).  If there is a turnstile for that lock, then it will
  display all the threads blocked on the lock.  In addition, for each
  thread blocked on the lock, it will display any contested locks they
  hold, and recurse on those locks to show any threads blocked on those
  locks, etc.
2006-04-25 20:28:17 +00:00
John Baldwin
f9ab2f134f Print td_name instead of p_comm if td_name is non-empty for
'show turnstile' and 'show sleepq'.
2006-04-21 20:40:43 +00:00
John Baldwin
f1a4b852dc - Bring back turnstile_empty() which can check to see if an individual
queue on a turnstile is empty.
- Add a turnstile_disown() function that allows a thread to give up
  ownership of a turnstile w/o waking up any waiters.
2006-04-18 18:16:54 +00:00
John Baldwin
4b3b0413d2 Always explicitly panic in propogate_priority() if we try to propogate
a lock's priority to a sleeping thread.  When we panic, dump a stack
trace of the thread that is asleep if DDB is compiled into the kernel
just before calling panic().  This is much more informative and useful
for debugging than the current behavior of getting a page fault and not
having an easy way of determining which thread caused the original problem.

MFC after:	1 week
2006-03-29 23:24:55 +00:00
John Baldwin
7aa4f6852a - Add support for having both a shared and exclusive queue of threads in
each turnstile.  Also, allow for the owner thread pointer of a turnstile
  to be NULL.  This is needed for the upcoming reader/writer lock
  implementation.
- Add a new ddb command 'show turnstile' that will look up the turnstile
  associated with the given lock argument and display useful information
  like the list of threads blocked on each queue, etc.  If there isn't an
  active turnstile for a lock at the specified address, then the function
  will see if there is an active turnstile at the specified address and
  display info about it if so.
- Adjust the mutex code to handle the turnstile API changes.

Tested on:	i386 (all), alpha, amd64, sparc64 (1 and 3)
2006-01-27 22:42:12 +00:00
John Baldwin
550d1c9392 Initialize thread0.td_contested in init_turnstiles() rather than
mutex_init() as it is used by the turnstile code and is not mutex-specific.
2006-01-17 16:47:42 +00:00
John Baldwin
3eb9cab0c6 Garbage collect turnstile_empty() since it is unused. 2006-01-17 16:40:20 +00:00
John Baldwin
b65089ccb5 Trim a couple of unneeded includes. 2005-09-29 19:13:52 +00:00
Poul-Henning Kamp
c711aea6ca Make a bunch of malloc types static.
Found by:	src/tools/tools/kernxref
2005-02-10 12:02:37 +00:00
John Baldwin
f5c157d986 Rework the interface between priority propagation (lending) and the
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
  sched_lend_prio() and sched_unlend_prio().  The turnstile code uses these
  functions to ask the scheduler to lend a thread a set priority and to
  tell the scheduler when it thinks it is ok for a thread to stop borrowing
  priority.  The unlend case is slightly complex in that the turnstile code
  tells the scheduler what the minimum priority of the thread needs to be
  to satisfy the requirements of any other threads blocked on locks owned
  by the thread in question.  The scheduler then decides where the thread
  can go back to normal mode (if it's normal priority is high enough to
  satisfy the pending lock requests) or it it should continue to use the
  priority specified to the sched_unlend_prio() call.  This involves adding
  a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
  for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
  borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
  on a turnstile, it will call a new function turnstile_adjust() to inform
  the turnstile code of the change.  This function resorts the thread on
  the priority list of the turnstile if needed, and if the thread ends up
  at the head of the list (due to having the highest priority) and its
  priority was raised, then it will propagate that new priority to the
  owner of the lock it is blocked on.

Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
  of its associated kse group has been consolidated in a new static
  function resetpriority_thread().  One change to this function is that
  it will now only adjust the priority of a thread if it already has a
  time sharing priority, thus preserving any boosts from a tsleep() until
  the thread returns to userland.  Also, resetpriority() no longer calls
  maybe_resched() on each thread in the group. Instead, the code calling
  resetpriority() is responsible for calling resetpriority_thread() on
  any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
  sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
  directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
  group priority has been adjusted.

Discussed with:	bde
Reviewed by:	ups, jeffr
Tested on:	4bsd, ule
Tested on:	i386, alpha, sparc64
2004-12-30 20:52:44 +00:00