275 Commits

Author SHA1 Message Date
Mateusz Guzik
b247fd395d locks: make trylock routines check for 'unowned' value
Since fcmpset can fail without lock contention e.g. on arm, it was possible
to get spurious failures when the caller was expecting the primitive to succeed.

Reported by:	mmel
2017-02-19 16:28:46 +00:00
Mateusz Guzik
5c5df0d99b locks: clean up trylock primitives
In particular thius reduces accesses of the lock itself.
2017-02-18 22:06:03 +00:00
Mateusz Guzik
a24c8eb847 mtx: plug the 'opts' argument when not used 2017-02-18 01:52:10 +00:00
Mateusz Guzik
cbebea4e67 mtx: get rid of file/line args from slow paths if they are unused
This denotes changes which went in by accident in r313877.

On most production kernels both said parameters are zeroed and have nothing
reading them in either __mtx_lock_sleep or __mtx_unlock_sleep. Thus this change
stops passing them by internal consumers which this is the case.

Kernel modules use _flags variants which are not affected kbi-wise.
2017-02-17 15:40:24 +00:00
Mateusz Guzik
09f1319acd mtx: restrict r313875 to kernels without LOCK_PROFILING 2017-02-17 15:34:40 +00:00
Mateusz Guzik
7640beb920 mtx: microoptimize lockstat handling in __mtx_lock_sleep
This saves a function call and multiple branches after the lock is acquired.
2017-02-17 14:55:59 +00:00
Mateusz Guzik
ffd5c94c4f locks: let primitives for modules unlock without always goging to the slsow path
It is only needed if the LOCK_PROFILING is enabled. It has to always check if
the lock is about to be released which requires an avoidable read if the option
is not specified..
2017-02-17 05:39:40 +00:00
Mateusz Guzik
afa39f7a32 locks: remove SCHEDULER_STOPPED checks from primitives for modules
They all fallback to the slow path if necessary and the check is there.

This means a panicked kernel executing code from modules will be able to
succeed doing actual lock/unlock, but this was already the case for core code
which has said primitives inlined.
2017-02-17 05:09:51 +00:00
Mateusz Guzik
3b3cf014fc locks: tidy up unlock fallback paths
Update comments to note these functions are reachable if lockstat is
enabled.

Check if the lock has any bits set before attempting unlock, which saves
an unnecessary atomic operation.
2017-02-09 08:19:30 +00:00
Mateusz Guzik
8e5a3e9a9d locks: change backoff to exponential
Previous implementation would use a random factor to spread readers and
reduce chances of starvation. This visibly reduces effectiveness of the
mechanism.

Switch to the more traditional exponential variant. Try to limit starvation
by imposing an upper limit of spins after which spinning is half of what
other threads get. Note the mechanism is turned off by default.

Reviewed by:	kib (previous version)
2017-02-07 14:49:36 +00:00
Mateusz Guzik
c1aaf63cb5 locks: fix recursion support after recent changes
When a relevant lockstat probe is enabled the fallback primitive is called with
a constant signifying a free lock. This works fine for typical cases but breaks
with recursion, since it checks if the passed value is that of the executing
thread.

Read the value if necessary.
2017-02-06 09:40:14 +00:00
Mateusz Guzik
dc0896512c mtx: fixup r313278, the assignemnt was supposed to go inside the loop 2017-02-05 09:53:13 +00:00
Mateusz Guzik
cae4ab7f37 mtx: fix up _mtx_obtain_lock_fetch usage in thread lock
Since _mtx_obtain_lock_fetch no longer sets the argument to MTX_UNOWNED,
callers have to do it on their own.
2017-02-05 09:35:17 +00:00
Mateusz Guzik
08da267775 mtx: move lockstat handling out of inline primitives
Lockstat requires checking if it is enabled and if so, calling a 6 argument
function. Further, determining whether to call it on unlock requires
pre-reading the lock value.

This is problematic in at least 3 ways:
- more branches in the hot path than necessary
- additional cacheline ping pong under contention
- bigger code

Instead, check first if lockstat handling is necessary and if so, just fall
back to regular locking routines. For this purpose a new macro is introduced
(LOCKSTAT_PROFILE_ENABLED).

LOCK_PROFILING uninlines all primitives. Fold in the current inline lock
variant into the _mtx_lock_flags to retain the support. With this change
the inline variants are not used when LOCK_PROFILING is defined and thus
can ignore its existence.

This results in:
   text	   data	    bss	    dec	    hex	filename
22259667	1303208	4994976	28557851	1b3c21b	kernel.orig
21797315	1303208	4994976	28095499	1acb40b	kernel.patched

i.e. about 3% reduction in text size.

A remaining action is to remove spurious arguments for internal kernel
consumers.
2017-02-05 08:04:11 +00:00
Mateusz Guzik
90836c3270 mtx: switch to fcmpset
The found value is passed to locking routines in order to reduce cacheline
accesses.

mtx_unlock grows an explicit check for regular unlock. On ll/sc architectures
the routine can fail even if the lock could have been handled by the inline
primitive.

Discussed with:	jhb
Tested by:	pho (previous version)
2017-02-05 03:26:34 +00:00
Mateusz Guzik
290511163d Sprinkle __read_mostly on backoff and lock profiling code.
MFC after:	1 month
2017-01-27 15:03:51 +00:00
Mateusz Guzik
391df78ad4 mtx: plug open-coded mtx_lock access missed in r311172 2017-01-04 02:25:31 +00:00
Mateusz Guzik
5e5ad162ad Reduce lock accesses in thread lock similarly to r311172. 2017-01-03 23:08:11 +00:00
Mateusz Guzik
2604eb9e17 mtx: reduce lock accesses
Instead of spuriously re-reading the lock value, read it once.

This change also has a side effect of fixing a performance bug:
on failed _mtx_obtain_lock, it was possible that re-read would find
the lock is unowned, but in this case the primitive would make a trip
through turnstile code.

This is diff reduction to a variant which uses atomic_fcmpset.

Discussed with:	jhb (previous version)
Tested by:	pho (previous version)
2017-01-03 21:36:15 +00:00
Mark Johnston
02315a6759 Use a consistent snapshot of the lock state in owner_mtx().
MFC after:	2 weeks
2016-12-10 02:59:34 +00:00
Eric van Gyzen
310ab671b8 Make no assertions about mutex state when the scheduler is stopped.
This changes the assert path to match the lock and unlock paths.

MFC after:	1 week
Sponsored by:	Dell EMC
2016-09-26 15:30:30 +00:00
Mateusz Guzik
a0d45f0fc8 locks: add backoff for spin mutexes and thread lock
Reviewed by:	jhb
2016-09-09 19:13:02 +00:00
Mateusz Guzik
fa5000a4f3 locks: fix compilation for KDTRACE_HOOKS && !ADAPTIVE_* case
Reported by:	Michael Butler <imb protected-networks.net>
2016-08-02 03:05:59 +00:00
Mateusz Guzik
1ada904147 Implement trivial backoff for locking primitives.
All current spinning loops retry an atomic op the first chance they get,
which leads to performance degradation under load.

One classic solution to the problem consists of delaying the test to an
extent. This implementation has a trivial linear increment and a random
factor for each attempt.

For simplicity, this first thouch implementation only modifies spinning
loops where the lock owner is running. spin mutexes and thread lock were
not modified.

Current parameters are autotuned on boot based on mp_cpus.

Autotune factors are very conservative and are subject to change later.

Reviewed by:	kib, jhb
Tested by:	pho
MFC after:	1 week
2016-08-01 21:48:37 +00:00
Mateusz Guzik
61852185ba locks: change sleep_cnt and spin_cnt types to u_int
Both variables are uint64_t, but they only count spins or sleeps.
All reasonable values which we can get here comfortably hit in 32-bit range.

Suggested by: kib
MFC after:	1 week
2016-07-31 12:11:55 +00:00
Konstantin Belousov
90b581f2cc Implement mtx_trylock_spin(9).
Discussed with:	bde
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D7192
2016-07-23 05:30:55 +00:00
Mark Johnston
f61d6c5a5b Ensure that spinlock sections are balanced even after a panic.
vpanic() uses spinlock_enter() to disable interrupts before dumping core.
However, when the scheduler is stopped and INVARIANTS is not configured,
thread_lock() does not acquire a spinlock section, while thread_unlock()
releases one. This can result in interrupts staying enabled while the
kernel dumps core, complicating post-mortem analysis of the crash.

Approved by:	re (gjb)
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-07-05 17:59:04 +00:00
Mateusz Guzik
fc4f686d59 Microoptimize locking primitives by avoiding unnecessary atomic ops.
Inline version of primitives do an atomic op and if it fails they fallback to
actual primitives, which immediately retry the atomic op.

The obvious optimisation is to check if the lock is free and only then proceed
to do an atomic op.

Reviewed by:	jhb, vangyzen
2016-06-01 18:32:20 +00:00
Mark Johnston
be2dfd58fe Remove the MUTEX_DEBUG kernel option.
It has no counterpart among the other lock primitives and has been a
no-op for years. Mutex consistency checks are generally done whenver
INVARIANTS is enabled.
2016-05-18 03:34:02 +00:00
Mark Johnston
5002e19502 Guard the lockstat:::thread-spin probe with KDTRACE_HOOKS.
X-MFC-With:	r300103
2016-05-18 03:23:07 +00:00
Mark Johnston
156fbc14a0 lockstat:::thread-spin should only fire after spinning for the lock.
MFC after:	1 week
2016-05-18 03:21:21 +00:00
Mark Johnston
ce1c953ee0 Don't modify curthread->td_locks unless INVARIANTS is enabled.
This field is only used in a KASSERT that verifies that no locks are held
when returning to user mode. Moreover, the td_locks accounting is only
correct when LOCK_DEBUG > 0, which is implied by INVARIANTS.

Reviewed by:	jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D3205
2015-08-02 00:03:08 +00:00
Mark Johnston
32cd0147fa Implement the lockstat provider using SDT(9) instead of the custom provider
in lockstat.ko. This means that lockstat probes now have typed arguments and
will utilize SDT probe hot-patching support when it arrives.

Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D2993
2015-07-19 22:14:09 +00:00
Mark Johnston
c6d48c8752 Fix the !KDTRACE_HOOKS build.
X-MFC-With:	r285664
2015-07-18 04:38:11 +00:00
Mark Johnston
e2b25737ee Pass the lock object to lockstat_nsecs() and return immediately if
LO_NOPROFILE is set. Some timecounter handlers acquire a spin mutex, and
we don't want to recurse if lockstat probes are enabled.

PR:		201642
Reviewed by:	avg
MFC after:	3 days
2015-07-18 00:57:30 +00:00
Andriy Gapon
076dd8eb2e several lockstat improvements
0. For spin events report time spent spinning, not a loop count.
While loop count is much easier and cheaper to obtain it is hard
to reason about the reported numbers, espcially for adaptive locks
where both spinning and sleeping can happen.
So, it's better to compare apples and apples.

1. Teach lockstat about FreeBSD rw locks.
This is done in part by changing the corresponding probes
and in part by changing what probes lockstat should expect.

2. Teach lockstat that rw locks are adaptive and can spin on FreeBSD.

3. Report lock acquisition events for successful rw try-lock operations.

4. Teach lockstat about FreeBSD sx locks.
Reporting of events for those locks completely mirrors
rw locks.

5. Report spin and block events before acquisition event.
This is behavior documented for the upstream, so it makes sense to stick
to it.  Note that because of FreeBSD adaptive lock implementations
both the spin and block events may be reported for the same acquisition
while the upstream reports only one of them.

Differential Revision:	https://reviews.freebsd.org/D2727
Reviewed by:	markj
MFC after:	17 days
Relnotes:	yes
Sponsored by:	ClusterHQ
2015-06-12 10:01:24 +00:00
Dmitry Chagin
fd07ddcf6f Add _NEW flag to mtx(9), sx(9), rmlock(9) and rwlock(9).
A _NEW flag passed to _init_flags() to avoid check for double-init.

Differential Revision:	https://reviews.freebsd.org/D1208
Reviewed by:	jhb, wblock
MFC after:	1 Month
2014-12-13 21:00:10 +00:00
Konstantin Belousov
6afb32fc67 Disable recursion for the process spinlock.
Tested by:	pho
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2014-12-01 17:36:10 +00:00
Konstantin Belousov
5c7bebf961 The process spin lock currently has the following distinct uses:
- Threads lifetime cycle, in particular, counting of the threads in
  the process, and interlocking with process mutex and thread lock.
  The main reason of this is that turnstile locks are after thread
  locks, so you e.g. cannot unlock blockable mutex (think process
  mutex) while owning thread lock.

- Virtual and profiling itimers, since the timers activation is done
  from the clock interrupt context.  Replace the p_slock by p_itimmtx
  and PROC_ITIMLOCK().

- Profiling code (profil(2)), for similar reason.  Replace the p_slock
  by p_profmtx and PROC_PROFLOCK().

- Resource usage accounting.  Need for the spinlock there is subtle,
  my understanding is that spinlock blocks context switching for the
  current thread, which prevents td_runtime and similar fields from
  changing (updates are done at the mi_switch()).  Replace the p_slock
  by p_statmtx and PROC_STATLOCK().

The split is done mostly for code clarity, and should not affect
scalability.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-11-26 14:10:00 +00:00
Warner Losh
40e6bdaf1e opt_global.h is included automatically in the build. No need to
explicitly include it in these places.

Sponsored by: Netflix
2014-11-18 17:06:56 +00:00
John Baldwin
2cba8dd301 Add a new thread state "spinning" to schedgraph and add tracepoints at the
start and stop of spinning waits in lock primitives.
2014-11-04 16:35:56 +00:00
Attilio Rao
54366c0bd7 - For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
  calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
  version of the releasing functions for mutex, rwlock and sxlock.
  Failing to do so skips the lockstat_probe_func invokation for
  unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
  kernel compiled without lock debugging options, potentially every
  consumer must be compiled including opt_kdtrace.h.
  Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
  dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
  is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested.  As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while.  Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by:	EMC / Isilon storage division
Discussed with:	rstone
[0] Reported by:	rstone
[1] Discussed with:	philip
2013-11-25 07:38:45 +00:00
Davide Italiano
7faf4d90e8 Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().

Suggested by:	jhb
Reviewed by:	jhb
Approved by:	re (delphij)
2013-09-20 23:06:21 +00:00
Attilio Rao
ac6b769be9 Give mutex(9) the ability to recurse on a per-instance basis.
Now the MTX_RECURSE flag can be passed to the mtx_*_flag() calls.
This helps in cases we want to narrow down to specific calls the
possibility to recurse for some locks.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff, alc
Tested by:	pho
2013-08-09 11:24:29 +00:00
Scott Long
de925dd31f Fix r253823. Some WIP patches snuck in.
Submitted by:	zont
2013-07-30 23:50:09 +00:00
Scott Long
fc4a5f052b Create a knob, kern.ipc.sfreadahead, that allows one to tune the amount of
readahead that sendfile() will do.  Default remains the same.

Obtained from:	Netflix
MFC after:	3 days
2013-07-30 23:26:05 +00:00
John Baldwin
b5fb43e572 A few mostly cosmetic nits to aid in debugging:
- Call lock_init() first before setting any lock_object fields in
  lock init routines.  This way if the machine panics due to a duplicate
  init the lock's original state is preserved.
- Somewhat similarly, don't decrement td_locks and td_slocks until after
  an unlock operation has completed successfully.
2013-06-25 20:23:08 +00:00
Attilio Rao
cd2fe4e632 Fixup r240424: On entering KDB backends, the hijacked thread to run
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.

Skip the idlethread check on block/sleep lock operations when KDB is
active.

Reported by:	jh
Tested by:	jh
MFC after:	1 week
2012-12-22 09:37:34 +00:00
Attilio Rao
7f44c61839 Give mtx(9) the ability to crunch different type of structures, with the
only constraint that they have a lock cookie named mtx_lock.
This name, then, becames reserved from the struct that wants to use the
mtx(9) KPI and other locking primitives cannot reuse it for their
members.

Namely such structs are the current struct mtx and the new
struct mtx_padalign.  The new structure will define an object which is
the same as the same layout of a struct mtx but will be allocated in
areas aligned to the cache line size and will be as big as a cache line.

This is supposed to give higher performance for highly contented mutexes
both spin or sleep (because of the adaptive spinning), where the cache
line contention results in too much traffic on the system bus.

The struct mtx_padalign can be used in a completely transparent way
with the mtx(9) KPI.

At the moment, a possibility to MFC the patch should be carefully
evaluated because this patch breaks the low level KPI
(not its representation though).

Discussed with:	jhb
Reviewed by:	jeff, andre
Reviewed by:	mdf (earlier version)
Tested by:	jimharris
2012-10-31 13:38:56 +00:00
Attilio Rao
0a15e5d30d Remove all the checks on curthread != NULL with the exception of some MD
trap checks (eg. printtrap()).

Generally this check is not needed anymore, as there is not a legitimate
case where curthread != NULL, after pcpu 0 area has been properly
initialized.

Reviewed by:	bde, jhb
MFC after:	1 week
2012-09-13 22:26:22 +00:00