Allows for duplicate locks to be acquired without witness complaining.
Similar flags exists already for rwlock(9) and sx(9).
Reviewed by: markj
MFC after: 3 days
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
NetApp PR: 52
Differential Revision: https://reviews.freebsd.org/D29683n
for the which which definitely use membar to sync with interrupt handlers.
libc and rtld uses of __compiler_membar() seems to want compiler barriers
proper.
The barrier in sched_unpin_lite() after td_pinned decrement seems to be not
needed and removed, instead of convertion.
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28956
The tracker flags need to be loaded only after the tracker is removed
from its per-CPU queue. Otherwise, readers may fail to synchronize with
pending writers attempting to propagate priority to active readers, and
readers and writers deadlock on each other. This was observed in a
stable/12-based armv7 kernel where the compiler had reordered the load
of rmp_flags to before the stores updating the queue.
Reviewed by: rlibby, scottl
Discussed with: kib
Sponsored by: Rubicon Communications, LLC ("Netgate")
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28821
This adds a dedicated counter updated with atomics when INVARIANTS
is used. As a side effect one can reliably determine the lock is held
for reading by at least one thread, but it's still not possible to
find out whether curthread has the lock in said mode.
This should be good enough in practice.
Problem spotted by avg.
The 2 provided zones had inconsistent naming between each other
("int" and "64") and other allocator zones (which use bytes).
Follow malloc by naming them "pcpu-" + size in bytes.
This is a step towards replacing ad-hoc per-cpu zones with
general slabs.
Previously the code had one wait channel for all pending writers.
This could result in a buggy scenario where after a writer switches
the lock mode form readers to writers goes off CPU, another writer
queues itself and then the last reader wakes up the latter instead
of the former.
Use a separate channel.
While here add features to reliably detect whether curthread has
the lock write-owned. This will be used by ZFS.
To be used when like rmlocks, except when sleeping for readers needs to be
allowed. See the manpage for more information.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D22823
s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does appear to be what all existing callers want.
Don't supply a NAND (not (A and B)) operation at this time.
Discussed with: jeff
Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22791
There are two classes of rm lock, one "sleepable" and one not. But even
a "sleepable" rm lock is only sleepable in write mode, and is
non-sleepable when taken in read mode.
Warn about sleepable rm locks in read mode as non-sleepable locks. Do
this by defining a new lock operation flag, LOP_NOSLEEP, to indicate
that a lock is non-sleepable despite what the LO_SLEEPABLE flag would
indicate, and defining a new witness lock instance flag, LI_SLEEPABLE,
to track the product of LO_SLEEPABLE and LOP_NOSLEEP on the lock
instance.
Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22527
Change the assert paths in rm, rw, and sx locks to match the lock
and unlock paths. I did this for mutexes in r306346.
Reported by: Travis Lane <tlane@isilon.com>
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().
Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313
Implement get_pcpu() for amd64/sparc64/mips/powerpc, and use it to
replace pcpu_find(curcpu) in MI code.
Reviewed by: andreast, kan, lidl
Tested by: lidl(mips, sparc64), andreast(powerpc)
Differential Revision: https://reviews.freebsd.org/D9587
The switch to get_pcpu() in MI code seems to cause hangs on MIPS.
Back out until we can get a better idea of what's happening there.
Reported by: kan, lidl
to sleep. The rmlock implementation enforces this by disabling
sleeping when a read lock is acquired. To simplify the implementation,
sleeping is disabled for most of the duration of rm_rlock. However,
it doesn't need to be disabled until the lock is acquired. If a
sleepable rm lock is contested, then rm_rlock may need to acquire the
backing sx lock. This tripped the overly-broad assertion. Fix by
relaxing the assertion around the call to sx_xlock().
Reported by: mjg
Reviewed by: kib, mjg
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D3324
This field is only used in a KASSERT that verifies that no locks are held
when returning to user mode. Moreover, the td_locks accounting is only
correct when LOCK_DEBUG > 0, which is implied by INVARIANTS.
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D3205
Currently LOCK_DEBUG is always defined in sys/lock.h (0 or 1).
This means that debugging code always built. In addition the kernel
modules have always defined LOCK_DEBUG as 1. So, debugging rmlock code
is always used by kernel modules.
MFC after: 1 week
A _NEW flag passed to _init_flags() to avoid check for double-init.
Differential Revision: https://reviews.freebsd.org/D1208
Reviewed by: jhb, wblock
MFC after: 1 Month
option, unbreak the lock tracing release semantic by embedding
calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
version of the releasing functions for mutex, rwlock and sxlock.
Failing to do so skips the lockstat_probe_func invokation for
unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
kernel compiled without lock debugging options, potentially every
consumer must be compiled including opt_kdtrace.h.
Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
is linked there and it is only used as a compile-time stub [0].
[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested. As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while. Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].
Sponsored by: EMC / Isilon storage division
Discussed with: rstone
[0] Reported by: rstone
[1] Discussed with: philip
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().
Suggested by: jhb
Reviewed by: jhb
Approved by: re (delphij)
provided by Isilon.
- Add an rm_assert() supporting various lock assertions similar to other
locking primitives. Because rmlocks track readers the assertions are
always fully accurate unlike rw_assert() and sx_assert().
- Flesh out the lock class methods for rmlocks to support sleeping via
condvars and rm_sleep() (but only while holding write locks), rmlock
details in 'show lock' in DDB, and the lc_owner method used by
dtrace.
- Add an internal destroyed cookie so that API functions can assert
that an rmlock is not destroyed.
- Make use of rm_assert() to add various assertions to the API (e.g.
to assert locks are held when an unlock routine is called).
- Give RM_SLEEPABLE locks their own lock class and always use the
rmlock's own lock_object with WITNESS.
- Use THREAD_NO_SLEEPING() / THREAD_SLEEPING_OK() to disallow sleeping
while holding a read lock on an rmlock.
Submitted by: andre
Obtained from: EMC/Isilon
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.
Skip the idlethread check on block/sleep lock operations when KDB is
active.
Reported by: jh
Tested by: jh
MFC after: 1 week
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.
Reviewed by: bde, kib
Discussed with: dim, theraven
MFC after: 2 weeks
Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.
Reviewed by: bde, jhb, kib
MFC after: 1 week
Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.
Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state
Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.
This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.
The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.
PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)
This enables locking consumers to pass their own structures around as const and
be able to assert locks embedded into those structures.
Reviewed by: ed, kib, jhb
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).
Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.
The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN
while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.
Some technical notes:
- This commit may be considered an ABI nop for all the architectures
different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
accessed avoiding migration, because the size of cpuset_t should be
considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
primirally done in order to leave some more space in userland to cope
with KBI extensions). If you need to access kernel cpuset_t from the
userland please refer to example in this patch on how to do that
correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now
The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.
Tested by: pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by: jeff, jhb, sbruno
- add rm_try_rlock().
- add RM_SLEEPABLE to use sx(9) as the back-end lock in order to sleep while
holding the write lock.
- change rm_noreadtoken to a cpu bitmask to indicate which CPUs need to go
through the lock/unlock in order to synchronize. As a side effect, this
also avoids IPI to CPUs without any readers during rm_wlock.
Discussed with: ups@, rwatson@ on arch@
Sponsored by: Isilon Systems, Inc.
- Add rm_init_flags() and accept extended options only for that variation.
- Add a flags space specifically for rm_init_flags(), rather than borrowing
the lock_init() flag space.
- Define flag RM_RECURSE to use instead of LO_RECURSABLE.
- Define flag RM_NOWITNESS to allow an rmlock to be exempt from WITNESS
checking; this wasn't possible previously as rm_init() always passed
LO_WITNESS when initializing an rmlock's struct lock.
- Add RM_SYSINIT_FLAGS().
- Rename embedded mutex in rmlocks to make it more obvious what it is.
- Update consumers.
- Update man page.
adds probes for mutexes, reader/writer and shared/exclusive locks to
gather contention statistics and other locking information for
dtrace scripts, the lockstat(1M) command and other potential
consumers.
Reviewed by: attilio jhb jb
Approved by: gnn (mentor)