Commit Graph

61 Commits

Author SHA1 Message Date
Mateusz Guzik
9488262679 rms: add rms_assert_rlock_ok
So that callers which opportunistically elide the lock can still
assert that they can take it.

Reviewed by:
Differential Revision:
2022-08-23 19:15:48 +00:00
Mark Johnston
afb44cb010 rmlock: Temporarily revert commit c84bb8cd77
It appears to have introduced a regression on arm64, possibly due to the
fact that the pcpu pointer is reloaded outside of the critical section
in _rm_rlock().  Until this is resolved one way or another, let's
revert.

Reported by:	Ronald Klop <ronald-lists@klop.ws>
Sponsored by:	The FreeBSD Foundation
2022-03-07 10:43:19 -05:00
Mark Johnston
89ae8eb74e rmlock: Add required compiler barriers to _rm_runlock()
Also remove excessive whitespace in _rm_rlock().

Reviewed by:	jah, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34381
2022-03-01 09:38:45 -05:00
Mark Johnston
c84bb8cd77 rmlock: Micro-optimize read locking
Use get_pcpu() instead of an open-coded pcpu_find(td->td_oncpu).  This
eliminates some memory accesses and results in a shorter instruction
sequence.  Note that get_pcpu() didn't exist when rmlocks were added.

Reviewed by:	jah, mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34377
2022-02-25 13:55:24 -05:00
Stefan Eßer
a19bd8e30e Restore variable aliasing in the context of cpu set operations
A simplification of set operations removed side-effects of the
previous code, which are restored by this commit.
2022-01-01 11:58:40 +01:00
Stefan Eßer
e2650af157 Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.

Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.

The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).

The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.

This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.

One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.

Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.

The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.

Reviewed by:	kib
MFC after:	2 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D33451
2021-12-30 12:20:32 +01:00
Mark Johnston
71f31d784e rmslock: Update td_locks during lock and unlock operations
Reviewed by:	mjg
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32692
2021-10-27 11:18:13 -04:00
Mitchell Horne
2816bd8442 rmlock(9): add an RM_DUPOK flag
Allows for duplicate locks to be acquired without witness complaining.
Similar flags exists already for rwlock(9) and sx(9).

Reviewed by:	markj
MFC after:	3 days
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
NetApp PR:	52
Differential Revision:	https://reviews.freebsd.org/D29683n
2021-04-12 11:42:21 -03:00
Konstantin Belousov
b5449c92b4 Use atomic_interrupt_fence() instead of bare __compiler_membar()
for the which which definitely use membar to sync with interrupt handlers.

libc and rtld uses of __compiler_membar() seems to want compiler barriers
proper.

The barrier in sched_unpin_lite() after td_pinned decrement seems to be not
needed and removed, instead of convertion.

Reviewed by:	markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28956
2021-02-28 01:27:29 +02:00
Mark Johnston
1d44514fcd rmlock: Add a required compiler membar to the rlock slow path
The tracker flags need to be loaded only after the tracker is removed
from its per-CPU queue.  Otherwise, readers may fail to synchronize with
pending writers attempting to propagate priority to active readers, and
readers and writers deadlock on each other.  This was observed in a
stable/12-based armv7 kernel where the compiler had reordered the load
of rmp_flags to before the stores updating the queue.

Reviewed by:	rlibby, scottl
Discussed with:	kib
Sponsored by:	Rubicon Communications, LLC ("Netgate")
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28821
2021-02-23 21:17:12 -05:00
Mateusz Guzik
42e7abd5db rms: several cleanups + debug read lockers handling
This adds a dedicated counter updated with atomics when INVARIANTS
is used. As a side effect one can reliably determine the lock is held
for reading by at least one thread, but it's still not possible to
find out whether curthread has the lock in said mode.

This should be good enough in practice.

Problem spotted by avg.
2020-11-07 16:57:53 +00:00
Mateusz Guzik
2dee296a3d Rationalize per-cpu zones.
The 2 provided zones had inconsistent naming between each other
("int" and "64") and other allocator zones (which use bytes).

Follow malloc by naming them "pcpu-" + size in bytes.

This is a step towards replacing ad-hoc per-cpu zones with
general slabs.
2020-11-05 15:08:56 +00:00
Mateusz Guzik
6fc2b069ca rms: fixup concurrent writer handling and add more features
Previously the code had one wait channel for all pending writers.
This could result in a buggy scenario where after a writer switches
the lock mode form readers to writers goes off CPU, another writer
queues itself and then the last reader wakes up the latter instead
of the former.

Use a separate channel.

While here add features to reliably detect whether curthread has
the lock write-owned. This will be used by ZFS.
2020-11-04 21:18:08 +00:00
Mateusz Guzik
8541ae04b4 rms: fix typo: bitmamp -> bitmap
Reported by:	kib
2020-08-04 20:31:03 +00:00
Mateusz Guzik
3211e783e3 rms: add a comment explaining performance deficiencies of write locking 2020-08-04 19:52:16 +00:00
Mateusz Guzik
00ac9d2632 rms: use smp_rendezvous_cpus_retry instead of a hand-rolled variant 2020-02-12 11:17:18 +00:00
Mateusz Guzik
ea77ce6ef9 rms: use newly added zpcpu routines instead of direct access where appropriate 2020-02-07 22:44:41 +00:00
Mateusz Guzik
1a78ac2416 Add rms_try_rlock and rms_wowned. 2020-01-31 08:36:49 +00:00
Mateusz Guzik
cedad2916e Remove an overzealous assert from rms_runlock. 2020-01-31 08:36:23 +00:00
Mateusz Guzik
3983dc32d7 Plug a warning in read-mostly spinlocks reported by gcc. 2019-12-27 13:37:19 +00:00
Mateusz Guzik
1f162fef76 Add read-mostly sleepable locks
To be used when like rmlocks, except when sleeping for readers needs to be
allowed. See the manpage for more information.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D22823
2019-12-27 11:19:57 +00:00
Ryan Libby
9825eadf2c bitset: rename confusing macro NAND to ANDNOT
s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too.  The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does appear to be what all existing callers want.

Don't supply a NAND (not (A and B)) operation at this time.

Discussed with:	jeff
Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D22791
2019-12-13 09:32:16 +00:00
Ryan Libby
59fb4a95c7 witness: sleepable rm locks are not sleepable in read mode
There are two classes of rm lock, one "sleepable" and one not.  But even
a "sleepable" rm lock is only sleepable in write mode, and is
non-sleepable when taken in read mode.

Warn about sleepable rm locks in read mode as non-sleepable locks.  Do
this by defining a new lock operation flag, LOP_NOSLEEP, to indicate
that a lock is non-sleepable despite what the LO_SLEEPABLE flag would
indicate, and defining a new witness lock instance flag, LI_SLEEPABLE,
to track the product of LO_SLEEPABLE and LOP_NOSLEEP on the lock
instance.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D22527
2019-11-27 01:54:39 +00:00
Eric van Gyzen
d54474e63b Make no assertions about lock state when the scheduler is stopped.
Change the assert paths in rm, rw, and sx locks to match the lock
and unlock paths.  I did this for mutexes in r306346.

Reported by:	Travis Lane <tlane@isilon.com>
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2018-11-13 20:48:05 +00:00
Mateusz Guzik
d0a22279db Remove an unused argument to turnstile_unpend.
PR:	228694
Submitted by:	Julian Pszczołowski <julian.pszczolowski@gmail.com>
2018-06-02 22:37:53 +00:00
Mateusz Guzik
85c1b3c1cb rmlock: partially depessimize lock/unlock fastpath
Previusly the slow path was folded in and partially jumped over in the
common case.
2018-05-11 06:59:54 +00:00
Mark Johnston
755230eb9f Clean up the SYSINIT_FLAGS definitions for rwlock(9) and rmlock(9).
Avoid duplication in their macro definitions, and document them. No
functional change intended.

MFC after:	1 week
2017-11-21 14:59:23 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Patrick Kelsey
67d955aab4 Corrected misspelled versions of rendezvous.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().

Reviewed by:	gnn, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10313
2017-04-09 02:00:03 +00:00
Jason A. Harmening
e2a8d17887 Bring back r313037, with fixes for mips:
Implement get_pcpu() for amd64/sparc64/mips/powerpc, and use it to
replace pcpu_find(curcpu) in MI code.

Reviewed by:	andreast, kan, lidl
Tested by:	lidl(mips, sparc64), andreast(powerpc)
Differential Revision:	https://reviews.freebsd.org/D9587
2017-02-19 02:03:09 +00:00
Jason A. Harmening
ad62ba6e96 Revert r313037
The switch to get_pcpu() in MI code seems to cause hangs on MIPS.
Back out until we can get a better idea of what's happening there.

Reported by:	kan, lidl
2017-02-04 06:24:49 +00:00
Jason A. Harmening
65ed483615 Implement get_pcpu() for the remaining architectures and use it to
replace pcpu_find(curcpu) in MI code.
2017-02-01 03:32:49 +00:00
Pedro F. Giffuni
e3043798aa sys/kern: spelling fixes in comments.
No functional change.
2016-04-29 22:15:33 +00:00
John Baldwin
e89d5f43da Threads holding a read lock of a sleepable rm lock are not permitted
to sleep.  The rmlock implementation enforces this by disabling
sleeping when a read lock is acquired. To simplify the implementation,
sleeping is disabled for most of the duration of rm_rlock.  However,
it doesn't need to be disabled until the lock is acquired.  If a
sleepable rm lock is contested, then rm_rlock may need to acquire the
backing sx lock.  This tripped the overly-broad assertion.  Fix by
relaxing the assertion around the call to sx_xlock().

Reported by:	mjg
Reviewed by:	kib, mjg
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D3324
2015-09-15 22:16:21 +00:00
Mark Johnston
ce1c953ee0 Don't modify curthread->td_locks unless INVARIANTS is enabled.
This field is only used in a KASSERT that verifies that no locks are held
when returning to user mode. Moreover, the td_locks accounting is only
correct when LOCK_DEBUG > 0, which is implied by INVARIANTS.

Reviewed by:	jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D3205
2015-08-02 00:03:08 +00:00
Andrey V. Elsukov
41f5f69f96 Build debug version of rmlock's methods only when LOCK_DEBUG > 0.
Currently LOCK_DEBUG is always defined in sys/lock.h (0 or 1).
This means that debugging code always built. In addition the kernel
modules have always defined LOCK_DEBUG as 1. So, debugging rmlock code
is always used by kernel modules.

MFC after:	1 week
2015-07-26 10:53:32 +00:00
Dmitry Chagin
fd07ddcf6f Add _NEW flag to mtx(9), sx(9), rmlock(9) and rwlock(9).
A _NEW flag passed to _init_flags() to avoid check for double-init.

Differential Revision:	https://reviews.freebsd.org/D1208
Reviewed by:	jhb, wblock
MFC after:	1 Month
2014-12-13 21:00:10 +00:00
Attilio Rao
54366c0bd7 - For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
  calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
  version of the releasing functions for mutex, rwlock and sxlock.
  Failing to do so skips the lockstat_probe_func invokation for
  unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
  kernel compiled without lock debugging options, potentially every
  consumer must be compiled including opt_kdtrace.h.
  Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
  dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
  is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested.  As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while.  Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by:	EMC / Isilon storage division
Discussed with:	rstone
[0] Reported by:	rstone
[1] Discussed with:	philip
2013-11-25 07:38:45 +00:00
Davide Italiano
7faf4d90e8 Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().

Suggested by:	jhb
Reviewed by:	jhb
Approved by:	re (delphij)
2013-09-20 23:06:21 +00:00
John Baldwin
c64bc3a076 Fix build with INVARIANT_SUPPORT enabled but not INVARIANTS.
Reported by:	"Matthew D. Fuller" <fullermd@over-yonder.net>
2013-07-08 21:17:20 +00:00
John Baldwin
cd32bd7ad1 Several improvements to rmlock(9). Many of these are based on patches
provided by Isilon.
- Add an rm_assert() supporting various lock assertions similar to other
  locking primitives.  Because rmlocks track readers the assertions are
  always fully accurate unlike rw_assert() and sx_assert().
- Flesh out the lock class methods for rmlocks to support sleeping via
  condvars and rm_sleep() (but only while holding write locks), rmlock
  details in 'show lock' in DDB, and the lc_owner method used by
  dtrace.
- Add an internal destroyed cookie so that API functions can assert
  that an rmlock is not destroyed.
- Make use of rm_assert() to add various assertions to the API (e.g.
  to assert locks are held when an unlock routine is called).
- Give RM_SLEEPABLE locks their own lock class and always use the
  rmlock's own lock_object with WITNESS.
- Use THREAD_NO_SLEEPING() / THREAD_SLEEPING_OK() to disallow sleeping
  while holding a read lock on an rmlock.

Submitted by:	andre
Obtained from:	EMC/Isilon
2013-06-25 18:44:15 +00:00
Attilio Rao
cd2fe4e632 Fixup r240424: On entering KDB backends, the hijacked thread to run
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.

Skip the idlethread check on block/sleep lock operations when KDB is
active.

Reported by:	jh
Tested by:	jh
MFC after:	1 week
2012-12-22 09:37:34 +00:00
Attilio Rao
3a4730256a Add an unified macro to deny ability from the compiler to reorder
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.

Reviewed by:	bde, kib
Discussed with:	dim, theraven
MFC after:	2 weeks
2012-10-09 14:32:30 +00:00
Attilio Rao
e3ae0dfe69 Improve check coverage about idle threads.
Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.

Reviewed by:	bde, jhb, kib
MFC after:	1 week
2012-09-12 22:10:53 +00:00
Andriy Gapon
353705930f panic: add a switch and infrastructure for stopping other CPUs in SMP case
Historical behavior of letting other CPUs merily go on is a default for
time being.  The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.

Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
  state

Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set.  That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts.  Those locks might be held by the stopped threads and would never
be released.  To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.

This change has substantial portions written and re-written by attilio
and kib at various times.  Other changes are heavily based on the ideas
and patches submitted by jhb and mdf.  bde has provided many insights
into the details and history of the current code.

The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console.  This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection.  Dumping to USB-connected disks may also be affected.

PR:			amd64/139614 (at least)
In cooperation with:	attilio, jhb, kib, mdf
Discussed with:		arch@, bde
Tested by:		Eugene Grosbein <eugen@grosbein.net>,
			gnn,
			Steven Hartland <killing@multiplay.co.uk>,
			glebius,
			Andrew Boyer <aboyer@averesystems.com>
			(various versions of the patch)
MFC after:		3 months (or never)
2011-12-11 21:02:01 +00:00
Pawel Jakub Dawidek
d576deedb5 Constify arguments for locking KPIs where possible.
This enables locking consumers to pass their own structures around as const and
be able to assert locks embedded into those structures.

Reviewed by:	ed, kib, jhb
2011-11-16 21:51:17 +00:00
Attilio Rao
a38f1f263b Remove pc_cpumask and pc_other_cpus usage from MI code.
Tested by:	pluknet
2011-06-13 13:28:31 +00:00
Attilio Rao
71a19bdc64 Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).

Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.

The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN

while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.

Some technical notes:
- This commit may be considered an ABI nop for all the architectures
  different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
  accessed avoiding migration, because the size of cpuset_t should be
  considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
  primirally done in order to leave some more space in userland to cope
  with KBI extensions). If you need to access kernel cpuset_t from the
  userland please refer to example in this patch on how to do that
  correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now

The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.

Tested by:	pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by:	jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
Olivier Houchard
0c87b5e243 No need to include sys/systm.h twice. 2010-11-16 14:08:21 +00:00
Max Laier
36058c09e4 rmlock(9) two additions and one change/fix:
- add rm_try_rlock().
 - add RM_SLEEPABLE to use sx(9) as the back-end lock in order to sleep while
   holding the write lock.
 - change rm_noreadtoken to a cpu bitmask to indicate which CPUs need to go
   through the lock/unlock in order to synchronize.  As a side effect, this
   also avoids IPI to CPUs without any readers during rm_wlock.

Discussed with:		ups@, rwatson@ on arch@
Sponsored by:		Isilon Systems, Inc.
2010-09-01 19:50:03 +00:00