2005-01-06 23:35:40 +00:00
|
|
|
/*-
|
2017-11-27 15:20:12 +00:00
|
|
|
* SPDX-License-Identifier: BSD-2-Clause-FreeBSD
|
|
|
|
*
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
* Copyright (c) 2015, 2016 The FreeBSD Foundation
|
2004-12-18 12:52:44 +00:00
|
|
|
* Copyright (c) 2004, David Xu <davidxu@freebsd.org>
|
2003-04-01 01:10:42 +00:00
|
|
|
* Copyright (c) 2002, Jeffrey Roberson <jeff@freebsd.org>
|
|
|
|
* All rights reserved.
|
|
|
|
*
|
2016-02-28 17:52:33 +00:00
|
|
|
* Portions of this software were developed by Konstantin Belousov
|
|
|
|
* under sponsorship from the FreeBSD Foundation.
|
|
|
|
*
|
2003-04-01 01:10:42 +00:00
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice unmodified, this list of conditions, and the following
|
|
|
|
* disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
|
|
|
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
|
|
|
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
|
|
|
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
|
|
|
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
|
|
|
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
|
|
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
|
|
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
|
|
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
|
|
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
*/
|
|
|
|
|
2003-06-11 00:56:59 +00:00
|
|
|
#include <sys/cdefs.h>
|
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
2012-03-16 20:32:11 +00:00
|
|
|
#include "opt_umtx_profiling.h"
|
|
|
|
|
2003-04-01 01:10:42 +00:00
|
|
|
#include <sys/param.h>
|
|
|
|
#include <sys/kernel.h>
|
2016-02-28 17:52:33 +00:00
|
|
|
#include <sys/fcntl.h>
|
|
|
|
#include <sys/file.h>
|
|
|
|
#include <sys/filedesc.h>
|
2004-07-02 00:40:07 +00:00
|
|
|
#include <sys/limits.h>
|
2003-04-01 01:10:42 +00:00
|
|
|
#include <sys/lock.h>
|
2003-06-03 05:24:46 +00:00
|
|
|
#include <sys/malloc.h>
|
2016-02-28 17:52:33 +00:00
|
|
|
#include <sys/mman.h>
|
2003-04-01 01:10:42 +00:00
|
|
|
#include <sys/mutex.h>
|
2006-11-06 13:42:10 +00:00
|
|
|
#include <sys/priv.h>
|
2003-04-01 01:10:42 +00:00
|
|
|
#include <sys/proc.h>
|
2016-02-28 17:52:33 +00:00
|
|
|
#include <sys/resource.h>
|
|
|
|
#include <sys/resourcevar.h>
|
|
|
|
#include <sys/rwlock.h>
|
2013-03-09 15:31:19 +00:00
|
|
|
#include <sys/sbuf.h>
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
#include <sys/sched.h>
|
Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.
Added sysctls:
kern.threads.umtx_dflt_spins
if the sysctl value is non-zero, a zero umutex.m_spincount will
cause the sysctl value to be used a spin cycle count.
kern.threads.umtx_max_spins
the sysctl sets upper limit of spin cycle count.
Tested on: Athlon64 X2 3800+, Dual Xeon 5130
2006-12-20 04:40:39 +00:00
|
|
|
#include <sys/smp.h>
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
#include <sys/sysctl.h>
|
2003-04-01 01:10:42 +00:00
|
|
|
#include <sys/sysent.h>
|
|
|
|
#include <sys/systm.h>
|
|
|
|
#include <sys/sysproto.h>
|
2010-12-22 05:01:52 +00:00
|
|
|
#include <sys/syscallsubr.h>
|
2016-02-28 17:52:33 +00:00
|
|
|
#include <sys/taskqueue.h>
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
#include <sys/time.h>
|
2004-12-18 12:52:44 +00:00
|
|
|
#include <sys/eventhandler.h>
|
2003-04-01 01:10:42 +00:00
|
|
|
#include <sys/umtx.h>
|
2021-07-29 12:41:29 +03:00
|
|
|
#include <sys/umtxvar.h>
|
2003-04-01 01:10:42 +00:00
|
|
|
|
2016-02-28 17:52:33 +00:00
|
|
|
#include <security/mac/mac_framework.h>
|
|
|
|
|
2004-12-18 12:52:44 +00:00
|
|
|
#include <vm/vm.h>
|
|
|
|
#include <vm/vm_param.h>
|
|
|
|
#include <vm/pmap.h>
|
|
|
|
#include <vm/vm_map.h>
|
|
|
|
#include <vm/vm_object.h>
|
|
|
|
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
#include <machine/atomic.h>
|
Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.
Added sysctls:
kern.threads.umtx_dflt_spins
if the sysctl value is non-zero, a zero umutex.m_spincount will
cause the sysctl value to be used a spin cycle count.
kern.threads.umtx_max_spins
the sysctl sets upper limit of spin cycle count.
Tested on: Athlon64 X2 3800+, Dual Xeon 5130
2006-12-20 04:40:39 +00:00
|
|
|
#include <machine/cpu.h>
|
|
|
|
|
2020-11-11 22:35:23 +00:00
|
|
|
#include <compat/freebsd32/freebsd32.h>
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
#ifdef COMPAT_FREEBSD32
|
2006-09-22 00:52:54 +00:00
|
|
|
#include <compat/freebsd32/freebsd32_proto.h>
|
|
|
|
#endif
|
|
|
|
|
2008-06-24 07:32:12 +00:00
|
|
|
#define _UMUTEX_TRY 1
|
|
|
|
#define _UMUTEX_WAIT 2
|
|
|
|
|
2013-03-09 15:31:19 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
|
|
|
#define UPROF_PERC_BIGGER(w, f, sw, sf) \
|
|
|
|
(((w) > (sw)) || ((w) == (sw) && (f) > (sf)))
|
|
|
|
#endif
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
#define UMTXQ_LOCKED_ASSERT(uc) mtx_assert(&(uc)->uc_lock, MA_OWNED)
|
2021-07-29 12:47:39 +03:00
|
|
|
#ifdef INVARIANTS
|
|
|
|
#define UMTXQ_ASSERT_LOCKED_BUSY(key) do { \
|
|
|
|
struct umtxq_chain *uc; \
|
|
|
|
\
|
|
|
|
uc = umtxq_getchain(key); \
|
|
|
|
mtx_assert(&uc->uc_lock, MA_OWNED); \
|
|
|
|
KASSERT(uc->uc_busy != 0, ("umtx chain is not busy")); \
|
|
|
|
} while (0)
|
|
|
|
#else
|
|
|
|
#define UMTXQ_ASSERT_LOCKED_BUSY(key) do {} while (0)
|
|
|
|
#endif
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Don't propagate time-sharing priority, there is a security reason,
|
|
|
|
* a user can simply introduce PI-mutex, let thread A lock the mutex,
|
|
|
|
* and let another thread B block on the mutex, because B is
|
|
|
|
* sleeping, its priority will be boosted, this causes A's priority to
|
|
|
|
* be boosted via priority propagating too and will never be lowered even
|
|
|
|
* if it is using 100%CPU, this is unfair to other processes.
|
|
|
|
*/
|
|
|
|
|
2006-10-26 21:42:22 +00:00
|
|
|
#define UPRI(td) (((td)->td_user_pri >= PRI_MIN_TIMESHARE &&\
|
|
|
|
(td)->td_user_pri <= PRI_MAX_TIMESHARE) ?\
|
|
|
|
PRI_MAX_TIMESHARE : (td)->td_user_pri)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
#define GOLDEN_RATIO_PRIME 2654404609U
|
2016-11-15 01:34:38 +00:00
|
|
|
#ifndef UMTX_CHAINS
|
2010-12-23 03:12:03 +00:00
|
|
|
#define UMTX_CHAINS 512
|
2016-11-15 01:34:38 +00:00
|
|
|
#endif
|
2010-12-23 03:12:03 +00:00
|
|
|
#define UMTX_SHIFTS (__WORD_BIT - 9)
|
2003-06-03 05:24:46 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
#define GET_SHARE(flags) \
|
|
|
|
(((flags) & USYNC_PROCESS_SHARED) == 0 ? THREAD_SHARE : PROCESS_SHARE)
|
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
#define BUSY_SPINS 200
|
|
|
|
|
2020-11-22 05:34:51 +00:00
|
|
|
struct umtx_copyops {
|
|
|
|
int (*copyin_timeout)(const void *uaddr, struct timespec *tsp);
|
|
|
|
int (*copyin_umtx_time)(const void *uaddr, size_t size,
|
|
|
|
struct _umtx_time *tp);
|
|
|
|
int (*copyin_robust_lists)(const void *uaddr, size_t size,
|
|
|
|
struct umtx_robust_lists_params *rbp);
|
|
|
|
int (*copyout_timeout)(void *uaddr, size_t size,
|
|
|
|
struct timespec *tsp);
|
|
|
|
const size_t timespec_sz;
|
|
|
|
const size_t umtx_time_sz;
|
|
|
|
const bool compat32;
|
|
|
|
};
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
_Static_assert(sizeof(struct umutex) == sizeof(struct umutex32), "umutex32");
|
|
|
|
_Static_assert(__offsetof(struct umutex, m_spare[0]) ==
|
|
|
|
__offsetof(struct umutex32, m_spare[0]), "m_spare32");
|
|
|
|
|
|
|
|
int umtx_shm_vnobj_persistent = 0;
|
|
|
|
SYSCTL_INT(_kern_ipc, OID_AUTO, umtx_vnode_persistent, CTLFLAG_RWTUN,
|
|
|
|
&umtx_shm_vnobj_persistent, 0,
|
|
|
|
"False forces destruction of umtx attached to file, on last close");
|
|
|
|
static int umtx_max_rb = 1000;
|
|
|
|
SYSCTL_INT(_kern_ipc, OID_AUTO, umtx_max_robust, CTLFLAG_RWTUN,
|
|
|
|
&umtx_max_rb, 0,
|
2020-03-02 15:30:52 +00:00
|
|
|
"Maximum number of robust mutexes allowed for each thread");
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static uma_zone_t umtx_pi_zone;
|
2008-05-30 02:18:54 +00:00
|
|
|
static struct umtxq_chain umtxq_chains[2][UMTX_CHAINS];
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
static MALLOC_DEFINE(M_UMTX, "umtx", "UMTX queue memory");
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static int umtx_pi_allocated;
|
2003-05-25 18:18:32 +00:00
|
|
|
|
2020-02-26 14:26:36 +00:00
|
|
|
static SYSCTL_NODE(_debug, OID_AUTO, umtx, CTLFLAG_RW | CTLFLAG_MPSAFE, 0,
|
|
|
|
"umtx debug");
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
SYSCTL_INT(_debug_umtx, OID_AUTO, umtx_pi_allocated, CTLFLAG_RD,
|
|
|
|
&umtx_pi_allocated, 0, "Allocated umtx_pi");
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
static int umtx_verbose_rb = 1;
|
|
|
|
SYSCTL_INT(_debug_umtx, OID_AUTO, robust_faults_verbose, CTLFLAG_RWTUN,
|
|
|
|
&umtx_verbose_rb, 0,
|
|
|
|
"");
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2012-03-16 20:32:11 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
|
|
|
static long max_length;
|
|
|
|
SYSCTL_LONG(_debug_umtx, OID_AUTO, max_length, CTLFLAG_RD, &max_length, 0, "max_length");
|
2020-02-26 14:26:36 +00:00
|
|
|
static SYSCTL_NODE(_debug_umtx, OID_AUTO, chains, CTLFLAG_RD | CTLFLAG_MPSAFE, 0,
|
|
|
|
"umtx chain stats");
|
2012-03-16 20:32:11 +00:00
|
|
|
#endif
|
|
|
|
|
2021-07-29 12:41:58 +03:00
|
|
|
static inline void umtx_abs_timeout_init2(struct umtx_abs_timeout *timo,
|
|
|
|
const struct _umtx_time *umtxtime);
|
|
|
|
static int umtx_abs_timeout_gethz(struct umtx_abs_timeout *timo);
|
|
|
|
static inline void umtx_abs_timeout_update(struct umtx_abs_timeout *timo);
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
|
2016-02-28 17:52:33 +00:00
|
|
|
static void umtx_shm_init(void);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static void umtxq_sysinit(void *);
|
|
|
|
static void umtxq_hash(struct umtx_key *key);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
static int do_unlock_pp(struct thread *td, struct umutex *m, uint32_t flags,
|
|
|
|
bool rb);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static void umtx_thread_cleanup(struct thread *td);
|
|
|
|
SYSINIT(umtx, SI_SUB_EVENTHANDLER+1, SI_ORDER_MIDDLE, umtxq_sysinit, NULL);
|
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
#define umtxq_signal(key, nwake) umtxq_signal_queue((key), (nwake), UMTX_SHARED_QUEUE)
|
|
|
|
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
static struct mtx umtx_lock;
|
|
|
|
|
2012-03-16 20:32:11 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
|
|
|
static void
|
2020-01-02 22:07:03 +00:00
|
|
|
umtx_init_profiling(void)
|
2012-03-16 20:32:11 +00:00
|
|
|
{
|
|
|
|
struct sysctl_oid *chain_oid;
|
|
|
|
char chain_name[10];
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < UMTX_CHAINS; ++i) {
|
|
|
|
snprintf(chain_name, sizeof(chain_name), "%d", i);
|
2020-01-02 22:07:03 +00:00
|
|
|
chain_oid = SYSCTL_ADD_NODE(NULL,
|
|
|
|
SYSCTL_STATIC_CHILDREN(_debug_umtx_chains), OID_AUTO,
|
2020-02-26 14:26:36 +00:00
|
|
|
chain_name, CTLFLAG_RD | CTLFLAG_MPSAFE, NULL,
|
|
|
|
"umtx hash stats");
|
2012-03-16 20:32:11 +00:00
|
|
|
SYSCTL_ADD_INT(NULL, SYSCTL_CHILDREN(chain_oid), OID_AUTO,
|
|
|
|
"max_length0", CTLFLAG_RD, &umtxq_chains[0][i].max_length, 0, NULL);
|
|
|
|
SYSCTL_ADD_INT(NULL, SYSCTL_CHILDREN(chain_oid), OID_AUTO,
|
|
|
|
"max_length1", CTLFLAG_RD, &umtxq_chains[1][i].max_length, 0, NULL);
|
|
|
|
}
|
|
|
|
}
|
2013-03-09 15:31:19 +00:00
|
|
|
|
|
|
|
static int
|
|
|
|
sysctl_debug_umtx_chains_peaks(SYSCTL_HANDLER_ARGS)
|
|
|
|
{
|
|
|
|
char buf[512];
|
|
|
|
struct sbuf sb;
|
|
|
|
struct umtxq_chain *uc;
|
|
|
|
u_int fract, i, j, tot, whole;
|
|
|
|
u_int sf0, sf1, sf2, sf3, sf4;
|
|
|
|
u_int si0, si1, si2, si3, si4;
|
|
|
|
u_int sw0, sw1, sw2, sw3, sw4;
|
|
|
|
|
|
|
|
sbuf_new(&sb, buf, sizeof(buf), SBUF_FIXEDLEN);
|
|
|
|
for (i = 0; i < 2; i++) {
|
|
|
|
tot = 0;
|
|
|
|
for (j = 0; j < UMTX_CHAINS; ++j) {
|
|
|
|
uc = &umtxq_chains[i][j];
|
|
|
|
mtx_lock(&uc->uc_lock);
|
|
|
|
tot += uc->max_length;
|
|
|
|
mtx_unlock(&uc->uc_lock);
|
|
|
|
}
|
|
|
|
if (tot == 0)
|
|
|
|
sbuf_printf(&sb, "%u) Empty ", i);
|
|
|
|
else {
|
|
|
|
sf0 = sf1 = sf2 = sf3 = sf4 = 0;
|
|
|
|
si0 = si1 = si2 = si3 = si4 = 0;
|
|
|
|
sw0 = sw1 = sw2 = sw3 = sw4 = 0;
|
|
|
|
for (j = 0; j < UMTX_CHAINS; j++) {
|
|
|
|
uc = &umtxq_chains[i][j];
|
|
|
|
mtx_lock(&uc->uc_lock);
|
|
|
|
whole = uc->max_length * 100;
|
|
|
|
mtx_unlock(&uc->uc_lock);
|
|
|
|
fract = (whole % tot) * 100;
|
|
|
|
if (UPROF_PERC_BIGGER(whole, fract, sw0, sf0)) {
|
|
|
|
sf0 = fract;
|
|
|
|
si0 = j;
|
|
|
|
sw0 = whole;
|
|
|
|
} else if (UPROF_PERC_BIGGER(whole, fract, sw1,
|
|
|
|
sf1)) {
|
|
|
|
sf1 = fract;
|
|
|
|
si1 = j;
|
|
|
|
sw1 = whole;
|
|
|
|
} else if (UPROF_PERC_BIGGER(whole, fract, sw2,
|
|
|
|
sf2)) {
|
|
|
|
sf2 = fract;
|
|
|
|
si2 = j;
|
|
|
|
sw2 = whole;
|
|
|
|
} else if (UPROF_PERC_BIGGER(whole, fract, sw3,
|
|
|
|
sf3)) {
|
|
|
|
sf3 = fract;
|
|
|
|
si3 = j;
|
|
|
|
sw3 = whole;
|
|
|
|
} else if (UPROF_PERC_BIGGER(whole, fract, sw4,
|
|
|
|
sf4)) {
|
|
|
|
sf4 = fract;
|
|
|
|
si4 = j;
|
|
|
|
sw4 = whole;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
sbuf_printf(&sb, "queue %u:\n", i);
|
|
|
|
sbuf_printf(&sb, "1st: %u.%u%% idx: %u\n", sw0 / tot,
|
|
|
|
sf0 / tot, si0);
|
|
|
|
sbuf_printf(&sb, "2nd: %u.%u%% idx: %u\n", sw1 / tot,
|
|
|
|
sf1 / tot, si1);
|
|
|
|
sbuf_printf(&sb, "3rd: %u.%u%% idx: %u\n", sw2 / tot,
|
|
|
|
sf2 / tot, si2);
|
|
|
|
sbuf_printf(&sb, "4th: %u.%u%% idx: %u\n", sw3 / tot,
|
|
|
|
sf3 / tot, si3);
|
|
|
|
sbuf_printf(&sb, "5th: %u.%u%% idx: %u\n", sw4 / tot,
|
|
|
|
sf4 / tot, si4);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
sbuf_trim(&sb);
|
|
|
|
sbuf_finish(&sb);
|
|
|
|
sysctl_handle_string(oidp, sbuf_data(&sb), sbuf_len(&sb), req);
|
|
|
|
sbuf_delete(&sb);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
sysctl_debug_umtx_chains_clear(SYSCTL_HANDLER_ARGS)
|
|
|
|
{
|
|
|
|
struct umtxq_chain *uc;
|
|
|
|
u_int i, j;
|
|
|
|
int clear, error;
|
|
|
|
|
|
|
|
clear = 0;
|
|
|
|
error = sysctl_handle_int(oidp, &clear, 0, req);
|
|
|
|
if (error != 0 || req->newptr == NULL)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
if (clear != 0) {
|
|
|
|
for (i = 0; i < 2; ++i) {
|
|
|
|
for (j = 0; j < UMTX_CHAINS; ++j) {
|
|
|
|
uc = &umtxq_chains[i][j];
|
|
|
|
mtx_lock(&uc->uc_lock);
|
|
|
|
uc->length = 0;
|
2020-01-02 22:07:03 +00:00
|
|
|
uc->max_length = 0;
|
2013-03-09 15:31:19 +00:00
|
|
|
mtx_unlock(&uc->uc_lock);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
SYSCTL_PROC(_debug_umtx_chains, OID_AUTO, clear,
|
|
|
|
CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, 0, 0,
|
2020-02-26 14:26:36 +00:00
|
|
|
sysctl_debug_umtx_chains_clear, "I",
|
|
|
|
"Clear umtx chains statistics");
|
2013-03-09 15:31:19 +00:00
|
|
|
SYSCTL_PROC(_debug_umtx_chains, OID_AUTO, peaks,
|
|
|
|
CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, 0, 0,
|
2020-02-26 14:26:36 +00:00
|
|
|
sysctl_debug_umtx_chains_peaks, "A",
|
|
|
|
"Highest peaks in chains max length");
|
2012-03-16 20:32:11 +00:00
|
|
|
#endif
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static void
|
|
|
|
umtxq_sysinit(void *arg __unused)
|
|
|
|
{
|
2008-05-30 02:18:54 +00:00
|
|
|
int i, j;
|
2004-12-18 12:52:44 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_zone = uma_zcreate("umtx pi", sizeof(struct umtx_pi),
|
|
|
|
NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
|
2008-05-30 02:18:54 +00:00
|
|
|
for (i = 0; i < 2; ++i) {
|
|
|
|
for (j = 0; j < UMTX_CHAINS; ++j) {
|
|
|
|
mtx_init(&umtxq_chains[i][j].uc_lock, "umtxql", NULL,
|
|
|
|
MTX_DEF | MTX_DUPOK);
|
2010-01-10 09:31:57 +00:00
|
|
|
LIST_INIT(&umtxq_chains[i][j].uc_queue[0]);
|
|
|
|
LIST_INIT(&umtxq_chains[i][j].uc_queue[1]);
|
|
|
|
LIST_INIT(&umtxq_chains[i][j].uc_spare_queue);
|
2008-05-30 02:18:54 +00:00
|
|
|
TAILQ_INIT(&umtxq_chains[i][j].uc_pi_list);
|
|
|
|
umtxq_chains[i][j].uc_busy = 0;
|
|
|
|
umtxq_chains[i][j].uc_waiters = 0;
|
2012-04-14 23:53:31 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
2012-03-16 20:32:11 +00:00
|
|
|
umtxq_chains[i][j].length = 0;
|
2020-01-02 22:07:03 +00:00
|
|
|
umtxq_chains[i][j].max_length = 0;
|
2012-04-14 23:53:31 +00:00
|
|
|
#endif
|
2008-05-30 02:18:54 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
2012-04-14 23:53:31 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
2012-03-16 20:32:11 +00:00
|
|
|
umtx_init_profiling();
|
2012-04-14 23:53:31 +00:00
|
|
|
#endif
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_init(&umtx_lock, "umtx lock", NULL, MTX_DEF);
|
2016-02-28 17:52:33 +00:00
|
|
|
umtx_shm_init();
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
|
2005-03-05 09:15:03 +00:00
|
|
|
struct umtx_q *
|
|
|
|
umtxq_alloc(void)
|
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
|
|
|
|
uq = malloc(sizeof(struct umtx_q), M_UMTX, M_WAITOK | M_ZERO);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uq->uq_spare_queue = malloc(sizeof(struct umtxq_queue), M_UMTX,
|
|
|
|
M_WAITOK | M_ZERO);
|
2010-01-10 09:31:57 +00:00
|
|
|
TAILQ_INIT(&uq->uq_spare_queue->head);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
TAILQ_INIT(&uq->uq_pi_contested);
|
|
|
|
uq->uq_inherited_pri = PRI_MAX;
|
|
|
|
return (uq);
|
2005-03-05 09:15:03 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
umtxq_free(struct umtx_q *uq)
|
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2010-01-10 09:31:57 +00:00
|
|
|
MPASS(uq->uq_spare_queue != NULL);
|
|
|
|
free(uq->uq_spare_queue, M_UMTX);
|
2005-03-05 09:15:03 +00:00
|
|
|
free(uq, M_UMTX);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static inline void
|
2004-12-18 12:52:44 +00:00
|
|
|
umtxq_hash(struct umtx_key *key)
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
unsigned n;
|
|
|
|
|
|
|
|
n = (uintptr_t)key->info.both.a + key->info.both.b;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
key->hash = ((n * GOLDEN_RATIO_PRIME) >> UMTX_SHIFTS) % UMTX_CHAINS;
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
}
|
|
|
|
|
2021-07-29 12:42:17 +03:00
|
|
|
struct umtxq_chain *
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_getchain(struct umtx_key *key)
|
2004-12-18 12:52:44 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2010-01-09 06:12:44 +00:00
|
|
|
if (key->type <= TYPE_SEM)
|
2008-05-30 02:18:54 +00:00
|
|
|
return (&umtxq_chains[1][key->hash]);
|
|
|
|
return (&umtxq_chains[0][key->hash]);
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
2008-04-02 04:08:37 +00:00
|
|
|
* Set chain to busy state when following operation
|
|
|
|
* may be blocked (kernel mutex can not be used).
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
*/
|
2021-07-29 12:42:17 +03:00
|
|
|
void
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_busy(struct umtx_key *key)
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtxq_chain *uc;
|
|
|
|
|
|
|
|
uc = umtxq_getchain(key);
|
2008-04-02 04:08:37 +00:00
|
|
|
mtx_assert(&uc->uc_lock, MA_OWNED);
|
|
|
|
if (uc->uc_busy) {
|
2008-04-03 11:49:20 +00:00
|
|
|
#ifdef SMP
|
|
|
|
if (smp_cpus > 1) {
|
|
|
|
int count = BUSY_SPINS;
|
|
|
|
if (count > 0) {
|
|
|
|
umtxq_unlock(key);
|
|
|
|
while (uc->uc_busy && --count > 0)
|
|
|
|
cpu_spinwait();
|
|
|
|
umtxq_lock(key);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
while (uc->uc_busy) {
|
2008-04-02 04:08:37 +00:00
|
|
|
uc->uc_waiters++;
|
|
|
|
msleep(uc, &uc->uc_lock, 0, "umtxqb", 0);
|
|
|
|
uc->uc_waiters--;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
uc->uc_busy = 1;
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
2008-04-02 04:08:37 +00:00
|
|
|
* Unbusy a chain.
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
*/
|
2021-07-29 12:42:17 +03:00
|
|
|
void
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_unbusy(struct umtx_key *key)
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtxq_chain *uc;
|
|
|
|
|
|
|
|
uc = umtxq_getchain(key);
|
2008-04-02 04:08:37 +00:00
|
|
|
mtx_assert(&uc->uc_lock, MA_OWNED);
|
|
|
|
KASSERT(uc->uc_busy != 0, ("not busy"));
|
|
|
|
uc->uc_busy = 0;
|
|
|
|
if (uc->uc_waiters)
|
|
|
|
wakeup_one(uc);
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
}
|
2003-06-03 05:24:46 +00:00
|
|
|
|
2021-07-29 12:46:58 +03:00
|
|
|
void
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(struct umtx_key *key)
|
|
|
|
{
|
|
|
|
|
|
|
|
umtxq_lock(key);
|
|
|
|
umtxq_unbusy(key);
|
|
|
|
umtxq_unlock(key);
|
|
|
|
}
|
|
|
|
|
2010-01-10 09:31:57 +00:00
|
|
|
static struct umtxq_queue *
|
|
|
|
umtxq_queue_lookup(struct umtx_key *key, int q)
|
|
|
|
{
|
|
|
|
struct umtxq_queue *uh;
|
|
|
|
struct umtxq_chain *uc;
|
|
|
|
|
|
|
|
uc = umtxq_getchain(key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
|
|
|
LIST_FOREACH(uh, &uc->uc_queue[q], link) {
|
|
|
|
if (umtx_key_match(&uh->key, key))
|
|
|
|
return (uh);
|
|
|
|
}
|
|
|
|
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
|
2021-07-29 12:42:17 +03:00
|
|
|
void
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_insert_queue(struct umtx_q *uq, int q)
|
2003-06-03 05:24:46 +00:00
|
|
|
{
|
2010-01-10 09:31:57 +00:00
|
|
|
struct umtxq_queue *uh;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtxq_chain *uc;
|
2004-12-18 12:52:44 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
uc = umtxq_getchain(&uq->uq_key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
2010-01-10 09:31:57 +00:00
|
|
|
KASSERT((uq->uq_flags & UQF_UMTXQ) == 0, ("umtx_q is already on queue"));
|
2010-02-10 05:47:34 +00:00
|
|
|
uh = umtxq_queue_lookup(&uq->uq_key, q);
|
2010-01-10 09:31:57 +00:00
|
|
|
if (uh != NULL) {
|
|
|
|
LIST_INSERT_HEAD(&uc->uc_spare_queue, uq->uq_spare_queue, link);
|
|
|
|
} else {
|
|
|
|
uh = uq->uq_spare_queue;
|
|
|
|
uh->key = uq->uq_key;
|
|
|
|
LIST_INSERT_HEAD(&uc->uc_queue[q], uh, link);
|
2013-03-21 19:58:25 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
|
|
|
uc->length++;
|
|
|
|
if (uc->length > uc->max_length) {
|
|
|
|
uc->max_length = uc->length;
|
|
|
|
if (uc->max_length > max_length)
|
2020-01-02 22:07:03 +00:00
|
|
|
max_length = uc->max_length;
|
2013-03-21 19:58:25 +00:00
|
|
|
}
|
|
|
|
#endif
|
2010-01-10 09:31:57 +00:00
|
|
|
}
|
|
|
|
uq->uq_spare_queue = NULL;
|
|
|
|
|
|
|
|
TAILQ_INSERT_TAIL(&uh->head, uq, uq_link);
|
|
|
|
uh->length++;
|
2006-05-18 08:43:46 +00:00
|
|
|
uq->uq_flags |= UQF_UMTXQ;
|
2010-01-10 09:31:57 +00:00
|
|
|
uq->uq_cur_queue = uh;
|
|
|
|
return;
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
|
|
|
|
2021-07-29 12:42:17 +03:00
|
|
|
void
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_remove_queue(struct umtx_q *uq, int q)
|
2004-12-18 12:52:44 +00:00
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtxq_chain *uc;
|
2010-01-10 09:31:57 +00:00
|
|
|
struct umtxq_queue *uh;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
uc = umtxq_getchain(&uq->uq_key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
2006-05-18 08:43:46 +00:00
|
|
|
if (uq->uq_flags & UQF_UMTXQ) {
|
2010-01-10 09:31:57 +00:00
|
|
|
uh = uq->uq_cur_queue;
|
|
|
|
TAILQ_REMOVE(&uh->head, uq, uq_link);
|
|
|
|
uh->length--;
|
2006-05-18 08:43:46 +00:00
|
|
|
uq->uq_flags &= ~UQF_UMTXQ;
|
2010-01-10 09:31:57 +00:00
|
|
|
if (TAILQ_EMPTY(&uh->head)) {
|
|
|
|
KASSERT(uh->length == 0,
|
|
|
|
("inconsistent umtxq_queue length"));
|
2013-03-21 19:58:25 +00:00
|
|
|
#ifdef UMTX_PROFILING
|
|
|
|
uc->length--;
|
|
|
|
#endif
|
2010-01-10 09:31:57 +00:00
|
|
|
LIST_REMOVE(uh, link);
|
|
|
|
} else {
|
|
|
|
uh = LIST_FIRST(&uc->uc_spare_queue);
|
|
|
|
KASSERT(uh != NULL, ("uc_spare_queue is empty"));
|
|
|
|
LIST_REMOVE(uh, link);
|
|
|
|
}
|
|
|
|
uq->uq_spare_queue = uh;
|
|
|
|
uq->uq_cur_queue = NULL;
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Check if there are multiple waiters
|
|
|
|
*/
|
2021-07-29 12:42:17 +03:00
|
|
|
int
|
2004-12-18 12:52:44 +00:00
|
|
|
umtxq_count(struct umtx_key *key)
|
|
|
|
{
|
2010-01-10 09:31:57 +00:00
|
|
|
struct umtxq_queue *uh;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2018-05-19 05:09:10 +00:00
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(key));
|
2010-01-10 09:31:57 +00:00
|
|
|
uh = umtxq_queue_lookup(key, UMTX_SHARED_QUEUE);
|
|
|
|
if (uh != NULL)
|
|
|
|
return (uh->length);
|
|
|
|
return (0);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if there are multiple PI waiters and returns first
|
|
|
|
* waiter.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
umtxq_count_pi(struct umtx_key *key, struct umtx_q **first)
|
|
|
|
{
|
2010-01-10 09:31:57 +00:00
|
|
|
struct umtxq_queue *uh;
|
2003-06-03 05:24:46 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
*first = NULL;
|
2018-05-19 05:09:10 +00:00
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(key));
|
2010-01-10 09:31:57 +00:00
|
|
|
uh = umtxq_queue_lookup(key, UMTX_SHARED_QUEUE);
|
|
|
|
if (uh != NULL) {
|
|
|
|
*first = TAILQ_FIRST(&uh->head);
|
|
|
|
return (uh->length);
|
2003-06-03 05:24:46 +00:00
|
|
|
}
|
2010-01-10 09:31:57 +00:00
|
|
|
return (0);
|
2003-06-03 05:24:46 +00:00
|
|
|
}
|
|
|
|
|
2021-07-29 12:42:49 +03:00
|
|
|
/*
|
|
|
|
* Wake up threads waiting on an userland object by a bit mask.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
umtxq_signal_mask(struct umtx_key *key, int n_wake, u_int bitset)
|
|
|
|
{
|
|
|
|
struct umtxq_queue *uh;
|
|
|
|
struct umtx_q *uq, *uq_temp;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = 0;
|
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(key));
|
|
|
|
uh = umtxq_queue_lookup(key, UMTX_SHARED_QUEUE);
|
|
|
|
if (uh == NULL)
|
|
|
|
return (0);
|
|
|
|
TAILQ_FOREACH_SAFE(uq, &uh->head, uq_link, uq_temp) {
|
|
|
|
if ((uq->uq_bitset & bitset) == 0)
|
|
|
|
continue;
|
|
|
|
umtxq_remove_queue(uq, UMTX_SHARED_QUEUE);
|
|
|
|
wakeup_one(uq);
|
|
|
|
if (++ret >= n_wake)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Wake up threads waiting on an userland object.
|
|
|
|
*/
|
2008-04-02 04:08:37 +00:00
|
|
|
|
2004-12-24 11:30:55 +00:00
|
|
|
static int
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_signal_queue(struct umtx_key *key, int n_wake, int q)
|
2003-06-03 05:24:46 +00:00
|
|
|
{
|
2010-01-10 09:31:57 +00:00
|
|
|
struct umtxq_queue *uh;
|
|
|
|
struct umtx_q *uq;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
int ret;
|
2003-06-03 05:24:46 +00:00
|
|
|
|
2004-12-24 11:30:55 +00:00
|
|
|
ret = 0;
|
2018-05-19 05:09:10 +00:00
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(key));
|
2010-01-10 09:31:57 +00:00
|
|
|
uh = umtxq_queue_lookup(key, q);
|
|
|
|
if (uh != NULL) {
|
|
|
|
while ((uq = TAILQ_FIRST(&uh->head)) != NULL) {
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_remove_queue(uq, q);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
wakeup(uq);
|
2004-12-24 11:30:55 +00:00
|
|
|
if (++ret >= n_wake)
|
2010-01-10 09:31:57 +00:00
|
|
|
return (ret);
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
|
|
|
}
|
2004-12-24 11:30:55 +00:00
|
|
|
return (ret);
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Wake up specified thread.
|
|
|
|
*/
|
|
|
|
static inline void
|
|
|
|
umtxq_signal_thread(struct umtx_q *uq)
|
|
|
|
{
|
|
|
|
|
2018-05-19 05:09:10 +00:00
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(&uq->uq_key));
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_remove(uq);
|
|
|
|
wakeup(uq);
|
|
|
|
}
|
|
|
|
|
2021-07-29 12:43:07 +03:00
|
|
|
/*
|
|
|
|
* Wake up a maximum of n_wake threads that are waiting on an userland
|
|
|
|
* object identified by key. The remaining threads are removed from queue
|
|
|
|
* identified by key and added to the queue identified by key2 (requeued).
|
|
|
|
* The n_requeue specifies an upper limit on the number of threads that
|
|
|
|
* are requeued to the second queue.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
umtxq_requeue(struct umtx_key *key, int n_wake, struct umtx_key *key2,
|
|
|
|
int n_requeue)
|
|
|
|
{
|
2021-10-21 20:10:42 +03:00
|
|
|
struct umtxq_queue *uh;
|
2021-07-29 12:43:07 +03:00
|
|
|
struct umtx_q *uq, *uq_temp;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = 0;
|
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(key));
|
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(key2));
|
|
|
|
uh = umtxq_queue_lookup(key, UMTX_SHARED_QUEUE);
|
|
|
|
if (uh == NULL)
|
|
|
|
return (0);
|
|
|
|
TAILQ_FOREACH_SAFE(uq, &uh->head, uq_link, uq_temp) {
|
|
|
|
if (++ret <= n_wake) {
|
|
|
|
umtxq_remove(uq);
|
|
|
|
wakeup_one(uq);
|
|
|
|
} else {
|
|
|
|
umtxq_remove(uq);
|
|
|
|
uq->uq_key = *key2;
|
|
|
|
umtxq_insert(uq);
|
|
|
|
if (ret - n_wake == n_requeue)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
2020-01-02 22:07:03 +00:00
|
|
|
static inline int
|
2012-03-30 05:40:26 +00:00
|
|
|
tstohz(const struct timespec *tsp)
|
|
|
|
{
|
|
|
|
struct timeval tv;
|
|
|
|
|
|
|
|
TIMESPEC_TO_TIMEVAL(&tv, tsp);
|
|
|
|
return tvtohz(&tv);
|
|
|
|
}
|
|
|
|
|
2021-07-29 12:41:58 +03:00
|
|
|
void
|
|
|
|
umtx_abs_timeout_init(struct umtx_abs_timeout *timo, int clockid,
|
|
|
|
int absolute, const struct timespec *timeout)
|
2012-03-30 05:40:26 +00:00
|
|
|
{
|
|
|
|
|
|
|
|
timo->clockid = clockid;
|
|
|
|
if (!absolute) {
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
timo->is_abs_real = false;
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_update(timo);
|
Make timespecadd(3) and friends public
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
2018-07-30 15:46:40 +00:00
|
|
|
timespecadd(&timo->cur, timeout, &timo->end);
|
2012-03-30 05:40:26 +00:00
|
|
|
} else {
|
|
|
|
timo->end = *timeout;
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
timo->is_abs_real = clockid == CLOCK_REALTIME ||
|
|
|
|
clockid == CLOCK_REALTIME_FAST ||
|
|
|
|
clockid == CLOCK_REALTIME_PRECISE;
|
|
|
|
/*
|
|
|
|
* If is_abs_real, umtxq_sleep will read the clock
|
|
|
|
* after setting td_rtcgen; otherwise, read it here.
|
|
|
|
*/
|
|
|
|
if (!timo->is_abs_real) {
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_update(timo);
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
}
|
2012-03-30 05:40:26 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(struct umtx_abs_timeout *timo,
|
|
|
|
const struct _umtx_time *umtxtime)
|
2012-03-30 05:40:26 +00:00
|
|
|
{
|
|
|
|
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init(timo, umtxtime->_clockid,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
(umtxtime->_flags & UMTX_ABSTIME) != 0, &umtxtime->_timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
}
|
|
|
|
|
2021-07-29 12:41:58 +03:00
|
|
|
static void
|
|
|
|
umtx_abs_timeout_update(struct umtx_abs_timeout *timo)
|
2012-03-30 05:40:26 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
kern_clock_gettime(curthread, timo->clockid, &timo->cur);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_gethz(struct umtx_abs_timeout *timo)
|
2012-03-30 05:40:26 +00:00
|
|
|
{
|
|
|
|
struct timespec tts;
|
|
|
|
|
2012-08-11 23:48:39 +00:00
|
|
|
if (timespeccmp(&timo->end, &timo->cur, <=))
|
2020-01-02 22:07:03 +00:00
|
|
|
return (-1);
|
Make timespecadd(3) and friends public
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
2018-07-30 15:46:40 +00:00
|
|
|
timespecsub(&timo->end, &timo->cur, &tts);
|
2012-03-30 05:40:26 +00:00
|
|
|
return (tstohz(&tts));
|
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
static uint32_t
|
|
|
|
umtx_unlock_val(uint32_t flags, bool rb)
|
|
|
|
{
|
|
|
|
|
|
|
|
if (rb)
|
|
|
|
return (UMUTEX_RB_OWNERDEAD);
|
|
|
|
else if ((flags & UMUTEX_NONCONSISTENT) != 0)
|
|
|
|
return (UMUTEX_RB_NOTRECOV);
|
|
|
|
else
|
|
|
|
return (UMUTEX_UNOWNED);
|
|
|
|
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Put thread into sleep state, before sleeping, check if
|
|
|
|
* thread was removed from umtx queue.
|
|
|
|
*/
|
2021-07-29 12:42:17 +03:00
|
|
|
int
|
2021-07-29 12:41:58 +03:00
|
|
|
umtxq_sleep(struct umtx_q *uq, const char *wmesg,
|
|
|
|
struct umtx_abs_timeout *abstime)
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtxq_chain *uc;
|
2012-08-11 23:48:39 +00:00
|
|
|
int error, timo;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
if (abstime != NULL && abstime->is_abs_real) {
|
|
|
|
curthread->td_rtcgen = atomic_load_acq_int(&rtc_generation);
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_update(abstime);
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
uc = umtxq_getchain(&uq->uq_key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
2012-03-30 05:40:26 +00:00
|
|
|
for (;;) {
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
if (!(uq->uq_flags & UQF_UMTXQ)) {
|
|
|
|
error = 0;
|
|
|
|
break;
|
|
|
|
}
|
2012-08-11 23:48:39 +00:00
|
|
|
if (abstime != NULL) {
|
2021-07-29 12:41:58 +03:00
|
|
|
timo = umtx_abs_timeout_gethz(abstime);
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
if (timo < 0) {
|
|
|
|
error = ETIMEDOUT;
|
|
|
|
break;
|
|
|
|
}
|
2012-08-11 00:06:56 +00:00
|
|
|
} else
|
2012-08-11 23:48:39 +00:00
|
|
|
timo = 0;
|
|
|
|
error = msleep(uq, &uc->uc_lock, PCATCH | PDROP, wmesg, timo);
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
if (error == EINTR || error == ERESTART) {
|
2012-03-30 05:40:26 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
break;
|
|
|
|
}
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
if (abstime != NULL) {
|
|
|
|
if (abstime->is_abs_real)
|
|
|
|
curthread->td_rtcgen =
|
|
|
|
atomic_load_acq_int(&rtc_generation);
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_update(abstime);
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
}
|
2012-03-30 05:40:26 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
}
|
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
|
|
|
|
|
|
|
curthread->td_rtcgen = 0;
|
2005-01-06 02:08:34 +00:00
|
|
|
return (error);
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Convert userspace address into unique logical address.
|
|
|
|
*/
|
2011-02-23 13:19:14 +00:00
|
|
|
int
|
2015-08-03 21:11:33 +00:00
|
|
|
umtx_key_get(const void *addr, int type, int share, struct umtx_key *key)
|
2004-12-18 12:52:44 +00:00
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct thread *td = curthread;
|
2004-12-18 12:52:44 +00:00
|
|
|
vm_map_t map;
|
|
|
|
vm_map_entry_t entry;
|
|
|
|
vm_pindex_t pindex;
|
|
|
|
vm_prot_t prot;
|
|
|
|
boolean_t wired;
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
key->type = type;
|
|
|
|
if (share == THREAD_SHARE) {
|
|
|
|
key->shared = 0;
|
|
|
|
key->info.private.vs = td->td_proc->p_vmspace;
|
|
|
|
key->info.private.addr = (uintptr_t)addr;
|
2006-10-25 06:28:23 +00:00
|
|
|
} else {
|
|
|
|
MPASS(share == PROCESS_SHARE || share == AUTO_SHARE);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
map = &td->td_proc->p_vmspace->vm_map;
|
|
|
|
if (vm_map_lookup(&map, (vm_offset_t)addr, VM_PROT_WRITE,
|
|
|
|
&entry, &key->info.shared.object, &pindex, &prot,
|
|
|
|
&wired) != KERN_SUCCESS) {
|
2015-10-30 20:47:42 +00:00
|
|
|
return (EFAULT);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
2006-02-04 06:36:39 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if ((share == PROCESS_SHARE) ||
|
|
|
|
(share == AUTO_SHARE &&
|
|
|
|
VM_INHERIT_SHARE == entry->inheritance)) {
|
|
|
|
key->shared = 1;
|
2015-08-04 06:01:13 +00:00
|
|
|
key->info.shared.offset = (vm_offset_t)addr -
|
|
|
|
entry->start + entry->offset;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
vm_object_reference(key->info.shared.object);
|
|
|
|
} else {
|
|
|
|
key->shared = 0;
|
|
|
|
key->info.private.vs = td->td_proc->p_vmspace;
|
|
|
|
key->info.private.addr = (uintptr_t)addr;
|
|
|
|
}
|
|
|
|
vm_map_lookup_done(map, entry);
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
umtxq_hash(key);
|
2004-12-18 12:52:44 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Release key.
|
|
|
|
*/
|
2011-02-23 13:19:14 +00:00
|
|
|
void
|
2004-12-18 12:52:44 +00:00
|
|
|
umtx_key_release(struct umtx_key *key)
|
|
|
|
{
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (key->shared)
|
2004-12-18 12:52:44 +00:00
|
|
|
vm_object_deallocate(key->info.shared.object);
|
1. use per-chain mutex instead of global mutex to reduce
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
2004-11-30 12:02:53 +00:00
|
|
|
}
|
|
|
|
|
2021-07-19 18:40:38 +03:00
|
|
|
#ifdef COMPAT_FREEBSD10
|
|
|
|
/*
|
|
|
|
* Lock a umtx object.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_lock_umtx(struct thread *td, struct umtx *umtx, u_long id,
|
|
|
|
const struct timespec *timeout)
|
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2021-07-19 18:40:38 +03:00
|
|
|
struct umtx_q *uq;
|
|
|
|
u_long owner;
|
|
|
|
u_long old;
|
|
|
|
int error = 0;
|
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init(&timo, CLOCK_REALTIME, 0, timeout);
|
2021-07-19 18:40:38 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Care must be exercised when dealing with umtx structure. It
|
|
|
|
* can fault on any access.
|
|
|
|
*/
|
|
|
|
for (;;) {
|
|
|
|
/*
|
|
|
|
* Try the uncontested case. This should be done in userland.
|
|
|
|
*/
|
|
|
|
owner = casuword(&umtx->u_owner, UMTX_UNOWNED, id);
|
|
|
|
|
|
|
|
/* The acquire succeeded. */
|
|
|
|
if (owner == UMTX_UNOWNED)
|
|
|
|
return (0);
|
|
|
|
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (owner == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
/* If no one owns it but it is contested try to acquire it. */
|
|
|
|
if (owner == UMTX_CONTESTED) {
|
|
|
|
owner = casuword(&umtx->u_owner,
|
|
|
|
UMTX_CONTESTED, id | UMTX_CONTESTED);
|
|
|
|
|
|
|
|
if (owner == UMTX_CONTESTED)
|
|
|
|
return (0);
|
|
|
|
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (owner == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
error = thread_check_susp(td, false);
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
|
|
|
|
/* If this failed the lock has changed, restart. */
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we caught a signal, we have retried and now
|
|
|
|
* exit immediately.
|
|
|
|
*/
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
|
|
|
|
if ((error = umtx_key_get(umtx, TYPE_SIMPLE_LOCK,
|
|
|
|
AUTO_SHARE, &uq->uq_key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the contested bit so that a release in user space
|
|
|
|
* knows to use the system call for unlock. If this fails
|
|
|
|
* either some one else has acquired the lock or it has been
|
|
|
|
* released.
|
|
|
|
*/
|
|
|
|
old = casuword(&umtx->u_owner, owner, owner | UMTX_CONTESTED);
|
|
|
|
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (old == -1) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We set the contested bit, sleep. Otherwise the lock changed
|
|
|
|
* and we need to retry or we lost a race to the thread
|
|
|
|
* unlocking the umtx.
|
|
|
|
*/
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
if (old == owner)
|
|
|
|
error = umtxq_sleep(uq, "umtx", timeout == NULL ? NULL :
|
|
|
|
&timo);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
|
|
|
|
if (error == 0)
|
|
|
|
error = thread_check_susp(td, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (timeout == NULL) {
|
|
|
|
/* Mutex locking is restarted if it is interrupted. */
|
|
|
|
if (error == EINTR)
|
|
|
|
error = ERESTART;
|
|
|
|
} else {
|
|
|
|
/* Timed-locking is not restarted. */
|
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlock a umtx object.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_unlock_umtx(struct thread *td, struct umtx *umtx, u_long id)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
u_long owner;
|
|
|
|
u_long old;
|
|
|
|
int error;
|
|
|
|
int count;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure we own this mtx.
|
|
|
|
*/
|
|
|
|
owner = fuword(__DEVOLATILE(u_long *, &umtx->u_owner));
|
|
|
|
if (owner == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
if ((owner & ~UMTX_CONTESTED) != id)
|
|
|
|
return (EPERM);
|
|
|
|
|
|
|
|
/* This should be done in userland */
|
|
|
|
if ((owner & UMTX_CONTESTED) == 0) {
|
|
|
|
old = casuword(&umtx->u_owner, owner, UMTX_UNOWNED);
|
|
|
|
if (old == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
if (old == owner)
|
|
|
|
return (0);
|
|
|
|
owner = old;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We should only ever be in here for contested locks */
|
|
|
|
if ((error = umtx_key_get(umtx, TYPE_SIMPLE_LOCK, AUTO_SHARE,
|
|
|
|
&key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
count = umtxq_count(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When unlocking the umtx, it must be marked as unowned if
|
|
|
|
* there is zero or one thread only waiting for it.
|
|
|
|
* Otherwise, it must be marked as contested.
|
|
|
|
*/
|
|
|
|
old = casuword(&umtx->u_owner, owner,
|
|
|
|
count <= 1 ? UMTX_UNOWNED : UMTX_CONTESTED);
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_signal(&key,1);
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
if (old == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
if (old != owner)
|
|
|
|
return (EINVAL);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef COMPAT_FREEBSD32
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Lock a umtx object.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_lock_umtx32(struct thread *td, uint32_t *m, uint32_t id,
|
|
|
|
const struct timespec *timeout)
|
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2021-07-19 18:40:38 +03:00
|
|
|
struct umtx_q *uq;
|
|
|
|
uint32_t owner;
|
|
|
|
uint32_t old;
|
|
|
|
int error = 0;
|
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
|
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init(&timo, CLOCK_REALTIME, 0, timeout);
|
2021-07-19 18:40:38 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Care must be exercised when dealing with umtx structure. It
|
|
|
|
* can fault on any access.
|
|
|
|
*/
|
|
|
|
for (;;) {
|
|
|
|
/*
|
|
|
|
* Try the uncontested case. This should be done in userland.
|
|
|
|
*/
|
|
|
|
owner = casuword32(m, UMUTEX_UNOWNED, id);
|
|
|
|
|
|
|
|
/* The acquire succeeded. */
|
|
|
|
if (owner == UMUTEX_UNOWNED)
|
|
|
|
return (0);
|
|
|
|
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (owner == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
/* If no one owns it but it is contested try to acquire it. */
|
|
|
|
if (owner == UMUTEX_CONTESTED) {
|
|
|
|
owner = casuword32(m,
|
|
|
|
UMUTEX_CONTESTED, id | UMUTEX_CONTESTED);
|
|
|
|
if (owner == UMUTEX_CONTESTED)
|
|
|
|
return (0);
|
|
|
|
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (owner == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
error = thread_check_susp(td, false);
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
|
|
|
|
/* If this failed the lock has changed, restart. */
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we caught a signal, we have retried and now
|
|
|
|
* exit immediately.
|
|
|
|
*/
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
if ((error = umtx_key_get(m, TYPE_SIMPLE_LOCK,
|
|
|
|
AUTO_SHARE, &uq->uq_key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the contested bit so that a release in user space
|
|
|
|
* knows to use the system call for unlock. If this fails
|
|
|
|
* either some one else has acquired the lock or it has been
|
|
|
|
* released.
|
|
|
|
*/
|
|
|
|
old = casuword32(m, owner, owner | UMUTEX_CONTESTED);
|
|
|
|
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (old == -1) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We set the contested bit, sleep. Otherwise the lock changed
|
|
|
|
* and we need to retry or we lost a race to the thread
|
|
|
|
* unlocking the umtx.
|
|
|
|
*/
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
if (old == owner)
|
|
|
|
error = umtxq_sleep(uq, "umtx", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
|
|
|
|
if (error == 0)
|
|
|
|
error = thread_check_susp(td, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (timeout == NULL) {
|
|
|
|
/* Mutex locking is restarted if it is interrupted. */
|
|
|
|
if (error == EINTR)
|
|
|
|
error = ERESTART;
|
|
|
|
} else {
|
|
|
|
/* Timed-locking is not restarted. */
|
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlock a umtx object.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_unlock_umtx32(struct thread *td, uint32_t *m, uint32_t id)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
uint32_t owner;
|
|
|
|
uint32_t old;
|
|
|
|
int error;
|
|
|
|
int count;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure we own this mtx.
|
|
|
|
*/
|
|
|
|
owner = fuword32(m);
|
|
|
|
if (owner == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
if ((owner & ~UMUTEX_CONTESTED) != id)
|
|
|
|
return (EPERM);
|
|
|
|
|
|
|
|
/* This should be done in userland */
|
|
|
|
if ((owner & UMUTEX_CONTESTED) == 0) {
|
|
|
|
old = casuword32(m, owner, UMUTEX_UNOWNED);
|
|
|
|
if (old == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
if (old == owner)
|
|
|
|
return (0);
|
|
|
|
owner = old;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We should only ever be in here for contested locks */
|
|
|
|
if ((error = umtx_key_get(m, TYPE_SIMPLE_LOCK, AUTO_SHARE,
|
|
|
|
&key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
count = umtxq_count(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When unlocking the umtx, it must be marked as unowned if
|
|
|
|
* there is zero or one thread only waiting for it.
|
|
|
|
* Otherwise, it must be marked as contested.
|
|
|
|
*/
|
|
|
|
old = casuword32(m, owner,
|
|
|
|
count <= 1 ? UMUTEX_UNOWNED : UMUTEX_CONTESTED);
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_signal(&key,1);
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
if (old == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
if (old != owner)
|
|
|
|
return (EINVAL);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
#endif /* COMPAT_FREEBSD32 */
|
|
|
|
#endif /* COMPAT_FREEBSD10 */
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Fetch and compare value, sleep on the address if value is not changed.
|
|
|
|
*/
|
2004-12-18 12:52:44 +00:00
|
|
|
static int
|
2006-10-17 02:24:47 +00:00
|
|
|
do_wait(struct thread *td, void *addr, u_long id,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
struct _umtx_time *timeout, int compat32, int is_private)
|
2004-12-18 12:52:44 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2005-03-05 09:15:03 +00:00
|
|
|
struct umtx_q *uq;
|
2006-10-17 02:24:47 +00:00
|
|
|
u_long tmp;
|
2014-10-28 15:30:33 +00:00
|
|
|
uint32_t tmp32;
|
2005-01-14 13:38:15 +00:00
|
|
|
int error = 0;
|
2004-12-18 12:52:44 +00:00
|
|
|
|
2005-03-05 09:15:03 +00:00
|
|
|
uq = td->td_umtxq;
|
2008-04-29 03:48:48 +00:00
|
|
|
if ((error = umtx_key_get(addr, TYPE_SIMPLE_WAIT,
|
|
|
|
is_private ? THREAD_SHARE : AUTO_SHARE, &uq->uq_key)) != 0)
|
2004-12-18 12:52:44 +00:00
|
|
|
return (error);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (compat32 == 0) {
|
|
|
|
error = fueword(addr, &tmp);
|
|
|
|
if (error != 0)
|
|
|
|
error = EFAULT;
|
|
|
|
} else {
|
|
|
|
error = fueword32(addr, &tmp32);
|
|
|
|
if (error == 0)
|
|
|
|
tmp = tmp32;
|
|
|
|
else
|
|
|
|
error = EFAULT;
|
|
|
|
}
|
2012-03-29 02:46:43 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == 0) {
|
|
|
|
if (tmp == id)
|
|
|
|
error = umtxq_sleep(uq, "uwait", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
|
|
|
if ((uq->uq_flags & UQF_UMTXQ) == 0)
|
|
|
|
error = 0;
|
|
|
|
else
|
|
|
|
umtxq_remove(uq);
|
|
|
|
} else if ((uq->uq_flags & UQF_UMTXQ) != 0) {
|
2012-03-29 02:46:43 +00:00
|
|
|
umtxq_remove(uq);
|
2014-10-28 15:30:33 +00:00
|
|
|
}
|
2012-03-29 02:46:43 +00:00
|
|
|
umtxq_unlock(&uq->uq_key);
|
2005-03-05 09:15:03 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
2004-12-24 11:30:55 +00:00
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
2004-12-18 12:52:44 +00:00
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Wake up threads sleeping on the specified address.
|
|
|
|
*/
|
2005-10-26 06:55:46 +00:00
|
|
|
int
|
2008-04-29 03:48:48 +00:00
|
|
|
kern_umtx_wake(struct thread *td, void *uaddr, int n_wake, int is_private)
|
2004-12-18 12:52:44 +00:00
|
|
|
{
|
|
|
|
struct umtx_key key;
|
2004-12-24 11:30:55 +00:00
|
|
|
int ret;
|
2020-01-02 22:07:03 +00:00
|
|
|
|
2008-04-29 03:48:48 +00:00
|
|
|
if ((ret = umtx_key_get(uaddr, TYPE_SIMPLE_WAIT,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
is_private ? THREAD_SHARE : AUTO_SHARE, &key)) != 0)
|
2004-12-24 11:30:55 +00:00
|
|
|
return (ret);
|
2004-12-24 11:59:20 +00:00
|
|
|
umtxq_lock(&key);
|
2015-03-28 21:21:40 +00:00
|
|
|
umtxq_signal(&key, n_wake);
|
2004-12-24 11:59:20 +00:00
|
|
|
umtxq_unlock(&key);
|
2004-12-24 11:30:55 +00:00
|
|
|
umtx_key_release(&key);
|
2004-12-18 12:52:44 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Lock PTHREAD_PRIO_NONE protocol POSIX mutex.
|
|
|
|
*/
|
|
|
|
static int
|
2012-03-30 05:40:26 +00:00
|
|
|
do_lock_normal(struct thread *td, struct umutex *m, uint32_t flags,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
struct _umtx_time *timeout, int mode)
|
2006-08-25 06:12:53 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
uint32_t owner, old, id;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error, rv;
|
2006-08-25 06:12:53 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
id = td->td_tid;
|
|
|
|
uq = td->td_umtxq;
|
2014-10-28 15:30:33 +00:00
|
|
|
error = 0;
|
2012-03-30 05:40:26 +00:00
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Care must be exercised when dealing with umtx structure. It
|
|
|
|
* can fault on any access.
|
|
|
|
*/
|
|
|
|
for (;;) {
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&m->m_owner, &owner);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
2008-06-24 07:32:12 +00:00
|
|
|
if (mode == _UMUTEX_WAIT) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (owner == UMUTEX_UNOWNED ||
|
|
|
|
owner == UMUTEX_CONTESTED ||
|
|
|
|
owner == UMUTEX_RB_OWNERDEAD ||
|
|
|
|
owner == UMUTEX_RB_NOTRECOV)
|
2008-06-24 07:32:12 +00:00
|
|
|
return (0);
|
|
|
|
} else {
|
|
|
|
/*
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
* Robust mutex terminated. Kernel duty is to
|
|
|
|
* return EOWNERDEAD to the userspace. The
|
|
|
|
* umutex.m_flags UMUTEX_NONCONSISTENT is set
|
|
|
|
* by the common userspace code.
|
|
|
|
*/
|
|
|
|
if (owner == UMUTEX_RB_OWNERDEAD) {
|
|
|
|
rv = casueword32(&m->m_owner,
|
|
|
|
UMUTEX_RB_OWNERDEAD, &owner,
|
|
|
|
id | UMUTEX_CONTESTED);
|
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_RB_OWNERDEAD);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (EOWNERDEAD); /* success */
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
|
|
|
MPASS(rv == 1);
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, false);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (rv != 0)
|
|
|
|
return (rv);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (owner == UMUTEX_RB_NOTRECOV)
|
|
|
|
return (ENOTRECOVERABLE);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Try the uncontested case. This should be
|
|
|
|
* done in userland.
|
2008-06-24 07:32:12 +00:00
|
|
|
*/
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&m->m_owner, UMUTEX_UNOWNED,
|
|
|
|
&owner, id);
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2008-06-24 07:32:12 +00:00
|
|
|
/* The acquire succeeded. */
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_UNOWNED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (0);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
/*
|
|
|
|
* If no one owns it but it is contested try
|
|
|
|
* to acquire it.
|
|
|
|
*/
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
MPASS(rv == 1);
|
2008-06-24 07:32:12 +00:00
|
|
|
if (owner == UMUTEX_CONTESTED) {
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&m->m_owner,
|
|
|
|
UMUTEX_CONTESTED, &owner,
|
|
|
|
id | UMUTEX_CONTESTED);
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_CONTESTED);
|
2008-06-24 07:32:12 +00:00
|
|
|
return (0);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
|
|
|
if (rv == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv != 0)
|
|
|
|
return (rv);
|
|
|
|
}
|
2013-06-13 09:33:22 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
/*
|
|
|
|
* If this failed the lock has
|
|
|
|
* changed, restart.
|
|
|
|
*/
|
2008-06-24 07:32:12 +00:00
|
|
|
continue;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
|
|
|
|
/* rv == 1 but not contested, likely store failure */
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv != 0)
|
|
|
|
return (rv);
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2008-06-24 07:32:12 +00:00
|
|
|
if (mode == _UMUTEX_TRY)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EBUSY);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we caught a signal, we have retried and now
|
|
|
|
* exit immediately.
|
|
|
|
*/
|
2012-03-30 05:49:32 +00:00
|
|
|
if (error != 0)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (error);
|
|
|
|
|
|
|
|
if ((error = umtx_key_get(m, TYPE_NORMAL_UMUTEX,
|
|
|
|
GET_SHARE(flags), &uq->uq_key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the contested bit so that a release in user space
|
|
|
|
* knows to use the system call for unlock. If this fails
|
|
|
|
* either some one else has acquired the lock or it has been
|
|
|
|
* released.
|
|
|
|
*/
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&m->m_owner, owner, &old,
|
|
|
|
owner | UMUTEX_CONTESTED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
/* The address was invalid or casueword failed to store. */
|
|
|
|
if (rv == -1 || rv == 1) {
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
2008-06-24 07:32:12 +00:00
|
|
|
umtxq_unbusy(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
|
|
|
if (rv == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv != 0)
|
|
|
|
return (rv);
|
|
|
|
}
|
|
|
|
continue;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We set the contested bit, sleep. Otherwise the lock changed
|
|
|
|
* and we need to retry or we lost a race to the thread
|
|
|
|
* unlocking the umtx.
|
|
|
|
*/
|
|
|
|
umtxq_lock(&uq->uq_key);
|
2008-06-24 07:32:12 +00:00
|
|
|
umtxq_unbusy(&uq->uq_key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
MPASS(old == owner);
|
|
|
|
error = umtxq_sleep(uq, "umtxn", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
2013-06-13 09:33:22 +00:00
|
|
|
|
|
|
|
if (error == 0)
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlock PTHREAD_PRIO_NONE protocol POSIX mutex.
|
|
|
|
*/
|
|
|
|
static int
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
do_unlock_normal(struct thread *td, struct umutex *m, uint32_t flags, bool rb)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct umtx_key key;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uint32_t owner, old, id, newlock;
|
|
|
|
int error, count;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
id = td->td_tid;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
|
|
|
|
again:
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Make sure we own this mtx.
|
|
|
|
*/
|
2014-10-31 17:43:21 +00:00
|
|
|
error = fueword32(&m->m_owner, &owner);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
if ((owner & ~UMUTEX_CONTESTED) != id)
|
|
|
|
return (EPERM);
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
newlock = umtx_unlock_val(flags, rb);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if ((owner & UMUTEX_CONTESTED) == 0) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = casueword32(&m->m_owner, owner, &old, newlock);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
goto again;
|
|
|
|
}
|
|
|
|
MPASS(old == owner);
|
|
|
|
return (0);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* We should only ever be in here for contested locks */
|
|
|
|
if ((error = umtx_key_get(m, TYPE_NORMAL_UMUTEX, GET_SHARE(flags),
|
|
|
|
&key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
count = umtxq_count(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When unlocking the umtx, it must be marked as unowned if
|
|
|
|
* there is zero or one thread only waiting for it.
|
|
|
|
* Otherwise, it must be marked as contested.
|
|
|
|
*/
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (count > 1)
|
|
|
|
newlock |= UMUTEX_CONTESTED;
|
|
|
|
error = casueword32(&m->m_owner, owner, &old, newlock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_lock(&key);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
umtxq_signal(&key, 1);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 1) {
|
|
|
|
if (old != owner)
|
|
|
|
return (EINVAL);
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
goto again;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
2008-06-24 07:32:12 +00:00
|
|
|
/*
|
|
|
|
* Check if the mutex is available and wake up a waiter,
|
|
|
|
* only for simple mutex.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_wake_umutex(struct thread *td, struct umutex *m)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
uint32_t owner;
|
|
|
|
uint32_t flags;
|
|
|
|
int error;
|
|
|
|
int count;
|
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
again:
|
2014-10-31 17:43:21 +00:00
|
|
|
error = fueword32(&m->m_owner, &owner);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
2008-06-24 07:32:12 +00:00
|
|
|
return (EFAULT);
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((owner & ~UMUTEX_CONTESTED) != 0 && owner != UMUTEX_RB_OWNERDEAD &&
|
|
|
|
owner != UMUTEX_RB_NOTRECOV)
|
2008-06-24 07:32:12 +00:00
|
|
|
return (0);
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&m->m_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2008-06-24 07:32:12 +00:00
|
|
|
|
|
|
|
/* We should only ever be in here for contested locks */
|
|
|
|
if ((error = umtx_key_get(m, TYPE_NORMAL_UMUTEX, GET_SHARE(flags),
|
|
|
|
&key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
count = umtxq_count(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (count <= 1 && owner != UMUTEX_RB_OWNERDEAD &&
|
|
|
|
owner != UMUTEX_RB_NOTRECOV) {
|
2014-10-28 15:30:33 +00:00
|
|
|
error = casueword32(&m->m_owner, UMUTEX_CONTESTED, &owner,
|
|
|
|
UMUTEX_UNOWNED);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == -1) {
|
2014-10-28 15:30:33 +00:00
|
|
|
error = EFAULT;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
} else if (error == 1) {
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
goto again;
|
|
|
|
}
|
2014-10-28 15:30:33 +00:00
|
|
|
}
|
2008-06-24 07:32:12 +00:00
|
|
|
|
|
|
|
umtxq_lock(&key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 0 && count != 0) {
|
|
|
|
MPASS((owner & ~UMUTEX_CONTESTED) == 0 ||
|
|
|
|
owner == UMUTEX_RB_OWNERDEAD ||
|
|
|
|
owner == UMUTEX_RB_NOTRECOV);
|
2008-06-24 07:32:12 +00:00
|
|
|
umtxq_signal(&key, 1);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
2008-06-24 07:32:12 +00:00
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
2014-10-28 15:30:33 +00:00
|
|
|
return (error);
|
2008-06-24 07:32:12 +00:00
|
|
|
}
|
|
|
|
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
/*
|
|
|
|
* Check if the mutex has waiters and tries to fix contention bit.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_wake2_umutex(struct thread *td, struct umutex *m, uint32_t flags)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
uint32_t owner, old;
|
|
|
|
int type;
|
|
|
|
int error;
|
|
|
|
int count;
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
switch (flags & (UMUTEX_PRIO_INHERIT | UMUTEX_PRIO_PROTECT |
|
|
|
|
UMUTEX_ROBUST)) {
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
case 0:
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
case UMUTEX_ROBUST:
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
type = TYPE_NORMAL_UMUTEX;
|
|
|
|
break;
|
|
|
|
case UMUTEX_PRIO_INHERIT:
|
|
|
|
type = TYPE_PI_UMUTEX;
|
|
|
|
break;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
case (UMUTEX_PRIO_INHERIT | UMUTEX_ROBUST):
|
|
|
|
type = TYPE_PI_ROBUST_UMUTEX;
|
|
|
|
break;
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
case UMUTEX_PRIO_PROTECT:
|
|
|
|
type = TYPE_PP_UMUTEX;
|
|
|
|
break;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
case (UMUTEX_PRIO_PROTECT | UMUTEX_ROBUST):
|
|
|
|
type = TYPE_PP_ROBUST_UMUTEX;
|
|
|
|
break;
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
default:
|
|
|
|
return (EINVAL);
|
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((error = umtx_key_get(m, type, GET_SHARE(flags), &key)) != 0)
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
return (error);
|
|
|
|
|
|
|
|
owner = 0;
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
count = umtxq_count(&key);
|
|
|
|
umtxq_unlock(&key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
|
|
|
|
error = fueword32(&m->m_owner, &owner);
|
|
|
|
if (error == -1)
|
|
|
|
error = EFAULT;
|
|
|
|
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
/*
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
* Only repair contention bit if there is a waiter, this means
|
|
|
|
* the mutex is still being referenced by userland code,
|
|
|
|
* otherwise don't update any memory.
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
*/
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
while (error == 0 && (owner & UMUTEX_CONTESTED) == 0 &&
|
|
|
|
(count > 1 || (count == 1 && (owner & ~UMUTEX_CONTESTED) != 0))) {
|
|
|
|
error = casueword32(&m->m_owner, owner, &old,
|
|
|
|
owner | UMUTEX_CONTESTED);
|
|
|
|
if (error == -1) {
|
2014-10-28 15:30:33 +00:00
|
|
|
error = EFAULT;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
break;
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 0) {
|
|
|
|
MPASS(old == owner);
|
|
|
|
break;
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
owner = old;
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
umtxq_lock(&key);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == EFAULT) {
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
umtxq_signal(&key, INT_MAX);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
} else if (count != 0 && ((owner & ~UMUTEX_CONTESTED) == 0 ||
|
|
|
|
owner == UMUTEX_RB_OWNERDEAD || owner == UMUTEX_RB_NOTRECOV))
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
umtxq_signal(&key, 1);
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
2021-07-29 12:46:58 +03:00
|
|
|
struct umtx_pi *
|
2006-10-26 09:33:34 +00:00
|
|
|
umtx_pi_alloc(int flags)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct umtx_pi *pi;
|
|
|
|
|
2006-10-26 09:33:34 +00:00
|
|
|
pi = uma_zalloc(umtx_pi_zone, M_ZERO | flags);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
TAILQ_INIT(&pi->pi_blocked);
|
|
|
|
atomic_add_int(&umtx_pi_allocated, 1);
|
|
|
|
return (pi);
|
|
|
|
}
|
|
|
|
|
2021-07-29 12:46:58 +03:00
|
|
|
void
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_free(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
uma_zfree(umtx_pi_zone, pi);
|
|
|
|
atomic_add_int(&umtx_pi_allocated, -1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Adjust the thread's position on a pi_state after its priority has been
|
|
|
|
* changed.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
umtx_pi_adjust_thread(struct umtx_pi *pi, struct thread *td)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq, *uq1, *uq2;
|
|
|
|
struct thread *td1;
|
|
|
|
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
mtx_assert(&umtx_lock, MA_OWNED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (pi == NULL)
|
|
|
|
return (0);
|
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if the thread needs to be moved on the blocked chain.
|
|
|
|
* It needs to be moved if either its priority is lower than
|
|
|
|
* the previous thread or higher than the next thread.
|
|
|
|
*/
|
|
|
|
uq1 = TAILQ_PREV(uq, umtxq_head, uq_lockq);
|
|
|
|
uq2 = TAILQ_NEXT(uq, uq_lockq);
|
|
|
|
if ((uq1 != NULL && UPRI(td) < UPRI(uq1->uq_thread)) ||
|
|
|
|
(uq2 != NULL && UPRI(td) > UPRI(uq2->uq_thread))) {
|
|
|
|
/*
|
|
|
|
* Remove thread from blocked chain and determine where
|
|
|
|
* it should be moved to.
|
|
|
|
*/
|
|
|
|
TAILQ_REMOVE(&pi->pi_blocked, uq, uq_lockq);
|
|
|
|
TAILQ_FOREACH(uq1, &pi->pi_blocked, uq_lockq) {
|
|
|
|
td1 = uq1->uq_thread;
|
|
|
|
MPASS(td1->td_proc->p_magic == P_MAGIC);
|
|
|
|
if (UPRI(td1) > UPRI(td))
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (uq1 == NULL)
|
|
|
|
TAILQ_INSERT_TAIL(&pi->pi_blocked, uq, uq_lockq);
|
|
|
|
else
|
|
|
|
TAILQ_INSERT_BEFORE(uq1, uq, uq_lockq);
|
|
|
|
}
|
|
|
|
return (1);
|
|
|
|
}
|
|
|
|
|
2015-01-31 12:27:40 +00:00
|
|
|
static struct umtx_pi *
|
|
|
|
umtx_pi_next(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq_owner;
|
|
|
|
|
|
|
|
if (pi->pi_owner == NULL)
|
|
|
|
return (NULL);
|
|
|
|
uq_owner = pi->pi_owner->td_umtxq;
|
|
|
|
if (uq_owner == NULL)
|
|
|
|
return (NULL);
|
|
|
|
return (uq_owner->uq_pi_blocked);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Floyd's Cycle-Finding Algorithm.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
umtx_pi_check_loop(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
struct umtx_pi *pi1; /* fast iterator */
|
|
|
|
|
|
|
|
mtx_assert(&umtx_lock, MA_OWNED);
|
|
|
|
if (pi == NULL)
|
|
|
|
return (false);
|
|
|
|
pi1 = pi;
|
|
|
|
for (;;) {
|
|
|
|
pi = umtx_pi_next(pi);
|
|
|
|
if (pi == NULL)
|
|
|
|
break;
|
|
|
|
pi1 = umtx_pi_next(pi1);
|
|
|
|
if (pi1 == NULL)
|
|
|
|
break;
|
|
|
|
pi1 = umtx_pi_next(pi1);
|
|
|
|
if (pi1 == NULL)
|
|
|
|
break;
|
|
|
|
if (pi == pi1)
|
|
|
|
return (true);
|
|
|
|
}
|
|
|
|
return (false);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Propagate priority when a thread is blocked on POSIX
|
|
|
|
* PI mutex.
|
2020-01-02 22:07:03 +00:00
|
|
|
*/
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
static void
|
|
|
|
umtx_propagate_priority(struct thread *td)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq;
|
|
|
|
struct umtx_pi *pi;
|
|
|
|
int pri;
|
|
|
|
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
mtx_assert(&umtx_lock, MA_OWNED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
pri = UPRI(td);
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
pi = uq->uq_pi_blocked;
|
|
|
|
if (pi == NULL)
|
|
|
|
return;
|
2015-01-31 12:27:40 +00:00
|
|
|
if (umtx_pi_check_loop(pi))
|
|
|
|
return;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
for (;;) {
|
|
|
|
td = pi->pi_owner;
|
2010-12-09 02:42:02 +00:00
|
|
|
if (td == NULL || td == curthread)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return;
|
|
|
|
|
|
|
|
MPASS(td->td_proc != NULL);
|
|
|
|
MPASS(td->td_proc->p_magic == P_MAGIC);
|
|
|
|
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(td);
|
2010-12-09 02:42:02 +00:00
|
|
|
if (td->td_lend_user_pri > pri)
|
|
|
|
sched_lend_user_prio(td, pri);
|
|
|
|
else {
|
|
|
|
thread_unlock(td);
|
|
|
|
break;
|
|
|
|
}
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Pick up the lock that td is blocked on.
|
|
|
|
*/
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
pi = uq->uq_pi_blocked;
|
2010-12-29 09:26:46 +00:00
|
|
|
if (pi == NULL)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
break;
|
2010-12-29 09:26:46 +00:00
|
|
|
/* Resort td on the list if needed. */
|
|
|
|
umtx_pi_adjust_thread(pi, td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unpropagate priority for a PI mutex when a thread blocked on
|
|
|
|
* it is interrupted by signal or resumed by others.
|
|
|
|
*/
|
|
|
|
static void
|
2010-12-29 09:26:46 +00:00
|
|
|
umtx_repropagate_priority(struct umtx_pi *pi)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct umtx_q *uq, *uq_owner;
|
|
|
|
struct umtx_pi *pi2;
|
2010-12-29 09:26:46 +00:00
|
|
|
int pri;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
mtx_assert(&umtx_lock, MA_OWNED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2015-01-31 12:27:40 +00:00
|
|
|
if (umtx_pi_check_loop(pi))
|
|
|
|
return;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
while (pi != NULL && pi->pi_owner != NULL) {
|
|
|
|
pri = PRI_MAX;
|
|
|
|
uq_owner = pi->pi_owner->td_umtxq;
|
|
|
|
|
|
|
|
TAILQ_FOREACH(pi2, &uq_owner->uq_pi_contested, pi_link) {
|
|
|
|
uq = TAILQ_FIRST(&pi2->pi_blocked);
|
|
|
|
if (uq != NULL) {
|
|
|
|
if (pri > UPRI(uq->uq_thread))
|
|
|
|
pri = UPRI(uq->uq_thread);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (pri > uq_owner->uq_inherited_pri)
|
|
|
|
pri = uq_owner->uq_inherited_pri;
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(pi->pi_owner);
|
2010-12-29 09:26:46 +00:00
|
|
|
sched_lend_user_prio(pi->pi_owner, pri);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(pi->pi_owner);
|
2010-12-29 09:26:46 +00:00
|
|
|
if ((pi = uq_owner->uq_pi_blocked) != NULL)
|
|
|
|
umtx_pi_adjust_thread(pi, uq_owner->uq_thread);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Insert a PI mutex into owned list.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
umtx_pi_setowner(struct umtx_pi *pi, struct thread *owner)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq_owner;
|
|
|
|
|
|
|
|
uq_owner = owner->td_umtxq;
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
mtx_assert(&umtx_lock, MA_OWNED);
|
2017-11-04 10:49:34 +00:00
|
|
|
MPASS(pi->pi_owner == NULL);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
pi->pi_owner = owner;
|
|
|
|
TAILQ_INSERT_TAIL(&uq_owner->uq_pi_contested, pi, pi_link);
|
|
|
|
}
|
|
|
|
|
2015-02-25 16:12:56 +00:00
|
|
|
/*
|
|
|
|
* Disown a PI mutex, and remove it from the owned list.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
umtx_pi_disown(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
|
|
|
|
mtx_assert(&umtx_lock, MA_OWNED);
|
|
|
|
TAILQ_REMOVE(&pi->pi_owner->td_umtxq->uq_pi_contested, pi, pi_link);
|
|
|
|
pi->pi_owner = NULL;
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Claim ownership of a PI mutex.
|
|
|
|
*/
|
2021-07-29 12:46:58 +03:00
|
|
|
int
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_claim(struct umtx_pi *pi, struct thread *owner)
|
|
|
|
{
|
2015-03-28 21:21:40 +00:00
|
|
|
struct umtx_q *uq;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
int pri;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (pi->pi_owner == owner) {
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (pi->pi_owner != NULL) {
|
|
|
|
/*
|
|
|
|
* userland may have already messed the mutex, sigh.
|
|
|
|
*/
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EPERM);
|
|
|
|
}
|
|
|
|
umtx_pi_setowner(pi, owner);
|
|
|
|
uq = TAILQ_FIRST(&pi->pi_blocked);
|
|
|
|
if (uq != NULL) {
|
|
|
|
pri = UPRI(uq->uq_thread);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(owner);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (pri < UPRI(owner))
|
|
|
|
sched_lend_user_prio(owner, pri);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(owner);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
2007-12-17 05:55:07 +00:00
|
|
|
/*
|
|
|
|
* Adjust a thread's order position in its blocked PI mutex,
|
|
|
|
* this may result new priority propagating process.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
umtx_pi_adjust(struct thread *td, u_char oldpri)
|
|
|
|
{
|
2007-12-17 08:09:37 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
struct umtx_pi *pi;
|
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
2007-12-17 08:09:37 +00:00
|
|
|
/*
|
|
|
|
* Pick up the lock that td is blocked on.
|
|
|
|
*/
|
|
|
|
pi = uq->uq_pi_blocked;
|
2010-12-29 09:26:46 +00:00
|
|
|
if (pi != NULL) {
|
|
|
|
umtx_pi_adjust_thread(pi, td);
|
|
|
|
umtx_repropagate_priority(pi);
|
|
|
|
}
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
2007-12-17 05:55:07 +00:00
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Sleep on a PI mutex.
|
|
|
|
*/
|
2021-07-29 12:46:58 +03:00
|
|
|
int
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
umtxq_sleep_pi(struct umtx_q *uq, struct umtx_pi *pi, uint32_t owner,
|
2021-07-29 12:41:58 +03:00
|
|
|
const char *wmesg, struct umtx_abs_timeout *timo, bool shared)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct thread *td, *td1;
|
|
|
|
struct umtx_q *uq1;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
int error, pri;
|
2018-05-19 05:09:10 +00:00
|
|
|
#ifdef INVARIANTS
|
|
|
|
struct umtxq_chain *uc;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2018-05-19 05:09:10 +00:00
|
|
|
uc = umtxq_getchain(&pi->pi_key);
|
|
|
|
#endif
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = 0;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
td = uq->uq_thread;
|
|
|
|
KASSERT(td == curthread, ("inconsistent uq_thread"));
|
2018-05-19 05:09:10 +00:00
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(&uq->uq_key));
|
2014-11-13 18:51:09 +00:00
|
|
|
KASSERT(uc->uc_busy != 0, ("umtx chain is not busy"));
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_insert(uq);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (pi->pi_owner == NULL) {
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
td1 = tdfind(owner, shared ? -1 : td->td_proc->p_pid);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
2010-11-15 07:33:54 +00:00
|
|
|
if (td1 != NULL) {
|
|
|
|
if (pi->pi_owner == NULL)
|
|
|
|
umtx_pi_setowner(pi, td1);
|
|
|
|
PROC_UNLOCK(td1->td_proc);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
TAILQ_FOREACH(uq1, &pi->pi_blocked, uq_lockq) {
|
|
|
|
pri = UPRI(uq1->uq_thread);
|
|
|
|
if (pri > UPRI(td))
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (uq1 != NULL)
|
|
|
|
TAILQ_INSERT_BEFORE(uq1, uq, uq_lockq);
|
|
|
|
else
|
|
|
|
TAILQ_INSERT_TAIL(&pi->pi_blocked, uq, uq_lockq);
|
|
|
|
|
|
|
|
uq->uq_pi_blocked = pi;
|
2007-12-17 05:55:07 +00:00
|
|
|
thread_lock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
td->td_flags |= TDF_UPIBLOCKED;
|
2007-12-17 05:55:07 +00:00
|
|
|
thread_unlock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_propagate_priority(td);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
2009-03-13 06:06:20 +00:00
|
|
|
umtxq_unbusy(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, wmesg, timo);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
uq->uq_pi_blocked = NULL;
|
2007-12-17 05:55:07 +00:00
|
|
|
thread_lock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
td->td_flags &= ~TDF_UPIBLOCKED;
|
2007-12-17 05:55:07 +00:00
|
|
|
thread_unlock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
TAILQ_REMOVE(&pi->pi_blocked, uq, uq_lockq);
|
2010-12-29 09:26:46 +00:00
|
|
|
umtx_repropagate_priority(pi);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
2009-03-13 06:06:20 +00:00
|
|
|
umtxq_unlock(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Add reference count for a PI mutex.
|
|
|
|
*/
|
2021-07-29 12:46:58 +03:00
|
|
|
void
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_ref(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
|
2018-05-19 05:09:10 +00:00
|
|
|
UMTXQ_LOCKED_ASSERT(umtxq_getchain(&pi->pi_key));
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
pi->pi_refcount++;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Decrease reference count for a PI mutex, if the counter
|
|
|
|
* is decreased to zero, its memory space is freed.
|
2020-01-02 22:07:03 +00:00
|
|
|
*/
|
2021-07-29 12:46:58 +03:00
|
|
|
void
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_unref(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
struct umtxq_chain *uc;
|
|
|
|
|
|
|
|
uc = umtxq_getchain(&pi->pi_key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
|
|
|
KASSERT(pi->pi_refcount > 0, ("invalid reference count"));
|
|
|
|
if (--pi->pi_refcount == 0) {
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
2015-03-28 21:21:40 +00:00
|
|
|
if (pi->pi_owner != NULL)
|
|
|
|
umtx_pi_disown(pi);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
KASSERT(TAILQ_EMPTY(&pi->pi_blocked),
|
|
|
|
("blocked queue not empty"));
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
TAILQ_REMOVE(&uc->uc_pi_list, pi, pi_hashlink);
|
|
|
|
umtx_pi_free(pi);
|
2009-03-13 06:06:20 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Find a PI mutex in hash table.
|
|
|
|
*/
|
2021-07-29 12:46:58 +03:00
|
|
|
struct umtx_pi *
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_lookup(struct umtx_key *key)
|
|
|
|
{
|
|
|
|
struct umtxq_chain *uc;
|
|
|
|
struct umtx_pi *pi;
|
|
|
|
|
|
|
|
uc = umtxq_getchain(key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
|
|
|
|
|
|
|
TAILQ_FOREACH(pi, &uc->uc_pi_list, pi_hashlink) {
|
|
|
|
if (umtx_key_match(&pi->pi_key, key)) {
|
|
|
|
return (pi);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Insert a PI mutex into hash table.
|
|
|
|
*/
|
2021-07-29 12:46:58 +03:00
|
|
|
void
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_insert(struct umtx_pi *pi)
|
|
|
|
{
|
|
|
|
struct umtxq_chain *uc;
|
|
|
|
|
|
|
|
uc = umtxq_getchain(&pi->pi_key);
|
|
|
|
UMTXQ_LOCKED_ASSERT(uc);
|
|
|
|
TAILQ_INSERT_TAIL(&uc->uc_pi_list, pi, pi_hashlink);
|
|
|
|
}
|
|
|
|
|
2021-07-29 12:47:39 +03:00
|
|
|
/*
|
|
|
|
* Drop a PI mutex and wakeup a top waiter.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
umtx_pi_drop(struct thread *td, struct umtx_key *key, bool rb, int *count)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq_first, *uq_first2, *uq_me;
|
|
|
|
struct umtx_pi *pi, *pi2;
|
|
|
|
int pri;
|
|
|
|
|
|
|
|
UMTXQ_ASSERT_LOCKED_BUSY(key);
|
|
|
|
*count = umtxq_count_pi(key, &uq_first);
|
|
|
|
if (uq_first != NULL) {
|
|
|
|
mtx_lock(&umtx_lock);
|
|
|
|
pi = uq_first->uq_pi_blocked;
|
|
|
|
KASSERT(pi != NULL, ("pi == NULL?"));
|
|
|
|
if (pi->pi_owner != td && !(rb && pi->pi_owner == NULL)) {
|
|
|
|
mtx_unlock(&umtx_lock);
|
|
|
|
/* userland messed the mutex */
|
|
|
|
return (EPERM);
|
|
|
|
}
|
|
|
|
uq_me = td->td_umtxq;
|
|
|
|
if (pi->pi_owner == td)
|
|
|
|
umtx_pi_disown(pi);
|
|
|
|
/* get highest priority thread which is still sleeping. */
|
|
|
|
uq_first = TAILQ_FIRST(&pi->pi_blocked);
|
|
|
|
while (uq_first != NULL &&
|
|
|
|
(uq_first->uq_flags & UQF_UMTXQ) == 0) {
|
|
|
|
uq_first = TAILQ_NEXT(uq_first, uq_lockq);
|
|
|
|
}
|
|
|
|
pri = PRI_MAX;
|
|
|
|
TAILQ_FOREACH(pi2, &uq_me->uq_pi_contested, pi_link) {
|
|
|
|
uq_first2 = TAILQ_FIRST(&pi2->pi_blocked);
|
|
|
|
if (uq_first2 != NULL) {
|
|
|
|
if (pri > UPRI(uq_first2->uq_thread))
|
|
|
|
pri = UPRI(uq_first2->uq_thread);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
thread_lock(td);
|
|
|
|
sched_lend_user_prio(td, pri);
|
|
|
|
thread_unlock(td);
|
|
|
|
mtx_unlock(&umtx_lock);
|
|
|
|
if (uq_first)
|
|
|
|
umtxq_signal_thread(uq_first);
|
|
|
|
} else {
|
|
|
|
pi = umtx_pi_lookup(key);
|
|
|
|
/*
|
|
|
|
* A umtx_pi can exist if a signal or timeout removed the
|
|
|
|
* last waiter from the umtxq, but there is still
|
|
|
|
* a thread in do_lock_pi() holding the umtx_pi.
|
|
|
|
*/
|
|
|
|
if (pi != NULL) {
|
|
|
|
/*
|
|
|
|
* The umtx_pi can be unowned, such as when a thread
|
|
|
|
* has just entered do_lock_pi(), allocated the
|
|
|
|
* umtx_pi, and unlocked the umtxq.
|
|
|
|
* If the current thread owns it, it must disown it.
|
|
|
|
*/
|
|
|
|
mtx_lock(&umtx_lock);
|
|
|
|
if (pi->pi_owner == td)
|
|
|
|
umtx_pi_disown(pi);
|
|
|
|
mtx_unlock(&umtx_lock);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Lock a PI mutex.
|
|
|
|
*/
|
|
|
|
static int
|
2012-03-30 05:40:26 +00:00
|
|
|
do_lock_pi(struct thread *td, struct umutex *m, uint32_t flags,
|
|
|
|
struct _umtx_time *timeout, int try)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
struct umtx_pi *pi, *new_pi;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uint32_t id, old_owner, owner, old;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error, rv;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
id = td->td_tid;
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((error = umtx_key_get(m, (flags & UMUTEX_ROBUST) != 0 ?
|
|
|
|
TYPE_PI_ROBUST_UMUTEX : TYPE_PI_UMUTEX, GET_SHARE(flags),
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
&uq->uq_key)) != 0)
|
|
|
|
return (error);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
2006-10-26 09:33:34 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
pi = umtx_pi_lookup(&uq->uq_key);
|
|
|
|
if (pi == NULL) {
|
|
|
|
new_pi = umtx_pi_alloc(M_NOWAIT);
|
|
|
|
if (new_pi == NULL) {
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_unlock(&uq->uq_key);
|
2006-10-26 09:33:34 +00:00
|
|
|
new_pi = umtx_pi_alloc(M_WAITOK);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
pi = umtx_pi_lookup(&uq->uq_key);
|
2006-10-26 09:33:34 +00:00
|
|
|
if (pi != NULL) {
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_pi_free(new_pi);
|
2006-10-26 09:33:34 +00:00
|
|
|
new_pi = NULL;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
}
|
2006-10-26 09:33:34 +00:00
|
|
|
if (new_pi != NULL) {
|
|
|
|
new_pi->pi_key = uq->uq_key;
|
|
|
|
umtx_pi_insert(new_pi);
|
|
|
|
pi = new_pi;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
umtx_pi_ref(pi);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2006-10-26 09:33:34 +00:00
|
|
|
/*
|
|
|
|
* Care must be exercised when dealing with umtx structure. It
|
|
|
|
* can fault on any access.
|
|
|
|
*/
|
|
|
|
for (;;) {
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Try the uncontested case. This should be done in userland.
|
|
|
|
*/
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&m->m_owner, UMUTEX_UNOWNED, &owner, id);
|
|
|
|
/* The address was invalid. */
|
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/* The acquire succeeded. */
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_UNOWNED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
error = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (owner == UMUTEX_RB_NOTRECOV) {
|
|
|
|
error = ENOTRECOVERABLE;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
/*
|
|
|
|
* Avoid overwriting a possible error from sleep due
|
|
|
|
* to the pending signal with suspension check result.
|
|
|
|
*/
|
|
|
|
if (error == 0) {
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/* If no one owns it but it is contested try to acquire it. */
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (owner == UMUTEX_CONTESTED || owner == UMUTEX_RB_OWNERDEAD) {
|
|
|
|
old_owner = owner;
|
|
|
|
rv = casueword32(&m->m_owner, owner, &owner,
|
|
|
|
id | UMUTEX_CONTESTED);
|
2014-10-28 15:30:33 +00:00
|
|
|
/* The address was invalid. */
|
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 1) {
|
|
|
|
if (error == 0) {
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error != 0)
|
2019-07-15 08:39:52 +00:00
|
|
|
break;
|
2015-02-25 16:17:16 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
/*
|
|
|
|
* If this failed the lock could
|
|
|
|
* changed, restart.
|
|
|
|
*/
|
|
|
|
continue;
|
|
|
|
}
|
2013-06-13 09:33:22 +00:00
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
MPASS(rv == 0);
|
|
|
|
MPASS(owner == old_owner);
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
error = umtx_pi_claim(pi, td);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
if (error != 0) {
|
|
|
|
/*
|
|
|
|
* Since we're going to return an
|
|
|
|
* error, restore the m_owner to its
|
|
|
|
* previous, unowned state to avoid
|
|
|
|
* compounding the problem.
|
|
|
|
*/
|
|
|
|
(void)casuword32(&m->m_owner,
|
|
|
|
id | UMUTEX_CONTESTED, old_owner);
|
|
|
|
}
|
|
|
|
if (error == 0 && old_owner == UMUTEX_RB_OWNERDEAD)
|
|
|
|
error = EOWNERDEAD;
|
|
|
|
break;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
2015-01-31 12:27:40 +00:00
|
|
|
if ((owner & ~UMUTEX_CONTESTED) == id) {
|
|
|
|
error = EDEADLK;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (try != 0) {
|
|
|
|
error = EBUSY;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we caught a signal, we have retried and now
|
|
|
|
* exit immediately.
|
|
|
|
*/
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
2020-01-02 22:07:03 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the contested bit so that a release in user space
|
|
|
|
* knows to use the system call for unlock. If this fails
|
|
|
|
* either some one else has acquired the lock or it has been
|
|
|
|
* released.
|
|
|
|
*/
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
rv = casueword32(&m->m_owner, owner, &old, owner |
|
|
|
|
UMUTEX_CONTESTED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
/* The address was invalid. */
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
2014-08-22 18:42:14 +00:00
|
|
|
if (error != 0)
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
break;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The lock changed and we need to retry or we
|
|
|
|
* lost a race to the thread unlocking the
|
|
|
|
* umtx. Note that the UMUTEX_RB_OWNERDEAD
|
|
|
|
* value for owner is impossible there.
|
|
|
|
*/
|
|
|
|
continue;
|
2009-03-13 06:06:20 +00:00
|
|
|
}
|
2013-06-13 09:33:22 +00:00
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
|
|
|
|
/* We set the contested bit, sleep. */
|
|
|
|
MPASS(old == owner);
|
|
|
|
error = umtxq_sleep_pi(uq, pi, owner & ~UMUTEX_CONTESTED,
|
|
|
|
"umtxpi", timeout == NULL ? NULL : &timo,
|
|
|
|
(flags & USYNC_PROCESS_SHARED) != 0);
|
|
|
|
if (error != 0)
|
|
|
|
continue;
|
|
|
|
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
2006-10-26 09:33:34 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtx_pi_unref(pi);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlock a PI mutex.
|
|
|
|
*/
|
|
|
|
static int
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
do_unlock_pi(struct thread *td, struct umutex *m, uint32_t flags, bool rb)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct umtx_key key;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uint32_t id, new_owner, old, owner;
|
2021-07-29 12:47:39 +03:00
|
|
|
int count, error;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
id = td->td_tid;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
|
|
|
|
usrloop:
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
|
|
|
* Make sure we own this mtx.
|
|
|
|
*/
|
2014-10-31 17:43:21 +00:00
|
|
|
error = fueword32(&m->m_owner, &owner);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
if ((owner & ~UMUTEX_CONTESTED) != id)
|
|
|
|
return (EPERM);
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
new_owner = umtx_unlock_val(flags, rb);
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/* This should be done in userland */
|
|
|
|
if ((owner & UMUTEX_CONTESTED) == 0) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = casueword32(&m->m_owner, owner, &old, new_owner);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
goto usrloop;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (old == owner)
|
|
|
|
return (0);
|
2006-09-02 02:41:33 +00:00
|
|
|
owner = old;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* We should only ever be in here for contested locks */
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((error = umtx_key_get(m, (flags & UMUTEX_ROBUST) != 0 ?
|
|
|
|
TYPE_PI_ROBUST_UMUTEX : TYPE_PI_UMUTEX, GET_SHARE(flags),
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
&key)) != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
2021-07-29 12:47:39 +03:00
|
|
|
error = umtx_pi_drop(td, &key, rb, &count);
|
|
|
|
if (error != 0) {
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
/* userland messed the mutex */
|
|
|
|
return (error);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When unlocking the umtx, it must be marked as unowned if
|
|
|
|
* there is zero or one thread only waiting for it.
|
|
|
|
* Otherwise, it must be marked as contested.
|
|
|
|
*/
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
if (count > 1)
|
|
|
|
new_owner |= UMUTEX_CONTESTED;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
again:
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = casueword32(&m->m_owner, owner, &old, new_owner);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 0)
|
|
|
|
goto again;
|
|
|
|
}
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_key_release(&key);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 0 && old != owner)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EINVAL);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
return (error);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Lock a PP mutex.
|
|
|
|
*/
|
|
|
|
static int
|
2012-03-30 05:40:26 +00:00
|
|
|
do_lock_pp(struct thread *td, struct umutex *m, uint32_t flags,
|
|
|
|
struct _umtx_time *timeout, int try)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
struct umtx_q *uq, *uq2;
|
|
|
|
struct umtx_pi *pi;
|
|
|
|
uint32_t ceiling;
|
|
|
|
uint32_t owner, id;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error, pri, old_inherited_pri, su, rv;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
id = td->td_tid;
|
|
|
|
uq = td->td_umtxq;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((error = umtx_key_get(m, (flags & UMUTEX_ROBUST) != 0 ?
|
|
|
|
TYPE_PP_ROBUST_UMUTEX : TYPE_PP_UMUTEX, GET_SHARE(flags),
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
&uq->uq_key)) != 0)
|
|
|
|
return (error);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
2006-11-06 13:42:10 +00:00
|
|
|
su = (priv_check(td, PRIV_SCHED_RTPRIO) == 0);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
for (;;) {
|
|
|
|
old_inherited_pri = uq->uq_inherited_pri;
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&m->m_ceilings[0], &ceiling);
|
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
ceiling = RTP_PRIO_MAX - ceiling;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (ceiling > RTP_PRIO_MAX) {
|
|
|
|
error = EINVAL;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (UPRI(td) < PRI_MIN_REALTIME + ceiling) {
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
error = EINVAL;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (su && PRI_MIN_REALTIME + ceiling < uq->uq_inherited_pri) {
|
|
|
|
uq->uq_inherited_pri = PRI_MIN_REALTIME + ceiling;
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if (uq->uq_inherited_pri < UPRI(td))
|
|
|
|
sched_lend_user_prio(td, uq->uq_inherited_pri);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(td);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
rv = casueword32(&m->m_owner, UMUTEX_CONTESTED, &owner,
|
|
|
|
id | UMUTEX_CONTESTED);
|
2014-10-28 15:30:33 +00:00
|
|
|
/* The address was invalid. */
|
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_CONTESTED);
|
2014-10-28 15:30:33 +00:00
|
|
|
error = 0;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
break;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
|
|
|
/* rv == 1 */
|
|
|
|
if (owner == UMUTEX_RB_OWNERDEAD) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
rv = casueword32(&m->m_owner, UMUTEX_RB_OWNERDEAD,
|
|
|
|
&owner, id | UMUTEX_CONTESTED);
|
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_RB_OWNERDEAD);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = EOWNERDEAD; /* success */
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* rv == 1, only check for suspension if we
|
|
|
|
* did not already catched a signal. If we
|
|
|
|
* get an error from the check, the same
|
|
|
|
* condition is checked by the umtxq_sleep()
|
|
|
|
* call below, so we should obliterate the
|
|
|
|
* error to not skip the last loop iteration.
|
|
|
|
*/
|
|
|
|
if (error == 0) {
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (error == 0) {
|
|
|
|
if (try != 0)
|
|
|
|
error = EBUSY;
|
|
|
|
else
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
error = 0;
|
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
} else if (owner == UMUTEX_RB_NOTRECOV) {
|
|
|
|
error = ENOTRECOVERABLE;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (try != 0)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
error = EBUSY;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we caught a signal, we have retried and now
|
|
|
|
* exit immediately.
|
|
|
|
*/
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, "umtxpp", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
uq->uq_inherited_pri = old_inherited_pri;
|
|
|
|
pri = PRI_MAX;
|
|
|
|
TAILQ_FOREACH(pi, &uq->uq_pi_contested, pi_link) {
|
|
|
|
uq2 = TAILQ_FIRST(&pi->pi_blocked);
|
|
|
|
if (uq2 != NULL) {
|
|
|
|
if (pri > UPRI(uq2->uq_thread))
|
|
|
|
pri = UPRI(uq2->uq_thread);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (pri > uq->uq_inherited_pri)
|
|
|
|
pri = uq->uq_inherited_pri;
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(td);
|
2010-12-29 09:26:46 +00:00
|
|
|
sched_lend_user_prio(td, pri);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(td);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (error != 0 && error != EOWNERDEAD) {
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
uq->uq_inherited_pri = old_inherited_pri;
|
|
|
|
pri = PRI_MAX;
|
|
|
|
TAILQ_FOREACH(pi, &uq->uq_pi_contested, pi_link) {
|
|
|
|
uq2 = TAILQ_FIRST(&pi->pi_blocked);
|
|
|
|
if (uq2 != NULL) {
|
|
|
|
if (pri > UPRI(uq2->uq_thread))
|
|
|
|
pri = UPRI(uq2->uq_thread);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (pri > uq->uq_inherited_pri)
|
|
|
|
pri = uq->uq_inherited_pri;
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(td);
|
2010-12-29 09:26:46 +00:00
|
|
|
sched_lend_user_prio(td, pri);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(td);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlock a PP mutex.
|
|
|
|
*/
|
|
|
|
static int
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
do_unlock_pp(struct thread *td, struct umutex *m, uint32_t flags, bool rb)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
struct umtx_q *uq, *uq2;
|
|
|
|
struct umtx_pi *pi;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uint32_t id, owner, rceiling;
|
2006-09-03 00:07:37 +00:00
|
|
|
int error, pri, new_inherited_pri, su;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
id = td->td_tid;
|
|
|
|
uq = td->td_umtxq;
|
2006-11-06 13:42:10 +00:00
|
|
|
su = (priv_check(td, PRIV_SCHED_RTPRIO) == 0);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure we own this mtx.
|
|
|
|
*/
|
2014-10-31 17:43:21 +00:00
|
|
|
error = fueword32(&m->m_owner, &owner);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
|
|
|
|
|
|
|
if ((owner & ~UMUTEX_CONTESTED) != id)
|
|
|
|
return (EPERM);
|
|
|
|
|
|
|
|
error = copyin(&m->m_ceilings[1], &rceiling, sizeof(uint32_t));
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
|
|
|
|
if (rceiling == -1)
|
|
|
|
new_inherited_pri = PRI_MAX;
|
|
|
|
else {
|
|
|
|
rceiling = RTP_PRIO_MAX - rceiling;
|
|
|
|
if (rceiling > RTP_PRIO_MAX)
|
|
|
|
return (EINVAL);
|
|
|
|
new_inherited_pri = PRI_MIN_REALTIME + rceiling;
|
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((error = umtx_key_get(m, (flags & UMUTEX_ROBUST) != 0 ?
|
|
|
|
TYPE_PP_ROBUST_UMUTEX : TYPE_PP_UMUTEX, GET_SHARE(flags),
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
&key)) != 0)
|
|
|
|
return (error);
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
/*
|
|
|
|
* For priority protected mutex, always set unlocked state
|
|
|
|
* to UMUTEX_CONTESTED, so that userland always enters kernel
|
|
|
|
* to lock the mutex, it is necessary because thread priority
|
|
|
|
* has to be adjusted for such mutex.
|
|
|
|
*/
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = suword32(&m->m_owner, umtx_unlock_val(flags, rb) |
|
|
|
|
UMUTEX_CONTESTED);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
if (error == 0)
|
|
|
|
umtxq_signal(&key, 1);
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
|
|
|
if (error == -1)
|
|
|
|
error = EFAULT;
|
|
|
|
else {
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_lock(&umtx_lock);
|
2006-09-03 00:07:37 +00:00
|
|
|
if (su != 0)
|
|
|
|
uq->uq_inherited_pri = new_inherited_pri;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
pri = PRI_MAX;
|
|
|
|
TAILQ_FOREACH(pi, &uq->uq_pi_contested, pi_link) {
|
|
|
|
uq2 = TAILQ_FIRST(&pi->pi_blocked);
|
|
|
|
if (uq2 != NULL) {
|
|
|
|
if (pri > UPRI(uq2->uq_thread))
|
|
|
|
pri = UPRI(uq2->uq_thread);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (pri > uq->uq_inherited_pri)
|
|
|
|
pri = uq->uq_inherited_pri;
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_lock(td);
|
2010-12-29 09:26:46 +00:00
|
|
|
sched_lend_user_prio(td, pri);
|
Commit 8/14 of sched_lock decomposition.
- Use a global umtx spinlock to protect the sleep queues now that there
is no global scheduler lock.
- Use thread_lock() to protect thread state.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:54:50 +00:00
|
|
|
thread_unlock(td);
|
2015-02-28 04:19:02 +00:00
|
|
|
mtx_unlock(&umtx_lock);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
umtx_key_release(&key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
do_set_ceiling(struct thread *td, struct umutex *m, uint32_t ceiling,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uint32_t *old_ceiling)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
struct umtx_q *uq;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uint32_t flags, id, owner, save_ceiling;
|
|
|
|
int error, rv, rv1;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&m->m_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
if ((flags & UMUTEX_PRIO_PROTECT) == 0)
|
|
|
|
return (EINVAL);
|
|
|
|
if (ceiling > RTP_PRIO_MAX)
|
|
|
|
return (EINVAL);
|
|
|
|
id = td->td_tid;
|
|
|
|
uq = td->td_umtxq;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((error = umtx_key_get(m, (flags & UMUTEX_ROBUST) != 0 ?
|
|
|
|
TYPE_PP_ROBUST_UMUTEX : TYPE_PP_UMUTEX, GET_SHARE(flags),
|
|
|
|
&uq->uq_key)) != 0)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (error);
|
|
|
|
for (;;) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&m->m_ceilings[0], &save_ceiling);
|
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
rv = casueword32(&m->m_owner, UMUTEX_CONTESTED, &owner,
|
|
|
|
id | UMUTEX_CONTESTED);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2019-07-15 08:38:01 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(owner == UMUTEX_CONTESTED);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
rv = suword32(&m->m_ceilings[0], ceiling);
|
|
|
|
rv1 = suword32(&m->m_owner, UMUTEX_CONTESTED);
|
|
|
|
error = (rv == 0 && rv1 == 0) ? 0: EFAULT;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((owner & ~UMUTEX_CONTESTED) == id) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
rv = suword32(&m->m_ceilings[0], ceiling);
|
|
|
|
error = rv == 0 ? 0 : EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (owner == UMUTEX_RB_OWNERDEAD) {
|
|
|
|
error = EOWNERDEAD;
|
|
|
|
break;
|
|
|
|
} else if (owner == UMUTEX_RB_NOTRECOV) {
|
|
|
|
error = ENOTRECOVERABLE;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we caught a signal, we have retried and now
|
|
|
|
* exit immediately.
|
|
|
|
*/
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We set the contested bit, sleep. Otherwise the lock changed
|
|
|
|
* and we need to retry or we lost a race to the thread
|
|
|
|
* unlocking the umtx.
|
|
|
|
*/
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, "umtxpp", NULL);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
}
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
if (error == 0)
|
|
|
|
umtxq_signal(&uq->uq_key, INT_MAX);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (error == 0 && old_ceiling != NULL) {
|
|
|
|
rv = suword32(old_ceiling, save_ceiling);
|
|
|
|
error = rv == 0 ? 0 : EFAULT;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Lock a userland POSIX mutex.
|
|
|
|
*/
|
|
|
|
static int
|
2006-09-05 12:01:09 +00:00
|
|
|
do_lock_umutex(struct thread *td, struct umutex *m,
|
2012-03-30 05:40:26 +00:00
|
|
|
struct _umtx_time *timeout, int mode)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
uint32_t flags;
|
2006-09-05 12:01:09 +00:00
|
|
|
int error;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&m->m_flags, &flags);
|
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
switch (flags & (UMUTEX_PRIO_INHERIT | UMUTEX_PRIO_PROTECT)) {
|
2012-03-30 05:40:26 +00:00
|
|
|
case 0:
|
|
|
|
error = do_lock_normal(td, m, flags, timeout, mode);
|
|
|
|
break;
|
|
|
|
case UMUTEX_PRIO_INHERIT:
|
|
|
|
error = do_lock_pi(td, m, flags, timeout, mode);
|
|
|
|
break;
|
|
|
|
case UMUTEX_PRIO_PROTECT:
|
|
|
|
error = do_lock_pp(td, m, flags, timeout, mode);
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return (EINVAL);
|
|
|
|
}
|
2006-09-05 12:01:09 +00:00
|
|
|
if (timeout == NULL) {
|
2008-06-24 07:32:12 +00:00
|
|
|
if (error == EINTR && mode != _UMUTEX_WAIT)
|
2006-09-05 12:01:09 +00:00
|
|
|
error = ERESTART;
|
|
|
|
} else {
|
|
|
|
/* Timed-locking is not restarted. */
|
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
2006-08-30 23:59:45 +00:00
|
|
|
}
|
2006-09-05 12:01:09 +00:00
|
|
|
return (error);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlock a userland POSIX mutex.
|
|
|
|
*/
|
|
|
|
static int
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
do_unlock_umutex(struct thread *td, struct umutex *m, bool rb)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
|
|
|
uint32_t flags;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&m->m_flags, &flags);
|
|
|
|
if (error == -1)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
return (EFAULT);
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
switch (flags & (UMUTEX_PRIO_INHERIT | UMUTEX_PRIO_PROTECT)) {
|
2006-09-02 02:41:33 +00:00
|
|
|
case 0:
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_unlock_normal(td, m, flags, rb));
|
2006-09-02 02:41:33 +00:00
|
|
|
case UMUTEX_PRIO_INHERIT:
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_unlock_pi(td, m, flags, rb));
|
2006-09-02 02:41:33 +00:00
|
|
|
case UMUTEX_PRIO_PROTECT:
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_unlock_pp(td, m, flags, rb));
|
2006-09-02 02:41:33 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2006-09-02 02:41:33 +00:00
|
|
|
return (EINVAL);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
2006-12-03 01:49:22 +00:00
|
|
|
static int
|
|
|
|
do_cv_wait(struct thread *td, struct ucond *cv, struct umutex *m,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
struct timespec *timeout, u_long wflags)
|
2006-12-03 01:49:22 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2006-12-03 01:49:22 +00:00
|
|
|
struct umtx_q *uq;
|
2014-10-28 15:30:33 +00:00
|
|
|
uint32_t flags, clockid, hasw;
|
2006-12-03 01:49:22 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&cv->c_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2006-12-03 01:49:22 +00:00
|
|
|
error = umtx_key_get(cv, TYPE_CV, GET_SHARE(flags), &uq->uq_key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
2010-12-22 05:01:52 +00:00
|
|
|
|
|
|
|
if ((wflags & CVWAIT_CLOCKID) != 0) {
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&cv->c_clockid, &clockid);
|
|
|
|
if (error == -1) {
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
2010-12-22 05:01:52 +00:00
|
|
|
if (clockid < CLOCK_REALTIME ||
|
|
|
|
clockid >= CLOCK_THREAD_CPUTIME_ID) {
|
|
|
|
/* hmm, only HW clock id will work. */
|
2014-10-28 15:30:33 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
2010-12-22 05:01:52 +00:00
|
|
|
return (EINVAL);
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
clockid = CLOCK_REALTIME;
|
|
|
|
}
|
|
|
|
|
2006-12-03 01:49:22 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
|
|
|
/*
|
2010-12-22 05:01:52 +00:00
|
|
|
* Set c_has_waiters to 1 before releasing user mutex, also
|
|
|
|
* don't modify cache line when unnecessary.
|
2006-12-03 01:49:22 +00:00
|
|
|
*/
|
2014-10-31 17:43:21 +00:00
|
|
|
error = fueword32(&cv->c_has_waiters, &hasw);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == 0 && hasw == 0)
|
2014-10-31 17:43:21 +00:00
|
|
|
suword32(&cv->c_has_waiters, 1);
|
2006-12-03 01:49:22 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2006-12-03 01:49:22 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = do_unlock_umutex(td, m, false);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
2012-03-30 12:57:14 +00:00
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init(&timo, clockid,
|
|
|
|
(wflags & CVWAIT_ABSTIME) != 0, timeout);
|
2020-01-02 22:07:03 +00:00
|
|
|
|
2006-12-03 01:49:22 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
if (error == 0) {
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, "ucond", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
2006-12-03 01:49:22 +00:00
|
|
|
}
|
|
|
|
|
2010-08-25 03:14:32 +00:00
|
|
|
if ((uq->uq_flags & UQF_UMTXQ) == 0)
|
|
|
|
error = 0;
|
|
|
|
else {
|
2010-12-22 05:01:52 +00:00
|
|
|
/*
|
|
|
|
* This must be timeout,interrupted by signal or
|
|
|
|
* surprious wakeup, clear c_has_waiter flag when
|
|
|
|
* necessary.
|
|
|
|
*/
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
if ((uq->uq_flags & UQF_UMTXQ) != 0) {
|
|
|
|
int oldlen = uq->uq_cur_queue->length;
|
|
|
|
umtxq_remove(uq);
|
|
|
|
if (oldlen == 1) {
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
2014-10-31 17:43:21 +00:00
|
|
|
suword32(&cv->c_has_waiters, 0);
|
2010-12-22 05:01:52 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
2006-12-03 01:49:22 +00:00
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
|
|
|
}
|
2010-08-25 03:14:32 +00:00
|
|
|
|
2006-12-03 01:49:22 +00:00
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Signal a userland condition variable.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_cv_signal(struct thread *td, struct ucond *cv)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
int error, cnt, nwake;
|
|
|
|
uint32_t flags;
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&cv->c_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2006-12-03 01:49:22 +00:00
|
|
|
if ((error = umtx_key_get(cv, TYPE_CV, GET_SHARE(flags), &key)) != 0)
|
2020-01-02 22:07:03 +00:00
|
|
|
return (error);
|
2006-12-03 01:49:22 +00:00
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
cnt = umtxq_count(&key);
|
|
|
|
nwake = umtxq_signal(&key, 1);
|
|
|
|
if (cnt <= nwake) {
|
|
|
|
umtxq_unlock(&key);
|
2014-10-31 17:43:21 +00:00
|
|
|
error = suword32(&cv->c_has_waiters, 0);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
|
|
|
error = EFAULT;
|
2006-12-03 01:49:22 +00:00
|
|
|
umtxq_lock(&key);
|
|
|
|
}
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
do_cv_broadcast(struct thread *td, struct ucond *cv)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
int error;
|
|
|
|
uint32_t flags;
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&cv->c_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2006-12-03 01:49:22 +00:00
|
|
|
if ((error = umtx_key_get(cv, TYPE_CV, GET_SHARE(flags), &key)) != 0)
|
2020-01-02 22:07:03 +00:00
|
|
|
return (error);
|
2006-12-03 01:49:22 +00:00
|
|
|
|
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
umtxq_signal(&key, INT_MAX);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
|
2014-10-31 17:43:21 +00:00
|
|
|
error = suword32(&cv->c_has_waiters, 0);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
|
|
|
error = EFAULT;
|
2006-12-03 01:49:22 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&key);
|
2006-12-03 01:49:22 +00:00
|
|
|
|
|
|
|
umtx_key_release(&key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
static int
|
2019-06-28 20:40:54 +00:00
|
|
|
do_rw_rdlock(struct thread *td, struct urwlock *rwlock, long fflag,
|
|
|
|
struct _umtx_time *timeout)
|
2008-04-02 04:08:37 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2008-04-02 04:08:37 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
uint32_t flags, wrflags;
|
|
|
|
int32_t state, oldstate;
|
|
|
|
int32_t blocked_readers;
|
2016-08-25 19:15:02 +00:00
|
|
|
int error, error1, rv;
|
2008-04-02 04:08:37 +00:00
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&rwlock->rw_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2008-04-02 04:08:37 +00:00
|
|
|
error = umtx_key_get(rwlock, TYPE_RWLOCK, GET_SHARE(flags), &uq->uq_key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
wrflags = URWLOCK_WRITE_OWNER;
|
|
|
|
if (!(fflag & URWLOCK_PREFER_READER) && !(flags & URWLOCK_PREFER_READER))
|
|
|
|
wrflags |= URWLOCK_WRITE_WAITERS;
|
|
|
|
|
|
|
|
for (;;) {
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
/* try to lock it */
|
|
|
|
while (!(state & wrflags)) {
|
2019-07-02 21:03:06 +00:00
|
|
|
if (__predict_false(URWLOCK_READER_COUNT(state) ==
|
|
|
|
URWLOCK_MAX_READERS)) {
|
2008-04-02 04:08:37 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EAGAIN);
|
|
|
|
}
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state + 1);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(oldstate == state);
|
2008-04-02 04:08:37 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (0);
|
|
|
|
}
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (error)
|
|
|
|
break;
|
|
|
|
|
|
|
|
/* grab monitor lock */
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
2010-02-03 03:56:32 +00:00
|
|
|
/*
|
|
|
|
* re-read the state, in case it changed between the try-lock above
|
|
|
|
* and the check below
|
|
|
|
*/
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1)
|
|
|
|
error = EFAULT;
|
2010-02-03 03:56:32 +00:00
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
/* set read contention bit */
|
2014-10-28 15:30:33 +00:00
|
|
|
while (error == 0 && (state & wrflags) &&
|
|
|
|
!(state & URWLOCK_READ_WAITERS)) {
|
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state | URWLOCK_READ_WAITERS);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(oldstate == state);
|
2008-04-02 04:08:37 +00:00
|
|
|
goto sleep;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (error != 0) {
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2013-06-13 09:33:22 +00:00
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* state is changed while setting flags, restart */
|
|
|
|
if (!(state & wrflags)) {
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
sleep:
|
2019-07-02 21:03:06 +00:00
|
|
|
/*
|
|
|
|
* Contention bit is set, before sleeping, increase
|
|
|
|
* read waiter count.
|
|
|
|
*/
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&rwlock->rw_blocked_readers,
|
|
|
|
&blocked_readers);
|
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
suword32(&rwlock->rw_blocked_readers, blocked_readers+1);
|
|
|
|
|
|
|
|
while (state & wrflags) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, "urdlck", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
2008-04-02 04:08:37 +00:00
|
|
|
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
if (error)
|
|
|
|
break;
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* decrease read waiter count, and may clear read contention bit */
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&rwlock->rw_blocked_readers,
|
|
|
|
&blocked_readers);
|
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
suword32(&rwlock->rw_blocked_readers, blocked_readers-1);
|
|
|
|
if (blocked_readers == 1) {
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2016-08-25 16:35:42 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2014-10-28 15:30:33 +00:00
|
|
|
error = EFAULT;
|
2016-08-25 16:35:42 +00:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
for (;;) {
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state & ~URWLOCK_READ_WAITERS);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(oldstate == state);
|
2008-04-02 04:08:37 +00:00
|
|
|
break;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
2020-01-02 22:13:59 +00:00
|
|
|
error1 = thread_check_susp(td, false);
|
2016-08-25 19:15:02 +00:00
|
|
|
if (error1 != 0) {
|
|
|
|
if (error == 0)
|
|
|
|
error = error1;
|
2016-08-25 16:35:42 +00:00
|
|
|
break;
|
2016-08-25 19:15:02 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
umtx_key_release(&uq->uq_key);
|
2008-04-02 04:26:59 +00:00
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
2008-04-02 04:08:37 +00:00
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2012-03-30 05:40:26 +00:00
|
|
|
do_rw_wrlock(struct thread *td, struct urwlock *rwlock, struct _umtx_time *timeout)
|
2008-04-02 04:08:37 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2008-04-02 04:08:37 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
uint32_t flags;
|
|
|
|
int32_t state, oldstate;
|
|
|
|
int32_t blocked_writers;
|
2009-09-25 00:03:13 +00:00
|
|
|
int32_t blocked_readers;
|
2016-08-25 19:15:02 +00:00
|
|
|
int error, error1, rv;
|
2008-04-02 04:08:37 +00:00
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&rwlock->rw_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2008-04-02 04:08:37 +00:00
|
|
|
error = umtx_key_get(rwlock, TYPE_RWLOCK, GET_SHARE(flags), &uq->uq_key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
2009-09-25 00:03:13 +00:00
|
|
|
blocked_readers = 0;
|
2008-04-02 04:08:37 +00:00
|
|
|
for (;;) {
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
2019-07-02 21:03:06 +00:00
|
|
|
while ((state & URWLOCK_WRITE_OWNER) == 0 &&
|
|
|
|
URWLOCK_READER_COUNT(state) == 0) {
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state | URWLOCK_WRITE_OWNER);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(oldstate == state);
|
2008-04-02 04:08:37 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
state = oldstate;
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
2009-09-25 00:03:13 +00:00
|
|
|
if (error) {
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if ((state & (URWLOCK_WRITE_OWNER |
|
|
|
|
URWLOCK_WRITE_WAITERS)) == 0 &&
|
2009-09-25 00:03:13 +00:00
|
|
|
blocked_readers != 0) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
umtxq_signal_queue(&uq->uq_key, INT_MAX,
|
|
|
|
UMTX_SHARED_QUEUE);
|
2009-09-25 00:03:13 +00:00
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
}
|
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
break;
|
2009-09-25 00:03:13 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
|
|
|
|
/* grab monitor lock */
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
|
2010-02-03 03:56:32 +00:00
|
|
|
/*
|
2019-07-02 21:03:06 +00:00
|
|
|
* Re-read the state, in case it changed between the
|
|
|
|
* try-lock above and the check below.
|
2010-02-03 03:56:32 +00:00
|
|
|
*/
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1)
|
|
|
|
error = EFAULT;
|
2010-02-03 03:56:32 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
while (error == 0 && ((state & URWLOCK_WRITE_OWNER) ||
|
|
|
|
URWLOCK_READER_COUNT(state) != 0) &&
|
|
|
|
(state & URWLOCK_WRITE_WAITERS) == 0) {
|
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state | URWLOCK_WRITE_WAITERS);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(oldstate == state);
|
2008-04-02 04:08:37 +00:00
|
|
|
goto sleep;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (error != 0) {
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2013-06-13 09:33:22 +00:00
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
2019-07-02 19:32:48 +00:00
|
|
|
if ((state & URWLOCK_WRITE_OWNER) == 0 &&
|
|
|
|
URWLOCK_READER_COUNT(state) == 0) {
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, false);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
2008-04-02 04:08:37 +00:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
sleep:
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&rwlock->rw_blocked_writers,
|
|
|
|
&blocked_writers);
|
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2019-07-02 19:32:48 +00:00
|
|
|
suword32(&rwlock->rw_blocked_writers, blocked_writers + 1);
|
2008-04-02 04:08:37 +00:00
|
|
|
|
2019-07-02 19:32:48 +00:00
|
|
|
while ((state & URWLOCK_WRITE_OWNER) ||
|
|
|
|
URWLOCK_READER_COUNT(state) != 0) {
|
2008-04-02 04:08:37 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_insert_queue(uq, UMTX_EXCLUSIVE_QUEUE);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, "uwrlck", timeout == NULL ?
|
|
|
|
NULL : &timo);
|
2008-04-02 04:08:37 +00:00
|
|
|
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_remove_queue(uq, UMTX_EXCLUSIVE_QUEUE);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
if (error)
|
|
|
|
break;
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&rwlock->rw_blocked_writers,
|
|
|
|
&blocked_writers);
|
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
suword32(&rwlock->rw_blocked_writers, blocked_writers-1);
|
|
|
|
if (blocked_writers == 1) {
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
for (;;) {
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state & ~URWLOCK_WRITE_WAITERS);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0) {
|
|
|
|
MPASS(oldstate == state);
|
2008-04-02 04:08:37 +00:00
|
|
|
break;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
2020-01-02 22:13:59 +00:00
|
|
|
error1 = thread_check_susp(td, false);
|
2013-06-13 09:33:22 +00:00
|
|
|
/*
|
|
|
|
* We are leaving the URWLOCK_WRITE_WAITERS
|
|
|
|
* behind, but this should not harm the
|
|
|
|
* correctness.
|
|
|
|
*/
|
2016-08-25 19:15:02 +00:00
|
|
|
if (error1 != 0) {
|
|
|
|
if (error == 0)
|
|
|
|
error = error1;
|
2013-06-13 09:33:22 +00:00
|
|
|
break;
|
2016-08-25 19:15:02 +00:00
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&rwlock->rw_blocked_readers,
|
|
|
|
&blocked_readers);
|
|
|
|
if (rv == -1) {
|
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
|
|
|
error = EFAULT;
|
|
|
|
break;
|
|
|
|
}
|
2009-09-25 00:03:13 +00:00
|
|
|
} else
|
|
|
|
blocked_readers = 0;
|
2008-04-02 04:08:37 +00:00
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
umtxq_unbusy_unlocked(&uq->uq_key);
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
umtx_key_release(&uq->uq_key);
|
2008-04-02 04:26:59 +00:00
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
2008-04-02 04:08:37 +00:00
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2008-04-03 11:49:20 +00:00
|
|
|
do_rw_unlock(struct thread *td, struct urwlock *rwlock)
|
2008-04-02 04:08:37 +00:00
|
|
|
{
|
|
|
|
struct umtx_q *uq;
|
|
|
|
uint32_t flags;
|
|
|
|
int32_t state, oldstate;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error, rv, q, count;
|
2008-04-02 04:08:37 +00:00
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&rwlock->rw_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2008-04-02 04:08:37 +00:00
|
|
|
error = umtx_key_get(rwlock, TYPE_RWLOCK, GET_SHARE(flags), &uq->uq_key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
|
2014-10-31 17:43:21 +00:00
|
|
|
error = fueword32(&rwlock->rw_state, &state);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1) {
|
|
|
|
error = EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
2008-04-02 04:08:37 +00:00
|
|
|
if (state & URWLOCK_WRITE_OWNER) {
|
|
|
|
for (;;) {
|
2020-01-02 22:07:03 +00:00
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
2014-10-28 15:30:33 +00:00
|
|
|
&oldstate, state & ~URWLOCK_WRITE_OWNER);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
error = EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 1) {
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
|
|
|
if (!(oldstate & URWLOCK_WRITE_OWNER)) {
|
|
|
|
error = EPERM;
|
|
|
|
goto out;
|
|
|
|
}
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
goto out;
|
2008-04-02 04:08:37 +00:00
|
|
|
} else
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
} else if (URWLOCK_READER_COUNT(state) != 0) {
|
|
|
|
for (;;) {
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&rwlock->rw_state, state,
|
|
|
|
&oldstate, state - 1);
|
|
|
|
if (rv == -1) {
|
2013-06-13 09:33:22 +00:00
|
|
|
error = EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 1) {
|
2008-04-02 04:08:37 +00:00
|
|
|
state = oldstate;
|
|
|
|
if (URWLOCK_READER_COUNT(oldstate) == 0) {
|
|
|
|
error = EPERM;
|
|
|
|
goto out;
|
|
|
|
}
|
2020-01-02 22:13:59 +00:00
|
|
|
error = thread_check_susp(td, true);
|
2013-06-13 09:33:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
goto out;
|
|
|
|
} else
|
2008-04-02 04:08:37 +00:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
error = EPERM;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
count = 0;
|
|
|
|
|
|
|
|
if (!(flags & URWLOCK_PREFER_READER)) {
|
|
|
|
if (state & URWLOCK_WRITE_WAITERS) {
|
|
|
|
count = 1;
|
|
|
|
q = UMTX_EXCLUSIVE_QUEUE;
|
|
|
|
} else if (state & URWLOCK_READ_WAITERS) {
|
|
|
|
count = INT_MAX;
|
|
|
|
q = UMTX_SHARED_QUEUE;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
if (state & URWLOCK_READ_WAITERS) {
|
|
|
|
count = INT_MAX;
|
|
|
|
q = UMTX_SHARED_QUEUE;
|
|
|
|
} else if (state & URWLOCK_WRITE_WAITERS) {
|
|
|
|
count = 1;
|
|
|
|
q = UMTX_EXCLUSIVE_QUEUE;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (count) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_signal_queue(&uq->uq_key, count, q);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
}
|
|
|
|
out:
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
2014-10-24 20:02:44 +00:00
|
|
|
#if defined(COMPAT_FREEBSD9) || defined(COMPAT_FREEBSD10)
|
2010-01-04 05:27:49 +00:00
|
|
|
static int
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
do_sem_wait(struct thread *td, struct _usem *sem, struct _umtx_time *timeout)
|
2010-01-04 05:27:49 +00:00
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2010-01-04 05:27:49 +00:00
|
|
|
struct umtx_q *uq;
|
2014-10-28 15:30:33 +00:00
|
|
|
uint32_t flags, count, count1;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
int error, rv, rv1;
|
2010-01-04 05:27:49 +00:00
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&sem->_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2010-01-09 06:05:31 +00:00
|
|
|
error = umtx_key_get(sem, TYPE_SEM, GET_SHARE(flags), &uq->uq_key);
|
2010-01-04 05:27:49 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2012-03-30 05:40:26 +00:00
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
again:
|
2010-01-04 05:27:49 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&sem->_has_waiters, 0, &count1, 1);
|
|
|
|
if (rv == 0)
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
rv1 = fueword32(&sem->_count, &count);
|
2019-07-31 19:16:49 +00:00
|
|
|
if (rv == -1 || (rv == 0 && (rv1 == -1 || count != 0)) ||
|
|
|
|
(rv == 1 && count1 == 0)) {
|
2010-01-04 05:27:49 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, true);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0)
|
|
|
|
goto again;
|
|
|
|
error = rv;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (rv == 0)
|
|
|
|
rv = rv1;
|
|
|
|
error = rv == -1 ? EFAULT : 0;
|
|
|
|
goto out;
|
2010-01-04 05:27:49 +00:00
|
|
|
}
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
error = umtxq_sleep(uq, "usem", timeout == NULL ? NULL : &timo);
|
2010-01-04 05:27:49 +00:00
|
|
|
|
2010-08-25 03:14:32 +00:00
|
|
|
if ((uq->uq_flags & UQF_UMTXQ) == 0)
|
|
|
|
error = 0;
|
|
|
|
else {
|
|
|
|
umtxq_remove(uq);
|
2013-04-19 10:16:00 +00:00
|
|
|
/* A relative timeout cannot be restarted. */
|
|
|
|
if (error == ERESTART && timeout != NULL &&
|
|
|
|
(timeout->_flags & UMTX_ABSTIME) == 0)
|
2010-01-04 05:27:49 +00:00
|
|
|
error = EINTR;
|
|
|
|
}
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
out:
|
2010-01-04 05:27:49 +00:00
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2014-10-24 20:02:44 +00:00
|
|
|
* Signal a userland semaphore.
|
2010-01-04 05:27:49 +00:00
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_sem_wake(struct thread *td, struct _usem *sem)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
2012-04-05 03:05:02 +00:00
|
|
|
int error, cnt;
|
2010-01-04 05:27:49 +00:00
|
|
|
uint32_t flags;
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
error = fueword32(&sem->_flags, &flags);
|
|
|
|
if (error == -1)
|
|
|
|
return (EFAULT);
|
2010-01-09 06:05:31 +00:00
|
|
|
if ((error = umtx_key_get(sem, TYPE_SEM, GET_SHARE(flags), &key)) != 0)
|
2020-01-02 22:07:03 +00:00
|
|
|
return (error);
|
2010-01-04 05:27:49 +00:00
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
cnt = umtxq_count(&key);
|
2012-04-05 03:05:02 +00:00
|
|
|
if (cnt > 0) {
|
|
|
|
/*
|
|
|
|
* Check if count is greater than 0, this means the memory is
|
|
|
|
* still being referenced by user code, so we can safely
|
|
|
|
* update _has_waiters flag.
|
|
|
|
*/
|
|
|
|
if (cnt == 1) {
|
|
|
|
umtxq_unlock(&key);
|
2014-10-31 17:43:21 +00:00
|
|
|
error = suword32(&sem->_has_waiters, 0);
|
2012-04-05 03:05:02 +00:00
|
|
|
umtxq_lock(&key);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (error == -1)
|
|
|
|
error = EFAULT;
|
2012-04-05 03:05:02 +00:00
|
|
|
}
|
2016-08-15 20:09:09 +00:00
|
|
|
umtxq_signal(&key, 1);
|
2010-01-04 05:27:49 +00:00
|
|
|
}
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
return (error);
|
|
|
|
}
|
2014-10-24 20:02:44 +00:00
|
|
|
#endif
|
|
|
|
|
|
|
|
static int
|
|
|
|
do_sem2_wait(struct thread *td, struct _usem2 *sem, struct _umtx_time *timeout)
|
|
|
|
{
|
2021-07-29 12:41:58 +03:00
|
|
|
struct umtx_abs_timeout timo;
|
2014-10-24 20:02:44 +00:00
|
|
|
struct umtx_q *uq;
|
|
|
|
uint32_t count, flags;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error, rv;
|
2014-10-24 20:02:44 +00:00
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
flags = fuword32(&sem->_flags);
|
|
|
|
if (timeout != NULL)
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_init2(&timo, timeout);
|
2014-10-24 20:02:44 +00:00
|
|
|
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
again:
|
2019-07-15 19:18:25 +00:00
|
|
|
error = umtx_key_get(sem, TYPE_SEM, GET_SHARE(flags), &uq->uq_key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
2014-10-24 20:02:44 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_busy(&uq->uq_key);
|
|
|
|
umtxq_insert(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&sem->_count, &count);
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1) {
|
2014-10-24 20:02:44 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (EFAULT);
|
|
|
|
}
|
|
|
|
for (;;) {
|
|
|
|
if (USEM_COUNT(count) != 0) {
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
if (count == USEM_HAS_WAITERS)
|
|
|
|
break;
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&sem->_count, 0, &count, USEM_HAS_WAITERS);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 0)
|
2014-10-24 20:02:44 +00:00
|
|
|
break;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
umtxq_remove(uq);
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, true);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv != 0)
|
|
|
|
return (rv);
|
|
|
|
goto again;
|
2014-10-24 20:02:44 +00:00
|
|
|
}
|
|
|
|
umtxq_lock(&uq->uq_key);
|
|
|
|
umtxq_unbusy(&uq->uq_key);
|
|
|
|
|
|
|
|
error = umtxq_sleep(uq, "usem", timeout == NULL ? NULL : &timo);
|
|
|
|
|
|
|
|
if ((uq->uq_flags & UQF_UMTXQ) == 0)
|
|
|
|
error = 0;
|
|
|
|
else {
|
|
|
|
umtxq_remove(uq);
|
2017-02-23 19:36:38 +00:00
|
|
|
if (timeout != NULL && (timeout->_flags & UMTX_ABSTIME) == 0) {
|
|
|
|
/* A relative timeout cannot be restarted. */
|
|
|
|
if (error == ERESTART)
|
|
|
|
error = EINTR;
|
|
|
|
if (error == EINTR) {
|
2021-07-29 12:41:58 +03:00
|
|
|
umtx_abs_timeout_update(&timo);
|
Make timespecadd(3) and friends public
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
2018-07-30 15:46:40 +00:00
|
|
|
timespecsub(&timo.end, &timo.cur,
|
|
|
|
&timeout->_timeout);
|
2017-02-23 19:36:38 +00:00
|
|
|
}
|
|
|
|
}
|
2014-10-24 20:02:44 +00:00
|
|
|
}
|
|
|
|
umtxq_unlock(&uq->uq_key);
|
|
|
|
umtx_key_release(&uq->uq_key);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Signal a userland semaphore.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
do_sem2_wake(struct thread *td, struct _usem2 *sem)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
2014-10-28 15:30:33 +00:00
|
|
|
int error, cnt, rv;
|
2014-10-24 20:02:44 +00:00
|
|
|
uint32_t count, flags;
|
|
|
|
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = fueword32(&sem->_flags, &flags);
|
|
|
|
if (rv == -1)
|
|
|
|
return (EFAULT);
|
2014-10-24 20:02:44 +00:00
|
|
|
if ((error = umtx_key_get(sem, TYPE_SEM, GET_SHARE(flags), &key)) != 0)
|
2020-01-02 22:07:03 +00:00
|
|
|
return (error);
|
2014-10-24 20:02:44 +00:00
|
|
|
umtxq_lock(&key);
|
|
|
|
umtxq_busy(&key);
|
|
|
|
cnt = umtxq_count(&key);
|
|
|
|
if (cnt > 0) {
|
|
|
|
/*
|
|
|
|
* If this was the last sleeping thread, clear the waiters
|
|
|
|
* flag in _count.
|
|
|
|
*/
|
|
|
|
if (cnt == 1) {
|
|
|
|
umtxq_unlock(&key);
|
2014-10-31 17:43:21 +00:00
|
|
|
rv = fueword32(&sem->_count, &count);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
while (rv != -1 && count & USEM_HAS_WAITERS) {
|
2014-10-28 15:30:33 +00:00
|
|
|
rv = casueword32(&sem->_count, count, &count,
|
2014-10-24 20:02:44 +00:00
|
|
|
count & ~USEM_HAS_WAITERS);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv == 1) {
|
2020-01-02 22:13:59 +00:00
|
|
|
rv = thread_check_susp(td, true);
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
if (rv != 0)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2014-10-28 15:30:33 +00:00
|
|
|
if (rv == -1)
|
2014-10-24 20:02:44 +00:00
|
|
|
error = EFAULT;
|
Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely. Otherwise, rogue userspace livelocks the
kernel.
To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison. The primitive no longer retries
the operation if it failed spuriously. Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.
On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.
Reviewed by: andrew (arm64, previous version), markj
Tested by: pho
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
|
|
|
else if (rv > 0) {
|
|
|
|
error = rv;
|
|
|
|
}
|
2014-10-24 20:02:44 +00:00
|
|
|
umtxq_lock(&key);
|
|
|
|
}
|
2016-08-15 20:09:09 +00:00
|
|
|
|
|
|
|
umtxq_signal(&key, 1);
|
2014-10-24 20:02:44 +00:00
|
|
|
}
|
|
|
|
umtxq_unbusy(&key);
|
|
|
|
umtxq_unlock(&key);
|
|
|
|
umtx_key_release(&key);
|
|
|
|
return (error);
|
|
|
|
}
|
2010-01-04 05:27:49 +00:00
|
|
|
|
2021-07-19 18:40:38 +03:00
|
|
|
#ifdef COMPAT_FREEBSD10
|
|
|
|
int
|
|
|
|
freebsd10__umtx_lock(struct thread *td, struct freebsd10__umtx_lock_args *uap)
|
|
|
|
{
|
|
|
|
return (do_lock_umtx(td, uap->umtx, td->td_tid, 0));
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
freebsd10__umtx_unlock(struct thread *td,
|
|
|
|
struct freebsd10__umtx_unlock_args *uap)
|
|
|
|
{
|
|
|
|
return (do_unlock_umtx(td, uap->umtx, td->td_tid));
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2011-12-03 12:30:58 +00:00
|
|
|
inline int
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_copyin_timeout(const void *uaddr, struct timespec *tsp)
|
2011-12-03 12:30:58 +00:00
|
|
|
{
|
|
|
|
int error;
|
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
error = copyin(uaddr, tsp, sizeof(*tsp));
|
2011-12-03 12:30:58 +00:00
|
|
|
if (error == 0) {
|
|
|
|
if (tsp->tv_sec < 0 ||
|
|
|
|
tsp->tv_nsec >= 1000000000 ||
|
|
|
|
tsp->tv_nsec < 0)
|
|
|
|
error = EINVAL;
|
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
static inline int
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_copyin_umtx_time(const void *uaddr, size_t size, struct _umtx_time *tp)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
{
|
|
|
|
int error;
|
2020-01-02 22:07:03 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
if (size <= sizeof(tp->_timeout)) {
|
2012-02-29 02:01:48 +00:00
|
|
|
tp->_clockid = CLOCK_REALTIME;
|
|
|
|
tp->_flags = 0;
|
2020-11-17 03:36:58 +00:00
|
|
|
error = copyin(uaddr, &tp->_timeout, sizeof(tp->_timeout));
|
2020-01-02 22:07:03 +00:00
|
|
|
} else
|
2020-11-17 03:36:58 +00:00
|
|
|
error = copyin(uaddr, tp, sizeof(*tp));
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
if (tp->_timeout.tv_sec < 0 ||
|
|
|
|
tp->_timeout.tv_nsec >= 1000000000 || tp->_timeout.tv_nsec < 0)
|
|
|
|
return (EINVAL);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
2006-09-22 00:52:54 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_copyin_robust_lists(const void *uaddr, size_t size,
|
|
|
|
struct umtx_robust_lists_params *rb)
|
|
|
|
{
|
|
|
|
|
|
|
|
if (size > sizeof(*rb))
|
|
|
|
return (EINVAL);
|
2020-11-18 03:30:31 +00:00
|
|
|
return (copyin(uaddr, rb, size));
|
2020-11-17 03:36:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
umtx_copyout_timeout(void *uaddr, size_t sz, struct timespec *tsp)
|
|
|
|
{
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Should be guaranteed by the caller, sz == uaddr1 - sizeof(_umtx_time)
|
|
|
|
* and we're only called if sz >= sizeof(timespec) as supplied in the
|
|
|
|
* copyops.
|
|
|
|
*/
|
|
|
|
KASSERT(sz >= sizeof(*tsp),
|
|
|
|
("umtx_copyops specifies incorrect sizes"));
|
|
|
|
|
|
|
|
return (copyout(tsp, uaddr, sizeof(*tsp)));
|
|
|
|
}
|
|
|
|
|
2021-07-19 18:40:38 +03:00
|
|
|
#ifdef COMPAT_FREEBSD10
|
2020-11-17 03:36:58 +00:00
|
|
|
static int
|
2021-07-19 18:40:38 +03:00
|
|
|
__umtx_op_lock_umtx(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
|
|
|
{
|
|
|
|
struct timespec *ts, timeout;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL)
|
|
|
|
ts = NULL;
|
|
|
|
else {
|
|
|
|
error = ops->copyin_timeout(uap->uaddr2, &timeout);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
ts = &timeout;
|
|
|
|
}
|
|
|
|
#ifdef COMPAT_FREEBSD32
|
|
|
|
if (ops->compat32)
|
|
|
|
return (do_lock_umtx32(td, uap->obj, uap->val, ts));
|
|
|
|
#endif
|
|
|
|
return (do_lock_umtx(td, uap->obj, uap->val, ts));
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
__umtx_op_unlock_umtx(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
2021-07-19 18:40:38 +03:00
|
|
|
#ifdef COMPAT_FREEBSD32
|
|
|
|
if (ops->compat32)
|
|
|
|
return (do_unlock_umtx32(td, uap->obj, uap->val));
|
|
|
|
#endif
|
|
|
|
return (do_unlock_umtx(td, uap->obj, uap->val));
|
|
|
|
}
|
|
|
|
#endif /* COMPAT_FREEBSD10 */
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
2021-07-19 18:40:38 +03:00
|
|
|
#if !defined(COMPAT_FREEBSD10)
|
|
|
|
static int
|
|
|
|
__umtx_op_unimpl(struct thread *td __unused, struct _umtx_op_args *uap __unused,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
|
|
|
{
|
2014-03-18 21:32:03 +00:00
|
|
|
return (EOPNOTSUPP);
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
2021-07-19 18:40:38 +03:00
|
|
|
#endif /* COMPAT_FREEBSD10 */
|
2006-09-22 00:52:54 +00:00
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wait(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
struct _umtx_time timeout, *tm_p;
|
2006-09-22 00:52:54 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
if (uap->uaddr2 == NULL)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = NULL;
|
2006-09-22 00:52:54 +00:00
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
uap->uaddr2, (size_t)uap->uaddr1, &timeout);
|
2006-09-22 00:52:54 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = &timeout;
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
2020-11-17 03:36:58 +00:00
|
|
|
return (do_wait(td, uap->obj, uap->val, tm_p, ops->compat32, 0));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
2007-11-21 04:21:02 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wait_uint(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2007-11-21 04:21:02 +00:00
|
|
|
{
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
struct _umtx_time timeout, *tm_p;
|
2007-11-21 04:21:02 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
if (uap->uaddr2 == NULL)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = NULL;
|
2007-11-21 04:21:02 +00:00
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
uap->uaddr2, (size_t)uap->uaddr1, &timeout);
|
2007-11-21 04:21:02 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = &timeout;
|
2007-11-21 04:21:02 +00:00
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_wait(td, uap->obj, uap->val, tm_p, 1, 0));
|
2008-04-29 03:48:48 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wait_uint_private(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2008-04-29 03:48:48 +00:00
|
|
|
{
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
struct _umtx_time *tm_p, timeout;
|
2008-04-29 03:48:48 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
if (uap->uaddr2 == NULL)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = NULL;
|
2008-04-29 03:48:48 +00:00
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
uap->uaddr2, (size_t)uap->uaddr1, &timeout);
|
2008-04-29 03:48:48 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = &timeout;
|
2008-04-29 03:48:48 +00:00
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_wait(td, uap->obj, uap->val, tm_p, 1, 1));
|
2007-11-21 04:21:02 +00:00
|
|
|
}
|
|
|
|
|
2006-09-22 00:52:54 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wake(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2008-04-29 03:48:48 +00:00
|
|
|
return (kern_umtx_wake(td, uap->obj, uap->val, 0));
|
|
|
|
}
|
|
|
|
|
2010-12-22 05:01:52 +00:00
|
|
|
#define BATCH_SIZE 128
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_nwake_private_native(struct thread *td, struct _umtx_op_args *uap)
|
2010-12-22 05:01:52 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
char *uaddrs[BATCH_SIZE], **upp;
|
|
|
|
int count, error, i, pos, tocopy;
|
2010-12-22 05:01:52 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
upp = (char **)uap->obj;
|
|
|
|
error = 0;
|
|
|
|
for (count = uap->val, pos = 0; count > 0; count -= tocopy,
|
|
|
|
pos += tocopy) {
|
|
|
|
tocopy = MIN(count, BATCH_SIZE);
|
|
|
|
error = copyin(upp + pos, uaddrs, tocopy * sizeof(char *));
|
2010-12-22 05:01:52 +00:00
|
|
|
if (error != 0)
|
|
|
|
break;
|
2020-11-17 03:36:58 +00:00
|
|
|
for (i = 0; i < tocopy; ++i) {
|
2010-12-22 05:01:52 +00:00
|
|
|
kern_umtx_wake(td, uaddrs[i], INT_MAX, 1);
|
2020-11-17 03:36:58 +00:00
|
|
|
}
|
|
|
|
maybe_yield();
|
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
__umtx_op_nwake_private_compat32(struct thread *td, struct _umtx_op_args *uap)
|
|
|
|
{
|
|
|
|
uint32_t uaddrs[BATCH_SIZE], *upp;
|
|
|
|
int count, error, i, pos, tocopy;
|
|
|
|
|
|
|
|
upp = (uint32_t *)uap->obj;
|
|
|
|
error = 0;
|
|
|
|
for (count = uap->val, pos = 0; count > 0; count -= tocopy,
|
|
|
|
pos += tocopy) {
|
|
|
|
tocopy = MIN(count, BATCH_SIZE);
|
|
|
|
error = copyin(upp + pos, uaddrs, tocopy * sizeof(uint32_t));
|
|
|
|
if (error != 0)
|
|
|
|
break;
|
|
|
|
for (i = 0; i < tocopy; ++i) {
|
|
|
|
kern_umtx_wake(td, (void *)(uintptr_t)uaddrs[i],
|
|
|
|
INT_MAX, 1);
|
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
maybe_yield();
|
2010-12-22 05:01:52 +00:00
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
2008-04-29 03:48:48 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_nwake_private(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
|
|
|
{
|
|
|
|
|
|
|
|
if (ops->compat32)
|
|
|
|
return (__umtx_op_nwake_private_compat32(td, uap));
|
|
|
|
return (__umtx_op_nwake_private_native(td, uap));
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
__umtx_op_wake_private(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2008-04-29 03:48:48 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2008-04-29 03:48:48 +00:00
|
|
|
return (kern_umtx_wake(td, uap->obj, uap->val, 1));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_lock_umutex(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
struct _umtx_time *tm_p, timeout;
|
2006-09-22 00:52:54 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = NULL;
|
2006-09-22 00:52:54 +00:00
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
uap->uaddr2, (size_t)uap->uaddr1, &timeout);
|
2006-09-22 00:52:54 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = &timeout;
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_lock_umutex(td, uap->obj, tm_p, 0));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_trylock_umutex(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_lock_umutex(td, uap->obj, NULL, _UMUTEX_TRY));
|
2008-06-24 07:32:12 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wait_umutex(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2008-06-24 07:32:12 +00:00
|
|
|
{
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
struct _umtx_time *tm_p, timeout;
|
2008-06-24 07:32:12 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = NULL;
|
2008-06-24 07:32:12 +00:00
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
uap->uaddr2, (size_t)uap->uaddr1, &timeout);
|
2008-06-24 07:32:12 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = &timeout;
|
2008-06-24 07:32:12 +00:00
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
return (do_lock_umutex(td, uap->obj, tm_p, _UMUTEX_WAIT));
|
2008-06-24 07:32:12 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wake_umutex(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2008-06-24 07:32:12 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_wake_umutex(td, uap->obj));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_unlock_umutex(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_unlock_umutex(td, uap->obj, false));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_set_ceiling(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_set_ceiling(td, uap->obj, uap->val, uap->uaddr1));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
2006-12-03 01:49:22 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_cv_wait(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2006-12-03 01:49:22 +00:00
|
|
|
{
|
|
|
|
struct timespec *ts, timeout;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL)
|
|
|
|
ts = NULL;
|
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_timeout(uap->uaddr2, &timeout);
|
2006-12-03 01:49:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
ts = &timeout;
|
|
|
|
}
|
2006-12-04 14:15:12 +00:00
|
|
|
return (do_cv_wait(td, uap->obj, uap->uaddr1, ts, uap->val));
|
2006-12-03 01:49:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_cv_signal(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2006-12-03 01:49:22 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_cv_signal(td, uap->obj));
|
2006-12-03 01:49:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_cv_broadcast(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2006-12-03 01:49:22 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_cv_broadcast(td, uap->obj));
|
2006-12-03 01:49:22 +00:00
|
|
|
}
|
|
|
|
|
2008-04-02 04:08:37 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_rw_rdlock(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2008-04-02 04:08:37 +00:00
|
|
|
{
|
2012-02-27 13:38:52 +00:00
|
|
|
struct _umtx_time timeout;
|
2008-04-02 04:08:37 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL) {
|
|
|
|
error = do_rw_rdlock(td, uap->obj, uap->val, 0);
|
|
|
|
} else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(uap->uaddr2,
|
2012-02-27 13:38:52 +00:00
|
|
|
(size_t)uap->uaddr1, &timeout);
|
2008-04-02 04:08:37 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
2012-03-30 05:40:26 +00:00
|
|
|
error = do_rw_rdlock(td, uap->obj, uap->val, &timeout);
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_rw_wrlock(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2008-04-02 04:08:37 +00:00
|
|
|
{
|
2012-02-27 13:38:52 +00:00
|
|
|
struct _umtx_time timeout;
|
2008-04-02 04:08:37 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL) {
|
|
|
|
error = do_rw_wrlock(td, uap->obj, 0);
|
|
|
|
} else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(uap->uaddr2,
|
2012-02-27 13:38:52 +00:00
|
|
|
(size_t)uap->uaddr1, &timeout);
|
2008-04-02 04:08:37 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
|
2012-03-30 05:40:26 +00:00
|
|
|
error = do_rw_wrlock(td, uap->obj, &timeout);
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_rw_unlock(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2008-04-02 04:08:37 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
|
|
|
return (do_rw_unlock(td, uap->obj));
|
2008-04-02 04:08:37 +00:00
|
|
|
}
|
|
|
|
|
2014-10-24 20:02:44 +00:00
|
|
|
#if defined(COMPAT_FREEBSD9) || defined(COMPAT_FREEBSD10)
|
2010-01-04 05:27:49 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_sem_wait(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2010-01-04 05:27:49 +00:00
|
|
|
{
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
struct _umtx_time *tm_p, timeout;
|
2010-01-04 05:27:49 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
|
|
|
if (uap->uaddr2 == NULL)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = NULL;
|
2010-01-04 05:27:49 +00:00
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
uap->uaddr2, (size_t)uap->uaddr1, &timeout);
|
2010-01-04 05:27:49 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
tm_p = &timeout;
|
2010-01-04 05:27:49 +00:00
|
|
|
}
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
return (do_sem_wait(td, uap->obj, tm_p));
|
2010-01-04 05:27:49 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_sem_wake(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2010-01-04 05:27:49 +00:00
|
|
|
{
|
2015-10-30 20:47:42 +00:00
|
|
|
|
|
|
|
return (do_sem_wake(td, uap->obj));
|
2010-01-04 05:27:49 +00:00
|
|
|
}
|
2014-10-24 20:02:44 +00:00
|
|
|
#endif
|
2010-01-04 05:27:49 +00:00
|
|
|
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_wake2_umutex(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
{
|
2015-10-30 20:47:42 +00:00
|
|
|
|
|
|
|
return (do_wake2_umutex(td, uap->obj, uap->val));
|
umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
|
|
|
}
|
|
|
|
|
2014-10-24 20:02:44 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_sem2_wait(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
2014-10-24 20:02:44 +00:00
|
|
|
{
|
|
|
|
struct _umtx_time *tm_p, timeout;
|
2017-02-23 19:36:38 +00:00
|
|
|
size_t uasize;
|
2014-10-24 20:02:44 +00:00
|
|
|
int error;
|
|
|
|
|
|
|
|
/* Allow a null timespec (wait forever). */
|
2017-02-23 19:36:38 +00:00
|
|
|
if (uap->uaddr2 == NULL) {
|
|
|
|
uasize = 0;
|
2014-10-24 20:02:44 +00:00
|
|
|
tm_p = NULL;
|
2017-02-23 19:36:38 +00:00
|
|
|
} else {
|
|
|
|
uasize = (size_t)uap->uaddr1;
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_umtx_time(uap->uaddr2, uasize, &timeout);
|
2014-10-24 20:02:44 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
tm_p = &timeout;
|
|
|
|
}
|
2017-02-23 19:36:38 +00:00
|
|
|
error = do_sem2_wait(td, uap->obj, tm_p);
|
|
|
|
if (error == EINTR && uap->uaddr2 != NULL &&
|
|
|
|
(timeout._flags & UMTX_ABSTIME) == 0 &&
|
2020-11-17 03:36:58 +00:00
|
|
|
uasize >= ops->umtx_time_sz + ops->timespec_sz) {
|
|
|
|
error = ops->copyout_timeout(
|
|
|
|
(void *)((uintptr_t)uap->uaddr2 + ops->umtx_time_sz),
|
|
|
|
uasize - ops->umtx_time_sz, &timeout._timeout);
|
2017-02-23 19:36:38 +00:00
|
|
|
if (error == 0) {
|
|
|
|
error = EINTR;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return (error);
|
2014-10-24 20:02:44 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_sem2_wake(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2014-10-24 20:02:44 +00:00
|
|
|
{
|
2015-10-30 20:47:42 +00:00
|
|
|
|
|
|
|
return (do_sem2_wake(td, uap->obj));
|
2014-10-24 20:02:44 +00:00
|
|
|
}
|
|
|
|
|
2016-02-28 17:52:33 +00:00
|
|
|
#define USHM_OBJ_UMTX(o) \
|
|
|
|
((struct umtx_shm_obj_list *)(&(o)->umtx_data))
|
|
|
|
|
|
|
|
#define USHMF_REG_LINKED 0x0001
|
|
|
|
#define USHMF_OBJ_LINKED 0x0002
|
|
|
|
struct umtx_shm_reg {
|
|
|
|
TAILQ_ENTRY(umtx_shm_reg) ushm_reg_link;
|
|
|
|
LIST_ENTRY(umtx_shm_reg) ushm_obj_link;
|
|
|
|
struct umtx_key ushm_key;
|
|
|
|
struct ucred *ushm_cred;
|
|
|
|
struct shmfd *ushm_obj;
|
|
|
|
u_int ushm_refcnt;
|
|
|
|
u_int ushm_flags;
|
|
|
|
};
|
|
|
|
|
|
|
|
LIST_HEAD(umtx_shm_obj_list, umtx_shm_reg);
|
|
|
|
TAILQ_HEAD(umtx_shm_reg_head, umtx_shm_reg);
|
|
|
|
|
|
|
|
static uma_zone_t umtx_shm_reg_zone;
|
|
|
|
static struct umtx_shm_reg_head umtx_shm_registry[UMTX_CHAINS];
|
|
|
|
static struct mtx umtx_shm_lock;
|
|
|
|
static struct umtx_shm_reg_head umtx_shm_reg_delfree =
|
|
|
|
TAILQ_HEAD_INITIALIZER(umtx_shm_reg_delfree);
|
|
|
|
|
|
|
|
static void umtx_shm_free_reg(struct umtx_shm_reg *reg);
|
|
|
|
|
|
|
|
static void
|
|
|
|
umtx_shm_reg_delfree_tq(void *context __unused, int pending __unused)
|
|
|
|
{
|
|
|
|
struct umtx_shm_reg_head d;
|
|
|
|
struct umtx_shm_reg *reg, *reg1;
|
|
|
|
|
|
|
|
TAILQ_INIT(&d);
|
|
|
|
mtx_lock(&umtx_shm_lock);
|
|
|
|
TAILQ_CONCAT(&d, &umtx_shm_reg_delfree, ushm_reg_link);
|
|
|
|
mtx_unlock(&umtx_shm_lock);
|
|
|
|
TAILQ_FOREACH_SAFE(reg, &d, ushm_reg_link, reg1) {
|
|
|
|
TAILQ_REMOVE(&d, reg, ushm_reg_link);
|
|
|
|
umtx_shm_free_reg(reg);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct task umtx_shm_reg_delfree_task =
|
|
|
|
TASK_INITIALIZER(0, umtx_shm_reg_delfree_tq, NULL);
|
|
|
|
|
|
|
|
static struct umtx_shm_reg *
|
|
|
|
umtx_shm_find_reg_locked(const struct umtx_key *key)
|
|
|
|
{
|
|
|
|
struct umtx_shm_reg *reg;
|
|
|
|
struct umtx_shm_reg_head *reg_head;
|
|
|
|
|
|
|
|
KASSERT(key->shared, ("umtx_p_find_rg: private key"));
|
|
|
|
mtx_assert(&umtx_shm_lock, MA_OWNED);
|
|
|
|
reg_head = &umtx_shm_registry[key->hash];
|
|
|
|
TAILQ_FOREACH(reg, reg_head, ushm_reg_link) {
|
|
|
|
KASSERT(reg->ushm_key.shared,
|
|
|
|
("non-shared key on reg %p %d", reg, reg->ushm_key.shared));
|
|
|
|
if (reg->ushm_key.info.shared.object ==
|
|
|
|
key->info.shared.object &&
|
|
|
|
reg->ushm_key.info.shared.offset ==
|
|
|
|
key->info.shared.offset) {
|
|
|
|
KASSERT(reg->ushm_key.type == TYPE_SHM, ("TYPE_USHM"));
|
|
|
|
KASSERT(reg->ushm_refcnt > 0,
|
|
|
|
("reg %p refcnt 0 onlist", reg));
|
|
|
|
KASSERT((reg->ushm_flags & USHMF_REG_LINKED) != 0,
|
|
|
|
("reg %p not linked", reg));
|
|
|
|
reg->ushm_refcnt++;
|
|
|
|
return (reg);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct umtx_shm_reg *
|
|
|
|
umtx_shm_find_reg(const struct umtx_key *key)
|
|
|
|
{
|
|
|
|
struct umtx_shm_reg *reg;
|
|
|
|
|
|
|
|
mtx_lock(&umtx_shm_lock);
|
|
|
|
reg = umtx_shm_find_reg_locked(key);
|
|
|
|
mtx_unlock(&umtx_shm_lock);
|
|
|
|
return (reg);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
umtx_shm_free_reg(struct umtx_shm_reg *reg)
|
|
|
|
{
|
|
|
|
|
|
|
|
chgumtxcnt(reg->ushm_cred->cr_ruidinfo, -1, 0);
|
|
|
|
crfree(reg->ushm_cred);
|
|
|
|
shm_drop(reg->ushm_obj);
|
|
|
|
uma_zfree(umtx_shm_reg_zone, reg);
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
umtx_shm_unref_reg_locked(struct umtx_shm_reg *reg, bool force)
|
|
|
|
{
|
|
|
|
bool res;
|
|
|
|
|
|
|
|
mtx_assert(&umtx_shm_lock, MA_OWNED);
|
|
|
|
KASSERT(reg->ushm_refcnt > 0, ("ushm_reg %p refcnt 0", reg));
|
|
|
|
reg->ushm_refcnt--;
|
|
|
|
res = reg->ushm_refcnt == 0;
|
|
|
|
if (res || force) {
|
|
|
|
if ((reg->ushm_flags & USHMF_REG_LINKED) != 0) {
|
|
|
|
TAILQ_REMOVE(&umtx_shm_registry[reg->ushm_key.hash],
|
|
|
|
reg, ushm_reg_link);
|
|
|
|
reg->ushm_flags &= ~USHMF_REG_LINKED;
|
|
|
|
}
|
|
|
|
if ((reg->ushm_flags & USHMF_OBJ_LINKED) != 0) {
|
|
|
|
LIST_REMOVE(reg, ushm_obj_link);
|
|
|
|
reg->ushm_flags &= ~USHMF_OBJ_LINKED;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (res);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
umtx_shm_unref_reg(struct umtx_shm_reg *reg, bool force)
|
|
|
|
{
|
|
|
|
vm_object_t object;
|
|
|
|
bool dofree;
|
|
|
|
|
|
|
|
if (force) {
|
|
|
|
object = reg->ushm_obj->shm_object;
|
|
|
|
VM_OBJECT_WLOCK(object);
|
|
|
|
object->flags |= OBJ_UMTXDEAD;
|
|
|
|
VM_OBJECT_WUNLOCK(object);
|
|
|
|
}
|
|
|
|
mtx_lock(&umtx_shm_lock);
|
|
|
|
dofree = umtx_shm_unref_reg_locked(reg, force);
|
|
|
|
mtx_unlock(&umtx_shm_lock);
|
|
|
|
if (dofree)
|
|
|
|
umtx_shm_free_reg(reg);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
umtx_shm_object_init(vm_object_t object)
|
|
|
|
{
|
|
|
|
|
|
|
|
LIST_INIT(USHM_OBJ_UMTX(object));
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
umtx_shm_object_terminated(vm_object_t object)
|
|
|
|
{
|
|
|
|
struct umtx_shm_reg *reg, *reg1;
|
|
|
|
bool dofree;
|
|
|
|
|
2018-12-08 14:04:57 +00:00
|
|
|
if (LIST_EMPTY(USHM_OBJ_UMTX(object)))
|
|
|
|
return;
|
|
|
|
|
2016-02-28 17:52:33 +00:00
|
|
|
dofree = false;
|
|
|
|
mtx_lock(&umtx_shm_lock);
|
|
|
|
LIST_FOREACH_SAFE(reg, USHM_OBJ_UMTX(object), ushm_obj_link, reg1) {
|
|
|
|
if (umtx_shm_unref_reg_locked(reg, true)) {
|
|
|
|
TAILQ_INSERT_TAIL(&umtx_shm_reg_delfree, reg,
|
|
|
|
ushm_reg_link);
|
|
|
|
dofree = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
mtx_unlock(&umtx_shm_lock);
|
|
|
|
if (dofree)
|
|
|
|
taskqueue_enqueue(taskqueue_thread, &umtx_shm_reg_delfree_task);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
umtx_shm_create_reg(struct thread *td, const struct umtx_key *key,
|
|
|
|
struct umtx_shm_reg **res)
|
|
|
|
{
|
|
|
|
struct umtx_shm_reg *reg, *reg1;
|
|
|
|
struct ucred *cred;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
reg = umtx_shm_find_reg(key);
|
|
|
|
if (reg != NULL) {
|
|
|
|
*res = reg;
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
cred = td->td_ucred;
|
|
|
|
if (!chgumtxcnt(cred->cr_ruidinfo, 1, lim_cur(td, RLIMIT_UMTXP)))
|
|
|
|
return (ENOMEM);
|
|
|
|
reg = uma_zalloc(umtx_shm_reg_zone, M_WAITOK | M_ZERO);
|
|
|
|
reg->ushm_refcnt = 1;
|
|
|
|
bcopy(key, ®->ushm_key, sizeof(*key));
|
2020-09-09 22:12:51 +00:00
|
|
|
reg->ushm_obj = shm_alloc(td->td_ucred, O_RDWR, false);
|
2016-02-28 17:52:33 +00:00
|
|
|
reg->ushm_cred = crhold(cred);
|
|
|
|
error = shm_dotruncate(reg->ushm_obj, PAGE_SIZE);
|
|
|
|
if (error != 0) {
|
|
|
|
umtx_shm_free_reg(reg);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
mtx_lock(&umtx_shm_lock);
|
|
|
|
reg1 = umtx_shm_find_reg_locked(key);
|
|
|
|
if (reg1 != NULL) {
|
|
|
|
mtx_unlock(&umtx_shm_lock);
|
|
|
|
umtx_shm_free_reg(reg);
|
|
|
|
*res = reg1;
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
reg->ushm_refcnt++;
|
|
|
|
TAILQ_INSERT_TAIL(&umtx_shm_registry[key->hash], reg, ushm_reg_link);
|
|
|
|
LIST_INSERT_HEAD(USHM_OBJ_UMTX(key->info.shared.object), reg,
|
|
|
|
ushm_obj_link);
|
|
|
|
reg->ushm_flags = USHMF_REG_LINKED | USHMF_OBJ_LINKED;
|
|
|
|
mtx_unlock(&umtx_shm_lock);
|
|
|
|
*res = reg;
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
umtx_shm_alive(struct thread *td, void *addr)
|
|
|
|
{
|
|
|
|
vm_map_t map;
|
|
|
|
vm_map_entry_t entry;
|
|
|
|
vm_object_t object;
|
|
|
|
vm_pindex_t pindex;
|
|
|
|
vm_prot_t prot;
|
|
|
|
int res, ret;
|
|
|
|
boolean_t wired;
|
|
|
|
|
|
|
|
map = &td->td_proc->p_vmspace->vm_map;
|
|
|
|
res = vm_map_lookup(&map, (uintptr_t)addr, VM_PROT_READ, &entry,
|
|
|
|
&object, &pindex, &prot, &wired);
|
|
|
|
if (res != KERN_SUCCESS)
|
|
|
|
return (EFAULT);
|
|
|
|
if (object == NULL)
|
|
|
|
ret = EINVAL;
|
|
|
|
else
|
|
|
|
ret = (object->flags & OBJ_UMTXDEAD) != 0 ? ENOTTY : 0;
|
|
|
|
vm_map_lookup_done(map, entry);
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
umtx_shm_init(void)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
umtx_shm_reg_zone = uma_zcreate("umtx_shm", sizeof(struct umtx_shm_reg),
|
|
|
|
NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
|
|
|
|
mtx_init(&umtx_shm_lock, "umtxshm", NULL, MTX_DEF);
|
|
|
|
for (i = 0; i < nitems(umtx_shm_registry); i++)
|
|
|
|
TAILQ_INIT(&umtx_shm_registry[i]);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
umtx_shm(struct thread *td, void *addr, u_int flags)
|
|
|
|
{
|
|
|
|
struct umtx_key key;
|
|
|
|
struct umtx_shm_reg *reg;
|
|
|
|
struct file *fp;
|
|
|
|
int error, fd;
|
|
|
|
|
|
|
|
if (__bitcount(flags & (UMTX_SHM_CREAT | UMTX_SHM_LOOKUP |
|
|
|
|
UMTX_SHM_DESTROY| UMTX_SHM_ALIVE)) != 1)
|
|
|
|
return (EINVAL);
|
|
|
|
if ((flags & UMTX_SHM_ALIVE) != 0)
|
|
|
|
return (umtx_shm_alive(td, addr));
|
|
|
|
error = umtx_key_get(addr, TYPE_SHM, PROCESS_SHARE, &key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
KASSERT(key.shared == 1, ("non-shared key"));
|
|
|
|
if ((flags & UMTX_SHM_CREAT) != 0) {
|
|
|
|
error = umtx_shm_create_reg(td, &key, ®);
|
|
|
|
} else {
|
|
|
|
reg = umtx_shm_find_reg(&key);
|
|
|
|
if (reg == NULL)
|
|
|
|
error = ESRCH;
|
|
|
|
}
|
|
|
|
umtx_key_release(&key);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
KASSERT(reg != NULL, ("no reg"));
|
|
|
|
if ((flags & UMTX_SHM_DESTROY) != 0) {
|
|
|
|
umtx_shm_unref_reg(reg, true);
|
|
|
|
} else {
|
|
|
|
#if 0
|
|
|
|
#ifdef MAC
|
|
|
|
error = mac_posixshm_check_open(td->td_ucred,
|
|
|
|
reg->ushm_obj, FFLAGS(O_RDWR));
|
|
|
|
if (error == 0)
|
|
|
|
#endif
|
|
|
|
error = shm_access(reg->ushm_obj, td->td_ucred,
|
|
|
|
FFLAGS(O_RDWR));
|
|
|
|
if (error == 0)
|
|
|
|
#endif
|
|
|
|
error = falloc_caps(td, &fp, &fd, O_CLOEXEC, NULL);
|
|
|
|
if (error == 0) {
|
|
|
|
shm_hold(reg->ushm_obj);
|
|
|
|
finit(fp, FFLAGS(O_RDWR), DTYPE_SHM, reg->ushm_obj,
|
|
|
|
&shm_ops);
|
|
|
|
td->td_retval[0] = fd;
|
|
|
|
fdrop(fp, td);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
umtx_shm_unref_reg(reg, false);
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_shm(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops __unused)
|
2016-02-28 17:52:33 +00:00
|
|
|
{
|
|
|
|
|
|
|
|
return (umtx_shm(td, uap->uaddr1, uap->val));
|
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
__umtx_op_robust_lists(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *ops)
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
{
|
|
|
|
struct umtx_robust_lists_params rb;
|
|
|
|
int error;
|
|
|
|
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
if (ops->compat32) {
|
|
|
|
if ((td->td_pflags2 & TDP2_COMPAT32RB) == 0 &&
|
|
|
|
(td->td_rb_list != 0 || td->td_rbp_list != 0 ||
|
|
|
|
td->td_rb_inact != 0))
|
|
|
|
return (EBUSY);
|
|
|
|
} else if ((td->td_pflags2 & TDP2_COMPAT32RB) != 0) {
|
|
|
|
return (EBUSY);
|
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
bzero(&rb, sizeof(rb));
|
2020-11-17 03:36:58 +00:00
|
|
|
error = ops->copyin_robust_lists(uap->uaddr1, uap->val, &rb);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
2006-09-22 00:52:54 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
if (ops->compat32)
|
|
|
|
td->td_pflags2 |= TDP2_COMPAT32RB;
|
2015-10-30 19:20:40 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
td->td_rb_list = rb.robust_list_offset;
|
|
|
|
td->td_rbp_list = rb.robust_priv_list_offset;
|
|
|
|
td->td_rb_inact = rb.robust_inact_offset;
|
|
|
|
return (0);
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
#if defined(__i386__) || defined(__amd64__)
|
|
|
|
/*
|
|
|
|
* Provide the standard 32-bit definitions for x86, since native/compat32 use a
|
|
|
|
* 32-bit time_t there. Other architectures just need the i386 definitions
|
|
|
|
* along with their standard compat32.
|
|
|
|
*/
|
|
|
|
struct timespecx32 {
|
|
|
|
int64_t tv_sec;
|
|
|
|
int32_t tv_nsec;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct umtx_timex32 {
|
|
|
|
struct timespecx32 _timeout;
|
|
|
|
uint32_t _flags;
|
|
|
|
uint32_t _clockid;
|
|
|
|
};
|
|
|
|
|
|
|
|
#ifndef __i386__
|
|
|
|
#define timespeci386 timespec32
|
|
|
|
#define umtx_timei386 umtx_time32
|
|
|
|
#endif
|
|
|
|
#else /* !__i386__ && !__amd64__ */
|
|
|
|
/* 32-bit architectures can emulate i386, so define these almost everywhere. */
|
|
|
|
struct timespeci386 {
|
|
|
|
int32_t tv_sec;
|
|
|
|
int32_t tv_nsec;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct umtx_timei386 {
|
|
|
|
struct timespeci386 _timeout;
|
|
|
|
uint32_t _flags;
|
|
|
|
uint32_t _clockid;
|
|
|
|
};
|
|
|
|
|
|
|
|
#if defined(__LP64__)
|
|
|
|
#define timespecx32 timespec32
|
|
|
|
#define umtx_timex32 umtx_time32
|
|
|
|
#endif
|
|
|
|
#endif
|
|
|
|
|
|
|
|
static int
|
|
|
|
umtx_copyin_robust_lists32(const void *uaddr, size_t size,
|
|
|
|
struct umtx_robust_lists_params *rbp)
|
|
|
|
{
|
|
|
|
struct umtx_robust_lists_params_compat32 rb32;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
if (size > sizeof(rb32))
|
|
|
|
return (EINVAL);
|
|
|
|
bzero(&rb32, sizeof(rb32));
|
|
|
|
error = copyin(uaddr, &rb32, size);
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
CP(rb32, *rbp, robust_list_offset);
|
|
|
|
CP(rb32, *rbp, robust_priv_list_offset);
|
|
|
|
CP(rb32, *rbp, robust_inact_offset);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifndef __i386__
|
2006-09-22 00:52:54 +00:00
|
|
|
static inline int
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
umtx_copyin_timeouti386(const void *uaddr, struct timespec *tsp)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
struct timespeci386 ts32;
|
2006-09-22 00:52:54 +00:00
|
|
|
int error;
|
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
error = copyin(uaddr, &ts32, sizeof(ts32));
|
2006-09-22 00:52:54 +00:00
|
|
|
if (error == 0) {
|
2011-12-03 12:28:33 +00:00
|
|
|
if (ts32.tv_sec < 0 ||
|
|
|
|
ts32.tv_nsec >= 1000000000 ||
|
|
|
|
ts32.tv_nsec < 0)
|
|
|
|
error = EINVAL;
|
|
|
|
else {
|
2020-11-17 03:36:58 +00:00
|
|
|
CP(ts32, *tsp, tv_sec);
|
|
|
|
CP(ts32, *tsp, tv_nsec);
|
2011-12-03 12:28:33 +00:00
|
|
|
}
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
2005-01-18 13:53:10 +00:00
|
|
|
return (error);
|
2004-12-18 12:52:44 +00:00
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
static inline int
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
umtx_copyin_umtx_timei386(const void *uaddr, size_t size, struct _umtx_time *tp)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
{
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
struct umtx_timei386 t32;
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
int error;
|
2020-01-02 22:07:03 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
t32._clockid = CLOCK_REALTIME;
|
|
|
|
t32._flags = 0;
|
|
|
|
if (size <= sizeof(t32._timeout))
|
|
|
|
error = copyin(uaddr, &t32._timeout, sizeof(t32._timeout));
|
2020-01-02 22:07:03 +00:00
|
|
|
else
|
2020-11-17 03:36:58 +00:00
|
|
|
error = copyin(uaddr, &t32, sizeof(t32));
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
2020-11-17 03:36:58 +00:00
|
|
|
if (t32._timeout.tv_sec < 0 ||
|
|
|
|
t32._timeout.tv_nsec >= 1000000000 || t32._timeout.tv_nsec < 0)
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
return (EINVAL);
|
2020-11-17 03:36:58 +00:00
|
|
|
TS_CP(t32, *tp, _timeout);
|
|
|
|
CP(t32, *tp, _flags);
|
|
|
|
CP(t32, *tp, _clockid);
|
In revision 231989, we pass a 16-bit clock ID into kernel, however
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
2012-02-25 02:12:17 +00:00
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
2006-09-22 00:52:54 +00:00
|
|
|
static int
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
umtx_copyout_timeouti386(void *uaddr, size_t sz, struct timespec *tsp)
|
2008-06-24 07:32:12 +00:00
|
|
|
{
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
struct timespeci386 remain32 = {
|
|
|
|
.tv_sec = tsp->tv_sec,
|
|
|
|
.tv_nsec = tsp->tv_nsec,
|
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Should be guaranteed by the caller, sz == uaddr1 - sizeof(_umtx_time)
|
|
|
|
* and we're only called if sz >= sizeof(timespec) as supplied in the
|
|
|
|
* copyops.
|
|
|
|
*/
|
|
|
|
KASSERT(sz >= sizeof(remain32),
|
|
|
|
("umtx_copyops specifies incorrect sizes"));
|
|
|
|
|
|
|
|
return (copyout(&remain32, uaddr, sizeof(remain32)));
|
|
|
|
}
|
|
|
|
#endif /* !__i386__ */
|
|
|
|
|
|
|
|
#if defined(__i386__) || defined(__LP64__)
|
|
|
|
static inline int
|
|
|
|
umtx_copyin_timeoutx32(const void *uaddr, struct timespec *tsp)
|
|
|
|
{
|
|
|
|
struct timespecx32 ts32;
|
2010-01-04 05:27:49 +00:00
|
|
|
int error;
|
|
|
|
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
error = copyin(uaddr, &ts32, sizeof(ts32));
|
|
|
|
if (error == 0) {
|
|
|
|
if (ts32.tv_sec < 0 ||
|
|
|
|
ts32.tv_nsec >= 1000000000 ||
|
|
|
|
ts32.tv_nsec < 0)
|
|
|
|
error = EINVAL;
|
|
|
|
else {
|
|
|
|
CP(ts32, *tsp, tv_sec);
|
|
|
|
CP(ts32, *tsp, tv_nsec);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int
|
|
|
|
umtx_copyin_umtx_timex32(const void *uaddr, size_t size, struct _umtx_time *tp)
|
|
|
|
{
|
|
|
|
struct umtx_timex32 t32;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
t32._clockid = CLOCK_REALTIME;
|
|
|
|
t32._flags = 0;
|
|
|
|
if (size <= sizeof(t32._timeout))
|
|
|
|
error = copyin(uaddr, &t32._timeout, sizeof(t32._timeout));
|
|
|
|
else
|
|
|
|
error = copyin(uaddr, &t32, sizeof(t32));
|
2020-11-17 03:36:58 +00:00
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
if (t32._timeout.tv_sec < 0 ||
|
|
|
|
t32._timeout.tv_nsec >= 1000000000 || t32._timeout.tv_nsec < 0)
|
|
|
|
return (EINVAL);
|
|
|
|
TS_CP(t32, *tp, _timeout);
|
|
|
|
CP(t32, *tp, _flags);
|
|
|
|
CP(t32, *tp, _clockid);
|
2020-11-17 03:36:58 +00:00
|
|
|
return (0);
|
2010-01-04 05:27:49 +00:00
|
|
|
}
|
2014-10-24 20:02:44 +00:00
|
|
|
|
|
|
|
static int
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
umtx_copyout_timeoutx32(void *uaddr, size_t sz, struct timespec *tsp)
|
2014-10-24 20:02:44 +00:00
|
|
|
{
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
struct timespecx32 remain32 = {
|
2020-11-17 03:36:58 +00:00
|
|
|
.tv_sec = tsp->tv_sec,
|
|
|
|
.tv_nsec = tsp->tv_nsec,
|
|
|
|
};
|
2014-10-24 20:02:44 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
/*
|
|
|
|
* Should be guaranteed by the caller, sz == uaddr1 - sizeof(_umtx_time)
|
|
|
|
* and we're only called if sz >= sizeof(timespec) as supplied in the
|
|
|
|
* copyops.
|
|
|
|
*/
|
|
|
|
KASSERT(sz >= sizeof(remain32),
|
|
|
|
("umtx_copyops specifies incorrect sizes"));
|
2010-12-22 05:01:52 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
return (copyout(&remain32, uaddr, sizeof(remain32)));
|
2010-12-22 05:01:52 +00:00
|
|
|
}
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
#endif /* __i386__ || __LP64__ */
|
2010-12-22 05:01:52 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
typedef int (*_umtx_op_func)(struct thread *td, struct _umtx_op_args *uap,
|
|
|
|
const struct umtx_copyops *umtx_ops);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
static const _umtx_op_func op_table[] = {
|
2021-07-19 18:40:38 +03:00
|
|
|
#ifdef COMPAT_FREEBSD10
|
|
|
|
[UMTX_OP_LOCK] = __umtx_op_lock_umtx,
|
|
|
|
[UMTX_OP_UNLOCK] = __umtx_op_unlock_umtx,
|
|
|
|
#else
|
|
|
|
[UMTX_OP_LOCK] = __umtx_op_unimpl,
|
|
|
|
[UMTX_OP_UNLOCK] = __umtx_op_unimpl,
|
|
|
|
#endif
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_WAIT] = __umtx_op_wait,
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
[UMTX_OP_WAKE] = __umtx_op_wake,
|
2016-04-19 11:37:43 +00:00
|
|
|
[UMTX_OP_MUTEX_TRYLOCK] = __umtx_op_trylock_umutex,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_MUTEX_LOCK] = __umtx_op_lock_umutex,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_MUTEX_UNLOCK] = __umtx_op_unlock_umutex,
|
|
|
|
[UMTX_OP_SET_CEILING] = __umtx_op_set_ceiling,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_CV_WAIT] = __umtx_op_cv_wait,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_CV_SIGNAL] = __umtx_op_cv_signal,
|
|
|
|
[UMTX_OP_CV_BROADCAST] = __umtx_op_cv_broadcast,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_WAIT_UINT] = __umtx_op_wait_uint,
|
|
|
|
[UMTX_OP_RW_RDLOCK] = __umtx_op_rw_rdlock,
|
|
|
|
[UMTX_OP_RW_WRLOCK] = __umtx_op_rw_wrlock,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_RW_UNLOCK] = __umtx_op_rw_unlock,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_WAIT_UINT_PRIVATE] = __umtx_op_wait_uint_private,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_WAKE_PRIVATE] = __umtx_op_wake_private,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_MUTEX_WAIT] = __umtx_op_wait_umutex,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_MUTEX_WAKE] = __umtx_op_wake_umutex,
|
2014-10-24 20:02:44 +00:00
|
|
|
#if defined(COMPAT_FREEBSD9) || defined(COMPAT_FREEBSD10)
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_SEM_WAIT] = __umtx_op_sem_wait,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_SEM_WAKE] = __umtx_op_sem_wake,
|
2014-10-24 20:02:44 +00:00
|
|
|
#else
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_SEM_WAIT] = __umtx_op_unimpl,
|
|
|
|
[UMTX_OP_SEM_WAKE] = __umtx_op_unimpl,
|
2014-10-24 20:02:44 +00:00
|
|
|
#endif
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_NWAKE_PRIVATE] = __umtx_op_nwake_private,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_MUTEX_WAKE2] = __umtx_op_wake2_umutex,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_SEM2_WAIT] = __umtx_op_sem2_wait,
|
2015-10-30 19:20:40 +00:00
|
|
|
[UMTX_OP_SEM2_WAKE] = __umtx_op_sem2_wake,
|
2016-02-28 17:52:33 +00:00
|
|
|
[UMTX_OP_SHM] = __umtx_op_shm,
|
2020-11-17 03:36:58 +00:00
|
|
|
[UMTX_OP_ROBUST_LISTS] = __umtx_op_robust_lists,
|
|
|
|
};
|
|
|
|
|
|
|
|
static const struct umtx_copyops umtx_native_ops = {
|
|
|
|
.copyin_timeout = umtx_copyin_timeout,
|
|
|
|
.copyin_umtx_time = umtx_copyin_umtx_time,
|
|
|
|
.copyin_robust_lists = umtx_copyin_robust_lists,
|
|
|
|
.copyout_timeout = umtx_copyout_timeout,
|
|
|
|
.timespec_sz = sizeof(struct timespec),
|
|
|
|
.umtx_time_sz = sizeof(struct _umtx_time),
|
2006-09-22 00:52:54 +00:00
|
|
|
};
|
|
|
|
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
#ifndef __i386__
|
|
|
|
static const struct umtx_copyops umtx_native_opsi386 = {
|
|
|
|
.copyin_timeout = umtx_copyin_timeouti386,
|
|
|
|
.copyin_umtx_time = umtx_copyin_umtx_timei386,
|
|
|
|
.copyin_robust_lists = umtx_copyin_robust_lists32,
|
|
|
|
.copyout_timeout = umtx_copyout_timeouti386,
|
|
|
|
.timespec_sz = sizeof(struct timespeci386),
|
|
|
|
.umtx_time_sz = sizeof(struct umtx_timei386),
|
|
|
|
.compat32 = true,
|
|
|
|
};
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#if defined(__i386__) || defined(__LP64__)
|
|
|
|
/* i386 can emulate other 32-bit archs, too! */
|
|
|
|
static const struct umtx_copyops umtx_native_opsx32 = {
|
|
|
|
.copyin_timeout = umtx_copyin_timeoutx32,
|
|
|
|
.copyin_umtx_time = umtx_copyin_umtx_timex32,
|
2020-11-17 03:36:58 +00:00
|
|
|
.copyin_robust_lists = umtx_copyin_robust_lists32,
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
.copyout_timeout = umtx_copyout_timeoutx32,
|
|
|
|
.timespec_sz = sizeof(struct timespecx32),
|
|
|
|
.umtx_time_sz = sizeof(struct umtx_timex32),
|
2020-11-17 03:36:58 +00:00
|
|
|
.compat32 = true,
|
|
|
|
};
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
|
|
|
|
#ifdef COMPAT_FREEBSD32
|
|
|
|
#ifdef __amd64__
|
|
|
|
#define umtx_native_ops32 umtx_native_opsi386
|
|
|
|
#else
|
|
|
|
#define umtx_native_ops32 umtx_native_opsx32
|
2020-11-17 03:36:58 +00:00
|
|
|
#endif
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
#endif /* COMPAT_FREEBSD32 */
|
|
|
|
#endif /* __i386__ || __LP64__ */
|
|
|
|
|
|
|
|
#define UMTX_OP__FLAGS (UMTX_OP__32BIT | UMTX_OP__I386)
|
2020-11-17 03:36:58 +00:00
|
|
|
|
2020-11-22 05:34:51 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
kern__umtx_op(struct thread *td, void *obj, int op, unsigned long val,
|
|
|
|
void *uaddr1, void *uaddr2, const struct umtx_copyops *ops)
|
|
|
|
{
|
|
|
|
struct _umtx_op_args uap = {
|
|
|
|
.obj = obj,
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
.op = op & ~UMTX_OP__FLAGS,
|
2020-11-17 03:36:58 +00:00
|
|
|
.val = val,
|
|
|
|
.uaddr1 = uaddr1,
|
|
|
|
.uaddr2 = uaddr2
|
|
|
|
};
|
|
|
|
|
|
|
|
if ((uap.op >= nitems(op_table)))
|
|
|
|
return (EINVAL);
|
|
|
|
return ((*op_table[uap.op])(td, &uap, ops));
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
sys__umtx_op(struct thread *td, struct _umtx_op_args *uap)
|
2006-09-22 00:52:54 +00:00
|
|
|
{
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
static const struct umtx_copyops *umtx_ops;
|
2015-10-30 19:20:40 +00:00
|
|
|
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
umtx_ops = &umtx_native_ops;
|
|
|
|
#ifdef __LP64__
|
|
|
|
if ((uap->op & (UMTX_OP__32BIT | UMTX_OP__I386)) != 0) {
|
|
|
|
if ((uap->op & UMTX_OP__I386) != 0)
|
|
|
|
umtx_ops = &umtx_native_opsi386;
|
|
|
|
else
|
|
|
|
umtx_ops = &umtx_native_opsx32;
|
|
|
|
}
|
|
|
|
#elif !defined(__i386__)
|
|
|
|
/* We consider UMTX_OP__32BIT a nop on !i386 ILP32. */
|
|
|
|
if ((uap->op & UMTX_OP__I386) != 0)
|
|
|
|
umtx_ops = &umtx_native_opsi386;
|
|
|
|
#else
|
|
|
|
/* Likewise, UMTX_OP__I386 is a nop on i386. */
|
|
|
|
if ((uap->op & UMTX_OP__32BIT) != 0)
|
|
|
|
umtx_ops = &umtx_native_opsx32;
|
|
|
|
#endif
|
2020-11-17 03:36:58 +00:00
|
|
|
return (kern__umtx_op(td, uap->obj, uap->op, uap->val, uap->uaddr1,
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
uap->uaddr2, umtx_ops));
|
2006-09-22 00:52:54 +00:00
|
|
|
}
|
|
|
|
|
2020-11-22 05:34:51 +00:00
|
|
|
#ifdef COMPAT_FREEBSD32
|
2021-07-19 18:40:38 +03:00
|
|
|
#ifdef COMPAT_FREEBSD10
|
|
|
|
int
|
2021-11-22 22:36:55 +00:00
|
|
|
freebsd10_freebsd32__umtx_lock(struct thread *td,
|
|
|
|
struct freebsd10_freebsd32__umtx_lock_args *uap)
|
2021-07-19 18:40:38 +03:00
|
|
|
{
|
|
|
|
return (do_lock_umtx32(td, (uint32_t *)uap->umtx, td->td_tid, NULL));
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
2021-11-22 22:36:55 +00:00
|
|
|
freebsd10_freebsd32__umtx_unlock(struct thread *td,
|
|
|
|
struct freebsd10_freebsd32__umtx_unlock_args *uap)
|
2021-07-19 18:40:38 +03:00
|
|
|
{
|
|
|
|
return (do_unlock_umtx32(td, (uint32_t *)uap->umtx, td->td_tid));
|
|
|
|
}
|
|
|
|
#endif /* COMPAT_FREEBSD10 */
|
|
|
|
|
2020-11-22 05:34:51 +00:00
|
|
|
int
|
|
|
|
freebsd32__umtx_op(struct thread *td, struct freebsd32__umtx_op_args *uap)
|
|
|
|
{
|
|
|
|
|
2021-11-17 20:12:24 +00:00
|
|
|
return (kern__umtx_op(td, uap->obj, uap->op, uap->val, uap->uaddr1,
|
2020-11-22 05:34:51 +00:00
|
|
|
uap->uaddr2, &umtx_native_ops32));
|
|
|
|
}
|
2021-07-19 18:40:38 +03:00
|
|
|
#endif /* COMPAT_FREEBSD32 */
|
2020-11-22 05:34:51 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
void
|
|
|
|
umtx_thread_init(struct thread *td)
|
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
td->td_umtxq = umtxq_alloc();
|
|
|
|
td->td_umtxq->uq_thread = td;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
umtx_thread_fini(struct thread *td)
|
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtxq_free(td->td_umtxq);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* It will be called when new thread is created, e.g fork().
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
umtx_thread_alloc(struct thread *td)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq;
|
|
|
|
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
uq->uq_inherited_pri = PRI_MAX;
|
|
|
|
|
|
|
|
KASSERT(uq->uq_flags == 0, ("uq_flags != 0"));
|
|
|
|
KASSERT(uq->uq_thread == td, ("uq_thread != td"));
|
|
|
|
KASSERT(uq->uq_pi_blocked == NULL, ("uq_pi_blocked != NULL"));
|
|
|
|
KASSERT(TAILQ_EMPTY(&uq->uq_pi_contested), ("uq_pi_contested is not empty"));
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* exec() hook.
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
*
|
|
|
|
* Clear robust lists for all process' threads, not delaying the
|
2020-11-21 10:32:40 +00:00
|
|
|
* cleanup to thread exit, since the relevant address space is
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
* destroyed right now.
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
*/
|
2020-11-21 10:32:40 +00:00
|
|
|
void
|
|
|
|
umtx_exec(struct proc *p)
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
struct thread *td;
|
|
|
|
|
|
|
|
KASSERT(p == curproc, ("need curproc"));
|
|
|
|
KASSERT((p->p_flag & P_HADTHREADS) == 0 ||
|
|
|
|
(p->p_flag & P_STOPPED_SINGLE) != 0,
|
|
|
|
("curproc must be single-threaded"));
|
2019-05-08 16:30:38 +00:00
|
|
|
/*
|
|
|
|
* There is no need to lock the list as only this thread can be
|
|
|
|
* running.
|
|
|
|
*/
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
FOREACH_THREAD_IN_PROC(p, td) {
|
|
|
|
KASSERT(td == curthread ||
|
|
|
|
((td->td_flags & TDF_BOUNDARY) != 0 && TD_IS_SUSPENDED(td)),
|
|
|
|
("running thread %p %p", p, td));
|
|
|
|
umtx_thread_cleanup(td);
|
|
|
|
td->td_rb_list = td->td_rbp_list = td->td_rb_inact = 0;
|
|
|
|
}
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2020-11-21 10:32:40 +00:00
|
|
|
* thread exit hook.
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
umtx_thread_exit(struct thread *td)
|
|
|
|
{
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
umtx_thread_cleanup(td);
|
|
|
|
}
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_read_uptr(struct thread *td, uintptr_t ptr, uintptr_t *res, bool compat32)
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
{
|
|
|
|
u_long res1;
|
|
|
|
uint32_t res32;
|
|
|
|
int error;
|
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
if (compat32) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = fueword32((void *)ptr, &res32);
|
|
|
|
if (error == 0)
|
|
|
|
res1 = res32;
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
} else {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
error = fueword((void *)ptr, &res1);
|
|
|
|
}
|
|
|
|
if (error == 0)
|
|
|
|
*res = res1;
|
|
|
|
else
|
|
|
|
error = EFAULT;
|
|
|
|
return (error);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_read_rb_list(struct thread *td, struct umutex *m, uintptr_t *rb_list,
|
|
|
|
bool compat32)
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
{
|
|
|
|
struct umutex32 m32;
|
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
if (compat32) {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
memcpy(&m32, m, sizeof(m32));
|
|
|
|
*rb_list = m32.m_rb_lnk;
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
} else {
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
*rb_list = m->m_rb_lnk;
|
[2/2] _umtx_op: introduce 32-bit/i386 flags for operations
This patch takes advantage of the consolidation that happened to provide two
flags that can be used with the native _umtx_op(2): UMTX_OP___32BIT and
UMTX_OP__I386.
UMTX_OP__32BIT iindicates that we are being provided with 32-bit structures.
Note that this flag alone indicates a 64bit time_t, since this is the
majority case.
UMTX_OP__I386 has been provided so that we can emulate i386 as well,
regardless of whether the host is amd64 or not.
Both imply a different set of copyops in sysumtx_op. freebsd32__umtx_op
simply ignores the flags, since it's already doing a 32-bit operation and
it's unlikely we'll be running an emulator under compat32. Future work
could consider it, but the author sees little benefit.
This will be used by qemu-bsd-user to pass on all _umtx_op calls to the
native interface as long as the host/target endianness matches, effectively
eliminating most if not all of the remaining unresolved deadlocks for most.
This version changed a fair amount from what was under review, mostly in
response to refactoring of the prereq reorganization and battle-testing
it with qemu-bsd-user. The main changes are as follows:
1.) The i386 flag got renamed to omit '32BIT' since this is redundant.
2.) The flags are now properly handled on 32-bit platforms to emulate other
32-bit platforms.
3.) Robust list handling was fixed, and the 32-bit functionality that was
previously gated by COMPAT_FREEBSD32 is now unconditional.
4.) Robust list handling was also improved, including the error reported
when a process has already registered 32-bit ABI lists and also
detecting if native robust lists have already been registered. Both
scenarios now return EBUSY rather than EINVAL, because the input is
technically valid but we're too busy with another ABI's lists.
libsysdecode/kdump/truss support will go into review soon-ish, along with
the associated manpage update.
Reviewed by: kib (earlier version)
MFC after: 3 weeks
2020-11-22 05:47:45 +00:00
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_handle_rb(struct thread *td, uintptr_t rbp, uintptr_t *rb_list, bool inact,
|
|
|
|
bool compat32)
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
{
|
|
|
|
struct umutex m;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
KASSERT(td->td_proc == curproc, ("need current vmspace"));
|
|
|
|
error = copyin((void *)rbp, &m, sizeof(m));
|
|
|
|
if (error != 0)
|
|
|
|
return (error);
|
|
|
|
if (rb_list != NULL)
|
2020-11-17 03:36:58 +00:00
|
|
|
umtx_read_rb_list(td, &m, rb_list, compat32);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if ((m.m_flags & UMUTEX_ROBUST) == 0)
|
|
|
|
return (EINVAL);
|
|
|
|
if ((m.m_owner & ~UMUTEX_CONTESTED) != td->td_tid)
|
|
|
|
/* inact is cleared after unlock, allow the inconsistency */
|
|
|
|
return (inact ? 0 : EINVAL);
|
|
|
|
return (do_unlock_umutex(td, (struct umutex *)rbp, true));
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
umtx_cleanup_rb_list(struct thread *td, uintptr_t rb_list, uintptr_t *rb_inact,
|
2020-11-17 03:36:58 +00:00
|
|
|
const char *name, bool compat32)
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
{
|
|
|
|
int error, i;
|
|
|
|
uintptr_t rbp;
|
|
|
|
bool inact;
|
|
|
|
|
|
|
|
if (rb_list == 0)
|
|
|
|
return;
|
2020-11-17 03:36:58 +00:00
|
|
|
error = umtx_read_uptr(td, rb_list, &rbp, compat32);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
for (i = 0; error == 0 && rbp != 0 && i < umtx_max_rb; i++) {
|
|
|
|
if (rbp == *rb_inact) {
|
|
|
|
inact = true;
|
|
|
|
*rb_inact = 0;
|
|
|
|
} else
|
|
|
|
inact = false;
|
2020-11-17 03:36:58 +00:00
|
|
|
error = umtx_handle_rb(td, rbp, &rbp, inact, compat32);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
}
|
|
|
|
if (i == umtx_max_rb && umtx_verbose_rb) {
|
|
|
|
uprintf("comm %s pid %d: reached umtx %smax rb %d\n",
|
|
|
|
td->td_proc->p_comm, td->td_proc->p_pid, name, umtx_max_rb);
|
|
|
|
}
|
|
|
|
if (error != 0 && umtx_verbose_rb) {
|
|
|
|
uprintf("comm %s pid %d: handling %srb error %d\n",
|
|
|
|
td->td_proc->p_comm, td->td_proc->p_pid, name, error);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
/*
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
* Clean up umtx data.
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
umtx_thread_cleanup(struct thread *td)
|
|
|
|
{
|
|
|
|
struct umtx_q *uq;
|
|
|
|
struct umtx_pi *pi;
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
uintptr_t rb_inact;
|
2020-11-17 03:36:58 +00:00
|
|
|
bool compat32;
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
/*
|
|
|
|
* Disown pi mutexes.
|
|
|
|
*/
|
|
|
|
uq = td->td_umtxq;
|
|
|
|
if (uq != NULL) {
|
2019-05-08 16:30:38 +00:00
|
|
|
if (uq->uq_inherited_pri != PRI_MAX ||
|
|
|
|
!TAILQ_EMPTY(&uq->uq_pi_contested)) {
|
|
|
|
mtx_lock(&umtx_lock);
|
|
|
|
uq->uq_inherited_pri = PRI_MAX;
|
|
|
|
while ((pi = TAILQ_FIRST(&uq->uq_pi_contested)) != NULL) {
|
|
|
|
pi->pi_owner = NULL;
|
|
|
|
TAILQ_REMOVE(&uq->uq_pi_contested, pi, pi_link);
|
|
|
|
}
|
|
|
|
mtx_unlock(&umtx_lock);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
}
|
2019-05-08 16:30:38 +00:00
|
|
|
sched_lend_user_prio_cond(td, PRI_MAX);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
|
2020-11-17 03:36:58 +00:00
|
|
|
compat32 = (td->td_pflags2 & TDP2_COMPAT32RB) != 0;
|
|
|
|
td->td_pflags2 &= ~TDP2_COMPAT32RB;
|
|
|
|
|
2019-05-08 16:30:38 +00:00
|
|
|
if (td->td_rb_inact == 0 && td->td_rb_list == 0 && td->td_rbp_list == 0)
|
|
|
|
return;
|
|
|
|
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
/*
|
|
|
|
* Handle terminated robust mutexes. Must be done after
|
|
|
|
* robust pi disown, otherwise unlock could see unowned
|
|
|
|
* entries.
|
|
|
|
*/
|
|
|
|
rb_inact = td->td_rb_inact;
|
|
|
|
if (rb_inact != 0)
|
2020-11-17 03:36:58 +00:00
|
|
|
(void)umtx_read_uptr(td, rb_inact, &rb_inact, compat32);
|
|
|
|
umtx_cleanup_rb_list(td, td->td_rb_list, &rb_inact, "", compat32);
|
|
|
|
umtx_cleanup_rb_list(td, td->td_rbp_list, &rb_inact, "priv ", compat32);
|
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
2016-05-17 09:56:22 +00:00
|
|
|
if (rb_inact != 0)
|
2020-11-17 03:36:58 +00:00
|
|
|
(void)umtx_handle_rb(td, rb_inact, NULL, true, compat32);
|
This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
|
|
|
}
|