mutex structure is added as following:
struct umutex {
__lwpid_t m_owner;
uint32_t m_flags;
uint32_t m_ceilings[2];
uint32_t m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.
The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.
No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
if the specified priority is zero. This avoids a race where the calling
thread could read a snapshot of it's current priority, then a different
thread could change the first thread's priority, then the original thread
would call sched_prio() inside msleep() undoing the change made by the
second thread. I used a priority of zero as no thread that calls msleep()
or tsleep() should be specifying a priority of zero anyway.
The various places that passed 'curthread->td_priority' or some variant
as the priority now pass 0.
threads can wait for a thread to exit, and safely assume that the thread
has left userland and is no longer using its userland stack, this is
necessary for pthread_join when a thread is waiting for another thread
to exit which has user customized stack, after pthread_join returns,
the userland stack can be reused for other purposes, without this change,
the joiner thread has to spin at the address to ensure the thread is really
exited.
and always has been, but the system call itself returns
errno in a register so the problem is really a function of
libc, not the system call.
Discussed with : Matthew Dillion <dillon@apollo.backplane.com>
errno can be tampered potentially by nested signal handle.
Now all error codes are returned in negative value, positive value are
reserved for future expansion.
more general than previous. It also lets me implement cancelable point
in thread library. Also in theory, umtx_lock and umtx_unlock can
be implemented by using umtx_wait and umtx_wake, all atomic operations
can be done in userland without kernel's casuptr() function.
call mmap() to create a shared space, and then initialize umtx on it,
after that, each thread in different processes can use the umtx same
as threads in same process.
2. introduce a new syscall _umtx_op to support timed lock and condition
variable semantics. also, orignal umtx_lock and umtx_unlock inline
functions now are reimplemented by using _umtx_op, the _umtx_op can
use arbitrary id not just a thread id.
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
pointer to the corresponding struct thread to the thread ID (lwpid_t)
assigned to that thread. The primary reason for this change is that
libthr now internally uses the same ID as the debugger and the kernel
when referencing to a kernel thread. This allows us to implement the
support for debugging without additional translations and/or mappings.
To preserve the ABI, the 1:1 threading syscalls, including the umtx
locking API have not been changed to work on a lwpid_t. Instead the
1:1 threading syscalls operate on long and the umtx locking API has
not been changed except for the contested bit. Previously this was
the least significant bit. Now it's the most significant bit. Since
the contested bit should not be tested by userland, this change is
not expected to be visible. Just to be sure, UMTX_CONTESTED has been
removed from <sys/umtx.h>.
Reviewed by: mtm@
ABI preservation tested on: i386, ia64
comes across it, it will turn into a core dump in userland instead of
a kernel panic. I had also inverted the sense of the test, so
Double pointy hat to: mtm
1. There was a race condition between a thread unlocking
a umtx and the thread contesting it. If the unlocking
thread won the race it may try to wakeup a thread that
was not yet in msleep(). The contesting thread would then
go to sleep to await a wakeup that would never come. It's
not possible to close the race by using a lock because
calls to casuptr() may have to fault a page in from swap.
Instead, the race was closed by introducing a flag that
the unlocking thread will set when waking up a thread.
The contesting thread will check for this flag before
going to sleep. For now the flag is kept in td_flags,
but it may be better to use some other member or create
a new one because of the possible performance/contention
issues of having to own sched_lock. Thanks to jhb for
pointing me in the right direction on this one.
2. Once a umtx was contested all future locks and unlocks
were happening in the kernel, regardless of whether it
was contested or not. To prevent this from happening,
when a thread locks a umtx it checks the queue for that
umtx and unsets the contested bit if there are no other
threads waiting on it. Again, this is slightly more
complicated than it needs to be because we can't hold
a lock across casuptr(). So, the thread has to check
the queue again after unseting the bit, and reset the
contested bit if it finds that another thread has put
itself on the queue in the mean time.
3. Remove the if... block for unlocking an uncontested
umtx, and replace it with a KASSERT. The _only_ time
a thread should be unlocking a umtx in the kernel is
if it is contested.
notice another typo in the same line. This typo makes libthr unuseable,
but it's effects where counter-balanced by the extra semicolon, which
made libthr remarkably useable for the past several months.
- Use a hash of umtx queues to queue blocked threads. We hash on pid and the
virtual address of the umtx structure. This eliminates cases where we
previously held a lock across a casuptr call.
Reviwed by: jhb (quickly)
- umtx_lock() is defined as an inline in umtx.h. It tries to do an
uncontested acquire of a lock which falls back to the _umtx_lock()
system-call if that fails.
- umtx_unlock() is also an inline which falls back to _umtx_unlock() if the
uncontested unlock fails.
- Locks are keyed off of the thr_id_t of the currently running thread which
is currently just the pointer to the 'struct thread' in kernel.
- _umtx_lock() uses the proc pointer to synchronize access to blocked thread
queues which are stored in the first blocked thread.