is also implemented in glibc and is used by a number of existing
applications (mysql, firefox, etc).
This mutex type is a default mutex with the additional property that
it spins briefly when attempting to acquire a contested lock, doing
trylock operations in userland before entering the kernel to block if
eventually unsuccessful.
The expectation is that applications requesting this mutex type know
that the mutex is likely to be only held for very brief periods, so it
is faster to spin in userland and probably succeed in acquiring the
mutex, than to enter the kernel and sleep, only to be woken up almost
immediately. This can help significantly in certain cases when
pthread mutexes are heavily contended and held for brief durations
(such as mysql).
Spin up to 200 times before entering the kernel, which represents only
a few us on modern CPUs. No performance degradation was observed with
this value and it is sufficient to avoid a large performance drop in
mysql performance in the heavily contended pthread mutex case.
The libkse implementation is a NOP.
Reviewed by: jeff
MFC after: 3 days
threading library.
- Now that libpthread is a symlink, it's no longer possible
to link applications with libpthread and have libmap.conf(5)
select the desired threading library; applications will be
linked to the default threading library, libkse or libthr.
Remove an obsolete paragraph.
- Mention that improvements can be seen compared to libkse.
Reviewed by: deischen, davidxu
the threading libraries is built. This simplifies the
logic in makefiles that need to check if the pthreads
support is present. It also fixes a bug where we would
build a threading library that we shouldn't have built:
for example, building with WITHOUT_LIBTHR and the default
value of DEFAULT_THREADING_LIB (libthr) would mistakenly
build the libthr library, but not install it.
Approved by: re (kensmith)
Warning, after symbol versioning is enabled, going back is not easy
(use WITHOUT_SYMVER at your own risk).
Change the default thread library to libthr.
There most likely still needs to be a version bump for at least the
thread libraries. If necessary, this will happen later.
_thr_ucond_broadcast, clear condition variable pointer in cancellation
info after returing from _thr_ucond_wait, since kernel has already
dropped the internal lock, so we don't need to unlock it in cancellation
handler again.
is also returned by pthread_detach() if a thread was already
detached, the error code was already documented:
> [EINVAL] The implementation has detected that the value speci-
> fied by thread does not refer to a joinable thread.
o The TLS pointer (r2) points 0x7000 after the *end* of the TCB.
o _rtld_allocate_tls() gets a pointer to the current TCB, not the
current TLS pointer.
o _rtld_free_tls() gets the size of the TCB structure.
into pthread structure to keep track of locked PTHREAD_PRIO_PROTECT mutex,
no real mutex code is changed, the mutex locking and unlocking code should
has same performance as before.
wait(), waitpid() and usleep(), they are internal versions and
should not be cancellation points.
2. Make wait3() as a cancellation point.
3. Move raise() and pause() into file thr_sig.c.
4. Add functions _sigsuspend, _sigwait, _sigtimedwait and _sigwaitinfo,
remove SIGCANCEL bit in wait-set for those functions, the signal is
used internally to implement thread cancellation.
to make it work, turnstile like mechanism to support priority
propagating and other realtime scheduling options in kernel
should be available to userland mutex, for the moment, I just
want to make libthr be simple and efficient thread library.
Discussed with: deischen, julian
* Add posix_memalign().
* Move calloc() from calloc.c to malloc.c. Add a calloc() implementation in
rtld-elf in order to make the loader happy (even though calloc() isn't
used in rtld-elf).
* Add _malloc_prefork() and _malloc_postfork(), and use them instead of
directly manipulating __malloc_lock.
Approved by: phk, markm (mentor)
operation, the caller is blocked util target threads are really
suspended, also avoid suspending a thread when it is holding a
critical lock.
Fix a bug in _thr_ref_delete which tests a never set flag.
rewritten, now timers created with same sigev_notify_attributes will
run in same thread, this allows user to organize which timers can
run in same thread to save some thread resource.
to point at libmap.conf(5). This will help answer questions about what
and why it is, although not in great detail.
Approved by: re (scottl)
MFC after: 1 week
MFC note: When MFC'd, don't MFC mention of work not yet MFC'd.
1. fast simple type mutex.
2. __thread tls works.
3. asynchronous cancellation works ( using signal ).
4. thread synchronization is fully based on umtx, mainly, condition
variable and other synchronization objects were rewritten by using
umtx directly. those objects can be shared between processes via
shared memory, it has to change ABI which does not happen yet.
5. default stack size is increased to 1M on 32 bits platform, 2M for
64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.
Okayed by: jeff, mtm, rwatson, scottl
run as a 32 bit support library for an amd64 kernel. 32 bit consumers of
libthr have zero chance of running on an amd64 kernel since we don't
implement the i386_set_ldt() family of functions. Note that this commit
doesn't make it actually work, it just removes one more obstacle.
no userland locks are heald, the dead thread lock can no longer protect
access to it. Therefore, instead of using an if (!dead)...else clause
after walking the active threads list test the thread pointer before
deciding not to walk the dead threads list. If the thread pointer is null
it means it was not found in the active threads list and the dead threads
list should be checked.
2. Do not free the stack of a thread that is not marked dead. This is the
2nd and final part of eliminating the race to free a thread's stack.
MFC after: 3 days
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.
Discussed with: davidxu, deischen, marcel
MFC after: 3 days
specified mutex is invalid. In spec parlance 'MAY FAIL' means it's
up to the implementor. So, remove the check for NULL pointers for two
reasons:
1. A mutex may be invalid without necessarily being NULL.
2. If the pointer to the mutex is NULL core-dumping in the
vicinity of the problem is much much much better than failing
in some other part of the code (especially when the application
doesn't check the return value of the function that you oh so
helpfully set to EINVAL).
o In the rwlock code: move a duplicated check inside an if..else to after
the if...else clause.
o When initializing a static rwlock move the initialization check
inside the lock.
o In thr_setschedparam.c: When breaking out of the trylock...retry if busy
loop make sure to reset the mtx pointer to null if the mutex is nolonger
in a queue.
pointer to the corresponding struct thread to the thread ID (lwpid_t)
assigned to that thread. The primary reason for this change is that
libthr now internally uses the same ID as the debugger and the kernel
when referencing to a kernel thread. This allows us to implement the
support for debugging without additional translations and/or mappings.
To preserve the ABI, the 1:1 threading syscalls, including the umtx
locking API have not been changed to work on a lwpid_t. Instead the
1:1 threading syscalls operate on long and the umtx locking API has
not been changed except for the contested bit. Previously this was
the least significant bit. Now it's the most significant bit. Since
the contested bit should not be tested by userland, this change is
not expected to be visible. Just to be sure, UMTX_CONTESTED has been
removed from <sys/umtx.h>.
Reviewed by: mtm@
ABI preservation tested on: i386, ia64
a fork, make sure that the current thread isn't detached and freed. As
a consequence the thread should be inserted into the head of the
active list only once (in the beginning).
followed are: Only 3 functions (pthread_cancel, pthread_setcancelstate,
pthread_setcanceltype) are required to be async-signal-safe by POSIX. None of
the rest of the pthread api is required to be async-signal-safe. This means
that only the three mentioned functions are safe to use from inside
signal handlers.
However, there are certain system/libc calls that are
cancellation points that a caller may call from within a signal handler,
and since they are cancellation points calls have to be made into libthr
to test for cancellation and exit the thread if necessary. So, the
cancellation test and thread exit code paths must be async-signal-safe
as well. A summary of the changes follows:
o Almost all of the code paths that masked signals, as well as locking the
pthread structure now lock only the pthread structure.
o Signals are masked (and left that way) as soon as a thread enters
pthread_exit().
o The active and dead threads locks now explicitly require that signals
are masked.
o Access to the isdead field of the pthread structure is protected by both
the active and dead list locks for writing. Either one is sufficient for
reading.
o The thread state and type fields have been combined into one three-state
switch to make it easier to read without requiring a lock. It doesn't need
a lock for writing (and therefore for reading either) because only the
current thread can write to it and it is an integer value.
o The thread state field of the pthread structure has been eliminated. It
was an unnecessary field that mostly duplicated the flags field, but
required additional locking that would make a lot more code paths require
signal masking. Any truly unique values (such as PS_DEAD) have been
reborn as separate members of the pthread structure.
o Since the mutex and condvar pthread functions are not async-signal-safe
there is no need to muck about with the wait queues when handling
a signal ...
o ... which also removes the need for wrapping signal handlers and sigaction(2).
o The condvar and mutex async-cancellation code had to be revised as a result
of some of these changes, which resulted in semi-unrelated changes which
would have been difficult to work on as a separate commit, so they are
included as well.
The only part of the changes I am worried about is related to locking for
the pthread joining fields. But, I will take a closer look at them once this
mega-patch is committed.
makeing sure the spinlock isn't already in use might be a nice feature to
have in theory, it's hard to implement in practice since the passed in
pointer may not be NULL, but still be an invalid value (i.e. 1..2..3.. etc).
functionality spelled out in SUSv3.
o Signal of 0 means do everything except send the signal
o Check that the signal is not invalid
o Check that the target thread is not dead/invalid
sigprocmask no longer needs to be wrapped.
o raise(3) is applied to the calling thread in a threaded program.
o In the sigaction wrapper reference the correct structure.
o Don't treat SIGTHR especially anymore (infact it won't exist in
a little while).
we still have to DTRT when an asynchronously cancellable thread is
cancelled while waiting for a mutex.
o While dequeueing a waiting mutex don't skip a thread if it has
a cancel pending. Only skip it if it is also async cancellable.
the cause of any bugs because it is *always* indirectly set
in the for...loop, but better to be explicit about it.
o Check the magic number of the passed in thread only after it has
been found in the active thread list. Otherwise, if the check is done
at the very beginning we may end up pointing to garbage if the
thread was once a valid thread, but has now been destroyed.
that this provokes. "Wherever possible" means "In the kernel OR NOT
C++" (implying C).
There are places where (void *) pointers are not valid, such as for
function pointers, but in the special case of (void *)0, agreement
settles on it being OK.
Most of the fixes were NULL where an integer zero was needed; many
of the fixes were NULL where ascii <nul> ('\0') was needed, and a
few were just "other".
Tested on: i386 sparc64
has been executed. On return from the signal handler
the call will either be restarted or EINTR will be returned,
but it will not go back to its previous state. So, it is
sufficient to simply change the state to 'running' without
actually trying to wake up the thread.
a PTHREAD_RWLOCK_INITIALIZER to do for rwlocks what
a similarly named symbol does for statically initialized mutexes.
This symbol was dropped in The Open Group Base Specifications Issue 6
and does not exist in IEEE Std 1003.1, 2003, but it should still be
supported for backwards compatibility.
Pointy hat: mtm
o Instead of checking both the passed in pointer and its value
for NULL, only check the latter. Any caller that passes in
a NULL pointer is obviously wrong.
o Fix mutex priority protocols. Keep separate counts of priority
inheritance and protection mutexes to make things easier.
This will not have much affect since this is only the
userland side, and the rest involves kernel scheduling.
These files had tags after the copyright notice,
inside the comment block (incorrect, removed),
and outside the comment block (correct).
Approved by: rwatson (mentor)
what do I get for my troubles? libc breaks offcourse!
Reimplement a hack (in libthr) that allows libc to use
rwlocks without initializing them first. The hack was reimplemented
so that only a private libc version of the rwlock locking functions
initializes an uninitialized rwlock. The application version will
correctly fail.
the system call got interrupted and the absolute timeout is
converted to a relative timeout, it may happen that we get a
negative number. In such a case, simply set the timeout to
zero so that if the event that the thread wants to wait for has
happened it can still return successfully, but if it hasn't
happened then the thread doesn't suspend indefinitely. This should
fix certain applications (including mozilla) that seem to hang
indefinitely sometimes.
Noticed and debugged by: Morten Johansen <root@morten-johansen.net>
o Simplify the logic by removing a lot of unnecesary nesting
o Reduce the amount of local variables
o Zero-out the allocated structure and get rid of
all the unnecessary setting to 0 and NULL;
Refactor _pthread_mutex_destroy
o Simplify the logic by removing a lot of unnecesary nesting
o No need to check pointer that the mutex attributes points
to. Checking passed in pointer is enough.
a list in the thread structure to keep track of the locks and
how many times they have been locked. This list is checked
on every lock and unlock. The traversal through the list is
O(n). Most applications don't hold so many locks at once that
this will become a problem. However, if it does become a problem
it might be a good idea to review this once libthr is
off probation and in the optimization cycle.
This fixes:
o deadlock when a thread tries to recursively acquire a
read lock when a writer is waiting on the lock.
o a thread could previously successfully unlock a lock it did not own
o deadlock when a thread tries to acquire a write lock on
a lock it already owns for reading or writing [ this is admittedly
not required by POSIX, but is nice to have ]
code and simply return EINVAL (which is allowed by the standard) in
all those pthread functions that previously initialized it.
o Refactor the pthread_rwlock_[try]rdlock() and pthread_rwlock_[try]wrlock()
functions. They are now completeley condensed into rwlock_rdlock_common()
and rwlock_wrlock_common(), respectively.
o If the application tries to destroy an rwlock that is currently
held by a thread return EBUSY where it previously went ahead and
freed all resources associated with the lock.
o Refactor _pthread_rwlock_init() to make it look (relatively) sane.
o When obtaining a read lock on an rwlock the check for whether it
would exceed the maximum allowed read locks should happen *before*
we obtain the lock.
o The pthread_rwlock_* functions shall *never* return EINTR, so make
sure to requeue/resuspend the thread if it encounters such an error.
o Make a note that pthread_rwlock_unlock() needs to ensure it holds a
lock on an rwlock it tries to unlock. It will be implemented in a
separate commit because it requires some additional rwlock infrastructure.
waiting on a locked mutex. This involves passing a struct timespec
from the pthread mutex locking interfaces all the way down to the
function that suspends the thread until the mutex is released.
The timeout is assumed to be an absolute time (i.e. not relative to
the current time).
Also, in _thread_suspend() make the passed in timespec const.
o Remove some code duplication between _thread_init(), which is run once
to initialize libthr and the intitial thread, and pthread_create(), which
initializes newly created threads, into a new function called from both
places: init_td_common()
o Move initialization of certain parts of libthr into a separate
function. These include:
- Active threads list and it's lock
- Dead threads list and it's lock & condition variable
- Naming and insertion of the initial thread into the
active threads list.
work before anyways, and I didn't want to fix broken code I had no
way of testing. It was necessary however, in order to get rid of GIANT_LOCK.
Pthread priorities will have to wait a little longer to get fixed.
problems: (1) The wrong flag was being checked for in the attribute
(2) The pthread's state was not being set to indicate it was
suspended.
Noticed by: Igor Sysoev <is@rambler-co.ru>
On ia64, where there's no libc_r at all, libkse is now the default
thread library by virtue of these links.
The reasons for this change are:
1. libkse is slated to become the default thread library anyway,
2. active development and maintenance is only present for libkse,
3. GNOME and KDE, both in the process of being supported on ia64,
work better with KSE; even on ia64.
Create a private, single underscore, version of pthread_mutex_unlock for libc.
pthread_mutex_lock already has one. These versions are different from the
ones that applications will link against because they block all signals
from the time a call to lock the mutex is made until it is successfully
unlocked.
a thread receives a spurious wakeup from sigtimedwait(), so make sure
that the call to the queueing code is called only once before entering
the loop (not in the loop). This should fix some fatal errors people
are seeing with messages stating the thread is already on the mutex queue.
These errors may still be triggered from signal handlers; however, since
that part of the code is not locked down yet.
not spinlock_t. Spinlock_t and the associated functions and macros may
require blocking signals in order for async-safe libc functions to behave
appropriately in libthr. This is undesriable for libthr internal locking.
So, this is the first step in completely separating libthr from libc's
locking primitives.
Three new macros should be used for internal libthr locking from now on:
THR_LOCK, THR_TRYLOCK, THR_UNLOCK.
and the disabling of signals. What we are really interested in is
keeping track of recursive disabling of signals. We should not
be recursively acquiring thread locks. Any such situations should
be reorganized to not require a recursive lock.
Separating the two out also allows us to block signals independent of
acquiring thread locks. This will be needed in libthr in the near future when
we put the pieces together to protect libc functions that use pthread mutexes
and low level locks.
implementation and the new improved one. We now precompute the
signal set passed to sigtimedwait, using an inverted set when
necessary for compatibility with older kernels.
exit function has invalidated the need for _spin[un]lock_pthread().
The _spin[un]lock() functions can now dereference curthread without
the danger that the ldtentry containing the pointer to the thread
has been cleared out from under them.
breakages. Note that runtime compatibility is not guaranteed. Future
changes to setjmp/longjmp in libc will break threaded applications
linked against libc_r.so.5 on ia64. We pull our "tier 2" card once
more...
Reviewed by: ru
an application compiled -static with libthr would dump core in
malloc(3) because the stub thread initialization routine in libc would
be used instead of the libthr supplied one.
condition variables. Cosmetic.
Explicitly compare against PTHREAD_MUTEX_INITIALIZER. We shouldn't
encourage calls to the mutex functions with null pointers to mutexes.
Approved by: re/jhb
from multiple threads don't initialze the same condition variable
more than once.
Explicitly compare cond pointers with PTHREAD_COND_INITIALIZER instead
of NULL. Just because it happens to be defined as NULL is no reason
to encourage the idea that people can call those functions with
NULL pointers to a condition variable.
Approved by: re/jhb
The dead list thread is sufficient for synchronization.
Retire the arch_id (ldt array slot) in the gc thread instead of the
doing it in the thread itself.
Approved by: re/jhb
joiner by making sure all locks and unlocks occur in the same order. For
the record the lock order is: DEAD_LIST, THREAD_LIST, exiting thread, joiner
thread.
Approved by: re/rwatson
thread is not dead, the join loop is guaranteed to execute at least
once, so there is no need to pick up the thread list lock after
we return from suspenstion only to release it after the loop.
Approved by: re/blanket libthr
joined and then the joiner thread. There isn't an easy (sane?) way
to make it use the correct order without introducing races involving
the target thread and finding which (active or dead) list it is on. So,
after locking the canceled thread it will try to lock the joined thread
and if it fails release the first lock and try again from the top.
Introduce a new function, _spintrylock, which is simply a wrapper arround
umtx_trylock(), to help accomplish this.
Approved by: re/blanket libthr
Modify the thread creation and thread searching routine
to lock the thread lists with the new locks instead of GIANT_LOCK.
Approved by: re/blanket libthr
list is protected by a spinlock_t, but the dead list uses a pthread_mutex
because it is necessary to synchronize other threads with the garbage
collector thread. Lock/Unlock macros are used so it's easier to make
changes to the locks in the future.
The 'dead thread list' lock is intended to replace the gc mutex.
This doesn't have any practical ramifications. It simply makes it
clearer what the purpose of the lock is. The gc will use this lock,
instead of the gc mutex, to synchronize access to the dead list with
other threads.
Modify _pthread_exit() to use these two new locks instead of GIANT_LOCK,
and also to properly lock and protect thread state changes,
especially with respect to a joining thread.
The gc thread was also re-arranged to be more organized and less nested.
_pthread_join() was also modified to use the thread list locks. However,
locking and unlocking here needs special care because a thread could find
itself in a position where it's joining an exiting thread that is
waiting on the dead list lock, which this thread (joiner) holds. If the
joiner doesn't take care to lock *and* unlock in the same order they
(the joiner and the joinee) could deadlock against each other.
Approved by: re/blanket libthr
pthread_cond_t) internaly in addition to the low-level spinlock_t. The
garbage collector mutex and condition variable are two such examples. This
might lead to critical sections nested within critical sections. Implement
a reference counting mechanism so that signals are masked only on the first
entry and unmasked on the last exit.
I'm not sure I like the idea of nested critical sections, but if
the library is going to use the pthread primitives it might be necessary.
Approved by: re/blanket libthr
Access to the thread's flags and state is protected by
_thread_critical_enter/exit(). When a thread is signaled with a condition
its state must be protected by locking it and disabling
signals before it is taken of the waiters' queue.
Move the implementation of pthread_cond_signal() and pthread_cond_broadcast()
into one function, cond_signal(). Its behaviour is determined by the
last argument, int broadcast. If this is set to 1 it will remove all
waiters, otherwise it will wake up only the first waiter thread.
Remove an extraneous call to pthread_testcancel().
Approved by: re/blanket libthr
that take the address of a struct pthread as their first argument.
_spin[un]lock() just become wrappers arround these two functions.
These new functions are for use in situations where curthread can't be
used. One example is _thread_retire(), where we invalidate the array index
curthread uses to get its pointer..
Approved by: re/blanket libthr
Prevent one thread from messing up another thread's saved signal
mask by saving it in struct pthread instead of leaving it as a
global variable. D'oh!
Approved by: re/blanket libthr
in thr_private.h
o Lock down the ldt_entries array and ldt_free, which points to
the next free slot. As noted in the comments, it's necessary
to special case the initial_thread because %gs is not setup
for it yet. This is ok because that early in the program there
won't be any reentrancy issues anyways.
Approved by: re/blanket libthr
When in either the mutex or cond queue we notice that the thread
is already on one of the queues, don't just simply abort(). Print
out the thread's identifiers and what queue it was on.
Approved by: markm/mentor, re/blanket libthr
is the *only* remaining thread in the application, in which case we
should not core dump, and instead exit gracefully.
Approved by: markm/mentor, re/blanket libthr
of pthread_cond_timedwait() is moved into cond_wait_common().
Pthread_cond_wait() and pthread_cond_timedwait() are now wrappers around
this function. Previously, the former called the latter with the abstime
pointing to 0 time. This violated Posix semantics should an application
have reason to call it with that argument because instead or returning
immediately it would have waited indefinitely for the cv to be signaled.
Approved by: markm/mentor, re/blanket libthr
Reviewed by: jeff
respect to other threads and signal handlers by moving to
the _thread_critical_enter/exit functions.
o Introduce an static function, testcancel(), that is used by
the other functions in this module. This allows it to make
locking assumptions that the top-level functions can't.
o Rework the code flow a bit to reduce indentation levels.
Approved by: markm/mentor, re/blanket libthr
Reviewed by: jeff
Note that the tp register (r13) is reserved as the TLS pointer in
the same way that that gp register (r1) is reserved as the global
pointer. This implementation uses the tp register to point to the
thread structure used by the threads implementation. This is not
in violation with the runtime specification provided the TLS is
a fixed distance from the thread structure. This is only an issue
when code used the __thread keyword to create TLS. This is not
supported at the moment.