however if current thread is executing cancellation handler, signal
SIGCANCEL may have already been blocked, this is unexpected, unblock the
signal in new thread if this happens.
MFC after: 1 week
the semantics of pthread_mutex_islocked_np() to return true if and only if
the mutex is held by the current thread.
Obviously, change the regression test to match.
MFC after: 2 weeks
locked. This is intended primarily to support the userland equivalent
of the various *_ASSERT_LOCKED() macros we have in the kernel.
MFC after: 2 weeks
loop count.
2. Add function pthread_mutex_setyieldloops_np to turn a mutex's yield
loop count.
3. Make environment variables PTHREAD_SPINLOOPS and PTHREAD_YIELDLOOPS
to be only used for turnning PTHREAD_MUTEX_ADAPTIVE_NP mutex.
doesn't use the default CFLAGS which contain -fno-strict-aliasing.
Until the code is cleaned up, just add -fno-strict-aliasing to the
CFLAGS of these for the tinderboxes' sake, allowing the rest of the
tree to have -Werror enabled again.
fixes a NULL-dereference of curthread when libstdc+ initializes
the exception handling globals on archs we can't use GNU TLS due
to lack of support in binutils 2.15 (i.e. arm and sparc64), yet,
thus making threaded C++ programs compiled with GCC 4.2.1 work
again on these archs.
Reviewed by: davidxu
MFC after: 3 days
to tune pthread mutex performance:
1. LIBPTHREAD_SPINLOOPS
If a pthread mutex is being locked by another thread, this environment
variable sets total number of spin loops before the current thread
sleeps in kernel, this saves a syscall overhead if the mutex will be
unlocked very soon (well written application code).
2. LIBPTHREAD_YIELDLOOPS
If a pthread mutex is being locked by other threads, this environment
variable sets total number of sched_yield() loops before the currrent
thread sleeps in kernel. if a pthread mutex is locked, the current thread
gives up cpu, but will not sleep in kernel, this means, current thread
does not set contention bit in mutex, but let lock owner to run again
if the owner is on kernel's run queue, and when lock owner unlocks the
mutex, it does not need to enter kernel and do lots of work to resume
mutex waiters, in some cases, this saves lots of syscall overheads for
mutex owner.
In my practice, sometimes LIBPTHREAD_YIELDLOOPS can massively improve performance
than LIBPTHREAD_SPINLOOPS, this depends on application. These two environments
are global to all pthread mutex, there is no interface to set them for each
pthread mutex, the default values are zero, this means spinning is turned off
by default.
is also implemented in glibc and is used by a number of existing
applications (mysql, firefox, etc).
This mutex type is a default mutex with the additional property that
it spins briefly when attempting to acquire a contested lock, doing
trylock operations in userland before entering the kernel to block if
eventually unsuccessful.
The expectation is that applications requesting this mutex type know
that the mutex is likely to be only held for very brief periods, so it
is faster to spin in userland and probably succeed in acquiring the
mutex, than to enter the kernel and sleep, only to be woken up almost
immediately. This can help significantly in certain cases when
pthread mutexes are heavily contended and held for brief durations
(such as mysql).
Spin up to 200 times before entering the kernel, which represents only
a few us on modern CPUs. No performance degradation was observed with
this value and it is sufficient to avoid a large performance drop in
mysql performance in the heavily contended pthread mutex case.
The libkse implementation is a NOP.
Reviewed by: jeff
MFC after: 3 days
threading library.
- Now that libpthread is a symlink, it's no longer possible
to link applications with libpthread and have libmap.conf(5)
select the desired threading library; applications will be
linked to the default threading library, libkse or libthr.
Remove an obsolete paragraph.
- Mention that improvements can be seen compared to libkse.
Reviewed by: deischen, davidxu
the threading libraries is built. This simplifies the
logic in makefiles that need to check if the pthreads
support is present. It also fixes a bug where we would
build a threading library that we shouldn't have built:
for example, building with WITHOUT_LIBTHR and the default
value of DEFAULT_THREADING_LIB (libthr) would mistakenly
build the libthr library, but not install it.
Approved by: re (kensmith)
Warning, after symbol versioning is enabled, going back is not easy
(use WITHOUT_SYMVER at your own risk).
Change the default thread library to libthr.
There most likely still needs to be a version bump for at least the
thread libraries. If necessary, this will happen later.
_thr_ucond_broadcast, clear condition variable pointer in cancellation
info after returing from _thr_ucond_wait, since kernel has already
dropped the internal lock, so we don't need to unlock it in cancellation
handler again.
is also returned by pthread_detach() if a thread was already
detached, the error code was already documented:
> [EINVAL] The implementation has detected that the value speci-
> fied by thread does not refer to a joinable thread.
o The TLS pointer (r2) points 0x7000 after the *end* of the TCB.
o _rtld_allocate_tls() gets a pointer to the current TCB, not the
current TLS pointer.
o _rtld_free_tls() gets the size of the TCB structure.
into pthread structure to keep track of locked PTHREAD_PRIO_PROTECT mutex,
no real mutex code is changed, the mutex locking and unlocking code should
has same performance as before.
wait(), waitpid() and usleep(), they are internal versions and
should not be cancellation points.
2. Make wait3() as a cancellation point.
3. Move raise() and pause() into file thr_sig.c.
4. Add functions _sigsuspend, _sigwait, _sigtimedwait and _sigwaitinfo,
remove SIGCANCEL bit in wait-set for those functions, the signal is
used internally to implement thread cancellation.
to make it work, turnstile like mechanism to support priority
propagating and other realtime scheduling options in kernel
should be available to userland mutex, for the moment, I just
want to make libthr be simple and efficient thread library.
Discussed with: deischen, julian
* Add posix_memalign().
* Move calloc() from calloc.c to malloc.c. Add a calloc() implementation in
rtld-elf in order to make the loader happy (even though calloc() isn't
used in rtld-elf).
* Add _malloc_prefork() and _malloc_postfork(), and use them instead of
directly manipulating __malloc_lock.
Approved by: phk, markm (mentor)
operation, the caller is blocked util target threads are really
suspended, also avoid suspending a thread when it is holding a
critical lock.
Fix a bug in _thr_ref_delete which tests a never set flag.
rewritten, now timers created with same sigev_notify_attributes will
run in same thread, this allows user to organize which timers can
run in same thread to save some thread resource.
to point at libmap.conf(5). This will help answer questions about what
and why it is, although not in great detail.
Approved by: re (scottl)
MFC after: 1 week
MFC note: When MFC'd, don't MFC mention of work not yet MFC'd.
1. fast simple type mutex.
2. __thread tls works.
3. asynchronous cancellation works ( using signal ).
4. thread synchronization is fully based on umtx, mainly, condition
variable and other synchronization objects were rewritten by using
umtx directly. those objects can be shared between processes via
shared memory, it has to change ABI which does not happen yet.
5. default stack size is increased to 1M on 32 bits platform, 2M for
64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.
Okayed by: jeff, mtm, rwatson, scottl
run as a 32 bit support library for an amd64 kernel. 32 bit consumers of
libthr have zero chance of running on an amd64 kernel since we don't
implement the i386_set_ldt() family of functions. Note that this commit
doesn't make it actually work, it just removes one more obstacle.
no userland locks are heald, the dead thread lock can no longer protect
access to it. Therefore, instead of using an if (!dead)...else clause
after walking the active threads list test the thread pointer before
deciding not to walk the dead threads list. If the thread pointer is null
it means it was not found in the active threads list and the dead threads
list should be checked.
2. Do not free the stack of a thread that is not marked dead. This is the
2nd and final part of eliminating the race to free a thread's stack.
MFC after: 3 days
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.
Discussed with: davidxu, deischen, marcel
MFC after: 3 days
specified mutex is invalid. In spec parlance 'MAY FAIL' means it's
up to the implementor. So, remove the check for NULL pointers for two
reasons:
1. A mutex may be invalid without necessarily being NULL.
2. If the pointer to the mutex is NULL core-dumping in the
vicinity of the problem is much much much better than failing
in some other part of the code (especially when the application
doesn't check the return value of the function that you oh so
helpfully set to EINVAL).
o In the rwlock code: move a duplicated check inside an if..else to after
the if...else clause.
o When initializing a static rwlock move the initialization check
inside the lock.
o In thr_setschedparam.c: When breaking out of the trylock...retry if busy
loop make sure to reset the mtx pointer to null if the mutex is nolonger
in a queue.
pointer to the corresponding struct thread to the thread ID (lwpid_t)
assigned to that thread. The primary reason for this change is that
libthr now internally uses the same ID as the debugger and the kernel
when referencing to a kernel thread. This allows us to implement the
support for debugging without additional translations and/or mappings.
To preserve the ABI, the 1:1 threading syscalls, including the umtx
locking API have not been changed to work on a lwpid_t. Instead the
1:1 threading syscalls operate on long and the umtx locking API has
not been changed except for the contested bit. Previously this was
the least significant bit. Now it's the most significant bit. Since
the contested bit should not be tested by userland, this change is
not expected to be visible. Just to be sure, UMTX_CONTESTED has been
removed from <sys/umtx.h>.
Reviewed by: mtm@
ABI preservation tested on: i386, ia64
a fork, make sure that the current thread isn't detached and freed. As
a consequence the thread should be inserted into the head of the
active list only once (in the beginning).
followed are: Only 3 functions (pthread_cancel, pthread_setcancelstate,
pthread_setcanceltype) are required to be async-signal-safe by POSIX. None of
the rest of the pthread api is required to be async-signal-safe. This means
that only the three mentioned functions are safe to use from inside
signal handlers.
However, there are certain system/libc calls that are
cancellation points that a caller may call from within a signal handler,
and since they are cancellation points calls have to be made into libthr
to test for cancellation and exit the thread if necessary. So, the
cancellation test and thread exit code paths must be async-signal-safe
as well. A summary of the changes follows:
o Almost all of the code paths that masked signals, as well as locking the
pthread structure now lock only the pthread structure.
o Signals are masked (and left that way) as soon as a thread enters
pthread_exit().
o The active and dead threads locks now explicitly require that signals
are masked.
o Access to the isdead field of the pthread structure is protected by both
the active and dead list locks for writing. Either one is sufficient for
reading.
o The thread state and type fields have been combined into one three-state
switch to make it easier to read without requiring a lock. It doesn't need
a lock for writing (and therefore for reading either) because only the
current thread can write to it and it is an integer value.
o The thread state field of the pthread structure has been eliminated. It
was an unnecessary field that mostly duplicated the flags field, but
required additional locking that would make a lot more code paths require
signal masking. Any truly unique values (such as PS_DEAD) have been
reborn as separate members of the pthread structure.
o Since the mutex and condvar pthread functions are not async-signal-safe
there is no need to muck about with the wait queues when handling
a signal ...
o ... which also removes the need for wrapping signal handlers and sigaction(2).
o The condvar and mutex async-cancellation code had to be revised as a result
of some of these changes, which resulted in semi-unrelated changes which
would have been difficult to work on as a separate commit, so they are
included as well.
The only part of the changes I am worried about is related to locking for
the pthread joining fields. But, I will take a closer look at them once this
mega-patch is committed.
makeing sure the spinlock isn't already in use might be a nice feature to
have in theory, it's hard to implement in practice since the passed in
pointer may not be NULL, but still be an invalid value (i.e. 1..2..3.. etc).
functionality spelled out in SUSv3.
o Signal of 0 means do everything except send the signal
o Check that the signal is not invalid
o Check that the target thread is not dead/invalid
sigprocmask no longer needs to be wrapped.
o raise(3) is applied to the calling thread in a threaded program.
o In the sigaction wrapper reference the correct structure.
o Don't treat SIGTHR especially anymore (infact it won't exist in
a little while).
we still have to DTRT when an asynchronously cancellable thread is
cancelled while waiting for a mutex.
o While dequeueing a waiting mutex don't skip a thread if it has
a cancel pending. Only skip it if it is also async cancellable.
the cause of any bugs because it is *always* indirectly set
in the for...loop, but better to be explicit about it.
o Check the magic number of the passed in thread only after it has
been found in the active thread list. Otherwise, if the check is done
at the very beginning we may end up pointing to garbage if the
thread was once a valid thread, but has now been destroyed.
that this provokes. "Wherever possible" means "In the kernel OR NOT
C++" (implying C).
There are places where (void *) pointers are not valid, such as for
function pointers, but in the special case of (void *)0, agreement
settles on it being OK.
Most of the fixes were NULL where an integer zero was needed; many
of the fixes were NULL where ascii <nul> ('\0') was needed, and a
few were just "other".
Tested on: i386 sparc64
has been executed. On return from the signal handler
the call will either be restarted or EINTR will be returned,
but it will not go back to its previous state. So, it is
sufficient to simply change the state to 'running' without
actually trying to wake up the thread.
a PTHREAD_RWLOCK_INITIALIZER to do for rwlocks what
a similarly named symbol does for statically initialized mutexes.
This symbol was dropped in The Open Group Base Specifications Issue 6
and does not exist in IEEE Std 1003.1, 2003, but it should still be
supported for backwards compatibility.
Pointy hat: mtm
o Instead of checking both the passed in pointer and its value
for NULL, only check the latter. Any caller that passes in
a NULL pointer is obviously wrong.