to tune pthread mutex performance:
1. LIBPTHREAD_SPINLOOPS
If a pthread mutex is being locked by another thread, this environment
variable sets total number of spin loops before the current thread
sleeps in kernel, this saves a syscall overhead if the mutex will be
unlocked very soon (well written application code).
2. LIBPTHREAD_YIELDLOOPS
If a pthread mutex is being locked by other threads, this environment
variable sets total number of sched_yield() loops before the currrent
thread sleeps in kernel. if a pthread mutex is locked, the current thread
gives up cpu, but will not sleep in kernel, this means, current thread
does not set contention bit in mutex, but let lock owner to run again
if the owner is on kernel's run queue, and when lock owner unlocks the
mutex, it does not need to enter kernel and do lots of work to resume
mutex waiters, in some cases, this saves lots of syscall overheads for
mutex owner.
In my practice, sometimes LIBPTHREAD_YIELDLOOPS can massively improve performance
than LIBPTHREAD_SPINLOOPS, this depends on application. These two environments
are global to all pthread mutex, there is no interface to set them for each
pthread mutex, the default values are zero, this means spinning is turned off
by default.
is also implemented in glibc and is used by a number of existing
applications (mysql, firefox, etc).
This mutex type is a default mutex with the additional property that
it spins briefly when attempting to acquire a contested lock, doing
trylock operations in userland before entering the kernel to block if
eventually unsuccessful.
The expectation is that applications requesting this mutex type know
that the mutex is likely to be only held for very brief periods, so it
is faster to spin in userland and probably succeed in acquiring the
mutex, than to enter the kernel and sleep, only to be woken up almost
immediately. This can help significantly in certain cases when
pthread mutexes are heavily contended and held for brief durations
(such as mysql).
Spin up to 200 times before entering the kernel, which represents only
a few us on modern CPUs. No performance degradation was observed with
this value and it is sufficient to avoid a large performance drop in
mysql performance in the heavily contended pthread mutex case.
The libkse implementation is a NOP.
Reviewed by: jeff
MFC after: 3 days
_thr_ucond_broadcast, clear condition variable pointer in cancellation
info after returing from _thr_ucond_wait, since kernel has already
dropped the internal lock, so we don't need to unlock it in cancellation
handler again.
is also returned by pthread_detach() if a thread was already
detached, the error code was already documented:
> [EINVAL] The implementation has detected that the value speci-
> fied by thread does not refer to a joinable thread.
into pthread structure to keep track of locked PTHREAD_PRIO_PROTECT mutex,
no real mutex code is changed, the mutex locking and unlocking code should
has same performance as before.
wait(), waitpid() and usleep(), they are internal versions and
should not be cancellation points.
2. Make wait3() as a cancellation point.
3. Move raise() and pause() into file thr_sig.c.
4. Add functions _sigsuspend, _sigwait, _sigtimedwait and _sigwaitinfo,
remove SIGCANCEL bit in wait-set for those functions, the signal is
used internally to implement thread cancellation.
to make it work, turnstile like mechanism to support priority
propagating and other realtime scheduling options in kernel
should be available to userland mutex, for the moment, I just
want to make libthr be simple and efficient thread library.
Discussed with: deischen, julian
* Add posix_memalign().
* Move calloc() from calloc.c to malloc.c. Add a calloc() implementation in
rtld-elf in order to make the loader happy (even though calloc() isn't
used in rtld-elf).
* Add _malloc_prefork() and _malloc_postfork(), and use them instead of
directly manipulating __malloc_lock.
Approved by: phk, markm (mentor)
operation, the caller is blocked util target threads are really
suspended, also avoid suspending a thread when it is holding a
critical lock.
Fix a bug in _thr_ref_delete which tests a never set flag.