Commit Graph

352 Commits

Author SHA1 Message Date
David Xu
21a9296f63 Remove incorrect comments, also make sure signal is
disabled when unregistering sigaction.
2010-09-01 13:22:55 +00:00
David Xu
12c61c22ce In function __pthread_cxa_finalize(), also make code for removing
atfork handler be async-signal safe.
2010-09-01 07:09:46 +00:00
David Xu
a523216bc6 pthread_atfork should acquire writer lock and protect the code
with critical region.
2010-09-01 03:55:10 +00:00
David Xu
ada33a6e36 Change atfork lock from mutex to rwlock, also make mutexes used by malloc()
module private type, when private type mutex is locked/unlocked, thread
critical region is entered or leaved. These changes makes fork()
async-signal safe which required by POSIX. Note that user's atfork handler
still needs to be async-signal safe, but it is not problem of libthr, it
is user's responsiblity.
2010-09-01 03:11:21 +00:00
David Xu
02c3c85869 Add signal handler wrapper, the reason to add it becauses there are
some cases we want to improve:
  1) if a thread signal got a signal while in cancellation point,
     it is possible the TDP_WAKEUP may be eaten by signal handler
     if the handler called some interruptibly system calls.
  2) In signal handler, we want to disable cancellation.
  3) When thread holding some low level locks, it is better to
     disable signal, those code need not to worry reentrancy,
     sigprocmask system call is avoided because it is a bit expensive.
The signal handler wrapper works in this way:
  1) libthr installs its signal handler if user code invokes sigaction
     to install its handler, the user handler is recorded in internal
     array.
  2) when a signal is delivered, libthr's signal handler is invoke,
     libthr checks if thread holds some low level lock or is in critical
     region, if it is true, the signal is buffered, and all signals are
     masked, once the thread leaves critical region, correct signal
     mask is restored and buffered signal is processed.
  3) before user signal handler is invoked, cancellation is temporarily
     disabled, after user signal handler is returned, cancellation state
     is restored, and pending cancellation is rescheduled.
2010-09-01 02:18:33 +00:00
David Xu
ed0ee6af2e Unregister thread specific data destructor when a corresponding dso
is unloaded.
2010-08-27 05:20:22 +00:00
David Xu
8e60ce996b clear lock to zero state if it is destroyed. 2010-08-27 03:23:07 +00:00
David Xu
1ac3d5022c eliminate unused code. 2010-08-26 09:04:27 +00:00
David Xu
6b932eca79 Decrease rdlock count only when thread unlocked a reader lock.
MFC after:	3 days
2010-08-26 07:09:48 +00:00
Konstantin Belousov
247a32fac5 Remove unused source.
MFC after:	2 weeks
2010-08-24 11:55:25 +00:00
Konstantin Belousov
47536ff629 The __hidden definition is provided by sys/cdefs.h.
MFC after:	2 weeks
2010-08-24 11:54:48 +00:00
David Xu
5cf2219535 Add wrapper for setcontext() and swapcontext(), the wrappers
unblock SIGCANCEL which is needed by thread cancellation.
2010-08-24 09:57:06 +00:00
Konstantin Belousov
ea246b6369 On shared object unload, in __cxa_finalize, call and clear all installed
atexit and __cxa_atexit handlers that are either installed by unloaded
dso, or points to the functions provided by the dso.

Use _rtld_addr_phdr to locate segment information from the address of
private variable belonging to the dso, supplied by crtstuff.c. Provide
utility function __elf_phdr_match_addr to do the match of address against
dso executable segment.

Call back into libthr from __cxa_finalize using weak
__pthread_cxa_finalize symbol to remove any atfork handler which
function points into unloaded object.

The rtld needs private __pthread_cxa_finalize symbol to not require
resolution of the weak undefined symbol at initialization time. This
cannot work, since rtld is relocated before sym_zero is set up.

Idea by:	kan
Reviewed by:	kan (previous version)
MFC after:	3 weeks
2010-08-23 15:38:02 +00:00
David Xu
82746ea546 Reduce redundant code.
Submitted by: kib
2010-08-20 13:42:48 +00:00
David Xu
635f917a9d In current implementation, thread cancellation is done in signal handler,
which does not know what is the state of interrupted system call, for
example, open() system call opened a file and the thread is still cancelled,
result is descriptor leak, there are other problems which can cause resource
leak or undeterminable side effect when a thread is cancelled. However, this
is no longer true in new implementation.

  In defering mode, a thread is canceled if cancellation request is pending and
later the thread enters a cancellation point, otherwise, a later
pthread_cancel() just causes SIGCANCEL to be sent to the target thread, and
causes target thread to abort system call, userland code in libthr then checks
cancellation state, and cancels the thread if needed. For example, the
cancellation point open(), the thread may be canceled at start,
but later, if it opened a file descriptor, it is not canceled, this avoids
file handle leak. Another example is read(), a thread may be canceled at start
of the function, but later, if it read some bytes from a socket, the thread
is not canceled, the caller then can decide if it should still enable cancelling
or disable it and continue reading data until it thinks it has read all
bytes of a packet, and keeps a protocol stream in health state, if user ignores
partly reading of a packet without disabling cancellation, then second iteration
of read loop cause the thread to be cancelled.
An exception is that the close() cancellation point always closes a file handle
despite whether the thread is cancelled or not.

  The old mechanism is still kept, for a functions which is not so easily to
fix a cancellation problem, the rough mechanism is used.

Reviewed by: kib@
2010-08-20 05:15:39 +00:00
David Xu
719863239e According to specification, function fcntl() is a cancellation point only
when cmd argument is F_SETLKW.
2010-08-20 04:15:05 +00:00
David Xu
cdcffc3f1c Tweak code a bit to be POSIX compatible, when a cancellation request
is acted upon, or when a thread calls pthread_exit(), the thread first
disables cancellation by setting its cancelability state to
PTHREAD_CANCEL_DISABLE and its cancelability type to
PTHREAD_CANCEL_DEFERRED. The cancelability state remains set to
PTHREAD_CANCEL_DISABLE until the thread has terminated.

It has no effect if a cancellation cleanup handler or thread-specific
data destructor routine changes the cancelability state to
PTHREAD_CANCEL_ENABLE.
2010-08-17 02:50:12 +00:00
Konstantin Belousov
b144e48b2a Use _SIG_VALID instead of expanded form of the macro.
Submitted by:	Garrett Cooper <yanegomi gmail com>
MFC after:	1 week
2010-07-12 10:15:33 +00:00
Daniel Eischen
1cfc8fc759 Coalesce one more broken line. 2010-05-24 13:44:39 +00:00
Daniel Eischen
9ed8360e53 Coalesce a couple of broken lines since they can fit within 80
characters.  Little nit found while looking at a bug report.
2010-05-24 13:43:11 +00:00
David Xu
60e9cdf158 remove file thr_sem_new.c. 2010-01-05 07:50:31 +00:00
David Xu
791f7a99e2 Remove extra new semaphore stubs, because libc already has them, and
ld can find the newest version which is default.

Poked by: kan@
2010-01-05 06:21:29 +00:00
David Xu
9b0f1823b5 Use umtx to implement process sharable semaphore, to make this work,
now type sema_t is a structure which can be put in a shared memory area,
and multiple processes can operate it concurrently.
User can either use mmap(MAP_SHARED) + sem_init(pshared=1) or use sem_open()
to initialize a shared semaphore.
Named semaphore uses file system and is located in /tmp directory, and its
file name is prefixed with 'SEMD', so now it is chroot or jail friendly.
In simplist cases, both for named and un-named semaphore, userland code
does not have to enter kernel to reduce/increase semaphore's count.
The semaphore is designed to be crash-safe, it means even if an application
is crashed in the middle of operating semaphore, the semaphore state is
still safely recovered by later use, there is no waiter counter maintained
by userland code.
The main semaphore code is in libc and libthr only has some necessary stubs,
this makes it possible that a non-threaded application can use semaphore
without linking to thread library.
Old semaphore implementation is kept libc to maintain binary compatibility.
The kernel ksem API is no longer used in the new implemenation.

Discussed on: threads@
2010-01-05 02:37:59 +00:00
Marcel Moolenaar
e4f141b546 Work-around a race condition on ia64 while unlocking a contested lock.
The race condition is believed to be in UMTX_OP_MUTEX_WAKE. On ia64,
we simply go to the kernel to unlock.
The big question is why this is only a race condition on ia64...

MFC after:	3 days
2009-12-14 01:26:01 +00:00
Konstantin Belousov
066d836b02 Current pselect(3) is implemented in usermode and thus vulnerable to
well-known race condition, which elimination was the reason for the
function appearance in first place. If sigmask supplied as argument to
pselect() enables a signal, the signal might be delivered before thread
called select(2), causing lost wakeup. Reimplement pselect() in kernel,
making change of sigmask and sleep atomic.

Since signal shall be delivered to the usermode, but sigmask restored,
set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK
should be cleared by ast() in case signal was not gelivered during
syscall execution.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:55:34 +00:00
Jilles Tjoelker
29670497af Make openat(2) a cancellation point.
This is required by POSIX and matches open(2).

Reviewed by:	kib, jhb
MFC after:	1 month
2009-10-11 20:19:45 +00:00
David Xu
daf3ced72b don't report error if key was deleted.
PR:	threads/135462
2009-09-25 00:15:30 +00:00
Attilio Rao
b13c5f2883 rwlock implemented from libthr need to fall through the 'hard path' and
query umtx also if the shared waiters bit is set on a shared lock.
The writer starvation avoidance technique, infact, can lead to shared
waiters on a shared lock which can bring to a missed wakeup and thus
to a deadlock if the right bit is not checked (a notable case is the
writers counterpart to be handled through expired timeouts).

Fix that by checking for the shared waiters bit also when unlocking the
shared locks.

That bug was causing a reported MySQL deadlock.
Many thanks go to Nick Esborn and his employer DesertNet which provided
time and machines to identify and fix this issue.

PR:		thread/135673
Reported by:	Nick Esborn <nick at desert dot net>
Tested by:	Nick Esborn <nick at desert dot net>
Reviewed by:	jeff
2009-09-23 21:38:57 +00:00
Attilio Rao
137ae5d291 In the current code, rdlock_count is not correctly handled for some cases.
The most notable is that it is not bumped in rwlock_rdlock_common() when
the hard path (__thr_rwlock_rdlock()) returns successfully.
This can lead to deadlocks in libthr when rwlocks recursion in read mode
happens.
Fix the interested parts by correctly handling rdlock_count.

PR:		threads/136345
Reported by:	rink
Tested by:	rink
Reviewed by:	jeff
Approved by:	re (kib)
MFC:		2 weeks
2009-07-06 09:31:04 +00:00
Brian Feldman
43af51a2b5 These are some cosmetic changes to improve the clarity of libthr's fork implementation. 2009-05-11 16:45:53 +00:00
Robert Watson
d1f2f1c3f3 Now that the kernel defines CACHE_LINE_SIZE in machine/param.h, use
that definition in the custom locking code for the run-time linker
rather than local definitions.

Pointed out by:	tinderbox
MFC after:	2 weeks
2009-04-19 23:02:50 +00:00
Konstantin Belousov
29986e1bac Forcibly unlock the malloc() locks in the child process after fork(),
by temporary pretending that the process is still multithreaded.
Current malloc lock primitives do nothing for singlethreaded process.

Reviewed by:	davidxu, deischen
2009-03-19 10:32:25 +00:00
David Xu
5b71b82e70 Don't ignore other fcntl functions, directly call __sys_fcntl if
WITHOUT_SYSCALL_COMPAT is not defined.

Reviewed by:	deischen
2009-03-09 05:54:43 +00:00
David Xu
c30c187d60 Don't reference non-existent __fcntl_compat if WITHOUT_SYSCALL_COMPAT is defined.
Submitted by:	Pawel Worach "pawel dot worach at gmail dot com"
2009-03-09 02:34:02 +00:00
Peter Wemm
70ba1e8fc1 When libthr and rtld start up, there are a number of magic spells cast
in order to get the symbol binding state "just so".  This is to allow
locking to be activated and not run into recursion problems later.

However, one of the magic bits involves an explicit call to _umtx_op()
to force symbol resolution.  It does a wakeup operation on a fake,
uninitialized (ie: random contents) umtx.  Since libthr isn't active, this
is harmless.  Nothing can match the random wakeup.

However, valgrind finds this and is not amused.  Normally I'd just
write a suppression record for it, but the idea of passing random
args to syscalls (on purpose) just doesn't feel right.
2008-12-07 02:32:49 +00:00
Konstantin Belousov
10b4034657 Provide custom simple allocator for rtld locks in libthr. The allocator
does not use any external symbols, thus avoiding possible recursion into
rtld to resolve symbols, when called.

Reviewed by:	kan, davidxu
Tested by:	rink
MFC after:	1 month
2008-12-02 11:58:31 +00:00
Alexander Kabaev
97df383415 Invoke _rtld_atfork_post earlier, before we reinitialize rtld locks
by switching into single-thread mode.

libthr ignores broken use of lock bitmaps used by default rtld locking
implementation, this in turn turns lock handoff in _rtld_thread_init
into NOP. This in turn makes child processes of forked multi-threaded
programs to run with _thr_signal_block still in effect, with most
signals blocked.

Reported by: phk, kib
2008-12-01 21:00:25 +00:00
Konstantin Belousov
e711c6f0d1 Unlock the malloc() locks in the child process after fork(). This gives
us working malloc in the fork child of the multithreaded process.

Although POSIX requires that only async-signal safe functions shall be
operable after fork in multithreaded process, not having malloc lower
the quality of our implementation.

Tested by:	rink
Discussed with:	kan, davidxu
Reviewed by:	kan
MFC after:	1 month
2008-11-29 21:46:28 +00:00
Konstantin Belousov
cb5c4b10ba Add two rtld exported symbols, _rtld_atfork_pre and _rtld_atfork_post.
Threading library calls _pre before the fork, allowing the rtld to
lock itself to ensure that other threads of the process are out of
dynamic linker. _post releases the locks.

This allows the rtld to have consistent state in the child. Although
child may legitimately call only async-safe functions, the call may
need plt relocation resolution, and this requires working rtld.

Reported and debugging help by:	rink
Reviewed by:	kan, davidxu
MFC after:	1 month (anyway, not before 7.1 is out)
2008-11-27 11:27:59 +00:00
Marcel Moolenaar
03fad2ad5f Allow psaddr_t to be widened by using thr_pread_{int,long,ptr},
where critical. Some places still use ps_pread/ps_pwrite directly,
but only need changed when byte-order comes into the picture.
Also, change th_p in td_event_msg_t from a pointer type to
psaddr_t, so that events also work when psaddr_t is widened.
2008-09-14 16:07:21 +00:00
Jason Evans
5b3842aefa Move call to _malloc_thread_cleanup() so that if this is the last thread,
the call never happens.  This is necessary because malloc may be used
during exit handler processing.

Submitted by:	davidxu
2008-09-09 17:14:32 +00:00
Jason Evans
d6742bfbd3 Add thread-specific caching for small size classes, based on magazines.
This caching allows for completely lock-free allocation/deallocation in the
steady state, at the expense of likely increased memory use and
fragmentation.

Reduce the default number of arenas to 2*ncpus, since thread-specific
caching typically reduces arena contention.

Modify size class spacing to include ranges of 2^n-spaced, quantum-spaced,
cacheline-spaced, and subpage-spaced size classes.  The advantages are:
fewer size classes, reduced false cacheline sharing, and reduced internal
fragmentation for allocations that are slightly over 512, 1024, etc.

Increase RUN_MAX_SMALL, in order to limit fragmentation for the
subpage-spaced size classes.

Add a size-->bin lookup table for small sizes to simplify translating sizes
to size classes.  Include a hard-coded constant table that is used unless
custom size class spacing is specified at run time.

Add the ability to disable tiny size classes at compile time via
MALLOC_TINY.
2008-08-27 02:00:53 +00:00
David Xu
fc45432be6 In function pthread_condattr_getpshared, store result correctly.
PR:		kern/126128
2008-08-01 01:21:49 +00:00
David Xu
7de1ecef2d Add two commands to _umtx_op system call to allow a simple mutex to be
locked and unlocked completely in userland. by locking and unlocking mutex
in userland, it reduces the total time a mutex is locked by a thread,
in some application code, a mutex only protects a small piece of code, the
code's execution time is less than a simple system call, if a lock contention
happens, however in current implemenation, the lock holder has to extend its
locking time and enter kernel to unlock it, the change avoids this disadvantage,
it first sets mutex to free state and then enters kernel and wake one waiter
up. This improves performance dramatically in some sysbench mutex tests.

Tested by: kris
Sounds great: jeff
2008-06-24 07:32:12 +00:00
David Xu
83a0758789 Make pthread_cleanup_push() and pthread_cleanup_pop() as a pair of macros,
use stack space to keep cleanup information, this eliminates overhead of
calling malloc() and free() in thread library.

Discussed on: thread@
2008-06-09 01:14:10 +00:00
Doug Rabson
cd7d66a21f Call the fcntl compatiblity wrapper from the thread library fcntl wrappers
so that they get the benefit of the (limited) forward ABI compatibility.

MFC after: 1 week
2008-05-30 14:47:42 +00:00
David Xu
1b3418b2dc Eliminate global mutex by using pthread_once's state field as
a semaphore.
2008-05-30 00:02:59 +00:00
David Xu
850f4d66cb - Reduce function call overhead for uncontended case.
- Remove unused flags MUTEX_FLAGS_* and their code.
- Check validity of the timeout parameter in mutex_self_lock().
2008-05-29 07:57:33 +00:00
David Xu
cf181aee60 Remove libc_r's remnant code. 2008-05-06 07:27:11 +00:00
David Xu
8d6a11a070 Use UMTX_OP_WAIT_UINT_PRIVATE and UMTX_OP_WAKE_PRIVATE to save
time in kernel(avoid VM lookup).
2008-04-29 03:58:18 +00:00