about the fpu code here. It should be using fxsave/fxrstor instead of
saving/restoring the control word. The SSE registers are used a lot in
gcc generated code on amd64. I'm not sure how this all fits together
though.
On ia64, where there's no libc_r at all, libkse is now the default
thread library by virtue of these links.
The reasons for this change are:
1. libkse is slated to become the default thread library anyway,
2. active development and maintenance is only present for libkse,
3. GNOME and KDE, both in the process of being supported on ia64,
work better with KSE; even on ia64.
can clear the pointer to mutex, not the thread doing mutex
handoff. Because _mutex_lock_backout does not hold scheduler
lock while testing THR_FLAGS_IN_SYNCQ and then reading mutex
pointer, it is possible mutex owner begin to unlock and
handoff the mutex to the current thread, and mutex pointer
will be cleared to NULL before current thread reading it, so
current thread will end up with deferencing a NULL pointer,
Fix the race by making mutex waiters to clear their mutex pointers.
While I am here, also save inherited priority in mutex for
PTHREAD_PRIO_INERIT mutex in mutex_trylock_common just like what
we did in mutex_lock_common.
for interrupted field.
Also in _thr_sig_handler, retrieve current signal mask from kernel not
from ucp, the later is pre-unioned mask, not current signal mask.
pthread_md.h. This commit only moves the definition; it does not
change it for any of the platforms. This more easily allows 64-bit
architectures (in particular) to pick a slightly larger stack size.
THR_SETCONTEXT as PANIC(). The THR_SETCONTEXT macro is currently not
used, which means that the definition we had could be wrong, overly
pessimistic or unknowingly right. I don't like the odds...
The new _ia64_break_setcontext() and corresponding kernel fixes make
KSE mostly usable. There's still a case where we don't properly
restore a context and end up with a NaT consumption fault (typically
an indication for not handling NaT collection points correctly),
but at least now mutex_d works...
state for amd64 was twice as large as necessary. Peter
recently fixed this, so the comment no longer applies.
Also, since the size of struct mcontext changed, adjust
the threads library version of get&set context to match.
FYI, any change layout/size change to any arch's struct
mcontext will likely need some minor changes in libpthread.
to avoid potential memory leak, also fix a bug in pthread_create, contention
scope should be inherited when PTHREAD_INHERIT_SCHED is set, and also check
right field for PTHREAD_INHERIT_SCHED, scheduling inherit flag is in sched_inherit.
2. Execute hooks registered by atexit() on thread stack but not on scheduler
stack.
3. Simplify some code in _kse_single_thread by calling xxx_destroy functions.
Reviewed by: deischen
should be a value past to pthread_attr_setguardsize, not a rounded up value.
Also fix a stack size matching bug in thr_stack.c, now stack matching code
uses number of pages but not bytes length to match stack size, so for example,
size 512 bytes and size 513 bytes should both match 1 page stack size.
Reviewed by: deischen
a shared library or any other dyanmic allocated data block, once
pthread_once_t is initialized, a mutex is allocated, if we unload the
shared library or free those data block, then there is no way to deallocate
the mutex, result is memory leak.
To fix this problem, we don't use mutex field in pthread_once_t, instead,
we use its state field and an internal mutex and conditional variable in
libkse to do any synchronization, we introduce a third state IN_PROGRESS to
wait if another thread is already in invoking init_routine().
Also while I am here, make pthread_once() conformed to pthread cancellation
point specification.
Reviewed by: deischen
instead of long types for low-level locks.
Add prototypes for some internal libc functions that are
wrapped by the library as cancellation points.
Add memory barriers to alpha atomic swap functions (submitted
by davidxu).
Requested by: bde
critical region, we wrap some syscalls for thread cancellation point, and
when syscalls returns, we call _thr_leave_cancellation_point, at the time
if a signal comes in, it would be buffered, and when the thread leaves
_thr_leave_cancellation_point, buffered signals will be processed, to avoid
messing up normal syscall errno, we should save and restore errno around
signal handling code.
yet, so we can protect some locking code from being interrupted by signal
handling. When KSE mode is turned on, reset the thread flag to scope process
except we are running in 1:1 mode which we needn't turn it off.
Also remove some unused member variables in structure kse.
Tested by: deischen
have execute permissions. Run "perl verify" instead. Replace all
occurences of the hardcoding of ./verify with $(VERIFY) to allow
it to be overridden as well.
otherwise masks all signals until fork() returns, in child process,
we reset library state before restoring signal masks until we reach
a safe to point.
Reviewed by: deischen
happens, the context of the interrupted thread is exported to
userland. Unlike most contexts, it will be an async context and
we cannot easily use our existing functions to set such a
context.
To avoid a lot of complexity that may possibly interfere with
the common case, we simply let the kernel deal with it. However,
we don't use the EPC based syscall path to invoke setcontext(2).
No, we use the break-based syscall path. That way the trapframe
will be compatible with the context we're trying to restore and
we save the kernel a lot of trouble. The kind of trouble we did
not want to go though ourselves...
However, we also need to set the threads mailbox and there's no
syscall to help us out. To avoid creating a new syscall, we use
the context itself to pass the information to the kernel so that
the kernel can update the mailbox. This involves setting a flag
(_MC_FLAGS_KSE_SET_MBOX) and setting ifa (the address) and isr
(the value).
TCB. We know that the thread pointer points to &tcb->tcb_tp, so all
we have to do is subtract offsetof(struct tcb, tcb_tp) from the
thread pointer to get to the TCB. Any reasonably smart compiler will
translate accesses to fields in the TCB as negative offsets from TP.
In _tcb_set() make sure the fake TCB gets a pointer to the current
KCB, just like any other TCB. This fixes a NULL-pointer dereference
in _thr_ref_add() when it tried to get the current KSE.
makecontext(). We only supply 3, not 4. This is mostly harmless,
except that on ia64 the garbage can include NaT bits, resulting
in NaT consumption faults.
that the TLS is 16-byte aligned, as well as guarantee that the thread
pointer is 16-byte aligned as it points to struct ia64_tp. Likewise,
struct tcb and struct ksd are also guaranteed to be 16-byte aligned
(if they weren't already).
archs that can (or are required to) have per-thread registers.
Tested on i386, amd64; marcel is testing on ia64 and will
have some follow-up commits.
Reviewed by: davidxu
context functions. We don't need to enter the kernel anymore. The
contexts are compatible (ie a context created by getcontext() can
be restored by _ia64_restore_context()).
While here, make the use of THR_ALIGNBYTES and THR_ALIGN a no-op.
They are going to be removed anyway.
We write 1 for r8 in the context so that _ia64_restore_context()
will return with a non-zero value. _ia64_save_context() always
return 0.
o In _ia64_restore_context(), don't restore the thread pointer. It
is not normally part of the context. Also, restore the return
registers. We get called for contexts created by getcontext(),
which means we have to restore all the syscall return values.
the userland version of [gs]etcontext to switch between a thread
and the UTS scheduler (and back again). This also fixes a bug
in i386 _thr_setcontext() which wasn't properly restoring the
context.
Reviewed by: davidxu
This eliminates ping-ponging of locks, where the idle KSE wakes
up only to find the lock it needs is being held. This gives
little or no gain to M:N mode but greatly speeds up 1:1 mode.
Reviewed & Tested by: davidxu
handed-off/signaled to a higher priority thread. Note that when
there are idle KSEs that could run the higher priority thread,
we still add the preemption point because it seems to take the
kernel a while to schedule an idle KSE. The drawbacks are that
threads will be swapped more often between CPUs (KSEs) and
that there will be an extra userland context switch (the idle
KSE is still woken and will probably resume the preempted
thread). We'll revisit this if and when idle CPU/KSE wakeup
times improve.
Inspired by: Petri Helenius <pete@he.iki.fi>
Reviewed by: davidxu
is system bound thread and when it is blocked, no upcall is generated.
o Add ability to libkse to allow it run in pure 1:1 threading mode,
defining SYSTEM_SCOPE_ONLY in Makefile can turn on this option.
o Eliminate code for installing dummy signal handler for sigwait call.
o Add hash table to find thread.
Reviewed by: deischen
its waitset, but if the signal is not masked by the thread, the signal
can interrupt the thread and signal action can be invoked by the thread,
sigwait should return with errno set to EINTR.
Also save and restore thread internal state(timeout and interrupted)
around signal handler invoking.