returning the error directly.
For sem_post(), make sure that the correct thread is woken up. This has
unfortunate performance implications, but is necessary for POSIX compliance.
Approved by: jkh
just use _foo() <-- foo(). In the case of a libpthread that doesn't do
call conversion (such as linuxthreads and our upcoming libpthread), this
is adequate. In the case of libc_r, we still need three names, which are
now _thread_sys_foo() <-- _foo() <-- foo().
Convert all internal libc usage of: aio_suspend(), close(), fsync(), msync(),
nanosleep(), open(), fcntl(), read(), and write() to _foo() instead of foo().
Remove all internal libc usage of: creat(), pause(), sleep(), system(),
tcdrain(), wait(), and waitpid().
Make thread cancellation fully POSIX-compliant.
Suggested by: deischen
are not supported by this implementation, and the error return values
from sem_init(), sem_open(), sem_close(), and sem_unlink() reflect this.
Approved by: jkh
signal handler. Explicitly check for jumps to anywhere other than the
current stack, since such jumps are undefined according to POSIX.
While we're at it, convert thread cancellation to use continuations, since
it's cleaner than the original cancellation code.
Avoid delivering a signal to a thread twice. This was a pre-existing bug,
but was likely unexposed until these other changes were made.
Defer signals generated by pthread_kill() so that they can be delivered on
the appropriate stack. deischen claims that this is unnecessary, which is
likely true, but without this change, pthread_kill() can cause undefined
priority queue states and/or PANICs in [sig|_]longjmp(), so I'm leaving
this in for now. To compile this code out and exercise the bug, define
the _NO_UNDISPATCH cpp macro. Defining _PTHREADS_INVARIANTS as well will
cause earlier crashes.
PR: kern/14685
Collaboration with: deischen
the case that a CPU hungry main thread is prevented from being preempted
due to a negative calculation of its time slice.
Reported by: Alexander Litvin <archer@lucky.net>
the initial thread). Instead, just leave an unmapped gap between thread
stacks and make sure that the thread stacks won't grow into these gaps,
simply by limiting the size of the stacks with the 'len' argument to
mmap(). This (if I understand correctly) reduces VM overhead
considerably.
Reviewed by: deischen
handler. Thread-to-thread signals (pthread_signal) are treated differently
than process signals; a pthread_signal can wakeup a blocked thread if
a signal handler is not installed for that signal.
Found by: ACE tests
o Cancellation flags were not getting properly set/cleared.
o Loops waiting for internal locks were not being exited
correctly by a cancelled thread.
o Minor spelling (cancelation -> cancellation) and formatting
corrections (missing tab).
Found by: tg
Reviewed by: jasone
o Don't call signal handlers with the signal handler access lock
held.
o Remove pending signals before calling signal handlers. If
pending signals were not removed prior to handling them,
invocation of the handler could cause the handler to be
called more than once for the same signal. Found by: JB
o When SIGCHLD arrives, wake up all threads in PS_WAIT_WAIT
(wait4).
PR: bin/15328
Reviewed by: jasone
Before this change, a signal was delivered to each thread that
didn't have the signal masked. Signals also improperly woke up
threads waiting on I/O. With this change, signals are now
handled in the following way:
o If a thread is waiting in a sigwait for the signal,
then the thread is woken up.
o If no threads are sigwait'ing on the signal and a
thread is in a sigsuspend waiting for the signal,
then the thread is woken up.
o In the case that no threads are waiting or suspended
on the signal, then the signal is delivered to the
first thread we find that has the signal unmasked.
o If no threads are waiting or suspended on the signal,
and no threads have the signal unmasked, then the signal
is added to the process wide pending signal set. The
signal will be delivered to the first thread that unmasks
the signal.
If there is an installed signal handler, it is only invoked
if the chosen thread was not in a sigwait.
In the case that multiple threads are waiting or suspended
on a signal, or multiple threads have the signal unmasked,
we wake up/deliver the signal to the first thread we find.
The above rules still apply.
Reported by: Scott Hess <scott@avantgo.com>
Reviewed by: jb, jasone
to use mmap(..., MAP_STACK, ...) on alpha too since that should work
now.
* Add hooks to allow GDB to access the internals of pthreads without
having to know the exact layout of struct pthread.
Reviewed by: deischen
eischen (Daniel Eischen) added wrappers to protect against cancled
threads orphaning internal resources.
the cancelability code is still a bit fuzzy but works for test
programs of my own, OpenBSD's and some examples from ORA's books.
add readdir_r to both libc and libc_r
add some 'const' attributes to function parameters
Reviewed by: eischen, jasone
with NetBSD and the Single Unix Specification v2.
This updates some structures with other, almost equivalent types and
effort is under way to get the whole more consistent.
Also removes a double definition of INET6 and some other clean-ups.
Reviewed by: green, bde, phk
Some part obtained from: NetBSD, SUSv2 specification
-----------------------------
Most of the userland changes are in libc. For both the alpha
and the i386 setjmp has been changed to accomodate for the
new sigset_t. Internally, libc is mostly rewritten to use the
new syscalls. The exception is in compat-43/sigcompat.c
The POSIX thread library has also been rewritten to use the
new sigset_t. Except, that it currently only handles NSIG
signals instead of the maximum _SIG_MAXSIG. This should not
be a problem because current applications don't use any
signals higher than NSIG.
There are version bumps for the following libraries:
libdialog
libreadline
libc
libc_r
libedit
libftpio
libss
These libraries either a) have one of the modified structures
visible in the interface, or b) use sigset_t internally and
may cause breakage if new binaries are used against libraries
that don't have the sigset_t change. This not an immediate
issue, but will be as soon as applications start using the
new range to its fullest.
NOTE: libncurses already had an version bump and has not been
given one now.
NOTE: doscmd is a real casualty and has been disconnected for
the moment. Reconnection will eventually happen after
doscmd has been fixed. I'm aware that being the last one
to touch it, I'm automaticly promoted to being maintainer.
According to good taste this means that I will receive a
badge which either will be glued or mechanically stapled,
drilled or otherwise violently forced onto me :-)
NOTE: pcvt/vttest cannot be compiled with -traditional. The
change cause sys/types to be included along the way which
contains the const and volatile modifiers. I don't consider
this a solution, but more a workaround.
might have been mmapped, and if so, passing the pointer to free() is
really not a good idea.
[ In the next millenium, when I've taken over the world, I'm going
to ban 8 character tabs. You've been warned. ]
Always use mmap() for default-size stack allocation. Use MAP_ANON instead
of MAP_STACK on the alpha architecture.
Reduce the amount of code executed while owning _gc_mutex during stack
allocation.
Cache discarded default thread stacks for use in subsequent thread creations.
Create a red zone at the end of each stack (including the initial thread
stack), with the hope of causing a segfault if a stack overflows.
To activate these modifications, add -D_PTHREAD_GSTACK to CFLAGS in
src/lib/libc_r/Makefile. Since the modifications depend on the VM_STACK
kernel option, I'm not sure how to safely use growable stacks by default.
Testing, as well as algorithmic and stylistic comments are welcome.
o The polling mechanism for I/O readiness was changed from
select() to poll(). In additon, a wrapped version of poll()
is now provided.
o The wrapped select routine now converts each fd_set to a
poll array so that the thread scheduler doesn't have to
perform a bitwise search for selected fds each time file
descriptors are polled for I/O readiness.
o The thread scheduler was modified to use a new queue (_workq)
for threads that need work. Threads waiting for I/O readiness
and spinblocks are added to the work queue in addition to the
waiting queue. This reduces the time spent forming/searching
the array of file descriptors being polled.
o The waiting queue (_waitingq) is now maintained in order of
thread wakeup time. This allows the thread scheduler to
find the nearest wakeup time by looking at the first thread
in the queue instead of searching the entire queue.
o Removed file descriptor locking for select/poll routines. An
application should not rely on the threads library for providing
this locking; if necessary, the application should use mutexes
to protect selecting/polling of file descriptors.
o Retrieve and use the kernel clock rate/resolution at startup
instead of hardcoding the clock resolution to 10 msec (tested
with kernel running at 1000 HZ).
o All queues have been changed to use queue.h macros. These
include the queues of all threads, dead threads, and threads
waiting for file descriptor locks.
o Added reinitialization of the GC mutex and condition variable
after a fork. Also prevented reallocation of the ready queue
after a fork.
o Prevented the wrapped close routine from closing the thread
kernel pipes.
o Initialized file descriptor table for stdio entries at thread
init.
o Provided additional flags to indicate to what queues threads
belong.
o Moved TAILQ initialization for statically allocated mutex and
condition variables to after the spinlock.
o Added dispatching of signals to pthread_kill. Removing the
dispatching of signals from thread activation broke sigsuspend
when pthread_kill was used to send a signal to a thread.
o Temporarily set the state of a thread to PS_SUSPENDED when it
is first created and placed in the list of threads so that it
will not be accidentally scheduled before becoming a member
of one of the scheduling queues.
o Change the signal handler to queue signals to the thread kernel
pipe if the scheduling queues are protected. When scheduling
queues are unprotected, signals are then dequeued and handled.
o Ensured that all installed signal handlers block the scheduling
signal and that the scheduling signal handler blocks all
other signals. This ensures that the signal handler is only
interruptible for and by non-scheduling signals. An atomic
lock is used to decide which instance of the signal handler
will handle pending signals.
o Removed _lock_thread_list and _unlock_thread_list as they are
no longer used to protect the thread list.
o Added missing RCS IDs to modified files.
o Added checks for appropriate queue membership and activity when
adding, removing, and searching the scheduling queues. These
checks add very little overhead and are enabled when compiled
with _PTHREADS_INVARIANTS defined. Suggested and implemented
by Tor Egge with some modification by me.
o Close a race condition in uthread_close. (Tor Egge)
o Protect the scheduling queues while modifying them in
pthread_cond_signal and _thread_fd_unlock. (Tor Egge)
o Ensure that when a thread gets a mutex, the mutex is on that
threads list of owned mutexes. (Tor Egge)
o Set the kernel-in-scheduler flag in _thread_kern_sched_state
and _thread_kern_sched_state_unlock to prevent a scheduling
signal from calling the scheduler again. (Tor Egge)
o Don't use TAILQ_FOREACH macro while searching the waiting
queue for threads in a sigwait state, because a change of
state destroys the TAILQ link. It is actually safe to do
so, though, because once a sigwaiting thread is found, the
loop ends and the function returns. (Tor Egge)
o When dispatching signals to threads, make the thread inherit
the signal deferral flag of the currently running thread.
(Tor Egge)
Submitted by: Daniel Eischen <eischen@vigrid.com> and
Tor Egge <Tor.Egge@fast.no>
o Runnable threads are now maintained in priority queues. The
implementation requires two things:
1.) The priority queues must be protected during insertion
and removal of threads. Since the kernel scheduler
must modify the priority queues, a spinlock for
protection cannot be used. The functions
_thread_kern_sched_defer() and _thread_kern_sched_undefer()
were added to {un}defer kernel scheduler activation.
2.) A thread (active) priority change can be performed only
when the thread is removed from the priority queue. The
implementation uses a threads active priority when
inserting it into the queue.
A by-product is that thread switches are much faster. A
separate queue is used for waiting and/or blocked threads,
and it is searched at most 2 times in the kernel scheduler
when there are active threads. It should be possible to
reduce this to once by combining polling of threads waiting
on I/O with the loop that looks for timed out threads and
the minimum timeout value.
o Functions to defer kernel scheduler activation were added. These
are _thread_kern_sched_defer() and _thread_kern_sched_undefer()
and may be called recursively. These routines do not block the
scheduling signal, but latch its occurrence. The signal handler
will not call the kernel scheduler when the running thread has
deferred scheduling, but it will be called when running thread
undefers scheduling.
o Added support for _POSIX_THREAD_PRIORITY_SCHEDULING. All the
POSIX routines required by this should now be implemented.
One note, SCHED_OTHER, SCHED_FIFO, and SCHED_RR are required
to be defined by including pthread.h. These defines are currently
in sched.h. I modified pthread.h to include sched.h but don't
know if this is the proper thing to do.
o Added support for priority protection and inheritence mutexes.
This allows definition of _POSIX_THREAD_PRIO_PROTECT and
_POSIX_THREAD_PRIO_INHERIT.
o Added additional error checks required by POSIX for mutexes and
condition variables.
o Provided a wrapper for sigpending which is marked as a hidden
syscall.
o Added a non-portable function as a debugging aid to allow an
application to monitor thread context switches. An application
can install a routine that gets called everytime a thread
(explicitly created by the application) gets context switched.
The routine gets passed the pthread IDs of the threads that are
being switched in and out.
Submitted by: Dan Eischen <eischen@vigrid.com>
Changes by me:
o Added a PS_SPINBLOCK state to deal with the priority inversion
problem most often (I think) seen by threads calling malloc/free/realloc.
o Dispatch signals to the running thread directly rather than at a
context switch to avoid the situation where the switch never occurs.
make pthread_yield() more reliable,
threads always (I hope) preempted at least every 0.1 sec, as intended.
PR: bin/7744
Submitted by: "Richard Seaman, Jr." <dick@tar.com>
the thread kernel into a garbage collector thread which is started when
the fisrt thread is created (other than the initial thread). This
removes the window of opportunity where a context switch will cause a
thread that has locked the malloc spinlock, to enter the thread kernel,
find there is a dead thread and try to free memory, therefore trying
to lock the malloc spinlock against itself.
The garbage collector thread acts just like any other thread, so
instead of having a spinlock to control accesses to the dead thread
list, it uses a mutex and a condition variable so that it can happily
wait to be signalled when a thread exists.
launching an application into space when someone tries to debug it.
The dead thread list now has it's own link pointer, so use that when
reporting the grateful dead.
- Add support of a thread being listed in the dead thread list as well
as the thread list.
- Add a new thread state to make sigwait work properly. (Submitted by
Daniel M. Eischen <eischen@vigrid.com>)
- Add global variable for the garbage collector mutex and condition
variable.
- Delete a couple of prototypes that are no longer required.
- Add a prototype for the garbage collector thread.
to fork. It is difficult to do real vfork in libc_r, since almost every
operation with file descriptsor changes _thread_fd_table and friends.
popen(3) works much better with this change.
are started instead of init (pid = 1). This allows an embedded
implementation quite like VxWorks, with (possibly) a single threaded
program running instead of init. The neat thing is that the same threaded
process can run in a multi-user workstation environment too.
initialized mutex. Statically initialized mutexes are actually
initialized at first use (pthread_mutex_lock/pthread_mutex_trylock).
To prevent concurrent initialization by multiple threads, all
static initializations are now serialized by a spinlock.
Reviewed by: jb
signal can arrive before the thread is woken from it's wait4. In this
case, don't return an EINTR, just set the thread state to running and
the wait4 wrapper will loop and get the exit status of the process.
line number every time a file descriptor is locked.
This looks like a big change but it isn't. It should reduce the size
of libc_r and make it run slightly faster.
time that a thread keeps the file descriptor table locked. In particular,
perform malloc/free calls outside the lock and handle the situation
where two threads can race to initialise the table entry for the same
file descriptor.
with -D_LOCK_DEBUG. This adds the file name and line number to each lock
call and these are stored in the spinlock structure. When using debug
mode, the lock function will check if the thread is trying to lock
something it has already locked. This is not supposed to happen because
the lock will be freed too early.
Without lock debug, libc_r should be smaller and slightly faster.
cleanup destructor, so trap this case to prevent me from being being
burnt again by applications that try to do this. With this change, an
application (like one using a mis-configured ACE) will exit the process
after displaying a message quoting the POSIX section that the application
has violated.
is allocated or not, rather than keeping a count and attempting to
know it it is in-use. POSIX says that once a key is deleted, using the
key again results in undefined behaviour.
written without returning to the caller. This only occurs on pipes
where either the number of bytes written is greater than the pipe
buffer or if there is insufficient space in the pipe buffer because the
reader is reading slower than the writer is writing.
for the process, not a separate set for each thread). By default, the
process now only has signal handlers installed for SIGVTALRM, SIGINFO
and SIGCHLD. The thread kernel signal handler is installed for other
signals on demand. This means that SIG_IGN and SIG_DFL processing is now
left to the kernel, not the thread kernel.
Change the signal dispatch to no longer use a signal thread, and
call the signal handler using the stack of the thread that has the
signal pending.
Change the atomic lock method to use test-and-set asm code with
a yield if blocked. This introduces separate locks for each type
of object instead of blocking signals to prevent a context
switch. It was this blocking of signals that caused the performance
degradation the people have noted.
This is a *big* change!
it was. Add a FILE_WAIT state and queue threads waiting for a FILE
lock. Start using the sys/queue.h macros instead of the way that MIT
pthreads did it.
Add a thread name to the private thread structure and a non-POSIX
function to set this. This helps (me at least) when sending a SIGINFO
to a threaded process to get a /tmp/uthread.dump to see what the
<expletive deleted> threads are doing this time. It is nice to be
able to recognise (yes, I spell that with an 's' too) which threads
are which.
threads from invalid ones. The pthread structure is opaque to the user
so this change does not cause any incompatibilities.
Hopefully this change will help code that was written for draft 4
fail gracefully if the programmer ignores the compiler warning about
the change in the level of indirection for the argument passed to
pthread_detach(). I got burnt, so I fixed then (expletive deleted)
thing.
These functions comply with the revised standard. That should shut
Terry up!
specifically:
uthread_accept.c: Fix for inherited socket not getting correct entry in
pthread flags.
uthread_create.c: Fix to allow pthread_t pointer return to be null if
caller doesn't care about return.
uthread_fd.c: Fix for return codes to be placed into correct errno.
uthread_init.c: Changes to make gcc-2.8 thread aware for exception stack
frames (WARNING: This is #ifdef'ed out by default and is
different from the Cygnus egcs fix).
uthread_ioctl.c: Fix for blocking/non-blocking ioctl.
uthread_kern.c: Signal handling fixes (only one case left to fix,
that of an externally sent SIGSEGV and friends -
a fairly unusual case).
uthread_write.c: Fix for lock of fd - ask for write lock, not read/write.
uthread_writev.c: Fix for lock of fd - ask for write lock, not read/write.
Pthreads now works well enough to run the LDAP and ACAPD(with the gcc 2.8 fix)
sample implementations.
functions would return -1 and set errno to indicate the specific error.
POSIX requires that the functions return the error code as the return
value of the function instead.
The addition of the nanosleep syscall was correctly added to
libc/sys/Makefile so that it is renamed as _thread_sys_nanosleep().
This syscall is one of those that libc_r has to re-implement because
the only behaviour is to block the process. So libc_r just ignores the
fact that a nanosleep syscall exists and goes its own way - as it has
done all along .... and now it does again. And now a simple program
can sleep again. Phew.
which don't provide a non-blocking interface.
This is a short term "fix" which changes a half-lose to a half-win.
The thread that accesses a device that does not provide a non-blocking
interface will block for its time slice.
A medium term solution would be to use rfork. A long-term solution
would be some sort of kernel thread/SMP implementation.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
Here are the diffs for libc_r to get it one step closer to P1003.1c
These make most of the thread/mutex/condvar structures opaque to the
user. There are three functions which have been renamed with _np
suffixes because they are extensions to P1003.1c (I did them for JAVA,
which needs to suspend/resume threads and also start threads suspended).
I've created a new header (pthread_np.h) for the non-POSIX stuff.
The egrep tags stuff in /usr/src/lib/libc_r/Makefile that I uncommented
doesn't work. I think its best to delete it. I don't think libc_r needs
tags anyway, 'cause most of the source is in libc which does have tags.
also:
Here's the first batch of man pages for the thread functions.
The diff to /usr/src/lib/libc_r/Makefile removes some stuff that was
inherited from /usr/src/lib/libc/Makefile that should only be done with
libc.
also:
I should have sent this diff with the pthread(3) man page.
It allows people to type
make -DWANT_LIBC_R world
to get libc_r built with the rest of the world. I put this in the
pthread(3) man page. The default is still not to build libc_r.
also:
The diff attached adds a pthread(3) man page to /usr/src/share/man/man3.
The idea is that without libc_r installed, this man page will give people
enough info to know that they have to build libc_r.
It uses a static constructor to call _thread_init() at program start-up
time. That eliminates the need for any initialization hooks in crt0.o.
Added a symbol reference in "uthread_init.c", to ensure that the new
module will always be pulled in when the archive version of the library
is used.
In "Makefile.inc", defined CPLUSPLUSLIB, so that the constructor will be
properly invoked in the shared library.
Suggested by: Christopher Provenzano, Peter Wemm, and others.