2002-10-02 18:01:51 +00:00
|
|
|
.\" Copyright (c) 2002 Packet Design, LLC.
|
|
|
|
.\" All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" Subject to the following obligations and disclaimer of warranty,
|
|
|
|
.\" use and redistribution of this software, in source or object code
|
|
|
|
.\" forms, with or without modifications are expressly permitted by
|
|
|
|
.\" Packet Design; provided, however, that:
|
|
|
|
.\"
|
|
|
|
.\" (i) Any and all reproductions of the source or object code
|
|
|
|
.\" must include the copyright notice above and the following
|
|
|
|
.\" disclaimer of warranties; and
|
|
|
|
.\" (ii) No rights are granted, in any manner or form, to use
|
|
|
|
.\" Packet Design trademarks, including the mark "PACKET DESIGN"
|
|
|
|
.\" on advertising, endorsements, or otherwise except as such
|
|
|
|
.\" appears in the above copyright notice or in the software.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS BEING PROVIDED BY PACKET DESIGN "AS IS", AND
|
|
|
|
.\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, PACKET DESIGN MAKES NO
|
|
|
|
.\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING
|
|
|
|
.\" THIS SOFTWARE, INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED
|
|
|
|
.\" WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
|
|
|
|
.\" OR NON-INFRINGEMENT. PACKET DESIGN DOES NOT WARRANT, GUARANTEE,
|
|
|
|
.\" OR MAKE ANY REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS
|
|
|
|
.\" OF THE USE OF THIS SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY,
|
|
|
|
.\" RELIABILITY OR OTHERWISE. IN NO EVENT SHALL PACKET DESIGN BE
|
|
|
|
.\" LIABLE FOR ANY DAMAGES RESULTING FROM OR ARISING OUT OF ANY USE
|
|
|
|
.\" OF THIS SOFTWARE, INCLUDING WITHOUT LIMITATION, ANY DIRECT,
|
|
|
|
.\" INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, PUNITIVE, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, LOSS OF
|
|
|
|
.\" USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY THEORY OF
|
|
|
|
.\" LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
|
|
.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
|
|
|
|
.\" THE USE OF THIS SOFTWARE, EVEN IF PACKET DESIGN IS ADVISED OF
|
|
|
|
.\" THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
.\"
|
|
|
|
.\" $FreeBSD$
|
|
|
|
.\"
|
2005-11-24 07:33:35 +00:00
|
|
|
.Dd July 12, 2004
|
2002-10-02 18:01:51 +00:00
|
|
|
.Dt KSE 2
|
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm kse
|
|
|
|
.Nd "kernel support for user threads"
|
|
|
|
.Sh LIBRARY
|
|
|
|
.Lb libc
|
|
|
|
.Sh SYNOPSIS
|
|
|
|
.In sys/types.h
|
|
|
|
.In sys/kse.h
|
|
|
|
.Ft int
|
|
|
|
.Fn kse_create "struct kse_mailbox *mbx" "int newgroup"
|
|
|
|
.Ft int
|
|
|
|
.Fn kse_exit void
|
|
|
|
.Ft int
|
2003-02-25 09:49:46 +00:00
|
|
|
.Fn kse_release "struct timespec *timeout"
|
2002-10-02 18:01:51 +00:00
|
|
|
.Ft int
|
2005-11-24 07:33:35 +00:00
|
|
|
.Fn kse_switchin "struct kse_thr_mailbox *tmbx" "int flags"
|
2002-10-02 18:01:51 +00:00
|
|
|
.Ft int
|
2005-11-24 07:33:35 +00:00
|
|
|
.Fn kse_thr_interrupt "struct kse_thr_mailbox *tmbx" "int cmd" "long data"
|
2003-12-10 02:38:51 +00:00
|
|
|
.Ft int
|
|
|
|
.Fn kse_wakeup "struct kse_mailbox *mbx"
|
2002-10-02 18:01:51 +00:00
|
|
|
.Sh DESCRIPTION
|
2002-12-18 09:22:32 +00:00
|
|
|
These system calls implement kernel support for multi-threaded processes.
|
2002-10-02 18:01:51 +00:00
|
|
|
.\"
|
|
|
|
.Ss Overview
|
|
|
|
.\"
|
|
|
|
Traditionally, user threading has been implemented in one of two ways:
|
|
|
|
either all threads are managed in user space and the kernel is unaware
|
|
|
|
of any threading (also known as
|
|
|
|
.Dq "N to 1" ) ,
|
|
|
|
or else separate processes sharing
|
|
|
|
a common memory space are created for each thread (also known as
|
|
|
|
.Dq "N to N" ) .
|
|
|
|
These approaches have advantages and disadvantages:
|
|
|
|
.Bl -column "- Cannot utilize multiple CPUs" "+ Can utilize multiple CPUs"
|
|
|
|
.It Sy "User threading Kernel threading"
|
|
|
|
.It "+ Lightweight - Heavyweight"
|
|
|
|
.It "+ User controls scheduling - Kernel controls scheduling"
|
|
|
|
.It "- Syscalls must be wrapped + No syscall wrapping required"
|
|
|
|
.It "- Cannot utilize multiple CPUs + Can utilize multiple CPUs"
|
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
The KSE system is a
|
|
|
|
hybrid approach that achieves the advantages of both the user and kernel
|
|
|
|
threading approaches.
|
|
|
|
The underlying philosophy of the KSE system is to give kernel support
|
|
|
|
for user threading without taking away any of the user threading library's
|
|
|
|
ability to make scheduling decisions.
|
|
|
|
A kernel-to-user upcall mechanism is used to pass control to the user
|
|
|
|
threading library whenever a scheduling decision needs to be made.
|
2002-12-27 08:21:15 +00:00
|
|
|
An arbitrarily number of user threads are multiplexed onto a fixed number of
|
2002-10-02 18:01:51 +00:00
|
|
|
virtual CPUs supplied by the kernel.
|
|
|
|
This can be thought of as an
|
|
|
|
.Dq "N to M"
|
|
|
|
threading scheme.
|
|
|
|
.Pp
|
|
|
|
Some general implications of this approach include:
|
|
|
|
.Bl -bullet
|
|
|
|
.It
|
|
|
|
The user process can run multiple threads simultaneously on multi-processor
|
|
|
|
machines.
|
|
|
|
The kernel grants the process virtual CPUs to schedule as it
|
|
|
|
wishes; these may run concurrently on real CPUs.
|
|
|
|
.It
|
|
|
|
All operations that block in the kernel become asynchronous, allowing
|
|
|
|
the user process to schedule another thread when any thread blocks.
|
|
|
|
.It
|
|
|
|
Multiple thread schedulers within the same process are possible, and they
|
|
|
|
may operate independently of each other.
|
|
|
|
.El
|
|
|
|
.\"
|
|
|
|
.Ss Definitions
|
|
|
|
.\"
|
|
|
|
KSE allows a user process to have multiple
|
|
|
|
.Sy threads
|
|
|
|
of execution in existence at the same time, some of which may be blocked
|
|
|
|
in the kernel while others may be executing or blocked in user space.
|
|
|
|
A
|
|
|
|
.Sy "kernel scheduling entity"
|
|
|
|
(KSE) is a
|
|
|
|
.Dq "virtual CPU"
|
|
|
|
granted to the process for the purpose of executing threads.
|
|
|
|
A thread that is currently executing is always associated with
|
|
|
|
exactly one KSE, whether executing in user space or in the kernel.
|
|
|
|
The KSE is said to be
|
|
|
|
.Sy assigned
|
|
|
|
to the thread.
|
|
|
|
.Pp
|
|
|
|
The KSE becomes
|
|
|
|
.Sy unassigned ,
|
|
|
|
and the associated thread is suspended, when the KSE has an associated
|
2002-12-27 08:21:15 +00:00
|
|
|
.Sy mailbox ,
|
2003-01-20 11:30:08 +00:00
|
|
|
(see below) the thread has an associated
|
2002-12-27 08:21:15 +00:00
|
|
|
.Sy thread mailbox ,
|
|
|
|
(also see below) and any of the following occurs:
|
2002-10-02 18:01:51 +00:00
|
|
|
.Bl -bullet
|
|
|
|
.It
|
2002-12-27 08:21:15 +00:00
|
|
|
The thread invokes a system call that blocks.
|
2002-10-02 18:01:51 +00:00
|
|
|
.It
|
|
|
|
The thread makes any other demand of the kernel that cannot be immediately
|
|
|
|
satisfied, e.g., touches a page of memory that needs to be fetched from disk,
|
|
|
|
causing a page fault.
|
|
|
|
.It
|
|
|
|
Another thread that was previously blocked in the kernel completes its
|
|
|
|
work in the kernel (or is
|
|
|
|
.Sy interrupted )
|
2002-12-27 08:21:15 +00:00
|
|
|
and becomes ready to return to user space, and the current thread is returning
|
|
|
|
to user space.
|
2002-10-02 18:01:51 +00:00
|
|
|
.It
|
|
|
|
A signal is delivered to the process, and this KSE is chosen to deliver it.
|
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
In other words, as soon as there is a scheduling decision to be made,
|
|
|
|
the KSE becomes unassigned, because the kernel does not presume to know
|
|
|
|
how the process' other runnable threads should be scheduled.
|
|
|
|
Unassigned KSEs always return to user space as soon as possible via
|
|
|
|
the
|
|
|
|
.Sy upcall
|
|
|
|
mechanism (described below), allowing the user process to decide how
|
|
|
|
that KSE should be utilized next.
|
|
|
|
KSEs always complete as much work as possible in the kernel before
|
|
|
|
becoming unassigned.
|
|
|
|
.Pp
|
|
|
|
A
|
|
|
|
.Sy "KSE group"
|
|
|
|
is a collection of KSEs that are scheduled uniformly and which share
|
|
|
|
access to the same pool of threads, which are associated with the KSE group.
|
|
|
|
A KSE group is the smallest entity to which a kernel scheduling
|
|
|
|
priority may be assigned.
|
|
|
|
For the purposes of process scheduling and accounting, each
|
|
|
|
KSE group
|
2002-12-27 08:21:15 +00:00
|
|
|
counts similarly to a traditional unthreaded process.
|
2002-10-02 18:01:51 +00:00
|
|
|
Individual KSEs within a KSE group are effectively indistinguishable,
|
|
|
|
and any KSE in a KSE group may be assigned by the kernel to any runnable
|
2002-12-27 08:21:15 +00:00
|
|
|
(in the kernel) thread associated with that KSE group.
|
2002-10-02 18:01:51 +00:00
|
|
|
In practice, the kernel attempts to preserve the affinity between threads
|
|
|
|
and actual CPUs to optimize cache behavior, but this is invisible to the
|
2003-02-24 22:53:26 +00:00
|
|
|
user process.
|
|
|
|
(Affinity is not yet implemented.)
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
|
|
|
Each KSE has a unique
|
|
|
|
.Sy "KSE mailbox"
|
|
|
|
supplied by the user process.
|
|
|
|
A mailbox consists of a control structure containing a pointer to an
|
|
|
|
.Sy "upcall function"
|
|
|
|
and a user stack.
|
|
|
|
The KSE invokes this function whenever it becomes unassigned.
|
|
|
|
The kernel updates this structure with information about threads that have
|
|
|
|
become runnable and signals that have been delivered before each upcall.
|
|
|
|
Upcalls may be temporarily blocked by the user thread scheduling code
|
|
|
|
during critical sections.
|
|
|
|
.Pp
|
|
|
|
Each user thread has a unique
|
|
|
|
.Sy "thread mailbox"
|
|
|
|
as well.
|
|
|
|
Threads are referred to using pointers to these mailboxes when communicating
|
|
|
|
between the kernel and the user thread scheduler.
|
|
|
|
Each KSE's mailbox contains a pointer to the mailbox of the user thread
|
|
|
|
that the KSE is currently executing.
|
|
|
|
This pointer is saved when the thread blocks in the kernel.
|
|
|
|
.Pp
|
|
|
|
Whenever a thread blocked in the kernel is ready to return to user space,
|
|
|
|
it is added to the KSE group's list of
|
|
|
|
.Sy completed
|
|
|
|
threads.
|
|
|
|
This list is presented to the user code at the next upcall as a linked list
|
|
|
|
of thread mailboxes.
|
2002-10-08 22:42:42 +00:00
|
|
|
.Pp
|
|
|
|
There is a kernel-imposed limit on the number of threads in a KSE group
|
|
|
|
that may be simultaneously blocked in the kernel (this number is not
|
|
|
|
currently visible to the user).
|
|
|
|
When this limit is reached, upcalls are blocked and no work is performed
|
|
|
|
for the KSE group until one of the threads completes (or a signal is
|
|
|
|
received).
|
2002-10-02 18:01:51 +00:00
|
|
|
.\"
|
|
|
|
.Ss Managing KSEs
|
|
|
|
.\"
|
|
|
|
To become multi-threaded, a process must first invoke
|
|
|
|
.Fn kse_create .
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_create
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
creates a new KSE (except for the very first invocation; see below).
|
|
|
|
The KSE will be associated with the mailbox pointed to by
|
|
|
|
.Fa mbx .
|
|
|
|
If
|
|
|
|
.Fa newgroup
|
|
|
|
is non-zero, a new KSE group is also created containing the KSE.
|
|
|
|
Otherwise, the new KSE is added to the current KSE group.
|
|
|
|
Newly created KSEs are initially unassigned; therefore,
|
|
|
|
they will upcall immediately.
|
|
|
|
.Pp
|
|
|
|
Each process initially has a single KSE in a single KSE group executing
|
|
|
|
a single user thread.
|
|
|
|
Since the KSE does not have an associated mailbox, it must remain assigned
|
|
|
|
to the thread and does not perform any upcalls.
|
|
|
|
The result is the traditional, unthreaded mode of operation.
|
|
|
|
Therefore, as a special case, the first call to
|
|
|
|
.Fn kse_create
|
|
|
|
by this initial thread with
|
|
|
|
.Fa newgroup
|
|
|
|
equal to zero does not create a new KSE; instead, it simply associates the
|
|
|
|
current KSE with the supplied KSE mailbox, and no immediate upcall results.
|
2002-12-27 08:21:15 +00:00
|
|
|
However, an upcall will be triggered the next time the thread blocks and
|
|
|
|
the required conditions are met.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
|
|
|
The kernel does not allow more KSEs to exist in a KSE group than the
|
|
|
|
number of physical CPUs in the system (this number is available as the
|
|
|
|
.Xr sysctl 3
|
|
|
|
variable
|
|
|
|
.Va hw.ncpu ) .
|
|
|
|
Having more KSEs than CPUs would not add any value to the user process,
|
|
|
|
as the additional KSEs would just compete with each other for access to
|
|
|
|
the real CPUs.
|
|
|
|
Since the extra KSEs would always be side-lined, the result
|
|
|
|
to the application would be exactly the same as having fewer KSEs.
|
|
|
|
There may however be arbitrarily many user threads, and it is up to the
|
|
|
|
user thread scheduler to handle mapping the application's user threads
|
|
|
|
onto the available KSEs.
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_exit
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
causes the KSE assigned to the currently running thread to be destroyed.
|
|
|
|
If this KSE is the last one in the KSE group, there must be no remaining
|
|
|
|
threads associated with the KSE group blocked in the kernel.
|
2002-12-27 08:21:15 +00:00
|
|
|
This system call does not return unless there is an error.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
|
|
|
As a special case, if the last remaining KSE in the last remaining KSE group
|
2002-12-18 09:22:32 +00:00
|
|
|
invokes this system call, then the KSE is not destroyed;
|
2006-08-04 07:56:35 +00:00
|
|
|
instead, the KSE just loses the association with its mailbox and
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_exit
|
|
|
|
returns normally.
|
|
|
|
This returns the process to its original, unthreaded state.
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_release
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
is used to
|
|
|
|
.Dq park
|
|
|
|
the KSE assigned to the currently running thread when it is not needed,
|
|
|
|
e.g., when there are more available KSEs than runnable user threads.
|
2002-12-27 08:21:15 +00:00
|
|
|
The thread converts to an upcall but does not get scheduled until
|
|
|
|
there is a new reason to do so, e.g., a previously
|
2003-02-25 09:49:46 +00:00
|
|
|
blocked thread becomes runnable, or the timeout expires.
|
2002-10-02 18:01:51 +00:00
|
|
|
If successful,
|
|
|
|
.Fn kse_release
|
2003-01-20 11:28:41 +00:00
|
|
|
does not return to the caller.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2003-12-10 02:38:51 +00:00
|
|
|
.Fn kse_switchin
|
|
|
|
system call can be used by the UTS, when it has selected a new thread,
|
|
|
|
to switch to the context of that thread.
|
|
|
|
The use of
|
|
|
|
.Fn kse_switchin
|
|
|
|
is machine dependent.
|
|
|
|
Some platforms do not need a system call to switch to a new context,
|
|
|
|
while others require its use in particular cases.
|
|
|
|
.Pp
|
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_wakeup
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
is the opposite of
|
|
|
|
.Fn kse_release .
|
2002-12-27 08:21:15 +00:00
|
|
|
It causes the (parked) KSE associated with the mailbox pointed to by
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fa mbx
|
|
|
|
to be woken up, causing it to upcall.
|
2002-12-18 09:22:32 +00:00
|
|
|
If the KSE has already woken up for another reason, this system call has no
|
2002-10-02 18:01:51 +00:00
|
|
|
effect.
|
|
|
|
The
|
|
|
|
.Fa mbx
|
2002-12-19 09:40:28 +00:00
|
|
|
argument
|
2002-10-02 18:01:51 +00:00
|
|
|
may be
|
|
|
|
.Dv NULL
|
|
|
|
to specify
|
|
|
|
.Dq "any KSE in the current KSE group" .
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_thr_interrupt
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
is used to interrupt a currently blocked thread.
|
|
|
|
The thread must either be blocked in the kernel or assigned to a KSE
|
|
|
|
(i.e., executing).
|
|
|
|
The thread is then marked as interrupted.
|
|
|
|
As soon as the thread invokes an interruptible system call (or immediately
|
|
|
|
for threads already blocked in one), the thread will be made runnable again,
|
|
|
|
even though the kernel operation may not have completed.
|
|
|
|
The effect on the interrupted system call is the same as if it had been
|
|
|
|
interrupted by a signal; typically this means an error is returned with
|
|
|
|
.Va errno
|
|
|
|
set to
|
|
|
|
.Er EINTR .
|
|
|
|
.\"
|
|
|
|
.Ss Signals
|
|
|
|
.\"
|
2005-01-11 20:50:51 +00:00
|
|
|
The current implementation creates a special signal thread.
|
2004-10-08 20:40:30 +00:00
|
|
|
Kernel threads (KSEs) in a process mask all signals, and only the signal
|
|
|
|
thread waits for signals to be delivered to the process, the signal thread
|
|
|
|
is responsible
|
|
|
|
for dispatching signals to user threads.
|
|
|
|
.Pp
|
|
|
|
A downside of this is that if a multiplexed thread
|
2005-01-11 20:50:51 +00:00
|
|
|
calls the
|
|
|
|
.Fn execve
|
|
|
|
syscall, its signal mask and pending signals may not be
|
|
|
|
available in the kernel.
|
|
|
|
They are stored
|
|
|
|
in userland and the kernel does not know where to get them, however
|
|
|
|
.Tn POSIX
|
2004-10-08 20:40:30 +00:00
|
|
|
requires them to be restored and passed them to new process.
|
2005-01-11 20:50:51 +00:00
|
|
|
Just setting the mask for the thread before calling
|
|
|
|
.Fn execve
|
|
|
|
is only a
|
2004-10-08 20:40:30 +00:00
|
|
|
close approximation to the problem as it does not re-deliver back to the kernel
|
2005-01-11 20:50:51 +00:00
|
|
|
any pending signals that the old process may have blocked, and it allows a
|
|
|
|
window in which new signals may be delivered to the process between the setting
|
|
|
|
of the mask and the
|
|
|
|
.Fn execve .
|
2004-10-08 20:40:30 +00:00
|
|
|
.Pp
|
|
|
|
For now this problem has been solved by adding a special combined
|
2005-01-11 20:50:51 +00:00
|
|
|
.Fn kse_thr_interrupt Ns / Ns Fn execve
|
|
|
|
mode to the
|
2004-10-08 20:40:30 +00:00
|
|
|
.Fn kse_thr_interrupt
|
|
|
|
syscall.
|
2005-01-11 20:50:51 +00:00
|
|
|
The
|
2004-10-08 20:40:30 +00:00
|
|
|
.Fn kse_thr_interrupt
|
2005-01-11 20:50:51 +00:00
|
|
|
syscall has a sub command
|
|
|
|
.Dv KSE_INTR_EXECVE ,
|
|
|
|
that allows it to accept a
|
|
|
|
.Vt kse_execv_args
|
2004-10-08 20:40:30 +00:00
|
|
|
structure, and allowing it to adjust the signals and then atomically
|
2005-01-11 20:50:51 +00:00
|
|
|
convert into an
|
|
|
|
.Fn execve
|
|
|
|
call.
|
2004-10-08 20:40:30 +00:00
|
|
|
Additional pending signals and the correct signal mask can be passed
|
2005-01-11 20:50:51 +00:00
|
|
|
to the kernel in this way.
|
|
|
|
The thread library overrides the
|
|
|
|
.Fn execve
|
|
|
|
syscall
|
|
|
|
and translates it into
|
|
|
|
.Fn kse_intr_interrupt
|
|
|
|
call, allowing a multiplexed thread
|
|
|
|
to restore pending signals and the correct signal mask before doing the
|
|
|
|
.Fn exec .
|
2004-10-08 20:40:30 +00:00
|
|
|
This solution to the problem may change.
|
2002-10-02 18:01:51 +00:00
|
|
|
.\"
|
|
|
|
.Ss KSE Mailboxes
|
|
|
|
.\"
|
2005-01-11 20:50:51 +00:00
|
|
|
Each KSE has a unique mailbox for user-kernel communication defined in
|
|
|
|
.In sys/kse.h .
|
|
|
|
Some of the fields there are:
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
2002-11-22 23:48:38 +00:00
|
|
|
.Va km_version
|
|
|
|
describes the version of this structure and must be equal to
|
|
|
|
.Dv KSE_VER_0 .
|
2002-10-02 18:01:51 +00:00
|
|
|
.Va km_udata
|
|
|
|
is an opaque pointer ignored by the kernel.
|
|
|
|
.Pp
|
|
|
|
.Va km_func
|
|
|
|
points to the KSE's upcall function;
|
|
|
|
it will be invoked using
|
|
|
|
.Va km_stack ,
|
|
|
|
which must remain valid for the lifetime of the KSE.
|
|
|
|
.Pp
|
|
|
|
.Va km_curthread
|
|
|
|
always points to the thread that is currently assigned to this KSE if any,
|
|
|
|
or
|
|
|
|
.Dv NULL
|
|
|
|
otherwise.
|
|
|
|
This field is modified by both the kernel and the user process as follows.
|
|
|
|
.Pp
|
|
|
|
When
|
|
|
|
.Va km_curthread
|
|
|
|
is not
|
|
|
|
.Dv NULL ,
|
|
|
|
it is assumed to be pointing at the mailbox for the currently executing
|
|
|
|
thread, and the KSE may be unassigned, e.g., if the thread blocks in the
|
|
|
|
kernel.
|
|
|
|
The kernel will then save the contents of
|
|
|
|
.Va km_curthread
|
|
|
|
with the blocked thread, set
|
|
|
|
.Va km_curthread
|
|
|
|
to
|
|
|
|
.Dv NULL ,
|
|
|
|
and upcall to invoke
|
|
|
|
.Fn km_func .
|
|
|
|
.Pp
|
|
|
|
When
|
|
|
|
.Va km_curthread
|
|
|
|
is
|
|
|
|
.Dv NULL ,
|
|
|
|
the kernel will never perform any upcalls with this KSE; in other words,
|
|
|
|
the KSE remains assigned to the thread even if it blocks.
|
|
|
|
.Va km_curthread
|
|
|
|
must be
|
|
|
|
.Dv NULL
|
|
|
|
while the KSE is executing critical user thread scheduler
|
|
|
|
code that would be disrupted by an intervening upcall;
|
|
|
|
in particular, while
|
|
|
|
.Fn km_func
|
|
|
|
itself is executing.
|
|
|
|
.Pp
|
|
|
|
Before invoking
|
|
|
|
.Fn km_func
|
|
|
|
in any upcall, the kernel always sets
|
|
|
|
.Va km_curthread
|
|
|
|
to
|
|
|
|
.Dv NULL .
|
|
|
|
Once the user thread scheduler has chosen a new thread to run,
|
|
|
|
it should point
|
|
|
|
.Va km_curthread
|
|
|
|
at the thread's mailbox, re-enabling upcalls, and then resume the thread.
|
|
|
|
.Em Note :
|
|
|
|
modification of
|
|
|
|
.Va km_curthread
|
2002-12-27 08:21:15 +00:00
|
|
|
by the user thread scheduler must be atomic
|
|
|
|
with the loading of the context of the new thread, to avoid
|
|
|
|
the situation where the thread context area
|
|
|
|
may be modified by a blocking async operation, while there
|
|
|
|
is still valid information to be read out of it.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
|
|
|
.Va km_completed
|
|
|
|
points to a linked list of user threads that have completed their work
|
|
|
|
in the kernel since the last upcall.
|
|
|
|
The user thread scheduler should put these threads back into its
|
|
|
|
own runnable queue.
|
2003-01-20 11:30:08 +00:00
|
|
|
Each thread in a KSE group that completes a kernel operation
|
2002-12-27 08:21:15 +00:00
|
|
|
(synchronous or asynchronous) that results in an upcall is guaranteed to be
|
2002-10-02 18:01:51 +00:00
|
|
|
linked into exactly one KSE's
|
|
|
|
.Va km_completed
|
|
|
|
list; which KSE in the group, however, is indeterminate.
|
2002-12-27 08:21:15 +00:00
|
|
|
Furthermore, the completion will be reported in only one upcall.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
|
|
|
.Va km_sigscaught
|
|
|
|
contains the list of signals caught by this process since the previous
|
|
|
|
upcall to any KSE in the process.
|
|
|
|
As long as there exists one or more KSEs with an associated mailbox in
|
|
|
|
the user process, signals are delivered this way rather than the
|
2003-02-24 22:53:26 +00:00
|
|
|
traditional way.
|
|
|
|
(This has not been implemented and may change.)
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
2002-11-22 23:48:38 +00:00
|
|
|
.Va km_timeofday
|
|
|
|
is set by the kernel to the current system time before performing
|
|
|
|
each upcall.
|
|
|
|
.Pp
|
2002-10-02 18:01:51 +00:00
|
|
|
.Va km_flags
|
|
|
|
may contain any of the following bits OR'ed together:
|
|
|
|
.Bl -tag -width indent
|
2005-01-11 20:50:51 +00:00
|
|
|
.It Dv KMF_NOUPCALL
|
|
|
|
Block upcalls from happening.
|
|
|
|
The thread is in some critical section.
|
|
|
|
.It Dv KMF_NOCOMPLETED , KMF_DONE , KMF_BOUND
|
2005-07-31 03:30:48 +00:00
|
|
|
This thread should be considered to be permanently bound to
|
2004-10-08 20:40:30 +00:00
|
|
|
its KSE, and treated much like a non-threaded process would be.
|
2005-01-11 20:50:51 +00:00
|
|
|
It is a
|
|
|
|
.Dq "long term"
|
|
|
|
version of
|
|
|
|
.Dv KMF_NOUPCALL
|
|
|
|
in some ways.
|
|
|
|
.It Dv KMF_WAITSIGEVENT
|
2005-07-31 03:30:48 +00:00
|
|
|
Implement characteristics needed for the signal delivery thread.
|
2002-10-02 18:01:51 +00:00
|
|
|
.El
|
|
|
|
.\"
|
|
|
|
.Ss Thread Mailboxes
|
|
|
|
.\"
|
|
|
|
Each user thread must have associated with it a unique
|
2004-10-08 20:40:30 +00:00
|
|
|
.Vt "struct kse_thr_mailbox"
|
2005-01-11 20:50:51 +00:00
|
|
|
as defined in
|
|
|
|
.In sys/kse.h .
|
|
|
|
It includes the following fields.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Pp
|
|
|
|
.Va tm_udata
|
|
|
|
is an opaque pointer ignored by the kernel.
|
|
|
|
.Pp
|
|
|
|
.Va tm_context
|
|
|
|
stores the context for the thread when the thread is blocked in user space.
|
2002-12-27 08:21:15 +00:00
|
|
|
This field is also updated by the kernel before a completed thread is returned
|
2002-10-02 18:01:51 +00:00
|
|
|
to the user thread scheduler via
|
|
|
|
.Va km_completed .
|
|
|
|
.Pp
|
|
|
|
.Va tm_next
|
|
|
|
links the
|
|
|
|
.Va km_completed
|
|
|
|
threads together when returned by the kernel with an upcall.
|
|
|
|
The end of the list is marked with a
|
|
|
|
.Dv NULL
|
|
|
|
pointer.
|
|
|
|
.Pp
|
2002-11-22 23:48:38 +00:00
|
|
|
.Va tm_uticks
|
|
|
|
and
|
|
|
|
.Va tm_sticks
|
|
|
|
are time counters for user mode and kernel mode execution, respectively.
|
|
|
|
These counters count ticks of the statistics clock (see
|
2002-11-29 15:57:50 +00:00
|
|
|
.Xr clocks 7 ) .
|
2002-11-22 23:48:38 +00:00
|
|
|
While any thread is actively executing in the kernel, the corresponding
|
|
|
|
.Va tm_sticks
|
|
|
|
counter is incremented.
|
|
|
|
While any KSE is executing in user space and that KSE's
|
|
|
|
.Va km_curthread
|
|
|
|
pointer is not equal to
|
|
|
|
.Dv NULL ,
|
|
|
|
the corresponding
|
|
|
|
.Va tm_uticks
|
|
|
|
counter is incremented.
|
|
|
|
.Pp
|
2002-10-02 18:01:51 +00:00
|
|
|
.Va tm_flags
|
|
|
|
may contain any of the following bits OR'ed together:
|
|
|
|
.Bl -tag -width indent
|
2005-01-11 20:50:51 +00:00
|
|
|
.It Dv TMF_NOUPCALL
|
|
|
|
Similar to
|
|
|
|
.Dv KMF_NOUPCALL .
|
|
|
|
This flag inhibits upcalling for critical sections.
|
2004-10-08 20:40:30 +00:00
|
|
|
Some architectures require this to be in one place and some in the other.
|
2002-10-02 18:01:51 +00:00
|
|
|
.El
|
|
|
|
.Sh RETURN VALUES
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2003-01-20 11:28:41 +00:00
|
|
|
.Fn kse_create ,
|
|
|
|
.Fn kse_wakeup ,
|
2002-10-02 18:01:51 +00:00
|
|
|
and
|
|
|
|
.Fn kse_thr_interrupt
|
2002-12-18 09:22:32 +00:00
|
|
|
system calls
|
2002-10-02 18:01:51 +00:00
|
|
|
return zero if successful.
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_exit
|
|
|
|
and
|
|
|
|
.Fn kse_release
|
2002-12-18 09:22:32 +00:00
|
|
|
system calls
|
2002-10-02 18:01:51 +00:00
|
|
|
do not return if successful.
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
All of these system calls return a non-zero error code in case of an error.
|
2002-10-02 18:01:51 +00:00
|
|
|
.Sh ERRORS
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_create
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
will fail if:
|
|
|
|
.Bl -tag -width Er
|
|
|
|
.It Bq Er ENXIO
|
|
|
|
There are already as many KSEs in the KSE group as hardware processors.
|
|
|
|
.It Bq Er EAGAIN
|
|
|
|
The system-imposed limit on the total number of KSE groups under
|
|
|
|
execution would be exceeded.
|
|
|
|
The limit is given by the
|
|
|
|
.Xr sysctl 3
|
|
|
|
MIB variable
|
|
|
|
.Dv KERN_MAXPROC .
|
|
|
|
(The limit is actually ten less than this
|
|
|
|
except for the super user.)
|
|
|
|
.It Bq Er EAGAIN
|
|
|
|
The user is not the super user, and the system-imposed limit on the total
|
|
|
|
number of KSE groups under execution by a single user would be exceeded.
|
|
|
|
The limit is given by the
|
|
|
|
.Xr sysctl 3
|
|
|
|
MIB variable
|
|
|
|
.Dv KERN_MAXPROCPERUID .
|
|
|
|
.It Bq Er EAGAIN
|
|
|
|
The user is not the super user, and the soft resource limit corresponding
|
2002-12-19 09:40:28 +00:00
|
|
|
to the
|
|
|
|
.Fa resource
|
|
|
|
argument
|
2002-10-02 18:01:51 +00:00
|
|
|
.Dv RLIMIT_NPROC
|
|
|
|
would be exceeded (see
|
|
|
|
.Xr getrlimit 2 ) .
|
|
|
|
.It Bq Er EFAULT
|
2002-12-19 09:40:28 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fa mbx
|
2002-12-19 09:40:28 +00:00
|
|
|
argument
|
2002-10-02 18:01:51 +00:00
|
|
|
points to an address which is not a valid part of the process address space.
|
|
|
|
.El
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_exit
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
will fail if:
|
|
|
|
.Bl -tag -width Er
|
|
|
|
.It Bq Er EDEADLK
|
|
|
|
The current KSE is the last in its KSE group and there are still one or more
|
|
|
|
threads associated with the KSE group blocked in the kernel.
|
|
|
|
.It Bq Er ESRCH
|
|
|
|
The current KSE has no associated mailbox, i.e., the process is operating
|
|
|
|
in traditional, unthreaded mode (in this case use
|
2003-06-17 09:36:47 +00:00
|
|
|
.Xr _exit 2
|
2002-10-02 18:01:51 +00:00
|
|
|
to exit the process).
|
|
|
|
.El
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_release
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
will fail if:
|
|
|
|
.Bl -tag -width Er
|
|
|
|
.It Bq Er ESRCH
|
|
|
|
The current KSE has no associated mailbox, i.e., the process is operating is
|
|
|
|
traditional, unthreaded mode.
|
|
|
|
.El
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_wakeup
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
will fail if:
|
|
|
|
.Bl -tag -width Er
|
|
|
|
.It Bq Er ESRCH
|
2002-12-19 09:40:28 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fa mbx
|
2002-12-19 09:40:28 +00:00
|
|
|
argument
|
2002-10-02 18:01:51 +00:00
|
|
|
is not
|
|
|
|
.Dv NULL
|
|
|
|
and the mailbox pointed to by
|
|
|
|
.Fa mbx
|
|
|
|
is not associated with any KSE in the process.
|
|
|
|
.It Bq Er ESRCH
|
2002-12-19 09:40:28 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fa mbx
|
2002-12-19 09:40:28 +00:00
|
|
|
argument
|
2002-10-02 18:01:51 +00:00
|
|
|
is
|
|
|
|
.Dv NULL
|
|
|
|
and the current KSE has no associated mailbox, i.e., the process is operating
|
|
|
|
in traditional, unthreaded mode.
|
|
|
|
.El
|
|
|
|
.Pp
|
2002-12-18 09:22:32 +00:00
|
|
|
The
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fn kse_thr_interrupt
|
2002-12-18 09:22:32 +00:00
|
|
|
system call
|
2002-10-02 18:01:51 +00:00
|
|
|
will fail if:
|
|
|
|
.Bl -tag -width Er
|
|
|
|
.It Bq Er ESRCH
|
|
|
|
The thread corresponding to
|
|
|
|
.Fa tmbx
|
|
|
|
is neither currently assigned to any KSE in the process nor blocked in the
|
|
|
|
kernel.
|
|
|
|
.El
|
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr rfork 2 ,
|
|
|
|
.Xr pthread 3 ,
|
|
|
|
.Xr ucontext 3
|
|
|
|
.Rs
|
|
|
|
.%A "Thomas E. Anderson"
|
|
|
|
.%A "Brian N. Bershad"
|
|
|
|
.%A "Edward D. Lazowska"
|
|
|
|
.%A "Henry M. Levy"
|
|
|
|
.%J "ACM Transactions on Computer Systems"
|
|
|
|
.%N Issue 1
|
|
|
|
.%V Volume 10
|
|
|
|
.%D February 1992
|
|
|
|
.%I ACM Press
|
|
|
|
.%P pp. 53-79
|
|
|
|
.%T "Scheduler activations: effective kernel support for the user-level management of parallelism"
|
|
|
|
.Re
|
|
|
|
.Sh HISTORY
|
2002-12-18 09:22:32 +00:00
|
|
|
The KSE system calls first appeared in
|
2002-10-02 18:01:51 +00:00
|
|
|
.Fx 5.0 .
|
|
|
|
.Sh AUTHORS
|
|
|
|
KSE was originally implemented by
|
|
|
|
.An -nosplit
|
|
|
|
.An "Julian Elischer" Aq julian@FreeBSD.org ,
|
|
|
|
with additional contributions by
|
|
|
|
.An "Jonathan Mini" Aq mini@FreeBSD.org ,
|
|
|
|
.An "Daniel Eischen" Aq deischen@FreeBSD.org ,
|
|
|
|
and
|
|
|
|
.An "David Xu" Aq davidxu@FreeBSD.org .
|
|
|
|
.Pp
|
|
|
|
This manual page was written by
|
|
|
|
.An "Archie Cobbs" Aq archie@FreeBSD.org .
|
|
|
|
.Sh BUGS
|
|
|
|
The KSE code is
|
|
|
|
.Ud .
|