Commit Graph

245 Commits

Author SHA1 Message Date
julian
3fc9836d46 Change the process flags P_KSES to be P_THREADED.
This is just a cosmetic change but I've been meaning to do it for about a year.
2003-02-27 02:05:19 +00:00
jeff
5c29a640b8 - Add a new function, thread_signal_add(), that is called from postsig to
add a signal to a mailbox's pending set.
 - Add a new function, thread_signal_upcall(), this causes the current thread
   to upcall so that we can deliver pending signals.

Reviewed by:	mini
2003-02-17 09:58:11 +00:00
julian
af55753a06 Move a bunch of flags from the KSE to the thread.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.

Submitted by:	 parts by davidxu@
Reviewed by:	jeff@ mini@
2003-02-17 09:55:10 +00:00
jeff
aa384c931f - Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back into
the proc.  These counters are only examined through calcru.

Submitted by:	davidxu
Tested on:	x86, alpha, UP/SMP
2003-02-17 02:19:58 +00:00
julian
e8efa7328e Reversion of commit by Davidxu plus fixes since applied.
I'm not convinced there is anything major wrong with the patch but
them's the rules..

I am using my "David's mentor" hat to revert this as he's
offline for a while.
2003-02-01 12:17:09 +00:00
tjr
e3277471d4 Use a local variable to store the number of ticks that elapsed in
kernel mode instead of (unintentionally) using the global `ticks'.
This error completely broke profiling.
2003-01-31 11:22:31 +00:00
davidxu
4b9b549ca2 Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian
2003-01-26 11:41:35 +00:00
julian
dde96893c9 Add code to ddb to allow backtracing an arbitrary thread.
(show thread {address})

Remove the IDLE kse state and replace it with a change in
the way threads sahre KSEs. Every KSE now has a thread, which is
considered its "owner" however a KSE may also be lent to other
threads in the same group to allow completion of in-kernel work.
n this case the owner remains the same and the KSE will revert to the
owner when the other work has been completed.

All creations of upcalls etc. is now done from
kse_reassign() which in turn is called from mi_switch or
thread_exit(). This means that special code can be removed from
msleep() and cv_wait().

kse_release() does not leave a KSE with no thread any more but
converts the existing thread into teh KSE's owner, and sets it up
for doing an upcall. It is just inhibitted from being scheduled until
there is some reason to do an upcall.

Remove all trace of the kse_idle queue since it is no-longer needed.
"Idle" KSEs are now on the loanable queue.
2002-12-28 01:23:07 +00:00
rwatson
551263d9d4 To reduce per-return overhead of userret(), call into
mac_thread_userret() only if PS_MACPEND is set in the process AST mask.
This avoids the cost of the entry point in the common case, but
requires policies interested in the userret event to set the flag
(protected by the scheduler lock) if they do want the event.  Since
all the policies that we're working with which use mac_thread_userret()
use the entry point only selectively to perform operations deferred
for locking reasons, this maintains the desired semantics.

Approved by:	re
Requested by:	bde
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-08 19:00:17 +00:00
julian
64467d2a2f iBack out david's last commit. the suspension code needs to be called
for non KSE processes too.
2002-10-26 04:44:17 +00:00
davidxu
9f183ef3fc Move suspension checking code from userret() into thread_userret(). 2002-10-26 02:56:51 +00:00
jeff
ef4d4e378e - Create a new scheduler api that is defined in sys/sched.h
- Begin moving scheduler specific functionality into sched_4bsd.c
 - Replace direct manipulation of scheduler data with hooks provided by the
   new api.
 - Remove KSE specific state modifications and single runq assumptions from
   kern_switch.c

Reviewed by:	-arch
2002-10-12 05:32:24 +00:00
jhb
7cc0ed53c2 - Move p_cpulimit to struct proc from struct plimit and protect it with
sched_lock.  This means that we no longer access p_limit in mi_switch()
  and the p_limit pointer can be protected by the proc lock.
- Remove PRS_ZOMBIE check from CPU limit test in mi_switch().  PRS_ZOMBIE
  processes don't call mi_switch(), and even if they did there is no longer
  the danger of p_limit being NULL (which is what the original zombie check
  was added for).
- When we bump the current processes soft CPU limit in ast(), just bump the
  private p_cpulimit instead of the shared rlimit.  This fixes an XXX for
  some value of fix.  There is still a (probably benign) bug in that this
  code doesn't check that the new soft limit exceeds the hard limit.

Inspired by:	bde (2)
2002-10-09 17:17:24 +00:00
jmallett
cd6e4d8c7c Access td->td_kse inside sched_lock.
Submitted by:	julian
2002-10-02 18:25:09 +00:00
jmallett
056df6de99 De-obfuscate local use of members of 'struct thread', for which we have
local variables, and group assignment.
2002-10-02 16:39:39 +00:00
rwatson
4be0d09ad3 Add a new MAC entry point, mac_thread_userret(td), which permits policy
modules to perform MAC-related events when a thread returns to user
space.  This is required for policies that have floating process labels,
as it's not always possible to acquire the process lock at arbitrary
points in the stack during system call processing; process labels might
represent traditional authentication data, process history information,
or other data.

LOMAC will use this entry point to perform the process label update
prior to the thread returning to userspace, when plugged into the MAC
framework.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-02 02:42:38 +00:00
jmallett
7a693db242 Back our kernel support for reliable signal queues.
Requested by:	rwatson, phk, and many others
2002-10-01 17:15:53 +00:00
jhb
cb9df84644 Minor style nits in a comment. 2002-10-01 15:49:32 +00:00
jhb
014597718b Various style fixups.
Submitted by:	bde (mostly)
2002-10-01 14:16:50 +00:00
jhb
a1ef3e37b0 Actually clear PS_XCPU in ast() when we handle it.
Submitted by:	bde
Pointy hat to:	jhb
2002-10-01 14:13:13 +00:00
jhb
f350519764 - Add a new per-process flag PS_XCPU to indicate that at least one thread
has exceeded its CPU time limit.
- In mi_switch(), set PS_XCPU when the CPU time limit is exceeded.
- Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set.

Requested by:	many
2002-09-30 21:13:54 +00:00
jmallett
0341f71df1 First half of implementation of ksiginfo, signal queues, and such. This
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control.  There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.

After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland.  That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.

CODAFS is unfinished in this regard because the logic is unclear in
some places.

Sponsored by:	New Gold Technology
Reviewed by:	bde, tjr, jake [an older version, logic similar]
2002-09-30 20:20:22 +00:00
julian
bcb38a31ff slightly clean up the thread_userret() and thread_consider_upcall() calls.
also some slight changes for TDF_BOUND testing and small style changes
Should ONLY affect KSE programs

Submitted by:	davidxu
2002-09-23 06:14:30 +00:00
rwatson
d410071f5d Spell proprly properly:
failed to set signal flags proprly for ast()
  failed to set signal flags proprly for ast()
  failed to set signal flags proprly for ast()
  failed to set signal flags proprly for ast()
2002-08-22 14:36:03 +00:00
mini
b244d01c4d Revert removal of cred_free_thread(): It is used to ensure that a thread's
credentials are not improperly borrowed when the thread is not current in
the kernel.

Requested by:	jhb, alfred
2002-07-11 02:18:33 +00:00
julian
acb01acb4e Don't slow every syscall and trap by doing locks and stuff if the
'stop' bits are not set. This is a temporary thing.. I think this code probably
needs to be rewritten anyhow.
2002-07-10 06:40:22 +00:00
julian
aa2dc0a5d9 Part 1 of KSE-III
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by:	Almost everyone who counts
	(at various times, peter, jhb, matt, alfred, mini, bernd,
	and a cast of thousands)

	NOTE: this is still Beta code, and contains lots of debugging stuff.
	expect slight instability in signals..
2002-06-29 17:26:22 +00:00
mini
ef6f2f567d Remove unused diagnostic function cread_free_thread().
Approved by:	alfred
2002-06-24 06:22:00 +00:00
jhb
1a2a2fa24a We no longer need to acqure Giant in ast() for ktrpsig() in postsig() now
that ktrace no longer needs Giant.
2002-06-07 05:43:40 +00:00
julian
304195369e CURSIG() is not a macro so rename it cursig().
Obtained from:	KSE tree
2002-05-29 23:44:32 +00:00
bde
14ae95f735 Moved signal handling and rescheduling from userret() to ast() so that
they aren't in the usual path of execution for syscalls and traps.
The main complication for this is that we have to set flags to control
ast() everywhere that changes the signal mask.

Avoid locking in userret() in most of the remaining cases.

Submitted by:	luoqi (first part only, long ago, reorganized by me)
Reminded by:	dillon
2002-04-04 17:49:48 +00:00
jake
855079d5b7 Style fixes purposefully left out of last commit. I checked the kse tree
and didn't see any changes that this conflicts with.
2002-03-29 16:45:03 +00:00
jake
8f9ce8398d Remove abuse of intr_disable/restore in MI code by moving the loop in ast()
back into the calling MD code.  The MD code must ensure no races between
checking the astpening flag and returning to usermode.

Submitted by:	peter (ia64 bits)
Tested on:	alpha (peter, jeff), i386, ia64 (peter), sparc64
2002-03-29 16:35:26 +00:00
imp
969e82886e Remove last two abuses of cpu_critical_{enter,exit} in the MI code.
Reviewed by: jake, jhb, rwatson
2002-03-21 06:11:09 +00:00
jhb
2e425ee2fc Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined.
Instead of caching the ucred reference, just go ahead and eat the
decerement and increment of the refcount.  Now that Giant is pushed down
into crfree(), we no longer have to get Giant in the common case.  In the
case when we are actually free'ing the ucred, we would normally free it on
the next kernel entry, so the cost there is not new, just in a different
place.  This also removse td_cache_ucred from struct thread.  This is
still only done #ifdef DIAGNOSTIC.

[ missed this file in the previous commit ]

Tested on:	i386, alpha
2002-03-20 21:12:04 +00:00
jake
b27e79327d Make this compile.
Pointy hat to:	julian
2002-02-23 01:42:13 +00:00
julian
53eb1d9219 Add some DIAGNOSTIC code.
While in userland, keep the thread's ucred reference in a shadow
field so that the usual place to store it is NULL.
If DIAGNOSTIC is not set, the thread ucred is kept valid until the next
kernel entry, at which time it is checked against the process cred
and possibly corrected. Produces a BIG speedup in
kernels with INVARIANTS set. (A previous commit corrected it
for the non INVARIANTS case already)

Reviewed by:	dillon@freebsd.org
2002-02-22 23:58:22 +00:00
julian
abe785e035 If the credential on an incoming thread is correct, don't bother
reaquiring it. In the same vein, don't bother dropping the thread cred
when goinf ot userland. We are guaranteed to nned it when we come back,
(which we are guaranteed to do).

Reviewed by:	jhb@freebsd.org, bde@freebsd.org (slightly different version)
2002-02-17 01:09:56 +00:00
julian
37369620df In a threaded world, differnt priorirites become properties of
different entities.  Make it so.

Reviewed by:	jhb@freebsd.org (john baldwin)
2002-02-11 20:37:54 +00:00
bde
73ef84f92b Changed the type of pcb_flags from u_char to u_int and adjusted things.
This removes the only atomic operation on a char type in the entire
kernel.
2002-01-17 17:49:23 +00:00
jhb
1ce407b675 Change the preemption code for software interrupt thread schedules and
mutex releases to not require flags for the cases when preemption is
not allowed:

The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent
switching to a higher priority thread on mutex releease and swi schedule,
respectively when that switch is not safe.  Now that the critical section
API maintains a per-thread nesting count, the kernel can easily check
whether or not it should switch without relying on flags from the
programmer.  This fixes a few bugs in that all current callers of
swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from
fast interrupt handlers and the swi_sched of softclock needed this flag.
Note that to ensure that swi_sched()'s in clock and fast interrupt
handlers do not switch, these handlers have to be explicitly wrapped
in critical_enter/exit pairs.  Presently, just wrapping the handlers is
sufficient, but in the future with the fully preemptive kernel, the
interrupt must be EOI'd before critical_exit() is called.  (critical_exit()
can switch due to a deferred preemption in a fully preemptive kernel.)

I've tested the changes to the interrupt code on i386 and alpha.  I have
not tested ia64, but the interrupt code is almost identical to the alpha
code, so I expect it will work fine.  PowerPC and ARM do not yet have
interrupt code in the tree so they shouldn't be broken.  Sparc64 is
broken, but that's been ok'd by jake and tmm who will be fixing the
interrupt code for sparc64 shortly.

Reviewed by:	peter
Tested on:	i386, alpha
2002-01-05 08:47:13 +00:00
jhb
82f83a1cbe Axe a stale comment. Holding sched_lock across both setrunqueue() and
mi_switch() is sufficient.
2002-01-04 10:55:51 +00:00
jhb
3b3c195480 - Change all callers of addupc_task() to check PS_PROFIL explicitly and
remove the check from addupc_task().  It would need sched_lock while
  testing the flag anyways.
- Always read sticks while holding sched_lock using a temporary variable
  where needed.
- Always init prticks to 0 in ast() to quiet a warning.
2001-12-18 09:06:10 +00:00
jhb
a3b98398cb Modify the critical section API as follows:
- The MD functions critical_enter/exit are renamed to start with a cpu_
  prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
  count and a per-thread critical section saved state set when entering
  a critical section while at nesting level 0 and restored when exiting
  to nesting level 0.  This moves the saved state out of spin mutexes so
  that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
  cpu_critical_enter/exit.  MI code such as device drivers and spin
  mutexes use the MI wrappers.  Note that since the MI wrappers store
  the state in the current thread, they do not have any return values or
  arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
  assigned to curthread->td_savecrit during fork_exit().

Tested on:	i386, alpha
2001-12-18 00:27:18 +00:00
jhb
46e3f92a5d Add a per-thread ucred reference for syscalls and synchronous traps from
userland.  The per thread ucred reference is immutable and thus needs no
locks to be read.  However, until all the proc locking associated with
writes to p_ucred are completed, it is still not safe to use the per-thread
reference.

Tested on:	x86 (SMP), alpha, sparc64
2001-10-26 08:12:54 +00:00
jhb
78494eacb6 Remove a bogus comment. "atomic" doesn't mean that the operation is done
as a physical atomic operation.  That would require the code to use the
atomic API, which it does not.  Instead, the operation is made psuedo
atomic (hence the quotes) by use of the lock to protect clearing all of the
flags in question.
2001-09-21 19:26:57 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
dillon
08e732a88b Remove the MPSAFE keyword from the parser for syscalls.master.
Instead introduce the [M] prefix to existing keywords.  e.g.
MSTD is the MP SAFE version of STD.  This is prepatory for a
massive Giant lock pushdown.  The old MPSAFE keyword made
syscalls.master too messy.

Begin comments MP-Safe procedures with the comment:
/*
 * MPSAFE
 */
This comments means that the procedure may be called without
Giant held (The procedure itself may still need to obtain
Giant temporarily to do its thing).

sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE
sv_transtrap() is now MP SAFE and assumed to be MP SAFE

ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown)
trapsignal() is now MP SAFE (Giant Pushdown)

Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant)
test in syscall[2]() in */*/trap.c now do not.  Instead they
explicitly unlock Giant if they previously obtained it, and then
assert that it is no longer held to catch broken system calls.

Rebuild syscall tables.
2001-08-30 18:50:57 +00:00
jhb
4a89454dcd - Close races with signals and other AST's being triggered while we are in
the process of exiting the kernel.  The ast() function now loops as long
  as the PS_ASTPENDING or PS_NEEDRESCHED flags are set.  It returns with
  preemption disabled so that any further AST's that arrive via an
  interrupt will be delayed until the low-level MD code returns to user
  mode.
- Use u_int's to store the tick counts for profiling purposes so that we
  do not need sched_lock just to read p_sticks.  This also closes a
  problem where the call to addupc_task() could screw up the arithmetic
  due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
  clear_resched(), and resched_wanted() in favor of direct bit operations
  on p_sflag.
- Fix up locking with sched_lock some.  In addupc_intr(), use sched_lock
  to ensure pr_addr and pr_ticks are updated atomically with setting
  PS_OWEUPC.  In ast() we clear pr_ticks atomically with clearing
  PS_OWEUPC.  We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.

Reviewed by:	bde (mostly)
2001-08-10 22:53:32 +00:00
dillon
52f62a303c postsig() currently requires Giant to be held. Giant is held properly at
the first postsig() call, but not always held at the second place,
resulting in an occassional panic.
2001-07-04 15:36:30 +00:00