Commit Graph

244 Commits

Author SHA1 Message Date
jhb
6bd4022f0d Correct handling of PDROP in msleep() to just skip the mtx_lock() rather
than clear the lock pointer so that sleepq_add() still gets the correct
lock pointer and doesn't bogusly trip an assertion.
2004-03-02 14:58:33 +00:00
jhb
a738033028 Switch the sleep/wakeup and condition variable implementations to use the
sleep queue interface:
- Sleep queues attempt to merge some of the benefits of both sleep queues
  and condition variables.  Having sleep qeueus in a hash table avoids
  having to allocate a queue head for each wait channel.  Thus, struct cv
  has shrunk down to just a single char * pointer now.  However, the
  hash table does not hold threads directly, but queue heads.  This means
  that once you have located a queue in the hash bucket, you no longer have
  to walk the rest of the hash chain looking for threads.  Instead, you have
  a list of all the threads sleeping on that wait channel.
- Outside of the sleepq code and the sleep/cv code the kernel no longer
  differentiates between cv's and sleep/wakeup.  For example, calls to
  abortsleep() and cv_abort() are replaced with a call to sleepq_abort().
  Thus, the TDF_CVWAITQ flag is removed.  Also, calls to unsleep() and
  cv_waitq_remove() have been replaced with calls to sleepq_remove().
- The sched_sleep() function no longer accepts a priority argument as
  sleep's no longer inherently bump the priority.  Instead, this is soley
  a propery of msleep() which explicitly calls sched_prio() before
  blocking.
- The TDF_ONSLEEPQ flag has been dropped as it was never used.  The
  associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been
  dropped and replaced with a single explicit clearing of td_wchan.
  TD_SET_ONSLEEPQ() would really have only made sense if it had taken
  the wait channel and message as arguments anyway.  Now that that only
  happens in one place, a macro would be overkill.
2004-02-27 18:52:44 +00:00
jeff
8dd6e29330 - Revert rev 1.240 we no longer need a kthread for loadav(). 2004-02-01 05:37:36 +00:00
jeff
d1598189c6 - Use sched_load() rather than grabbing the sx lock and traversing the proc
table to discover the load.
2004-02-01 02:51:33 +00:00
jhb
40204c6793 Move the loadav() callout into its own kthread since it uses allproc_lock
which is a sleepable lock and thus is not safe to acquire from a callout
routine.
2004-01-28 20:44:41 +00:00
jeff
c133d855d2 - Use a unique string for the sched_setup SYSINIT and rename sched_setup to
synch_setup.  The schedulers use the sched_setup function name.
2004-01-25 07:49:45 +00:00
jeff
8ac75a1ec6 - Add a flags parameter to mi_switch. The value of flags may be SW_VOL or
SW_INVOL.  Assert that one of these is set in mi_switch() and propery
   adjust the rusage statistics.  This is to simplify the large number of
   users of this interface which were previously all required to adjust the
   proper counter prior to calling mi_switch().  This also facilitates more
   switch and locking optimizations.
 - Change all callers of mi_switch() to pass the appropriate paramter and
   remove direct references to the process statistics.
2004-01-25 03:54:52 +00:00
bde
c13e065ac9 Removed mostly-dead code for setting switchtime after the idle loop
clobbers this variable.  Long ago, when the idle loop wasn't in a
process, it set switchtime.tv_sec to zero to indicate that the time
needs to be read after the idle loop finishes.  The special case for
this isn't needed now that there is an idle process (for each CPU).
The time is read in the normal way when the idle process is switched
away from.  The seconds component of the time is only zero for the
first second after the uptime is set, and the mostly-dead code was only
executed during this time.  (This was slightly broken by using uptimes
instead of times relative to the Epoch -- in the original version the
seconds component of the time was only 0 for the first second after
the Epoch.)

In mi_switch(), moved the setting of switchticks to just after the
first (and now only) setting of switchtime.  This setting used to be
delayed since a late setting was needed for the idle case and an early
setting was not needed.  Now the early setting is needed so that
fork_exit() doesn't need to set either switchtime or switchticks.
Removed now-completely-rotted comment attached to this.  Most of the
code described by the comment had already moved to sched_switch().
2003-10-29 15:23:09 +00:00
jeff
c7b38e5072 - Collapse sched_switchin() and sched_switchout() into sched_switch(). Now
mi_switch() calls sched_switch() which calls cpu_switch().  This is
   actually one less function call than it had been.
2003-10-16 08:53:46 +00:00
bms
dd43324d3d Add a pre-emption counter, td_generation, so that threads can notice
when they have been pre-empted by other threads. This is bumped from
within mi_switch() every time a context switch takes place.

Discussed with:	pete
2003-10-05 09:35:08 +00:00
jeff
e16e9289c6 - On my Pentium4-M laptop, invalpg takes ~1100 cycles if the page is found in
the TLB and ~1600 if it is not.  Therefore, it is more effecient to
   invalidate the TLB after operations that use CMAP rather than before.
 - So that the tlb is invalidated prior to switching off of a processor, we
   must change the switchin functions to switchout functions.
 - Remove td_switchout from the thread and move it to the x86 pcb.
 - Move the code that calls switchout into swtch.s.  These changes make this
   optimization truely x86 specific.
2003-09-30 08:11:36 +00:00
sam
77a4adf88f Change instances of callout_init that specify MPSAFE behaviour to
use CALLOUT_MPSAFE instead of "1" for the second parameter.  This
does not change the behaviour; it just makes the intent more clear.
2003-08-19 17:51:11 +00:00
jhb
72dc80b91f - Various style fixes in both code and comments.
- Update some stale comments.
- Sort a couple of includes.
- Only set 'newcpu' in updatepri() if we use it.
- No functional changes.

Obtained from:	bde (via an old diff I got a long time ago)
2003-08-15 21:29:06 +00:00
grehan
d2678252c2 Update powerpc to use the (old thread,new thread) calling convention
for cpu_throw() and cpu_switch().
2003-08-14 03:56:24 +00:00
jhb
b198db6f26 - Convert Alpha over to the new calling conventions for cpu_throw() and
cpu_switch() where both the old and new threads are passed in as
  arguments.  Only powerpc uses the old conventions now.
- Update comments in the Alpha swtch.s to reflect KSE changes.

Tested by:	obrien, marcel
2003-08-12 19:33:36 +00:00
peter
b68c98870a unifdef -DLAZY_SWITCH and start to tidy up the associated glue. 2003-07-10 01:02:59 +00:00
davidxu
79cd14dc97 o Change kse_thr_interrupt to allow send a signal to a specified thread,
or unblock a thread in kernel, and allow UTS to specify whether syscall
  should be restarted.
o Add ability for UTS to monitor signal comes in and removed from process,
  the flag PS_SIGEVENT is used to indicate the events.
o Add a KMF_WAITSIGEVENT for KSE mailbox flag, UTS call kse_release with
  this flag set to wait for above signal event.
o For SA based thread, kernel masks all signal in its signal mask, let
  UTS to use kse_thr_interrupt interrupt a thread, and install a signal
  frame in userland for the thread.
o Add a tm_syncsig in thread mailbox, when a hardware trap occurs,
  it is used to deliver synchronous signal to userland, and upcall
  is schedule, so UTS can process the synchronous signal for the thread.

Reviewed by: julian (mentor)
2003-06-28 08:29:05 +00:00
peter
07b2700826 Tidy up leftover lazy_switch instrumentation that is no longer needed.
This cleans up some #ifdef hell.
2003-06-27 22:39:14 +00:00
davidxu
95b64acdb5 Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.
2003-06-15 00:31:24 +00:00
obrien
97ddbdc5a1 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
phk
e62620c360 Add a couple of XXX comments where the intent is not clear.
Found by:       FlexeLint
2003-05-31 20:13:58 +00:00
marcel
2c3af6b0c7 Revamp of the syscall path, exception and context handling. The
prime objectives are:
o  Implement a syscall path based on the epc inststruction (see
   sys/ia64/ia64/syscall.s).
o  Revisit the places were we need to save and restore registers
   and define those contexts in terms of the register sets (see
   sys/ia64/include/_regset.h).

Secundairy objectives:
o  Remove the requirement to use contigmalloc for kernel stacks.
o  Better handling of the high FP registers for SMP systems.
o  Switch to the new cpu_switch() and cpu_throw() semantics.
o  Add a good unwinder to reconstruct contexts for the rare
   cases we need to (see sys/contrib/ia64/libuwx)

Many files are affected by this change. Functionally it boils
down to:
o  The EPC syscall doesn't preserve registers it does not need
   to preserve and places the arguments differently on the stack.
   This affects libc and truss.
o  The address of the kernel page directory (kptdir) had to
   be unstaticized for use by the nested TLB fault handler.
   The name has been changed to ia64_kptdir to avoid conflicts.
   The renaming affects libkvm.
o  The trapframe only contains the special registers and the
   scratch registers. For syscalls using the EPC syscall path
   no scratch registers are saved. This affects all places where
   the trapframe is accessed. Most notably the unaligned access
   handler, the signal delivery code and the debugger.
o  Context switching only partly saves the special registers
   and the preserved registers. This affects cpu_switch() and
   triggered the move to the new semantics, which additionally
   affects cpu_throw().
o  The high FP registers are either in the PCB or on some
   CPU. context switching for them is done lazily. This affects
   trap().
o  The mcontext has room for all registers, but not all of them
   have to be defined in all cases. This mostly affects signal
   delivery code now. The *context syscalls are as of yet still
   unimplemented.

Many details went into the removal of the requirement to use
contigmalloc for kernel stacks. The details are mostly CPU
specific and limited to exception_save() and exception_restore().
The few places where we create, destroy or switch stacks were
mostly simplified by not having to construct physical addresses
and additionally saving the virtual addresses for later use.

Besides more efficient context saving and restoring, which of
course yields a noticable speedup, this also fixes the dreaded
SMP bootup problem as a side-effect. The details of which are
still not fully understood.

This change includes all the necessary backward compatibility
code to have it handle older userland binaries that use the
break instruction for syscalls. Support for break-based syscalls
has been pessimized in favor of a clean implementation. Due to
the overall better performance of the kernel, this will still
be notived as an improvement if it's noticed at all.

Approved by: re@ (jhb)
2003-05-16 21:26:42 +00:00
jhb
f0272107fb - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
jhb
bed2a8a99b Remove TD_ON_RUNQ() from a check to make sure Giant is not held when
calling mi_switch().  The kernel would panic on an earlier KASSERT() in
mi_switch() if TD_ON_RUNQ() was true.
2003-05-05 21:12:36 +00:00
jhb
2fdc3dd90a Garbage collect unused TDF_INMSLEEP flag. 2003-05-01 17:05:24 +00:00
peter
ed2d737246 AMD64 uses the new-style cpu_switch()/cpu_throw() calling conventions. 2003-04-30 21:45:03 +00:00
jhb
2464deac76 - Protect p_numthreads with the sched_lock.
- Protect p_singlethread with both the sched_lock and the proc lock.
- Protect p_suspcount with the proc lock.
2003-04-23 18:46:51 +00:00
peter
eb7d0d7512 Commit a partial lazy thread switch mechanism for i386. it isn't as lazy
as it could be and can do with some more cleanup.  Currently its under
options LAZY_SWITCH.  What this does is avoid %cr3 reloads for short
context switches that do not involve another user process.  ie: we can
take an interrupt, switch to a kthread and return to the user without
explicitly flushing the tlb.  However, this isn't as exciting as it could
be, the interrupt overhead is still high and too much blocks on Giant
still.  There are some debug sysctls, for stats and for an on/off switch.

The main problem with doing this has been "what if the process that you're
running on exits while we're borrowing its address space?" - in this case
we use an IPI to give it a kick when we're about to reclaim the pmap.

Its not compiled in unless you add the LAZY_SWITCH option.  I want to fix a
few more things and get some more feedback before turning it on by default.

This is NOT a replacement for Bosko's lazy interrupt stuff.  This was more
meant for the kthread case, while his was for interrupts.  Mine helps a
little for interrupts, but his helps a lot more.

The stats are enabled with options SWTCH_OPTIM_STATS - this has been a
pseudo-option for years, I just added a bunch of stuff to it.

One non-trivial change was to select a new thread before calling
cpu_switch() in the first place.  This allows us to catch the silly
case of doing a cpu_switch() to the current process.  This happens
uncomfortably often.  This simplifies a bit of the asm code in cpu_switch
(no longer have to call choosethread() in the middle).  This has been
implemented on i386 and (thanks to jake) sparc64.  The others will come
soon.  This is actually seperate to the lazy switch stuff.

Glanced at by:  jake, jhb
2003-04-02 23:53:30 +00:00
jeff
5538fc3a38 - Borrow the KSE single threading code for exec and exit. We use the check
if (p->p_numthreads > 1) and not a flag because action is only necessary
   if there are other threads.  The rest of the system has no need to
   identify thr threaded processes.
 - In kern_thread.c use thr_exit1() instead of thread_exit() if P_THREADED
   is not set.
2003-04-01 01:26:20 +00:00
davidxu
38a6446279 Adjust code for userland preemptive. Userland can set a quantum in
kse_mailbox to schedule an upcall, this is useful for userland timeout
routine, for example pthread_cond_timedwait().

Also extract upcall scheduling code from kse_reassign and create
a new function called thread_switchout to include these code.

Reviewed by: julain
2003-03-19 05:49:38 +00:00
jhb
96d7c30c59 Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls
to WITNESS_WARN().
2003-03-04 21:03:05 +00:00
harti
4be3365196 When a process has been waiting on a condition variable or mutex the
td_wmesg field in the thread structure points to the description string of
the condition variable or mutex. If the condvar or the mutex had been
initialized from a loadable module that was unloaded in the meantime,
td_wmesg may now point to invalid memory. Retrieving the process table now
may panic the kernel (or access junk). Setting the td_wmesg field to NULL
after unblocking on the condvar/mutex prevents this panic.

PR:		kern/47408
Approved by:	jake (mentor)
2003-02-27 08:43:27 +00:00
julian
0d22df7a73 Change the process flags P_KSES to be P_THREADED.
This is just a cosmetic change but I've been meaning to do it for about a year.
2003-02-27 02:05:19 +00:00
julian
f9e7584a83 Move a bunch of flags from the KSE to the thread.
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.

Submitted by:	 parts by davidxu@
Reviewed by:	jeff@ mini@
2003-02-17 09:55:10 +00:00
alfred
1269929493 Print a backtrace in case we tsleep from inside of DDB. 2003-02-14 12:44:07 +00:00
julian
14da3a9517 Add code to ddb to allow backtracing an arbitrary thread.
(show thread {address})

Remove the IDLE kse state and replace it with a change in
the way threads sahre KSEs. Every KSE now has a thread, which is
considered its "owner" however a KSE may also be lent to other
threads in the same group to allow completion of in-kernel work.
n this case the owner remains the same and the KSE will revert to the
owner when the other work has been completed.

All creations of upcalls etc. is now done from
kse_reassign() which in turn is called from mi_switch or
thread_exit(). This means that special code can be removed from
msleep() and cv_wait().

kse_release() does not leave a KSE with no thread any more but
converts the existing thread into teh KSE's owner, and sets it up
for doing an upcall. It is just inhibitted from being scheduled until
there is some reason to do an upcall.

Remove all trace of the kse_idle queue since it is no-longer needed.
"Idle" KSEs are now on the loanable queue.
2002-12-28 01:23:07 +00:00
julian
cfb81f2986 Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage
during the context switch. Rearrange thread cleanups
to avoid problems with Giant. Clean threads when freed or
when recycled.

Approved by:	re (jhb)
2002-12-10 02:33:45 +00:00
jeff
90783165ac - Move FSCALE back to kern_sync. This is not scheduler specific.
- Create a new callout for lbolt and move it out of schedcpu().  This is not
   scheduler specific either.

Approved by:	re
2002-11-21 08:57:08 +00:00
davidxu
db2bf8d23e Add an actual implementation of kse_thr_interrupt() 2002-10-30 02:28:41 +00:00
julian
07781f7f33 More work on the interaction between suspending and sleeping threads.
Also clean up some code used with 'single-threading'.

Reviewed by:	davidxu
2002-10-25 07:11:12 +00:00
jeff
451c2a5505 - Create a new scheduler api that is defined in sys/sched.h
- Begin moving scheduler specific functionality into sched_4bsd.c
 - Replace direct manipulation of scheduler data with hooks provided by the
   new api.
 - Remove KSE specific state modifications and single runq assumptions from
   kern_switch.c

Reviewed by:	-arch
2002-10-12 05:32:24 +00:00
jhb
109070fe68 - Move p_cpulimit to struct proc from struct plimit and protect it with
sched_lock.  This means that we no longer access p_limit in mi_switch()
  and the p_limit pointer can be protected by the proc lock.
- Remove PRS_ZOMBIE check from CPU limit test in mi_switch().  PRS_ZOMBIE
  processes don't call mi_switch(), and even if they did there is no longer
  the danger of p_limit being NULL (which is what the original zombie check
  was added for).
- When we bump the current processes soft CPU limit in ast(), just bump the
  private p_cpulimit instead of the shared rlimit.  This fixes an XXX for
  some value of fix.  There is still a (probably benign) bug in that this
  code doesn't check that the new soft limit exceeds the hard limit.

Inspired by:	bde (2)
2002-10-09 17:17:24 +00:00
julian
6f389985be Round out the facilty for a 'bound' thread to loan out its KSE
in specific situations. The owner thread must be blocked, and the
borrower can not proceed back to user space with the borrowed KSE.
The borrower will return the KSE on the next context switch where
teh owner wants it back. This removes a lot of possible
race conditions and deadlocks. It is consceivable that the
borrower should inherit the priority of the owner too.
that's another discussion and would be simple to do.

Also, as part of this, the "preallocatd spare thread" is attached to the
thread doing a syscall rather than the KSE. This removes the need to lock
the scheduler when we want to access it, as it's now "at hand".

DDB now shows a lot mor info for threaded proceses though it may need
some optimisation to squeeze it all back into 80 chars again.
(possible JKH project)

Upcalls are now "bound" threads, but "KSE Lending" now means that
other completing syscalls can be completed using that KSE before the upcall
finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is
one of the highest priorities in the KSE system.) The upcall when it happens
will present all the completed syscalls to the KSE for selection.
2002-10-09 02:33:36 +00:00
jmallett
4e57a787f2 XXX Add a check for p->p_limit being NULL before dereferencing it. This is
totally bogus but will hide the occurances of access of 0xbc(NULL) which
people have run into lately.  This is not a proper fix, just a bandaid, until
the cause of this happening is tracked down and fixed.

Reviewed by:	rwatson
2002-10-03 04:09:00 +00:00
jhb
127b25cac0 Rename the mutex thread and process states to use a more generic 'LOCK'
name instead.  (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of
TD_ON_MUTEX())  Eventually a turnstile abstraction will be added that
will be shared with mutexes and other types of locks.  SLOCK/TDI_LOCK will
be used internally by the turnstile code and will not be specific to
mutexes.  Making the change now ensures that turnstiles can be dropped
in at a later date without affecting the ABI of userland applications.
2002-10-02 20:31:47 +00:00
jhb
c949ba6096 - Adjust comment noting that handling of CPU limit exhaustion is done in
ast().
- Actually set KEF_ASTPENDING so ast() is called.  I think this is buggy
  for a process with multiple KSE's in that PS_XCPU is not a KSE event,
  it's a process-wide event.  IMO there really should probably be two
  ASTPENDING flags, one for per-process, and one for per-KSE.

Submitted by:	bde
2002-10-01 14:10:08 +00:00
jhb
07c6865714 - Add a new per-process flag PS_XCPU to indicate that at least one thread
has exceeded its CPU time limit.
- In mi_switch(), set PS_XCPU when the CPU time limit is exceeded.
- Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set.

Requested by:	many
2002-09-30 21:13:54 +00:00
julian
b3a916f5d5 Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode
from stopping another thread from completing a syscall, and this allows it to
release its resources etc. Probably more related commits to follow (at least
one I know of)

Initial concept by: julian, dillon
Submitted by:	davidxu
2002-09-29 23:04:34 +00:00
julian
c69b2fd9b6 Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00
julian
651e8e6053 Rejig the code to figure out estcpu and work out how long a KSEGRP has been
idle. What was there before was surprisingly ALMOST correct.

Peter and I fried our brains on this for a couple of hours figuring out
what this actually means in the context of multiple threads.

Reviewed by:	peter@freebsd.org
2002-08-30 00:25:49 +00:00