Commit Graph

116 Commits

Author SHA1 Message Date
davidxu
69017b27e9 In adjustrunqueue(), add code to handle thread migrating case for
ULE scheduler. In original code, local run queue of threaded ksegrp
is corrupted if adjustrunqueue() is called while thread is migrating.
2005-08-03 01:23:45 +00:00
ups
5273b0bf9f Restore preemption of idle threads.
Submitted by:	jhb
2005-06-10 03:00:29 +00:00
ups
4421a08742 Lots of whitespace cleanup.
Fix for broken if condition.

Submitted by:	nate@
2005-06-09 19:43:08 +00:00
ups
d9753fcc91 Fix some race conditions for pinned threads that may cause them to run
on the wrong CPU.

Add IPI support for preempting a thread on another CPU.

MFC after:3 weeks
2005-06-09 18:26:31 +00:00
ups
acfce18a2a Use low level constructs borrowed from interrupt threads to wait for
work in proc0.
Remove the TDP_WAKEPROC0 workaround.
2005-05-23 23:01:53 +00:00
ups
c8d93020ce Fix a bug that caused preemption to happen for a thread in the same
ksegrp with the same priority as the currently running thread.
This can cause propagate_priority() to panic.

Pointy hat to: ups
2005-05-19 01:08:30 +00:00
ups
7bac02c146 Sprinkle some volatile magic and rearrange things a bit to avoid race
conditions in critical_exit now that it no longer blocks interrupts.

Reviewed by:	jhb
2005-04-08 03:37:53 +00:00
jhb
41cadaa11e Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
rwatson
560261414f Add a read-only kern.sched.preemption sysctl so that user space can tell
if "options PREEMPTION" is compiled into the kernel.
2005-03-20 17:05:12 +00:00
rwatson
04058cab9f A further step on the journey of meaking panics and debugging more reliable:
in the window between the beginning of panic() and entering the debugger,
it's possible to receive interrupts.  If we receive an interrupt, don't
preempt if panicstr != NULL, as the system is in the process of failing, and
the preempting thread is likely to stumble over the failure.  The typical
scenario is during the printf() in panic() prior to entering the debugger,
but when running with a slower console type such as serial console.

It could be that the panic string should be passed to the debugger to print,
so that it can run from the debugger's environment rather than a regular
kernel printf.

Glanced at by:	jhb
2005-03-17 15:18:01 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
jeff
c2b9649e7a - Define KTR points for KTR_SCHED. 2004-12-26 00:14:21 +00:00
jeff
c94fadce10 - Garbage collect several unused members of struct kse and struce ksegrp.
As best as I can tell, some of these were never used.
2004-12-14 10:53:55 +00:00
das
6175c08488 Remove local definitions of RANGEOF() and use __rangeof() instead.
Also remove a few bogus casts.
2004-11-20 23:00:59 +00:00
rwatson
ab85f61559 Add basic critical section tracing to KTR using event type KTR_CRITICAL.
This generates a KTR event for each critical section entered and exited.

It would be desirable to also log the filename and line number of the
source entering or exiting the critical section, but this requires
hacking up the critical section API, so I've not done that yet.
2004-11-07 23:11:32 +00:00
scottl
e049505e4b If a process needs to be swapped in, wakeup the swapper from within
critical_exit as the process is getting scheduled to run.  This is subotimal
but for now avoid the LOR between the scheduler and the sleepq systems.
This is a 5.3 candidate.

Submitted by: davidxu
MFC After: 3 days
2004-10-16 06:38:22 +00:00
ups
02ee911318 Fix maybe_preempt_in_ksegrp for !SMP.
Tested   by: tegge
Reviewed by: julian
Approved by: sam (mentor)
MFC after: 3 days
2004-10-13 22:07:04 +00:00
phk
e948dce998 Make !SMP kernels compile, and as far as I can tell, work again. 2004-10-12 20:57:37 +00:00
ups
5d0d8550e7 Prevent preemption in slot_fill.
Implement preemption between threads in the same ksegp in out of slot
situations to prevent priority inversion.

Tested   by: pho
Reviewed by: jhb, julian
Approved by: sam (mentor)
MFC: ASAP
2004-10-12 16:30:20 +00:00
julian
30d2ba06b9 Don't release the slot twice.. sched_rem() has already done it.
Submitted by:	stephan uphoff (ups at tree dot com)
MFC after:	3 days
2004-10-10 05:19:22 +00:00
julian
57fb03da54 When preempting a thread, put it back on the HEAD of its run queue.
(Only really implemented in 4bsd)

MFC after:	4 days
2004-10-05 22:03:10 +00:00
julian
7b170fd9fa Use some macros to trach available scheduler slots to allow
easier debugging.

MFC after:	4 days
2004-10-05 21:10:44 +00:00
das
8b64b8f028 The zone from which proc structures are allocated is marked
UMA_ZONE_NOFREE to guarantee type stability, so proc_fini() should
never be called.  Move an assertion from proc_fini() to proc_dtor()
and garbage-collect the rest of the unreachable code.  I have retained
vm_proc_dispose(), since I consider its disuse a bug.
2004-09-19 18:34:17 +00:00
julian
6461286b21 clean up thread runq accounting a bit.
MFC after:	3 days
2004-09-16 07:12:59 +00:00
julian
b4933d4405 e specific code to revert a partial add ot teh run queue, not
remrunqueue() which can't handle a partially added thread.

MFC after:	1 week
2004-09-16 05:37:40 +00:00
julian
d7dd18c6b5 Oops accidentally removed #ifdef SCHED_4BSD
as part of another commit
This function is not yet used in ULE
2004-09-15 03:51:51 +00:00
julian
2e10eab995 Commit a fix for some panics we've been seeing with preemption.
MFC after:	2 days
2004-09-13 23:06:39 +00:00
julian
0b88c839d5 Add some kasserts 2004-09-13 23:02:52 +00:00
julian
9993c65718 Add some code to allow threads to nominat a sibling to run if theyu are going to sleep.
MFC after:	1 week
2004-09-10 21:04:38 +00:00
julian
35060cd448 Make debug printf less threatenning and make it only print out once.
MFC after:	2 days
2004-09-07 06:38:22 +00:00
julian
91180c0a8c Don't do IPIs on behalf of interrupt threads.
just punt straight on through to teh preemption code.

Make a KASSSERT out of a condition that can no longer occur.
MFC after:	1 week
2004-09-06 07:23:14 +00:00
julian
5813d27029 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
julian
46d0945926 remove unused code
MFC after:	 2 days
2004-09-02 23:37:41 +00:00
scottl
d9af98161a Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
julian
8354ba9e3a Give the 4bsd scheduler the ability to wake up idle processors
when there is new work to be done.

MFC after:	5 days
2004-09-01 06:42:02 +00:00
julian
e9d9514975 Give setrunqueue() and sched_add() more of a clue as to
where they are coming from and what is expected from them.

MFC after:	2 days
2004-09-01 02:11:28 +00:00
peter
9e60f4336e Backout the previous backout (with scott's ok). sched_ule.c:1.122 is
believed to fix the problem with ULE that this change triggered.
2004-08-28 01:04:44 +00:00
scottl
30583f7adf Revert the previous change. It works great for 4BSD but causes major
problems for ULE.  The reason is quite unknown and worrisome.
2004-08-20 05:58:38 +00:00
scottl
b336a56514 In maybe_preempt(), ignore threads that are in an inconsistent state. This
is an effective band-aid for at least some of the scheduler corruption seen
recently.  The real fix will involve protecting threads while they are
inconsistent, and will come later.

Submitted by: julian
2004-08-20 05:18:50 +00:00
scottl
ab3ce7c4d9 Add a temporary debugging hack to detect a deadlock in setrunqueue(). This
is here so that we can gather stats on the nature of the recent rash of
hard lockups, and in this particular case panic the machine instead of
letting it deadlock forever.
2004-08-10 00:26:25 +00:00
julian
38d3d854fe Make kg->kg_runnable actually count runnable threads in the ksegrp run queue
instead of only doing it sometimes.. This is not used outdide of debugging code
in the current code, but that will probably change.
2004-08-09 20:36:03 +00:00
julian
61fada7840 Increase the amount of data exported by KTR in the KTR_RUNQ setting.
This extra data is needed to really follow what is going on in the
threaded case.
2004-08-09 18:21:12 +00:00
jhb
d3254af40d Don't scare users with a warning about preemption being off when it isn't
yet safe to have on by default.
2004-08-06 15:49:44 +00:00
rwatson
4ab080249a Pass a thread argument into cpu_critical_{enter,exit}() rather than
dereference curthread.  It is called only from critical_{enter,exit}(),
which already dereferences curthread.  This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.

Head nodding:	jhb, bmilekic
2004-07-27 16:41:01 +00:00
scottl
36b2b29e6c Remove the previous hack since it doesn't make a difference and is getting
in the way of debugging.
2004-07-23 19:59:16 +00:00
scottl
9c40ab7a35 Disable the PREEMPTION-enabled code in critical_exit() that encourages
switching to a different thread.  This is just a hack to try to improve
stability some more, but likely points closer to the real culprit.
2004-07-22 14:32:48 +00:00
jhb
0cb3276d57 - Move TDF_OWEPREEMPT, TDF_OWEUPC, and TDF_USTATCLOCK over to td_pflags
since they are only accessed by curthread and thus do not need any
  locking.
- Move pr_addr and pr_ticks out of struct uprof (which is per-process)
  and directly into struct thread as td_profil_addr and td_profil_ticks
  as these variables are really per-thread.  (They are used to defer an
  addupc_intr() that was too "hard" until ast()).
2004-07-16 21:04:55 +00:00
marcel
a9ad69d5af Update for the KDB framework:
o  Make debugging code conditional upon KDB instead of DDB.
o  Call kdb_enter() instead of Debugger().
o  Call kdb_backtrace() instead of db_print_backtrace() or backtrace().

kern_mutex.c:
o  Replace checks for db_active with checks for kdb_active and make
   them unconditional.

kern_shutdown.c:
o  s/DDB_UNATTENDED/KDB_UNATTENDED/g
o  s/DDB_TRACE/KDB_TRACE/g
o  Save the TID of the thread doing the kernel dump so the debugger
   knows which thread to select as the current when debugging the
   kernel core file.
o  Clear kdb_active instead of db_active and do so unconditionally.
o  Remove backtrace() implementation.

kern_synch.c:
o  Call kdb_reenter() instead of db_error().
2004-07-10 21:36:01 +00:00
marcel
82affa1f89 Unbreak build for the the !PREEMPTION case: don't define variables
that aren't used in that case.
2004-07-03 00:57:43 +00:00
jhb
696704716d Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00