freebsd-dev

Author	SHA1	Message	Date
David Xu	b41f1452d9	Add scheduler CORE, the work I have done half a year ago, recent, I picked it up again. The scheduler is forked from ULE, but the algorithm to detect an interactive process is almost completely different with ULE, it comes from Linux paper "Understanding the Linux 2.6.8.1 CPU Scheduler", although I still use same word "score" as a priority boost in ULE scheduler. Briefly, the scheduler has following characteristic: 1. Timesharing process's nice value is seriously respected, timeslice and interaction detecting algorithm are based on nice value. 2. per-cpu scheduling queue and load balancing. 3. O(1) scheduling. 4. Some cpu affinity code in wakeup path. 5. Support POSIX SCHED_FIFO and SCHED_RR. Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler uses 256 priority queues. Unlike ULE which using pull and push, the scheduelr uses pull method, the main reason is to let relative idle cpu do the work, but current the whole scheduler is protected by the big sched_lock, so the benefit is not visible, it really can be worse than nothing because all other cpu are locked out when we are doing balancing work, which the 4BSD scheduelr does not have this problem. The scheduler does not support hyperthreading very well, in fact, the scheduler does not make the difference between physical CPU and logical CPU, this should be improved in feature. The scheduler has priority inversion problem on MP machine, it is not good for realtime scheduling, it can cause realtime process starving. As a result, it seems the MySQL super-smack runs better on my Pentium-D machine when using libthr, despite on UP or SMP kernel.	2006-06-13 13:12:56 +00:00
Olivier Houchard	4bb0f51d1d	sched_rem() already sets ke->ke_state to KES_THREAD, so there's no need to redo it.	2006-06-01 22:45:56 +00:00
Alexander Kabaev	3f34977614	Trim trailing whitespace.	2005-12-28 17:13:31 +00:00
Nate Lawson	1335c4df32	Restore KTR_CRITICAL but conditionally compile it in as KTR_SCHED. Requested by: scottl, jhb	2005-12-18 18:10:57 +00:00
Nate Lawson	8615fd8696	Clean up unused or poorly utilized KTR values. Remove KTR_FS, KTR_KGDB, and KTR_IO as they were never used. Remove KTR_CLK since it was only used for hardclock firing and use KTR_INTR there instead. Remove KTR_CRITICAL since it was only used for crit enter/exit and use KTR_CONTENTION instead.	2005-12-17 03:57:10 +00:00
David Xu	3c424d1447	In adjustrunqueue(), add code to handle thread migrating case for ULE scheduler. In original code, local run queue of threaded ksegrp is corrupted if adjustrunqueue() is called while thread is migrating.	2005-08-03 01:23:45 +00:00
Stephan Uphoff	3ea6bbc59a	Restore preemption of idle threads. Submitted by: jhb	2005-06-10 03:00:29 +00:00
Stephan Uphoff	a3f2d84279	Lots of whitespace cleanup. Fix for broken if condition. Submitted by: nate@	2005-06-09 19:43:08 +00:00
Stephan Uphoff	f3a0f87396	Fix some race conditions for pinned threads that may cause them to run on the wrong CPU. Add IPI support for preempting a thread on another CPU. MFC after:3 weeks	2005-06-09 18:26:31 +00:00
Stephan Uphoff	d13ec71369	Use low level constructs borrowed from interrupt threads to wait for work in proc0. Remove the TDP_WAKEPROC0 workaround.	2005-05-23 23:01:53 +00:00
Stephan Uphoff	503c2ea34d	Fix a bug that caused preemption to happen for a thread in the same ksegrp with the same priority as the currently running thread. This can cause propagate_priority() to panic. Pointy hat to: ups	2005-05-19 01:08:30 +00:00
Stephan Uphoff	779186434a	Sprinkle some volatile magic and rearrange things a bit to avoid race conditions in critical_exit now that it no longer blocks interrupts. Reviewed by: jhb	2005-04-08 03:37:53 +00:00
John Baldwin	c6a37e8413	Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case. Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch. This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example). Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more	2005-04-04 21:53:56 +00:00
Robert Watson	6220dcba84	Add a read-only kern.sched.preemption sysctl so that user space can tell if "options PREEMPTION" is compiled into the kernel.	2005-03-20 17:05:12 +00:00
Robert Watson	bc60830675	A further step on the journey of meaking panics and debugging more reliable: in the window between the beginning of panic() and entering the debugger, it's possible to receive interrupts. If we receive an interrupt, don't preempt if panicstr != NULL, as the system is in the process of failing, and the preempting thread is likely to stumble over the failure. The typical scenario is during the printf() in panic() prior to entering the debugger, but when running with a slower console type such as serial console. It could be that the panic string should be passed to the debugger to print, so that it can run from the debugger's environment rather than a regular kernel printf. Glanced at by: jhb	2005-03-17 15:18:01 +00:00
Warner Losh	9454b2d864	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
Jeff Roberson	85da7a569b	- Define KTR points for KTR_SCHED.	2004-12-26 00:14:21 +00:00
Jeff Roberson	7842f65e7f	- Garbage collect several unused members of struct kse and struce ksegrp. As best as I can tell, some of these were never used.	2004-12-14 10:53:55 +00:00
David Schultz	6db36923ad	Remove local definitions of RANGEOF() and use __rangeof() instead. Also remove a few bogus casts.	2004-11-20 23:00:59 +00:00
Robert Watson	f42a43fa2d	Add basic critical section tracing to KTR using event type KTR_CRITICAL. This generates a KTR event for each critical section entered and exited. It would be desirable to also log the filename and line number of the source entering or exiting the critical section, but this requires hacking up the critical section API, so I've not done that yet.	2004-11-07 23:11:32 +00:00
Scott Long	b96741f410	If a process needs to be swapped in, wakeup the swapper from within critical_exit as the process is getting scheduled to run. This is subotimal but for now avoid the LOR between the scheduler and the sleepq systems. This is a 5.3 candidate. Submitted by: davidxu MFC After: 3 days	2004-10-16 06:38:22 +00:00
Stephan Uphoff	7c71b6453a	Fix maybe_preempt_in_ksegrp for !SMP. Tested by: tegge Reviewed by: julian Approved by: sam (mentor) MFC after: 3 days	2004-10-13 22:07:04 +00:00
Poul-Henning Kamp	13e7430fde	Make !SMP kernels compile, and as far as I can tell, work again.	2004-10-12 20:57:37 +00:00
Stephan Uphoff	84f9d4b137	Prevent preemption in slot_fill. Implement preemption between threads in the same ksegp in out of slot situations to prevent priority inversion. Tested by: pho Reviewed by: jhb, julian Approved by: sam (mentor) MFC: ASAP	2004-10-12 16:30:20 +00:00
Julian Elischer	042b7b1af0	Don't release the slot twice.. sched_rem() has already done it. Submitted by: stephan uphoff (ups at tree dot com) MFC after: 3 days	2004-10-10 05:19:22 +00:00
Julian Elischer	c20c691bed	When preempting a thread, put it back on the HEAD of its run queue. (Only really implemented in 4bsd) MFC after: 4 days	2004-10-05 22:03:10 +00:00
Julian Elischer	d39063f20d	Use some macros to trach available scheduler slots to allow easier debugging. MFC after: 4 days	2004-10-05 21:10:44 +00:00
David Schultz	8daa8c602a	The zone from which proc structures are allocated is marked UMA_ZONE_NOFREE to guarantee type stability, so proc_fini() should never be called. Move an assertion from proc_fini() to proc_dtor() and garbage-collect the rest of the unreachable code. I have retained vm_proc_dispose(), since I consider its disuse a bug.	2004-09-19 18:34:17 +00:00
Julian Elischer	14f0e2e9bf	clean up thread runq accounting a bit. MFC after: 3 days	2004-09-16 07:12:59 +00:00
Julian Elischer	9da3e923f4	e specific code to revert a partial add ot teh run queue, not remrunqueue() which can't handle a partially added thread. MFC after: 1 week	2004-09-16 05:37:40 +00:00
Julian Elischer	e8807f22f9	Oops accidentally removed #ifdef SCHED_4BSD as part of another commit This function is not yet used in ULE	2004-09-15 03:51:51 +00:00
Julian Elischer	1f9f5df61d	Commit a fix for some panics we've been seeing with preemption. MFC after: 2 days	2004-09-13 23:06:39 +00:00
Julian Elischer	b2578c6c06	Add some kasserts	2004-09-13 23:02:52 +00:00
Julian Elischer	3389af30e8	Add some code to allow threads to nominat a sibling to run if theyu are going to sleep. MFC after: 1 week	2004-09-10 21:04:38 +00:00
Julian Elischer	5498350529	Make debug printf less threatenning and make it only print out once. MFC after: 2 days	2004-09-07 06:38:22 +00:00
Julian Elischer	6a574b2afc	Don't do IPIs on behalf of interrupt threads. just punt straight on through to teh preemption code. Make a KASSSERT out of a condition that can no longer occur. MFC after: 1 week	2004-09-06 07:23:14 +00:00
Julian Elischer	ed062c8d66	Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week	2004-09-05 02:09:54 +00:00
Julian Elischer	44692526be	remove unused code MFC after: 2 days	2004-09-02 23:37:41 +00:00
Scott Long	9923b511ed	Turn PREEMPTION into a kernel option. Make sure that it's defined if FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is enabled (code inspired by the PREEMPTION warning in kern_switch.c). This is a possible MT5 candidate.	2004-09-02 18:59:15 +00:00
Julian Elischer	6804a3ab6d	Give the 4bsd scheduler the ability to wake up idle processors when there is new work to be done. MFC after: 5 days	2004-09-01 06:42:02 +00:00
Julian Elischer	2630e4c90c	Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them. MFC after: 2 days	2004-09-01 02:11:28 +00:00
Peter Wemm	6f96710c60	Backout the previous backout (with scott's ok). sched_ule.c:1.122 is believed to fix the problem with ULE that this change triggered.	2004-08-28 01:04:44 +00:00
Scott Long	2384290ced	Revert the previous change. It works great for 4BSD but causes major problems for ULE. The reason is quite unknown and worrisome.	2004-08-20 05:58:38 +00:00
Scott Long	2c86298c6c	In maybe_preempt(), ignore threads that are in an inconsistent state. This is an effective band-aid for at least some of the scheduler corruption seen recently. The real fix will involve protecting threads while they are inconsistent, and will come later. Submitted by: julian	2004-08-20 05:18:50 +00:00
Scott Long	0f4ad91810	Add a temporary debugging hack to detect a deadlock in setrunqueue(). This is here so that we can gather stats on the nature of the recent rash of hard lockups, and in this particular case panic the machine instead of letting it deadlock forever.	2004-08-10 00:26:25 +00:00
Julian Elischer	1a5cd27b4b	Make kg->kg_runnable actually count runnable threads in the ksegrp run queue instead of only doing it sometimes.. This is not used outdide of debugging code in the current code, but that will probably change.	2004-08-09 20:36:03 +00:00
Julian Elischer	732d95288a	Increase the amount of data exported by KTR in the KTR_RUNQ setting. This extra data is needed to really follow what is going on in the threaded case.	2004-08-09 18:21:12 +00:00
John Baldwin	44fe3c1ff0	Don't scare users with a warning about preemption being off when it isn't yet safe to have on by default.	2004-08-06 15:49:44 +00:00
Robert Watson	1a8cfbc450	Pass a thread argument into cpu_critical_{enter,exit}() rather than dereference curthread. It is called only from critical_{enter,exit}(), which already dereferences curthread. This doesn't seem to affect SMP performance in my benchmarks, but improves MySQL transaction throughput by about 1% on UP on my Xeon. Head nodding: jhb, bmilekic	2004-07-27 16:41:01 +00:00
Scott Long	18f480f8f6	Remove the previous hack since it doesn't make a difference and is getting in the way of debugging.	2004-07-23 19:59:16 +00:00

1 2 3

121 Commits