freebsd-dev

Author	SHA1	Message	Date
David Xu	3db720fdce	Add user priority loaning code to support priority propagation for 1:1 threading's POSIX priority mutexes, the code is no-op unless priority-aware umtx code is committed.	2006-08-25 06:12:53 +00:00
Maxim Konovalov	1f36c876a1	o Fix grammar in the comment, indent macros. No functional changes.	2006-07-02 20:53:52 +00:00
Maxim Konovalov	75d960eb2e	o Remove rev. 1.57 leftover, not reached code.	2006-07-02 20:49:46 +00:00
David E. O'Brien	2e4db89cfc	Fix building with GCC 4.2: define data types before referring to them.	2006-06-29 19:37:31 +00:00
David Xu	36ec198bd5	Add scheduler API sched_relinquish(), the API is used to implement yield() and sched_yield() syscalls. Every scheduler has its own way to relinquish cpu, the ULE and CORE schedulers have two internal run- queues, a timesharing thread which calls yield() syscall should be moved to inactive queue.	2006-06-15 06:37:39 +00:00
David Xu	b41f1452d9	Add scheduler CORE, the work I have done half a year ago, recent, I picked it up again. The scheduler is forked from ULE, but the algorithm to detect an interactive process is almost completely different with ULE, it comes from Linux paper "Understanding the Linux 2.6.8.1 CPU Scheduler", although I still use same word "score" as a priority boost in ULE scheduler. Briefly, the scheduler has following characteristic: 1. Timesharing process's nice value is seriously respected, timeslice and interaction detecting algorithm are based on nice value. 2. per-cpu scheduling queue and load balancing. 3. O(1) scheduling. 4. Some cpu affinity code in wakeup path. 5. Support POSIX SCHED_FIFO and SCHED_RR. Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler uses 256 priority queues. Unlike ULE which using pull and push, the scheduelr uses pull method, the main reason is to let relative idle cpu do the work, but current the whole scheduler is protected by the big sched_lock, so the benefit is not visible, it really can be worse than nothing because all other cpu are locked out when we are doing balancing work, which the 4BSD scheduelr does not have this problem. The scheduler does not support hyperthreading very well, in fact, the scheduler does not make the difference between physical CPU and logical CPU, this should be improved in feature. The scheduler has priority inversion problem on MP machine, it is not good for realtime scheduling, it can cause realtime process starving. As a result, it seems the MySQL super-smack runs better on my Pentium-D machine when using libthr, despite on UP or SMP kernel.	2006-06-13 13:12:56 +00:00
David Xu	0ae716e5ee	Make ke_rqindex unsigned.	2006-06-06 12:26:17 +00:00
John-Mark Gurney	5c06d111b8	back out for now... revert ccpu to being kern.ccpu...	2006-04-27 17:57:59 +00:00
John-Mark Gurney	c71ce6a445	move remaining sysctl into the kern.sched tree...	2006-04-26 19:42:38 +00:00
John Baldwin	0f180a7cce	Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway. The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.	2006-04-17 18:20:38 +00:00
Peter Wemm	4da0d332f4	Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't merit being in opt_global.h and forcing a global recompile when only a few files reference it. Approved by: re	2005-06-24 00:16:57 +00:00
Stephan Uphoff	a3f2d84279	Lots of whitespace cleanup. Fix for broken if condition. Submitted by: nate@	2005-06-09 19:43:08 +00:00
Stephan Uphoff	f3a0f87396	Fix some race conditions for pinned threads that may cause them to run on the wrong CPU. Add IPI support for preempting a thread on another CPU. MFC after:3 weeks	2005-06-09 18:26:31 +00:00
Joseph Koshy	ebccf1e3a6	Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities and documentation into -CURRENT. Bump FreeBSD_version. Reviewed by: alc, jhb (kernel changes)	2005-04-19 04:01:25 +00:00
Maxim Konovalov	f305048664	Fix a typo in the comment. Noticed by: Samy Al Bahra	2005-04-15 14:01:43 +00:00
Stephan Uphoff	779186434a	Sprinkle some volatile magic and rearrange things a bit to avoid race conditions in critical_exit now that it no longer blocks interrupts. Reviewed by: jhb	2005-04-08 03:37:53 +00:00
John Baldwin	f5c157d986	Rework the interface between priority propagation (lending) and the schedulers a bit to ensure more correct handling of priorities and fewer priority inversions: - Add two functions to the sched(9) API to handle priority lending: sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these functions to ask the scheduler to lend a thread a set priority and to tell the scheduler when it thinks it is ok for a thread to stop borrowing priority. The unlend case is slightly complex in that the turnstile code tells the scheduler what the minimum priority of the thread needs to be to satisfy the requirements of any other threads blocked on locks owned by the thread in question. The scheduler then decides where the thread can go back to normal mode (if it's normal priority is high enough to satisfy the pending lock requests) or it it should continue to use the priority specified to the sched_unlend_prio() call. This involves adding a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag for priority elevation. - Schedulers now refuse to lower the priority of a thread that is currently borrowing another therad's priority. - If a scheduler changes the priority of a thread that is currently sitting on a turnstile, it will call a new function turnstile_adjust() to inform the turnstile code of the change. This function resorts the thread on the priority list of the turnstile if needed, and if the thread ends up at the head of the list (due to having the highest priority) and its priority was raised, then it will propagate that new priority to the owner of the lock it is blocked on. Some additional fixes specific to the 4BSD scheduler include: - Common code for updating the priority of a thread when the user priority of its associated kse group has been consolidated in a new static function resetpriority_thread(). One change to this function is that it will now only adjust the priority of a thread if it already has a time sharing priority, thus preserving any boosts from a tsleep() until the thread returns to userland. Also, resetpriority() no longer calls maybe_resched() on each thread in the group. Instead, the code calling resetpriority() is responsible for calling resetpriority_thread() on any threads that need to be updated. - schedcpu() now uses resetpriority_thread() instead of just calling sched_prio() directly after it updates a kse group's user priority. - sched_clock() now uses resetpriority_thread() rather than writing directly to td_priority. - sched_nice() now updates all the priorities of the threads after the group priority has been adjusted. Discussed with: bde Reviewed by: ups, jeffr Tested on: 4bsd, ule Tested on: i386, alpha, sparc64	2004-12-30 20:52:44 +00:00
Jeff Roberson	907bdbc288	- Wrap the thread count adjustment in sched_load_add() and sched_load_rem() so that we may place some ktr entries nearby. - Define other KTR_SCHED tracepoints so that we may graph the operation of the scheduler.	2004-12-26 00:16:24 +00:00
Jeff Roberson	7842f65e7f	- Garbage collect several unused members of struct kse and struce ksegrp. As best as I can tell, some of these were never used.	2004-12-14 10:53:55 +00:00
Stephan Uphoff	5656474145	Propagate TDF_NEEDRESCHED to replacement thread in sched_switch(). Reviewed by: julian, jhb (in October) Approved by: sam (mentor) MFC after: 4 weeks	2004-12-07 18:17:24 +00:00
Julian Elischer	c20c691bed	When preempting a thread, put it back on the HEAD of its run queue. (Only really implemented in 4bsd) MFC after: 4 days	2004-10-05 22:03:10 +00:00
Julian Elischer	d39063f20d	Use some macros to trach available scheduler slots to allow easier debugging. MFC after: 4 days	2004-10-05 21:10:44 +00:00
Julian Elischer	14f0e2e9bf	clean up thread runq accounting a bit. MFC after: 3 days	2004-09-16 07:12:59 +00:00
Julian Elischer	b2578c6c06	Add some kasserts	2004-09-13 23:02:52 +00:00
Scott Long	1e7fad6b6a	Revert the previous round of changes to td_pinned. The scheduler isn't fully initialed when the pmap layer tries to call sched_pini() early in the boot and results in an quick panic. Use ke_pinned instead as was originally done with Tor's patch. Approved by: julian	2004-09-11 10:07:22 +00:00
Julian Elischer	5c854accc1	Make up my mind if cpu pinning is stored in the thread structure or the scheduler specific extension to it. Put it in the extension as the implimentation details of how the pinning is done needn't be visible outside the scheduler. Submitted by: tegge (of course!) (with changes) MFC after: 3 days	2004-09-10 22:28:33 +00:00
Julian Elischer	3389af30e8	Add some code to allow threads to nominat a sibling to run if theyu are going to sleep. MFC after: 1 week	2004-09-10 21:04:38 +00:00
Julian Elischer	6a574b2afc	Don't do IPIs on behalf of interrupt threads. just punt straight on through to teh preemption code. Make a KASSSERT out of a condition that can no longer occur. MFC after: 1 week	2004-09-06 07:23:14 +00:00
Julian Elischer	0fe38d47b7	slight code cleanup MFC after: 1 week	2004-09-05 23:23:58 +00:00
Julian Elischer	bce73aeddb	turn on IPIs for 4bsd scheduler by default. MFC after: 1 week	2004-09-05 02:19:53 +00:00
Julian Elischer	ed062c8d66	Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week	2004-09-05 02:09:54 +00:00
Julian Elischer	00b0483d5c	Don't declare a function we are not defining.	2004-09-03 09:19:49 +00:00
Julian Elischer	37c28a022b	fix compile for UP	2004-09-03 09:15:10 +00:00
Julian Elischer	293968d8d3	ooops finish last commit. moved the variables but not the declarations.	2004-09-03 08:19:31 +00:00
Julian Elischer	82a1dfc16d	Move 4bsd specific experimental IP code into the 4bsd file. Move the sysctls into kern.sched	2004-09-03 07:42:31 +00:00
Julian Elischer	6804a3ab6d	Give the 4bsd scheduler the ability to wake up idle processors when there is new work to be done. MFC after: 5 days	2004-09-01 06:42:02 +00:00
Julian Elischer	2630e4c90c	Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them. MFC after: 2 days	2004-09-01 02:11:28 +00:00
Julian Elischer	ad59c36ba1	diff reduction for upcoming patch. Use a macro that masks some of the odd goings on with sub-structures, because they will go away anyhow.	2004-08-22 05:21:41 +00:00
Julian Elischer	0f54f48225	Properly keep track of how many kses are on the system run queue(s).	2004-08-11 20:54:48 +00:00
Julian Elischer	732d95288a	Increase the amount of data exported by KTR in the KTR_RUNQ setting. This extra data is needed to really follow what is going on in the threaded case.	2004-08-09 18:21:12 +00:00
Scott Long	e038d35422	Clean up whitespace, increase consistency and correctness. Submitted by: bde	2004-07-23 23:09:00 +00:00
Julian Elischer	55d44f79ea	When calling scheduler entrypoints for creating new threads and processes, specify "us" as the thread not the process/ksegrp/kse. You can always find the others from the thread but the converse is not true. Theorotically this would lead to runtime being allocated to the wrong entity in some cases though it is not clear how often this actually happenned. (would only affect threaded processes and would probably be pretty benign, but it WAS a bug..) Reviewed by: peter	2004-07-18 23:36:13 +00:00
John Baldwin	52eb84641d	- Move TDF_OWEPREEMPT, TDF_OWEUPC, and TDF_USTATCLOCK over to td_pflags since they are only accessed by curthread and thus do not need any locking. - Move pr_addr and pr_ticks out of struct uprof (which is per-process) and directly into struct thread as td_profil_addr and td_profil_ticks as these variables are really per-thread. (They are used to defer an addupc_intr() that was too "hard" until ast()).	2004-07-16 21:04:55 +00:00
John Baldwin	6942d4339e	Set TDF_NEEDRESCHED when a higher priority thread is scheduled in sched_add() rather than just doing it in sched_wakeup(). The old ithread preemption code used to set NEEDRESCHED unconditionally if it didn't preempt which masked this bug in SCHED_4BSD. Noticed by: jake Reported by: kensmith, marcel	2004-07-13 20:49:13 +00:00
John Baldwin	0c0b25ae91	Implement preemption of kernel threads natively in the scheduler rather than as one-off hacks in various other parts of the kernel: - Add a function maybe_preempt() that is called from sched_add() to determine if a thread about to be added to a run queue should be preempted to directly. If it is not safe to preempt or if the new thread does not have a high enough priority, then the function returns false and sched_add() adds the thread to the run queue. If the thread should be preempted to but the current thread is in a nested critical section, then the flag TDF_OWEPREEMPT is set and the thread is added to the run queue. Otherwise, mi_switch() is called immediately and the thread is never added to the run queue since it is switch to directly. When exiting an outermost critical section, if TDF_OWEPREEMPT is set, then clear it and call mi_switch() to perform the deferred preemption. - Remove explicit preemption from ithread_schedule() as calling setrunqueue() now does all the correct work. This also removes the do_switch argument from ithread_schedule(). - Do not use the manual preemption code in mtx_unlock if the architecture supports native preemption. - Don't call mi_switch() in a loop during shutdown to give ithreads a chance to run if the architecture supports native preemption since the ithreads will just preempt DELAY(). - Don't call mi_switch() from the page zeroing idle thread for architectures that support native preemption as it is unnecessary. - Native preemption is enabled on the same archs that supported ithread preemption, namely alpha, i386, and amd64. This change should largely be a NOP for the default case as committed except that we will do fewer context switches in a few cases and will avoid the run queues completely when preempting. Approved by: scottl (with his re@ hat)	2004-07-02 20:21:44 +00:00
John Baldwin	bf0acc273a	- Change mi_switch() and sched_switch() to accept an optional thread to switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.	2004-07-02 19:09:50 +00:00
Scott Long	36c6fd1c0f	Fix another typo in the previous commit.	2004-06-21 23:47:47 +00:00
Scott Long	c38dd4b6bd	Fix typo that somehow crept into the previous commit	2004-06-21 22:42:46 +00:00
Scott Long	dc09579417	Add the sysctl node 'kern.sched.name' that has the name of the scheduler currently in use. Move the 4bsd kern.quantum node to kern.sched.quantum for consistency.	2004-06-21 22:05:46 +00:00
Julian Elischer	fa88511615	Nice, is a property of a process as a whole.. I mistakenly moved it to the ksegroup when breaking up the process structure. Put it back in the proc structure.	2004-06-16 00:26:31 +00:00

1 2

87 Commits