freebsd-dev

Author	SHA1	Message	Date
John Baldwin	007ddf7e7a	Now that the return value semantics of cv's for multithreaded processes have been unified with that of msleep(9), further refine the sleepq interface and consolidate some duplicated code: - Move the pre-sleep checks for theaded processes into a thread_sleep_check() function in kern_thread.c. - Move all handling of TDF_SINTR to be internal to subr_sleepqueue.c. Specifically, if a thread is awakened by something other than a signal while checking for signals before going to sleep, clear TDF_SINTR in sleepq_catch_signals(). This removes a sched_lock lock/unlock combo in that edge case during an interruptible sleep. Also, fix sleepq_check_signals() to properly handle the condition if TDF_SINTR is clear rather than requiring the callers of the sleepq API to notice this edge case and call a non-_sig variant of sleepq_wait(). - Clarify the flags arguments to sleepq_add(), sleepq_signal() and sleepq_broadcast() by creating an explicit submask for sleepq types. Also, add an explicit SLEEPQ_MSLEEP type rather than a magic number of 0. Also, add a SLEEPQ_INTERRUPTIBLE flag for use with sleepq_add() and move the setting of TDF_SINTR to sleepq_add() if this flag is set rather than sleepq_catch_signals(). Note that it is the caller's responsibility to ensure that sleepq_catch_signals() is called if and only if this flag is passed to the preceeding sleepq_add(). Note that this also removes a sched_lock lock/unlock pair from sleepq_catch_signals(). It also ensures that for an interruptible sleep, TDF_SINTR is always set when TD_ON_SLEEPQ() is true.	2004-08-19 11:31:42 +00:00
Julian Elischer	732d95288a	Increase the amount of data exported by KTR in the KTR_RUNQ setting. This extra data is needed to really follow what is going on in the threaded case.	2004-08-09 18:21:12 +00:00
John Baldwin	c950c15c76	Workaround a possible deadlock on SMP due to a spin lock LOR by disabling the immediate awakening of proc0 (scheduler kproc, controls swapping processes in and out). The scheduler process periodically awakens already, so this will not result in processes not being swapped in, there will just be more latency in between a thread being made runnable and the scheduler waking up to swap the affected process back in.	2004-08-04 20:24:40 +00:00
David Xu	8bda8a620c	Use P_SINGLE_EXIT to check single-threading case, P_WEXIT is not for that purpose.	2004-07-28 06:30:52 +00:00
John Baldwin	52eb84641d	- Move TDF_OWEPREEMPT, TDF_OWEUPC, and TDF_USTATCLOCK over to td_pflags since they are only accessed by curthread and thus do not need any locking. - Move pr_addr and pr_ticks out of struct uprof (which is per-process) and directly into struct thread as td_profil_addr and td_profil_ticks as these variables are really per-thread. (They are used to defer an addupc_intr() that was too "hard" until ast()).	2004-07-16 21:04:55 +00:00
Marcel Moolenaar	2d50560abc	Update for the KDB framework: o Make debugging code conditional upon KDB instead of DDB. o Call kdb_enter() instead of Debugger(). o Call kdb_backtrace() instead of db_print_backtrace() or backtrace(). kern_mutex.c: o Replace checks for db_active with checks for kdb_active and make them unconditional. kern_shutdown.c: o s/DDB_UNATTENDED/KDB_UNATTENDED/g o s/DDB_TRACE/KDB_TRACE/g o Save the TID of the thread doing the kernel dump so the debugger knows which thread to select as the current when debugging the kernel core file. o Clear kdb_active instead of db_active and do so unconditionally. o Remove backtrace() implementation. kern_synch.c: o Call kdb_reenter() instead of db_error().	2004-07-10 21:36:01 +00:00
John Baldwin	0c0b25ae91	Implement preemption of kernel threads natively in the scheduler rather than as one-off hacks in various other parts of the kernel: - Add a function maybe_preempt() that is called from sched_add() to determine if a thread about to be added to a run queue should be preempted to directly. If it is not safe to preempt or if the new thread does not have a high enough priority, then the function returns false and sched_add() adds the thread to the run queue. If the thread should be preempted to but the current thread is in a nested critical section, then the flag TDF_OWEPREEMPT is set and the thread is added to the run queue. Otherwise, mi_switch() is called immediately and the thread is never added to the run queue since it is switch to directly. When exiting an outermost critical section, if TDF_OWEPREEMPT is set, then clear it and call mi_switch() to perform the deferred preemption. - Remove explicit preemption from ithread_schedule() as calling setrunqueue() now does all the correct work. This also removes the do_switch argument from ithread_schedule(). - Do not use the manual preemption code in mtx_unlock if the architecture supports native preemption. - Don't call mi_switch() in a loop during shutdown to give ithreads a chance to run if the architecture supports native preemption since the ithreads will just preempt DELAY(). - Don't call mi_switch() from the page zeroing idle thread for architectures that support native preemption as it is unnecessary. - Native preemption is enabled on the same archs that supported ithread preemption, namely alpha, i386, and amd64. This change should largely be a NOP for the default case as committed except that we will do fewer context switches in a few cases and will avoid the run queues completely when preempting. Approved by: scottl (with his re@ hat)	2004-07-02 20:21:44 +00:00
John Baldwin	bf0acc273a	- Change mi_switch() and sched_switch() to accept an optional thread to switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.	2004-07-02 19:09:50 +00:00
John Baldwin	a5471e4ef4	Remove the signal_caught argument from sleepq_timedwait() as it was effectively always zero.	2004-06-28 18:57:06 +00:00
Tim J. Robbins	be5318b2ca	Remove a stale and misleading comment.	2004-06-07 09:35:00 +00:00
Bruce Evans	a13ec35b05	Fixed some common printf format errors. Don't assume that "struct foo " is "void " (it isn't) or that the default promotion of pid_t is int. Instead, assume that casting "struct foo " to "void " and printing the result with %p is useful, and that all pid_t's are representable as longs. Fixed some minor style bugs (mainly spelling errors in comments).	2004-05-14 20:51:42 +00:00
Warner Losh	7f8a436ff2	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
John Baldwin	1ed3e44f22	- Remove old sleep queues. - Remove sleepqueue argument from sleepq_set_timeout() since it is not used.	2004-03-12 19:06:18 +00:00
Robert Watson	ce89352952	Mark loadaverage callout as CALLOUT_MPSAFE. Reviewed by: jhb	2004-03-08 22:01:19 +00:00
John Baldwin	959c0c4122	Correct handling of PDROP in msleep() to just skip the mtx_lock() rather than clear the lock pointer so that sleepq_add() still gets the correct lock pointer and doesn't bogusly trip an assertion.	2004-03-02 14:58:33 +00:00
John Baldwin	44f3b09204	Switch the sleep/wakeup and condition variable implementations to use the sleep queue interface: - Sleep queues attempt to merge some of the benefits of both sleep queues and condition variables. Having sleep qeueus in a hash table avoids having to allocate a queue head for each wait channel. Thus, struct cv has shrunk down to just a single char * pointer now. However, the hash table does not hold threads directly, but queue heads. This means that once you have located a queue in the hash bucket, you no longer have to walk the rest of the hash chain looking for threads. Instead, you have a list of all the threads sleeping on that wait channel. - Outside of the sleepq code and the sleep/cv code the kernel no longer differentiates between cv's and sleep/wakeup. For example, calls to abortsleep() and cv_abort() are replaced with a call to sleepq_abort(). Thus, the TDF_CVWAITQ flag is removed. Also, calls to unsleep() and cv_waitq_remove() have been replaced with calls to sleepq_remove(). - The sched_sleep() function no longer accepts a priority argument as sleep's no longer inherently bump the priority. Instead, this is soley a propery of msleep() which explicitly calls sched_prio() before blocking. - The TDF_ONSLEEPQ flag has been dropped as it was never used. The associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been dropped and replaced with a single explicit clearing of td_wchan. TD_SET_ONSLEEPQ() would really have only made sense if it had taken the wait channel and message as arguments anyway. Now that that only happens in one place, a macro would be overkill.	2004-02-27 18:52:44 +00:00
Jeff Roberson	40ece05382	- Revert rev 1.240 we no longer need a kthread for loadav().	2004-02-01 05:37:36 +00:00
Jeff Roberson	e7f004fe23	- Use sched_load() rather than grabbing the sx lock and traversing the proc table to discover the load.	2004-02-01 02:51:33 +00:00
John Baldwin	d5b75694e7	Move the loadav() callout into its own kthread since it uses allproc_lock which is a sleepable lock and thus is not safe to acquire from a callout routine.	2004-01-28 20:44:41 +00:00
Jeff Roberson	d1605f0ac9	- Use a unique string for the sched_setup SYSINIT and rename sched_setup to synch_setup. The schedulers use the sched_setup function name.	2004-01-25 07:49:45 +00:00
Jeff Roberson	29bcc4514f	- Add a flags parameter to mi_switch. The value of flags may be SW_VOL or SW_INVOL. Assert that one of these is set in mi_switch() and propery adjust the rusage statistics. This is to simplify the large number of users of this interface which were previously all required to adjust the proper counter prior to calling mi_switch(). This also facilitates more switch and locking optimizations. - Change all callers of mi_switch() to pass the appropriate paramter and remove direct references to the process statistics.	2004-01-25 03:54:52 +00:00
Bruce Evans	b3aeaf2ed1	Removed mostly-dead code for setting switchtime after the idle loop clobbers this variable. Long ago, when the idle loop wasn't in a process, it set switchtime.tv_sec to zero to indicate that the time needs to be read after the idle loop finishes. The special case for this isn't needed now that there is an idle process (for each CPU). The time is read in the normal way when the idle process is switched away from. The seconds component of the time is only zero for the first second after the uptime is set, and the mostly-dead code was only executed during this time. (This was slightly broken by using uptimes instead of times relative to the Epoch -- in the original version the seconds component of the time was only 0 for the first second after the Epoch.) In mi_switch(), moved the setting of switchticks to just after the first (and now only) setting of switchtime. This setting used to be delayed since a late setting was needed for the idle case and an early setting was not needed. Now the early setting is needed so that fork_exit() doesn't need to set either switchtime or switchticks. Removed now-completely-rotted comment attached to this. Most of the code described by the comment had already moved to sched_switch().	2003-10-29 15:23:09 +00:00
Jeff Roberson	ae53b483cc	- Collapse sched_switchin() and sched_switchout() into sched_switch(). Now mi_switch() calls sched_switch() which calls cpu_switch(). This is actually one less function call than it had been.	2003-10-16 08:53:46 +00:00
Bruce M Simpson	0c9601bc6b	Add a pre-emption counter, td_generation, so that threads can notice when they have been pre-empted by other threads. This is bumped from within mi_switch() every time a context switch takes place. Discussed with: pete	2003-10-05 09:35:08 +00:00
Jeff Roberson	fa3f9daae5	- On my Pentium4-M laptop, invalpg takes ~1100 cycles if the page is found in the TLB and ~1600 if it is not. Therefore, it is more effecient to invalidate the TLB after operations that use CMAP rather than before. - So that the tlb is invalidated prior to switching off of a processor, we must change the switchin functions to switchout functions. - Remove td_switchout from the thread and move it to the x86 pcb. - Move the code that calls switchout into swtch.s. These changes make this optimization truely x86 specific.	2003-09-30 08:11:36 +00:00
Sam Leffler	c06eb4e293	Change instances of callout_init that specify MPSAFE behaviour to use CALLOUT_MPSAFE instead of "1" for the second parameter. This does not change the behaviour; it just makes the intent more clear.	2003-08-19 17:51:11 +00:00
John Baldwin	70fca4277e	- Various style fixes in both code and comments. - Update some stale comments. - Sort a couple of includes. - Only set 'newcpu' in updatepri() if we use it. - No functional changes. Obtained from: bde (via an old diff I got a long time ago)	2003-08-15 21:29:06 +00:00
Peter Grehan	eac100658a	Update powerpc to use the (old thread,new thread) calling convention for cpu_throw() and cpu_switch().	2003-08-14 03:56:24 +00:00
John Baldwin	e9911cf591	- Convert Alpha over to the new calling conventions for cpu_throw() and cpu_switch() where both the old and new threads are passed in as arguments. Only powerpc uses the old conventions now. - Update comments in the Alpha swtch.s to reflect KSE changes. Tested by: obrien, marcel	2003-08-12 19:33:36 +00:00
Peter Wemm	e95babf3a8	unifdef -DLAZY_SWITCH and start to tidy up the associated glue.	2003-07-10 01:02:59 +00:00
David Xu	9dde3bc999	o Change kse_thr_interrupt to allow send a signal to a specified thread, or unblock a thread in kernel, and allow UTS to specify whether syscall should be restarted. o Add ability for UTS to monitor signal comes in and removed from process, the flag PS_SIGEVENT is used to indicate the events. o Add a KMF_WAITSIGEVENT for KSE mailbox flag, UTS call kse_release with this flag set to wait for above signal event. o For SA based thread, kernel masks all signal in its signal mask, let UTS to use kse_thr_interrupt interrupt a thread, and install a signal frame in userland for the thread. o Add a tm_syncsig in thread mailbox, when a hardware trap occurs, it is used to deliver synchronous signal to userland, and upcall is schedule, so UTS can process the synchronous signal for the thread. Reviewed by: julian (mentor)	2003-06-28 08:29:05 +00:00
Peter Wemm	eabd19726f	Tidy up leftover lazy_switch instrumentation that is no longer needed. This cleans up some #ifdef hell.	2003-06-27 22:39:14 +00:00
David Xu	0e2a4d3aeb	Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.	2003-06-15 00:31:24 +00:00
David E. O'Brien	677b542ea2	Use __FBSDID().	2003-06-11 00:56:59 +00:00
Poul-Henning Kamp	4fe77d64a0	Add a couple of XXX comments where the intent is not clear. Found by: FlexeLint	2003-05-31 20:13:58 +00:00
Marcel Moolenaar	f2c49dd248	Revamp of the syscall path, exception and context handling. The prime objectives are: o Implement a syscall path based on the epc inststruction (see sys/ia64/ia64/syscall.s). o Revisit the places were we need to save and restore registers and define those contexts in terms of the register sets (see sys/ia64/include/_regset.h). Secundairy objectives: o Remove the requirement to use contigmalloc for kernel stacks. o Better handling of the high FP registers for SMP systems. o Switch to the new cpu_switch() and cpu_throw() semantics. o Add a good unwinder to reconstruct contexts for the rare cases we need to (see sys/contrib/ia64/libuwx) Many files are affected by this change. Functionally it boils down to: o The EPC syscall doesn't preserve registers it does not need to preserve and places the arguments differently on the stack. This affects libc and truss. o The address of the kernel page directory (kptdir) had to be unstaticized for use by the nested TLB fault handler. The name has been changed to ia64_kptdir to avoid conflicts. The renaming affects libkvm. o The trapframe only contains the special registers and the scratch registers. For syscalls using the EPC syscall path no scratch registers are saved. This affects all places where the trapframe is accessed. Most notably the unaligned access handler, the signal delivery code and the debugger. o Context switching only partly saves the special registers and the preserved registers. This affects cpu_switch() and triggered the move to the new semantics, which additionally affects cpu_throw(). o The high FP registers are either in the PCB or on some CPU. context switching for them is done lazily. This affects trap(). o The mcontext has room for all registers, but not all of them have to be defined in all cases. This mostly affects signal delivery code now. The *context syscalls are as of yet still unimplemented. Many details went into the removal of the requirement to use contigmalloc for kernel stacks. The details are mostly CPU specific and limited to exception_save() and exception_restore(). The few places where we create, destroy or switch stacks were mostly simplified by not having to construct physical addresses and additionally saving the virtual addresses for later use. Besides more efficient context saving and restoring, which of course yields a noticable speedup, this also fixes the dreaded SMP bootup problem as a side-effect. The details of which are still not fully understood. This change includes all the necessary backward compatibility code to have it handle older userland binaries that use the break instruction for syscalls. Support for break-based syscalls has been pessimized in favor of a clean implementation. Due to the overall better performance of the kernel, this will still be notived as an improvement if it's noticed at all. Approved by: re@ (jhb)	2003-05-16 21:26:42 +00:00
John Baldwin	90af4afacb	- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)	2003-05-13 20:36:02 +00:00
John Baldwin	e668d8d834	Remove TD_ON_RUNQ() from a check to make sure Giant is not held when calling mi_switch(). The kernel would panic on an earlier KASSERT() in mi_switch() if TD_ON_RUNQ() was true.	2003-05-05 21:12:36 +00:00
John Baldwin	f2957f6b9a	Garbage collect unused TDF_INMSLEEP flag.	2003-05-01 17:05:24 +00:00
Peter Wemm	cb1f265c60	AMD64 uses the new-style cpu_switch()/cpu_throw() calling conventions.	2003-04-30 21:45:03 +00:00
John Baldwin	112afcb232	- Protect p_numthreads with the sched_lock. - Protect p_singlethread with both the sched_lock and the proc lock. - Protect p_suspcount with the proc lock.	2003-04-23 18:46:51 +00:00
Peter Wemm	cc66ebe2a9	Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb	2003-04-02 23:53:30 +00:00
Jeff Roberson	2c10d16a4b	- Borrow the KSE single threading code for exec and exit. We use the check if (p->p_numthreads > 1) and not a flag because action is only necessary if there are other threads. The rest of the system has no need to identify thr threaded processes. - In kern_thread.c use thr_exit1() instead of thread_exit() if P_THREADED is not set.	2003-04-01 01:26:20 +00:00
David Xu	6ce75196ce	Adjust code for userland preemptive. Userland can set a quantum in kse_mailbox to schedule an upcall, this is useful for userland timeout routine, for example pthread_cond_timedwait(). Also extract upcall scheduling code from kse_reassign and create a new function called thread_switchout to include these code. Reviewed by: julain	2003-03-19 05:49:38 +00:00
John Baldwin	263067951a	Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().	2003-03-04 21:03:05 +00:00
Hartmut Brandt	b89bc9e62b	When a process has been waiting on a condition variable or mutex the td_wmesg field in the thread structure points to the description string of the condition variable or mutex. If the condvar or the mutex had been initialized from a loadable module that was unloaded in the meantime, td_wmesg may now point to invalid memory. Retrieving the process table now may panic the kernel (or access junk). Setting the td_wmesg field to NULL after unblocking on the condvar/mutex prevents this panic. PR: kern/47408 Approved by: jake (mentor)	2003-02-27 08:43:27 +00:00
Julian Elischer	ac2e415327	Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.	2003-02-27 02:05:19 +00:00
Julian Elischer	4a338afd7a	Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind. Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@	2003-02-17 09:55:10 +00:00
Alfred Perlstein	aae87a3681	Print a backtrace in case we tsleep from inside of DDB.	2003-02-14 12:44:07 +00:00
Julian Elischer	93a7aa79d6	Add code to ddb to allow backtracing an arbitrary thread. (show thread {address}) Remove the IDLE kse state and replace it with a change in the way threads sahre KSEs. Every KSE now has a thread, which is considered its "owner" however a KSE may also be lent to other threads in the same group to allow completion of in-kernel work. n this case the owner remains the same and the KSE will revert to the owner when the other work has been completed. All creations of upcalls etc. is now done from kse_reassign() which in turn is called from mi_switch or thread_exit(). This means that special code can be removed from msleep() and cv_wait(). kse_release() does not leave a KSE with no thread any more but converts the existing thread into teh KSE's owner, and sets it up for doing an upcall. It is just inhibitted from being scheduled until there is some reason to do an upcall. Remove all trace of the kse_idle queue since it is no-longer needed. "Idle" KSEs are now on the loanable queue.	2002-12-28 01:23:07 +00:00
Julian Elischer	696058c3c5	Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage during the context switch. Rearrange thread cleanups to avoid problems with Giant. Clean threads when freed or when recycled. Approved by: re (jhb)	2002-12-10 02:33:45 +00:00
Jeff Roberson	148302c9c9	- Move FSCALE back to kern_sync. This is not scheduler specific. - Create a new callout for lbolt and move it out of schedcpu(). This is not scheduler specific either. Approved by: re	2002-11-21 08:57:08 +00:00
David Xu	34e80e027d	Add an actual implementation of kse_thr_interrupt()	2002-10-30 02:28:41 +00:00
Julian Elischer	9d10277721	More work on the interaction between suspending and sleeping threads. Also clean up some code used with 'single-threading'. Reviewed by: davidxu	2002-10-25 07:11:12 +00:00
Jeff Roberson	b43179fbe8	- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch	2002-10-12 05:32:24 +00:00
John Baldwin	5715307f74	- Move p_cpulimit to struct proc from struct plimit and protect it with sched_lock. This means that we no longer access p_limit in mi_switch() and the p_limit pointer can be protected by the proc lock. - Remove PRS_ZOMBIE check from CPU limit test in mi_switch(). PRS_ZOMBIE processes don't call mi_switch(), and even if they did there is no longer the danger of p_limit being NULL (which is what the original zombie check was added for). - When we bump the current processes soft CPU limit in ast(), just bump the private p_cpulimit instead of the shared rlimit. This fixes an XXX for some value of fix. There is still a (probably benign) bug in that this code doesn't check that the new soft limit exceeds the hard limit. Inspired by: bde (2)	2002-10-09 17:17:24 +00:00
Julian Elischer	48bfcddd94	Round out the facilty for a 'bound' thread to loan out its KSE in specific situations. The owner thread must be blocked, and the borrower can not proceed back to user space with the borrowed KSE. The borrower will return the KSE on the next context switch where teh owner wants it back. This removes a lot of possible race conditions and deadlocks. It is consceivable that the borrower should inherit the priority of the owner too. that's another discussion and would be simple to do. Also, as part of this, the "preallocatd spare thread" is attached to the thread doing a syscall rather than the KSE. This removes the need to lock the scheduler when we want to access it, as it's now "at hand". DDB now shows a lot mor info for threaded proceses though it may need some optimisation to squeeze it all back into 80 chars again. (possible JKH project) Upcalls are now "bound" threads, but "KSE Lending" now means that other completing syscalls can be completed using that KSE before the upcall finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is one of the highest priorities in the KSE system.) The upcall when it happens will present all the completed syscalls to the KSE for selection.	2002-10-09 02:33:36 +00:00
Juli Mallett	a723033a4d	XXX Add a check for p->p_limit being NULL before dereferencing it. This is totally bogus but will hide the occurances of access of 0xbc(NULL) which people have run into lately. This is not a proper fix, just a bandaid, until the cause of this happening is tracked down and fixed. Reviewed by: rwatson	2002-10-03 04:09:00 +00:00
John Baldwin	551cf4e150	Rename the mutex thread and process states to use a more generic 'LOCK' name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will be used internally by the turnstile code and will not be specific to mutexes. Making the change now ensures that turnstiles can be dropped in at a later date without affecting the ABI of userland applications.	2002-10-02 20:31:47 +00:00
John Baldwin	1d56414515	- Adjust comment noting that handling of CPU limit exhaustion is done in ast(). - Actually set KEF_ASTPENDING so ast() is called. I think this is buggy for a process with multiple KSE's in that PS_XCPU is not a KSE event, it's a process-wide event. IMO there really should probably be two ASTPENDING flags, one for per-process, and one for per-KSE. Submitted by: bde	2002-10-01 14:10:08 +00:00
John Baldwin	dc183990ca	- Add a new per-process flag PS_XCPU to indicate that at least one thread has exceeded its CPU time limit. - In mi_switch(), set PS_XCPU when the CPU time limit is exceeded. - Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set. Requested by: many	2002-09-30 21:13:54 +00:00
Julian Elischer	9eb1fdea37	Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode from stopping another thread from completing a syscall, and this allows it to release its resources etc. Probably more related commits to follow (at least one I know of) Initial concept by: julian, dillon Submitted by: davidxu	2002-09-29 23:04:34 +00:00
Julian Elischer	71fad9fdee	Completely redo thread states. Reviewed by: davidxu@freebsd.org	2002-09-11 08:13:56 +00:00
Julian Elischer	472be95807	Rejig the code to figure out estcpu and work out how long a KSEGRP has been idle. What was there before was surprisingly ALMOST correct. Peter and I fried our brains on this for a couple of hours figuring out what this actually means in the context of multiple threads. Reviewed by: peter@freebsd.org	2002-08-30 00:25:49 +00:00
Peter Wemm	d13947c3b0	updatepri() works on a ksegrp (where the scheduling parameters are), so directly give it the ksegrp instead of the thread. The only thing it used to use in the thread was the ksegrp. Reviewed by: julian	2002-08-28 23:45:15 +00:00
Julian Elischer	04774f2357	Slight cleanup of some comments/whitespace. Make idle process state more consistant. Add an assert on thread state. Clean up idleproc/mi_switch() interaction. Use a local instead of referencing curthread 7 times in a row (I've been told curthread can be expensive on some architectures) Remove some commented out code. Add a little commented out code (completion coming soon) Reviewed by: jhb@freebsd.org	2002-08-01 18:45:10 +00:00
Seigo Tanimura	133267776c	In endtsleep() and cv_timedwait_end(), a thread marked TDF_TIMEOUT may be swapped out. Do not put such the thread directly back to the run queue. Spotted by: David Xu <davidx@viasoft.com.cn> While I am here, s/PS_TIMEOUT/TDF_TIMEOUT/.	2002-07-30 10:12:11 +00:00
Seigo Tanimura	9eb881f804	- Optimize wakeup() and its friends; if a thread waken up is being swapped in, we do not have to ask for the scheduler thread to do that. - Assert that a process is not swapped out in runq functions and swapout(). - Introduce thread_safetoswapout() for readability. - In swapout_procs(), perform a test that may block (check of a thread working on its vm map) first. This lets us call swapout() with the sched_lock held, providing a better atomicity.	2002-07-30 06:54:05 +00:00
Julian Elischer	1d7b9ed2e6	Create a new thread state to describe threads that would be ready to run except for the fact tha they are presently swapped out. Also add a process flag to indicate that the process has started the struggle to swap back in. This will be needed for the case where multiple threads start the swapin action top a collision. Also add code to stop a process fropm being swapped out if one of the threads in this process is actually off running on another CPU.. that might hurt... Submitted by: Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>	2002-07-29 18:33:32 +00:00
Julian Elischer	294e6308bf	slight stylisations to take into account recent code changes.	2002-07-24 23:59:15 +00:00
Julian Elischer	2d014fd7f8	Fix a reversed test. Fix some style nits. Fix a KASSERT message. Add/fix some comments. Submitted by: bde@freebsd.org	2002-07-17 19:20:48 +00:00
John Baldwin	627ed43ba7	Add a KASSERT() to assert that td_critnest is == 1 when mi_switch() is called.	2002-07-17 02:46:13 +00:00
Andrew Gallatin	fe79953325	Allow alphas to do crashdumps: Refuse to run anything in choosethread() after a panic which is not an interrupt thread, or the thread which caused the panic. Also, remove panicstr checks from msleep() and from cv_wait() in order to allow threads to go to sleep and yeild the cpu to the panicing thread, or to an interrupt thread which might be doing the crashdump. Reviewed by: jhb (and it was mostly his idea too)	2002-07-17 02:23:44 +00:00
Julian Elischer	c3b98db091	Thinking about it I came to the conclusion that the KSE states were incorrectly formulated. The correct states should be: IDLE: On the idle KSE list for that KSEG RUNQ: Linked onto the system run queue. THREAD: Attached to a thread and slaved to whatever state the thread is in. This means that most places where we were adjusting kse state can go away as it is just moving around because the thread is.. The only places we need to adjust the KSE state is in transition to and from the idle and run queues. Reviewed by: jhb@freebsd.org	2002-07-14 03:43:33 +00:00
Julian Elischer	ac8bcbb700	oops, state cannot be two different values at once.. use \|\| instead of &&	2002-07-14 01:36:48 +00:00
Matthew Dillon	fbcf77c2ea	Re-enable the idle page-zeroing code. Remove all IPIs from the idle page-zeroing code as well as from the general page-zeroing code and use a lazy tlb page invalidation scheme based on a callback made at the end of mi_switch. A number of people came up with this idea at the same time so credit belongs to Peter, John, and Jake as well. Two-way SMP buildworld -j 5 tests (second run, after stabilization) 2282.76 real 2515.17 user 704.22 sys before peter's IPI commit 2266.69 real 2467.50 user 633.77 sys after peter's commit 2232.80 real 2468.99 user 615.89 sys after this commit Reviewed by: peter, jhb Approved by: peter	2002-07-12 20:17:06 +00:00
Julian Elischer	fe0f1bf4b1	make this repect ps_sigintr if there is a pre-existing signal or suspension request. Submitted by: David Xu	2002-07-06 08:47:24 +00:00
Julian Elischer	55fb7ca894	Fix at least one of the things wrong with signals ^Z should work a lot better now. Submitted by: peter@freebsd.org	2002-07-06 02:45:11 +00:00
Julian Elischer	aa0fa33464	Try clean up some of the mess that resulted from layers and layers of p4 merges from -current as things started getting different. Corroborated by: Similar patches just mailed by BDE.	2002-07-03 09:15:20 +00:00
Julian Elischer	8b768fc82b	When going back to SLEEP state, make sure our State is correctly marked so.	2002-07-02 05:40:51 +00:00
Julian Elischer	e602ba25fd	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
Alfred Perlstein	016091145e	more caddr_t removal.	2002-06-29 02:00:02 +00:00
Matthew Dillon	727300861d	I Noticed a defect in the way wakeup() scans the tailq. Tor noticed an even worse defect in wakeup_one(). This patch cleans up both. Submitted by: tegge MFC after: 3 days	2002-06-24 00:14:36 +00:00
John Baldwin	9ba7fe1b76	- Catch up to new ktrace API. - ktrace trace points in msleep() and cv_wait() no longer need Giant.	2002-06-07 05:39:16 +00:00
Julian Elischer	628855e758	CURSIG() is not a macro so rename it cursig(). Obtained from: KSE tree	2002-05-29 23:44:32 +00:00
John Baldwin	cc5d39f81e	Minor nit: get p pointer in msleep() from td->td_proc (where td == curthread) rather than from curproc.	2002-05-23 04:14:18 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Peter Wemm	30171114b3	Fix a gcc-3.1+ warning. warning: deprecated use of label at end of compound statement ie: you cannot do this anymore: switch(foo) { .... default: }	2002-03-19 11:02:06 +00:00
Poul-Henning Kamp	1cbb9c3b03	Convert p->p_runtime and PCPU(switchtime) to bintime format.	2002-02-22 13:32:01 +00:00
Julian Elischer	2c1007663f	In a threaded world, differnt priorirites become properties of different entities. Make it so. Reviewed by: jhb@freebsd.org (john baldwin)	2002-02-11 20:37:54 +00:00
John Baldwin	c86b6ff551	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha	2002-01-05 08:47:13 +00:00
John Baldwin	7e1f6dfe9d	Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit(). Tested on: i386, alpha	2001-12-18 00:27:18 +00:00
Luigi Rizzo	af1408e33f	Add/correct description for some sysctl variables where it was missing. The description field is unused in -stable, so the MFC there is equivalent to a comment. It can be done at any time, i am just setting a reminder in 45 days when hopefully we are past 4.5-release. MFC after: 45 days	2001-12-16 16:07:20 +00:00
John Baldwin	ac9a258074	Assert that Giant is not held in mi_switch() unless the process state is SMTX or SRUN.	2001-10-23 17:52:49 +00:00
Ian Dowse	72ec63a53d	Introduce some jitter to the timing of the samples that determine the system load average. Previously, the load average measurement was susceptible to synchronisation with processes that run at regular intervals such as the system bufdaemon process. Each interval is now chosen at random within the range of 4 to 6 seconds. This large variation is chosen so that over the shorter 5-minute load average timescale there is a good dispersion of samples across the 5-second sample period (the time to perform 60 5-second samples now has a standard deviation of approx 4.5 seconds).	2001-10-20 16:07:17 +00:00
Ian Dowse	0eb6ce3169	Move the code that computes the system load average from vm_meter.c to kern_synch.c in preparation for adding some jitter to the inter-sample time. Note that the "vm.loadavg" sysctl still lives in vm_meter.c which isn't the right place, but it is appropriate for the current (bad) name of that sysctl. Suggested by: jhb (some time ago) Reviewed by: bde	2001-10-20 13:10:43 +00:00
John Baldwin	21832b1ec0	GC some #if 0'd code.	2001-09-21 19:21:18 +00:00
John Baldwin	3226cbf43b	Whitespace and spelling fixes.	2001-09-21 19:16:12 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Matthew Dillon	918c3b1361	Make yield() MPSAFE. Synchronize syscalls.master with all MPSAFE changes to date. Synchronize new syscall generation follows because yield() will panic if it is out of sync with syscalls.master.	2001-09-01 03:54:09 +00:00
John Baldwin	b285782b29	Release the sched_lock before bombing out in mi_switch() via db_error(). This makes things slightly easier if you call a function that calls mi_switch() as it keeps the locking before and after closer.	2001-08-21 23:10:37 +00:00
John Baldwin	161778121a	Add a hook to mi_switch() to abort via db_error() if we attempt to perform a context switch from DDB. Consulting from: bde	2001-08-21 20:09:05 +00:00
John Baldwin	91a4536f22	- Fix a bug in the previous workaround for the tsleep/endtsleep race. callout_stop() would fail in two cases: 1) The timeout was currently executing, and 2) The timeout had already executed. We only needed to work around the race for 1). We caught some instances of 2) via the PS_TIMEOUT flag, however, if endtsleep() fired after the process had been woken up but before it had resumed execution, PS_TIMEOUT would not be set, but callout_stop() would fail, so we would block the process until endtsleep() resumed it. Except that endtsleep() had already run and couldn't resume it. This adds a new flag PS_TIMOFAIL to indicate the case of 2) when PS_TIMEOUT isn't set. - Implement this race fix for condition variables as well. Tested by: sos	2001-08-21 18:42:45 +00:00
John Baldwin	688ebe120c	- Close races with signals and other AST's being triggered while we are in the process of exiting the kernel. The ast() function now loops as long as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with preemption disabled so that any further AST's that arrive via an interrupt will be delayed until the low-level MD code returns to user mode. - Use u_int's to store the tick counts for profiling purposes so that we do not need sched_lock just to read p_sticks. This also closes a problem where the call to addupc_task() could screw up the arithmetic due to non-atomic reads of p_sticks. - Axe need_proftick(), aston(), astoff(), astpending(), need_resched(), clear_resched(), and resched_wanted() in favor of direct bit operations on p_sflag. - Fix up locking with sched_lock some. In addupc_intr(), use sched_lock to ensure pr_addr and pr_ticks are updated atomically with setting PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing PS_OWEUPC. We also do not grab the lock just to test a flag. - Simplify the handling of Giant in ast() slightly. Reviewed by: bde (mostly)	2001-08-10 22:53:32 +00:00
John Baldwin	8791b43513	Work around a race between msleep() and endtsleep() where it was possible for endtsleep() to be executing when msleep() resumed, for endtsleep() to spin on sched_lock long enough for the other process to loop on msleep() and sleep again resulting in endtsleep() waking up the "wrong" msleep. Obtained from: BSD/OS	2001-08-10 21:08:56 +00:00
John Baldwin	4d33620270	Style nit: covert a couple of if (p_wchan) tests to if (p_wchan != NULL).	2001-08-10 20:56:25 +00:00
John Baldwin	8ec48c6dbf	- Remove asleep(), await(), and M_ASLEEP. - Callers of asleep() and await() have been converted to calling tsleep(). The only caller outside of M_ASLEEP was the ata driver, which called both asleep() and await() with spl-raised, so there was no need for the asleep() and await() pair. M_ASLEEP was unused. Reviewed by: jasone, peter	2001-08-10 06:37:05 +00:00
John Baldwin	b39bc3e160	Use 'p' instead of the potentially more expensive 'curproc' inside of mi_switch().	2001-08-02 22:15:31 +00:00
John Baldwin	36c2e9feb4	Apply the cluebat to myself and undo the await() -> mawait() rename. The asleep() and await() functions split the functionality of msleep() up into two halves. Only the asleep() half (which is what puts the process on the sleep queue) actually needs the lock usually passed to msleep() held to prevent lost wakeups. await() does not need the lock held, so the lock can be released prior to calling await() and does not need to be passed in to the await() function. Typical usage of these functions would be as follows: mtx_lock(&foo_mtx); ... do stuff ... asleep(&foo_cond, PRIxx, "foowt", hz); ... mtx_unlock&foo_mtx); ... await(-1, -1); Inspired by: dillon on the couch at Usenix	2001-07-31 22:06:56 +00:00
John Baldwin	e9121d0663	Add a safety belt to mawait() for the (cold \|\| panicstr) case identical to the one in msleep() such that we return immediately rather than blocking. Submitted by: peter Prodded by: sheldonh	2001-07-31 20:57:57 +00:00
Jake Burkholder	d652b3d918	Backout mwakeup, etc.	2001-07-06 01:16:43 +00:00
Jake Burkholder	9316aed2ef	Implement mwakeup, mwakeup_one, cv_signal_drop and cv_broadcast_drop. These take an additional mutex argument, which is dropped before any processes are made runnable. This can avoid contention on the mutex if the processes would immediately acquire it, and is done in such a way that wakeups will not be lost. Reviewed by: jhb	2001-07-04 00:32:50 +00:00
John Baldwin	d68a8cc0ab	Remove commented-out garbage that skipped updating schedcpu() stats for ithreads in SWAIT.	2001-07-03 08:03:56 +00:00
John Baldwin	97b4306f0f	Just check p_oncpu when determining if a process is executing or not. We already did this in the SMP case, and it is now maintained in the UP case as well, and makes the code slightly more readable. Note that curproc is always executing, thus the p != curproc test does not need to be performed if the p_oncpu check is made.	2001-07-03 08:00:57 +00:00
John Baldwin	9d36b83e2c	Axe spl's that are covered by the sched_lock (and have been for quite some time.)	2001-07-03 07:53:35 +00:00
John Baldwin	36f1548b96	Include the wait message and channel for msleep() in the KTR tracepoint.	2001-07-03 07:39:06 +00:00
John Baldwin	8f451b4114	Remove bogus need_resched() of the current CPU in roundrobin(). We don't actually need to force a context switch of the current process. The act of firing the event triggers a context switch to softclock() and then switching back out again which is equivalent to a preemption, thus no further work is needed on the local CPU.	2001-07-03 05:33:09 +00:00
John Baldwin	a300519d41	Make the schedlock saved critical section state a per-thread property.	2001-06-30 03:11:26 +00:00
John Baldwin	1df95969b5	- Lock CURSIG() with the proc lock to close the signal race with psignal. - Grab Giant around ktrace points. - Clean up KTR_PROC tracepoints to not display the value of sched_lock.mtx_lock as it isn't really needed anymore and just obfuscates the messages. - Add a few if conditions to replace gotos. - Ensure that every msleep KTR event ends up with a matching msleep resume KTR event (this was broken when we didn't do a mi_switch()). - Only note via ktrace that we resumed from a switch once rather than twice in several places in msleep(). - Remove spl's rom asleep and await as the proc lock and sched_lock provide all the needed locking. - In mawait() add in a needed ktrace point for noting that we are about to switch out.	2001-06-22 23:11:26 +00:00
John Baldwin	b516d2f5e1	Add in assertions to ensure that we always call msleep or mawait with either a timeout or a held mutex to detect unprotected infinite sleeps that can easily lead to deadlock. Submitted by: alfred	2001-05-23 19:38:26 +00:00
Alfred Perlstein	a4d22b8035	Remove KASSERT test for sleeping on mv_mtx, instead let WITNESS catch it. Requested by: jhb	2001-05-22 00:58:20 +00:00
Alfred Perlstein	5ee5c3aa1f	remove my private assertions from tsleep. add one assertion to ensure we don't sleep while holding vm.	2001-05-19 01:40:48 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
John Baldwin	74fc745594	- Remove unneeded include of sys/ipl.h. - Lock the process before calling killproc() to kill it for exceeding the maximum CPU limit.	2001-05-15 23:15:06 +00:00
John Baldwin	6caa8a1501	Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP. - It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the _process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the _process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the _process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the _process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in. Reviewed by: jake, peter Looked over by: eivind	2001-04-27 19:28:25 +00:00
John Baldwin	1005a129e5	Convert the allproc and proctree locks from lockmgr locks to sx locks.	2001-03-28 11:52:56 +00:00
John Baldwin	192846463a	Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.	2001-03-28 09:03:24 +00:00
John Baldwin	eed4805444	- Proc locking. - Remove unneeded spl()'s.	2001-03-07 03:01:53 +00:00
John Baldwin	c978f49e20	Add a mtx_assert() in maybe_resched() just to be sure it's always called with sched_lock held.	2001-02-22 13:47:01 +00:00
John Baldwin	5a93f3e851	- Use the new NOCPU constant. - Fix a warning. Noticed by: bde (2)	2001-02-22 00:32:13 +00:00
John Baldwin	5813dc03bd	- Don't call clear_resched() in userret(), instead, clear the resched flag in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.	2001-02-20 05:26:15 +00:00
Jake Burkholder	d5a08a6065	Implement a unified run queue and adjust priority levels accordingly. - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.	2001-02-12 00:20:08 +00:00
Jake Burkholder	c11f93b3e7	Acquire sched_lock around need_resched() in roundrobin() to satisfy assertions that it is held. Since roundrobin() is a timeout there's no possible way that it could be called with sched_lock held.	2001-02-10 19:07:32 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
Peter Wemm	2508f69037	Zap last remaining references to (and a use use of) of simple_locks.	2001-01-31 04:29:52 +00:00
John Baldwin	1899325c72	- Catch up to proc flag changes. - Add in some locking ops that might fix SIGXCPU, but don't enable them yet. - Assert that sched_lock is not recursed when mi_switch() is called.	2001-01-24 11:10:55 +00:00
Matt Jacob	15516f16d2	Do not do the commenting out the way that saves bytes and looks cleaner to you. Do it the way Vox Populi wants it.	2001-01-23 16:35:33 +00:00
Matt Jacob	462574faf5	Move (now) unused variable declaration inside the block (now commented out).	2001-01-22 22:22:38 +00:00
John Baldwin	049ebc15a1	Temporarily disable the printf() for micruptime() going backwards, the SIGXCPU signal, and killing of processes that exceed their allowed run time until they can play nice with sched_lock. Right now they are just potentital panics waiting to happen. The printf() has bitten several people.	2001-01-20 02:57:59 +00:00
Jason Evans	238510fc46	Implement condition variables.	2001-01-16 01:00:43 +00:00
Jake Burkholder	ef73ae4b0c	Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.	2001-01-10 04:43:51 +00:00
Jake Burkholder	c0c2557090	- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.	2000-12-13 00:17:05 +00:00
John Baldwin	7b29322c25	Add in #include of <sys/lock.h> since it was axed from <sys/proc.h>. Noticed by: Wesley Morgan <morganw@chemikals.org> Pointy hat to: me	2000-12-06 00:33:58 +00:00
Jake Burkholder	86360fee54	Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup from struct proc, which are now unused (p_nthread already was). Remove process flag P_KTHREADP which was untested and only set in vfs_aio.c (it should use kthread_create). Move the yield system call to kern_synch.c as kern_threads.c has been removed completely. moral support from: alfred, jhb	2000-12-02 05:41:30 +00:00
Jake Burkholder	1512b5d6ab	Use an mp-safe callout for endtsleep.	2000-12-01 04:55:52 +00:00
John Baldwin	1bd0eefb4c	Fix up priority propagation: - Use a better test for determining when a process is running. - Convert some checks to assertions. - Remove unnecessary tests. - Save the priority before acquiring a mutex rather than in msleep(9).	2000-11-30 00:51:16 +00:00
John Baldwin	e2979dcc85	Don't drop Giant and the passed in mutex incorrectly in the cold \|\| panicstr case. Do drop the passed in mutex in that case if PDROP is specified.	2000-11-29 18:32:50 +00:00
Jake Burkholder	4f55983606	Use callout_reset instead of timeout(9). Most callouts are statically allocated, 2 have been added to struct proc for setitimer and sleep. Reviewed by: jhb, jlemon	2000-11-27 22:52:31 +00:00
Jake Burkholder	553629ebc9	Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd	2000-11-22 07:42:04 +00:00
Jake Burkholder	7da6f97772	- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock. Tested by: John Hay <jhay@icomtek.csir.co.za> jhb	2000-11-17 18:09:18 +00:00

1 2 3 4 5 ...

358 Commits