freebsd-skq

Author	SHA1	Message	Date
jake	95fb73d495	Use an mp-safe callout for endtsleep.	2000-12-01 04:55:52 +00:00
jhb	ab556afac4	Fix up priority propagation: - Use a better test for determining when a process is running. - Convert some checks to assertions. - Remove unnecessary tests. - Save the priority before acquiring a mutex rather than in msleep(9).	2000-11-30 00:51:16 +00:00
jhb	896b5da519	Don't drop Giant and the passed in mutex incorrectly in the cold \|\| panicstr case. Do drop the passed in mutex in that case if PDROP is specified.	2000-11-29 18:32:50 +00:00
jake	9326f655fc	Use callout_reset instead of timeout(9). Most callouts are statically allocated, 2 have been added to struct proc for setitimer and sleep. Reviewed by: jhb, jlemon	2000-11-27 22:52:31 +00:00
jake	0c0be4e826	Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd	2000-11-22 07:42:04 +00:00
jake	3a97b3e213	- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock. Tested by: John Hay <jhay@icomtek.csir.co.za> jhb	2000-11-17 18:09:18 +00:00
jhb	c0bba69cbe	Don't release and acquire Giant in mi_switch(). Instead, release and acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled. Submitted by: witness	2000-11-16 02:16:44 +00:00
jhb	5fa45f43dd	Argh, add in a missing release of the sched_lock.	2000-11-16 01:16:54 +00:00
jhb	8b193931b5	CURSIG() calls functions that acquire sleep mutexes, so it is not a good idea to be holding the sched_lock while we are calling it. As such, release sched_lock before calling CURSIG() in msleep() and mawait() and reacquire it after CURSIG() returns. Submitted by: witness	2000-11-16 01:07:19 +00:00
jhb	de636b04e8	- Rename await() to mawait(). mawait() is to await() as msleep() is to tsleep(). Namely, mawait() takes an extra argument which is a mutex to drop when going to sleep. Just as with msleep(), if the priority argument includes the PDROP flag, then the mutex will be dropped and will not be reacquired when the process wakes up. - Add in a backwards compatible macro await() that passes in NULL as the mutex argument to mawait().	2000-11-15 22:39:35 +00:00
jhb	0efbfa0260	- Replace a KASSERT() that knew too much about mutex internals with a mtx_assert() that ensures the mutex we release during msleep() is both not recursed and owned by the current process.	2000-11-15 22:30:48 +00:00
jhb	c70d0c6d5a	- Convert references from tsleep() -> msleep() - Fix a buglet in a comment above await()	2000-11-15 22:27:38 +00:00
jhb	be4bef8719	- GC some #if 0'd code regarding the non-existant safepri variable. - Don't dink with the witness state of Giant unless we actually own it during mi_switch().	2000-10-20 07:52:10 +00:00
jhb	fd275a78bd	- Change fast interrupts on x86 to push a full interrupt frame and to return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast(). Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())	2000-10-06 02:20:21 +00:00
jhb	98932a243c	Add a KASSERT() to catch instances where the mutex that we pass in to msleep() are recursed. Suggested by: cp	2000-09-24 00:33:51 +00:00
jhb	ebc05310ca	Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, just use struct mtx, struct witness, and struct witness_blessed. Requested by: bde	2000-09-14 20:15:16 +00:00
jake	273d0f5a2d	Rename tsleep to msleep and add a mutex argument, which is released before sleeping and re-acquired before msleep returns. A compatibility cpp macro has been provided for tsleep to avoid changing all occurences of it in the kernel. Remove an assertion that the Giant mutex be held before calling tsleep or asleep. This is intended to serve the same purpose as condition variables, but does not preclude their addition in the future. Approved by: jasone Obtained from: BSD/OS	2000-09-11 00:20:02 +00:00
dfr	14c49fbe71	Fix printf warnings in CTRx calls.	2000-09-10 13:34:35 +00:00
jasone	769e0f974d	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh	2000-09-07 01:33:02 +00:00
phk	e5de271d47	Previous commit changing SYSCTL_HANDLER_ARGS violated KNF. Pointed out by: bde	2000-07-04 11:25:35 +00:00
phk	61ff05be25	Style police catches up with rev 1.26 of src/sys/sys/sysctl.h: Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)	2000-07-03 09:35:31 +00:00
jake	961b97d434	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
jake	d93fbc9916	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
grog	747ab40a69	Correct a couple of typos.	2000-05-07 05:09:45 +00:00
green	74f13b7793	Change the scheduler to actually respect the PUSER barrier. It's been wrong for many years that negative niceness would lower the priority of a process below PUSER, and once below PUSER, there were conditionals in the code that are required to test for whether a process was in the kernel which would break. The breakage could (and did) cause lock-ups, basically nothing else but the least nice program being able to run in some conditions. The algorithm which adjusts the priority now subtracts PRIO_MIN to do things properly, and the ESTCPULIM() algorithm was updated to use PRIO_TOTAL (PRIO_MAX - PRIO_MIN) to calculate the estcpu. NICE_WEIGHT is now 1 to accomodate the full range of priorities better (a -20 process with full CPU time has the priority of a +0 process with no CPU time). There are now 20 queues (exactly; 80 priorities) for use in user processes' scheduling, and PUSER has been lowered to 48 to accomplish this. This means, to the user, that things will be scheduled more correctly (noticeable), there is no lock-up anymore WRT a niced -20 process never releasing the CPU time for other processes. In this fair system, tsleep()ed < PUSER processes now will get the proper higher priority than priority >= PUSER user processes. The detective work of this was done by me, along with part of the solution. Luoqi Chen has provided most of the solution, and really helped me understand what was happening better, to boot :) Submitted by: luoqi Concept reviewed by: bde	2000-04-30 18:33:43 +00:00
dillon	b852fcb160	The SMP cleanup commit broke UP compiles. Make UP compiles work again.	2000-03-28 18:06:49 +00:00
dillon	689641c1ea	Commit major SMP cleanups and move the BGL (big giant lock) in the syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch	2000-03-28 07:16:37 +00:00
dufault	705c38904d	I applied the wrong patch set. Back out anything associated with the known bogus currtpriority. This undoes the previous changes to sys/i386/i386/trap.c, sys/alpha/alpha/trap.c, sys/sys/systm.h Now we have the patch set approved by bde. Approved by: bde	2000-03-02 22:03:49 +00:00
dufault	0bdb67cb26	Patches that eliminate extra context switches in FIFO case. Fixes p1003_1b regression test in the simple case of no RR and FIFO processes competing. Reviewed by: jkh, bde	2000-03-02 16:20:07 +00:00
peter	031f01d30f	Don't make the ktrace hook in tsleep() deref a null curproc after a panic. PR: 15169 Submitted by: David Gilbert <dgilbert@velocet.ca>	1999-11-30 09:01:46 +00:00
phk	8da3ba86dc	Add a bit of sanity checking and problem avoidance in case the timecounter hardware is bogus. This will produce a new warning "microuptime() went backwards" and try to not screw up the process resource accounting.	1999-11-29 11:29:04 +00:00
bde	1ad19bea4d	Scheduler fixes equivalent to the ones logged in the following NetBSD commit to kern_synch.c: ---------------------------- revision 1.55 date: 1999/02/23 02:56:03; author: ross; state: Exp; lines: +39 -10 Scheduler bug fixes and reorganization * fix the ancient nice(1) bug, where nice +20 processes incorrectly steal 10 - 20% of the CPU, (or even more depending on load average) * provide a new schedclk() mechanism at a new clock at schedhz, so high platform hz values don't cause nice +0 processes to look like they are niced * change the algorithm slightly, and reorganize the code a lot * fix percent-CPU calculation bugs, and eliminate some no-op code === nice bug === Correctly divide the scheduler queues between niced and compute-bound processes. The current nice weight of two (sort of, see `algorithm change' below) neatly divides the USRPRI queues in half; this should have been used to clip p_estcpu, instead of UCHAR_MAX. Besides being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op, and it was done after decay_cpu() which can only _reduce_ the value. It has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can scheduler-penalize themselves onto the same queue as nice +20 processes. (Or even a higher one.) === New schedclk() mechansism === Some platforms should be cutting down stathz before hitting the scheduler, since the scheduler algorithm only works right in the vicinity of 64 Hz. Rather than prescale hz, then scale back and forth by 4 every time p_estcpu is touched (each occurance an abstraction violation), use p_estcpu without scaling and require schedhz to be generated directly at the right frequency. Use a default stathz (well, actually, profhz) / 4, so nothing changes unless a platform defines schedhz and a new clock. Define these for alpha, where hz==1024, and nice was totally broke. === Algorithm change === The nice value used to be added to the exponentially-decayed scheduler history value p_estcpu, in _addition_ to be incorporated directly (with greater wieght) into the priority calculation. At first glance, it appears to be a pointless increase of 1/8 the nice effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that because it will ramp up linearly but be decayed only exponentially, thus converging to an additional .75 nice for a loadaverage of one. I killed this, it makes the behavior hard to control, almost impossible to analyze, and the effect (~~nothing at for the first second, then somewhat increased niceness after three seconds or more, depending on load average) pointless. === Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation. Collect scheduler functionality. Try to put each abstraction in just one place. ---------------------------- The details are a little different in FreeBSD: === nice bug === Fixing this is the main point of this commit. We use essentially the same clipping rule as NetBSD (our limit on p_estcpu differs by a scale factor). However, clipping at all is fundamentally bad. It gives free CPU the hoggiest hogs once they reach the limit, and reaching the limit is normal for long-running hogs. This will be fixed later. === New schedclk() mechanism === We don't use the NetBSD schedclk() (now schedclock()) mechanism. We require (real)stathz to be about 128 and scale by an extra factor of 2 compared with NetBSD's statclock(). We scale p_estcpu instead of scaling the clock. This is more accurate and flexible. === Algorithm change === Same change. === Other bugs === The p_pctcpu bug was fixed long ago. We don't try as hard to abstract functionality yet. Related changes: the new limit on p_estcpu must be exported to kern_exit.c for clipping in wait1(). Agreed with by: dufault	1999-11-28 12:12:13 +00:00
bde	0f795adedc	Updated comments for the move in the previous commit.	1999-11-27 15:27:11 +00:00
bde	4955977bad	Moved scheduling-related code to kern_synch.c so that it is easier to fix and extend. The new function containing the code is named schedclock() as in NetBSD, but it has slightly different semantics (it already handles incrementation of p->p_cpticks, and it should handle any calling frequency). Agreed with in principle by: dufault	1999-11-27 12:32:27 +00:00
phk	8fca18de89	This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the Q_HEAD and Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914	1999-11-16 10:56:05 +00:00
marcel	d5e8d714b9	sigset_t change (part 2 of 5) ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.	1999-09-29 15:03:48 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
peter	eac345f15d	Don't initialize run queues here, do it all in one place.	1999-08-19 00:14:43 +00:00
bde	79ceaf95c2	The magic "no-cpu" cpu number is 0xff. Don't misrepresent cpu numbers as chars or use bogus casts in an attempt to unmisrepresnt them. In top, don't assume that 0xff is the only negative cpu number when cpu numbers are (mis)represented.	1999-03-05 16:38:13 +00:00
julian	ee38b91324	The tunable parameter for the scheduler quantum was inverted. Higher numbers led to smaller quanta. In discussion with BDE, change this parameter to be in uSecs to make it machine independent, and limit it to non zero multiples of 'tick' (rounding down). Also make the variabel globally available so that the present function that returns its value (used for posix scheduling I believe) can go away. Submitted by: Bruce Evans <bde@freebsd.org>	1999-03-03 18:15:29 +00:00
bde	6d8d63664b	Removed all traces of `p_switchtime'. The relevant timestamp is per-cpu, not per-process. Keep it in `switchtime' consistently. It is now clear that the timestamp is always valid in fork_trampoline() except when the child is running on a previously idle cpu, which can only happen if there are multiple cpus, so don't check or set the timestamp in fork_trampoline except in the (i386) SMP case. Just remove the alpha code for setting it unconditionally, since there is no SMP case for alpha and the code had rotted. Parts reviewed by: dfr, phk	1999-02-28 10:53:29 +00:00
bde	d51135c0c3	Improved scheduling in uiomove(), etc. resched_wanted() is true too often for it to be a good criterion for switching kernel cpu hogs -- it is true after most wakeups. Use the criterion "has been running for >= 2 quanta" instead.	1999-02-22 16:57:48 +00:00
eivind	89e1199534	KNFize, by bde.	1999-01-10 01:58:29 +00:00
eivind	a8dc66f457	Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith	1999-01-08 17:31:30 +00:00
dillon	953800406c	Add asleep() and await() support. Currently highly experimental. A small support structure had to be added to the proc structure, and a few minor conditional panics no longer apply.	1998-12-21 07:41:51 +00:00
dg	b4bceb0b07	Compare p_cpulimit with RLIM_INFINITY before comparing it with the process runtime. p_runtime is unsigned while p_cpulimit is not, so this avoids the nasty side effect of the process getting killed when the runtime comes up "negative" due to other bugs.	1998-11-27 11:44:22 +00:00
bde	8fdbb5fce3	Fixed the previous fix - stathz doesn't give the statclock frequency when it is 0. Submitted by: mostly by Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp>	1998-11-26 16:49:55 +00:00
bde	043d2a6202	Oops, yet again back out some local changes that shouldn't have been in the previous commit.	1998-11-26 14:05:58 +00:00
bde	0d3ca540ea	Fixed scaling of p_pctcpu. It was wrong by a factor of stathz/hz. Until recently, this was half compensated for in at least ps and top by multiplying by 100/stathz to get a better wrong factor of 100/hz.	1998-11-26 14:00:08 +00:00
bde	2fee4bebdc	Oops, back out some local changes that shouldn't have been in the previous commit.	1998-10-25 20:11:36 +00:00

1 2 3

114 Commits