freebsd-skq

Author	SHA1	Message	Date
phk	93429f2778	Add a new per-thread private flag: TDP_GEOM. This flag gets set whenever the thread posts an event on the GEOM event queue, and if the flag is set when the thread is prepared to return to userland from the kernel, g_waitidle() will be called to make sure that the posted events have completed. This can replace an insufficient number of g_waitidle() calls in various other places, and has the advantage of being failsafe: Any system call which does a VOP_OPEN()/VOP_CLOSE will now correctly wait for any geom events it posted as part of spoils or tastes. Assert that topology and Giant is not held in g_waitidle().	2004-10-23 20:49:17 +00:00
jhb	ce2d3f89af	Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month	2004-10-05 18:51:11 +00:00
jhb	d0df115aaa	Don't try to protect td_sticks with sched_lock. It doesn't need it as it is only accessed by curthread.	2004-09-23 21:03:58 +00:00
jhb	3956303607	Various small style fixes.	2004-09-22 15:24:33 +00:00
julian	2782d4b3fc	Remove an unneeded argument.. The removed argument could trivially be derived from the remaining one. That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument. Having both proc and thread as an argumen tjust gives an opportunity for them to get out sync. MFC after: 3 days	2004-08-31 07:34:54 +00:00
julian	ee753ed190	Remove sched_free_thread() which was only used in diagnostics. It has outlived its usefulness and has started causing panics for people who turn on DIAGNOSTIC, in what is otherwise good code. MFC after: 2 days	2004-08-31 06:12:13 +00:00
davidxu	f8c21c52ad	Call thread_user_enter for M:N thread, ast() should be treated as another entrance of kernel.	2004-08-08 22:28:33 +00:00
jhb	0cb3276d57	- Move TDF_OWEPREEMPT, TDF_OWEUPC, and TDF_USTATCLOCK over to td_pflags since they are only accessed by curthread and thus do not need any locking. - Move pr_addr and pr_ticks out of struct uprof (which is per-process) and directly into struct thread as td_profil_addr and td_profil_ticks as these variables are really per-thread. (They are used to defer an addupc_intr() that was too "hard" until ast()).	2004-07-16 21:04:55 +00:00
jhb	1b16b181d1	- Change mi_switch() and sched_switch() to accept an optional thread to switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.	2004-07-02 19:09:50 +00:00
jhb	ca6f6cfd39	Tidy up uprof locking. Mostly the fields are protected by both the proc lock and sched_lock so they can be read with either lock held. Document the locking as well. The one remaining bogosity is that pr_addr and pr_ticks should be per-thread but profiling of multithreaded apps is currently undefined.	2004-07-02 03:50:48 +00:00
julian	7a48fb22ac	Remove unused variable.	2004-03-31 08:20:44 +00:00
peter	1cb95fd2b7	Push Giant down a little further: - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe	2004-03-13 22:31:39 +00:00
rwatson	e2aad13d33	Put "failed to set signal flags properly for ast()" check under DIAGNOSTIC instead of INVARIANTS. INVARIANTS is intended for tests that don't substantially change code flow or behavior (passive), but this test required locking both the proc lock and scheduler lock in order to execute. It also appears to be a very advisory diagnostic as opposed to an invariant violation. Following discussion with: bde	2004-03-05 17:35:28 +00:00
jhb	279b2b8278	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
jeff	c85cdc3d0f	- Add a flags parameter to mi_switch. The value of flags may be SW_VOL or SW_INVOL. Assert that one of these is set in mi_switch() and propery adjust the rusage statistics. This is to simplify the large number of users of this interface which were previously all required to adjust the proper counter prior to calling mi_switch(). This also facilitates more switch and locking optimizations. - Change all callers of mi_switch() to pass the appropriate paramter and remove direct references to the process statistics.	2004-01-25 03:54:52 +00:00
peter	f79f1784c9	Log involuntary context switches correctly.	2003-09-05 22:15:26 +00:00
davidxu	69df6d1c3b	kse.h is not needed for these files.	2003-08-05 12:08:49 +00:00
peter	8dd9d4012a	When ktracing context switches, make sure we record involuntary switches. Otherwise, when we get a evicted from the cpu, there is no record of it. This is not a default ktrace flag.	2003-07-31 01:36:24 +00:00
davidxu	788b1fc17a	o Change kse_thr_interrupt to allow send a signal to a specified thread, or unblock a thread in kernel, and allow UTS to specify whether syscall should be restarted. o Add ability for UTS to monitor signal comes in and removed from process, the flag PS_SIGEVENT is used to indicate the events. o Add a KMF_WAITSIGEVENT for KSE mailbox flag, UTS call kse_release with this flag set to wait for above signal event. o For SA based thread, kernel masks all signal in its signal mask, let UTS to use kse_thr_interrupt interrupt a thread, and install a signal frame in userland for the thread. o Add a tm_syncsig in thread mailbox, when a hardware trap occurs, it is used to deliver synchronous signal to userland, and upcall is schedule, so UTS can process the synchronous signal for the thread. Reviewed by: julian (mentor)	2003-06-28 08:29:05 +00:00
davidxu	1d77a8e0f6	1. Add code to support bound thread. when blocked, a bound thread never schedules an upcall. Signal delivering to a bound thread is same as non-threaded process. This is intended to be used by libpthread to implement PTHREAD_SCOPE_SYSTEM thread. 2. Simplify kse_release() a bit, remove sleep loop.	2003-06-15 12:51:26 +00:00
davidxu	abb4420bbe	Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.	2003-06-15 00:31:24 +00:00
obrien	3b8fff9e4c	Use __FBSDID().	2003-06-11 00:56:59 +00:00
jhb	89a4eb17de	- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)	2003-05-13 20:36:02 +00:00
jhb	2c416d197d	The signotify() sanity check in userret() doesn't need Giant anymore.	2003-04-23 18:51:55 +00:00
jhb	128ae3c8d8	- Move PS_PROFIL and its new cousin PS_STOPPROF back over to p_flag and rename them appropriately. Protect both flags with both the proc lock and the sched_lock. - Protect p_profthreads with the proc lock. - Remove Giant from profil(2).	2003-04-22 20:54:04 +00:00
jhb	bffa90cc0a	Tweak locking in the PS_XCPU handler to hold the sched_lock while reading p_runtime.	2003-04-17 22:33:04 +00:00
jeff	46e6ba39f1	- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.	2003-03-31 22:49:17 +00:00
jeff	4a3718fb25	- Change trapsignal() to accept a thread and not a proc. - Change all consumers to pass in a thread. Right now this does not cause any functional changes but it will be important later when signals can be delivered to specific threads.	2003-03-31 22:02:38 +00:00
davidxu	b47a4be33e	Fix signal delivering bug for threaded process.	2003-03-11 02:59:50 +00:00
jhb	e4bcd25517	Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().	2003-03-04 21:03:05 +00:00
julian	3fc9836d46	Change the process flags P_KSES to be P_THREADED. This is just a cosmetic change but I've been meaning to do it for about a year.	2003-02-27 02:05:19 +00:00
jeff	5c29a640b8	- Add a new function, thread_signal_add(), that is called from postsig to add a signal to a mailbox's pending set. - Add a new function, thread_signal_upcall(), this causes the current thread to upcall so that we can deliver pending signals. Reviewed by: mini	2003-02-17 09:58:11 +00:00
julian	af55753a06	Move a bunch of flags from the KSE to the thread. I was in two minds as to where to put them in the first case.. I should have listenned to the other mind. Submitted by: parts by davidxu@ Reviewed by: jeff@ mini@	2003-02-17 09:55:10 +00:00
jeff	aa384c931f	- Move ke_sticks, ke_iticks, ke_uticks, ke_uu, ke_su, and ke_iu back into the proc. These counters are only examined through calcru. Submitted by: davidxu Tested on: x86, alpha, UP/SMP	2003-02-17 02:19:58 +00:00
julian	e8efa7328e	Reversion of commit by Davidxu plus fixes since applied. I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.	2003-02-01 12:17:09 +00:00
tjr	e3277471d4	Use a local variable to store the number of ticks that elapsed in kernel mode instead of (unintentionally) using the global `ticks'. This error completely broke profiling.	2003-01-31 11:22:31 +00:00
davidxu	4b9b549ca2	Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian	2003-01-26 11:41:35 +00:00
julian	dde96893c9	Add code to ddb to allow backtracing an arbitrary thread. (show thread {address}) Remove the IDLE kse state and replace it with a change in the way threads sahre KSEs. Every KSE now has a thread, which is considered its "owner" however a KSE may also be lent to other threads in the same group to allow completion of in-kernel work. n this case the owner remains the same and the KSE will revert to the owner when the other work has been completed. All creations of upcalls etc. is now done from kse_reassign() which in turn is called from mi_switch or thread_exit(). This means that special code can be removed from msleep() and cv_wait(). kse_release() does not leave a KSE with no thread any more but converts the existing thread into teh KSE's owner, and sets it up for doing an upcall. It is just inhibitted from being scheduled until there is some reason to do an upcall. Remove all trace of the kse_idle queue since it is no-longer needed. "Idle" KSEs are now on the loanable queue.	2002-12-28 01:23:07 +00:00
rwatson	551263d9d4	To reduce per-return overhead of userret(), call into mac_thread_userret() only if PS_MACPEND is set in the process AST mask. This avoids the cost of the entry point in the common case, but requires policies interested in the userret event to set the flag (protected by the scheduler lock) if they do want the event. Since all the policies that we're working with which use mac_thread_userret() use the entry point only selectively to perform operations deferred for locking reasons, this maintains the desired semantics. Approved by: re Requested by: bde Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-08 19:00:17 +00:00
julian	64467d2a2f	iBack out david's last commit. the suspension code needs to be called for non KSE processes too.	2002-10-26 04:44:17 +00:00
davidxu	9f183ef3fc	Move suspension checking code from userret() into thread_userret().	2002-10-26 02:56:51 +00:00
jeff	ef4d4e378e	- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch	2002-10-12 05:32:24 +00:00
jhb	7cc0ed53c2	- Move p_cpulimit to struct proc from struct plimit and protect it with sched_lock. This means that we no longer access p_limit in mi_switch() and the p_limit pointer can be protected by the proc lock. - Remove PRS_ZOMBIE check from CPU limit test in mi_switch(). PRS_ZOMBIE processes don't call mi_switch(), and even if they did there is no longer the danger of p_limit being NULL (which is what the original zombie check was added for). - When we bump the current processes soft CPU limit in ast(), just bump the private p_cpulimit instead of the shared rlimit. This fixes an XXX for some value of fix. There is still a (probably benign) bug in that this code doesn't check that the new soft limit exceeds the hard limit. Inspired by: bde (2)	2002-10-09 17:17:24 +00:00
jmallett	cd6e4d8c7c	Access td->td_kse inside sched_lock. Submitted by: julian	2002-10-02 18:25:09 +00:00
jmallett	056df6de99	De-obfuscate local use of members of 'struct thread', for which we have local variables, and group assignment.	2002-10-02 16:39:39 +00:00
rwatson	4be0d09ad3	Add a new MAC entry point, mac_thread_userret(td), which permits policy modules to perform MAC-related events when a thread returns to user space. This is required for policies that have floating process labels, as it's not always possible to acquire the process lock at arbitrary points in the stack during system call processing; process labels might represent traditional authentication data, process history information, or other data. LOMAC will use this entry point to perform the process label update prior to the thread returning to userspace, when plugged into the MAC framework. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-02 02:42:38 +00:00
jmallett	7a693db242	Back our kernel support for reliable signal queues. Requested by: rwatson, phk, and many others	2002-10-01 17:15:53 +00:00
jhb	cb9df84644	Minor style nits in a comment.	2002-10-01 15:49:32 +00:00
jhb	014597718b	Various style fixups. Submitted by: bde (mostly)	2002-10-01 14:16:50 +00:00
jhb	a1ef3e37b0	Actually clear PS_XCPU in ast() when we handle it. Submitted by: bde Pointy hat to: jhb	2002-10-01 14:13:13 +00:00
jhb	f350519764	- Add a new per-process flag PS_XCPU to indicate that at least one thread has exceeded its CPU time limit. - In mi_switch(), set PS_XCPU when the CPU time limit is exceeded. - Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set. Requested by: many	2002-09-30 21:13:54 +00:00
jmallett	0341f71df1	First half of implementation of ksiginfo, signal queues, and such. This gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs. After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such. CODAFS is unfinished in this regard because the logic is unclear in some places. Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]	2002-09-30 20:20:22 +00:00
julian	bcb38a31ff	slightly clean up the thread_userret() and thread_consider_upcall() calls. also some slight changes for TDF_BOUND testing and small style changes Should ONLY affect KSE programs Submitted by: davidxu	2002-09-23 06:14:30 +00:00
rwatson	d410071f5d	Spell proprly properly: failed to set signal flags proprly for ast() failed to set signal flags proprly for ast() failed to set signal flags proprly for ast() failed to set signal flags proprly for ast()	2002-08-22 14:36:03 +00:00
mini	b244d01c4d	Revert removal of cred_free_thread(): It is used to ensure that a thread's credentials are not improperly borrowed when the thread is not current in the kernel. Requested by: jhb, alfred	2002-07-11 02:18:33 +00:00
julian	acb01acb4e	Don't slow every syscall and trap by doing locks and stuff if the 'stop' bits are not set. This is a temporary thing.. I think this code probably needs to be rewritten anyhow.	2002-07-10 06:40:22 +00:00
julian	aa2dc0a5d9	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
mini	ef6f2f567d	Remove unused diagnostic function cread_free_thread(). Approved by: alfred	2002-06-24 06:22:00 +00:00
jhb	1a2a2fa24a	We no longer need to acqure Giant in ast() for ktrpsig() in postsig() now that ktrace no longer needs Giant.	2002-06-07 05:43:40 +00:00
julian	304195369e	CURSIG() is not a macro so rename it cursig(). Obtained from: KSE tree	2002-05-29 23:44:32 +00:00
bde	14ae95f735	Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask. Avoid locking in userret() in most of the remaining cases. Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon	2002-04-04 17:49:48 +00:00
jake	855079d5b7	Style fixes purposefully left out of last commit. I checked the kse tree and didn't see any changes that this conflicts with.	2002-03-29 16:45:03 +00:00
jake	8f9ce8398d	Remove abuse of intr_disable/restore in MI code by moving the loop in ast() back into the calling MD code. The MD code must ensure no races between checking the astpening flag and returning to usermode. Submitted by: peter (ia64 bits) Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64	2002-03-29 16:35:26 +00:00
imp	969e82886e	Remove last two abuses of cpu_critical_{enter,exit} in the MI code. Reviewed by: jake, jhb, rwatson	2002-03-21 06:11:09 +00:00
jhb	2e425ee2fc	Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined. Instead of caching the ucred reference, just go ahead and eat the decerement and increment of the refcount. Now that Giant is pushed down into crfree(), we no longer have to get Giant in the common case. In the case when we are actually free'ing the ucred, we would normally free it on the next kernel entry, so the cost there is not new, just in a different place. This also removse td_cache_ucred from struct thread. This is still only done #ifdef DIAGNOSTIC. [ missed this file in the previous commit ] Tested on: i386, alpha	2002-03-20 21:12:04 +00:00
jake	b27e79327d	Make this compile. Pointy hat to: julian	2002-02-23 01:42:13 +00:00
julian	53eb1d9219	Add some DIAGNOSTIC code. While in userland, keep the thread's ucred reference in a shadow field so that the usual place to store it is NULL. If DIAGNOSTIC is not set, the thread ucred is kept valid until the next kernel entry, at which time it is checked against the process cred and possibly corrected. Produces a BIG speedup in kernels with INVARIANTS set. (A previous commit corrected it for the non INVARIANTS case already) Reviewed by: dillon@freebsd.org	2002-02-22 23:58:22 +00:00
julian	abe785e035	If the credential on an incoming thread is correct, don't bother reaquiring it. In the same vein, don't bother dropping the thread cred when goinf ot userland. We are guaranteed to nned it when we come back, (which we are guaranteed to do). Reviewed by: jhb@freebsd.org, bde@freebsd.org (slightly different version)	2002-02-17 01:09:56 +00:00
julian	37369620df	In a threaded world, differnt priorirites become properties of different entities. Make it so. Reviewed by: jhb@freebsd.org (john baldwin)	2002-02-11 20:37:54 +00:00
bde	73ef84f92b	Changed the type of pcb_flags from u_char to u_int and adjusted things. This removes the only atomic operation on a char type in the entire kernel.	2002-01-17 17:49:23 +00:00
jhb	1ce407b675	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha	2002-01-05 08:47:13 +00:00
jhb	82f83a1cbe	Axe a stale comment. Holding sched_lock across both setrunqueue() and mi_switch() is sufficient.	2002-01-04 10:55:51 +00:00
jhb	3b3c195480	- Change all callers of addupc_task() to check PS_PROFIL explicitly and remove the check from addupc_task(). It would need sched_lock while testing the flag anyways. - Always read sticks while holding sched_lock using a temporary variable where needed. - Always init prticks to 0 in ast() to quiet a warning.	2001-12-18 09:06:10 +00:00
jhb	a3b98398cb	Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit(). Tested on: i386, alpha	2001-12-18 00:27:18 +00:00
jhb	46e3f92a5d	Add a per-thread ucred reference for syscalls and synchronous traps from userland. The per thread ucred reference is immutable and thus needs no locks to be read. However, until all the proc locking associated with writes to p_ucred are completed, it is still not safe to use the per-thread reference. Tested on: x86 (SMP), alpha, sparc64	2001-10-26 08:12:54 +00:00
jhb	78494eacb6	Remove a bogus comment. "atomic" doesn't mean that the operation is done as a physical atomic operation. That would require the code to use the atomic API, which it does not. Instead, the operation is made psuedo atomic (hence the quotes) by use of the lock to protect clearing all of the flags in question.	2001-09-21 19:26:57 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
dillon	08e732a88b	Remove the MPSAFE keyword from the parser for syscalls.master. Instead introduce the [M] prefix to existing keywords. e.g. MSTD is the MP SAFE version of STD. This is prepatory for a massive Giant lock pushdown. The old MPSAFE keyword made syscalls.master too messy. Begin comments MP-Safe procedures with the comment: /* * MPSAFE / This comments means that the procedure may be called without Giant held (The procedure itself may still need to obtain Giant temporarily to do its thing). sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE sv_transtrap() is now MP SAFE and assumed to be MP SAFE ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown) trapsignal() is now MP SAFE (Giant Pushdown) Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant) test in syscall[2]() in /*/trap.c now do not. Instead they explicitly unlock Giant if they previously obtained it, and then assert that it is no longer held to catch broken system calls. Rebuild syscall tables.	2001-08-30 18:50:57 +00:00
jhb	4a89454dcd	- Close races with signals and other AST's being triggered while we are in the process of exiting the kernel. The ast() function now loops as long as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with preemption disabled so that any further AST's that arrive via an interrupt will be delayed until the low-level MD code returns to user mode. - Use u_int's to store the tick counts for profiling purposes so that we do not need sched_lock just to read p_sticks. This also closes a problem where the call to addupc_task() could screw up the arithmetic due to non-atomic reads of p_sticks. - Axe need_proftick(), aston(), astoff(), astpending(), need_resched(), clear_resched(), and resched_wanted() in favor of direct bit operations on p_sflag. - Fix up locking with sched_lock some. In addupc_intr(), use sched_lock to ensure pr_addr and pr_ticks are updated atomically with setting PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing PS_OWEUPC. We also do not grab the lock just to test a flag. - Simplify the handling of Giant in ast() slightly. Reviewed by: bde (mostly)	2001-08-10 22:53:32 +00:00
dillon	52f62a303c	postsig() currently requires Giant to be held. Giant is held properly at the first postsig() call, but not always held at the second place, resulting in an occassional panic.	2001-07-04 15:36:30 +00:00
jhb	7bb1f29898	Grab Giant around postsig() since sendsig() can call into the vm to grow the stack and we already needed Giant for KTRACE.	2001-07-03 05:27:53 +00:00
jhb	cbc88996c6	Move ast() and userret() to sys/kern/subr_trap.c now that they are MI.	2001-06-29 19:51:37 +00:00
jhb	d82893e676	Add a new MI pointer to the process' trapframe p_frame instead of using various differently named pointers buried under p_md. Reviewed by: jake (in principle)	2001-06-29 11:10:41 +00:00
jhb	cc8833dfe9	Grab Giant around trap_pfault() for now.	2001-06-29 04:18:10 +00:00
jhb	e5e16e09ad	- Grab the proc lock around CURSIG and postsig(). Don't release the proc lock until after grabbing the sched_lock to avoid CURSIG racing with psignal. - Don't grab Giant for addupc_task() as it isn't needed. Reported by: tegge (signal race), bde (addupc_task a while back)	2001-06-22 23:05:11 +00:00
jhb	7a4f835060	Don't hold sched_lock across addupc_task(). Reported by: David Taylor <davidt@yadt.co.uk> Submitted by: bde	2001-06-06 00:57:24 +00:00
jhb	8d7fd621d7	Don't acquire Giant just to call trap_fatal(), we are about to panic anyway so we'd rather see the printf's then block if the system is hosed.	2001-05-23 22:58:09 +00:00
bde	5fd5877aef	Convert npx interrupts into traps instead of vice versa. This is much simpler for npx exceptions that start as traps (no assembly required...) and works better for npx exceptions that start as interrupts (there is no longer a problem for nested interrupts). Submitted by: original (pre-SMPng) version by luoqi	2001-05-22 21:20:49 +00:00
alfred	a3f0842419	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
jhb	3fbeaa9056	Remove unneeded includes of sys/ipl.h and machine/ipl.h.	2001-05-15 23:22:29 +00:00
jhb	c20ad9aee2	Simplify the vm fault trap handling code a bit by using if-else instead of duplicating code in the then case and then using a goto to jump around the else case.	2001-05-11 23:50:08 +00:00
jhb	8bfdafc934	Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP. - It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the _process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the _process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the _process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the _process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in. Reviewed by: jake, peter Looked over by: eivind	2001-04-27 19:28:25 +00:00
jhb	9c1fb038d7	- Release Giant a bit earlier on syscall exit. - Don't try to grab Giant before postsig() in userret() as it is no longer needed. - Don't grab Giant before psignal() in ast() but get the proc lock instead.	2001-03-07 03:53:39 +00:00
jake	d273bd6e2b	- Rename the lcall system call handler from Xsyscall to Xlcall_syscall to be more like Xint0x80_syscall and less like c function syscall(). - Reduce code duplication between the int0x80 and lcall handlers by shuffling the elfags into the right place, saving the sizeof the instruction in tf_err and jumping into the common int0x80 code. Reviewed by: peter	2001-02-25 02:53:06 +00:00
jhb	4d15762396	The p_md.md_regs member of proc is used in signal handling to reference the the original trapframe of the syscall, trap, or interrupt that entered the kernel. Before SMPng, ast's were handled via a psuedo trap at the end of doerti. With the SMPng commit, ast's were broken out into a separate ast() function that was called from doreti to match the behavior of other architectures. Unfortunately, when this was done, the p_md.md_regs member of curproc was not updateda in ast(), thus when signals are handled by userret() after an interrupt that returns to userland, we end up using a stale trapframe that will result in the registers from the old trapframe overwriting the real trapframe and smashing all the registers right before we return to usermode. The saved %cs:%eip from where we were in usermode are saved in the trapframe for example.	2001-02-22 19:35:20 +00:00
jhb	3f7cd4b044	- Change ast() to take a pointer to a trapframe like other architectures. - Don't use an atomic operation to update cnt.v_soft in ast(). This is the only place the variable is written to, and sched_lock is always held when it is written, so it is already protected and the mutex release of sched_lock asserts a memory barrier that ensures the value will be updated in a timely fashion.	2001-02-22 18:05:15 +00:00
jhb	667eb173f1	- Use TRAPF_PC() on the alpha to acess the PC in the trap frame. - Don't hold sched_lock around addupc_task() as this apparently breaks profiling badly due to sched_lock being held across copyin(). Reported by: bde (2)	2001-02-22 16:23:12 +00:00
jhb	27efeb0d30	- Don't call clear_resched() in userret(), instead, clear the resched flag in mi_switch() just before calling cpu_switch() so that the first switch after a resched request will satisfy the request. - While I'm at it, move a few things into mi_switch() and out of cpu_switch(), specifically set the p_oncpu and p_lastcpu members of proc in mi_switch(), and handle the sched_lock state change across a context switch in mi_switch(). - Since cpu_switch() no longer handles the sched_lock state change, we have to setup an initial state for sched_lock in fork_exit() before we release it.	2001-02-20 05:26:15 +00:00
bde	97e52ec00f	Removed all traces of T_ASTFLT (except for gaps where it was). It became unused except in dead code when ast() was split off from trap().	2001-02-19 15:47:38 +00:00
bde	49ef1aaa13	Changed the aston() family to operate on a specified process instead of always on curproc. This is needed to implement signal delivery properly (see a future log message for kern_sig.c). Debogotified the definition of aston(). aston() was defined in terms of signotify() (perhaps because only the latter already operated on a specified process), but aston() is the primitive. Similar changes are needed in the ia64 versions of cpu.h and trap.c. I didn't make them because the ia64 is missing the prerequisite changes to make astpending and need_resched per-process and those changes are too large to make without testing.	2001-02-19 04:15:59 +00:00

1 2 3 4 5 ...

325 Commits