freebsd-dev

Author	SHA1	Message	Date
Julian Elischer	060563ec50	Move the _oncpu entry from the KSE to the thread. The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.	2003-04-10 17:35:44 +00:00
Mike Barcroft	fd7a8150fb	o In struct prison, add an allprison linked list of prisons (protected by allprison_mtx), a unique prison/jail identifier field, two path fields (pr_path for reporting and pr_root vnode instance) to store the chroot() point of each jail. o Add jail_attach(2) to allow a process to bind to an existing jail. o Add change_root() to perform the chroot operation on a specified vnode. o Generalize change_dir() to accept a vnode, and move namei() calls to callers of change_dir(). o Add a new sysctl (security.jail.list) which is a group of struct xprison instances that represent a snapshot of active jails. Reviewed by: rwatson, tjr	2003-04-09 02:55:18 +00:00
Peter Wemm	cc66ebe2a9	Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb	2003-04-02 23:53:30 +00:00
John Baldwin	959d22329a	- Remove witness_dead and just use witness_watch instead. If witness_watch is set to 0, it now has the same affect as setting witness_dead used to have. - Added a sysctl handler that allows root to change witness_watch from a non-zero value to zero to disable witness at runtime. Note that you can't turn witness back on once it is off. You can only turn it off as a one-way switch. - Added a comment describing the possible values of witness_watch.	2003-03-24 21:03:53 +00:00
John Baldwin	2ca9461a05	Trim an extra blank line that snuck into the last commit.	2003-03-11 22:33:42 +00:00
John Baldwin	427b3a6549	- Change witness_displaydescendants() to accept the indentation level as a parameter instead of using the level of a given witness. When recursing, pass an indent level of indent + 1. - Make use of the information witness_levelall() provides in witness_display_list() to use an O(n) algorithm instead of an O(n^2) algo to decide which witnesses to display hierarchies from. Basically, we only display a hierarchy for witnesses with a level of 0. - Add a new per-witness flag that is reset at the start of witness_display() for all witness's and is set the first time a witness is displayed in witness_displaydescendants(). If a witness is encountered more than once in the lock order tree (which happens often), witness_displaydescendants() marks the later occurrences with the string "(already displayed)" and doesn't display the subtree under that witness. This avoids duplicating large amounts of the lock order tree in the 'show witness' output in DDB. All these changes serve to make 'show witness' a lot more readable and useful than it was previously.	2003-03-11 22:14:21 +00:00
John Baldwin	f82c6950be	- Split the itismychild() function into two functions: insertchild() adds a witness to the child list of a parent witness. rebalancetree() runs through the entire tree removing direct descendants of witnesses who already have said child witness as an indirect descendant through another direct descendant. itismychild() now calls insertchild() followed by rebalancetree() and no longer needs the evil hack of having static recursed variable. - Add a function reparentchildren() that adds all the direct descendants of one witness as direct descendants of another witness. - Change the return value of itismychild() and similar functions so that they return 0 in the case of failure due to lack of resources instead of 1. This makes the return value more intuitive. - Check the return value of itismychild() when defining the static lock order in witness_initialize(). - Don't try to setup a lock instance in witness_lock() if itismychild() fails. Witness is hosed anyways so no need to do any more witness related activity at that point. It also makes the code flow easier to understand. - Add a new depart() function as the opposite of enroll(). When the reference count of a witness drops to 0 in witness_destroy(), this function is called on that witness. First, it runs through the lock order tree using reparentchildren() to reparent direct descendants of the departing witness to each of the witness' parents in the tree. Next, it releases it's own child list and other associated resources. Finally it calls rebalanacetree() to rebalance the lock order tree. - Sort function prototypes into something closer to alphabetical order. As a result of these changes, there should no longer be 'dead' witnesses in the order tree, and repeatedly loading and unloading a module should no longer exhaust witness of its internal resources. Inspired by: gallatin	2003-03-11 22:07:35 +00:00
John Baldwin	d5b13ee082	Trim useless "../" leading strings from filenames passed into witness.	2003-03-11 21:53:12 +00:00
John Baldwin	28e4d137a2	Adjust style of #ifdef's and #endif's to be more consistent and in line with recent additions to style(9).	2003-03-11 21:38:49 +00:00
John Baldwin	d278a7f9ba	Do the lock order check skip for the LOP_TRYLOCK case after the check for recursing on a lock instead of before. This fixes a bug where WITNESS could get a little confused if you did an sx_tryslock() on a sx lock that you already had an slock on. WITNESS would still function correctly but it could result in weirdness in the output of 'show locks'. This also makes it possible for mtx_trylock() to recurse on a lock.	2003-03-11 20:54:37 +00:00
John Baldwin	0e8677f68b	Now that we have WITNESS_WARN(), we only call witness_list() from the ddb 'show locks' command. Thus, move witness_list() to the #ifdef DDB section and remove extra checks for calling this function outside of DDB. Also, witness_list() now returns void instead of returning an int. Reported by: Steve Ames <steve@energistic.com> Prodded by: davidxu	2003-03-10 17:03:57 +00:00
John Baldwin	9da590b49b	Oops, fix the double faults people were seeing with the recent changes to witness. Sleepable locks such as sx locks always come before all mutexes including Giant. However, the static lock order list placed Giant before the proctree and allproc sx locks. This resulted in witness creating a cycle in its lock order "tree" (real trees don't have cycles) leading to infinite recursion and eventually a double fault. To fix, put Giant after sx locks in the lock order list.	2003-03-06 17:25:06 +00:00
John Baldwin	c141c242ac	Bah, fix a bogon in the last commit: get the sense of a compare test right so that we allow a sleepable lock to be acquired with Giant held rather than allowing a sleepable lock to be acquired with anything but Giant held.	2003-03-04 22:34:07 +00:00
John Baldwin	35580ede37	A small overhaul of witness: - Add a comment about special lock order rules and Giant near the top of subr_witness.c. Specifically, this documents and explains the real lock order relationship between Giant and sleepable locks (i.e. lockmgr locks and sx locks). Basically, Giant can be safely acquired either before or after sleepable locks and the case of Giant before a sleepable lock is exempted as a special case. - Add a new static function 'witness_list_lock()' that displays a single line of information about a struct lock_instance. This is used to make the output of witness messages more consistent and reduce some code duplication. - Fixup a few comments in witness_lock(). - Properly handle the Giant-before-sleepable-lock lock order exception in a more general fashion and remove the no longer needed LI_SLEPT flag. - Break up the last condition before assuming a reversal a bit to try and make the logic less confusing in witness_lock(). - Axe WITNESS_SLEEP() now that LI_SLEPT is no longer needed and replace it with a more general WITNESS_WARN() macro/function combination. WITNESS_WARN() allows you to output a customized message out to the console along with a list of held locks. It will optionally drop into the debugger as well. You can exempt a single lock from the check by passing it in as the second argument. You can also use flags to specify if Giant should be exempt from the check, if all sleepable locks should be exempt from the check, and if witness should panic if any non-exempt locks are found. - Make the witness_list() function static. Other areas of the kernel should use the new WITNESS_WARN() instead.	2003-03-04 20:56:39 +00:00
Peter Wemm	af3d516f55	Initiate de-orbit burn for USE_PCI_BIOS_FOR_READ_WRITE. This has been #if'ed out for a while. Complete the deed and tidy up some other bits. We need to be able to call this stuff from outer edges of interrupt handlers for devices that have the ISR bits in pci config space. Making the bios code mpsafe was just too hairy. We had also stubbed it out some time ago due to there simply being too much brokenness in too many systems. This adds a leaf lock so that it is safe to use pci_read_config() and pci_write_config() from interrupt handlers. We still will use pcibios to do interrupt routing if there is no acpi.. [yes, I tested this] Briefly glanced at by: imp	2003-02-18 03:36:49 +00:00
Jeff Roberson	5215b1872f	- Split the struct kse into struct upcall and struct kse. struct kse will soon be visible only to schedulers. This greatly simplifies much the KSE code. Submitted by: davidxu	2003-02-17 05:14:26 +00:00
Peter Wemm	1c425b874c	Add a 'debug.witness_trace' sysctl (and tunable) when DDB is present. This causes LOR and could-sleep messages to come with a stack trace.	2003-02-13 01:35:56 +00:00
Julian Elischer	6f8132a867	Reversion of commit by Davidxu plus fixes since applied. I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.	2003-02-01 12:17:09 +00:00
David Xu	0dbb100b9b	Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian	2003-01-26 11:41:35 +00:00
Jake Burkholder	be0800e85e	Oops, add zstty to the witness order list. Noticed by: benno	2003-01-09 15:45:28 +00:00
Jake Burkholder	c3c2862df4	- Add a spin lock to single thread cache invalidation and tlb flush ipis, which allows ipis to be sent outside of Giant. - Remove the ap boot mutex, which is unused.	2002-12-22 20:50:23 +00:00
Kris Kennaway	4ef3d7a27b	Enforce correct ordering of the filedesc structure and pipe mutex, because WITNESS can get the order wrong if it guesses based on first use. Reviewed by: jhb, alfred	2002-12-22 16:32:34 +00:00
John Baldwin	d2b28e078a	Correct an assertion in the code to traverse the list of locks to find an earlier acquired lock with the same witness as the lock currently being acquired. If we had released several earlier acquired locks after acquiring enough locks to require another lock_list_entry bucket in the lock list, then subsequent lock_list_entry buckets could contain only one lock instance in which case i would be zero. Reported by: Joel M. Baldwin <qumqats@outel.org>	2002-11-11 16:36:20 +00:00
Alan Cox	151113a946	Catch up with the removal of the vm page buckets spin mutex.	2002-11-02 22:42:18 +00:00
Poul-Henning Kamp	ab33958276	#unifdef the code for checking blessed lock collisions until we need it. Spotted by: DARPA & NAI Labs.	2002-10-20 08:48:39 +00:00
Peter Wemm	d2575b9651	Register the machine check private state spinlock on ia64.	2002-10-12 00:33:36 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Jeff Roberson	c76e20451c	- Tell witness about ALQ's spin lock.	2002-09-22 07:11:57 +00:00
Jake Burkholder	c0d676c068	Make this driver work a whole lot better. - Get the initial mode from the prom settings and don't clobber the mode on open. - Copy output into an internal ring buffer instead of accessing the tty outq directly in the interrupt handler. This fixes a problem where garbage would show up in the output stream. - Reset the console port completely and reprogram all the parameters before enabling it. This fixes seemingly random hangs on startup when using a fast interrupt handler. - Add minimal locking in place of spls. - Remove dead code and minor cleanups.	2002-09-08 04:45:16 +00:00
Ian Dowse	9261400aa2	Add WITNESS_FILE() and WITNESS_LINE(), which allow users of witness to print out the file and line from the lock object. These will be used shortly by CTR() calls in the mutex code. Reviewed by: jhb, jake	2002-08-26 18:31:26 +00:00
Mark Peek	11a78c514f	Silence compiler warnings when DDB is not defined. PR: 36002 Submitted by: Yoshikazu GOTO <goto@snowy.to>	2002-07-15 02:03:17 +00:00
Peter Wemm	f1b665c8fe	Revive backed out pmap related changes from Feb 2002. The highlights are: - It actually works this time, honest! - Fine grained TLB shootdowns for SMP on i386. IPI's are very expensive, so try and optimize things where possible. - Introduce ranged shootdowns that can be done as a single IPI. - PG_G support for i386 - Specific-cpu targeted shootdowns. For example, there is no sense in globally purging the TLB cache for where we are stealing a page from the local unshared process on the local cpu. Use pm_active to track this. - Add some instrumentation for the tlb shootdown code. - Rip out SMP code from <machine/cpufunc.h> - Try and fix some very bogus PG_G and PG_PS interactions that were bad enough to cause vm86 bios calls to break. vm86 depended on our existing bugs and this was the cause of the VESA panics last time. - Fix the silly one-line error that caused the 'panic: bad pte' last time. - Fix a couple of other silly one-line errors that should have caused more pain than they did. Some more work is needed: - pmap_{zero,copy}_page[_idle]. These can be done without IPI's if we have a hook in cpu_switch. - The IPI handlers need some cleanup. I have a bogus %ds load that can be avoided. - APTD handling is rather bogus and appears to be a large source of global TLB IPI shootdowns for no really good reason. I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop. I expect to see a bigger difference when there is significant pageout activity or the system otherwise has memory shortages. I have backed out a few optimizations that I had been using over the last few days in order to be a little more conservative. I'll revisit these again over the next few days as the dust settles. New option: DISABLE_PG_G - In case I missed something.	2002-07-12 07:56:11 +00:00
Alan Cox	70c1763634	o Resurrect vm_page_lock_queues(), vm_page_unlock_queues(), and the free queue lock (revision 1.33 of vm/vm_page.c removed them). o Make the free queue lock a spin lock because it's sometimes acquired inside of a critical section.	2002-07-04 22:07:37 +00:00
Julian Elischer	e602ba25fd	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
John Baldwin	48849938e8	Change the all locks list from a STAILQ to a TAILQ. This bloats struct lock_object by another pointer (though all of lock_object should be conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE() in witness_destroy() (called for every mtx_destroy() and sx_destroy()) instead of an O(n) STAILQ_REMOVE. Since WITNESS is so dog slow as it is, the speed-up is worth the space cost. Suggested by: iedowse	2002-06-06 20:51:04 +00:00
John Baldwin	8dcb900b62	Handle "dead" witnesses better in the situation of several short term locks being created and destroyed without a single long-term one around to ensure the witness associated with that group of locks stays alive. The pipe mutexes are an example of this group. For a dead witness we no longer clear the witness name. Instead, when looking up the witness for a lock, if a dead witness' (a witness with a refcount of 0) w_name pointer is identical to the witness name of the lock then we revive that witness instead of using a new witness for the lock. This results in far fewer dead witness objects and also better preserves locking orders over the long term resulting in more correct lock order checking. Note that we can't ever derefence w_name of a dead witness since we don't know if the string it is pointing to has been free()'d or kldunload()'d out from under us.	2002-06-06 19:04:38 +00:00
John Baldwin	525c135972	In witness_unlock(), when updating a lock list entry bucket, decrement the count of lock list entries after we fixup the bucket of lock list entries. In theory we can remove the intr_disable/intr_restore() calls now.	2002-05-20 19:16:22 +00:00
John Baldwin	bbd296aba6	- Allow witness_sleep() to be called when witness hasn't been initialized yet. We just return without performing any checks. - Don't explicitly enter and exit critical sections when walking lock lists. We don't need a critical section to walk the list of sleep locks for a thread. We check to see if a spin lock list is empty before we walk it. If the list is empty we don't need to walk it. If it isn't then we already hold at least one spin lock and are already in a critical section and thus don't need our own explicit critical section.	2002-05-20 17:49:46 +00:00
Alfred Perlstein	e649887b1e	Make funsetown() take a 'struct sigio **' so that the locking can be done internally. Ensure that no one can fsetown() to a dying process/pgrp. We need to check the process for P_WEXIT to see if it's exiting. Process groups are already safe because there is no such thing as a pgrp zombie, therefore the proctree lock completely protects the pgrp from having sigio structures associated with it after it runs funsetownlst. Add sigio lock to witness list under proctree and allproc, but over proc and pgrp. Seigo Tanimura helped with this.	2002-05-06 19:31:28 +00:00
Alan Cox	ea0f50bcf0	o Convert the vm_page buckets mutex to a spin lock. (This resolves an issue on the Alpha platform found by jeff@.) o Simplify vm_page_lookup(). Reviewed by: jhb	2002-04-30 21:24:47 +00:00
John Baldwin	e64b74e35b	Whitespace bogon.	2002-04-27 04:48:36 +00:00
Marcel Moolenaar	9ae9d0ff86	Insert a semi-colon between label 'skip:' and the closing brace of the FOREACH loop to silence GCC 3.	2002-04-27 02:58:18 +00:00
Dag-Erling Smørgrav	521eb014c8	Add the mutex profiling lock to the witness list. This hopefully unbreaks the MUTEX_PROFILING + WITNESS + !WITNESS_SKIPSPIN case. Submitted by: Hiten Pandya <hiten@uk.FreeBSD.org>	2002-04-25 22:48:40 +00:00
John Baldwin	f089b57070	- Merge the pgrpsess_lock and proctree_lock sx locks into one proctree_lock sx lock. Trying to get the lock order between these locks was getting too complicated as the locking in wait1() was being fixed. - leavepgrp() now requires an exclusive lock of proctree_lock to be held when it is called. - fixjobc() no longer gets a shared lock of proctree_lock now that it requires an xlock be held by the caller. - Locking notes in sys/proc.h are adjusted to note that everything that used to be protected by the pgrpsess_lock is now protected by the proctree_lock.	2002-04-16 17:03:05 +00:00
John Baldwin	9522390c28	Display the recursion count in the lock_instance in the show locks output. Indirectly requested by: peter	2002-04-10 01:25:11 +00:00
John Baldwin	9351347a17	Cosmetic fixup in output of lock types in show locks output.	2002-04-10 01:19:53 +00:00
John Baldwin	b6396e1656	Add a new char * pointer lo_type to struct lock_object that is used to point to a more generic name for a lock that is more suitable for use by witness when grouping locks. For example, although network driver locks use the interface name for the name of each lock, they should all use the same witness and be treated the same as witness. Another example is that all UMA zone locks should be treated the same. The witness code has also been updated to print out the lock type in addition to the lock name in a few places where it is relevant.	2002-04-04 20:45:21 +00:00
John Baldwin	c08cf3c3e8	Enforce an implicit lock order of sleepable locks before non-sleepable locks.	2002-04-02 19:27:21 +00:00
John Baldwin	48c343df5f	Explicitly document how we implicitly enforce the lock order of sleep locks before spin locks.	2002-04-02 16:51:20 +00:00
Jeff Roberson	f22a4b62f5	Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb	2002-03-27 09:23:41 +00:00
Warner Losh	cb9a238a8a	Remove last two abuses of cpu_critical_{enter,exit} in the MI code. Reviewed by: jake, jhb, rwatson	2002-03-21 06:11:09 +00:00
John Baldwin	60e269643d	- Use a MI critical section in witness_sleep() and witness_list() as they simply need to prevent switching from another CPU and do not need interrupts disabled. - Add a comment to witness_list() about why displaying spin locks for threads on other CPU's really is just a bad idea and probably shouldn't be done.	2002-03-08 18:57:57 +00:00
Peter Wemm	d1693e1701	Back out all the pmap related stuff I've touched over the last few days. There is some unresolved badness that has been eluding me, particularly affecting uniprocessor kernels. Turning off PG_G helped (which is a bad sign) but didn't solve it entirely. Userland programs still crashed.	2002-02-27 09:51:33 +00:00
Peter Wemm	6bd95d70db	Work-in-progress commit syncing up pmap cleanups that I have been working on for a while: - fine grained TLB shootdown for SMP on i386 - ranged TLB shootdowns.. eg: specify a range of pages to shoot down with a single IPI, since the IPI is very expensive. Adjust some callers that used to trigger this inside tight loops to do a ranged shootdown at the end instead. - PG_G support for SMP on i386 (options ENABLE_PG_G) - defer PG_G activation till after we decide what we are going to do with PSE and the 4MB pages at the start of the kernel. This should solve some rumored strangeness about stale PG_G entries getting stuck underneath the 4MB pages. - add some instrumentation for the fine TLB shootdown - convert some asm instruction wrappers from functions to inlines. gcc seems to do a fair bit better with this. - [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix this again shortly. This has been working fairly well for me for a while, but I have tweaked it again prior to commit since my last major testing round. The only outstanding problem that I know of is PG_G related, which is why there is an option for it (not on by default for SMP). I have seen a world speedups by a few percent (as much as 4 or 5% in one case) but I have not accurately measured this - I am a bit sceptical of these numbers.	2002-02-25 23:49:51 +00:00
Seigo Tanimura	f591779bb5	Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)	2002-02-23 11:12:57 +00:00
Julian Elischer	079b7badea	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,	2002-02-07 20:58:47 +00:00
John Baldwin	78a1485fd1	Fixes for alpha pmap on SMP machines: - Create a private list of active pmaps rather than abusing the list of all processes when we need to look up pmaps. The process list needs a sx lock and we can't be getting sx locks in the middle of cpu_switch() (pmap_activate() can call pmap_get_asn() from cpu_switch()). Instead, we protect the list with a spinlock. This also means the list is shorter since a pmap can be used by more than one process and we could (at least in thoery) dink with pmap's more than once, but now we only touch each pmap once when we have to update all of them. - Wrap pmap_activate()'s code to get a new ASN in an explicit critical section so that when it is called while doing an exec() we can't get preempted. - Replace splhigh() in pmap_growkernel() with a critical section to prevent preemption while we are adjusting the kernel page tables. - Fixes abuse of PCPU_GET(), which doesn't return an L-value. - Also adds some slight cleanups to the ASN handling by adding some macros instead of magic numbers in relation to the ASN and ASN generations. Reviewed by: dfr	2002-02-06 04:30:26 +00:00
John Baldwin	c86b6ff551	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha	2002-01-05 08:47:13 +00:00
John Baldwin	422f61655f	Remove brain damaged code in witness_lock(). We could have easily just used PCPU_GET(spinlocks) w/o needing the w_mtx held. It is more correct to just check td_critnest now though.	2002-01-05 08:29:54 +00:00
John Baldwin	98f9879242	Introduce a standard name for the lock protecting an interrupt controller and it's associated state variables: icu_lock with the name "icu". This renames the imen_mtx for x86 SMP, but also uses the lock to protect access to the 8259 PIC on x86 UP. This also adds an appropriate lock to the various Alpha chipsets which fixes problems with Alpha SMP machines dropping interrupts with an SMP kernel.	2001-12-20 23:48:31 +00:00
John Baldwin	7e1f6dfe9d	Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit(). Tested on: i386, alpha	2001-12-18 00:27:18 +00:00
David E. O'Brien	91f9161737	Repeat after me -- "Use of ANSI string concatenation can be bad." In this case, C99's __func__ is properly defined as: static const char __func__[] = "function-name"; and GCC 3.1 will not allow it to be used in bogus string concatenation.	2001-12-10 05:40:12 +00:00
John Baldwin	f4076cc158	Add a couple of returns to making recovering from a failed witness_assert() more sane in the RESTARTABLE_PANICS case.	2001-11-15 19:46:36 +00:00
John Baldwin	74e4502e62	Replace 'curproc' with 'td->td_proc'.	2001-10-08 21:05:46 +00:00
John Baldwin	0479e3d339	Move the ap boot spin lock earlier in the lock order before the sio(4) lock since we occasionally call printf() while holding the ap boot lock which can call down into the sio(4) driver if using a serial console.	2001-10-01 22:50:30 +00:00
John Baldwin	e649bcb506	Remove unneeded proc variables and fix comments.	2001-09-21 21:54:45 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
John Baldwin	6385dec00e	Style nits: - Don't use punctuation or newlines in panic messages. - Remove excess blank lines. Requested and partially submitted by: bde	2001-08-24 17:46:58 +00:00
John Baldwin	c19fe5e261	Add witness_upgrade() and witness_downgrade() for handling upgrades and downgrades of shared/exclusive locks.	2001-08-23 22:47:05 +00:00
John Baldwin	d7c4536a55	Convert some KASSERT()'s into if (foo) panic() because they are testing how locks are managed by the rest of the kernel, not verifying the internal integrity of witness itself.	2001-08-23 22:44:47 +00:00
John Baldwin	827dcaf663	Make witness compile w/o DDB. Reported by: wpaul	2001-08-10 22:33:59 +00:00
John Baldwin	32bca5fe03	- Fix panicstr checks to explicitly check against NULL. - Add a few more panicstr checks so that we don't panic recursively. Requested by: sheldonh (2)	2001-07-31 17:44:57 +00:00
John Baldwin	a5dd141db6	Add a missing ~ so that the LO_INITIALIZED flag actually gets turned off in witness_destroy().	2001-07-20 23:29:25 +00:00
John Baldwin	ec178c1e4c	Don't check witness assertions if the lock doesn't use witness or witness is dead.	2001-06-28 22:22:20 +00:00
John Baldwin	04297fe609	- Add a new witness_assert() to perform arbitrary locking assertions. - Clean up the KTR tracepoints to be slighlty more consistent and useful - Fix a bug in WITNESS where we would recurse indefinitely and blow the stack when acquiring Giant after sleeping with a sleepable lock held. Reported by: tanimura (3)	2001-06-27 06:27:29 +00:00
John Baldwin	b7e554f5d6	- Move the 'clk' spinlock below other spin locks since KTR trace events may need the clock lock for nanotime(). - Add KTR trace events for lock list manipulations and other witness operations. - Use a temporary variable instead of setting the lock list head directly and then setting up the links to add a new lock list entry to the lock list. This small race could result in witness "forgetting" about all the locks held by this process temporarily during an interrupt. - Close a more fatal race condition when removing a lock from a list. Removing a lock from the list entails both decrementing the count of items in this bucket as well as shuffling items in the current bucket up a notch to replace the gap left by the removed item. Wrap these operations in a critical section.	2001-06-25 23:17:52 +00:00
Peter Wemm	0978669829	"Fix" the previous initial attempt at fixing TUNABLE_INT(). This time around, use a common function for looking up and extracting the tunables from the kernel environment. This saves duplicating the same function over and over again. This way typically has an overhead of 8 bytes + the path string, versus about 26 bytes + the path string.	2001-06-08 05:24:21 +00:00
Peter Wemm	4422746fdf	Back out part of my previous commit. This was a last minute change and I botched testing. This is a perfect example of how NOT to do this sort of thing. :-(	2001-06-07 03:17:26 +00:00
Peter Wemm	81930014ef	Make the TUNABLE_() macros look and behave more consistantly like the SYSCTL_() macros. TUNABLE_INT_DECL() was an odd name because it didn't actually declare the int, which is what the name suggests it would do.	2001-06-06 22:17:08 +00:00
John Baldwin	1ad5401134	- Don't panic on a try lock operation for a sleep lock if we hold a spin lock. Since we won't actually block on a try lock operation, it's not a problem. Add a comment explaining why it is safe to skip lock order checking with try locks. - Remove the ithread list lock spin lock from the order list.	2001-05-17 22:44:56 +00:00
John Baldwin	9e5620599e	Check witness_dead in more functions to avoid panic'ing when assertions fail due to witness exhausting its internal resources and shutting down. Reported by: Szilveszter Adam <sziszi@petra.hos.u-szeged.hu> Tested by: David Wolfskill <david@catwhisker.org>	2001-05-11 20:25:29 +00:00
John Baldwin	2d96f0b145	- Move state about lock objects out of struct lock_object and into a new struct lock_instance that is stored in the per-process and per-CPU lock lists. Previously, the lock lists just kept a pointer to each lock held. That pointer is now replaced by a lock instance which contains a pointer to the lock object, the file and line of the last acquisition of a lock, and various flags about a lock including its recursion count. - If we sleep while holding a sleepable lock, then mark that lock instance as having slept and ignore any lock order violations that occur while acquiring Giant when we wake up with slept locks. This is ok because of Giant's special nature. - Allow witness to differentiate between shared and exclusive locks and unlocks of a lock. Witness will now detect the case when a lock is acquired first in one mode and then in another. Mutexes are always locked and unlocked exclusively. Witness will also now detect the case where a process attempts to unlock a shared lock while holding an exclusive lock and vice versa. - Fix a bug in the lock list implementation where we used the wrong constant to detect the case where a lock list entry was full.	2001-05-04 17:15:16 +00:00
Alfred Perlstein	aad7597ce0	When panic()'ing because of recursion on a non-recursive mutex, print out the location it was initially locked. Ok'd by: jake	2001-04-30 01:01:52 +00:00
John Baldwin	9d4f526475	Spelling nit: acquring -> acquiring. Reported by: T. William Wells <bill@twwells.com>	2001-04-21 01:50:32 +00:00
John Baldwin	d8915a7f34	- Whoops, forgot to enable the clock lock in the spin order list on the alpha. - Change the Debugger() functions to pass in the real function name.	2001-04-19 15:49:54 +00:00
John Baldwin	3c41f323c9	Check to see if enroll() returns NULL in the witness initialization. This can happen if witness runs out of resources during initialization or if witness_skipspin is enabled. Sleuthing by: Peter Jeremy <peter.jeremy@alcatel.com.au>	2001-04-17 03:35:38 +00:00
John Baldwin	7a9aa5d372	- Add a comment at the start of the spin locks list. - The alpha SMP code uses an "ap boot" spinlock as well.	2001-04-13 08:31:38 +00:00
Boris Popov	16162e5789	Avoid endless recursion on panic. Reviewed by: jhb	2001-04-10 00:56:19 +00:00
John Baldwin	d53d22496f	Maintain a reference count on the witness struct. When the reference count drops to 0 in witness_destroy, set the w_name and w_file pointers to point to the string "(dead)" and the w_line field to 0. This way, if a mutex of a given name is used only in a module, then as long as all mutexes in the module are destroyed when the module is unloaded, witness will not maintain stale references to the mutex's name in the module's data section causing a panic later on when the w_name or w_file field's are examined.	2001-04-09 22:34:05 +00:00
John Baldwin	3dcb6789d7	- Split out the functionality of displaying the contents of a single lock list into a public witness_list_locks() function. Call this function twice in witness_list() instead of using an evil goto. - Adjust the 'show locks' command to take an optional parameter which specifies the pid of a process to list the locks of. By default the locks held by the current process are displayed.	2001-04-06 21:37:52 +00:00
John Baldwin	026e76f43e	Close a race condition where if we were obtaining a sleep lock and no spin locks were held, we could be preempted and switch CPU's in between the time that we set a variable to the list of spin locks on our CPU and the time that we checked that variable to ensure no spinlocks were held while grabbing a sleep lock. Losing the race resulted in checking some other CPU's spin lock list and bogusly panicing.	2001-03-28 16:11:51 +00:00
John Baldwin	f7012f592a	- s/mutexes/locks/g in appropriate comments. - Rename the 'show mutexes' ddb command to 'show locks' since it shows a list of all the lock objects held by the current process.	2001-03-28 12:39:40 +00:00
John Baldwin	192846463a	Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.	2001-03-28 09:03:24 +00:00
John Baldwin	6283b7d01b	- Switch from using save/disable/restore_intr to using critical_enter/exit and change the u_int mtx_saveintr member of struct mtx to a critical_t mtx_savecrit. - On the alpha we no longer need a custom _get_spin_lock() macro to avoid an extra PAL call, so remove it. - Partially fix using mutexes with WITNESS in modules. Change all the _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line parameters and rename them to use a prefix of two underscores. Inside of kern_mutex.c, generate wrapper functions for _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore) that are called from modules. The macros mtx_{un,}lock_{spin,}_flags() are mapped to the __mtx_* macros inside of the kernel to inline the usual case of mutex operations and map to the internal _mtx_* functions in the module case so that modules will use WITNESS and KTR logging if the kernel is compiled with support for it.	2001-03-28 02:40:47 +00:00
John Baldwin	5db078a9be	Fix mtx_legal2block. The only time that it is bad to block on a mutex is if we hold a spin mutex, since we can trivially get into deadlocks if we start switching out of processes that hold spinlocks. Checking to see if interrupts were disabled was a sort of cheap way of doing this since most of the time interrupts were only disabled when holding a spin lock. At least on the i386. To fix this properly, use a per-process counter p_spinlocks that counts the number of spin locks currently held, and instead of checking to see if interrupts are disabled in the witness code, check to see if we hold any spin locks. Since child processes always start up with the sched lock magically held in fork_exit(), we initialize p_spinlocks to 1 for child processes. Note that proc0 doesn't go through fork_exit(), so it starts with no spin locks held. Consulting from: cp	2001-03-09 07:24:17 +00:00
John Baldwin	1b43703b47	- Add an extra check in priority_propagation() for UP systems to ensure we don't end up back at ourselves which would indicate deadlock. - Add the proc lock to the witness dup_list as we may hold more than one process lock at a time. - Don't assert a mutex is owned in _mtx_unlock_sleep() as that is too late. We do the checks in the macros instead.	2001-03-07 02:45:15 +00:00
Julian Elischer	a96dcd84d2	Shuffle netgraph mutexes a bit and hold a reference on a node from the function that is calling the destructor.	2001-02-28 18:49:09 +00:00
Jake Burkholder	5b270b2a55	Sigh. Try to get priorities sorted out. Don't bother trying to update native priority, it is diffcult to get right and likely to end up horribly wrong. Use an honestly wrong fixed value that seems to work; PUSER for user threads, and the interrupt priority for ithreads. Set it once when the process is created and forget about it. Suggested by: bde Pointy hat: me	2001-02-28 02:53:44 +00:00
Jake Burkholder	be15bfc091	Initialize native priority to PRI_MAX. It was usually 0 which made a process's priority go through the roof when it released a (contested) mutex. Only set the native priority in mtx_lock if hasn't already been set. Reviewed by: jhb	2001-02-26 23:27:35 +00:00
Jake Burkholder	a10f496636	Remove brackets around variables in a function that used to be a macro.	2001-02-25 16:18:13 +00:00
Julian Elischer	7433466190	Move netgraph spimlock order entries out of the #ifdef SMP section. They need to be there for UP too.	2001-02-25 04:56:23 +00:00
John Baldwin	1103f3b05b	Grrr, s/INVARIANTS_SUPPORT/INVARIANT_SUPPORT/.	2001-02-24 21:29:32 +00:00
John Baldwin	15ec816acc	- Axe RETIP() as it was very i386 specific and unwieldy. Instead, use the passed in filename and line number in the KTR tracepoint message. - Even though it is #if 0'd code, change the code to detect that a process is an interrupt thread to check p->p_ithd against NULL rather than checking non-existant process flags from BSD/OS. - Use '%p' to print pointers in KTR log messages instead of assuming sizeof(int) == sizeof(void *). - Don't set p_mtxname to NULL when releasing a mutex. It doesn't hurt to leave it set (we don't clear w_mesg for example) and at least at one time in the past, there used to be race conditions in the kernel that would result in setting this to NULL causing the kernel to dereference NULL. - Make the _mtx_assert() function be compiled in if INVARIANTS_SUPPORT is defined rather than if INVARIANTS is defined so that a KLD compiled with INVARIANTS that uses mtx_assert() can be used with a kernel that just has INVARIANT_SUPPORT compiled in.	2001-02-24 19:36:13 +00:00
Julian Elischer	33338e7370	Add knowledge of the netgraph spinlocks into the Witness code. Well, at least I think that's how it's done.	2001-02-24 14:29:47 +00:00
John Baldwin	25d209f260	- Use the NOCPU constant. - Move the ithread spin locks before sched lock and clk in preparation for future commits to the ithread code.	2001-02-22 02:12:54 +00:00
Bosko Milekic	2786342687	Change all instances of `CURPROC' and` CURTHD' to `curproc,' in order to stay consistent. Requested by: bde	2001-02-12 03:15:43 +00:00
Jake Burkholder	d5a08a6065	Implement a unified run queue and adjust priority levels accordingly. - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.	2001-02-12 00:20:08 +00:00
Bosko Milekic	5746a1d866	- Place back STR string declarations for lock/unlock strings used for KTR_LOCK tracing in order to avoid duplication. - Insert some tracepoints back into the mutex acq/rel code, thus ensuring that we can trace all lock acq/rel's again. - All CURPROC != NULL checks are MPASS()es (under MUTEX_DEBUG) because they signify a serious mutex corruption. - Change up some KASSERT()s to MPASS()es, and vice-versa, depending on the type of problem we're debugging (INVARIANTS is used here to check that the API is being used properly whereas MUTEX_DEBUG is used to ensure that something general isn't happening that will have bad impact on mutex locks). Reminded by: jhb, jake, asmodai	2001-02-11 02:54:16 +00:00
John Baldwin	c75e5182ce	Unify the two sleep lock order lists to enforce the process lock -> uidinfo lock locking order.	2001-02-09 20:52:02 +00:00
John Baldwin	e910ba59fc	- Change the 'witness_list' ddb command to 'show mutexes'. Note that this will only display sleep mutexes held by the current process. - Clean up some nits in the witness_display() function and add a ddb command 'show witness' that dumps the hierarchy and order lists to the console. - Use queue(3) macros where appropriate. - Resort the spin lock order list so that "com" is before "sched_lock". Also, add appropriate #ifdef's around SMP and i386-specific mutexes. - Add two new mutexes used to protect the ithread lists and tables to the order list. Requested by: bde (1)	2001-02-09 15:19:41 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
John Baldwin	d38b8dbfc8	Add a new ddb command 'witness_list' that lists the mutexes held by curproc. Requested by: peter	2001-01-27 07:51:34 +00:00
Jason Evans	1b367556b5	Convert all simplelocks to mutexes and remove the simplelock implementations.	2001-01-24 12:35:55 +00:00
John Baldwin	8484de7555	- Don't use a union and fun tricks to shave one extra pointer off of struct mtx right now as it makes debugging harder. When we are in optimizing mode, we can revisit this. - Fix the KTR trace messages to use %p rather than 0x%p to avoid duplicate 0x's in KTR output. - During witness_fixup, release Giant so that witness doesn't get confused. Also, grab all_mtx while walking the list of mutexes. - Remove w_sleep and w_recurse. Instead, perform checks on mutexes using the mutex's mtx_flags field. - Allow debug.witness_ddb and debug.witness_skipspin to be set from the loader. - Add Giant to the front of existing order_list entries to help ensure Giant is always first. - Add an order entry for the various proc locks. Note that this only helps keep proc in order mostly as the allproc and proctree mutexes are only obtained during a lockmgr operation on the specified mutex.	2001-01-24 10:57:01 +00:00
Jason Evans	56771ca74b	Print correct file name and line number in mtx_assert(). Noticed by: jake	2001-01-22 05:56:55 +00:00
Jason Evans	0cde2e34af	Move most of sys/mutex.h into kern/kern_mutex.c, thereby making the mutex inline functions non-inlined. Hide parts of the mutex implementation that should not be exposed. Make sure that WITNESS code is not executed during boot until the mutexes are fully initialized by SI_SUB_MUTEX (the original motivation for this commit). Submitted by: peter	2001-01-21 22:34:43 +00:00
Jason Evans	527c2fd277	Make the order of the static initializer for all_mtx match the order of fields in struct mtx. Found by: jake	2001-01-21 11:05:02 +00:00
Jason Evans	d1c1b8413e	Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex initialization until after malloc() is safe to call, then iterate through all mutexes and complete their initialization. This change is necessary in order to avoid some circular bootstrapping dependencies.	2001-01-21 07:52:20 +00:00
Jake Burkholder	c1ef8aac9e	- Make npx_intr INTR_MPSAFE and move acquiring Giant into the function itself. - Remove a hack to allow acquiring Giant from the npx asm trap vector.	2001-01-20 02:30:58 +00:00
Bosko Milekic	08812b3925	Implement MTX_RECURSE flag for mtx_init(). All calls to mtx_init() for mutexes that recurse must now include the MTX_RECURSE bit in the flag argument variable. This change is in preparation for an upcoming (further) mutex API cleanup. The witness code will call panic() if a lock is found to recurse but the MTX_RECURSE bit was not set during the lock's initialization. The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to MTX_RECURSED, which is more appropriate given its meaning. The following locks have been made "recursive," thus far: eventhandler, Giant, callout, sched_lock, possibly some others declared in the architecture-specific code, all of the network card driver locks in pci/, as well as some other locks in dev/ stuff that I've found to be recursive. Reviewed by: jhb	2001-01-19 01:59:14 +00:00
Jake Burkholder	ef73ae4b0c	Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.	2001-01-10 04:43:51 +00:00
John Baldwin	562e4ffe86	- Add a new flag MTX_QUIET that can be passed to the various mtx_* functions. If this flag is set, then no KTR log messages are issued. This is useful for blocking excessive logging, such as with the internal mutex used by the witness code. - Use MTX_QUIET on all of the mtx_enter/exit operations on the internal mutex used by the witness code. - If we are in a panic, don't do witness checks in witness_enter(), witness_exit(), and witness_try_enter(), just return.	2000-12-13 21:53:42 +00:00
Jake Burkholder	92cf772d8d	- Add code to detect if a system call returns with locks other than Giant held and panic if so (conditional on witness). - Change witness_list to return the number of locks held so this is easier. - Add kern/syscalls.c to the kernel build if witness is defined so that the panic message can contain the name of the offending system call. - Add assertions that Giant and sched_lock are not held when returning from a system call, which were missing for alpha and ia64.	2000-12-12 01:14:32 +00:00
John Baldwin	428b4b5562	Oops, the witness mutex is a spin lock, so use MTX_SPIN in the call to mtx_init(). Since the witness code ignores its internal mutex, this doesn't result in any functional change.	2000-12-12 00:37:18 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
John Baldwin	6936206ebd	Split the WITNESS and MUTEX_DEBUG options apart so that WITNESS does not depend on MUTEX_DEBUG. The MUTEX_DEBUG option turns on extra assertions and checks to verify that mutexes themselves are implemented properly. The WITNESS option uses extra checks and diagnostics to verify that other code is using mutexes properly.	2000-12-01 00:10:59 +00:00
John Baldwin	1bd0eefb4c	Fix up priority propagation: - Use a better test for determining when a process is running. - Convert some checks to assertions. - Remove unnecessary tests. - Save the priority before acquiring a mutex rather than in msleep(9).	2000-11-30 00:51:16 +00:00
John Baldwin	86327ad8a4	Set p_mtxname when blocking on a mutex and clear it when waking up.	2000-11-29 20:17:15 +00:00
John Baldwin	f404050e44	Use an atomic operation with an appropriate memory barrier when releasing a contested sleep mutex in the case that at least two processes are blocked on the contested mutex.	2000-11-29 18:41:19 +00:00
John Baldwin	8f838cb563	The sched_lock mutex goes after the sio mutex in the locking order since a software interrupt can be scheduled in the sio interrupt handler while the sio mutex is held.	2000-11-29 18:38:14 +00:00
John Baldwin	bbc7a98a31	Save the line number and filename of the last mtx_enter operation for spin locks. We already do this for sleep locks.	2000-11-29 18:37:01 +00:00
Alfred Perlstein	0931dcefb3	Move the #define of _KERN_MUTEX_C_ so that it's before any system headers are included. System headers can include sys/mutex.h and then certain macros do not get defined. Reviewed by: jake	2000-11-26 21:14:17 +00:00
Jake Burkholder	a5d5c61c12	Add uidinfo hash and uidinfo struct to the witness order list.	2000-11-26 15:05:46 +00:00
Jake Burkholder	fa2fbc3dac	- Protect the callout wheel with a separate spin mutex, callout_lock. - Use the mutex in hardclock to ensure no races between it and softclock. - Make softclock be INTR_MPSAFE and provide a flag, CALLOUT_MPSAFE, which specifies that a callout handler does not need giant. There is still no way to set this flag when regstering a callout. Reviewed by: -smp@, jlemon	2000-11-19 06:02:32 +00:00
Jake Burkholder	7da6f97772	- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock. Tested by: John Hay <jhay@icomtek.csir.co.za> jhb	2000-11-17 18:09:18 +00:00
John Baldwin	20cdcc5b73	Don't release and acquire Giant in mi_switch(). Instead, release and acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled. Submitted by: witness	2000-11-16 02:16:44 +00:00
John Baldwin	9c36c934a1	Include the right headers to get the DDB #define and the db_active variable.	2000-11-15 22:08:16 +00:00
John Baldwin	59f857e4ea	Declare the 'witness_spin_check' properly as a per-CPU variable in the non-SMP case.	2000-11-15 22:02:05 +00:00
John Baldwin	ecbd8e3710	Don't perform witness checks in witness_enter() during a panic.	2000-11-15 22:00:31 +00:00
John Baldwin	0fe4e534b1	Minor whitespace nit in a comment.	2000-11-10 21:21:20 +00:00
John Baldwin	a5a96a1978	- Use MUTEX_DECLARE() and MTX_COLD for the WITNESS code's internal mutex so it can function before malloc(9) is up and running. - Add two new options WITNESS_DDB and WITNESS_SKIPSPIN. If WITNESS_SKIPSPIN is enabled, then spin mutexes are ignored by the WITNESS code. If WITNESS_DDB is turned on and DDB is compiled into the kernel, then the kernel will drop into DDB when either a lock hierarchy violation occurs or mutexes are held when going to sleep. - Add some new sysctls: debug.witness_ddb is a read-write sysctl that corresponds to WITNESS_DDB. The kernel option merely changes the default value to on at boot. debug.witness_skipspin is a read-only sysctl that one can use to determine if the kernel was compiled with WITNESS_SKIPSPIN. - Wipe out the BSD/OS-specific lock order lists. We get to build our own lists now as we add mutexes to the kernel.	2000-10-27 02:59:30 +00:00
John Baldwin	3127162743	Quite some warnings.	2000-10-25 04:37:54 +00:00
John Baldwin	b67a3e6e85	Propogate the 'const'ness of mutex descriptions to the witness code to quiet warnings.	2000-10-20 22:45:01 +00:00
John Baldwin	78f0da0373	Actually enable the witness code if the WITNESS kernel option is enabled.	2000-10-20 21:58:11 +00:00
John Baldwin	f5271ebc2f	Doh. Fix a 64-bit-ism by using uintptr_t for a temporary lock variable instead of int.	2000-10-20 20:24:40 +00:00
John Baldwin	36412d79b4	- Make the mutex code almost completely machine independent. This greatly reducues the maintenance load for the mutex code. The only MD portions of the mutex code are in machine/mutex.h now, which include the assembly macros for handling mutexes as well as optionally overriding the mutex micro-operations. For example, we use optimized micro-ops on the x86 platform #ifndef I386_CPU. - Change the behavior of the SMP_DEBUG kernel option. In the new code, mtx_assert() only depends on INVARIANTS, allowing other kernel developers to have working mutex assertiions without having to include all of the mutex debugging code. The SMP_DEBUG kernel option has been renamed to MUTEX_DEBUG and now just controls extra mutex debugging code. - Abolish the ugly mtx_f hack. Instead, we dynamically allocate seperate mtx_debug structures on the fly in mtx_init, except for mutexes that are initiated very early in the boot process. These mutexes are declared using a special MUTEX_DECLARE() macro, and use a new flag MTX_COLD when calling mtx_init. This is still somewhat hackish, but it is less evil than the mtx_f filler struct, and the mtx struct is now the same size with and without mutex debugging code. - Add some micro-micro-operation macros for doing the actual atomic operations on the mutex mtx_lock field to make it easier for other archs to override/optimize mutex ops if needed. These new tiny ops also clean up the code in some places by replacing long atomic operation function calls that spanned 2-3 lines with a short 1-line macro call. - Don't call mi_switch() from mtx_enter_hard() when we block while trying to obtain a sleep mutex. Calling mi_switch() would bogusly release Giant before switching to the next process. Instead, inline most of the code from mi_switch() in the mtx_enter_hard() function. Note that when we finally kill Giant we can back this out and go back to calling mi_switch().	2000-10-20 07:26:37 +00:00
John Baldwin	606f8eb27a	Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, just use struct mtx, struct witness, and struct witness_blessed. Requested by: bde	2000-09-14 20:15:16 +00:00
Jason Evans	5340642a2e	Style cleanups. No functional changes.	2000-09-09 23:18:48 +00:00
Jason Evans	46bf3fe5a6	Add file and line arguments to WITNESS_ENTER() and WITNESS_EXIT, since __FILE__ and __LINE__ don't get expanded usefully in inline functions. Add const to all witness*() arguments that are filenames.	2000-09-09 22:43:22 +00:00
Jason Evans	12473b76dc	Rename mtx_enter(), mtx_try_enter(), and mtx_exit() and wrap them with cpp macros that expand to pass filename and line number information. This is necessary since we're using inline functions instead of macros now. Add const to the filename pointers passed througout the mtx and witness code.	2000-09-08 21:48:06 +00:00
Jason Evans	0384fff8c5	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh	2000-09-07 01:33:02 +00:00

... 2 3 4 5 6 ...

301 Commits