freebsd-dev

Author	SHA1	Message	Date
John Baldwin	122eceef61	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm	2005-07-15 18:17:59 +00:00
Gleb Smirnoff	4f20185860	Add additional newline to debug.mutex.prof.stats header, so that column names are printed exactly above the columns.	2005-04-08 14:14:09 +00:00
John Baldwin	c6a37e8413	Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case. Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch. This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example). Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more	2005-04-04 21:53:56 +00:00
John Baldwin	33fb8a386e	Rework the optimization for spinlocks on UP to be slightly less drastic and turn it back on. Specifically, the actual changes are now less intrusive in that the _get_spin_lock() and _rel_spin_lock() macros now have their contents changed for UP vs SMP kernels which centralizes the changes. Also, UP kernels do not use _mtx_lock_spin() and no longer include it. The UP versions of the spin lock functions do not use any atomic operations, but simple compares and stores which allow mtx_owned() to still work for spin locks while removing the overhead of atomic operations. Tested on: i386, alpha	2005-01-05 21:13:27 +00:00
John Baldwin	2ff0e645d1	Refine the turnstile and sleep queue interfaces just a bit: - Add a new _lock() call to each API that locks the associated chain lock for a lock_object pointer or wait channel. The _lookup() functions now require that the chain lock be locked via _lock() when they are called. - Change sleepq_add(), turnstile_wait() and turnstile_claim() to lookup the associated queue structure internally via _lookup() rather than accepting a pointer from the caller. For turnstiles, this means that the actual lookup of the turnstile in the hash table is only done when the thread actually blocks rather than being done on each loop iteration in _mtx_lock_sleep(). For sleep queues, this means that sleepq_lookup() is no longer used outside of the sleep queue code except to implement an assertion in cv_destroy(). - Change sleepq_broadcast() and sleepq_signal() to require that the chain lock is already required. For condition variables, this lets the cv_broadcast() and cv_signal() functions lock the sleep queue chain lock while testing the waiters count. This means that the waiters count internal to condition variables is no longer protected by the interlock mutex and cv_broadcast() and cv_signal() now no longer require that the interlock be held when they are called. This lets consumers of condition variables drop the lock before waking other threads which can result in fewer context switches. MFC after: 1 month	2004-10-12 18:36:20 +00:00
Stephan Uphoff	b9a80acadb	Force MUTEX_WAKE_ALL. A race condition in single thread wakeup may break priority inheritance. Tested by: pho Reviewed by: jhb,julian Approved by: sam (mentor) MFC: ASAP	2004-10-12 16:28:18 +00:00
Scott Long	9923b511ed	Turn PREEMPTION into a kernel option. Make sure that it's defined if FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is enabled (code inspired by the PREEMPTION warning in kern_switch.c). This is a possible MT5 candidate.	2004-09-02 18:59:15 +00:00
John-Mark Gurney	000968010a	add options MPROF_BUFFERS and MPROF_HASH_SIZE that adjust the sizes of the mutex profiling buffers. Document them in the man page and in NOTES. Ensure _HASH_SIZE is larger than _BUFFERS with a cpp error.	2004-08-19 06:38:26 +00:00
John Baldwin	bdcfcf5bc4	Cache the value of curthread in the _get_sleep_lock() and _get_spin_lock() macros and pass the value to the associated _mtx_*() functions to avoid more curthread dereferences in the function implementations. This provided a very modest perf improvement in some benchmarks. Suggested by: rwatson Tested by: scottl	2004-08-04 20:18:45 +00:00
Maxime Henrion	9f1b87f106	Instead of calling ia32_pause() conditionally on __i386__ or __amd64__ being defined, define and use a new MD macro, cpu_spinwait(). It only expands to something on i386 and amd64, so the compiled code should be identical. Name of the macro found by: jhb Reviewed by: jhb	2004-08-03 18:44:27 +00:00
Robert Watson	a9abdce44a	Add "options ADAPTIVE_GIANT" which causes Giant to also be treated in an adaptive fashion when adaptive mutexes are enabled. The theory behind non-adaptive Giant is that Giant will be held for long periods of time, and therefore spinning waiting on it is wasteful. However, in MySQL benchmarks which are relatively Giant-free, running Giant adaptive makes an observable difference on SMP (5% transaction rate improvement). As such, make adaptive behavior on Giant an option so it can be more widely benchmarked.	2004-07-27 16:34:48 +00:00
Peter Wemm	b09cb1027b	#ifdef __i386__ -> __i386__ \|\| __amd64__	2004-07-20 02:15:10 +00:00
Pawel Jakub Dawidek	ece2d9891e	Now we have NO_ADAPTIVE_MUTEXES option, so use it here too. Missed by: scottl	2004-07-18 23:27:14 +00:00
Scott Long	701f140800	Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to NO_ADAPTIVE_MUTEXES. This option has been enabled by default on amd64 for quite some time, and has been extensively tested on i386 and sparc64. It shows measurable performance gains in many circumstances, and few negative effects. It would be nice in t he future if adaptive mutexes actually went to sleep after a certain amount of spinning, but that will require quite a bit more testing.	2004-07-18 15:59:03 +00:00
Marcel Moolenaar	2d50560abc	Update for the KDB framework: o Make debugging code conditional upon KDB instead of DDB. o Call kdb_enter() instead of Debugger(). o Call kdb_backtrace() instead of db_print_backtrace() or backtrace(). kern_mutex.c: o Replace checks for db_active with checks for kdb_active and make them unconditional. kern_shutdown.c: o s/DDB_UNATTENDED/KDB_UNATTENDED/g o s/DDB_TRACE/KDB_TRACE/g o Save the TID of the thread doing the kernel dump so the debugger knows which thread to select as the current when debugging the kernel core file. o Clear kdb_active instead of db_active and do so unconditionally. o Remove backtrace() implementation. kern_synch.c: o Call kdb_reenter() instead of db_error().	2004-07-10 21:36:01 +00:00
John Baldwin	0c0b25ae91	Implement preemption of kernel threads natively in the scheduler rather than as one-off hacks in various other parts of the kernel: - Add a function maybe_preempt() that is called from sched_add() to determine if a thread about to be added to a run queue should be preempted to directly. If it is not safe to preempt or if the new thread does not have a high enough priority, then the function returns false and sched_add() adds the thread to the run queue. If the thread should be preempted to but the current thread is in a nested critical section, then the flag TDF_OWEPREEMPT is set and the thread is added to the run queue. Otherwise, mi_switch() is called immediately and the thread is never added to the run queue since it is switch to directly. When exiting an outermost critical section, if TDF_OWEPREEMPT is set, then clear it and call mi_switch() to perform the deferred preemption. - Remove explicit preemption from ithread_schedule() as calling setrunqueue() now does all the correct work. This also removes the do_switch argument from ithread_schedule(). - Do not use the manual preemption code in mtx_unlock if the architecture supports native preemption. - Don't call mi_switch() in a loop during shutdown to give ithreads a chance to run if the architecture supports native preemption since the ithreads will just preempt DELAY(). - Don't call mi_switch() from the page zeroing idle thread for architectures that support native preemption as it is unnecessary. - Native preemption is enabled on the same archs that supported ithread preemption, namely alpha, i386, and amd64. This change should largely be a NOP for the default case as committed except that we will do fewer context switches in a few cases and will avoid the run queues completely when preempting. Approved by: scottl (with his re@ hat)	2004-07-02 20:21:44 +00:00
John Baldwin	bf0acc273a	- Change mi_switch() and sched_switch() to accept an optional thread to switch to. If a non-NULL thread pointer is passed in, then the CPU will switch to that thread directly rather than calling choosethread() to pick a thread to choose to. - Make sched_switch() aware of idle threads and know to do TD_SET_CAN_RUN() instead of sticking them on the run queue rather than requiring all callers of mi_switch() to know to do this if they can be called from an idlethread. - Move constants for arguments to mi_switch() and thread_single() out of the middle of the function prototypes and up above into their own section.	2004-07-02 19:09:50 +00:00
John Baldwin	535eb30962	Add a new kernel option MUTEX_WAKE_ALL that changes the mutex unlock code to awaken all waiters when a contested mutex is released instead of just the highest priority waiter. If the various threads are awakened in sequence then each thread may acquire and release the lock in question without contention resulting in fewer expensive unlock and lock operations. This old behavior of waking just the highest priority is still used if this option is specified. Making the algorithm conditional on a kernel option will allows us to benchmark both cases later and determine which one should be used by default. Requested by: tanimura-san	2004-04-06 19:12:24 +00:00
Robert Watson	94ffb20d72	Add a reset sysctl for mutex profiling: zeros all of the mutex profiling buffers and hash table. This makes it a lot easier to do multiple profiling runs without rebooting or performing gratuitous arithmetic. Sysctl is named debug.mutex.prof.reset. Reviewed by: jake	2004-01-28 22:11:53 +00:00
John Baldwin	8d768e7676	Rework witness_lock() to make it slightly more useful and flexible. - witness_lock() is split into two pieces: witness_checkorder() and witness_lock(). Witness_checkorder() determines if acquiring a specified lock at the time it is called would result in a lock order. It optionally adds a new lock order relationship as well. witness_lock() updates witness's data structures to assume that a lock has been acquired by stick a new lock instance in the appropriate lock instance list. - The mutex and sx lock functions now call checkorder() prior to trying to acquire a lock and continue to call witness_lock() after the acquire is completed. This will let witness catch a deadlock before it happens rather than trying to do so after the threads have deadlocked (i.e. never actually report it). - A new function witness_defineorder() has been added that adds a lock order between two locks at runtime without having to acquire the locks. If the lock order cannot be added it will return an error. This function is available to programmers via the WITNESS_DEFINEORDER() macro which accepts either two mutexes or two sx locks as its arguments. - A few simple wrapper macros were added to allow developers to call witness_checkorder() anywhere as a way of enforcing locking assertions in code that might acquire a certain lock in some situations. The macros are: witness_check_{mutex,shared_sx,exclusive_sx} and take an appropriate lock as the sole argument. - The code to remove a lock instance from a lock list in witness_unlock() was unnested by using a goto to vastly improve the readability of this function.	2004-01-28 20:39:57 +00:00
Jeff Roberson	29bcc4514f	- Add a flags parameter to mi_switch. The value of flags may be SW_VOL or SW_INVOL. Assert that one of these is set in mi_switch() and propery adjust the rusage statistics. This is to simplify the large number of users of this interface which were previously all required to adjust the proper counter prior to calling mi_switch(). This also facilitates more switch and locking optimizations. - Change all callers of mi_switch() to pass the appropriate paramter and remove direct references to the process statistics.	2004-01-25 03:54:52 +00:00
Robert Watson	8dc10be885	Add some basic support for measuring sleep mutex contention to the mutex profiling code. As with existing mutex profiling, measurement is done with respect to mtx_lock() instances in the code, as opposed to specific mutexes. In particular, measure two things: (1) Lock contention. How often did this mtx_lock() call get made and have to sleep (or almost sleep) waiting for the lock. This helps identify the "victims" of contention. (2) Hold contention. How often, while the lock was held by a thread as a result of this mtx_lock(), did another thread try to acquire the same mutex. This helps identify the causes of contention. I'm currently exploring adding measurement of "time waited for the lock", but the current implementation has proven useful to me so far so I figured I'd commit it so others could try it out. Note that this increases the size of mutexes when MUTEX_PROFILING is enabled, so you might find you need to further bump UMA_BOOT_PAGES. Fixes welcome. The once over: des, others	2004-01-25 01:59:27 +00:00
John Baldwin	eac097962f	- Allow mtx_trylock() to recurse on a recursive mutex. Attempts to recurse on a non-recursive mutex will fail but will not trigger any assertions. - Add an assertion to mtx_lock() that one never recurses on a non-recursive mutex. This is mostly useful for the non-WITNESS case. Requested by: deischen, julian, others (1)	2004-01-05 23:09:51 +00:00
John Baldwin	961a7b244d	Add an implementation of turnstiles and change the sleep mutex code to use turnstiles to implement blocking isntead of implementing a thread queue directly. These turnstiles are somewhat similar to those used in Solaris 7 as described in Solaris Internals but are also different. Turnstiles do not come out of a fixed-sized pool. Rather, each thread is assigned a turnstile when it is created that it frees when it is destroyed. When a thread blocks on a lock, it donates its turnstile to that lock to serve as queue of blocked threads. The queue associated with a given lock is found by a lookup in a simple hash table. The turnstile itself is protected by a lock associated with its entry in the hash table. This means that sched_lock is no longer needed to contest on a mutex. Instead, sched_lock is only used when manipulating run queues or thread priorities. Turnstiles also implement priority propagation inherently. Currently turnstiles only support mutexes. Eventually, however, turnstiles may grow two queue's to support a non-sleepable reader/writer lock implementation. For more details, see the comments in sys/turnstile.h and kern/subr_turnstile.c. The two primary advantages from the turnstile code include: 1) the size of struct mutex shrinks by four pointers as it no longer stores the thread queue linkages directly, and 2) less contention on sched_lock in SMP systems including the ability for multiple CPUs to contend on different locks simultaneously (not that this last detail is necessarily that much of a big win). Note that 1) means that this commit is a kernel ABI breaker, so don't mix old modules with a new kernel and vice versa. Tested on: i386 SMP, sparc64 SMP, alpha SMP	2003-11-11 22:07:29 +00:00
John Baldwin	4110951861	If a spin lock is held for too long and WITNESS is enabled, then call witness_display_spinlock() to see if we can find out where the current owner of the spin lock last acquired the lock.	2003-07-31 18:52:18 +00:00
John Baldwin	47b722c1af	When complaining about a sleeping thread owning a mutex, display the thread's pid to make debugging easier for people who don't want to have to use the intended tool for these panics (witness). Indirectly prodded by: kris	2003-07-30 20:42:15 +00:00
John Baldwin	f7ee15901a	- Add comments about the maintenance of the per-thread list of contested locks held by each thread. - Fix a bug in the original BSD/OS code where a contested lock was not properly handed off from the old thread to the new thread when a contested lock with more than one blocked thread was transferred from one thread to another. - Don't use an atomic operation to write the MTX_CONTESTED value to mtx_lock in the aforementioned special case. The memory barriers and exclusion provided by sched_lock are sufficient. Spotted by: alc (2)	2003-07-02 16:14:09 +00:00
David E. O'Brien	677b542ea2	Use __FBSDID().	2003-06-11 00:56:59 +00:00
Poul-Henning Kamp	b82af320cf	Add "" around mutex name to make message less confusing.	2003-05-31 21:11:01 +00:00
John Baldwin	27dad03c97	Use TD_IS_RUNNING() instead of thread_running() in the adaptive mutex code.	2003-04-17 22:28:58 +00:00
Julian Elischer	060563ec50	Move the _oncpu entry from the KSE to the thread. The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.	2003-04-10 17:35:44 +00:00
Tim J. Robbins	f949f795aa	Remove unused mtx_lock_giant(), mtx_unlock_giant(), related globals and sysctls.	2003-03-23 11:26:11 +00:00
Poul-Henning Kamp	b4b138c27f	Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.	2003-03-18 08:45:25 +00:00
John Baldwin	75d468ee12	Axe the useless MTX_SLEEPABLE flag. mutexes are not sleepable locks. Nothing used this flag and WITNESS would have panic'd during mtx_init() if anything had.	2003-03-11 20:02:57 +00:00
John Baldwin	1106937d99	Remove safety belt: it is now ok to do a mtx_trylock() on a mutex you already own. The mtx_trylock() will fail however. Enhance the comment at the top of the try lock function to explain this. Requested by: jlemon and his evil netisr locking	2003-03-04 21:32:25 +00:00
John Baldwin	5fa8dd90f9	Miscellaneous cleanups to _mtx_lock_sleep(): - Declare some local variables at the top of the function instead of in a nested block. - Use mtx_owned() instead of masking off bits from mtx_lock manually. - Read the value of mtx_lock into 'v' as a separate line rather than inside an if statement for clarity. This code is hairy enough as it is.	2003-03-04 20:32:41 +00:00
John Baldwin	6b869595c5	Properly assert that mtx_trylock() is not called on a mutex we already owned. Previously the KASSERT would only trigger if we successfully acquired a lock that we already held. However, _obtain_lock() fails to acquire locks that we already hold, so the KASSERT was never checked in the case it was supposed to fail.	2003-03-04 20:30:30 +00:00
Mike Makonnen	0bd5f7979d	Unbreak mutex profiling (at least for me). o Always check for null when dereferencing the filename component. o Implement a try-and-backoff method for allocating memory to dump stats to avoid a spin-lock -> sleep-lock mutex lock order panic with WITNESS. Approved by: des, markm (mentor) Not objected: jhb	2003-02-25 22:28:46 +00:00
Dag-Erling Smørgrav	ecf031c9ad	There's absolutely no need for a struct-within-a-struct, so move the counters out of the inner struct and remove it.	2003-01-21 20:33:27 +00:00
Poul-Henning Kamp	fa669ab7b8	Disable the kernacc() check in mtx_validate() until such time that kernacc does not require Giant. This means that we may miss panics on a class of mutex programming bugs, but only if running with a Chernobyl setting of debug-flags. Spotted by: Pete Carah <pete@ns.altadena.net>	2002-10-25 08:40:20 +00:00
Dag-Erling Smørgrav	f2c1ea8152	Whitespace cleanup.	2002-10-23 10:26:54 +00:00
Robert Drehmel	d08926b1f6	Change the `mutex_prof' structure to use three variables contained in an anonymous structure as counters, instead of an array with preprocessor-defined names for indices. Remove the associated XXX- comment.	2002-10-22 16:06:28 +00:00
Dag-Erling Smørgrav	6d0369001a	Reduce the overhead of the mutex statistics gathering code, try to produce shorter lines in the report, and clean up some minor style issues.	2002-10-21 18:48:28 +00:00
Jeff Roberson	b43179fbe8	- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch	2002-10-12 05:32:24 +00:00
John Baldwin	551cf4e150	Rename the mutex thread and process states to use a more generic 'LOCK' name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will be used internally by the turnstile code and will not be specific to mutexes. Making the change now ensures that turnstiles can be dropped in at a later date without affecting the ABI of userland applications.	2002-10-02 20:31:47 +00:00
Julian Elischer	2735483034	uh, commit all of the patch	2002-09-29 23:28:58 +00:00
Julian Elischer	e081731767	commit the version I actually tested.. Submitted by: davidxu	2002-09-29 23:23:25 +00:00
Julian Elischer	9eb1fdea37	Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode from stopping another thread from completing a syscall, and this allows it to release its resources etc. Probably more related commits to follow (at least one I know of) Initial concept by: julian, dillon Submitted by: davidxu	2002-09-29 23:04:34 +00:00
Julian Elischer	71fad9fdee	Completely redo thread states. Reviewed by: davidxu@freebsd.org	2002-09-11 08:13:56 +00:00
John Baldwin	0d975d6341	Add some KASSERT()'s to ensure that we don't perform spin mutex ops on sleep mutexes and vice versa. WITNESS normally should catch this but not everyone uses WITNESS so this is a fallback to catch nasty but easy to do bugs.	2002-09-03 18:25:16 +00:00
Ian Dowse	02bd1bcd2a	Add a new KTR type KTR_CONTENTION, and use it in the mutex code to log the start and end of periods during which mtx_lock() is waiting to acquire a sleep mutex. The log message includes the file and line of both the waiter and the holder. Reviewed by: jhb, jake	2002-08-26 18:39:38 +00:00
John Baldwin	ce39e722ec	Disable optimization of spinlocks on UP kernels w/o debugging for now since it breaks mtx_owned() on spin mutexes when used outside of mtx_assert(). Unfortunately we currently use it in the i386 MD code and in the sio(4) driver. Reported by: bde	2002-07-27 16:54:23 +00:00
Dag-Erling Smørgrav	b61860ad2d	Add mtx_ prefixes to the fields used for mutex profiling, and fix a bug where the profiling code would report the release point instead of the acquisition point. Requested by: bde	2002-07-03 01:50:27 +00:00
Julian Elischer	e602ba25fd	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
John Baldwin	6a95e08f2f	Replace thread_runnable() with thread_running() as the latter is more accurate. Suggested by: julian	2002-06-04 22:36:24 +00:00
John Baldwin	7fcca6096f	Optimize the adaptive mutex spin a bit. Use a simple while loop with simple reads (and on IA32, a "pause" instruction for each interation of the loop) to spin until either the mutex owner field changes, or the lock owner stops executing. Suggested by: tanimura Tested on: i386	2002-06-04 21:53:48 +00:00
John Baldwin	5853d37d3b	Add a private thread_runnable() macro to make the code more readable and make the KSE diff easier to maintain.	2002-06-04 21:50:02 +00:00
Dag-Erling Smørgrav	db586c8b7c	Make the counters uintmax_ts, and use %ju rather than %llu.	2002-05-23 03:08:42 +00:00
John Baldwin	6b8c698908	Rename pause() to ia32_pause() so it doesn't conflict with the pause() function defined in <unistd.h>. I didn't #ifdef _KERNEL it because the mutex implementation in libpthread will probably need this.	2002-05-22 20:32:39 +00:00
John Baldwin	0228ea4e0b	Rename cpu_pause() to pause(). Originally I was going to make this an MI API with empty cpu_pause() functions on other arch's, but this functionality is definitely unique to IA-32, so I decided to leave it as i386-only and wrap it in #ifdef's. I should have dropped the cpu_ prefix when I made that decision. Requested by: bde	2002-05-22 13:19:22 +00:00
John Baldwin	703fc290fb	Add appropriate IA32 "pause" instructions to improve performanec on Pentium 4's and newer IA32 processors. The "pause" instruction has been verified by Intel to be a NOP on all currently existing IA32 processors prior to the Pentium 4.	2002-05-21 22:26:35 +00:00
John Baldwin	0e54ddadd9	Fix an old cut 'n' paste bug inherited from BSD/OS: don't increment 'i' twice once we are in the long wait stage of spinning on a spin mutex.	2002-05-21 21:27:05 +00:00
John Baldwin	e6302957fe	Whitespace fixup, properly indent the body of an else clause.	2002-05-21 21:13:27 +00:00
John Baldwin	2498cf8c42	Add code to make default mutexes adaptive if the ADAPTIVE_MUTEXES kernel option is used (not on by default). - In the case of trying to lock a mutex, if the MTX_CONTESTED flag is set, then we can safely read the thread pointer from the mtx_lock member while holding sched_lock. We then examine the thread to see if it is currently executing on another CPU. If it is, then we keep looping instead of blocking. - In the case of trying to unlock a mutex, it is now possible for a mutex to have MTX_CONTESTED set in mtx_lock but to not have any threads actually blocked on it, so we need to handle that case. In that case, we just release the lock as if MTX_CONTESTED was not set and return. - We do not adaptively spin on Giant as Giant is held for long times and it slows SMP systems down to a crawl (it was taking several minutes, like 5-10 or so for my test alpha and sparc64 SMP boxes to boot up when they adaptively spinned on Giant). - We only compile in the code to do this for SMP kernels, it doesn't make sense for UP kernels. Tested on: i386, alpha, sparc64	2002-05-21 20:47:11 +00:00
John Baldwin	e8fdcfb57a	Optimize spin mutexes for UP kernels without debugging to just enter and exit critical sections. We only contest on a spin mutex on an SMP kernel running on an SMP machine.	2002-05-21 20:34:28 +00:00
John Baldwin	0c88508a78	Change mtx_init() to now take an extra argument. The third argument is the generic lock type for use with witness. If this argument is NULL then the lock name is used as the lock type. Add a macro for a lock type name for network driver locks.	2002-04-04 20:52:27 +00:00
Dag-Erling Smørgrav	e633070431	Revert to open hashing. It makes the code simpler, and works farily well even when the number of records approaches the size of the hash table. Besides, the previous implementation (using linear probing) was broken :) Also, use the newly introduced MTX_SYSINIT.	2002-04-02 23:26:32 +00:00
John Baldwin	c53c013bae	- Move the MI mutexes sched_lock and Giant from being declared in the various machdep.c's to being declared in kern_mutex.c. - Add a new function mutex_init() used to perform early initialization needed for mutexes such as setting up thread0's contested lock list and initializing MI mutexes. Change the various MD startup routines to call this function instead of duplicating all the code themselves. Tested on: alpha, i386	2002-04-02 22:19:16 +00:00
John Baldwin	7feefcd6ce	Spelling police.	2002-04-02 20:44:30 +00:00
Andrew R. Reiter	c27b56999e	- Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx locks to be able to setup a SYSINIT call. This helps in places where a lock is needed to protect some data, but the data is not truly associated with a subsystem that can properly initialize it's lock. The macros use the mtx_sysinit() and sx_sysinit() functions, respectively, as the handler argument to SYSINIT(). Reviewed by: alfred, jhb, smp@	2002-04-02 16:05:43 +00:00
Dag-Erling Smørgrav	b784ffe91a	Instead of get_cyclecount(9), use nanotime(9) to record acquisition and release times. Measurements are made and stored in nanoseconds but presented in microseconds, which should be sufficient for the locks for which we actually want this (those that are held long and / or often). Also, rename some variables and structure members to unit-agnostic names.	2002-04-02 14:42:01 +00:00
Dag-Erling Smørgrav	6c35e80948	Mutex profiling code, conditional on the MUTEX_PROFILING option. Adds the following sysctl variables: debug.mutex.prof.enable enable / disable profiling debug.mutex.prof.acquisitions number of mutex acquisitions recorded debug.mutex.prof.records number of acquisition points recorded debug.mutex.prof.maxrecords max number of acquisition points debug.mutex.prof.rejected number of rejections (due to full table) debug.mutex.prof.hashsize hash size debug.mutex.prof.collisions number of hash collisions debug.mutex.prof.stats profiling statistics The code records four numbers for each acquisition point (identified by source file name and line number): longest time held, total time held, number of non-recursive acquisitions, average time held. The measurements are in clock cycles (as returned by get_cyclecount(9)); this may cause measurements on some SMP systems to be unreliable. This can probably be worked around by replacing get_cyclecount(9) by some incarnation of nanotime(9). This work was derived from initial patches by eivind.	2002-04-02 00:01:49 +00:00
Jeff Roberson	f22a4b62f5	Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb	2002-03-27 09:23:41 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Peter Wemm	114730b0a8	Tidy up some unused variables	2002-02-20 21:25:44 +00:00
Matthew Dillon	735da6de88	Add kern_giant_ucred to instrument Giant around ucred related operations such a getgid(), setgid(), etc...	2002-02-18 17:51:47 +00:00
Julian Elischer	2c1007663f	In a threaded world, differnt priorirites become properties of different entities. Make it so. Reviewed by: jhb@freebsd.org (john baldwin)	2002-02-11 20:37:54 +00:00
John Baldwin	18fc2ba9ff	Use the mtx_owner() macro in one spot in _mtx_lock_sleep() to make the code easier to read.	2002-02-09 00:12:53 +00:00
John Baldwin	bf07c922ac	Bump the limits for determining if we've held a spinlock too long as they seem to be too short for the 500 Mhz DS20 I'm testing on. The rather arbitrary numbers are rather bogus anyways. We should probably have variables for these limits that are calibrated in the MD startup code somehow.	2002-01-15 14:20:33 +00:00
John Baldwin	c86b6ff551	Change the preemption code for software interrupt thread schedules and mutex releases to not require flags for the cases when preemption is not allowed: The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent switching to a higher priority thread on mutex releease and swi schedule, respectively when that switch is not safe. Now that the critical section API maintains a per-thread nesting count, the kernel can easily check whether or not it should switch without relying on flags from the programmer. This fixes a few bugs in that all current callers of swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from fast interrupt handlers and the swi_sched of softclock needed this flag. Note that to ensure that swi_sched()'s in clock and fast interrupt handlers do not switch, these handlers have to be explicitly wrapped in critical_enter/exit pairs. Presently, just wrapping the handlers is sufficient, but in the future with the fully preemptive kernel, the interrupt must be EOI'd before critical_exit() is called. (critical_exit() can switch due to a deferred preemption in a fully preemptive kernel.) I've tested the changes to the interrupt code on i386 and alpha. I have not tested ia64, but the interrupt code is almost identical to the alpha code, so I expect it will work fine. PowerPC and ARM do not yet have interrupt code in the tree so they shouldn't be broken. Sparc64 is broken, but that's been ok'd by jake and tmm who will be fixing the interrupt code for sparc64 shortly. Reviewed by: peter Tested on: i386, alpha	2002-01-05 08:47:13 +00:00
John Baldwin	7e1f6dfe9d	Modify the critical section API as follows: - The MD functions critical_enter/exit are renamed to start with a cpu_ prefix. - MI wrapper functions critical_enter/exit maintain a per-thread nesting count and a per-thread critical section saved state set when entering a critical section while at nesting level 0 and restored when exiting to nesting level 0. This moves the saved state out of spin mutexes so that interlocking spin mutexes works properly. - Most low-level MD code that used critical_enter/exit now use cpu_critical_enter/exit. MI code such as device drivers and spin mutexes use the MI wrappers. Note that since the MI wrappers store the state in the current thread, they do not have any return values or arguments. - mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is assigned to curthread->td_savecrit during fork_exit(). Tested on: i386, alpha	2001-12-18 00:27:18 +00:00
John Baldwin	ba48b69a13	Remove definition of witness and comment stating that this file implements witness. Witness moved off to subr_witness.c a while ago.	2001-11-15 19:08:55 +00:00
Matthew Dillon	d23f5958bc	Add mtx_lock_giant() and mtx_unlock_giant() wrappers for sysctl management of Giant during the Giant unwinding phase, and start work on instrumenting Giant for the file and proc mutexes. These wrappers allow developers to turn on and off Giant around various subsystems. DEVELOPERS SHOULD NEVER TURN OFF GIANT AROUND A SUBSYSTEM JUST BECAUSE THE SYSCTL EXISTS! General developers should only considering turning on Giant for a subsystem whos default is off (to help track down bugs). Only developers working on particular subsystems who know what they are doing should consider turning off Giant. These wrappers will greatly improve our ability to unwind Giant and test the kernel on a (mostly) subsystem by subsystem basis. They allow Giant unwinding developers (GUDs) to emplace appropriate subsystem and structural mutexes in the main tree and then request that the larger community test the work by turning off Giant around the subsystem(s), without the larger community having to mess around with patches. These wrappers also allow GUDs to boot into a (more likely to be) working system in the midst of their unwinding work and to test that work under more controlled circumstances. There is a master sysctl, kern.giant.all, which defaults to 0 (off). If turned on it overrides ALL other kern.giant sysctls and forces Giant to be turned on for all wrapped subsystems. If turned off then Giant around individual subsystems are controlled by various other kern.giant.XXX sysctls. Code which overlaps multiple subsystems must have all related subsystem Giant sysctls turned off in order to run without Giant.	2001-10-26 20:48:04 +00:00
John Baldwin	7ada587697	The mtx_init() and sx_init() functions bzero'd locks before handing them off to witness_init() making the check for double intializating a lock by testing the LO_INITIALIZED flag moot. Workaround this by checking the LO_INITIALIZED flag ourself before we bzero the lock structure.	2001-10-20 01:22:42 +00:00
John Baldwin	21377ce065	Remove superflous parens after de-macroizing.	2001-09-26 00:05:18 +00:00
John Baldwin	dde96c9933	Since we no longer inline any debugging code in the mutex operations, move all the debugging code into the function versions of the mutex operations in kern_mutex.c. This reduced the __mtx_* macros to simply wrappers of the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in favor of just calling the _{get,rel}_lock_* macros. The tangled hairy mass of macros calling macros is at least a bit more sane now.	2001-09-22 21:19:55 +00:00
John Baldwin	a44f918bf9	Fix a bug in propagate priority: the kse group pointer wasn't being updated in the loop so the new thread always seemd to have the same priority as the original thread and no actual priorities were changed.	2001-09-19 22:52:59 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Bosko Milekic	76dcbd6f9f	Force a commit on kern_mutex.c to explain reason for last commit but while I'm at it also add a comment in mtx_validate() explaining the purpose of the last change. Basically, this fixes booting kernels compiled with MUTEX_DEBUG. What used to happen is before we setidt from init386() [still using BTX idt], we called mtx_init() on several mutex locks, notably Giant and some others. This is a problem for MUTEX_DEBUG because it enables mtx_validate() which calls kernacc(), some of which in turn requires Giant. Fix by calling kernacc() from mtx_validate() only if (!cold).	2001-08-24 23:00:59 +00:00
Bosko Milekic	ab07087e16	* empty log message *	2001-08-24 22:53:45 +00:00
John Baldwin	5cb0fbe47e	If we have already panic'd then don't bother enforcing mutex asserts as things are pretty much shot already and all panic'ing does is hurt our chances of getting a dump. Inspired by: sheldonh	2001-07-31 17:45:50 +00:00
John Baldwin	c4f7a18726	Count the context switch when blocking on a mutex as a voluntary context switch. Count the context switch when preempting the current thread to let a higher priority thread blocked on a mutex we just released run as an involuntary context switch. Reported by: bde	2001-06-25 18:29:32 +00:00
John Baldwin	2d96f0b145	- Move state about lock objects out of struct lock_object and into a new struct lock_instance that is stored in the per-process and per-CPU lock lists. Previously, the lock lists just kept a pointer to each lock held. That pointer is now replaced by a lock instance which contains a pointer to the lock object, the file and line of the last acquisition of a lock, and various flags about a lock including its recursion count. - If we sleep while holding a sleepable lock, then mark that lock instance as having slept and ignore any lock order violations that occur while acquiring Giant when we wake up with slept locks. This is ok because of Giant's special nature. - Allow witness to differentiate between shared and exclusive locks and unlocks of a lock. Witness will now detect the case when a lock is acquired first in one mode and then in another. Mutexes are always locked and unlocked exclusively. Witness will also now detect the case where a process attempts to unlock a shared lock while holding an exclusive lock and vice versa. - Fix a bug in the lock list implementation where we used the wrong constant to detect the case where a lock list entry was full.	2001-05-04 17:15:16 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
John Baldwin	7141f2ad46	Exit and re-enter the critical section while spinning for a spinlock so that interrupts can come in while we are waiting for a lock.	2001-04-17 03:34:52 +00:00
Mark Murray	f0b60d7560	Handle a rare but fatal race invoked sometimes when SIGSTOP is invoked.	2001-04-13 09:29:34 +00:00
John Baldwin	192846463a	Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.	2001-03-28 09:03:24 +00:00
John Baldwin	6283b7d01b	- Switch from using save/disable/restore_intr to using critical_enter/exit and change the u_int mtx_saveintr member of struct mtx to a critical_t mtx_savecrit. - On the alpha we no longer need a custom _get_spin_lock() macro to avoid an extra PAL call, so remove it. - Partially fix using mutexes with WITNESS in modules. Change all the _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line parameters and rename them to use a prefix of two underscores. Inside of kern_mutex.c, generate wrapper functions for _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore) that are called from modules. The macros mtx_{un,}lock_{spin,}_flags() are mapped to the __mtx_* macros inside of the kernel to inline the usual case of mutex operations and map to the internal _mtx_* functions in the module case so that modules will use WITNESS and KTR logging if the kernel is compiled with support for it.	2001-03-28 02:40:47 +00:00
John Baldwin	5db078a9be	Fix mtx_legal2block. The only time that it is bad to block on a mutex is if we hold a spin mutex, since we can trivially get into deadlocks if we start switching out of processes that hold spinlocks. Checking to see if interrupts were disabled was a sort of cheap way of doing this since most of the time interrupts were only disabled when holding a spin lock. At least on the i386. To fix this properly, use a per-process counter p_spinlocks that counts the number of spin locks currently held, and instead of checking to see if interrupts are disabled in the witness code, check to see if we hold any spin locks. Since child processes always start up with the sched lock magically held in fork_exit(), we initialize p_spinlocks to 1 for child processes. Note that proc0 doesn't go through fork_exit(), so it starts with no spin locks held. Consulting from: cp	2001-03-09 07:24:17 +00:00
John Baldwin	1b43703b47	- Add an extra check in priority_propagation() for UP systems to ensure we don't end up back at ourselves which would indicate deadlock. - Add the proc lock to the witness dup_list as we may hold more than one process lock at a time. - Don't assert a mutex is owned in _mtx_unlock_sleep() as that is too late. We do the checks in the macros instead.	2001-03-07 02:45:15 +00:00

1 2 3 4 5

205 Commits