freebsd-dev

Author	SHA1	Message	Date
Andrew Thompson	018cecb61b	Add functions WITNESS so it can be asserted that the lock is not released for a section of code, this uses WITNESS_NORELEASE() and WITNESS_RELEASEOK() to mark the boundaries. Both functions require the lock to be held when calling. This is intended for scenarios like a bus asserting that the bus lock is not dropped during a driver call. There doesn't appear to be a man page to document this in. Reviewed by: jhb	2009-01-21 04:19:18 +00:00
Kip Macy	3120b9d428	- convert radix node head lock from mutex to rwlock - make radix node head lock not recursive - fix LOR in rtexpunge - fix LOR in rtredirect Reviewed by: sam	2008-12-07 21:15:43 +00:00
Dag-Erling Smørgrav	e11e3f187d	Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.	2008-10-23 20:26:15 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Attilio Rao	ca86d51602	In the actual code for witness_warn: - If there aren't spinlocks held, but there are problems with old sleeplocks, they are not reported. - If the spinlock found is not the only one, problems are not reported. Fix these 2 problems. Reported by: tegge	2008-10-20 19:22:16 +00:00
Attilio Rao	ac0dd8886d	- Fix a race in witness_checkorder() where, between the PCPU_GET() and PCPU_PTR() curthread can migrate on another CPU and get incorrect results. - Fix a similar race into witness_warn(). - Fix the interlock's checks bypassing by correctly using the appropriate children even when the lock_list chunk to be explored is not the first one. - Allow witness_warn() to work with spinlocks too. Bugs found by: tegge Submitted by: jhb, tegge Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-16 12:42:56 +00:00
John Baldwin	4ec9dc4de2	Oops, missed updating a place with with 's/lock1/plock/' when adding interlock support to WITNESS. Specifically, the printf listing the first location when duplicate locks of the same type are acquired. Reported by: pho	2008-10-03 18:13:05 +00:00
John Baldwin	c1fa2e4200	Update description of witness_watch.	2008-09-24 18:47:24 +00:00
Sam Leffler	39297ba455	Make ddb command registration dynamic so modules can extend the command set (only so long as the module is present): o add db_command_register and db_command_unregister to add and remove commands, respectively o replace linker sets with SYSINIT's (and SYSUINIT's) that register commands o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table for registering top-level commands, show operands, and show all operands, respectively While here also: o sort command lists o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases for existing commands o add "show all trace" as an alias for "show alltrace" o add "show all locks" as an alias for "show alllocks" Submitted by: Guillaume Ballet <gballet@gmail.com> (original version) Reviewed by: jhb MFC after: 1 month	2008-09-15 22:45:14 +00:00
Attilio Rao	d56bc17bce	- For any lock list we hold the head in order to reduce allocation from the free list and in this way avoid contention on the w_mtx. In order to make the code simple, we rely on the rule that when the head has not a child it also doesn't have other subsequent entries. Actually this assertion is broken because we can free all the head children and quit witness_unlock() with the head still allocated, with no children and subsequent entries present. Fix this by shifting the head if other entries are present and still freeing the object, but leaving always an head. - Fix witness_thread_has_locks() in order to report, correctly, if the lock list linked to a specific thread has children or not based on the above explained rule. - Fix a printout into DDB's "show alllocks" command in order to show, correctly, the process name that is really what we want. - Fix style(9) for a comment. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Reported by: Marko Kiiskila <marko dot kiiskila at nokia dot com> Sponsored by: Nokia	2008-09-12 21:44:01 +00:00
John Baldwin	413134305e	Teach WITNESS about the interlocks used with lockmgr. This removes a bunch of spurious witness warnings since lockmgr grew witness support. Before this, every time you passed an interlock to a lockmgr lock WITNESS treated it as a LOR. Reviewed by: attilio	2008-09-10 19:13:30 +00:00
Attilio Rao	988d28340a	- Improve some witness_watch operability in code which does perform both lock tracking and checks, doing just the former ones. - Fix a bug where sysctl utility was printing crazy values when setting a new value for debug.witness.watch [0] [0] Reported by: yongari	2008-08-30 13:20:35 +00:00
Attilio Rao	df3310e04a	- Make witness_watch a 3 state value. 1 means that witness is up and running. 0 means that witness is disabled but that it can be established later again in effective way. -1 means that witness is disabled permanently - Fix a bug causing kernel to panic on witness disabling through witness_watch. lock lists queues were still full of entries and this was causing throubles with debugging stubs (like witness_thread_exit()). Reported by: kris, yongari Sponsored by: Nokia	2008-08-29 15:47:53 +00:00
Attilio Rao	3d06b4b330	Introduce some WITNESS improvements: - Speedup the lock orderings lookup modifying the witness graph from a linked tree to a matrix. A table lookup caches the lock orderings in order to make a O(1) access for them. Any witness object has an unique index withing this lookup cache table. - Reduce the lock contention on w_mtx acquiring it only when the LOR actually happens and not in a sane case. In order to do this don't totally flush lock lists (per-CPU spinlocks list and per-thread sleeplocks list) but check for ll_count anytime we need to have to verify allocations sanity. - Introduce the function witness_thread_exit() in the witness namespace which should verify a thread doesn't hold any witness occurrence why exiting. - Rename the sysctl debug.witness.graphs into debug.witness.fullgraph and add debug.witness.badstacks which prints out stacks for LOR revealed. This is implemented using the stack(9) support, which makes WITNESS to be dependent by the STACK option or by the DDB (including STACK) option. - Fix style(9) for src/sys/kern/subr_witness.c The hash table approach has been developed by Ilya Maykov on the behalf of Isilon Systems which kindly released the patch. Jeff Roberson, ported the patch to -CURRENT and fixed w_mtx contention, on the behalf of Nokia. Submitted by: Ilya Maykov <ivmaykov at gmail dot com> (Isilon Systems), jeff Sponsored by: Nokia	2008-08-13 18:24:22 +00:00
Robert Watson	1a4b919f8e	witness_addgraph() is required even if DDB isn't compiled into the kernel, so exclude it from #ifdef DDB. Submitted by: attilio	2008-07-19 17:47:23 +00:00
Attilio Rao	90356491d7	- Embed the recursion counter for any locking primitive directly in the lock_object, using an unified field called lo_data. - Replace lo_type usage with the w_name usage and at init time pass the lock "type" directly to witness_init() from the parent lock init function. Handle delayed initialization before than witness_initialize() is called through the witness_pendhelp structure. - Axe out LO_ENROLLPEND as it is not really needed. The case where the mutex init delayed wants to be destroyed can't happen because witness_destroy() checks for witness_cold and panic in case. - In enroll(), if we cannot allocate a new object from the freelist, notify that to userspace through a printf(). - Modify the depart function in order to return nothing as in the current CVS version it always returns true and adjust callers accordingly. - Fix the witness_addgraph() argument name prototype. - Remove unuseful code from itismychild(). This commit leads to a shrinked struct lock_object and so smaller locks, in particular on amd64 where 2 uintptr_t (16 bytes per-primitive) are gained. Reviewed by: jhb	2008-05-15 20:10:06 +00:00
Attilio Rao	688b98135c	Add a new witness sysctl which returns the relations between any lock and its children in the form: "parent","child" so that head and bottom of an oriented graph can be easilly detected and various form of diagrams can be build. The sysctl is called debug.witness.graphs and it is read-only; in order to get the list of relations, a simple: #sysctl debug.witness.graphs will do the trick. This approach has been choosen in order to support easilly things like the DOT format and such. Soon, an auto-explicative awk script, which filters simple informations returned by the sysctl and converts them into a real DOT script, will be committed to the repository between examples. Discussed with: rwatson	2008-05-07 21:41:36 +00:00
Robert Watson	8501a69cc9	Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive. This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code. MFC after: 3 months Tested by: kris (superset of committered patch)	2008-04-17 21:38:18 +00:00
Attilio Rao	0b0100db88	struct lock_instance and struct lock_list_entry don't need to be in the public namespace for WITNESS as they are only used internally so just move them in the private namespace for the subsystem (with all related supporting definitions).	2008-04-13 01:20:47 +00:00
Attilio Rao	e5f94314ad	- Re-introduce WITNESS support for lockmgr. About the old implementation the only one difference is that lockmgr*() functions now accept LK_NOWITNESS flag which skips ordering for the instanced calling. - Remove an unuseful stub in witness_checkorder() (because the above check doesn't allow ever happening) and allow witness_upgrade() to accept non-try operation too.	2008-04-12 19:57:30 +00:00
Attilio Rao	1859cffaef	Add missing stubs for spinlocks cpuset and intrcnt. Submitted by: kris	2008-04-12 13:51:18 +00:00
Pawel Jakub Dawidek	4582cb68b1	- There is no more "uidinfo struct" mutex. - The "uidinfo hash" lock is now a rwlock. Reminded by: kib	2008-03-17 11:48:40 +00:00
Robert Watson	237fdd787b	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
Jeff Roberson	6617724c5f	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
Rafal Jaworowski	6b7ba54456	Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family. The PQ3 is a high performance integrated communications processing system based on the e500 core, which is an embedded RISC processor that implements the 32-bit Book E definition of the PowerPC architecture. For details refer to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E This port was tested and successfully run on the following members of the PQ3 family: MPC8533, MPC8541, MPC8548, MPC8555. The following major integrated peripherals are supported: * On-chip peripherals bus * OpenPIC interrupt controller * UART * Ethernet (TSEC) * Host/PCI bridge * QUICC engine (SCC functionality) This commit brings the main functionality and will be followed by individual drivers that are logically separate from this base. Approved by: cognet (mentor) Obtained from: Juniper, Semihalf MFp4: e500	2008-03-03 17:17:00 +00:00
Robert Watson	3de213cc00	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
Attilio Rao	4a32616a77	Fix the spinlock static table adding missing spinlocks. - rm_spinlock has turnstile chain as child - srclock has callout and clk as child, found by witness "emulation". Just move it very high in our ranking	2007-11-24 04:32:32 +00:00
Julian Elischer	e01eafef2a	A bunch more files that should probably print out a thread name instead of a process name.	2007-11-14 06:51:33 +00:00
Attilio Rao	c8790f5d09	Fix some entries in the locks static table of witness. In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re	2007-09-20 20:38:43 +00:00
Marius Strobl	79be8b5082	- Remove zstty spin lock for no longer existing zs(4). - Move the rtc_mtx spin lock out from under #ifdef SMP as it's just not SMP-specific. - Add a new spin lock pcib_mtx for locking "fast" interrupt handlers of host-to-PCI bridge drivers on sparc64.	2007-06-16 23:30:57 +00:00
Sam Leffler	68e8e04e93	Update 802.11 wireless support: o major overhaul of the way channels are handled: channels are now fully enumerated and uniquely identify the operating characteristics; these changes are visible to user applications which require changes o make scanning support independent of the state machine to enable background scanning and roaming o move scanning support into loadable modules based on the operating mode to enable different policies and reduce the memory footprint on systems w/ constrained resources o add background scanning in station mode (no support for adhoc/ibss mode yet) o significantly speedup sta mode scanning with a variety of techniques o add roaming support when background scanning is supported; for now we use a simple algorithm to trigger a roam: we threshold the rssi and tx rate, if either drops too low we try to roam to a new ap o add tx fragmentation support o add first cut at 802.11n support: this code works with forthcoming drivers but is incomplete; it's included now to establish a baseline for other drivers to be developed and for user applications o adjust max_linkhdr et. al. to reflect 802.11 requirements; this eliminates prepending mbufs for traffic generated locally o add support for Atheros protocol extensions; mainly the fast frames encapsulation (note this can be used with any card that can tx+rx large frames correctly) o add sta support for ap's that beacon both WPA1+2 support o change all data types from bsd-style to posix-style o propagate noise floor data from drivers to net80211 and on to user apps o correct various issues in the sta mode state machine related to handling authentication and association failures o enable the addition of sta mode power save support for drivers that need net80211 support (not in this commit) o remove old WI compatibility ioctls (wicontrol is officially dead) o change the data structures returned for get sta info and get scan results so future additions will not break user apps o fixed tx rate is now maintained internally as an ieee rate and not an index into the rate set; this needs to be extended to deal with multi-mode operation o add extended channel specifications to radiotap to enable 11n sniffing Drivers: o ath: add support for bg scanning, tx fragmentation, fast frames, dynamic turbo (lightly tested), 11n (sniffing only and needs new hal) o awi: compile tested only o ndis: lightly tested o ipw: lightly tested o iwi: add support for bg scanning (well tested but may have some rough edges) o ral, ural, rum: add suppoort for bg scanning, calibrate rssi data o wi: lightly tested This work is based on contributions by Atheros, kmacy, sephe, thompsa, mlaier, kevlo, and others. Much of the scanning work was supported by Atheros. The 11n work was supported by Marvell.	2007-06-11 03:36:55 +00:00
Jeff Roberson	bd43e47156	Commit 10/14 of sched_lock decomposition. - Add new spinlocks to support thread_lock() and adjust ordering. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:55:45 +00:00
Attilio Rao	02b0a160dc	Fix some problems introduced with the last descriptors tables locking patch: - Do the correct test for ldt allocation - Drop dt_lock just before to call kmem_free (since it acquires blocking locks inside) - Solve a deadlock with smp_rendezvous() where other CPU will wait undefinitively for dt_lock acquisition. - Add dt_lock in the WITNESS list of spinlocks While applying these modifies, change the requirement for user_ldt_free() making that returning without dt_lock held. Tested by: marcus, tegge Reviewed by: tegge Approved by: jeff (mentor)	2007-05-29 18:55:41 +00:00
Jeff Roberson	8b98fec903	- Move clock synchronization into a seperate clock lock so the global scheduler lock is not involved. sched_lock still protects the sched_clock call. Another patch will remedy this. Contributed by: Attilio Rao <attilio@FreeBSD.org> Tested by: kris, jeff	2007-05-20 22:11:50 +00:00
Joseph Koshy	382d30cdd8	Fix witness(4) warnings about mutex use. Group mutexes used in hwpmc(4) into 3 "types" in the sense of witness(4): - leaf spin mutexes---only one of these should be held at a time, so these mutexes are specified as belonging to a single witness type "pmc-leaf". - `struct pmc_owner' descriptors are protected by a spin mutex of witness type "pmc-owner-proc". Since we call wakeup_one() while holding these mutexes, the witness type of these mutexes needs to dominate that of "sleepq chain" mutexes. - logger threads use a sleep mutex, of type "pmc-sleep". Submitted by: wkoszek (earlier patch)	2007-04-19 08:02:51 +00:00
Pawel Jakub Dawidek	028e84c68b	allprison mutex was converted to sx(9) lock.	2007-04-05 23:32:32 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Wojciech A. Koszek	4850546f51	ng_node and ng_worklist locks both migrated from being spinning locks to adaptive mutexes. Let witness(4) calm down and bring proper types of those locks to the lock order database. Glanced at by: rwatson	2007-04-01 15:48:10 +00:00
John Baldwin	aa89d8cd52	Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes, rwlocks, and sx locks to 'lock_object'.	2007-03-21 21:20:51 +00:00
Robert Watson	7ee76f9d4e	Remove unnecessary privilege and privilege check for WITNESS sysctl. Head nod: jhb	2007-02-20 23:49:31 +00:00
Alan Cox	0e2056ee7f	Remove the vm page queue free mutex from the CDEV order.	2007-02-07 05:43:31 +00:00
Mike Pritchard	af7a34173d	The change to the vm_page_queue_freelist lock from a spin lock to a sleep lock missed the witness code, and the system will panic immediately on boot if WITNESS is enabled. Changed the witness definition to the new type.	2007-02-06 05:51:55 +00:00
Konstantin Belousov	e6a4f4cd40	Record kqueue -> struct mount mtx -> vnode interlock lock order to catch the places where reverse lock order is instantiated. OKed by: jeff	2007-02-02 09:02:18 +00:00
Suleiman Souhlal	e8ac01c56a	Remove hptlock from the static witness table, now that it's a regular sleep mutex.	2007-01-16 22:56:28 +00:00
Kip Macy	7c0435b933	MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile wait (time waited to acquire) and hold times for all kernel locks. If the architecture has a system synchronized TSC, the profiling code will use that - thereby minimizing profiling overhead. Large chunks of profiling code have been moved out of line, the overhead measured on the T1 for when it is compiled in but not enabled is < 1%. Approved by: scottl (standing in for mentor rwatson) Reviewed by: des and jhb	2006-11-11 03:18:07 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Scott Long	988129b824	Introduce a spinlock for synchronizing access to the video output hardware in syscons. This replaces a simple access semaphore that was assumed to be protected by Giant but often was not. If two threads that were otherwise SMP-safe called printf at the same time, there was a high likelyhood that the semaphore would get corrupted and result in a permanently frozen video console. This is similar to what is already done in the serial console drivers.	2006-09-13 15:48:15 +00:00
Suleiman Souhlal	bec31a8fee	The "taskqueue_fast" spinlocks were renamed to "fast_taskqueue" in subr_taskqueue.c:r1.32 Reported by: rdivacky	2006-08-26 11:21:25 +00:00
John Baldwin	de833b7c0c	Use db_lookup_thread() to lookup the thread for the passed in address and change 'show locks' to only list the locks for a given thread rather than for all the threads in the process containing a specified thread.	2006-04-25 20:24:23 +00:00
Marius Strobl	fa63296aba	Remove last vestiges of sab(4).	2006-04-25 19:43:53 +00:00
Marcel Moolenaar	07c8931358	Add the scc_hwmtx spin mutex, defined by scc(4).	2006-04-07 22:15:54 +00:00
John Baldwin	6b81555744	Axe KTR_ALQ_MASK now that KTR_WITNESS is off unless you hack an #ifdef in subr_witness.c. I did add a comment in subr_witness.c noting that KTR_WITNESS is incompatible with KTR_ALQ.	2006-01-25 14:57:23 +00:00
John Baldwin	2b604e82b2	- Add a new KTR_SUBSYS in place of KTR_SPARE1 to serve as a subsystem placeholder similar to KTR_DEV. Explain the use of KTR_DEV and KTR_SUBSYS in a comment as well. - Retire KTR_WITNESS and instead have KTR_WITNESS default to off but use KTR_SUBSYS if it is enabled.	2006-01-24 22:23:45 +00:00
John Baldwin	83a81bcb14	Add a new file (kern/subr_lock.c) for holding code related to struct lock_obj objects: - Add new lock_init() and lock_destroy() functions to setup and teardown lock_object objects including KTR logging and registering with WITNESS. - Move all the handling of LO_INITIALIZED out of witness and the various lock init functions into lock_init() and lock_destroy(). - Remove the constants for static indices into the lock_classes[] array and change the code outside of subr_lock.c to use LOCK_CLASS to compare against a known lock class. - Move the 'show lock' ddb function and lock_classes[] array out of kern_mutex.c over to subr_lock.c.	2006-01-17 16:55:17 +00:00
John Baldwin	3c6decc327	Trim another pointer from struct lock_object (and thus from struct mtx and struct sx). Instead of storing a direct pointer to a our lock_class struct in lock_object, reserve 4 bits in the lo_flags field to serve as an index into a global lock_classes array that contains pointers to the lock classes. Only debugging code such as WITNESS or INVARIANTS checks and KTR logging need to access the lock_class member, so this shouldn't add any overhead to production kernels. It might add some slight overhead to kernels using those debug options however. As with the previous set of changes to lock_object, this is going to completely obliterate the kernel ABI, so be sure to recompile all your modules.	2006-01-06 18:07:32 +00:00
John Baldwin	b0e9883e2f	Teach WITNESS_SAVE() and WITNESS_RESTORE() to work with spin locks instead of only sleep locks.	2005-12-29 20:54:25 +00:00
John Baldwin	0a46ed7d56	Fix a deadlock I introduced with the recently added printf to warn about spin locks that are not in the static order list. It is not safe to call printf while holding the witness spin mutex since the console drivers that back printf may need to use their own spin locks which would try to talk to witness when they were locked. Given this, it is possible for one CPU to lock a console driver lock (such as sio) which then tries to lock the witness lock while another CPU is doing the printf while holding the witness lock. Fix this by moving the printf outside of the witness lock. All other printf's in witness are already correct. MFC after: 3 days	2005-12-29 20:53:01 +00:00
John Baldwin	5d2162b2f8	Tweak witness handling of lock object to shave 2 pointers off of each lock object (and thus off of each mutex and sx lock): - Rename the all_locks list to pending_locks and only put locks initialized before SI_SUB_WITNESS on the list so that the SI_SUB_WITNESS can add them to witness once it starts up. - Now that pending_locks is only used during early startup, change it from a TAILQ to an STAILQ. This removes a pointer from the STAILQ_ENTRY in struct lock_object. - Since the pending_locks list is only used during the single-threaded early boot it no longer needs to be protected by a mutex, so remove all_mtx. - Since the lo_list member of struct lock_object is now only used during early boot before witness is running, collapse lo_list and lo_witness into a union. This shaves the second pointer off of struct lock_object. - Axe lock_cur_cnt and lock_max_cnt. With these changes, struct mtx shrinks from 36 to 28 bytes on 32-bit platforms and from 72 to 56 bytes on 64-bit platforms. Note that this commit will completely and utterly destroy the kernel ABI, so no MFC. Tested on: alpha, amd64, i386, sparc64	2005-12-05 20:45:24 +00:00
John Baldwin	e0f66ef861	Reorganize the interrupt handling code a bit to make a few things cleaner and increase flexibility to allow various different approaches to be tried in the future. - Split struct ithd up into two pieces. struct intr_event holds the list of interrupt handlers associated with interrupt sources. struct intr_thread contains the data relative to an interrupt thread. Currently we still provide a 1:1 relationship of events to threads with the exception that events only have an associated thread if there is at least one threaded interrupt handler attached to the event. This means that on x86 we no longer have 4 bazillion interrupt threads with no handlers. It also means that interrupt events with only INTR_FAST handlers no longer have an associated thread either. - Renamed struct intrhand to struct intr_handler to follow the struct intr_foo naming convention. This did require renaming the powerpc MD struct intr_handler to struct ppc_intr_handler. - INTR_FAST no longer implies INTR_EXCL on all architectures except for powerpc. This means that multiple INTR_FAST handlers can attach to the same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach to the same interrupt. Sharing INTR_FAST handlers may not always be desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun either. Drivers can always still use INTR_EXCL to ask for an interrupt exclusively. The way this sharing works is that when an interrupt comes in, all the INTR_FAST handlers are executed first, and if any threaded handlers exist, the interrupt thread is scheduled afterwards. This type of layout also makes it possible to investigate using interrupt filters ala OS X where the filter determines whether or not its companion threaded handler should run. - Aside from the INTR_FAST changes above, the impact on MD interrupt code is mostly just 's/ithread/intr_event/'. - A new MI ddb command 'show intrs' walks the list of interrupt events dumping their state. It also has a '/v' verbose switch which dumps info about all of the handlers attached to each event. - We currently don't destroy an interrupt thread when the last threaded handler is removed because it would suck for things like ppbus(8)'s braindead behavior. The code is present, though, it is just under #if 0 for now. - Move the code to actually execute the threaded handlers for an interrrupt event into a separate function so that ithread_loop() becomes more readable. Previously this code was all in the middle of ithread_loop() and indented halfway across the screen. - Made struct intr_thread private to kern_intr.c and replaced td_ithd with a thread private flag TDP_ITHREAD. - In statclock, check curthread against idlethread directly rather than curthread's proc against idlethread's proc. (Not really related to intr changes) Tested on: alpha, amd64, i386, sparc64 Tested on: arm, ia64 (older version of patch by cognet and marcel)	2005-10-25 19:48:48 +00:00
John Baldwin	cf23efc12a	Don't panic if a spin lock is initialized that isn't in our static order list. Just warn about it instead. Requested by: scottl MFC after: 1 day	2005-10-24 20:14:24 +00:00
John Baldwin	971d0ad835	Spell hierarchy correctly in comments. Submitted by: Wojciech A. Koszek dunstan at freebsd dot czest dot pl	2005-10-24 15:57:27 +00:00
John Baldwin	2a2b58faa4	Add entry for the spin mutex used by the hptmv(4) driver. MFC after: 1 day Tested by: Philip Kizer pckizer at nostrum dot com	2005-10-20 14:49:59 +00:00
John Baldwin	d27acf445e	Add the spin lock used by the binary nvidia driver to the static lock order list so that WITNESS and the driver play together nicely. Tested by: Harald Schmalzbauer MFC after: 3 days	2005-09-26 18:30:12 +00:00
John Baldwin	b27dbfbf4a	- Enforce an implicit lock order that Giant cannot be locked while holding any other non-sleepable lock. In plain English: Giant comes before all other mutexes. - Add some extra description to the lock order reversal printf's to indicate when a reversal is triggered by a hard-coded implicit rule. Requested by: truckman (2) MFC after: 1 week	2005-09-15 19:07:14 +00:00
Don Lewis	908b3deb2b	Relocate witness_levelall(), witness_leveldescendents(), and witness_displaydescendants() so that they are protected by "#ifdef DDB/#endif" to unbreak kernels not using "option DDB". MFC after: 3 weeks	2005-09-11 07:57:06 +00:00
John Baldwin	acc0265cc2	- Add some comments to some of the static lock orders. Don't explicitly link proctree and allproc to Giant since that order is already implicitly enforced. - Use a goto to handle the case where we want to enforce a reversal before calling isitmydescendant() in witness_checkorder() so that the logic is easier to follow and so that it is easier to add more forced-reversal cases in the future. MFC after: 3 days	2005-09-02 20:23:49 +00:00
Don Lewis	4053cae340	Track all lock relationships instead of pruning direct relationships if an indirect relationship exists (keep both A->B->C and A->C). This allows witness_checkorder() to use isitmychild() instead of the much more expensive isitmydescendant() to check for valid lock ordering. Don't do an expensive tree walk to update the w_level values when the tree is updated. Only update the w_level values when using the debugger to display the tree. Nuke the experimental "witness_watch > 1" mode that only compared w_level for the two locks. This information is no longer maintained at run time, and the use of isitmychild() in witness_checkorder should bring performance close enough to the acceptable level that this hack is not needed. Report witness data structure allocation statistics under the debug.witness sysctl. Reviewed by: jhb MFC after: 30 days	2005-08-25 03:47:37 +00:00
Robert Watson	ae018704a1	Add an order between UDP inpcb locks and the IPv4 multicast address list lock, as there has been a report that an alternative lock order is getting introduced. This should help ferret it out. Reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>	2005-08-09 13:27:50 +00:00
Robert Watson	dd5a318ba3	Introduce in_multi_mtx, which will protect IPv4-layer multicast address lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case. For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held. Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output(). Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync. Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events. Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking. Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days	2005-08-03 19:29:47 +00:00
Marius Strobl	fce21e7e25	After some input from bde@ and rereading the datasheet use a MTX_SPIN mutex instead of a MTX_DEF one in order to defer preemption while reading the date and time registers. If we don't manage to read them within the time slot where we are guaranteed that no updates occur we might actually read them during an update in which case the output is undefined.	2005-06-04 23:24:50 +00:00
Jeff Roberson	17314e6286	- Define the real lock order with cdev and a few vm/vfs related locks. This can be removed once cdev no longer calls free() with the cdev lock held.	2005-04-22 22:43:31 +00:00
Jeff Roberson	57f66be038	- Check LO_DUPOK as well as LOP_DUPOK when determining whether we should warn about duplicate acquires. Sponsored by: Isilon Systems, Inc.	2005-04-22 22:39:46 +00:00
Vinod Kashyap	f0c1dee27f	The latest release of the FreeBSD driver (twa) for 3ware's 9xxx series controllers. This corresponds to the 9.2 release (for FreeBSD 5.2.1) on the 3ware website. Highlights of this release are: 1. The driver has been re-architected to use a "Common Layer" (all tw_cl* files), which is a consolidation of all OS-independent parts of the driver. The FreeBSD OS specific portions of the driver go into an "OS Layer" (all tw_osl* files). This re-architecture is to achieve better maintainability, consistency of behavior across OS's, and better portability to new OS's (drivers for new OS's can be written by just adding an OS Layer that's specific to the OS, by complying to a "Common Layer Programming Interface" API. 2. The driver takes advantage of multiple processors. 3. The driver has a new firmware image bundled, the new features of which include Online Capacity Expansion and multi-lun support, among others. More details about 3ware's 9.2 release can be found here: http://www.3ware.com/download/Escalade9000Series/9.2/9.2_Release_Notes_Web.pdf Since the Common Layer is used across OS's, the FreeBSD specific include path for header files (/sys/dev/twa) is not part of the #include pre-processor directive in any of the source files. For being able to integrate twa into the kernel despite this, Makefile.<arch> has been changed to add the include path to CFLAGS. Reviewed by: scottl	2005-04-12 22:07:11 +00:00
Pawel Jakub Dawidek	c19618dd7d	CDEV lock should be before 'system map' lock. Hardcode this order to help track down reported LOR. LOR reported by: Thierry Herbelot <thierry@herbelot.com> LOR info: http://sources.zabbadoz.net/freebsd/lor.html#080	2005-04-09 13:32:01 +00:00
Pawel Jakub Dawidek	cd104dd3c1	Add a missing terminator. Confirmed by: rwatson	2005-04-09 11:31:31 +00:00
Robert Watson	53358cc907	Document, via WITNESS, that the NFS server mutex falls ahead of the socket buffer mutexes.	2005-03-09 21:38:53 +00:00
Bill Paul	58a6edd121	When you call MiniportInitialize() for an 802.11 driver, it will at some point result in a status event being triggered (it should be a link down event: the Microsoft driver design guide says you should generate one when the NIC is initialized). Some drivers generate the event during MiniportInitialize(), such that by the time MiniportInitialize() completes, the NIC is ready to go. But some drivers, in particular the ones for Atheros wireless NICs, don't generate the event until after a device interrupt occurs at some point after MiniportInitialize() has completed. The gotcha is that you have to wait until the link status event occurs one way or the other before you try to fiddle with any settings (ssid, channel, etc...). For the drivers that set the event sycnhronously this isn't a problem, but for the others we have to pause after calling ndis_init_nic() and wait for the event to arrive before continuing. Failing to wait can cause big trouble: on my SMP system, calling ndis_setstate_80211() after ndis_init_nic() completes, but _before_ the link event arrives, will lock up or reset the system. What we do now is check to see if a link event arrived while ndis_init_nic() was running, and if it didn't we msleep() until it does. Along the way, I discovered a few other problems: - Defered procedure calls run at PASSIVE_LEVEL, not DISPATCH_LEVEL. ntoskrnl_run_dpc() has been fixed accordingly. (I read the documentation wrong.) - Similarly, the NDIS interrupt handler, which is essentially a DPC, also doesn't need to run at DISPATCH_LEVEL. ndis_intrtask() has been fixed accordingly. - MiniportQueryInformation() and MiniportSetInformation() run at DISPATCH_LEVEL, and each request must complete before another can be submitted. ndis_get_info() and ndis_set_info() have been fixed accordingly. - Turned the sleep lock that guards the NDIS thread job list into a spin lock. We never do anything with this lock held except manage the job list (no other locks are held), so it's safe to do this, and it's possible that ndis_sched() and ndis_unsched() can be called from DISPATCH_LEVEL, so using a sleep lock here is semantically incorrect. Also updated subr_witness.c to add the lock to the order list.	2005-03-07 03:05:31 +00:00
Robert Watson	5324bda309	When DDB is not defined, don't implement witness_thread_has_locks() and witness_proc_has_locks(), as they are unused, which results in a compiler error. This problem was introduced with the implementation of "show alllocks". Spotted by: Artem Kuchin <matrix at itlegion dot ru>	2005-01-22 21:14:21 +00:00
John Baldwin	83ae089aab	- Up the WITNESS_COUNT macro from 200 to 1024 to support the growing number of lock types in the kernel. This results in an increase of witness data usage from ~145k to ~280k on i386 for kernels with 'options WITNESS'. - Remove the unused witness malloc bucket. Submitted by: Michal Mertl mime at traveller dot cz (1)	2004-12-28 21:21:27 +00:00
Robert Watson	6ce8940626	Attempt to slightly refine the print out from "show alllocks" -- list the process and thread numbers/names on the same line rather than on separate lines, and print the thread pointer not just the tid.	2004-12-27 10:47:08 +00:00
Robert Watson	b6dd9ef2fe	Add "show alllocks" command to DDB, which dumps a list of processes and threads currently holding sleep mutexes (and spin mutexes for curthread). This can be quite useful in looking for a lock condition summary for a system, as it avoids manually iterating through threads and processes to find all the interesting locks. NB: "alllocks" is up there with "lockedvnods" for a bad argument for show. MFC after: 2 weeks	2004-12-26 22:52:24 +00:00
John-Mark Gurney	e211cad004	clean up some tunables that should of been removed a while ago...	2004-11-09 06:46:14 +00:00
Robert Watson	cc34aa2094	Add entropy harvest mutex to hard-coded spin lock witness lock order, remove previous entropy harvesting mutex names as they are no longer present. Commit to this file was ommitted when randomdev_soft.c:1.5 was made. Feet shot: Robert Huff <roberthuff at rcn dot com>	2004-10-11 08:26:18 +00:00
Brian Feldman	41f57cbc8d	Don't "implicitly order all sleep locks before spin locks" in witness when the spin lock in question isn't -- it's the critical_enter() that KDB set. No more panic in DDB for console -> syscons -> tty -> knote operations.	2004-10-09 08:16:37 +00:00
Robert Watson	030c6fb156	Hard code witness lock order for BPF locks.	2004-09-09 05:01:37 +00:00
John-Mark Gurney	80e6bbe95b	make witness it's own sysctl branch instead of using _ to do this. I have left the old tunables in to give people a few days to transition their loader.conf and sysctl.conf's over to the new names.. MFC after: 5 days	2004-09-06 23:27:28 +00:00
John Baldwin	0e5a07e533	Remove a potential deadlock on i386 SMP by changing the lazypmap ipi and spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi and spin-wait code snippets so that you can't get into the situation of one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap shootdown each of which are waiting on each other. With this change, only one of the CPUs would do an IPI and spin-wait at a time.	2004-08-04 20:31:19 +00:00
Robert Watson	3ed994c6c3	Add netatalk mutexes to hard-coded WITNESS lock order.	2004-07-25 20:16:51 +00:00
Marcel Moolenaar	eba21ad501	Update for the KDB framework: o Make debugging code conditional upon KDB instead of DDB. o s/WITNESS_DDB/WITNESS_KDB/g o s/witness_ddb/witness_kdb/g o Rename the debug.witness_ddb sysctl to debug.witness_kdb. o Call kdb_backtrace() instead of backtrace(). o Call kdb_enter() instead Debugger(). o Assert kdb_active instead of db_active.	2004-07-10 21:42:16 +00:00
John Baldwin	776b99ee1a	Check the lock lists to see if they are empty directly rather than assigning a pointer to the list and then dereferencing the pointer as a second step. When the first spin lock is acquired, curthread is not in a critical section so it may be preempted and would end up using another CPUs lock list instead of its own. When this code was in witness_lock() this sequence was safe as curthread was in a critical section already since witness_lock() is called after the lock is acquired. Tested by: Daniel Lang dl at leo.org	2004-07-09 17:46:27 +00:00
Robert Watson	cce9e3f104	Introduce socket and UNIX domain socket locks into hard-coded lock order definition for witness. Send lock before receive lock, and socket locks after accept but before select: filedesc -> accept -> so_snd -> so_rcv -> sellck All routing locks after send lock: so_rcv -> radix node head All protocol locks before socket locks: unp -> so_snd udp -> udpinp -> so_snd tcp -> tcpinp -> so_snd	2004-06-13 00:23:03 +00:00
John Baldwin	ba8b26f960	- Comment out NULL, NULL barrier for Unix domain sockets section as the double NULL entries signal Witness to stop processing the array of order entries meaning none of the spin locks are added resulting in panics on boot. - Add a missing NULL, NULL terminator to the Slip locks list to keep them separate from the spin locks.	2004-06-03 20:07:44 +00:00
Robert Watson	d97e0534fa	Expand the hard-coded WITNESS lock order to include the following relationships: Sockets: filedesc->accept->sellck Routing: radix node head->rtentry->ifaddr UDP: udp->udpinp TCP: tcp->tcpinp SLIP: slip_mtx->slip sc_mtx Drop in a place holder section for UNIX domain sockets. Various sections to be expanded over the next few days.	2004-06-02 23:28:06 +00:00
Alfred Perlstein	12e9993f65	Emit a traceback when witness_trace is set and witness_warn() is called and triggers (typically caused by sleeping with a non-sleepable lock). Reviewed by: jhb	2004-03-23 00:32:27 +00:00
John Baldwin	dd75b0a90d	Add an implementation of a generic sleep queue abstraction that is used to queue threads sleeping on a wait channel similar to how turnstiles are used to queue threads waiting for a lock. This subsystem will be used as the backend for sleep/wakeup and condition variables initially. Eventually it will also be used to replace the ithread-specific iwait thread inhibitor. Sleep queues are also not locked by sched_lock, so this splits sched_lock up a bit further increasing concurrency within the scheduler. Sleep queues also natively support timeouts on sleeps and interruptible sleeps allowing for the reduction of a lot of duplicated code between the sleep/wakeup and condition variable implementations. For more details on the sleep queue implementation, check the comments in sys/sleepqueue.h and kern/subr_sleepqueue.c.	2004-02-27 18:33:09 +00:00
John Baldwin	3e9ac3ebf2	Remove a bogus assertion. Noticed by: bde Pointy hat to: jhb	2004-02-03 15:14:27 +00:00
John Baldwin	9c9c52a3ed	- Assert that witness_cold is not true in enroll(). - Only check witness_watch once in enroll(). Reported by: ru (2)	2004-02-02 22:15:17 +00:00
John Baldwin	8d768e7676	Rework witness_lock() to make it slightly more useful and flexible. - witness_lock() is split into two pieces: witness_checkorder() and witness_lock(). Witness_checkorder() determines if acquiring a specified lock at the time it is called would result in a lock order. It optionally adds a new lock order relationship as well. witness_lock() updates witness's data structures to assume that a lock has been acquired by stick a new lock instance in the appropriate lock instance list. - The mutex and sx lock functions now call checkorder() prior to trying to acquire a lock and continue to call witness_lock() after the acquire is completed. This will let witness catch a deadlock before it happens rather than trying to do so after the threads have deadlocked (i.e. never actually report it). - A new function witness_defineorder() has been added that adds a lock order between two locks at runtime without having to acquire the locks. If the lock order cannot be added it will return an error. This function is available to programmers via the WITNESS_DEFINEORDER() macro which accepts either two mutexes or two sx locks as its arguments. - A few simple wrapper macros were added to allow developers to call witness_checkorder() anywhere as a way of enforcing locking assertions in code that might acquire a certain lock in some situations. The macros are: witness_check_{mutex,shared_sx,exclusive_sx} and take an appropriate lock as the sole argument. - The code to remove a lock instance from a lock list in witness_unlock() was unnested by using a goto to vastly improve the readability of this function.	2004-01-28 20:39:57 +00:00
Ruslan Ermilov	33fe8fd0df	Register the uart(4)'s spin lock with witness(4).	2004-01-25 15:04:37 +00:00
Mark Murray	4e3a7a14d9	Fix a major faux pas of mine. I was causing 2 very bad things to happen in interrupt context; 1) sleep locks, and 2) malloc/free calls. 1) is fixed by using spin locks instead. 2) is fixed by preallocating a FIFO (implemented with a STAILQ) and using elements from this FIFO instead. This turns out to be rather fast. OK'ed by: re (scottl) Thanks to: peter, jhb, rwatson, jake Apologies to: *	2003-11-20 15:35:48 +00:00
Peter Wemm	0d2a298904	Initial landing of SMP support for FreeBSD/amd64. - This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.	2003-11-17 08:58:16 +00:00
Bruce Evans	416ab90e6b	Localized the cy driver's locking.	2003-11-16 00:55:54 +00:00
John Baldwin	961a7b244d	Add an implementation of turnstiles and change the sleep mutex code to use turnstiles to implement blocking isntead of implementing a thread queue directly. These turnstiles are somewhat similar to those used in Solaris 7 as described in Solaris Internals but are also different. Turnstiles do not come out of a fixed-sized pool. Rather, each thread is assigned a turnstile when it is created that it frees when it is destroyed. When a thread blocks on a lock, it donates its turnstile to that lock to serve as queue of blocked threads. The queue associated with a given lock is found by a lookup in a simple hash table. The turnstile itself is protected by a lock associated with its entry in the hash table. This means that sched_lock is no longer needed to contest on a mutex. Instead, sched_lock is only used when manipulating run queues or thread priorities. Turnstiles also implement priority propagation inherently. Currently turnstiles only support mutexes. Eventually, however, turnstiles may grow two queue's to support a non-sleepable reader/writer lock implementation. For more details, see the comments in sys/turnstile.h and kern/subr_turnstile.c. The two primary advantages from the turnstile code include: 1) the size of struct mutex shrinks by four pointers as it no longer stores the thread queue linkages directly, and 2) less contention on sched_lock in SMP systems including the ability for multiple CPUs to contend on different locks simultaneously (not that this last detail is necessarily that much of a big win). Note that 1) means that this commit is a kernel ABI breaker, so don't mix old modules with a new kernel and vice versa. Tested on: i386 SMP, sparc64 SMP, alpha SMP	2003-11-11 22:07:29 +00:00
John Baldwin	b95bb3e62b	Update spin lock order list for new i386 interrupt and SMP code.	2003-11-03 22:38:30 +00:00
Mike Silbersack	184dcdc7c8	Change all SYSCTLS which are readonly and have a related TUNABLE from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide more useful error messages.	2003-10-21 18:28:36 +00:00
Sam Leffler	6c024e8ef6	add fast swi taskqueue spinlock to the order_list so witness doesn't complain Submitted by: Tor Egge <Tor.Egge@cvsup.no.freebsd.org>	2003-09-06 21:06:08 +00:00
John Baldwin	b35b737201	Insert cosmetic spaces. Reported by: kris	2003-08-04 19:24:25 +00:00
John Baldwin	1beccae67c	Add a new function to look for a spinlock's instance when it is held by another thread. We use the td_oncpu member of the other field to locate it's associated CPU and then search the that CPU's list of spin locks contained in its per-CPU data. This is not always safe and may in fact panic or just not work, but it is useful in at least one case.	2003-07-31 18:50:58 +00:00
Peter Wemm	e95babf3a8	unifdef -DLAZY_SWITCH and start to tidy up the associated glue.	2003-07-10 01:02:59 +00:00
David E. O'Brien	677b542ea2	Use __FBSDID().	2003-06-11 00:56:59 +00:00
Poul-Henning Kamp	b1921a6f33	Remove return after panic. Found by: FlexeLint	2003-05-31 20:09:42 +00:00
Peter Wemm	d5167abf3c	Add __amd64__ to the ifdefs that introduce the "pcicfg" spinlock to witness. Approved by: re (safe amd64 support)	2003-05-31 06:42:37 +00:00
Julian Elischer	060563ec50	Move the _oncpu entry from the KSE to the thread. The entry in the KSE still exists but it's purpose will change a bit when we add the ability to lock a KSE to a cpu.	2003-04-10 17:35:44 +00:00
Mike Barcroft	fd7a8150fb	o In struct prison, add an allprison linked list of prisons (protected by allprison_mtx), a unique prison/jail identifier field, two path fields (pr_path for reporting and pr_root vnode instance) to store the chroot() point of each jail. o Add jail_attach(2) to allow a process to bind to an existing jail. o Add change_root() to perform the chroot operation on a specified vnode. o Generalize change_dir() to accept a vnode, and move namei() calls to callers of change_dir(). o Add a new sysctl (security.jail.list) which is a group of struct xprison instances that represent a snapshot of active jails. Reviewed by: rwatson, tjr	2003-04-09 02:55:18 +00:00
Peter Wemm	cc66ebe2a9	Commit a partial lazy thread switch mechanism for i386. it isn't as lazy as it could be and can do with some more cleanup. Currently its under options LAZY_SWITCH. What this does is avoid %cr3 reloads for short context switches that do not involve another user process. ie: we can take an interrupt, switch to a kthread and return to the user without explicitly flushing the tlb. However, this isn't as exciting as it could be, the interrupt overhead is still high and too much blocks on Giant still. There are some debug sysctls, for stats and for an on/off switch. The main problem with doing this has been "what if the process that you're running on exits while we're borrowing its address space?" - in this case we use an IPI to give it a kick when we're about to reclaim the pmap. Its not compiled in unless you add the LAZY_SWITCH option. I want to fix a few more things and get some more feedback before turning it on by default. This is NOT a replacement for Bosko's lazy interrupt stuff. This was more meant for the kthread case, while his was for interrupts. Mine helps a little for interrupts, but his helps a lot more. The stats are enabled with options SWTCH_OPTIM_STATS - this has been a pseudo-option for years, I just added a bunch of stuff to it. One non-trivial change was to select a new thread before calling cpu_switch() in the first place. This allows us to catch the silly case of doing a cpu_switch() to the current process. This happens uncomfortably often. This simplifies a bit of the asm code in cpu_switch (no longer have to call choosethread() in the middle). This has been implemented on i386 and (thanks to jake) sparc64. The others will come soon. This is actually seperate to the lazy switch stuff. Glanced at by: jake, jhb	2003-04-02 23:53:30 +00:00
John Baldwin	959d22329a	- Remove witness_dead and just use witness_watch instead. If witness_watch is set to 0, it now has the same affect as setting witness_dead used to have. - Added a sysctl handler that allows root to change witness_watch from a non-zero value to zero to disable witness at runtime. Note that you can't turn witness back on once it is off. You can only turn it off as a one-way switch. - Added a comment describing the possible values of witness_watch.	2003-03-24 21:03:53 +00:00
John Baldwin	2ca9461a05	Trim an extra blank line that snuck into the last commit.	2003-03-11 22:33:42 +00:00
John Baldwin	427b3a6549	- Change witness_displaydescendants() to accept the indentation level as a parameter instead of using the level of a given witness. When recursing, pass an indent level of indent + 1. - Make use of the information witness_levelall() provides in witness_display_list() to use an O(n) algorithm instead of an O(n^2) algo to decide which witnesses to display hierarchies from. Basically, we only display a hierarchy for witnesses with a level of 0. - Add a new per-witness flag that is reset at the start of witness_display() for all witness's and is set the first time a witness is displayed in witness_displaydescendants(). If a witness is encountered more than once in the lock order tree (which happens often), witness_displaydescendants() marks the later occurrences with the string "(already displayed)" and doesn't display the subtree under that witness. This avoids duplicating large amounts of the lock order tree in the 'show witness' output in DDB. All these changes serve to make 'show witness' a lot more readable and useful than it was previously.	2003-03-11 22:14:21 +00:00
John Baldwin	f82c6950be	- Split the itismychild() function into two functions: insertchild() adds a witness to the child list of a parent witness. rebalancetree() runs through the entire tree removing direct descendants of witnesses who already have said child witness as an indirect descendant through another direct descendant. itismychild() now calls insertchild() followed by rebalancetree() and no longer needs the evil hack of having static recursed variable. - Add a function reparentchildren() that adds all the direct descendants of one witness as direct descendants of another witness. - Change the return value of itismychild() and similar functions so that they return 0 in the case of failure due to lack of resources instead of 1. This makes the return value more intuitive. - Check the return value of itismychild() when defining the static lock order in witness_initialize(). - Don't try to setup a lock instance in witness_lock() if itismychild() fails. Witness is hosed anyways so no need to do any more witness related activity at that point. It also makes the code flow easier to understand. - Add a new depart() function as the opposite of enroll(). When the reference count of a witness drops to 0 in witness_destroy(), this function is called on that witness. First, it runs through the lock order tree using reparentchildren() to reparent direct descendants of the departing witness to each of the witness' parents in the tree. Next, it releases it's own child list and other associated resources. Finally it calls rebalanacetree() to rebalance the lock order tree. - Sort function prototypes into something closer to alphabetical order. As a result of these changes, there should no longer be 'dead' witnesses in the order tree, and repeatedly loading and unloading a module should no longer exhaust witness of its internal resources. Inspired by: gallatin	2003-03-11 22:07:35 +00:00
John Baldwin	d5b13ee082	Trim useless "../" leading strings from filenames passed into witness.	2003-03-11 21:53:12 +00:00
John Baldwin	28e4d137a2	Adjust style of #ifdef's and #endif's to be more consistent and in line with recent additions to style(9).	2003-03-11 21:38:49 +00:00
John Baldwin	d278a7f9ba	Do the lock order check skip for the LOP_TRYLOCK case after the check for recursing on a lock instead of before. This fixes a bug where WITNESS could get a little confused if you did an sx_tryslock() on a sx lock that you already had an slock on. WITNESS would still function correctly but it could result in weirdness in the output of 'show locks'. This also makes it possible for mtx_trylock() to recurse on a lock.	2003-03-11 20:54:37 +00:00
John Baldwin	0e8677f68b	Now that we have WITNESS_WARN(), we only call witness_list() from the ddb 'show locks' command. Thus, move witness_list() to the #ifdef DDB section and remove extra checks for calling this function outside of DDB. Also, witness_list() now returns void instead of returning an int. Reported by: Steve Ames <steve@energistic.com> Prodded by: davidxu	2003-03-10 17:03:57 +00:00
John Baldwin	9da590b49b	Oops, fix the double faults people were seeing with the recent changes to witness. Sleepable locks such as sx locks always come before all mutexes including Giant. However, the static lock order list placed Giant before the proctree and allproc sx locks. This resulted in witness creating a cycle in its lock order "tree" (real trees don't have cycles) leading to infinite recursion and eventually a double fault. To fix, put Giant after sx locks in the lock order list.	2003-03-06 17:25:06 +00:00
John Baldwin	c141c242ac	Bah, fix a bogon in the last commit: get the sense of a compare test right so that we allow a sleepable lock to be acquired with Giant held rather than allowing a sleepable lock to be acquired with anything but Giant held.	2003-03-04 22:34:07 +00:00
John Baldwin	35580ede37	A small overhaul of witness: - Add a comment about special lock order rules and Giant near the top of subr_witness.c. Specifically, this documents and explains the real lock order relationship between Giant and sleepable locks (i.e. lockmgr locks and sx locks). Basically, Giant can be safely acquired either before or after sleepable locks and the case of Giant before a sleepable lock is exempted as a special case. - Add a new static function 'witness_list_lock()' that displays a single line of information about a struct lock_instance. This is used to make the output of witness messages more consistent and reduce some code duplication. - Fixup a few comments in witness_lock(). - Properly handle the Giant-before-sleepable-lock lock order exception in a more general fashion and remove the no longer needed LI_SLEPT flag. - Break up the last condition before assuming a reversal a bit to try and make the logic less confusing in witness_lock(). - Axe WITNESS_SLEEP() now that LI_SLEPT is no longer needed and replace it with a more general WITNESS_WARN() macro/function combination. WITNESS_WARN() allows you to output a customized message out to the console along with a list of held locks. It will optionally drop into the debugger as well. You can exempt a single lock from the check by passing it in as the second argument. You can also use flags to specify if Giant should be exempt from the check, if all sleepable locks should be exempt from the check, and if witness should panic if any non-exempt locks are found. - Make the witness_list() function static. Other areas of the kernel should use the new WITNESS_WARN() instead.	2003-03-04 20:56:39 +00:00
Peter Wemm	af3d516f55	Initiate de-orbit burn for USE_PCI_BIOS_FOR_READ_WRITE. This has been #if'ed out for a while. Complete the deed and tidy up some other bits. We need to be able to call this stuff from outer edges of interrupt handlers for devices that have the ISR bits in pci config space. Making the bios code mpsafe was just too hairy. We had also stubbed it out some time ago due to there simply being too much brokenness in too many systems. This adds a leaf lock so that it is safe to use pci_read_config() and pci_write_config() from interrupt handlers. We still will use pcibios to do interrupt routing if there is no acpi.. [yes, I tested this] Briefly glanced at by: imp	2003-02-18 03:36:49 +00:00
Jeff Roberson	5215b1872f	- Split the struct kse into struct upcall and struct kse. struct kse will soon be visible only to schedulers. This greatly simplifies much the KSE code. Submitted by: davidxu	2003-02-17 05:14:26 +00:00
Peter Wemm	1c425b874c	Add a 'debug.witness_trace' sysctl (and tunable) when DDB is present. This causes LOR and could-sleep messages to come with a stack trace.	2003-02-13 01:35:56 +00:00
Julian Elischer	6f8132a867	Reversion of commit by Davidxu plus fixes since applied. I'm not convinced there is anything major wrong with the patch but them's the rules.. I am using my "David's mentor" hat to revert this as he's offline for a while.	2003-02-01 12:17:09 +00:00
David Xu	0dbb100b9b	Move UPCALL related data structure out of kse, introduce a new data structure called kse_upcall to manage UPCALL. All KSE binding and loaning code are gone. A thread owns an upcall can collect all completed syscall contexts in its ksegrp, turn itself into UPCALL mode, and takes those contexts back to userland. Any thread without upcall structure has to export their contexts and exit at user boundary. Any thread running in user mode owns an upcall structure, when it enters kernel, if the kse mailbox's current thread pointer is not NULL, then when the thread is blocked in kernel, a new UPCALL thread is created and the upcall structure is transfered to the new UPCALL thread. if the kse mailbox's current thread pointer is NULL, then when a thread is blocked in kernel, no UPCALL thread will be created. Each upcall always has an owner thread. Userland can remove an upcall by calling kse_exit, when all upcalls in ksegrp are removed, the group is atomatically shutdown. An upcall owner thread also exits when process is in exiting state. when an owner thread exits, the upcall it owns is also removed. KSE is a pure scheduler entity. it represents a virtual cpu. when a thread is running, it always has a KSE associated with it. scheduler is free to assign a KSE to thread according thread priority, if thread priority is changed, KSE can be moved from one thread to another. When a ksegrp is created, there is always N KSEs created in the group. the N is the number of physical cpu in the current system. This makes it is possible that even an userland UTS is single CPU safe, threads in kernel still can execute on different cpu in parallel. Userland calls kse_create to add more upcall structures into ksegrp to increase concurrent in userland itself, kernel is not restricted by number of upcalls userland provides. The code hasn't been tested under SMP by author due to lack of hardware. Reviewed by: julian	2003-01-26 11:41:35 +00:00
Jake Burkholder	be0800e85e	Oops, add zstty to the witness order list. Noticed by: benno	2003-01-09 15:45:28 +00:00
Jake Burkholder	c3c2862df4	- Add a spin lock to single thread cache invalidation and tlb flush ipis, which allows ipis to be sent outside of Giant. - Remove the ap boot mutex, which is unused.	2002-12-22 20:50:23 +00:00
Kris Kennaway	4ef3d7a27b	Enforce correct ordering of the filedesc structure and pipe mutex, because WITNESS can get the order wrong if it guesses based on first use. Reviewed by: jhb, alfred	2002-12-22 16:32:34 +00:00
John Baldwin	d2b28e078a	Correct an assertion in the code to traverse the list of locks to find an earlier acquired lock with the same witness as the lock currently being acquired. If we had released several earlier acquired locks after acquiring enough locks to require another lock_list_entry bucket in the lock list, then subsequent lock_list_entry buckets could contain only one lock instance in which case i would be zero. Reported by: Joel M. Baldwin <qumqats@outel.org>	2002-11-11 16:36:20 +00:00
Alan Cox	151113a946	Catch up with the removal of the vm page buckets spin mutex.	2002-11-02 22:42:18 +00:00
Poul-Henning Kamp	ab33958276	#unifdef the code for checking blessed lock collisions until we need it. Spotted by: DARPA & NAI Labs.	2002-10-20 08:48:39 +00:00
Peter Wemm	d2575b9651	Register the machine check private state spinlock on ia64.	2002-10-12 00:33:36 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Jeff Roberson	c76e20451c	- Tell witness about ALQ's spin lock.	2002-09-22 07:11:57 +00:00
Jake Burkholder	c0d676c068	Make this driver work a whole lot better. - Get the initial mode from the prom settings and don't clobber the mode on open. - Copy output into an internal ring buffer instead of accessing the tty outq directly in the interrupt handler. This fixes a problem where garbage would show up in the output stream. - Reset the console port completely and reprogram all the parameters before enabling it. This fixes seemingly random hangs on startup when using a fast interrupt handler. - Add minimal locking in place of spls. - Remove dead code and minor cleanups.	2002-09-08 04:45:16 +00:00
Ian Dowse	9261400aa2	Add WITNESS_FILE() and WITNESS_LINE(), which allow users of witness to print out the file and line from the lock object. These will be used shortly by CTR() calls in the mutex code. Reviewed by: jhb, jake	2002-08-26 18:31:26 +00:00
Mark Peek	11a78c514f	Silence compiler warnings when DDB is not defined. PR: 36002 Submitted by: Yoshikazu GOTO <goto@snowy.to>	2002-07-15 02:03:17 +00:00
Peter Wemm	f1b665c8fe	Revive backed out pmap related changes from Feb 2002. The highlights are: - It actually works this time, honest! - Fine grained TLB shootdowns for SMP on i386. IPI's are very expensive, so try and optimize things where possible. - Introduce ranged shootdowns that can be done as a single IPI. - PG_G support for i386 - Specific-cpu targeted shootdowns. For example, there is no sense in globally purging the TLB cache for where we are stealing a page from the local unshared process on the local cpu. Use pm_active to track this. - Add some instrumentation for the tlb shootdown code. - Rip out SMP code from <machine/cpufunc.h> - Try and fix some very bogus PG_G and PG_PS interactions that were bad enough to cause vm86 bios calls to break. vm86 depended on our existing bugs and this was the cause of the VESA panics last time. - Fix the silly one-line error that caused the 'panic: bad pte' last time. - Fix a couple of other silly one-line errors that should have caused more pain than they did. Some more work is needed: - pmap_{zero,copy}_page[_idle]. These can be done without IPI's if we have a hook in cpu_switch. - The IPI handlers need some cleanup. I have a bogus %ds load that can be avoided. - APTD handling is rather bogus and appears to be a large source of global TLB IPI shootdowns for no really good reason. I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop. I expect to see a bigger difference when there is significant pageout activity or the system otherwise has memory shortages. I have backed out a few optimizations that I had been using over the last few days in order to be a little more conservative. I'll revisit these again over the next few days as the dust settles. New option: DISABLE_PG_G - In case I missed something.	2002-07-12 07:56:11 +00:00
Alan Cox	70c1763634	o Resurrect vm_page_lock_queues(), vm_page_unlock_queues(), and the free queue lock (revision 1.33 of vm/vm_page.c removed them). o Make the free queue lock a spin lock because it's sometimes acquired inside of a critical section.	2002-07-04 22:07:37 +00:00
Julian Elischer	e602ba25fd	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
John Baldwin	48849938e8	Change the all locks list from a STAILQ to a TAILQ. This bloats struct lock_object by another pointer (though all of lock_object should be conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE() in witness_destroy() (called for every mtx_destroy() and sx_destroy()) instead of an O(n) STAILQ_REMOVE. Since WITNESS is so dog slow as it is, the speed-up is worth the space cost. Suggested by: iedowse	2002-06-06 20:51:04 +00:00
John Baldwin	8dcb900b62	Handle "dead" witnesses better in the situation of several short term locks being created and destroyed without a single long-term one around to ensure the witness associated with that group of locks stays alive. The pipe mutexes are an example of this group. For a dead witness we no longer clear the witness name. Instead, when looking up the witness for a lock, if a dead witness' (a witness with a refcount of 0) w_name pointer is identical to the witness name of the lock then we revive that witness instead of using a new witness for the lock. This results in far fewer dead witness objects and also better preserves locking orders over the long term resulting in more correct lock order checking. Note that we can't ever derefence w_name of a dead witness since we don't know if the string it is pointing to has been free()'d or kldunload()'d out from under us.	2002-06-06 19:04:38 +00:00
John Baldwin	525c135972	In witness_unlock(), when updating a lock list entry bucket, decrement the count of lock list entries after we fixup the bucket of lock list entries. In theory we can remove the intr_disable/intr_restore() calls now.	2002-05-20 19:16:22 +00:00
John Baldwin	bbd296aba6	- Allow witness_sleep() to be called when witness hasn't been initialized yet. We just return without performing any checks. - Don't explicitly enter and exit critical sections when walking lock lists. We don't need a critical section to walk the list of sleep locks for a thread. We check to see if a spin lock list is empty before we walk it. If the list is empty we don't need to walk it. If it isn't then we already hold at least one spin lock and are already in a critical section and thus don't need our own explicit critical section.	2002-05-20 17:49:46 +00:00

1 2 3 4 5 ...

363 Commits