freebsd-nq

Author	SHA1	Message	Date
Robert Watson	161a0c7cff	Don't hold the UNIX domain socket subsystem lock over the body of the UNIX domain socket garbage collection implementation, as that risks holding the mutex over potentially sleeping operations (as well as introducing some nasty lock order issues, etc). unp_gc() will hold the lock long enough to do necessary deferal checks and set that it's running, but then release it until it needs to reset the gc state. RELENG_5 candidate. Discussed with: alfred	2004-08-25 21:24:36 +00:00
Robert Watson	fe0f2d4e11	Conditional acquisition of socket buffer mutexes when testing socket buffers with kqueue filters is no longer required: the kqueue framework will guarantee that the mutex is held on entering the filter, either due to a call from the socket code already holding the mutex, or by explicitly acquiring it. This removes the last of the conditional socket locking.	2004-08-24 05:28:18 +00:00
Warner Losh	0160658e84	Set the description to NULL in the right detach routine. This should keep dangling pointers to strings in loaded modules from hanging around after the drivers are unloaded.	2004-08-24 05:19:15 +00:00
David Xu	d30412a8db	Remove checking of single exit flag in thread_user_enter(), this is generic code for threaded process, should not be here.	2004-08-23 22:54:37 +00:00
Peter Wemm	f1009e1e1f	Commit Doug White and Alan Cox's fix for the cross-ipi smp deadlock. We were obtaining different spin mutexes (which disable interrupts after aquisition) and spin waiting for delivery. For example, KSE processes do LDT operations which use smp_rendezvous, while other parts of the system are doing things like tlb shootdowns with a different mutex. This patch uses the common smp_rendezvous mutex for all MD home-grown IPIs that spinwait for delivery. Having the single mutex means that the spinloop to aquire it will enable interrupts periodically, thus avoiding the cross-ipi deadlock. Obtained from: dwhite, alc Reviewed by: jhb	2004-08-23 21:39:29 +00:00
Alexander Kabaev	cffdaf2dce	Temporarily back out r1.74 as it seems to cause a number of regressions accordimg to numerous reports. It might get reintroduced some time later when an exact failure mode is understood better.	2004-08-23 02:39:45 +00:00
Robert Watson	d963815baf	Make debug.kdb.stop_cpus also a TUNABLE() so it can be set prior to boot to help debug early nasty hangs.	2004-08-22 15:10:52 +00:00
Julian Elischer	ad59c36ba1	diff reduction for upcoming patch. Use a macro that masks some of the odd goings on with sub-structures, because they will go away anyhow.	2004-08-22 05:21:41 +00:00
Don Lewis	1a1c04b6b3	Don't bother calling the module event handlers from module_shutdown() in the shutdown_final state if the RB_NOSYNC flag is set. The specific motivation in this case is that a system panic in an interrupt context results in a call to module_shutdown(), which calls g_modevent(), which calls g_malloc(..., M_WAITOK), which results in a second panic. While g_modevent() could be fixed to not call malloc() for MOD_SHUTDOWN events (which it doesn't handle in any case), it is probably also a good idea to entirely skip the execution of the module shutdown handlers after a panic. This may be a MFC candidate for RELENG_5.	2004-08-20 21:47:48 +00:00
Don Lewis	8ded654028	Don't attempt to trigger the syncer thread final sync code in the shutdown_pre_sync state if the RB_NOSYNC flag is set. This is the likely cause of hangs after a system panic that are keeping crash dumps from being done. This is a MFC candidate for RELENG_5. MFC after: 3 days	2004-08-20 19:21:47 +00:00
John Baldwin	55c45354ff	Remove some dead code under a straggling APIC_IO #ifdef that I missed back before 5.2.	2004-08-20 17:24:52 +00:00
Robert Watson	7b38f0d3c3	Back out uipc_socket.c:1.208, as it incorrectly assumes that all sockets are connection-oriented for the purposes of kqueue registration. Since UDP sockets aren't connection-oriented, this appeared to break a great many things, such as RPC-based applications and services (i.e., NFS). Since jmg isn't around I'm backing this out before too many more feet are shot, but intend to investigate the right solution with him once he's available. Apologies to: jmg Discussed with: imp, scottl	2004-08-20 16:24:23 +00:00
Scott Long	2384290ced	Revert the previous change. It works great for 4BSD but causes major problems for ULE. The reason is quite unknown and worrisome.	2004-08-20 05:58:38 +00:00
Scott Long	2c86298c6c	In maybe_preempt(), ignore threads that are in an inconsistent state. This is an effective band-aid for at least some of the scheduler corruption seen recently. The real fix will involve protecting threads while they are inconsistent, and will come later. Submitted by: julian	2004-08-20 05:18:50 +00:00
John-Mark Gurney	5d6dd4685a	make sure that the socket is either accepting connections or is connected when attaching a knote to it... otherwise return EINVAL... Pointed out by: benno	2004-08-20 04:15:30 +00:00
Nate Lawson	0b54748fec	Add a newline.	2004-08-19 20:16:09 +00:00
Poul-Henning Kamp	d298f91974	Add bioq_takefirst(). If the bioq is empty, NULL is returned. Otherwise the front element is removed and returned. This can simplify locking in many drivers from: lock() bp = bioq_first(bq); if (bp == NULL) { unlock() return } bioq_remove(bp, bq) unlock to: lock() bp = bioq_takefirst(bq); unlock() if (bp == NULL) return;	2004-08-19 19:51:51 +00:00
Nate Lawson	c003dab8ff	Add debugging to rman_manage_region() as well. This is useful since we manage subregions in ACPI. MFC after: 3 days	2004-08-19 16:41:12 +00:00
Robert Watson	16239786ca	Remove GIANT_REQUIRED from setugidsafety() as knote_fdclose() no longer requires Giant.	2004-08-19 14:59:51 +00:00
John Baldwin	007ddf7e7a	Now that the return value semantics of cv's for multithreaded processes have been unified with that of msleep(9), further refine the sleepq interface and consolidate some duplicated code: - Move the pre-sleep checks for theaded processes into a thread_sleep_check() function in kern_thread.c. - Move all handling of TDF_SINTR to be internal to subr_sleepqueue.c. Specifically, if a thread is awakened by something other than a signal while checking for signals before going to sleep, clear TDF_SINTR in sleepq_catch_signals(). This removes a sched_lock lock/unlock combo in that edge case during an interruptible sleep. Also, fix sleepq_check_signals() to properly handle the condition if TDF_SINTR is clear rather than requiring the callers of the sleepq API to notice this edge case and call a non-_sig variant of sleepq_wait(). - Clarify the flags arguments to sleepq_add(), sleepq_signal() and sleepq_broadcast() by creating an explicit submask for sleepq types. Also, add an explicit SLEEPQ_MSLEEP type rather than a magic number of 0. Also, add a SLEEPQ_INTERRUPTIBLE flag for use with sleepq_add() and move the setting of TDF_SINTR to sleepq_add() if this flag is set rather than sleepq_catch_signals(). Note that it is the caller's responsibility to ensure that sleepq_catch_signals() is called if and only if this flag is passed to the preceeding sleepq_add(). Note that this also removes a sched_lock lock/unlock pair from sleepq_catch_signals(). It also ensures that for an interruptible sleep, TDF_SINTR is always set when TD_ON_SLEEPQ() is true.	2004-08-19 11:31:42 +00:00
John-Mark Gurney	000968010a	add options MPROF_BUFFERS and MPROF_HASH_SIZE that adjust the sizes of the mutex profiling buffers. Document them in the man page and in NOTES. Ensure _HASH_SIZE is larger than _BUFFERS with a cpp error.	2004-08-19 06:38:26 +00:00
Robert Watson	4c5bc1ca39	Add UNP_UNLOCK_ASSERT() to asser that the UNIX domain socket subsystem lock is not held. Rather than annotating that the lock is released after calls to unp_detach() with a comment, annotate with an assertion. Assert that the UNIX domain socket subsystem lock is not held when unp_externalize() and unp_internalize() are called.	2004-08-19 01:45:16 +00:00
Robert Watson	2cfe973b62	Annotate call to DELAY() in interrupt storm mitigation as being something to revisit. Approved by: re (scottl)	2004-08-17 04:09:09 +00:00
Alexander Kabaev	c8b876219f	Upgrading a lock does not play well together with acquiring an exclusive lock and can lead to two threads being granted exclusive access. Check that no one has the same lock in exclusive mode before proceeding to acquire it. The LK_WANT_EXCL and LK_WANT_UPGRADE bits act as mini-locks and can block other threads. Normally this is not a problem since the mini locks are upgraded to full locks and the release of the locks will unblock the other threads. However if a thread reset the bits without obtaining a full lock other threads are not awoken. Add missing wakeups for these cases. PR: kern/69964 Submitted by: Stephan Uphoff <ups at tree dot com> Very good catch by: Stephan Uphoff <ups at tree dot com>	2004-08-16 15:01:22 +00:00
David E. O'Brien	78c37b0de8	s/MAX_SAFE_MAXVNODES/MAXVNODES_MAX/g	2004-08-16 08:33:37 +00:00
Robert Watson	40f2ac28a0	Always acquire the UNIX domain socket subsystem lock (UNP lock) before dereferencing sotounpcb() and checking its value, as so_pcb is protected by protocol locking, not subsystem locking. This prevents races during close() by one thread and use of ths socket in another. unp_bind() now assert the UNP lock, and uipc_bind() now acquires the lock around calls to unp_bind().	2004-08-16 04:41:03 +00:00
Brian Feldman	8912c44d9f	Add the missing knote_fdclose().	2004-08-16 03:09:01 +00:00
Brian Feldman	1c0f9af5b5	Allocate the marker, when scanning a kqueue, from the "heap" instead of the stack. When swapped out, a process's kernel stack would be unavailable, and we could get a page fault when scanning the same kqueue. PR: kern/61849	2004-08-16 03:08:38 +00:00
Robert Watson	ce5f32de11	Annotate the current UNIX domain socket locking strategies, order, strengths, and weaknesses in a comment. Assert a copyright over the changes made as part of the locking work.	2004-08-16 01:52:04 +00:00
Mike Silbersack	5173e8f567	Major enhancements to pipe memory usage: - pipespace is now able to resize non-empty pipes; this allows for many more resizing opportunities - Backing is no longer pre-allocated for the reverse direction of pipes. This direction is rarely (if ever) used, so this cuts the amount of map space allocated to a pipe in half. - Pipe growth is now much more dynamic; a pipe will now grow when the total amount of data it contains and the size of the write are larger than the size of pipe. Previously, only individual writes greater than the size of the pipe would cause growth. - In low memory situations, pipes will now shrink during both read and write operations, where possible. Once the memory shortage ends, the growth code will cause these pipes to grow back to an appropriate size. - If the full PIPE_SIZE allocation fails when a new pipe is created, the allocation will be retried with SMALL_PIPE_SIZE. This helps to deal with the situation of a fragmented map after a low memory period has ended. - Minor documentation + code changes to support the above. In total, these changes increase the total number of pipes that can be allocated simultaneously, drastically reducing the chances that pipe allocation will fail. Performance appears unchanged due to dynamic resizing.	2004-08-16 01:27:24 +00:00
Don Lewis	b6915bdbe5	Yet another tweak to the shutdown messages in boot(): Don't count busy buffers before the initial call to sync() and don't skip the initial sync() if no busy buffers were called. Always call sync() at least once if syncing is requested. This defers the "Syncing disks, buffers remaining..." message until after the initial sync() call and the first count of busy buffers. This backs out changes in kern_shutdown 1.162. Print a different message when there are no busy buffers after the initial sync(), which is now the expected situation. Print an additional message when syncing has completed successfully in the unusual situation where the work of syncing was done by boot(). Uppercase one message to make it consistent with all of the other kernel shutdown messages. Discussed with: bde (in a much earlier form, prior to 1.162) Reviewed by: njl (in an earlier form)	2004-08-15 19:17:23 +00:00
John-Mark Gurney	ad3b9257c2	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
Robert Watson	d8939d82cb	Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we attempt to IPI other cpus when entering the debugger in order to stop them while in the debugger. The default remains to issue the stop; however, that can result in a hang if another cpu has interrupts disabled and is spinning, since the IPI won't be received and the KDB will wait indefinitely. We probably need to add a timeout, but this is a useful stopgap in the mean time. Reviewed by: marcel	2004-08-15 02:06:27 +00:00
Robert Watson	6cbea71c82	Cause pfind() not to return processes in the PRS_NEW state. As a result, threads consuming the result of pfind() will not need to check for a NULL credential pointer or other signs of an incompletely created process. However, this also means that pfind() cannot be used to test for the existence or find such a process. Annotate pfind() to indicate that this is the case. A review of curent consumers seems to indicate that this is not a problem for any of them. This closes a number of race conditions that could result in NULL pointer dereferences and related failure modes. Other related races continue to exist, especially during iteration of the allproc list without due caution. Discussed with: tjr, green	2004-08-14 17:15:16 +00:00
Poul-Henning Kamp	d8e8b6755c	Add some KASSERTS.	2004-08-14 08:33:49 +00:00
Julian Elischer	f0017f3321	Whitespace nit.	2004-08-14 07:21:20 +00:00
Robert Watson	b295bdcded	After completing a name lookup for a target UNIX domain socket to connect to, re-check that the local UNIX domain socket hasn't been closed while we slept, and if so, return EINVAL. This affects the system running both with and without Giant over the network stack, and recent ULE changes appear to cause it to trigger more frequently than previously under load. While here, improve catching of possibly closed UNIX domain sockets in one or two additional circumstances. I have a much larger set of related changes in Perforce, but they require more testing before they can be merged. One debugging printf is left in place to indicate when such a race takes place: this is typically triggered by a buggy application that simultaenously connect()'s and close()'s a UNIX domain socket file descriptor. I'll remove this at some point in the future, but am interested in seeing how frequently this is reported. In the case of Martin's reported problem, it appears to be a result of a non-thread safe syslog() implementation in the C library, which does not synchronize access to its logging file descriptor. Reported by: mbr	2004-08-14 03:43:49 +00:00
John-Mark Gurney	ac77164d64	clean up whitespace...	2004-08-13 17:43:53 +00:00
John-Mark Gurney	7d5e45a391	looks like rwatson forgot tabs... :)	2004-08-13 07:38:58 +00:00
Julian Elischer	c00661f83c	Don't keep evaluating our own cpu mask.. it's not likely to have changed....	2004-08-13 00:57:43 +00:00
Robert Watson	44f31f7556	Trim trailing white space.	2004-08-12 18:06:21 +00:00
Warner Losh	9f7f340a0f	Minor formatting fixes for lines > 80 characters	2004-08-12 17:26:22 +00:00
Jeff Roberson	f2b74cbf28	- Introduce a new flag KEF_HOLD that prevents sched_add() from doing a migration. Use this in sched_prio() and sched_switch() to stop us from migrating threads that are in short term sleeps or are runnable. These extra migrations were added in the patches to support KSE. - Only set NEEDRESCHED if the thread we're adding in sched_add() is a lower priority and is being placed on the current queue. - Fix some minor whitespace problems.	2004-08-12 07:56:33 +00:00
Julian Elischer	0f54f48225	Properly keep track of how many kses are on the system run queue(s).	2004-08-11 20:54:48 +00:00
Robert Watson	217a4b6e4e	Replace a reference to splnet() with a reference to locking in a comment.	2004-08-11 03:43:10 +00:00
Marcel Moolenaar	4da47b2fec	Add __elfN(dump_thread). This function is called from __elfN(coredump) to allow dumping per-thread machine specific notes. On ia64 we use this function to flush the dirty registers onto the backingstore before we write out the PRSTATUS notes. Tested on: alpha, amd64, i386, ia64 & sparc64 Not tested on: arm, powerpc	2004-08-11 02:35:06 +00:00
Robert Watson	87e83e7d4c	In v_addpollinfo(), we allocate storage to back vp->v_pollinfo. However, we may sleep when doing so; check that we didn't race with another thread allocating storage for the vnode after allocation is made to a local pointer, and only update the vnode pointer if it's still NULL. Otherwise, accept that another thread got there first, and release the local storage. Discussed with: jmg	2004-08-11 01:27:53 +00:00
Alan Cox	fad44deea3	Eliminate the acquisition and release of Giant within physio(). Remove the spl calls. Reviewed by: phk@ Discussed with: scottl@	2004-08-10 21:47:11 +00:00
John Baldwin	274f8f48e8	Synchronize the extra SA threading checks and return value handling of condition variables with that of msleep(). Reviewed by: davidxu	2004-08-10 17:42:59 +00:00
Jeff Roberson	2454aaf51c	- Use a new flag, KEF_XFERABLE, to record with certainty that this kse had contributed to the transferable load count. This prevents any potential problems with sched_pin() being used around calls to setrunqueue(). - Change the sched_add() load balancing algorithm to try to migrate on wakeup. This attempts to place threads that communicate with each other on the same CPU. - Don't clear the idle counts in kseq_transfer(), let the cpus do that when they call sched_add() from kseq_assign(). - Correct a few out of date comments. - Make sure the ke_cpu field is correct when we preempt. - Call kseq_assign() from sched_clock() to catch any assignments that were done without IPI. Presently all assignments are done with an IPI, but I'm trying a patch that limits that. - Don't migrate a thread if it is still runnable in sched_add(). Previously, this could only happen for KSE threads, but due to changes to sched_switch() all threads went through this path. - Remove some code that was added with preemption but is not necessary.	2004-08-10 07:52:21 +00:00

1 2 3 4 5 ...

7591 Commits