freebsd-dev

Author	SHA1	Message	Date
Mateusz Guzik	e2ab16b1a6	lockprof: move panic check after inspecting the state	2021-05-23 17:55:27 +00:00
Mateusz Guzik	6a467cc5e1	lockprof: pass lock type as an argument instead of reading the spin flag	2021-05-23 17:55:27 +00:00
Mateusz Guzik	a0842e69aa	lockprof: add contested-only profiling This allows tracking all wait times with much smaller runtime impact. For example when doing -j 104 buildkernel on tmpfs: no profiling: 2921.70s user 282.72s system 6598% cpu 48.562 total all acquires: 2926.87s user 350.53s system 6656% cpu 49.237 total contested only: 2919.64s user 290.31s system 6583% cpu 48.756 total	2021-05-22 19:28:37 +00:00
Mateusz Guzik	fca5cfd584	lockprof: retire lock_prof_skipcount The implementation uses a global variable for ALL calls, defeating the point of sampling in the first place. Remove it as it clearly remains unused.	2021-05-22 19:28:37 +00:00
Edward Tomasz Napierala	7f6157f7fd	lock_delay(9): improve interaction with restrict_starvation After `e7a5b3bd05`, the la->delay value was adjusted after being set by the starvation_limit code block, which is wrong. Reported By: avg Reviewed By: avg Fixes: `e7a5b3bd05` Sponsored By: NetApp, Inc. Sponsored By: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D29513	2021-04-03 13:08:53 +01:00
Edward Tomasz Napierala	e7a5b3bd05	Modify lock_delay() to increase the delay time after spinning Modify lock_delay() to increase the delay time after spinning, not before. Previously we would spin at least twice instead of once. In NetApp's benchmarks this fixes a performance regression compared to FreeBSD 10, which called cpu_spinwait() directly. Reviewed By: mjg Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D27331	2021-02-25 18:55:26 +00:00
Mateusz Guzik	eac22dd480	lockmgr: shrink struct lock by 8 bytes on LP64 Currently the struct has a 4 byte padding stemming from 3 ints. 1. prio comfortably fits in short, unfortunately there is no dedicated type for it and plumbing it throughout the codebase is not worth it right now, instead an assert is added which covers also flags for safety 2. lk_exslpfail can in principle exceed u_short, but the count is already not considered reliable and it only ever gets modified straight to 0. In other words it can be incrementing with an upper bound of USHRT_MAX With these in place struct lock shrinks from 48 to 40 bytes. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D28680	2021-02-15 13:57:25 +00:00
Mateusz Guzik	6fed89b179	kern: clean up empty lines in .c and .h files	2020-09-01 22:12:32 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Mateusz Guzik	2e77cad11d	locks: add default delay struct Use it for all primitives. This makes everything fit in 8 bytes.	2020-01-05 12:48:19 +00:00
Mateusz Guzik	6b8dd26e7c	locks: convert delay times to u_short int is just a waste of space for this purpose.	2020-01-05 12:47:29 +00:00
Mateusz Guzik	3ac2ac2e08	lockprof: use IPI-injecetd fences to fix hangs on stat dump and reset The previously used quiesce_all_cpus walks all CPUs and waits until curthread can run on them. Even on contemporary machines this becomes a significant problem under load when it can literally take minutes for the operation to complete. With the patch the stall is normally less than 1 second. Reviewed by: kib, jeff (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21740	2019-11-30 17:24:42 +00:00
Mateusz Guzik	d2be3ef05c	lockprof: move per-cpu data to dpcpu Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21747	2019-09-22 20:44:24 +00:00
Mateusz Guzik	cbba2cb367	lockprof: use CPUFOREACH and drop always false lp_cpu NULL checks Sponsored by: The FreeBSD Foundation	2019-09-21 19:05:38 +00:00
John Baldwin	2e43efd0bb	Drop "All rights reserved" from my copyright statements. Reviewed by: rgrimes MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19485	2019-03-06 22:11:45 +00:00
Mateusz Guzik	a045941bd2	locks: tweak backoff a little bit Previous limits were chosen when locking primitives had spurious lock accesses. Flipping the starting point to 1 (or rather 2 as the first call shifts it) provides a modest win when mild contention is seen while not hurting worse cases. Tested on a bunch of one, two and four socket old and new systems (Westmere, Skylake, Threadreaper and others) by doing concurrent page faults, buildkernel/buildworld and other stuff (although not all systems got all the tests). Another thing is the upper limit. It is semi-arbitrarily chosen as it was getting out of hand for slightly less small systems (e.g. a 128-thread one). Note that backoff is fundamentally a speculative bandaid and this change just makes it fit a little bit better. It remains completely oblivious to the hardware topology or the contention pattern. This is being experimented with.	2018-04-08 16:34:10 +00:00
Pedro F. Giffuni	8a36da99de	sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:20:12 +00:00
Mateusz Guzik	3c798b2b1f	locks: follow up r313386 Unfinished diff was committed by accident. The loop in lock_delay was changed to decrement, but the loop iterator was still incrementing.	2017-02-07 16:01:07 +00:00
Mateusz Guzik	8e5a3e9a9d	locks: change backoff to exponential Previous implementation would use a random factor to spread readers and reduce chances of starvation. This visibly reduces effectiveness of the mechanism. Switch to the more traditional exponential variant. Try to limit starvation by imposing an upper limit of spins after which spinning is half of what other threads get. Note the mechanism is turned off by default. Reviewed by: kib (previous version)	2017-02-07 14:49:36 +00:00
Mateusz Guzik	290511163d	Sprinkle __read_mostly on backoff and lock profiling code. MFC after: 1 month	2017-01-27 15:03:51 +00:00
Mateusz Guzik	1ada904147	Implement trivial backoff for locking primitives. All current spinning loops retry an atomic op the first chance they get, which leads to performance degradation under load. One classic solution to the problem consists of delaying the test to an extent. This implementation has a trivial linear increment and a random factor for each attempt. For simplicity, this first thouch implementation only modifies spinning loops where the lock owner is running. spin mutexes and thread lock were not modified. Current parameters are autotuned on boot based on mp_cpus. Autotune factors are very conservative and are subject to change later. Reviewed by: kib, jhb Tested by: pho MFC after: 1 week	2016-08-01 21:48:37 +00:00
Dmitry Chagin	fd07ddcf6f	Add _NEW flag to mtx(9), sx(9), rmlock(9) and rwlock(9). A _NEW flag passed to _init_flags() to avoid check for double-init. Differential Revision: https://reviews.freebsd.org/D1208 Reviewed by: jhb, wblock MFC after: 1 Month	2014-12-13 21:00:10 +00:00
John Baldwin	e432d5f6a7	Drop the 3rd clause from all 3 clause BSD licenses where I am the sole holder to convert them to 2 clause BSD licenses. MFC after: 1 week	2014-02-05 18:13:27 +00:00
John-Mark Gurney	3a6cdc4e55	fix spelling of lock_initialized.. jhb approved.. MFC after: 1 week	2014-01-28 17:27:54 +00:00
John Baldwin	cd32bd7ad1	Several improvements to rmlock(9). Many of these are based on patches provided by Isilon. - Add an rm_assert() supporting various lock assertions similar to other locking primitives. Because rmlocks track readers the assertions are always fully accurate unlike rw_assert() and sx_assert(). - Flesh out the lock class methods for rmlocks to support sleeping via condvars and rm_sleep() (but only while holding write locks), rmlock details in 'show lock' in DDB, and the lc_owner method used by dtrace. - Add an internal destroyed cookie so that API functions can assert that an rmlock is not destroyed. - Make use of rm_assert() to add various assertions to the API (e.g. to assert locks are held when an unlock routine is called). - Give RM_SLEEPABLE locks their own lock class and always use the rmlock's own lock_object with WITNESS. - Use THREAD_NO_SLEEPING() / THREAD_SLEEPING_OK() to disallow sleeping while holding a read lock on an rmlock. Submitted by: andre Obtained from: EMC/Isilon	2013-06-25 18:44:15 +00:00
Jeff Roberson	28d91af30f	- Implement run-time expansion of the KTR buffer via sysctl. - Implement a function to ensure that all preempted threads have switched back out at least once. Use this to make sure there are no stale references to the old ktr_buf or the lock profiling buffers before updating them. Reviewed by: marius (sparc64 parts), attilio (earlier patch) Sponsored by: EMC / Isilon Storage Division	2012-11-15 00:51:57 +00:00
Andriy Gapon	353705930f	panic: add a switch and infrastructure for stopping other CPUs in SMP case Historical behavior of letting other CPUs merily go on is a default for time being. The new behavior can be switched on via kern.stop_scheduler_on_panic tunable and sysctl. Stopping of the CPUs has (at least) the following benefits: - more of the system state at panic time is preserved intact - threads and interrupts do not interfere with dumping of the system state Only one thread runs uninterrupted after panic if stop_scheduler_on_panic is set. That thread might call code that is also used in normal context and that code might use locks to prevent concurrent execution of certain parts. Those locks might be held by the stopped threads and would never be released. To work around this issue, it was decided that instead of explicit checks for panic context, we would rather put those checks inside the locking primitives. This change has substantial portions written and re-written by attilio and kib at various times. Other changes are heavily based on the ideas and patches submitted by jhb and mdf. bde has provided many insights into the details and history of the current code. The new behavior may cause problems for systems that use a USB keyboard for interfacing with system console. This is because of some unusual locking patterns in the ukbd code which have to be used because on one hand ukbd is below syscons, but on the other hand it has to interface with other usb code that uses regular mutexes/Giant for its concurrency protection. Dumping to USB-connected disks may also be affected. PR: amd64/139614 (at least) In cooperation with: attilio, jhb, kib, mdf Discussed with: arch@, bde Tested by: Eugene Grosbein <eugen@grosbein.net>, gnn, Steven Hartland <killing@multiplay.co.uk>, glebius, Andrew Boyer <aboyer@averesystems.com> (various versions of the patch) MFC after: 3 months (or never)	2011-12-11 21:02:01 +00:00
Ed Schouten	6472ac3d8a	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
Matthew D Fleming	00f0e671ff	Explicitly wire the user buffer rather than doing it implicitly in sbuf_new_for_sysctl(9). This allows using an sbuf with a SYSCTL_OUT drain for extremely large amounts of data where the caller knows that appropriate references are held, and sleeping is not an issue. Inspired by: rwatson	2011-01-27 00:34:12 +00:00
John Baldwin	58ccf5b41c	Remove unneeded includes of <sys/linker_set.h>. Other headers that use it internally contain nested includes. Reviewed by: bde	2011-01-11 13:59:06 +00:00
Rebecca Cran	b1ce21c6ef	Fix typos. PR: bin/148894 Submitted by: olgeni	2010-11-09 10:59:09 +00:00
Matthew D Fleming	4e6571599b	Re-add r212370 now that the LOR in powerpc64 has been resolved: Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough SBUF_FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk (original patch)	2010-09-16 16:13:12 +00:00
Matthew D Fleming	404a593e28	Revert r212370, as it causes a LOR on powerpc. powerpc does a few unexpected things in copyout(9) and so wiring the user buffer is not sufficient to perform a copyout(9) while holding a random mutex. Requested by: nwhitehorn	2010-09-13 18:48:23 +00:00
Matthew D Fleming	dd67e2103c	Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk	2010-09-09 18:33:46 +00:00
Ed Schouten	60ae52f785	Use ISO C99 integer types in sys/kern where possible. There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.	2010-06-21 09:55:56 +00:00
Andriy Gapon	e7154e7ef1	lock_profile_release_lock: do not compare unsigned with zero Found by: Coverity Prevent CID: 3660 Reviewed by: jhb MFC after: 2 weeks	2010-06-17 10:15:13 +00:00
John Baldwin	3aa6d94e0c	Update several places that iterate over CPUs to use CPU_FOREACH().	2010-06-11 18:46:34 +00:00
Jeff Roberson	2e6b8de462	- Implement a new mechanism for resetting lock profiling. We now guarantee that all cpus have acknowledged the cleared enable int by scheduling the resetting thread on each cpu in succession. Since all lock profiling happens within a critical section this guarantees that all cpus have left lock profiling before we clear the datastructures. - Assert that the per-thread queue of locks lock profiling is aware of is clear on thread exit. There were several cases where this was not true that slows lock profiling and leaks information. - Remove all objects from all lists before clearing any per-cpu information in reset. Lock profiling objects can migrate between per-cpu caches and previously these migrated objects could be zero'd before they'd been removed Discussed with: attilio Sponsored by: Nokia	2009-03-15 06:41:47 +00:00
Kip Macy	947265b6bd	- track maximum wait time - resize columns based on actual observed numerical values MFC after: 3 days	2008-07-27 21:45:20 +00:00
Attilio Rao	90356491d7	- Embed the recursion counter for any locking primitive directly in the lock_object, using an unified field called lo_data. - Replace lo_type usage with the w_name usage and at init time pass the lock "type" directly to witness_init() from the parent lock init function. Handle delayed initialization before than witness_initialize() is called through the witness_pendhelp structure. - Axe out LO_ENROLLPEND as it is not really needed. The case where the mutex init delayed wants to be destroyed can't happen because witness_destroy() checks for witness_cold and panic in case. - In enroll(), if we cannot allocate a new object from the freelist, notify that to userspace through a printf(). - Modify the depart function in order to return nothing as in the current CVS version it always returns true and adjust callers accordingly. - Fix the witness_addgraph() argument name prototype. - Remove unuseful code from itismychild(). This commit leads to a shrinked struct lock_object and so smaller locks, in particular on amd64 where 2 uintptr_t (16 bytes per-primitive) are gained. Reviewed by: jhb	2008-05-15 20:10:06 +00:00
Attilio Rao	13ddf72de7	Really, no explicit checks against against lock_class_* object should be done in consumers code: using locks properties is much more appropriate. Fix current code doing these bogus checks. Note: Really, callout are not usable by all !(LC_SPINLOCK \| LC_SLEEPABLE) primitives like rmlocks doesn't implement the generic lock layer functions, but they can be equipped for this, so the check is still valid. Tested by: matteo, kris (earlier version) Reviewed by: jhb	2008-02-06 00:04:09 +00:00
Kris Kennaway	357911ce77	Fix logic in skipcount handling (used to sample every 1/N lock operations to reduce profiling overhead)	2008-01-08 01:11:40 +00:00
Jeff Roberson	0c66dc6758	- Pause a while after disabling lock profiling and before resetting it to be sure that all participating CPUs have stopped updating it. - Restore the behavior of printing the name of the lock type in the output.	2007-12-31 03:45:51 +00:00
Jeff Roberson	eea4f254fe	- Re-implement lock profiling in such a way that it no longer breaks the ABI when enabled. There is no longer an embedded lock_profile_object in each lock. Instead a list of lock_profile_objects is kept per-thread for each lock it may own. The cnt_hold statistic is now always 0 to facilitate this. - Support shared locking by tracking individual lock instances and statistics in the per-thread per-instance lock_profile_object. - Make the lock profiling hash table a per-cpu singly linked list with a per-cpu static lock_prof allocator. This removes the need for an array of spinlocks and reduces cache contention between cores. - Use a seperate hash for spinlocks and other locks so that only a critical_enter() is required and not a spinlock_enter() to modify the per-cpu tables. - Count time spent spinning in the lock statistics. - Remove the LOCK_PROFILE_SHARED option as it is always supported now. - Specifically drop and release the scheduler locks in both schedulers since we track owners now. In collaboration with: Kip Macy Sponsored by: Nokia	2007-12-15 23:13:31 +00:00
Stephan Uphoff	f53d15fe1b	Initial checkin for rmlock (read mostly lock) a multi reader single writer lock optimized for almost exclusive reader access. (see also rmlock.9) TODO: Convert to per cpu variables linkerset as soon as it is available. Optimize UP (single processor) case.	2007-11-08 14:47:55 +00:00
Attilio Rao	4486adc51f	Currently the LO_NOPROFILE flag (which is masked on upper level code by per-primitive macros like MTX_NOPROFILE, SX_NOPROFILE or RW_NOPROFILE) is not really honoured. In particular lock_profile_obtain_lock_failure() and lock_profile_obtain_lock_success() are naked respect this flag. The bug leads to locks marked with no-profiling to be profiled as well. In the case of the clock_lock, used by the timer i8254 this leads to unpredictable behaviour both on amd64 and ia32 (double faults panic, sudden reboots, etc.). The amd64 clock_lock is also not marked as not profilable as it should be. Fix these bugs adding proper checks in the lock profiling code and at clock_lock initialization time. i8254 bug pointed out by: kris Tested by: matteo, Giuseppe Cocomazzi <sbudella at libero dot it> Approved by: jeff (mentor) Approved by: re	2007-09-14 01:12:39 +00:00
Kris Kennaway	cdcc788a7e	Revert some debugging KTRs that were added during development.	2007-06-03 18:24:31 +00:00
John Baldwin	c91fcee75d	Move lock_profile_object_{init,destroy}() into lock_{init,destroy}().	2007-05-18 15:04:59 +00:00
Kip Macy	8289600ce7	skip call to _lock_profile_obtain_lock_success entirely if acquisition time is non-zero (i.e. recursing or adding sharers)	2007-04-03 18:36:27 +00:00
Kip Macy	fe68a91631	general LOCK_PROFILING cleanup - only collect timestamps when a lock is contested - this reduces the overhead of collecting profiles from 20x to 5x - remove unused function from subr_lock.c - generalize cnt_hold and cnt_lock statistics to be kept for all locks - NOTE: rwlock profiling generates invalid statistics (and most likely always has) someone familiar with that should review	2007-02-26 08:26:44 +00:00

1 2

62 Commits