freebsd-skq

Author	SHA1	Message	Date
mmacy	c78ee89370	lockf: annotate LOCKF_DEBUG only var	2018-05-19 05:04:38 +00:00
mjg	c5969d119d	lockf: change the owner hash from pid to vnode-based This adds a bit missed due to the patch split, see r332882 Tested by: pho	2018-04-24 06:10:36 +00:00
mjg	a525500c85	lockf: add per-chain locks to the owner hash This combined with previous changes significantly depessimizes the behaviour under contentnion. In particular the lock1_processes test (locking/unlocking separate files) from the will-it-scale suite was executed with 128 concurrency on a 4-socket Broadwell with 128 hardware threads. Operations/second (lock+unlock) go from ~750000 to ~45000000 (6000%) For reference single-process is ~1680000 (i.e. on stock kernel the resulting perf is less than half of the single-threaded run), Note this still does not really scale all that well as the locks were just bolted on top of the current implementation. Significant room for improvement is still here. In particular the top performance fluctuates depending on the extent of false sharing in given run (which extends beyond the file). Added chain+lock pairs were not padded w.r.t. cacheline size. One big ticket item is the hash used for spreading threads: it used to be the process pid (which basically serialized all threaded ops). Temporarily the vnode addr was slapped in instead. Tested by: pho	2018-04-23 08:23:10 +00:00
mjg	36aee106b7	lockf: skip locking the graph if not necessary (common case) Tested by: pho	2018-04-23 07:54:02 +00:00
mjg	671996ff90	lockf: perform wakeup onlly when there is anybody waiting Tested by: pho	2018-04-23 07:52:56 +00:00
mjg	0bb3de5fe9	lockf: skip the hard work in lf_purgelocks if possible Tested by: pho	2018-04-23 07:52:10 +00:00
mjg	2bb2295c93	lockf: free state only when recycling the vnode This avoids malloc/free cycles when locking/unlocking the vnode when nobody is contending. Tested by: pho	2018-04-23 07:51:19 +00:00
mjg	49cba071c4	lockf: slightly depessimize 1. check if P_ADVLOCK is already set and if so, don't lock to set it (stolen from DragonFly) 2. when trying for fast path unlock, check that we are doing unlock first instead of taking the interlock for no reason (e.g. if we want to lock). whilere make it more likely that falling fast path will not take the interlock either by checking for state Note the code is severely pessimized both single- and multithreaded.	2018-04-22 09:30:07 +00:00
pfg	4736ccfd9c	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
avg	8bd567c2ae	put very expensive sanity checks of advisory locks under DIAGNOSTIC The checks have quadratic complexity over a number of advisory locks active for a file and that could be a lot. What's the worse is that the checks are done while holding ls_lock. That could lead to a long a very long backlog and performance degradation even if all requested locks are compatible (e.g. all shared locks). The checks used to be under INVARIANTS. Discussed with: kib MFC after: 2 weeks Sponsored by: Panzura	2017-01-30 15:20:13 +00:00
sephe	d7fabd9d6d	Fix LINT building. Sponsored by: Microsoft	2016-09-18 07:37:00 +00:00
emaste	00b67b15b9	Renumber license clauses in sys/kern to avoid skipping #3	2016-09-15 13:16:20 +00:00
kib	13c308de0c	When sleeping waiting for either local or remote advisory lock, interrupt sleeps with the ERESTART on the suspension attempts. Otherwise, single-threading requests are deferred until the locks are granted for NFS files, which causes hangs. When retrying local registration of the remotely-granted adv lock, allow full suspension and check for suspension, for usual reasons. Reported by: markj, pho Reviewed by: jilles Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-06-26 20:08:42 +00:00
pfg	28823d0656	sys/kern: spelling fixes in comments. No functional change.	2016-04-29 22:15:33 +00:00
delphij	27945e456a	Improve style and fix a possible use-after-free case introduced in r268384 by reinitializing the 'freestate' pointer after freeing the memory. Obtained from: HardenedBSD (71fab80c5dd3034b71a29a61064625018671bbeb) PR: 194525 Submitted by: Oliver Pinter <oliver.pinter@hardenedbsd.org> MFC after: 2 weeks	2015-01-10 06:48:35 +00:00
kib	c8b684225c	Correct the problem reported by test16 from tools/regression/file/flock/flock.c, which completes the fix in r192685. When the lock was stolen from us, retry the whole lock sequence in kernel, instead of returning EINTR to usermode and hoping that application would handle it correctly by restarting the lock acquire. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-08 08:10:15 +00:00
ed	e97eae1577	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
kib	4ce38c4283	In lf_iteratelocks_vnode, increment state->ls_threads around iterating of the vnode advisory lock list. This prevents deallocation of state while inside the loop. Reported and tested by: pho MFC after: 2 weeks	2009-06-25 18:54:56 +00:00
kib	21376236dd	Decrement state->ls_threads when vnode appeared to be doomed. Reported and tested by: pho	2009-06-17 12:43:04 +00:00
kib	4184c18920	Do not leak the state->ls_lock after VI_DOOMED check introduced in the r192683. Reported by: pho Submitted by: jhb	2009-06-10 16:17:38 +00:00
kib	862b0fc4e3	The advisory lock may be activated or activated and removed during the sleep waiting for conditions when the lock may be granted. To prevent lf_setlock() from accessing possibly freed memory, add reference counting to the struct lockf_entry. Bump refcount around the sleep. Make lf_free_lock() return non-zero when structure was freed, and use this after the sleep to return EINTR to the caller. The error code might need a clarification, but we cannot return success to usermode, since the lock is not owned anymore. Reviewed by: dfr Tested by: pho MFC after: 1 month	2009-05-24 12:39:38 +00:00
kib	c54d127bf0	In lf_purgelocks(), assert that state->ls_pending is empty after we weeded out threads, and clean ls_active instead of ls_pending. Reviewed by: dfr Tested by: pho MFC after: 1 month	2009-05-24 12:37:55 +00:00
kib	fec8f2f845	In lf_advlockasync(), recheck for doomed vnode after the state->ls_lock is acquired. In the lf_purgelocks(), assert that vnode is doomed and set *statep to NULL before clearing ls_pending list. Otherwise, we allow for the thread executing lf_advlockasync() to put new pending entry after state->ls_lock is dropped in lf_purgelocks(). Reviewed by: dfr Tested by: pho MFC after: 1 month	2009-05-24 12:33:16 +00:00
kib	56e5e587c8	Replace the while statement with the if for clarity. The loop body cannot be executed more then once. Reviewed by: dfr Tested by: pho MFC after: 1 month	2009-05-24 12:28:38 +00:00
ganbold	93f46a969a	Remove unused variable. Found with: Coverity Prevent(tm) CID: 3664 Approved by: kib	2008-11-27 04:40:37 +00:00
dfr	f98e1f1bbf	Don't rely on the value of *statep without first taking the vnode interlock. Reviewed by: Mike Tancsa MFC after: 2 weeks	2008-10-24 16:04:10 +00:00
dfr	41cea6d5ca	Re-implement the client side of rpc.lockd in the kernel. This implementation provides the correct semantics for flock(2) style locks which are used by the lockf(1) command line tool and the pidfile(3) library. It also implements recovery from server restarts and ensures that dirty cache blocks are written to the server before obtaining locks (allowing multiple clients to use file locking to safely share data). Sponsored by: Isilon Systems PR: 94256 MFC after: 2 weeks	2008-06-26 10:21:54 +00:00
dfr	b95b50cdbb	When blocking on an F_FLOCK style lock request which is upgrading a shared lock to exclusive, drop the shared lock before deadlock detection. MFC after: 2 days	2008-05-09 10:34:23 +00:00
dfr	f50ee5045a	Fix compilation with LOCKF_DEBUG.	2008-04-16 14:08:12 +00:00
kib	52243403eb	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
dfr	60db59bdb1	Don't try to use an SX lock while holding the vnode interlock. Sponsored by: Isilon Systems	2008-04-01 16:07:01 +00:00
dfr	79d2dfdaa6	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
jeff	4cd4553bb5	- Fix the last of the threading bugs that were introduced as far back as 1.38 in 2001. Break out of the FOREACH_THREAD_IN_PROC loop when we've discovered a new proc in the chain. - Increment i and check for maxlockdepth once per matching process not once per thread. This didn't properly terminate the loop before. - Fix a bug which has existed potentially since rev 1.1. waitblock->lf_next can be NULL when a thread has been woken-up but not yet scheduled. Check for this condition rather than blindly dereferencing. Found by: libMicro	2008-03-19 07:13:24 +00:00
jeff	46f09d5bc3	- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.	2008-03-19 06:19:01 +00:00
kib	b236f3d925	Do not call free() while holding vnode interlock. Reported and tested by: Peter Holm Reviewed by: jeff Approved by: re (kensmith)	2007-08-07 09:04:50 +00:00
jeff	4c779e4b9e	- Remove explicit Giant protection from lockf. Use the vnode interlock to protect this datastructure instead. - Preallocate an extra lockf structure in case we want to split a lock on insert or delete. - msleep() on the vnode interlock when blocking on a lock. Reviewed by: rwatson Approved by: re	2007-07-03 21:22:58 +00:00
jeff	91d1501790	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
phk	d3b2ec6954	Print name of device instead of useless major/minor numbers.	2005-03-29 08:13:01 +00:00
phk	9e33e49ed5	Fix a debug message to print a usable device name rather than useless major+minor tupple.	2005-03-15 14:08:10 +00:00
jeff	c9f0aca772	- Make lf_print static and move its prototype into kern_lockf.c - Protect all of the advlock code with Giant as some filesystems may not be entering with Giant held now. Sponsored by: Isilon Systems, Inc.	2005-01-25 10:15:26 +00:00
imp	20280f1431	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
imp	74cf37bd00	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
obrien	3b8fff9e4c	Use __FBSDID().	2003-06-11 00:56:59 +00:00
kan	9468fdaf14	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
phk	e059b79437	Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.	2003-03-18 08:45:25 +00:00
imp	cf874b345d	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
alfred	bf8e8a6e8f	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
mux	ad09b5b153	- Fix a bunch of casts to long which were truncating off_t's. - Remove the comments which were justifying this by the fact that we don't have %q in the kernel, this was probably right back in time, but we now have %q, and we even have better to print those types (%j).	2002-11-07 21:56:05 +00:00
mux	fefd97fe38	Remove a conditional #include <sys/kernel.h>, it is already included unconditionally before. Submitted by: Olivier Houchard <cognet@ci0.org>	2002-09-14 14:44:41 +00:00
phk	46cc4d0ca8	Add a #include for <sys/mount.h>	2002-08-13 10:07:05 +00:00

1 2

91 Commits