freebsd-dev

Author	SHA1	Message	Date
John Baldwin	9078981ab1	Convert the global mutex protecting the directory lookup name cache from a mutex to a reader/writer lock. Lookup operations first grab a read lock and perform the lookup. If the operation results in a need to modify the cache, then it tries to do an upgrade. If that fails, it drops the read lock, obtains a write lock, and redoes the lookup.	2009-01-28 19:05:18 +00:00
John Baldwin	8a7ef10b71	- Mark all standalone INT/LONG/QUAD sysctl's MPSAFE. This is done inside the SYSCTL() macros and thus does not need to be done for all of the nodes scattered across the source tree. - Mark the name-cache related sysctl's (including debug.hashstat.) MPSAFE. - Mark vm.loadavg MPSAFE. - Remove GIANT_REQUIRED from vmtotal() (everything in this routine already has sufficient locking) and mark vm.vmtotal MPSAFE. - Mark the vm.stats.(sys\|vm). sysctls MPSAFE.	2009-01-23 22:49:23 +00:00
Stephen McKay	58c1607e03	Add a limit on namecache entries. In normal operation, the number of cache entries is roughly equal to the number of active vnodes. However, when most of the recently accessed vnodes have many hard links, the number of cache entries can be 32000 times as large, exhausting kernel memory and provoking a panic in kmem_malloc(). MFC after: 2 weeks	2009-01-20 04:21:21 +00:00
Konstantin Belousov	83e73926ad	In r185557, the check for existing negative entry for the given name did not compared nc_dvp with supplied parent directory vnode pointer. Add the check and note that now branches for vp != NULL and vp == NULL are the same, thus can be merged. Reported and reviewed by: kan Tested by: pho MFC after: 2 weeks	2008-12-30 12:51:14 +00:00
Joe Marcus Clarke	4769218f4b	Do not KASSERT when vp->v_dd is NULL. Only directories which have had ".." looked up would have v_dd set to a non-NULL value. This fixes a panic seen when running installworld on a diskless system with a separate /usr file system. Submitted by: cracauer Approved by: kib	2008-12-23 20:43:42 +00:00
Konstantin Belousov	86dcb537c9	Keep the hold on the vnode during VOP_VPTOCNP() call, allowing the vop implementation to drop vnode lock, if needed. Reported and tested by: pho	2008-12-23 20:04:31 +00:00
Joe Marcus Clarke	b9022449b3	Add a new VOP, VOP_VPTOCNP, which translates a vnode to its component name on a best-effort basis. Teach vn_fullpath to use this new VOP if a regular VFS cache lookup fails. This VOP is designed to supplement the VFS cache to provide a better chance that a vnode-to-name lookup will succeed. Currently, an implementation for devfs is being committed. The default implementation is to return ENOENT. A big thanks to kib for the mentorship on this, and to pho for running it through his stress test suite. Reviewed by: arch Approved by: kib	2008-12-12 00:57:38 +00:00
Konstantin Belousov	d6568724e1	Shared lookup makes it possible to create several negative cache entries for one name. Then, creating inode with that name would remove one entry, leaving others dormant. Reclaiming the vnode would uncover negative entries, causing false return of ENOENT from the calls like stat, that do not create inode. Prevent creation of the duplicated negative entries. Reported and debugged with: pho Reviewed by: jhb X-MFC: after shared lookup changes	2008-12-02 11:14:16 +00:00
Joe Marcus Clarke	ef61995ebd	Move vn_fullpath1() outside of FILEDESC locking. This is being done in advance of teaching vn_fullpath1() how to query file systems for vnode-to-name mappings when cache lookups fail. Thanks to kib for guidance and patience on this process. Reviewed by: kib Approved by: kib	2008-11-25 15:36:15 +00:00
John Baldwin	d2722d704c	Part 1 of making shared lookups more resilient with respect to forced unmounts. When we upgrade a vnode lock from shared to exclusive during a name cache lookup, fail the lookup with EBADF if the vnode is invalidated while we are waiting for the exclusive lock. Also, for correctness (though I'm not sure it can occur in practice), downgrade an exclusively locked vnode if it should be share locked. Tested by: pho	2008-09-24 18:51:33 +00:00
John Baldwin	cbb598af66	Sort includes.	2008-09-18 20:04:22 +00:00
John Baldwin	969bf150df	Fix a race condition with concurrent LOOKUP namecache operations for a vnode not in the namecache when shared lookups are enabled (vfs.lookup_shared=1, it is currently off by default) and the filesystem supports shared lookups (e.g. NFS client). Specifically, if multiple concurrent LOOKUPs both miss in the name cache in parallel, each of the lookups may each end up adding an entry to the namecache resulting in duplicate entries in the namecache for the same pathname. A subsequent removal of the mapping of that pathname to that vnode (via remove or rename) would only evict one of the entries from the name cache. As a result, subseqent lookups for that pathname would still return the old vnode. This race was observed with shared lookups over NFS where a file was updated by writing a new file out to a temporary file name and then renaming that temporary file to the "real" file to effect atomic updates of a file. Other processes on the same client that were periodically reading the file would occasionally receive an ESTALE error from open(2) because the VOP_GETATTR() in nfs_open() would receive that error when given the stale vnode. The fix here is to check for duplicates in cache_enter() and just return if an entry for this same directory and leaf file name for this vnode is already in the cache. The check for duplicates is done by walking the per-vnode list of name cache entries. It is expected that this list should be very small in the common case (usually 0 or 1 entries during a cache_enter() since most files only have 1 "leaf" name). Reviewed by: ups, scottl MFC after: 2 months	2008-08-23 15:13:39 +00:00
Alfred Perlstein	cbd3ba3edf	Prevent crashes due to unlocked access to hash buckets in two sysctls. Use CACHE_LOCK to prevent crashes. Sysctls fixed: debug.hashstat.nchash and debug.hashstat.rawnchash. Obtained from: Juniper Networks MFC After: 1 week	2008-08-16 21:48:10 +00:00
Christian S.J. Peron	dfc714fba1	Currently, BSM audit pathname token generation for chrooted or jailed processes are not producing absolute pathname tokens. It is required that audited pathnames are generated relative to the global root mount point. This modification changes our implementation of audit_canon_path(9) and introduces a new function: vn_fullpath_global(9) which performs a vnode -> pathname translation relative to the global mount point based on the contents of the name cache. Much like vn_fullpath, vn_fullpath_global is a wrapper function which called vn_fullpath1. Further, the string parsing routines have been converted to use the sbuf(9) framework. This change also removes the conditional acquisition of Giant, since the vn_fullpath1 method will not dip into file system dependent code. The vnode locking was modified to use vhold()/vdrop() instead the vref() and vrele(). This will modify the hold count instead of modifying the user count. This makes more sense since it's the kernel that requires the reference to the vnode. This also makes sure that the vnode does not get recycled we hold the reference to it. [1] Discussed with: rwatson Reviewed by: kib [1] MFC after: 2 weeks	2008-07-31 16:57:41 +00:00
Pawel Jakub Dawidek	b03d720760	- Use LK_TYPE_MASK where needed. Actually after sys/sys/lockmgr.h:1.69 it is no longer needed, but for now we still want to be consistent with other similar checks in the tree. - Call ASSERT_VOP_ELOCKED() only when vget() returns 0. Reviewed by: jeff	2008-04-09 20:19:55 +00:00
Konstantin Belousov	0a3af16a75	Add the utility function vn_commname() to retrieve the command name from the vfs namecache, when available. Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 11:53:03 +00:00
Robert Watson	237fdd787b	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Kris Kennaway	e6d64a0f15	Remove remaining Giant acquisition around vn_fullpath1. This was missed in r1.106 and has not been required for some years now. Reviewed by: jeff MFC After: 1 week	2007-11-22 21:26:25 +00:00
Pawel Jakub Dawidek	b4d7e2983c	Fix some locking cases where we ask for exclusively locked vnode, but we get shared locked vnode in instead when vfs.lookup_shared is set to 1. Discussed with: kib, kris Tested by: kris Approved by: re (kensmith)	2007-09-21 10:16:56 +00:00
Pawel Jakub Dawidek	dfe97ff4a5	We only flush entries related to the given file system. Currently there are no 'invalid' cache entires - file system is responsible for keeping it that way. The comment should have been updated in rev.1.25.	2007-06-18 09:28:24 +00:00
Pawel Jakub Dawidek	6e042171bd	To avoid a deadlock when handling .. directory during a lookup, we unlock parent vnode and relock it after locking child vnode. The problem was that we always relock it exclusively, even when it was share-locked. Discussed with: jeff	2007-05-25 22:23:38 +00:00
Pawel Jakub Dawidek	b4c85af977	We no longer need to put namecache entries onto temporary mplist. It was useful in revision 1.86, but should have been removed in 1.89.	2007-05-25 22:19:49 +00:00
Pawel Jakub Dawidek	950afe9972	The cache_leaf_test() function seems to be unused, so remove it.	2007-05-25 22:16:17 +00:00
Pawel Jakub Dawidek	f013ccb768	- Remove redundant initialization. - Compare pointer with NULL.	2007-05-22 23:05:48 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Robert Watson	873fbcd776	Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.	2007-03-05 13:10:58 +00:00
Christian S.J. Peron	4f0840f348	Axe Giant from vn_fullpath(9). The vnode -> pathname lookup should be filesystem agnostic. We are not touching any file system specific functions in this code path. Since we have a cache lock, there is really no need to keep Giant around here. This eliminates Giant acquisitions for any syscall which is auditing pathnames. Discussed with: jeff	2006-06-16 05:09:28 +00:00
John-Mark Gurney	e98b5a89de	remove duplicate sizeof vnode entry (debug.sizeof.vnode already existed)... move ncsize into debug.sizeof and rename to namecache...	2006-04-16 18:38:30 +00:00
Jeff Roberson	2f0bca553a	- Don't check v_mount for NULL to determine if a vnode has been recycled. Use the more appropriate VI_DOOMED flag instead. Sponsored by: Isilon Systems, Inc. MFC After: 1 week	2006-02-06 10:15:27 +00:00
Jeff Roberson	32b6dcd8a4	- Fix a leaked reference to a vnode via v_dd. We rely on cache_purge() and cache_zap() to clear the v_dd pointers when a directory vnode is forcibly discarded. For this to work, all vnodes with v_dd pointers to a directory must also have name cache entries linked via v_cache_dst to that dvp otherwise we could not find them at cache_purge() time. The following code snipit could break this guarantee by unlinking a directory before fetching it's dotdot. The dotdot lookup would initialize the v_dd field of the unlinked directory which could never be cleared. To fix this we don't initialize v_dd for orphaned vnodes. printf("rmdir: %d\n", rmdir("../foo")); /* foo is cwd */ printf("chdir: %d\n", chdir("..")); printf("%s\n", getwd(NULL)); Sponsored by: Isilon Systems, Inc. Discovered by: kkenn Approved by: re (blanket vfs)	2005-06-17 01:05:13 +00:00
Jeff Roberson	6bd8103d33	- Clear v_dd in cache_zap() instead of cache_purge() as cache_purge() may not be called in all cases where we free the cnp. Sponsored by: Isilon Systems, Inc.	2005-06-13 05:59:59 +00:00
Jeff Roberson	eff2d12635	- Add KTR_VFS messages for various name cache related events. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:03 +00:00
Jeff Roberson	1b2da2d0fa	- Assert that we're not adding a doomed vnode to the name cache. Sponsored by: Isilon Systems, Inc.	2005-06-11 08:47:30 +00:00
Jeff Roberson	4585e3ac5a	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
David Schultz	7ce7f713ee	Eliminate v_id and v_ddid. The name cache now holds references to vnodes whose names it caches, so we no longer need a `generation number' to tell us if a referenced vnode is invalid. Replace the use of the parent's v_id in the hash function with the address of the parent vnode. Tested by: Peter Holm Glanced at by: jeff, phk	2005-03-30 03:01:36 +00:00
David Schultz	dd33f0d92f	Merge kern___cwd() and vn_fullpath(), which were virtually identical, except for places where people forget to update one of them. We now collect only one set of stats for both of these routines. Other changes in this commit include: - Start acquiring Giant again in vn_fullpath(), since it is required when crossing a mount point. - Expand the scope of the cache lock to avoid dropping it and picking it up again for every pathname component. This also makes it trivial to avoid races in stats collection. - Assert that nc_dvp == v_dd for directories instead of returning an error to userland when this is not true. AFAIK, it should always be true when v_dd is non-null. - For vn_fullpath(), handle the first (non-directory) vnode separately. Glanced at by: jeff, phk	2005-03-30 02:59:32 +00:00
Jeff Roberson	5280e61f2f	- Move the logic that locks and refs the new vnode from vfs_cache_lookup() to cache_lookup(). This allows us to acquire the vnode interlock before dropping the cache lock. This protects the vnodes identity until we have locked it. Sponsored by: Isilon Systems, Inc.	2005-03-29 12:59:06 +00:00
Jeff Roberson	571211c454	- Get rid of the old LOOKUP_SHARED code. namei() now supplies the proper lock flags via cn_lkflag. Sponsored by: Isilon Systems, Inc.	2005-03-29 10:08:23 +00:00
Jeff Roberson	b75719afea	- Invalidate the childrens v_dd pointers when we cache_purge() a directory. Otherwise the stale pointer may be accessed after a vnode is freed. Sponsored by: Isilon Systems, Inc.	2005-03-29 09:58:41 +00:00
Jeff Roberson	f7b404d88f	- Remove an unused variable. Sponsored by: Isilon Systems, Inc.	2005-03-28 13:29:48 +00:00
Jeff Roberson	ee5a0a2d7c	- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. Sponsored by: Isilon Systems, Inc.	2005-03-28 09:26:17 +00:00
Jeff Roberson	fdd6a3ff3c	- All of the bugs which lead to the complication of the LOOKUP_SHARED config option have now been fixed. All filesystems are properly locked and checked via DEBUG_VFS_LOCKS. Remove the workaround code. Sponsored by: Isilon Systems, Inc.	2005-03-24 06:00:45 +00:00
Poul-Henning Kamp	2adc2b87c7	Make a SYSCTL_NODE and a mutex static	2005-02-10 12:16:42 +00:00
Jeff Roberson	799cc2dcee	- Simplify the cache locking. The lock order relationship with the vnode lock is much simpler than I originally thought it would be. Now, the cache lock is always acquired before the vnode lock. - Provide some gotos in __getcwd() to simplify the unlocking a bit. - Move Giant acquisition down into __getcwd(). Sponsored By: Isilon Systems, Inc.	2005-01-24 10:24:12 +00:00
Warner Losh	9454b2d864	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
Warner Losh	7f8a436ff2	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
Jeff Roberson	98d7d155c1	- Apply a big giant lock around the namecache. This has been sitting in my tree since BSDcon.	2003-10-05 07:13:50 +00:00

1 2 3

135 Commits