Fix a race condition with concurrent LOOKUP namecache operations for a vnode

not in the namecache when shared lookups are enabled (vfs.lookup_shared=1,
it is currently off by default) and the filesystem supports shared lookups
(e.g. NFS client).  Specifically, if multiple concurrent LOOKUPs both miss
in the name cache in parallel, each of the lookups may each end up adding an
entry to the namecache resulting in duplicate entries in the namecache
for the same pathname.  A subsequent removal of the mapping of that
pathname to that vnode (via remove or rename) would only evict one of the
entries from the name cache.  As a result, subseqent lookups for that
pathname would still return the old vnode.

This race was observed with shared lookups over NFS where a file was updated
by writing a new file out to a temporary file name and then renaming that
temporary file to the "real" file to effect atomic updates of a file.  Other
processes on the same client that were periodically reading the file would
occasionally receive an ESTALE error from open(2) because the VOP_GETATTR()
in nfs_open() would receive that error when given the stale vnode.

The fix here is to check for duplicates in cache_enter() and just return
if an entry for this same directory and leaf file name for this vnode is
already in the cache.  The check for duplicates is done by walking the
per-vnode list of name cache entries.  It is expected that this list should
be very small in the common case (usually 0 or 1 entries during a
cache_enter() since most files only have 1 "leaf" name).

Reviewed by:	ups, scottl
MFC after:	2 months
This commit is contained in:
jhb 2008-08-23 15:13:39 +00:00
parent 6455ebee9c
commit 6c646c3b72

View File

@ -500,8 +500,39 @@ cache_enter(dvp, vp, cnp)
hold = 0;
zap = 0;
/*
* Calculate the hash key and setup as much of the new
* namecache entry as possible before acquiring the lock.
*/
ncp = cache_alloc(cnp->cn_namelen);
ncp->nc_vp = vp;
ncp->nc_dvp = dvp;
len = ncp->nc_nlen = cnp->cn_namelen;
hash = fnv_32_buf(cnp->cn_nameptr, len, FNV1_32_INIT);
bcopy(cnp->cn_nameptr, ncp->nc_name, len);
hash = fnv_32_buf(&dvp, sizeof(dvp), hash);
CACHE_LOCK();
/*
* See if this vnode is already in the cache with this name.
* This can happen with concurrent lookups of the same path
* name.
*/
if (vp) {
struct namecache *n2;
TAILQ_FOREACH(n2, &vp->v_cache_dst, nc_dst) {
if (n2->nc_dvp == dvp &&
n2->nc_nlen == cnp->cn_namelen &&
!bcmp(n2->nc_name, cnp->cn_nameptr, n2->nc_nlen)) {
CACHE_UNLOCK();
cache_free(ncp);
return;
}
}
}
numcache++;
if (!vp) {
numneg++;
@ -513,16 +544,9 @@ cache_enter(dvp, vp, cnp)
}
/*
* Set the rest of the namecache entry elements, calculate it's
* hash key and insert it into the appropriate chain within
* the cache entries table.
* Insert the new namecache entry into the appropriate chain
* within the cache entries table.
*/
ncp->nc_vp = vp;
ncp->nc_dvp = dvp;
len = ncp->nc_nlen = cnp->cn_namelen;
hash = fnv_32_buf(cnp->cn_nameptr, len, FNV1_32_INIT);
bcopy(cnp->cn_nameptr, ncp->nc_name, len);
hash = fnv_32_buf(&dvp, sizeof(dvp), hash);
ncpp = NCHHASH(hash);
LIST_INSERT_HEAD(ncpp, ncp, nc_hash);
if (LIST_EMPTY(&dvp->v_cache_src)) {