NFS clients was reported to freebsd-fs@ under the subject "NFS
corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when
a TCP mounted root fs was changed to using UDP. I believe that this
problem was caused by the change in mnt_stat.f_iosize that occurred
because rsize was decreased to the maximum supported by UDP. This
patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize,
since the latter is set to f_iosize when the vnode is allocated, but
does not change for a given vnode when f_iosize changes.
Reported by: pjd
Reviewed by: kib
MFC after: 2 weeks
appropriate timestamps. Restore the assertions which verify that
NCF_TS is set when timestamp is asked for.
Reviewed by: jhb (previous version)
MFC after: 2 weeks
we will only trust a positive name cache entry for a specified amount of
time before falling back to a LOOKUP RPC, even if the ctime for the file
handle matches the cached copy in the name cache entry. The timeout is
configured via a new 'nametimeo' mount option and defaults to 60 seconds.
It may be set to zero to disable positive name caching entirely.
Reviewed by: rmacklem
MFC after: 1 week
from TCP to UDP and the rsize/wsize/readdirsize is greater
than NFS_MAXDGRAMDATA, it is possible for a thread doing an
I/O RPC to get stuck repeatedly doing retries. This happens
because the RPC will use a resize/wsize/readdirsize that won't
work for UDP and, as such, it will keep failing indefinitely.
This patch returns an error for this case, to avoid the problem.
A discussion on freebsd-fs@ seemed to indicate that returning
an error was preferable to silently ignoring the "udp"/"mntudp"
option.
This problem was discovered while investigating a problem reported
by pjd@ via email.
MFC after: 2 weeks
entries on one client when a directory was renamed on another client. The
root cause for the stale entry being trusted is that each per-vnode nfsnode
structure has a single 'n_ctime' timestamp used to validate positive name
cache entries. However, if there are multiple entries for a single vnode,
they all share a single timestamp. To fix this, extend the name cache
to allow filesystems to optionally store a timestamp value in each name
cache entry. The NFS clients now fetch the timestamp associated with
each name cache entry and use that to validate cache hits instead of the
timestamps previously stored in the nfsnode. Another part of the fix is
that the NFS clients now use timestamps from the post-op attributes of
RPCs when adding name cache entries rather than pulling the timestamps out
of the file's attribute cache. The latter is subject to races with other
lookups updating the attribute cache concurrently. Some more details:
- Add a variant of nfsm_postop_attr() to the old NFS client that can return
a vattr structure with a copy of the post-op attributes.
- Handle lookups of "." as a special case in the NFS clients since the name
cache does not store name cache entries for ".", so we cannot get a
useful timestamp. It didn't really make much sense to recheck the
attributes on the the directory to validate the namecache hit for "."
anyway.
- ABI compat shims for the name cache routines are present in this commit
so that it is safe to MFC.
MFC after: 2 weeks
The effect of this was, for clients mounted via inet6 addresses,
that the DRC cache would never have a hit in the server. It also
broke NFSv4 callbacks when an inet6 address was the only one available
in the client. This patch fixes the above, plus deletes opt_inet6.h
from a couple of files it is not needed for.
MFC after: 2 weeks
jhb@ spotted that nfscl_getstateid() might modify credentials when
called from nfsrpc_read() for the case where p != NULL, whereas
nfsrpc_read() only did a crdup() to get new credentials for p == NULL.
This bug was introduced by r195510, since pre-r195510 nfscl_getstateid()
only modified credentials for the p == NULL case. This patch modifies
nfsrpc_read()/nfsrpc_write() so that they do crdup() for the p != NULL case.
It is conceivable that this bug caused the crash reported by glebius@, but
that will not be determined for some time, since the crash occurred after
about 1month of operation.
Tested by: glebius
Reviewed by: jhb
MFC after: 2 weeks
of the same lock_owner4 string. As such, the handling of cleanup
of lock_owners could be simplified. This simplification permitted
the client to do a ReleaseLockOwner operation when the process that
the lock_owner4 string represents, has exited. This permits the
server to release any storage related to the lock_owner4 string
before the associated open is closed. Without this change, it
is possible to exhaust a server's storage when a long running
process opens a file and then many child processes do locking
on the file, because the open doesn't get closed. A similar patch
was applied to the Linux NFSv4 client recently so that it wouldn't
exhaust a server's storage.
Reviewed by: zack
MFC after: 2 weeks
client. This does not change the client's behaviour, but prepares
the code so that nfsrpc_rellockown() can be called elsewhere in a
future commit.
MFC after: 2 weeks
head nfsc_defunctlockowner. This patch simply removes the code that
loops through this always empty list, since the code no longer does
anything useful. It should not have any effect on the client's
behaviour.
MFC after: 2 weeks
for regular files. Since other file types don't write into the
buffer cache, calling ncl_flush() is almost a no-op. However, it does
clear the NMODIFIED flag and this shouldn't be done by nfs_fsync() for
directories.
MFC after: 2 weeks
before the nfs_decode_args() call in the new NFS client, so
that a specfied command line value won't be overwritten.
Also, modify the calculation for small values of desiredvnodes
to avoid an unusually large value or a divide by zero crash.
It seems that the default value for nm_wcommitsize is very
conservative and may need to change at some time.
PR: kern/159351
Submitted by: onwahe at gmail.com (earlier version)
Reviewed by: jhb
MFC after: 2 weeks
kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.
Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc
This was reported to the mailing list freebsd-net@freebsd.org
on July 21, 2011 under the subject "LOR with nfsclient sillyrename".
The LOR occurred when nfs_inactive() called vrele(sp->s_dvp)
while holding the vnode lock on the file in s_dvp. This patch
modifies the client so that it performs the vrele(sp->s_dvp)
as a separate task to avoid the LOR. This fix was discussed
with jhb@ and kib@, who both proposed variations of it.
Tested by: pho, jlott at averesystems.com
Submitted by: jhb (earlier version)
Reviewed by: kib
Approved by: re (kib)
MFC after: 2 weeks
failed after the file was created in nfs_create(). This would
probably only happen during a forced dismount. The old NFS client
does have a vput() for this case. Detected by pho during recent
testing, where an open syscall returned with a vnode still locked.
Tested by: pho
Approved by: re (kib)
MFC after: 2 weeks
loop in nfscl_getcl() when a forced dismount is in progress,
because nfsv4_lock() will return 0 without sleeping when
MNTK_UNMOUNTF is set.
This patch fixes it so it won't loop calling nfsv4_lock()
for this case.
MFC after: 2 weeks
multiple instances of the same lock_owner when a process both
inherited an open file descriptor plus opened the same file itself.
Since some NFSv4 servers cannot handle multiple instances of
the same lock_owner string, this patch changes the algorithm
used by nfscl_getopen() in the new NFSv4 client to keep that
from happening. The new algorithm is simpler, since there is
no longer any need to ascend the process's parentage tree because
all NFSv4 Closes for a file are done at VOP_INACTIVE()/VOP_RECLAIM(),
making the Opens indistinct w.r.t. use with Lock Ops.
This problem was discovered at the recent NFSv4 interoperability
Bakeathon.
MFC after: 2 weeks
to the lock_owner4 string that goes on the wire. Also, add
code to do a ReleaseLockOwner Op on the lock_owner4 string
before a Close. Apparently not all NFSv4 servers handle multiple
instances of the same lock_owner4 string, at least not in a
compatible way. This patch avoids having multiple instances,
except for one unusual case, which will be fixed by a future commit.
Found at the recent NFSv4 interoperability Bakeathon.
Tested by: tdh at excfb.com
MFC after: 2 weeks
mode attribute in as 0 when doing writes. The change adds
the Mode attribute plus the others except Owner and Owner_group
to the list requested by the NFSv4 Write Operation. This fixed
a problem where an executable file built by "cc" would get mode
0111 instead of 0755 for some NFSv4 servers.
Found at the recent NFSv4 interoperability Bakeathon.
Tested by: tdh at excfb.com
MFC after: 2 weeks
the NFS subsystems use five of the rpcsec_gss/kgssapi entry points,
but since it was not obvious which others might be useful, all
nineteen were included. Basically the nineteen entry points are
set in a structure called rpc_gss_entries and inline functions
defined in sys/rpc/rpcsec_gss.h check for the entry points being
non-NULL and then call them. A default value is returned otherwise.
Requested by rwatson.
Reviewed by: jhb
MFC after: 2 weeks
cloned from the old NFS client, plus additions for NFSv4. A
review of this code is in progress, however it was felt by the
reviewer that it could go in now, before code slush. Any changes
required by the review can be committed as bug fixes later.
should be ok, since the client now delays NFSv4 Close operations
until VOP_INACTIVE()/VOP_RECLAIM(). As such, there should be no
risk that the NFSv4 Open is closed while an associated byte range lock
still exists.
Tested by: avg
MFC after: 2 weeks
"p_leader" for the "id" for POSIX byte range locking. I think
this would only have affected processes created by rfork(2)
with the RFTHREAD flag specified. This patch fixes that by
passing the "id" down through the various functions from
nfs_advlock().
MFC after: 2 weeks
VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return
VM_PAGER_AGAIN for the partially written page. Always forward at least
one page in the loop of vm_object_page_clean().
VM_PAGER_ERROR causes the page reactivation and does not clear the
page dirty state, so the write is not lost.
The change fixes an infinite loop in vm_object_page_clean() when the
filesystem returns permanent errors for some page writes.
Reported and tested by: gavin
Reviewed by: alc, rmacklem
MFC after: 1 week
Pathconf RPC for cases where the reply doesn't include
the answer. This fixes a problem reported by avg@ where
the NFSv3 Pathconf RPC would fail when "ls -l" did an
lpathconf(2) for _PC_ACL_NFS4.
Tested by: avg
MFC after: 2 weeks
correctly during a forced dismount. This required that
the exclusive and shared (refcnt) sleep lock functions check
for MNTK_UMOUNTF before sleeping, so that they won't block
while nfscl_umount() is getting rid of the state. As
such, a "struct mount *" argument was added to the locking
functions. I believe the only remaining case where a forced
dismount can get hung in the kernel is when a thread is
already attempting to do a TCP connect to a dead server
when the krpc client structure called nr_client is NULL.
This will only happen just after a "mount -u" with options
that force a new TCP connection is done, so it shouldn't
be a problem in practice.
MFC after: 2 weeks
in the new NFS client so that a forced dismount doesn't
get stuck in the VFS_SYNC() call that happens before
VFS_UNMOUNT() in dounmount().
Additional changes are needed before forced dismounts will work.
MFC after: 2 weeks
argument for a write RPC when it succeeds for the first one and
fails for a subsequent RPC within the same call to the function.
This makes it compatible with the old NFS client for this case.
MFC after: 2 weeks
to both NFS clients. This avoids the crash reported by
Sergey Kandaurov (pluknet@gmail.com) to the freebsd-fs@
list with subject "[old nfsclient] different nmount()
args passed from mount vs mount_nfs" dated May 17, 2011.
Tested by: pluknet at gmail.com (old nfs client)
MFC after: 2 weeks
new NFS client. It will then be reduced to whatever the
server says it can support. There might be an argument
that this could be one block larger, but since NFS is
a byte granular system, I chose not to do that.
Suggested by: Matt Dillon
Tested by: Daniel Braniss (earlier version)
MFC after: 2 weeks
that are now in "struct statfs" for NFSv3 and NFSv4. Since
the ffiles value is uint64_t on the wire, I clip the value
to INT64_MAX to avoid setting f_ffree negative.
Tested by: kib
MFC after: 2 weeks
checking at open time. It may improve performance for read-only
NFS mounts. Use deliberately.
MFC after: 1 week
Reviewed by: rmacklem, jhb (earlier version)