Commit Graph

568 Commits

Author SHA1 Message Date
Rick Macklem
c1970a7eba nfscl: Fix IO_APPEND writes from kernel space
Commit 867c27c23a modified the NFS client so that
it did IO_APPEND writes directly to the NFS server
bypassing the buffer cache, via a call to
nfs_directio_write().  Unfortunately, this (very old)
function assumed that the uio iov was for user space
addresses.  As such, a IO_APPEND VOP_WRITE() that
was for system space, such as ktrace(1) does, would
write bogus data.

This patch fixes nfs_directio_write() so that it
handles kernel space uio iovs.

Reported by:	bz
Tested by:	bz
MFC after:	2 weeks
2022-03-28 15:11:52 -07:00
Gordon Bergling
c1ad8a39a1 nfsclient: Fix a typos in source code comments
- s/ony/only/

Obtained from:	NetBSD
MFC after:	3 days
2022-03-27 19:27:05 +02:00
Rick Macklem
f37dc50d9f nfscl: Do not do a Lookup+Open for pNFS mounts
A NFSv4.1/4.2 pNFS mount needs to do a
separate Open+LayoutGet RPC, so do not do
a Lookup+Open RPC for these mounts.

The Lookup+Open RPCs are still disabled,
until further testing is done, so this patch
has no effect at this time.
2022-03-17 07:48:06 -07:00
Rick Macklem
57014f21e7 nfscl: Fix NFSv4.1/4.2 Lookup+Open RPC
Use of the Lookup+Open RPC is currently disabled,
due to a problem detected during testing.  This
patch fixes this problem.  The problem was that
nfscl_postop_attr() does not parse the attributes
if nd_repstat != 0.  It also would parse the
return status for the operation, where the
Lookup+Open code had already parsed it.

The first change in the patch does not make any
semantics change, but makes the code identical
to what is done later in the function, so that
it is apparent that the semantics should be the
same in both places.

Lookup+Open remains disabled while further
testing is being done, so this patch has no
effect at this time.
2022-03-13 13:15:12 -07:00
Rick Macklem
1cedb4ea1a nfscl: Fix a use after free in nfscl_cleanupkext()
ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
  owp->nfsow_owner.  If we free that particular
  owner structure, than in subsequent comparisons
  "own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
  of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
  without the NFSCLSTATE mutex held, this could race with
  nfscl_cleanupkext().
  This could happen when the exclusive lock is held
  on the client, such as when delegations are being returned
  or when recovering from NFSERR_EXPIRED.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
    nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() so that it will never free more
    than the first matching element.  Normally there should only
    be one element in each list with a matching open/lock owner
    anyhow (but there might be a bug that results in a duplicate).
    This should guarantee that the FOREACH_SAFE loops in
    nfscl_cleanupkext() are adequate.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
    and nfscl_freelockowner(), if it is not already held.
    This serializes all of these calls with the ones done in
    nfscl_cleanup_common().

Reported by:	ler
Reviewed by:	markj
Tested by:	cy
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34334
2022-02-25 07:27:03 -08:00
Rick Macklem
06148d2251 Revert "nfscl: Fix a use after free in nfscl_cleanupkext()"
This reverts commit dd08b84e35.

cy@ reported a problem caused by this patch.  He will be
testing an alternate patch, but I'm reverting this one.
2022-02-24 07:01:03 -08:00
Rick Macklem
dd08b84e35 nfscl: Fix a use after free in nfscl_cleanupkext()
ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
  owp->nfsow_owner.  If we free that particular
  owner structure, than in subsequent comparisons
  "own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
  of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
  without the NFSCLSTATE mutex held, this could race with
  nfscl_cleanupkext().
  This could happen when the exclusive lock is held
  on the client, such as when delegations are being returned.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
    nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() to return whether or not a
    free was done.
    When a free was done, do a goto to restart the loop, instead
    of using FOREACH_SAFE, which was not safe in this case.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
    and nfscl_freelockowner(), if it not already held.
    This serializes all of these calls with the ones done in
    nfscl_cleanup_common().

Reported by:	ler
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34334
2022-02-22 14:21:43 -08:00
Rick Macklem
98c788737f nfsclient: Delete unused function nfscl_getcookie()
The function nfscl_getcookie(), which is essentially the
same as ncl_getcookie(), is never called, so delete it.
This is probably cruft left over from the port of the
NFSv4 code to FreeBSD several years ago.

Found while modifying the code to better use the
directory offset cookies.

MFC after:	2 weeks
2022-01-27 15:30:26 -08:00
Rick Macklem
a91a57846b nfsd: Do not accept audit/alarm ACEs for the NFSv4 server
The UFS and ZFS file systems only support Allow/Deny ACEs
in the NFSv4 ACLs.  This patch does not allow the server
to parse Audit/Alarm ACEs.  The NFSv4 client is still
allowed to pase Audit/Alarm ACEs, since non-FreeBSD NFSv4
servers may use them.

This patch should not have a significant effect, since the
UFS and ZFS file systems will not handle these ACEs anyhow.
It simply serves as an additional "safety belt" for the
NFSv4 server.

MFC after:	2 weeks
2022-01-11 09:40:07 -08:00
Rick Macklem
5da9b3b011 Revert "nfscommon: Add arguments for support of the dacl attribute"
This reverts commit 0fa074b53e.

I now see that the implementation of the "dacl" operation
requires that the NFSv4 server to "automatic inheritance"
and I do not plan on doing this.  As such, this patch is
harmless, but unneeded.
2022-01-11 08:30:50 -08:00
Rick Macklem
e4df1036f6 nfscl: Always invalidate buffers for append writes
kib@ reported a problem which was resolved by
reverting commit 867c27c23a, which changed the NFS
client to use direct RPCs to the server for
IO_APPEND writes.  He also spotted that the
code only invalidated buffer cache buffers
when they were marked NMODIFIED (had been
written into).

This patch modifies the NFS VOP_WRITE() to
always invalidate the buffer cache buffers
and pages for the file when IO_APPEND is
specified.  It also includes some cleanup
suggested by kib@.

Reported by:	kib
Tested by:	kib
Reviewed by:	kib
MFC after:	10 weeks
2022-01-06 14:18:36 -08:00
Rick Macklem
0fa074b53e nfscommon: Add arguments for support of the dacl attribute
NFSv4.1/4.2 has an alternative to the acl attribute, called
dacl, that includes support for the ACL_ENTRY_INHERITED flag,
called NFSV4ACE_INHERITED in NFSv4.

This patch adds a dacl argument to nfsrv_buildacl(),
nfsrv_dissectacl() and nfsrv_dissectace(), so that they
will handle NFSV4ACE_INHERITED when dacl == true.

Since these functions are always called with dacl == false
for this patch, semantics should not have changed.
A future patch will add support for dacl.

MFC after:	2 weeks
2021-12-26 16:43:46 -08:00
Rick Macklem
b70042adfe nfscl: Check for mmap(2)'d file before doing direct output
Commit 867c27c23a modified the NFS client so that
it does IO_APPEND writes directly to the NFS server,
bypassing the buffer cache.  However, this could result
in stale data in client pages when the file is mmap(2)'d.
As such, the NFS client needs to call vm_object_is_active()
to check if the file is mmap(2)'d and only do direct
output if the file is not mmap(2)'d.

This patch adds this check.

Although a simple patch, I have given it a long MFC,
since the related commit 867c27c23a made a significant
semantics change and, as such, has a long MFC.

MFC after:	3 months
2021-12-20 13:10:26 -08:00
Rick Macklem
150da1e3cd nfscl: Partially revert commit 867c27c23a
Commit 867c27c23a enabled the n_directio_opens code
in open/close, which sets/clears NNONCACHE, for
IO_APPEND. This code should not be enabled unless
newnfs_directio_enable is non-zero.

This patch reverts that part of commit 867c27c23a.

A future patch that fixes the case where the
file that is being written IO_APPEND is mmap()'d.

MFC after:	3 months
2021-12-16 14:30:37 -08:00
Rick Macklem
e0861304a7 nfscl: Handle CB_SEQUENCE not first op correctly
The check for "not first operation" in CB_SEQUENCE
was done after the slot, etc. was updated. This patch
moves the check to the beginning of CB_SEQUENCE
processing.

While here, also fix the check for "no CB_SEQUENCE operation first"
by moving the check to the beginning of callback operation parsing,
since the check was in a couple of the other operations, but
not all of them.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260412
MFC after:	2 weeks
2021-12-15 16:36:40 -08:00
Rick Macklem
867c27c23a nfscl: Change IO_APPEND writes to direct I/O
IO_APPEND writes have always been very slow over NFS, due to
the need to acquire an up to date file size after flushing
all writes to the NFS server.

This patch switches the IO_APPEND writes to use direct I/O,
bypassing the buffer cache.  As such, flushing of writes
normally only occurs when the open(..O_APPEND..) is done.
It does imply that all writes must be done synchronously
and must be committed to stable storage on the file server
(NFSWRITE_FILESYNC).

For a simple test program that does 10,000 IO_APPEND writes
in a loop, performance improved significantly with this patch.

For a UFS exported file system, the test ran 12x faster.
This drops to 3x faster when the open(2)/close(2) are done
for each loop iteration.
For a ZFS exported file system, the test ran 40% faster.

The much smaller improvement may have been because the ZFS
file system I tested against does not have a ZIL log and
does have "sync" enabled.

Note that IO_APPEND write performance is still much slower
than when done on local file systems.

Although this is a simple patch, it does result in a
significant semantics change, so I have given it a
large MFC time.

Tested by:	otis
MFC after:	3 months
2021-12-15 08:35:48 -08:00
Rick Macklem
fe04c91184 nfscl: add a filesize limit check to nfs_allocate()
As reported in PR#260343, nfs_allocate() did not check
the filesize rlimit. This patch adds that check.

PR:	260343
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33422
2021-12-13 15:32:19 -08:00
Rick Macklem
24947b701d nfscl: Fix must_commit handling for mirrored pNFS mounts
For pNFS mounts to mirrored Flexible File layout pNFS servers,
the "must_commit" component in the nfsclwritedsdorpc
structure must be checked and the "must_commit" argument passed
into nfscl_doiods() must be updated.  Technically, only writes to
the DS with a writeverf change must be redone, but since this
occurrence will be rare, the must_commit argument to nfscl_doiosd()
is set to 1, so all writes to all DSs will be redone.

This bug would affect few, since use of mirrored pNFS servers
is rare and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots.

MFC after:	2 weeks
2021-12-12 15:40:30 -08:00
Rick Macklem
ead50c94cb nfscl: Fix must_commit/writeverf handling for Direct I/O
Without this patch, the KASSERT(must_commit == 0,..) can be
triggered by the writeverf in the Direct I/O write reply changing.
This is not a situation that should cause a panic(). Correct
handling is to ignore the change in "writeverf" for Direct
I/O, since it is done with NFSWRITE_FILESYNC.

This patch modifies the semantics of the "must_commit"
argument slightly, allowing an initial value of 2 to indicate
that a change in "writeverf" should be ignored.
It also fixes the KASSERT()s.

This bug would affect few, since Direct I/O is not enabled
by default and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots, however I found the
bug when testing against a Linux 5.15.1 kernel nfsd, which
replied to a NFSWRITE_FILESYNC write with a "writeverf" of all
0x0 bytes.

MFC after:	2 weeks
2021-12-11 15:00:30 -08:00
Rick Macklem
ab639f2398 nfscl: Check for an error return from nfsrv_getattrbits()
There were two places where the client code did not check
for a parse error return from nfsrv_getattrbits().

This patch fixes both of these cases.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260272
MFC after:	2 weeks
2021-12-09 14:32:22 -08:00
Rick Macklem
d9931c2561 nfscl: Sanity check the callback tag length
The sanity check for tag length in a callback request
was broken in two ways:

It checked for a negative value, but not a large positive
value.

It did not set taglen to -1, to indicate to the code that
it should not be used.

This patch fixes both of these issues.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260266
MFC after:	2 weeks
2021-12-09 14:15:48 -08:00
Rick Macklem
c3134a6af0 nfscl: Disable use of the LookupOpen RPC
The LookupOpen RPC reduces the number of Open RPCs
needed.  Unfortunately, it breaks certain software
builds over NFS, so disable it until this is fixed.

The LookupOpen RPC is only used for NFSv4.1/4.2
mounts when the "oneopenown" mount option is
specified, so this should not affect many users.
2021-11-27 15:34:45 -08:00
Rick Macklem
22f7bcb523 nfscl: Sanity check irdcnt in nfsrpc_createsession
Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	259996
MFC after:	2 weeks
2021-11-26 15:28:40 -08:00
Konstantin Belousov
8ef0c11e7c nfsclient: upgrade vnode lock in VOP_OPEN()/VOP_CLOSE() if we need to flush buffers
VOP_FSYNC() asserts that the vnode is exclusively locked for NFS.
If we try to execute file with recently modified content, the assert is
triggered.

Reviewed by:	rmacklem
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32999
2021-11-16 19:13:29 +02:00
Rick Macklem
ce9676de86 pNFS: Add nfsstats counters for number of Layouts
For pNFS, Layouts are issued by the server to indicate
where a file's data resides on the DS(s).  This patch
adds counters for how many layouts are allocated to
the nfsstatsv1 structure, using two reserved fields.

MFC after:	2 weeks
2021-11-12 17:32:55 -08:00
Rick Macklem
44744f7538 nfscl: Add a LayoutError RPC for NFSv4.2 pNFS mounts
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the MDS server to
tell it about the error.  This patch adds the same to the
FreeBSD NFSv4.2 pNFS client, to maintain Linux compatible
behaviour, particlularily for non-FreeBSD pNFS servers.

MFC after:	2 weeks
2021-11-11 15:43:58 -08:00
Rick Macklem
f0c9847a6c vfs: Add "ioflag" and "cred" arguments to VOP_ALLOCATE
When the NFSv4.2 server does a VOP_ALLOCATE(), it needs
the operation to be done for the RPC's credential and not
td_ucred. It also needs the writing to be done synchronously.

This patch adds "ioflag" and "cred" arguments to VOP_ALLOCATE()
and modifies vop_stdallocate() to use these arguments.

The VOP_ALLOCATE.9 man page will be patched separately.

Reviewed by:	khng, kib
Differential Revision:	https://reviews.freebsd.org/D32865
2021-11-06 13:26:43 -07:00
Rick Macklem
f5d5164fb6 nfscl: Fix two more cases for forced dismount
Although I was not able to cause a failure during testing, there
are places in nfscl_removedeleg() and nfscl_renamedeleg() where
I think a forced dismount could get hung.  This patch fixes those.

This patch only affects forced dismount and only if the NFSv4
server is issuing delegations to the client.

Found by code inspection.

MFC after:	2 weeks
2021-11-05 15:33:19 -07:00
Rick Macklem
6b67753488 nfscl: Fix forced dismount from looping on commit
When a forced dismount is in progress, it is possible to
end up looping, retrying commits that fail.
This patch fixes the problem by pretending
that commits succeeded when a forced dismount is in prgress.

MFC after:	2 weeks
2021-11-03 14:25:44 -07:00
Rick Macklem
4412225859 nfscl: Fix use after free for forced dismount
When a forced dismount is done and delegations are being
issued by the server (disabled by default for FreeBSD
servers), the delegation structure is free'd before the
loop calling vflush().  This could result in a use after
free of the delegation structure.

This patch changes the code so that the delegation
structures are not free'd until after the vflush()
loop for forced dismounts.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-03 12:15:40 -07:00
Rick Macklem
331883a2f2 nfscl: Check for a forced dismount in nfscl_getref()
The nfscl_getref() function is called within nfscl_doiods() when
the NFSv4.1/4.2 pNFS client is doing I/O on a DS.  As such,
nfscl_getref() needs to check for a forced dismount.
This patch adds that check.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-02 17:28:13 -07:00
Rick Macklem
d5d2ce1c85 nfscl: Do pNFS layout return_on_close synchronously
For pNFS servers that specify that Layouts are to be returned
upon close, they may expect that LayoutReturn to happen before
the associated Close.

This patch modifies the NFSv4.1/4.2 pNFS client so that this
is done.  This only affects a pNFS mount against a non-FreeBSD
NFSv4.1/4.2 server that specifies return_on_close in LayoutGet
replies.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-10-31 16:31:31 -07:00
Rick Macklem
50dcff0816 nfscl: Add setting n_localmodtime to the Write RPC code
Similar to commit 2be417843a, I believe there could be a race between
the NFS client VOP_LOOKUP() and file Writing that could result in stale
file attributes being loaded into the NFS vnode by VOP_LOOKUP().

I have not been able to reproduce a failure due to this race, but
I believe that there are two possibilities:

The Lookup RPC happens while VOP_WRITE() is being executed and loads
stale file attributes after VOP_WRITE() returns when it has already
completed the Write/Commit RPC(s).
--> For this case, setting the local modify timestamp at the end of
  VOP_WRITE() should ensure that stale file attributes are not loaded.

The Lookup RPC occurs after VOP_WRITE() has returned, while
asynchronous Write/Commit RPCs are in progress and then is
blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which
will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the
NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP()
then acquires the NFS vnode lock and fills in stale file attributes.
 --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf()
   when they clear NMODIFIED should ensure that stale file attributes
   are not loaded.

This patch does the above.

PR:	259071
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32677
2021-10-30 17:08:28 -07:00
Rick Macklem
ab87c39c25 nfscl: Set n_localmodtime in Deallocate
Commit 2be417843a added n_localmodtime, which is used by Lookup
and ReaddirPlus to check to see if the file attributes in an RPC
reply might be stale.  This patch sets n_localmodtime in Deallocate.
Done as a separate commit, since Deallocate is not in stable/13.

PR:	259071
Reviewed by:	asomers
Differential Revision:	https://reviews.freebsd.org/D32635
2021-10-30 16:46:14 -07:00
Rick Macklem
2be417843a PR#259071 provides a test program that fails for the NFS client.
Testing with it, there appears to be a race between Lookup
and VOPs like Setattr-of-size, where Lookup ends up loading
stale attributes (including what might be the wrong file size)
into the NFS vnode's attribute cache.

The race occurs when the modifying VOP (which holds a lock
on the vnode), blocks the acquisition of the vnode in Lookup,
after the RPC (with now potentially stale attributes).

Here's what seems to happen:
Child                                Parent

does stat(), which does
VOP_LOOKUP(), doing the Lookup
RPC with the directory vnode
locked, acquiring file attributes
valid at this point in time

blocks waiting for locked file       does ftruncate(), which
vnode                                does VOP_SETATTR() of Size,
                                     changing the file's size
                                     while holding an exclusive
                                     lock on the file's vnode
                                     releases the vnode lock
acquires file vnode and fills in
now stale attributes including
the old wrong Size
                                     does a read() which returns
                                     wrong data size

This patch fixes the problem by saving a timestamp in the NFS vnode
in the VOPs that modify the file (Setattr-of-size, Allocate).
Then lookup/readdirplus compares that timestamp with the time just
before starting the RPC after it has acquired the file's vnode.
If the modifying RPC occurred during the Lookup, the attributes
in the RPC reply are discarded, since they might be stale.

With this patch the test program works as expected.

Note that the test program does not fail on a July stable/12,
although this race is in the NFS client code.  I suspect a
fairly recent change to the name caching code exposed this
bug.

PR:	259071
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32635
2021-10-30 16:35:02 -07:00
Rick Macklem
dc6dd769de nfscl: Use NFSMNTP_DELEGISSUED in two more functions
Commit 5e5ca4c8fc added a NFSMNTP_DELEGISSUED flag to indicate when
a delegation has been issued to the mount.  For the common case
where an NFSv4 server is not issuing delegations, this flag
can be checked to avoid acquisition of the NFSCLSTATEMUTEX.

This patch adds checks for NFSMNTP_DELEGISSUED being set
to two more functions.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

MFC after:	2 week
2021-10-29 20:35:02 -07:00
Rick Macklem
23024f004a nfscl: Add a missing delegation lock release
There was a case in nfscl_doiods() where the function would return
without releasing the delegation shared lock, if it was aquired by
the call to nfscl_getstateid().  This patch adds that release.

I have never observed a failure due to this missing release, so I
do not know if it ever happens in practice.  However, since the pNFS
client is not yet heavily used, it might be the case.

Found by code inspection during a recent NFSv4 IETF working group
testing event.

MFC after:	2 week
2021-10-25 19:11:45 -07:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Rick Macklem
52dee2bc03 nfscl: Handle NFSv4.1/4.2 Close RPC NFSERR_DELAY replies better
Without this patch, if a NFSv4.1/4.2 server replies NFSERR_DELAY to
a Close operation, the client loops retrying the Close while holding
a shared lock on the clientID.  This shared lock blocks returns of
delegations, even though the server has issued a CB_RECALL to request
the delegation return.

This patch delays doing a retry of a Close that received a reply of
NFSERR_DELAY until after the shared lock on the clientID is released,
for NFSv4.1/4.2.  To fix this for NFSv4.0 would be very difficult and
since the only known NFSv4 server to reply NFSERR_DELAY to Close only
does NFSv4.1/4.2, this fix is hoped to be sufficient.

This problem was detected during a recent IETF working group NFSv4
testing event.

MFC after:	2 week
2021-10-18 15:05:34 -07:00
Rick Macklem
d95c0a12a2 nfscl: Modify Close RPC so that it does not use "owner" for NFSv4.1/4.2
This patch modifies the function that does the Close RPC (nfsrpc_closerpc)
so that it does not use the open_owner (nfso_own) for NFSv4.1/4.2.
Use of the seqid in the open_owner structure is only needed for NFSv4.0.
Same applies to a NFSERR_STALESTATEID reply, which should only happen
for NFSv4.0.  This allows nfsrpc_closerpc() to be called when nfso_own
is no longer valid.  This, in turn, allows nfsrpc_closerpc() to be called
after the shared lock on the clientID is released, for NFSv4.1/4.2.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-17 17:50:56 -07:00
Rick Macklem
e2aab5e2d7 nfscl: Move release of the clientID lock into nfscl_doclose()
This patch moves release of the shared clientID lock from nfsrpc_close()
just after the nfscl_doclose() call to the end of nfscl_doclose() call.
This does make the code cleaner, since the shared lock is acquired at
the beginning of nfscl_doclose().  The only semantics change is that
the code no longer drops and reaquires the NFSCLSTATELOCK() mutex,
which I do not believe will have a negative effect on the NFSv4 client.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-16 15:49:38 -07:00
Rick Macklem
77c595ce33 nfscl: Add an argument to nfscl_tryclose()
This patch adds a new argument to nfscl_tryclose() to indicate
whether or not it should loop when a NFSERR_DELAY reply is received
from the NFSv4 server.  Since this new argument is always passed in
as "true" at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-15 14:25:38 -07:00
Rick Macklem
6495766acf nfscl: Restructure nfscl_freeopen() slightly
This patch factors the unlinking of the nfsclopen structure out of
nfscl_freeopen() into a separate function called nfscl_unlinkopen().
It also adds a new argument to nfscl_freeopen() to conditionally do
the unlink.  Since this new argument is always passed in as "true"
at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-14 17:28:01 -07:00
Rick Macklem
24af0fcdfc nfscl: Make nfscl_getlayout() acquire the correct pNFS layout
Without this patch, if a pNFS read layout has already been acquired
for a file, writes would be redirected to the Metadata Server (MDS),
because nfscl_getlayout() would not acquire a read/write layout for
the file.  This happened because there was no "mode" argument to
nfscl_getlayout() to indicate whether reading or writing was being done.
Since doing I/O through the Metadata Server is not encouraged for some
pNFS servers, it is preferable to get a read/write layout for writes
instead of redirecting the write to the MDS.

This patch adds a access mode argument to nfscl_getlayout() and
nfsrpc_getlayout(), so that nfscl_getlayout() knows to acquire a read/write
layout for writing, even if a read layout has already been acquired.
This patch only affects NFSv4.1/4.2 client behaviour when pNFS ("pnfs" mount
option against a server that supports pNFS) is in use.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	2 week
2021-10-13 15:48:54 -07:00
Rick Macklem
b82168e657 nfscl: Fix another deadlock related to the NFSv4 clientID lock
Without this patch, it is possible to hang the NFSv4 client,
when a rename/remove is being done on a file where the client
holds a delegation, if pNFS is being used.  For a delegation
to be returned, dirty data blocks must be flushed to the NFSv4
server.  When pNFS is in use, a shared lock on the clientID
must be acquired while doing a write to the DS(s).
However, if rename/remove is doing the delegation return
an exclusive lock will be acquired on the clientID, preventing
the write to the DS(s) from acquiring a shared lock on the clientID.

This patch stops rename/remove from doing a delegation return
if pNFS is enabled.  Since doing delegation return in the same
compound as rename/remove is only an optimization, not doing
so should not cause problems.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	1 week
2021-10-12 17:21:01 -07:00
Rick Macklem
120b20bdf4 nfscl: Fix a deadlock related to the NFSv4 clientID lock
Without this patch, it is possible for a process doing an NFSv4
Open/create of a file to block to allow another process
to acquire the exclusive lock on the clientID when holding
a shared lock on the clientID.  As such, both processes
deadlock, with one wanting the exclusive lock, while the
other holds the shared lock.  This deadlock is unlikely to occur
unless delegations are in use on the NFSv4 mount.

This patch fixes the problem by not deferring to the process
waiting for the exclusive lock when a shared lock (reference cnt)
is already held by the process.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	1 week
2021-10-11 21:58:24 -07:00
Mateusz Guzik
b4a58fbf64 vfs: remove cn_thread
It is always curthread.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32453
2021-10-11 13:21:47 +00:00
Rick Macklem
235891a127 nfscl: Fix NFS VOP_ALLOCATE for mounts without Allocate support
Without this patch, nfs_allocate() fell back on using vop_stdallocate()
for NFS mounts without Allocate operation support.  This was incorrect,
since some file systems, such as ZFS, cannot do allocate via
vop_stdallocate(), which uses writes to try and allocate blocks.

Also, fix nfs_allocate() to return EINVAL when mounts cannot do Allocate,
since that is the correct error for posix_fallocate(2).
Note that Allocate is only supported by some NFSv4.2 servers.

MFC after:	2 weeks
2021-10-10 14:27:52 -07:00
Rick Macklem
62c5be4ab4 nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg()
Commit 5e5ca4c8fc added a flag to a NFSv4 mount point that is set when
the first delegation is acquired from the NFSv4 server.

For a common case where delegations are not being issued by the
NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for
open/lock state, finds the delegation list empty, then just unlocks the
mutex and returns. This patch adds a check of the flag to avoid the
need to acquire the mutex for this common case.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

This commit should not affect the high level semantics of delegation
handling.

MFC after:      2 weeks
2021-09-26 18:37:25 -07:00
Gordon Bergling
90d60ca8b7 nfsclient: Fix a typo in a comment
- s/derefernce/dereference/

MFC after:	3 days
2021-09-26 15:17:00 +02:00