Commit Graph

374 Commits

Author SHA1 Message Date
Alan Somers
32fbc5d824 nfs: don't truncate directory cookies to 32-bits in the NFS server
In NFSv2, the directory cookie was 32-bits.  NFSv3 widened it to
64-bits and SVN r22521 widened the corresponding argument in
VOP_READDIR, but FreeBSD's NFS server continued to treat the cookies as
32-bits, and 0-extended to fill the field on the wire.  Nobody ever
noticed, because every in-tree file system generates cookies that fit
comfortably within 32-bits.

Also, have better type safety for txdr_hyper.  Turn it into an inline
function that type-checks its arguments.  Prevents warnings about
shift-count-overflow.

PR:		260375
MFC after:	2 weeks
Reviewed by:	rmacklem
Differential Revision: https://reviews.freebsd.org/D33404
2021-12-15 20:54:57 -07:00
Rick Macklem
2d90ef4714 nfsd: Fix Verify for attributes like FilesAvail
When the Verify operation calls nfsv4_loadattr(), it provides
the "struct statfs" information that can be used for doing a
compare for FilesAvail, FilesFree, FilesTotal, SpaceAvail,
SpaceFree and SpaceTotal.  However, the code erroneously
used the "struct nfsstatfs *" argument that is NULL.
This patch fixes these cases to use the correct argument
structure.  For the case of FilesAvail, the code in
nfsv4_fillattr() was factored out into a separate function
called nfsv4_filesavail(), so that it can be called from
nfsv4_loadattr() as well as nfsv4_fillattr().

In fact, most of the code in nfsv4_filesavail() is old
OpenBSD code that does not build/run on FreeBSD, but I
left it in place, in case it is of some use someday.

I am not aware of any extant NFSv4 client that does Verify
on these attributes.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260176
MFC after:	2 weeks
2021-12-04 14:38:55 -08:00
Rick Macklem
480be96e1e nfsd: Sanity check the Layouttype count
Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260155
MFC after:	2 weeks
2021-12-04 14:22:19 -08:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Rick Macklem
fd020f197d nfsd: Sanity check the ACL attribute
When an ACL is presented to the NFSv4 server in
Setattr or Verify, parsing of the ACL assumed a
sane acecnt and sane sizes for the "who" strings.
This patch adds sanity checks for these.

The patch also fixes handling of an error
return from nfsrv_dissectacl() for one broken
case.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260111
MFC after:	2 weeks
2021-12-01 13:55:17 -08:00
Rick Macklem
638b90a191 nfs: Quiet a few "unused" warnings
For most of these warnings, the variable is loaded
with data parsed out of an RPC messages.  In case
the data is useful in the future, I just marked
these with __unused.
2021-11-28 15:48:51 -08:00
Rick Macklem
1c15c8c0e9 nfscl: Sanity check the Sequence slotid in reply
The slotid in the Sequence reply must be the same as
in the request.  Check that it is the same and log
a console message if it is not, plus set it to the
correct value.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260071
MFC after:	2 weeks
2021-11-27 15:02:04 -08:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Rick Macklem
ce9676de86 pNFS: Add nfsstats counters for number of Layouts
For pNFS, Layouts are issued by the server to indicate
where a file's data resides on the DS(s).  This patch
adds counters for how many layouts are allocated to
the nfsstatsv1 structure, using two reserved fields.

MFC after:	2 weeks
2021-11-12 17:32:55 -08:00
Rick Macklem
44744f7538 nfscl: Add a LayoutError RPC for NFSv4.2 pNFS mounts
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the MDS server to
tell it about the error.  This patch adds the same to the
FreeBSD NFSv4.2 pNFS client, to maintain Linux compatible
behaviour, particlularily for non-FreeBSD pNFS servers.

MFC after:	2 weeks
2021-11-11 15:43:58 -08:00
Rick Macklem
f8dc06303b nfsd: Fix the NFSv4.2 pNFS MDS server for NFSERR_NOSPC via LayoutError
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the MDS server to
tell it about the error and keeps retrying, doing repeated
LayoutGets to the MDS and Write RPCs to the DS.  The Linux client is
"stuck" until disk space on the DS is free'd up unless
a subsequent LayoutGet request is sent a NFSERR_NOSPC
reply.
The looping problem still occurs for NFSv4.1 mounts, but no
fix for this is known at this time.

This patch changes the pNFS MDS server to reply to LayoutGet
operations with NFSERR_NOSPC once a LayoutError reports the
problem, until the DS has available space.  This keeps the Linux
NFSv4.2 from looping.

Found during recent testing because of issues w.r.t. a DS
being out of space found during a recent IEFT NFSv4 working
group testing event.

MFC after:	2 weeks
2021-11-08 15:58:00 -08:00
Rick Macklem
d70ca5b00e nfsd: Fix f_bavail and f_ffree for NFSv4 when negative
Since the NFS Space_available and Files_available are unsigned,
the NFSv3 server sets them to 0 when negative, so that they
do not appear to be large positive values for non-FreeBSD clients.
This patch fixes the NFSv4 server to do the same.

Found during a recent IEFT NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-08 12:59:31 -08:00
Rick Macklem
80e5955b08 nfscl: Fix NFSv4.1/4.2 pnfs mounts using nconnect
When a mount with the "pnfs" and "nconnect" options specified
does an I/O operation, it erroneously uses a TCP connection
to the MDS when it is meant to be a DS operation and, as such,
needs to use a TCP connection to the DS.  This patch fixes this.

When the "pnfs" and "nconnect" options are specified for a
NFSv4.1/4.2 mount, there probably should be N connections
established to each DS for I/O RPCs.  This is a fair amount
of work and may be done in a future commit.

This problem was found during a recent IETF NFSv4 working
group testing event.

MFC after:	2 weeks
2021-11-04 17:06:34 -07:00
Rick Macklem
ae49051c03 nfscl: Fix forced dismount when "nconnect" is specified
When a forced dismount is done and the "nconnect" mount
option was used, the additional connections must be closed.
This patch does that.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-03 13:26:38 -07:00
Rick Macklem
4412225859 nfscl: Fix use after free for forced dismount
When a forced dismount is done and delegations are being
issued by the server (disabled by default for FreeBSD
servers), the delegation structure is free'd before the
loop calling vflush().  This could result in a use after
free of the delegation structure.

This patch changes the code so that the delegation
structures are not free'd until after the vflush()
loop for forced dismounts.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-03 12:15:40 -07:00
Rick Macklem
5a95a6e8e4 nfscl: Use a smaller initial delay time for NFSERR_DELAY
For NFS RPCs that receive a NFSERR_DELAY reply, the delay time
is initially 1sec and then increases exponentially to NFS_TRYLATERDEL.
It was found that this delay time is excessive for some NFSv4
servers, which work well with a 1msec delay.
A 1sec delay resulted in very slow performance for Remove and
Rename when delegations and pNFS were enabled.

This patch decreases the initial delay time to 1msec.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-01 17:21:31 -07:00
Rick Macklem
d5d2ce1c85 nfscl: Do pNFS layout return_on_close synchronously
For pNFS servers that specify that Layouts are to be returned
upon close, they may expect that LayoutReturn to happen before
the associated Close.

This patch modifies the NFSv4.1/4.2 pNFS client so that this
is done.  This only affects a pNFS mount against a non-FreeBSD
NFSv4.1/4.2 server that specifies return_on_close in LayoutGet
replies.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-10-31 16:31:31 -07:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Rick Macklem
52dee2bc03 nfscl: Handle NFSv4.1/4.2 Close RPC NFSERR_DELAY replies better
Without this patch, if a NFSv4.1/4.2 server replies NFSERR_DELAY to
a Close operation, the client loops retrying the Close while holding
a shared lock on the clientID.  This shared lock blocks returns of
delegations, even though the server has issued a CB_RECALL to request
the delegation return.

This patch delays doing a retry of a Close that received a reply of
NFSERR_DELAY until after the shared lock on the clientID is released,
for NFSv4.1/4.2.  To fix this for NFSv4.0 would be very difficult and
since the only known NFSv4 server to reply NFSERR_DELAY to Close only
does NFSv4.1/4.2, this fix is hoped to be sufficient.

This problem was detected during a recent IETF working group NFSv4
testing event.

MFC after:	2 week
2021-10-18 15:05:34 -07:00
Rick Macklem
77c595ce33 nfscl: Add an argument to nfscl_tryclose()
This patch adds a new argument to nfscl_tryclose() to indicate
whether or not it should loop when a NFSERR_DELAY reply is received
from the NFSv4 server.  Since this new argument is always passed in
as "true" at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-15 14:25:38 -07:00
Rick Macklem
6495766acf nfscl: Restructure nfscl_freeopen() slightly
This patch factors the unlinking of the nfsclopen structure out of
nfscl_freeopen() into a separate function called nfscl_unlinkopen().
It also adds a new argument to nfscl_freeopen() to conditionally do
the unlink.  Since this new argument is always passed in as "true"
at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-14 17:28:01 -07:00
Rick Macklem
24af0fcdfc nfscl: Make nfscl_getlayout() acquire the correct pNFS layout
Without this patch, if a pNFS read layout has already been acquired
for a file, writes would be redirected to the Metadata Server (MDS),
because nfscl_getlayout() would not acquire a read/write layout for
the file.  This happened because there was no "mode" argument to
nfscl_getlayout() to indicate whether reading or writing was being done.
Since doing I/O through the Metadata Server is not encouraged for some
pNFS servers, it is preferable to get a read/write layout for writes
instead of redirecting the write to the MDS.

This patch adds a access mode argument to nfscl_getlayout() and
nfsrpc_getlayout(), so that nfscl_getlayout() knows to acquire a read/write
layout for writing, even if a read layout has already been acquired.
This patch only affects NFSv4.1/4.2 client behaviour when pNFS ("pnfs" mount
option against a server that supports pNFS) is in use.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	2 week
2021-10-13 15:48:54 -07:00
Rick Macklem
120b20bdf4 nfscl: Fix a deadlock related to the NFSv4 clientID lock
Without this patch, it is possible for a process doing an NFSv4
Open/create of a file to block to allow another process
to acquire the exclusive lock on the clientID when holding
a shared lock on the clientID.  As such, both processes
deadlock, with one wanting the exclusive lock, while the
other holds the shared lock.  This deadlock is unlikely to occur
unless delegations are in use on the NFSv4 mount.

This patch fixes the problem by not deferring to the process
waiting for the exclusive lock when a shared lock (reference cnt)
is already held by the process.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	1 week
2021-10-11 21:58:24 -07:00
Mateusz Guzik
ef7d2c1fc1 nfs: eliminate thread argument from nfsvno_namei
This is a step towards retiring struct componentname cn_thread

Reviewed by:	rmacklem
Differential Revision:	https://reviews.freebsd.org/D32267
2021-10-02 00:57:20 +00:00
Rick Macklem
55089ef4f8 nfscl: Make vfs.nfs.maxcopyrange larger by default
As of commit 103b207536, the NFSv4.2 server will limit the size
of a Copy operation based upon a 1 second timeout.  The Linux 5.2
kernel server also limits Copy operation size to 4Mbytes.
As such, the NFSv4.2 client can attempt a large Copy without
resulting in a long RPC RTT for these servers.

This patch changes vfs.nfs.maxcopyrange to 64bits and sets
the default to the maximum possible size of SSIZE_MAX, since
a larger size makes the Copy operation more efficient and
allows for copying to complete with fewer RPCs.
The sysctl may be need to be made smaller for other non-FreeBSD
NFSv4.2 servers.

MFC after:	2 weeks
2021-09-11 15:36:32 -07:00
Rick Macklem
08b9cc316a nfscl: Add a VOP_DEALLOCATE() for the NFSv4.2 client
This patch adds a VOP_DEALLOCATE() to the NFS client.
For NFSv4.2 servers that support the Deallocate operation,
it is used. Otherwise, it falls back on calling
vop_stddeallocate().

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31640
2021-08-27 18:31:36 -07:00
Rick Macklem
bb958dcf3d nfsd: Add support for the NFSv4.2 Deallocate operation
The recently added VOP_DEALLOCATE(9) VOP call allows
implementation of the Deallocate NFSv4.2 operation.

Since the Deallocate operation is a single succeed/fail
operation, the call to VOP_DEALLOCATE(9) loops so long
as progress is being made.  It calls maybe_yield()
between loop iterations to allow other processes
to preempt it.

Where RFC 7862 underspecifies behaviour, the code
is written to be Linux NFSv4.2 server compatible.

Reviewed by:	khng
Differential Revision:	https://reviews.freebsd.org/D31624
2021-08-26 18:14:11 -07:00
Rick Macklem
3ad1e1c1ce nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2
This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2
NFS client, which can be used by nfs_lookup() so that a
subsequent Open RPC is not required.
It uses the cn_flags OPENREAD, OPENWRITE added by commit c18c74a87c.
This reduced the number of RPCs by about 15% for a kernel
build over NFS.

For now, use of Lookup+Open is only done when the "oneopenown"
mount option is used.  It may be possible for Lookup+Open to
be used for non-oneopenown NFSv4.1/4.2 mounts, but that will
require extensive further testing to determine if it works.

While here, I've added the changes to the nfscommon module
that are needed to implement the Deallocate NFSv4.2 operation.
This avoids needing another cycle of changes to the internal
KAPI between the NFS modules.

This commit has changed the internal KAPI between the NFS
modules and, as such, all need to be rebuilt from sources.
I have not bumped __FreeBSD_version, since it was bumped a
few days ago.
2021-08-11 18:49:26 -07:00
Rick Macklem
ee29e6f311 nfsd: Add sysctl to set maximum I/O size up to 1Mbyte
Since MAXPHYS now allows the FreeBSD NFS client
to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio
so that the maximum NFS server I/O size can be set up to 1Mbyte.
The Linux NFS client can also do 1Mbyte I/O operations.

The default of 128Kbytes for the maximum I/O size has
not been changed for two reasons:
- kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O
- The limited benchmarking I can do actually shows a drop in I/O rate
  when the I/O size is above 256Kbytes.
However, daveb@spectralogic.com reports seeing an increase
in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client.

Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30826
2021-07-16 15:01:03 -07:00
Rick Macklem
1e0a518d65 nfscl: Add a Linux compatible "nconnect" mount option
Linux has had an "nconnect" NFS mount option for some time.
It specifies that N (up to 16) TCP connections are to created for a mount,
instead of just one TCP connection.

A discussion on freebsd-net@ indicated that this could improve
client<-->server network bandwidth, if either the client or server
have one of the following:
- multiple network ports aggregated to-gether with lagg/lacp.
- a fast NIC that is using multiple queues
It does result in using more IP port#s and might increase server
peak load for a client.

One difference from the Linux implementation is that this implementation
uses the first TCP connection for all RPCs composed of small messages
and uses the additional TCP connections for RPCs that normally have
large messages (Read/Readdir/Write).  The Linux implementation spreads
all RPCs across all TCP connections in a round robin fashion, whereas
this implementation spreads Read/Readdir/Write across the additional
TCP connections in a round robin fashion.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30970
2021-07-08 17:39:04 -07:00
Rick Macklem
c5f4772c66 nfscl: Improve "Consider increasing kern.ipc.maxsockbuf" message
When the setting of kern.ipc.maxsockbuf is less than what is
desired for I/O based on vfs.maxbcachebuf and vfs.nfs.bufpackets,
a console message of "Consider increasing kern.ipc.maxsockbuf".
is printed.

This patch modifies the message to provide a suggested value
for kern.ipc.maxsockbuf.
Note that the setting is only needed when the NFS rsize/wsize
is set to vfs.maxbcachebuf.

While here, make nfs_bufpackets global, so that it can be used
by a future patch that adds a sysctl to set the NFS server's
maximum I/O size.  Also, remove "sizeof(u_int32_t)" from the maximum
packet length, since NFS_MAXXDR is already an "overestimate"
of the actual length.

MFC after:	2 weeks
2021-06-30 15:15:41 -07:00
Rick Macklem
aed98fa5ac nfscl: Make NFSv4.0 client acquisition NFSv4.1/4.2 compatible
When the NFSv4.0 client was implemented, acquisition of a clientid
via SetClientID/SetClientIDConfirm was done upon the first Open,
since that was when it was needed.  NFSv4.1/4.2 acquires the clientid
during mount (via ExchangeID/CreateSession), since the associated
session is required during mount.

This patch modifies the NFSv4.0 mount so that it acquires the
clientid during mount.  This simplifies the code and makes it
easy to implement "find the highest minor version supported by
the NFSv4 server", which will be done for the default minorversion
in a future commit.
The "start_renewthread" argument for nfscl_getcl() is replaced
by "tryminvers", which will be used by the aforementioned
future commit.

MFC after:	2 weeks
2021-06-15 17:48:51 -07:00
Rick Macklem
a5df139ec6 nfsd: Fix when NFSERR_WRONGSEC may be replied to NFSv4 clients
Commit d224f05fcf pre-parsed the next operation number for
the put file handle operations.  This patch uses this next
operation number, plus the type of the file handle being set by
the put file handle operation, to implement the rules in RFC5661
Sec. 2.6 with respect to replying NFSERR_WRONGSEC.

This patch also adds a check to see if NFSERR_WRONGSEC should be
replied when about to perform Lookup, Lookupp or Open with a file
name component, so that the NFSERR_WRONGSEC reply is done for
these operations, as required by RFC5661 Sec. 2.6.

This patch does not have any practical effect for the FreeBSD NFSv4
client and I believe that the same is true for the Linux client,
since NFSERR_WRONGSEC is considered a fatal error at this time.

MFC after:	2 weeks
2021-06-05 16:53:07 -07:00
Rick Macklem
947bd2479b nfsd: Add support for the NFSv4.1/4.2 Secinfo_no_name operation
The Linux client is now attempting to use the Secinfo_no_name
operation for NFSv4.1/4.2 mounts.  Although it does not seem to
mind the NFSERR_NOTSUPP reply, adding support for it seems
reasonable.

I also noticed that "savflag" needed to be 64bits in
nfsrvd_secinfo() since nd_flag in now 64bits, so I changed
the declaration of it there.  I also added code to set "vp" NULL
after performing Secinfo/Secinfo_no_name, since these
operations consume the current FH, which is represented
by "vp" in nfsrvd_compound().

Fixing when the server replies NFSERR_WRONGSEC so that
it conforms to RFC5661 Sec. 2.6 still needs to be done
in a future commit.

MFC after:	2 weeks
2021-05-30 17:52:43 -07:00
Rick Macklem
3f7e14ad93 nfscl: Add hash lists for the NFSv4 opens
A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently.  When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.

This patch adds a table of hash lists for the opens, hashed on
file handle.  This table will be used by future commits to
search for an open based on file handle more efficiently.

MFC after:	2 weeks
2021-05-22 14:53:56 -07:00
Rick Macklem
fc0dc94029 nfsd: Reduce the callback timeout to 800msec
Recent discussion on the nfsv4@ietf.org mailing list confirmed
that an NFSv4 server should reply to an RPC in less than 1second.
If an NFSv4 RPC requires a delegation be recalled,
the server will attempt a CB_RECALL callback.
If the client is not responsive, the RPC reply will be delayed
until the callback times out.
Without this patch, the timeout is set to 4 seconds (set in
ticks, but used as seconds), resulting in the RPC reply taking over 4sec.
This patch redefines the constant as being in milliseconds and it
implements that for a value of 800msec, to ensure the RPC
reply is sent in less than 1second.

This patch only affects mounts from clients when delegations
are enabled on the server and the client is unresponsive to callbacks.

MFC after:	2 weeks
2021-05-18 16:17:58 -07:00
Rick Macklem
dd02d9d605 nfscl: Add support for va_birthtime to NFSv4
There is a NFSv4 file attribute called TimeCreate
that can be used for va_birthtime.
r362175 added some support for use of TimeCreate.
This patch completes support of va_birthtime by adding
support for setting this attribute to the server.
It also eanbles the client to
acquire and set the attribute for a NFSv4
server that supports the attribute.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30156
2021-05-07 17:30:56 -07:00
Mark Johnston
8bde6d15d1 nfsclient: Copy only initialized fields in nfs_getattr()
When loading attributes from the cache, the NFS client is careful to
copy only the fields that it initialized.  After fetching attributes
from the server, however, it would copy the entire vattr structure
initialized from the RPC response, so uninitialized stack bytes would
end up being copied to userspace.  In particular, va_birthtime (v2 and
v3) and va_gen (v3) had this problem.

Use a common subroutine to copy fields provided by the NFS client, and
ensure that we provide a dummy va_gen for the v3 case.

Reviewed by:	rmacklem
Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30090
2021-05-04 08:53:57 -04:00
Rick Macklem
f5ff282bc0 nfscl: fix the handling of NFSERR_DELAY for Open/LayoutGet RPCs
For a pNFS mount, the NFSv4.1/4.2 client uses compound RPCs that
have both Open and LayoutGet operations in them.
If the pNFS server were tp reply NFSERR_DELAY for one of these
compounds, the retry after a delay cannot be handled by
newnfs_request(), since there is a reference held on the open
state for the Open operation in them.

Fix this by adding these RPCs to the "don't do delay here"
list in newnfs_request().

This patch is only needed if the mount is using pNFS (the "pnfs"
mount option) and probably only matters if the MDS server
is issuing delegations as well as pNFS layouts.

Found by code inspection.

MFC after:	2 weeks
2021-04-26 17:48:21 -07:00
Rick Macklem
8759773148 nfsd: fix the slot sequence# when a callback fails
Commit 4281bfec36 patched the server so that the
callback session slot would be free'd for reuse when
a callback attempt fails.
However, this can often result in the sequence# for
the session slot to be advanced such that the client
end will reply NFSERR_SEQMISORDERED.

To avoid the NFSERR_SEQMISORDERED client reply,
this patch negates the sequence# advance for the
case where the callback has failed.
The common case is a failed back channel, where
the callback cannot be sent to the client, and
not advancing the sequence# is correct for this
case.  For the uncommon case where the client's
reply to the callback is lost, not advancing the
sequence# will indicate to the client that the
next callback is a retry and not a new callback.
But, since the FreeBSD server always sets "csa_cachethis"
false in the callback sequence operation, a retry
and a new callback should be handled the same way
by the client, so this should not matter.

Until you have this patch in your NFSv4.1/4.2 server,
you should consider avoiding the use of delegations.
Even with this patch, interoperation with the
Linux NFSv4.1/4.2 client in kernel versions prior
to 5.3 can result in frequent 15second delays if
delegations are enabled.  This occurs because, for
kernels prior to 5.3, the Linux client does a TCP
reconnect every time it sees multiple concurrent
callbacks and then it takes 15seconds to recover
the back channel after doing so.

MFC after:	2 weeks
2021-04-26 16:24:10 -07:00
Rick Macklem
aad780464f nfscl: return delegations in the NFS VOP_RECLAIM()
After a vnode is recycled it can no longer be
acquired via vfs_hash_get() and, as such,
a delegation for the vnode cannot be recalled.

In the unlikely event that a delegation still
exists when the vnode is being recycled, return
the delegation since it will no longer be
recallable.

Until you have this patch in your NFSv4 client,
you should consider avoiding the use of delegations.

MFC after:	2 weeks
2021-04-25 17:57:55 -07:00
Rick Macklem
78ffcb86d9 nfscommon: fix function name in comment
MFC after:	2 weeks
2021-04-19 20:09:46 -07:00
Rick Macklem
34256484af Revert "nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661"
This reverts commit 9edaceca81.

It turns out that the Linux client intentionally does an NFSv4.1
RPC with only a Sequence operation in it and with "seqid + 1"
for the slot.  This is used to re-synchronize the slot's seqid
and the client expects the NFS4ERR_SEQ_MISORDERED error reply.

As such, revert the patch, so that the server remains RFC5661
compliant.
2021-04-15 14:08:40 -07:00
Rick Macklem
9edaceca81 nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.

Sometimes, after some Linux NFSv4.1/4.2 clients establish
a new TCP connection, they will advance the sequence number
for a session slot by 2 instead of 1.
RFC5661 specifies that a server should reply
NFS4ERR_SEQ_MISORDERED for this case.
This might result in a system call error in the client and
seems to disable future use of the slot by the client.
Since advancing the sequence number by 2 seems harmless,
allow this case if vfs.nfs.linuxseqsesshack is non-zero.

Note that, if the order of RPCs is actually reversed,
a subsequent RPC with a smaller sequence number value
for the slot will be received.  This will result in
a NFS4ERR_SEQ_MISORDERED reply.
This has not been observed during testing.
Setting vfs.nfs.linuxseqsesshack to 0 will provide
RFC5661 compliant behaviour.

This fix affects the fairly rare case where a NFSv4
Linux client does a TCP reconnect and then apparently
erroneously increments the sequence number for the
session slot twice during the reconnect cycle.

PR:	254816
MFC after:	2 weeks
2021-04-11 16:51:25 -07:00
Rick Macklem
7763814fc9 nfsv4 client: do the BindConnectionToSession as required
During a recent testing event, it was reported that the NFSv4.1/4.2
server erroneously bound the back channel to a new TCP connection.
RFC5661 specifies that the fore channel is implicitly bound to a
new TCP connection when an RPC with Sequence (almost any of them)
is done on it.  For the back channel to be bound to the new TCP
connection, an explicit BindConnectionToSession must be done as
the first RPC on the new connection.

Since new TCP connections are created by the "reconnect" layer
(sys/rpc/clnt_rc.c) of the krpc, this patch adds an optional
upcall done by the krpc whenever a new connection is created.
The patch also adds the specific upcall function that does a
BindConnectionToSession and configures the krpc to call it
when required.

This is necessary for correct interoperability with NFSv4.1/NFSv4.2
servers when the nfscbd daemon is running.

If doing NFSv4.1/NFSv4.2 mounts without this patch, it is
recommended that the nfscbd daemon not be running and that
the "pnfs" mount option not be specified.

PR:	254840
Comments by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29475
2021-04-11 14:34:57 -07:00
Rick Macklem
22cefe3d83 nfsd: fix replies from session cache for multiple retries
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.

Commit 05a39c2c1c fixed replying with the cached reply in
in the session slot if same session slot sequence#.
However, the code uses the reply and, as such,
will fail for a subsequent retry of the RPC.
A subsequent retry would be an extremely rare event,
but this patch fixes this, so long as m_copym(..M_NOWAIT)
does not fail, which should also be a rare event.

This fix affects the exceedingly rare case where a NFSv4
client retries a non-idempotent RPC, such as a lock
operation, multiple times.  Note that retries only occur
after the client has needed to create a new TCP connection,
with a new TCP connection for each retry.

MFC after:	2 weeks
2021-04-10 15:50:25 -07:00
Rick Macklem
7a606f280a nfsd: make the server repeat CB_RECALL every couple of seconds
Commit 01ae8969a9 stopped the NFSv4.1/4.2 server from implicitly
binding the back channel to a new TCP connection so that it
conforms to RFC5661, for NFSv4.1/4.2. An effect of this
for the Linux NFS client is that it will do a
BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN
set in a sequence reply. This will fix the back channel, but the
first attempt at a callback like CB_RECALL will already have
failed. Without this patch, a CB_RECALL will not be retried
and that can result in a 5 minute delay until the delegation
times out.

This patch modifies the code so that it will retry the
CB_RECALL every couple of seconds, often avoiding the
5 minute delay.

This is not critical for correct behaviour, but avoids
the 5 minute delay for the case where the Linux client
re-binds the back channel via BindConnectionToSession.

MFC after:	2 weeks
2021-04-04 18:15:54 -07:00
Rick Macklem
5f742d3879 nfsv4 client: fix forced dismount when sleeping on nfsv4lck
During a recent NFSv4 testing event a test server caused a hang
where "umount -N" failed.  The renew thread was sleeping on "nfsv4lck"
and the "umount" was sleeping, waiting for the renew thread to
terminate.

This is the first of two patches that is hoped to fix the renew thread
so that it will terminate when "umount -N" is done on the mount.

nfsv4_lock() checks for forced dismount, but only after it wakes up
from msleep().  Without this patch, a wakeup() call was required.
This patch adds a 1second timeout on the msleep(), so that it will
wake up and see the forced dismount flag.  Normally a wakeup()
will occur in less than 1second, but if a premature return from
msleep() does occur, it will simply loop around and msleep() again.

While here, replace the nfsmsleep() wrapper that was used for portability
with the actual msleep() call and make the same change for nfsv4_getref().

MFC after:	2 weeks
2021-03-19 14:09:33 -07:00
Rick Macklem
c04199affe nfsclient: Fix ReadDS/WriteDS/CommitDS nfsstats RPC counts for a NFSv3 DS
During a recent virtual NFSv4 testing event, a bug in the FreeBSD client
was detected when doing I/O DS operations on a Flexible File Layout pNFS
server.  For an NFSv3 DS, the Read/Write/Commit nfsstats were incremented
instead of the ReadDS/WriteDS/CommitDS counts.
This patch fixes this.

Only the RPC counts reported by nfsstat(1) were affected by this bug,
the I/O operations were performed correctly.

MFC after:	2 weeks
2021-03-02 14:18:23 -08:00