Commit Graph

349 Commits

Author SHA1 Message Date
Mateusz Guzik
b4a58fbf64 vfs: remove cn_thread
It is always curthread.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32453
2021-10-11 13:21:47 +00:00
Rick Macklem
dfe887b7d2 nfsd: Disable the NFSv4.2 Allocate operation by default
Some exported file systems, such as ZFS ones, cannot do VOP_ALLOCATE().
Since an NFSv4.2 server must either support the Allocate operation for
all file systems or not support it at all, define a sysctl called
vfs.nfsd.enable_v42allocate to enable the Allocate operation.
This sysctl is false by default and can only be set true if all
exported file systems (or all DSs for a pNFS server) can perform
VOP_ALLOCATE().

Unfortunately, there is no way to know if a ZFS file system will
be exported once the nfsd is operational, even if there are none
exported when the nfsd is started up, so enabling Allocate must
be done manually for a server configuration.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	2 weeks
2021-10-10 18:46:02 -07:00
Rick Macklem
93a32050ab nfsd: Fix pNFS handling of Deallocate
For a pNFS server configuration, an NFSv4.2 Deallocate operation
is proxied to the DS(s).  The code that parsed the reply for the
proxy RPC is broken and did not process the pre-operation attributes.

This patch fixes this problem.

This bug would only affect pNFS servers built from recent main/FreeBSD14
sources.
2021-10-02 14:11:15 -07:00
Mateusz Guzik
ef7d2c1fc1 nfs: eliminate thread argument from nfsvno_namei
This is a step towards retiring struct componentname cn_thread

Reviewed by:	rmacklem
Differential Revision:	https://reviews.freebsd.org/D32267
2021-10-02 00:57:20 +00:00
Alexander Motin
272c4a4dc5 Allow setting NFS server scope and owner.
By default NFS server reports as scope and owner major the host UUID
value and zero for owner minor.  It works good in case of standalone
server.  But in case of CARP-based HA cluster failover the values
should remain persistent, otherwise some clients like VMware ESXi
get confused by the change and fail to reconnect automatically.

The patch makes server scope, major owner and minor owner values
configurable via sysctls.  If not set (by default) the host UUID
value is still used.

Reviewed by:	rmacklem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31952
2021-09-14 14:18:03 -04:00
Rick Macklem
f1c8811d2d nfsd: Fix build after commit 103b207536 for 32bit arches
MFC after:	2 weeks
2021-09-08 18:55:06 -07:00
Rick Macklem
103b207536 nfsd: Use the COPY_FILE_RANGE_TIMEO1SEC flag
Although it is not specified in the RFCs, the concept that
the NFSv4 server should reply to an RPC request within a
reasonable time is accepted practice within the NFSv4 community.

Without this patch, the NFSv4.2 server attempts to reply to
a Copy operation within 1 second by limiting the copy to
vfs.nfs.maxcopyrange bytes (default 10Mbytes). This is crude at
best, given the large variation in I/O subsystem performance.

This patch uses the COPY_FILE_RANGE_TIMEO1SEC flag added by
commit c5128c48df to limit the reply time for a Copy
operation to approximately 1 second.

MFC after:	2 weeks
2021-09-08 14:29:20 -07:00
Rick Macklem
13914e51eb nfsd: Make loop calling VOP_ALLOCATE() iterate until done
The NFSv4.2 Deallocate operation loops on VOP_DEALLOCATE()
while progress is being made (remaining length decreasing).
This patch changes the loop on VOP_ALLOCATE() for the NFSv4.2
Allocate operation do the same, instead of stopping after
an arbitrary 20 iterations.

MFC after:	2 weeks
2021-08-29 16:46:27 -07:00
Rick Macklem
bb958dcf3d nfsd: Add support for the NFSv4.2 Deallocate operation
The recently added VOP_DEALLOCATE(9) VOP call allows
implementation of the Deallocate NFSv4.2 operation.

Since the Deallocate operation is a single succeed/fail
operation, the call to VOP_DEALLOCATE(9) loops so long
as progress is being made.  It calls maybe_yield()
between loop iterations to allow other processes
to preempt it.

Where RFC 7862 underspecifies behaviour, the code
is written to be Linux NFSv4.2 server compatible.

Reviewed by:	khng
Differential Revision:	https://reviews.freebsd.org/D31624
2021-08-26 18:14:11 -07:00
Rick Macklem
06afb53bcd nfsd: Fix sanity check for NFSv4.2 Allocate operations
The NFSv4.2 Allocate operation sanity checks the aa_offset
and aa_length arguments.  Since they are assigned to variables
of type off_t (signed) it was possible for them to be negative.
It was also possible for aa_offset+aa_length to exceed OFF_MAX
when stored in lo_end, which is uint64_t.

This patch adds checks for these cases to the sanity check.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31511
2021-08-12 16:48:28 -07:00
Rick Macklem
ee29e6f311 nfsd: Add sysctl to set maximum I/O size up to 1Mbyte
Since MAXPHYS now allows the FreeBSD NFS client
to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio
so that the maximum NFS server I/O size can be set up to 1Mbyte.
The Linux NFS client can also do 1Mbyte I/O operations.

The default of 128Kbytes for the maximum I/O size has
not been changed for two reasons:
- kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O
- The limited benchmarking I can do actually shows a drop in I/O rate
  when the I/O size is above 256Kbytes.
However, daveb@spectralogic.com reports seeing an increase
in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client.

Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30826
2021-07-16 15:01:03 -07:00
Rick Macklem
1e0a518d65 nfscl: Add a Linux compatible "nconnect" mount option
Linux has had an "nconnect" NFS mount option for some time.
It specifies that N (up to 16) TCP connections are to created for a mount,
instead of just one TCP connection.

A discussion on freebsd-net@ indicated that this could improve
client<-->server network bandwidth, if either the client or server
have one of the following:
- multiple network ports aggregated to-gether with lagg/lacp.
- a fast NIC that is using multiple queues
It does result in using more IP port#s and might increase server
peak load for a client.

One difference from the Linux implementation is that this implementation
uses the first TCP connection for all RPCs composed of small messages
and uses the additional TCP connections for RPCs that normally have
large messages (Read/Readdir/Write).  The Linux implementation spreads
all RPCs across all TCP connections in a round robin fashion, whereas
this implementation spreads Read/Readdir/Write across the additional
TCP connections in a round robin fashion.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30970
2021-07-08 17:39:04 -07:00
Rick Macklem
e1a907a25c krpc: Acquire ref count of CLIENT for backchannel use
Michael Dexter <editor@callfortesting.org> reported
a crash in FreeNAS, where the first argument to
clnt_bck_svccall() was no longer valid.
This argument is a pointer to the callback CLIENT
structure, which is free'd when the associated
NFSv4 ClientID is free'd.

This appears to have occurred because a callback
reply was still in the socket receive queue when
the CLIENT structure was free'd.

This patch acquires a reference count on the CLIENT
that is not CLNT_RELEASE()'d until the socket structure
is destroyed. This should guarantee that the CLIENT
structure is still valid when clnt_bck_svccall() is called.
It also adds a check for closed or closing to
clnt_bck_svccall() so that it will not process the callback
RPC reply message after the ClientID is free'd.

Comments by:	mav
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30153
2021-06-11 16:57:14 -07:00
Rick Macklem
a5df139ec6 nfsd: Fix when NFSERR_WRONGSEC may be replied to NFSv4 clients
Commit d224f05fcf pre-parsed the next operation number for
the put file handle operations.  This patch uses this next
operation number, plus the type of the file handle being set by
the put file handle operation, to implement the rules in RFC5661
Sec. 2.6 with respect to replying NFSERR_WRONGSEC.

This patch also adds a check to see if NFSERR_WRONGSEC should be
replied when about to perform Lookup, Lookupp or Open with a file
name component, so that the NFSERR_WRONGSEC reply is done for
these operations, as required by RFC5661 Sec. 2.6.

This patch does not have any practical effect for the FreeBSD NFSv4
client and I believe that the same is true for the Linux client,
since NFSERR_WRONGSEC is considered a fatal error at this time.

MFC after:	2 weeks
2021-06-05 16:53:07 -07:00
Rick Macklem
56e9d8e38e nfsd: Fix NFSv4.1/4.2 Secinfo_no_name when security flavors empty
Commit 947bd2479b added support for the Secinfo_no_name operation.
When a non-exported file system is being traversed, the list of
security flavors is empty.  It turns out that the Linux client
mount attempt fails when the security flavors list in the
Secinfo_no_name reply is empty.

This patch modifies Secinfo/Secinfo_no_name so that it replies
with all four security flavors when the list is empty.
This fixes Linux NFSv4.1/4.2 mounts when the file system at
the NFSv4 root (as specified on a V4: exports(5) line) is
not exported.

MFC after:	2 weeks
2021-06-04 20:31:20 -07:00
Rick Macklem
d224f05fcf nfsd: Pre-parse the next NFSv4 operation number for put FH operations
RFC5661 Sec. 2.6 specifies when a NFSERR_WRONGSEC error reply can be done.
For the four operations PutFH, PutrootFH, PutpublicFH and RestoreFH,
NFSERR_WRONGSEC can or cannot be replied, depending upon what operation
follows one of these operations in the compound.

This patch modifies nfsrvd_compound() so that it parses the next operation
number before executing any of the above four operations, storing it in
"nextop".

A future commit will implement use of "nextop" to decide if NFSERR_WRONGSEC
can be replied for the above four operations.

This commit should not change the semantics of performing the compound RPC.

MFC after:	2 weeks
2021-06-03 20:48:26 -07:00
Rick Macklem
984c71f903 nfsd: Fix the failure return for non-fh NFSv4 operations
Without this patch, nfsd_checkrootexp() returns failure
and then the NFSv4 operation would reply NFSERR_WRONGSEC.
RFC5661 Sec. 2.6 only allows a few NFSv4 operations, none
of which call nfsv4_checktootexp(), to return NFSERR_WRONGSEC.
This patch modifies nfsd_checkrootexp() to return the
error instead of a boolean and sets the returned error to an RPC
layer AUTH_ERR, as discussed on nfsv4@ietf.org.
The patch also fixes nfsd_errmap() so that the pseudo
error NFSERR_AUTHERR is handled correctly such that an RPC layer
AUTH_ERR is replied to the NFSv4 client.

The two new "enum auth_stat" values have not yet been assigned
by IANA, but are the expected next two values.

The effect on extant NFSv4 clients of this change appears
limited to reporting a different failure error when a
mount that does not use adequate security is attempted.

MFC after:	2 weeks
2021-06-02 15:28:07 -07:00
Rick Macklem
1d4afcaca2 nfsd: Delete extraneous NFSv4 root checks
There are several NFSv4.1/4.2 server operation functions which
have unneeded checks for the NFSv4 root being set up.
The checks are not needed because the operations always follow
a Sequence operation, which performs the check.

This patch deletes these checks, simplifying the code so
that a future patch that fixes the checks to conform with
RFC5661 Sec. 2.6 will be less extension.

MFC after:	2 weeks
2021-05-31 19:41:17 -07:00
Mateusz Guzik
68c2544264 nfs: even up value returned by nfsrv_parsename with copyinstr
Reported by:	dim
Reviewed by:	rmacklem
2021-05-31 16:32:04 +00:00
Rick Macklem
947bd2479b nfsd: Add support for the NFSv4.1/4.2 Secinfo_no_name operation
The Linux client is now attempting to use the Secinfo_no_name
operation for NFSv4.1/4.2 mounts.  Although it does not seem to
mind the NFSERR_NOTSUPP reply, adding support for it seems
reasonable.

I also noticed that "savflag" needed to be 64bits in
nfsrvd_secinfo() since nd_flag in now 64bits, so I changed
the declaration of it there.  I also added code to set "vp" NULL
after performing Secinfo/Secinfo_no_name, since these
operations consume the current FH, which is represented
by "vp" in nfsrvd_compound().

Fixing when the server replies NFSERR_WRONGSEC so that
it conforms to RFC5661 Sec. 2.6 still needs to be done
in a future commit.

MFC after:	2 weeks
2021-05-30 17:52:43 -07:00
Rick Macklem
d80a903a1c nfsd: Add support for CLAIM_DELEG_PREV_FH to the NFSv4.1/4.2 Open
Commit b3d4c70dc6 added support for CLAIM_DELEG_CUR_FH to Open.
While doing this, I noticed that CLAIM_DELEG_PREV_FH support
could be added the same way.  Although I am not aware of any extant
NFSv4.1/4.2 client that uses this claim type, it seems prudent to add
support for this variant of Open to the NFSv4.1/4.2 server.

This patch does not affect mounts from extant NFSv4.1/4.2 clients,
as far as I know.

MFC after:	2 weeks
2021-05-20 18:37:40 -07:00
Rick Macklem
b3d4c70dc6 nfsd: Add support for CLAIM_DELEG_CUR_FH to the NFSv4.1/4.2 Open
The Linux NFSv4.1/4.2 client now uses the CLAIM_DELEG_CUR_FH
variant of the Open operation when delegations are recalled and
the client has a local open of the file.  This patch adds
support for this variant of Open to the NFSv4.1/4.2 server.

This patch only affects mounts from Linux clients when delegations
are enabled on the server.

MFC after:	2 weeks
2021-05-18 15:53:54 -07:00
Rick Macklem
46269d66ed NFSv4 server: Re-establish the delegation recall timeout
Commit 7a606f280a allowed the server to do retries of CB_RECALL
callbacks every couple of seconds.  This was needed to allow the
Linux client to re-establish the back channel.
However this patch broke the delegation timeout check, such that
it would just keep retrying CB_RECALLS.
If the client has crashed or been network patitioned from the
server, this continues until the client TCP reconnects to
the server and re-establishes the back channel.

This patch modifies the code such that it still times out the
delegation recall after some minutes, so that the server will
allow the conflicting client request once the delegation times out.

This patch only affects the NFSv4 server when delegations are
enabled and a NFSv4 client that holds a delegation has crashed
or been network partitioned from the server for at least several
minutes when a delegation needs to be recalled.

MFC after:	2 weeks
2021-05-16 16:40:01 -07:00
Rick Macklem
dd02d9d605 nfscl: Add support for va_birthtime to NFSv4
There is a NFSv4 file attribute called TimeCreate
that can be used for va_birthtime.
r362175 added some support for use of TimeCreate.
This patch completes support of va_birthtime by adding
support for setting this attribute to the server.
It also eanbles the client to
acquire and set the attribute for a NFSv4
server that supports the attribute.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30156
2021-05-07 17:30:56 -07:00
Rick Macklem
8759773148 nfsd: fix the slot sequence# when a callback fails
Commit 4281bfec36 patched the server so that the
callback session slot would be free'd for reuse when
a callback attempt fails.
However, this can often result in the sequence# for
the session slot to be advanced such that the client
end will reply NFSERR_SEQMISORDERED.

To avoid the NFSERR_SEQMISORDERED client reply,
this patch negates the sequence# advance for the
case where the callback has failed.
The common case is a failed back channel, where
the callback cannot be sent to the client, and
not advancing the sequence# is correct for this
case.  For the uncommon case where the client's
reply to the callback is lost, not advancing the
sequence# will indicate to the client that the
next callback is a retry and not a new callback.
But, since the FreeBSD server always sets "csa_cachethis"
false in the callback sequence operation, a retry
and a new callback should be handled the same way
by the client, so this should not matter.

Until you have this patch in your NFSv4.1/4.2 server,
you should consider avoiding the use of delegations.
Even with this patch, interoperation with the
Linux NFSv4.1/4.2 client in kernel versions prior
to 5.3 can result in frequent 15second delays if
delegations are enabled.  This occurs because, for
kernels prior to 5.3, the Linux client does a TCP
reconnect every time it sees multiple concurrent
callbacks and then it takes 15seconds to recover
the back channel after doing so.

MFC after:	2 weeks
2021-04-26 16:24:10 -07:00
Rick Macklem
4281bfec36 nfsd: fix session slot handling for failed callbacks
When the NFSv4.1/4.2 server does a callback to a client
on the back channel, it will use a session slot in the
back channel session. If the back channel has failed,
the callback will fail and, without this patch, the
session slot will not be released.
As more callbacks are attempted, all session slots
can become busy and then the nfsd thread gets stuck
waiting for a back channel session slot.

This patch frees the session slot upon callback
failure to avoid this problem.

Without this patch, the problem can be avoided by leaving
delegations disabled in the NFS server.

MFC after:	2 weeks
2021-04-23 15:24:47 -07:00
Rick Macklem
5a89498d19 nfsd: fix stripe size reply for the File Layout pNFS server
At a recent testing event I found out that I had misinterpreted
RFC5661 where it describes the stripe size in the File Layout's
nfl_util field. This patch fixes the pNFS File Layout server
so that it returns the correct value to the NFSv4.1/4.2 pNFS
enabled client.

This affects almost no one, since pNFS server configurations
are rare and the extant pNFS aware NFS clients seemed to
function correctly despite the erroneous stripe size.
It *might* be needed for correct behaviour if a recent
Linux client mounts a FreeBSD pNFS server configuration
that is using File Layout (non-mirrored configuration).

MFC after:	2 weeks
2021-04-19 17:54:54 -07:00
Rick Macklem
05a39c2c1c nfsd: fix replies from session cache for retried RPCs
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.

The FreeBSD server failec to reply using the cached
reply in the session slot when an RPC was retried on
the session slot, as indicated by same slot sequence#.

This patch fixes this.  It should also fix a similar
failure for NFSv4.0 mounts, when the sequence# in
the open/lock_owner requires a reply be done from
an entry locked into the DRC.

This fix affects the fairly rare case where a NFSv4
client retries a non-idempotent RPC, such as a lock
operation.  Note that retries only occur after the
client has needed to create a new TCP connection.

MFC after:	2 weeks
2021-04-08 14:04:22 -07:00
Rick Macklem
7a606f280a nfsd: make the server repeat CB_RECALL every couple of seconds
Commit 01ae8969a9 stopped the NFSv4.1/4.2 server from implicitly
binding the back channel to a new TCP connection so that it
conforms to RFC5661, for NFSv4.1/4.2. An effect of this
for the Linux NFS client is that it will do a
BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN
set in a sequence reply. This will fix the back channel, but the
first attempt at a callback like CB_RECALL will already have
failed. Without this patch, a CB_RECALL will not be retried
and that can result in a 5 minute delay until the delegation
times out.

This patch modifies the code so that it will retry the
CB_RECALL every couple of seconds, often avoiding the
5 minute delay.

This is not critical for correct behaviour, but avoids
the 5 minute delay for the case where the Linux client
re-binds the back channel via BindConnectionToSession.

MFC after:	2 weeks
2021-04-04 18:15:54 -07:00
Rick Macklem
6f2addd838 nfsd: fix BindConnectionToSession so that it clears "cb path down"
Commit 01ae8969a9 stopped the NFSv4.1/4.2 server from implicitly
binding the back channel to a new TCP connection so that it
conforms to RFC5661, for NFSv4.1/4.2. An effect of this
for the Linux NFS client is that it will do a
BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN
set in a sequence reply. It will do this for every RPC
reply until it no longer sees the flag.
Without that patch, this will happen until the client does
an Open, which will clear LCL_CBDOWN.

This patch clears LCL_CBDOWN right away, so that
NFSV4SEQ_CBPATHDOWN will no longer be sent to the client
in Sequence replies and the Linux client will not repeat
the BindConnectionToSession RPCs.

This is not critical for correct behaviour, but reduces
RPC overheads for cases where the Open will not be done
for a while.

MFC after:	2 weeks
2021-04-04 15:05:39 -07:00
Rick Macklem
01ae8969a9 nfsd: do not implicitly bind the back channel for NFSv4.1/4.2 mounts
The NFSv4.1 (and 4.2 on 13) server incorrectly binds
a new TCP connection to the back channel when first
used by an RPC with a Sequence op in it (almost all of them).
RFC5661 specifies that only the fore channel should be bound.

This was done because early clients (including FreeBSD)
did not do the required BindConnectionToSession RPC.

Unfortunately, this breaks the Linux client when the
"nconnects" mount option is used, since the server
may do a callback on the incorrect TCP connection.

This patch converts the server behaviour to that
required by the RFC.  It also makes the server test/indicate
failure of the back channel more aggressively.

Until this patch is applied to the server, the
"nconnects" mount option is not recommended for a Linux
NFSv4.1/4.2 client mount to the FreeBSD server.

Reported by:	bcodding@redhat.com
Tested by:	bcodding@redhat.com
PR:		254560
MFC after:	1 week
2021-03-30 14:31:05 -07:00
Konstantin Belousov
4a21bcb241 nfsserver: use VOP_VPUT_PAIR().
Apply VOP_VPUT_PAIR() to the end of vnode operations after the
VOP_MKNOD(), VOP_MKDIR(), VOP_LINK(), VOP_SYMLINK(), VOP_CREATE().

Reviewed by:	chs, mckusick
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-02-12 03:02:21 +02:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Rick Macklem
148a227bf8 nfsd: add KASSERTs to nfsm_trimtrailing() for M_EXTPG mbufs
Add KASSERTS to nfsm_trimtrailing() to confirm the sanity of
the arguments for the M_EXTPG case.

Suggested by:	kib
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D28053
2021-01-10 13:50:15 -08:00
Konstantin Belousov
51a9b978e7 nfs server: improve use of the VFS KPI
In particular, do not assume that vn_start_write() returns the same mp
as it was passed in, or never returns error.

Also be more accurate to return NULL vp and mp when error occured, to
catch wrong control flow easier.

Stop checking for NULL mp before calling vn_finished_write(), NULL mp
is handled transparently by the function.

Reviewed by:	rmacklem
Tested by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27881
2021-01-02 20:17:12 +02:00
Rick Macklem
dc78533a52 nfsd: fix NFSv4.0 seqid handling for ERELOOKUP
Commit 774a36851e fixed the NFS server so that it could handle
ERELOOKUP returns from VOP calls by redoing the operation/RPC.
However, for NFSv4.0, redoing an Open would increment
the open_owner's seqid multiple times, breaking the protocol.
This patch sets a new flag called ND_ERELOOKUP on the RPC when
a redo is in progress.  Then the code that increments the seqid
avoids the seqid increment/check when the flag is set, since
it indicates this has already been done for the Open.
2021-01-01 14:21:51 -08:00
Rick Macklem
774a36851e nfsd: fix NFS server for ERELOOKUP
r367672 modified UFS such that certain VOPs, such as
VOP_CREATE() will intermittently return ERELOOKUP.
When this happens, the entire system call, or NFS
operation in the case of the NFS server, must be redone.

This patch adds that support to the NFS server by rolling
back the state of the NFS request arguments and NFS
reply arguments mbuf lists to the condition they were
in before the operation and then redoing the operation.

Tested by:	pho
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D27875
2021-01-01 13:55:51 -08:00
Rick Macklem
ff45b9fc1a Bjorn reported a problem where the Linux NFSv4.1 client is
using an open_to_lock_owner4 when that lock_owner4 has already
been created by a previous open_to_lock_owner4. This caused the NFS server
to reply NFSERR_INVAL.

For NFSv4.0, this is an error, although the updated NFSv4.0 RFC7530 notes
that the correct error reply is NFSERR_BADSEQID (RFC3530 did not specify
what error to return).

For NFSv4.1, it is not obvious whether or not this is allowed by RFC5661,
but the NFSv4.1 server can handle this case without error.
This patch changes the NFSv4.1 (and NFSv4.2) server to handle multiple
uses of the same lock_owner in open_to_lock_owner so that it now correctly
interoperates with the Linux NFS client.
It also changes the error returned for NFSv4.0 to be NFSERR_BADSEQID.

Thanks go to Bjorn for diagnosing this and testing the patch.
He also provided a program that I could use to reproduce the problem.

Tested by:	bj@cebitec.uni-bielefeld.de (Bjorn Fischer)
PR:		249567
Reported by:	bj@cebitec.uni-bielefeld.de (Bjorn Fischer)
MFC after:	3 days
2020-09-26 23:05:38 +00:00
Rick Macklem
58dd2b52cb Fix a LOR between the NFS server and server side krpc.
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures in nfsrv_checksequence().  This was fixed by r365789.
A similar bug exists in nfsrv_bindconnsess(), where SVC_RELEASE() is called
while mutexes are held.
This patch applies a fix similar to r365789, moving the SVC_RELEASE() call
down to after the mutexes are released.

This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.

MFC after:	1 week
2020-09-18 23:52:56 +00:00
Rick Macklem
a5c55410b3 Fix a LOR between the NFS server and server side krpc.
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures.
The code in nfsrv_checksequence() would call SVC_RELEASE() with the mutex
held.  Normally this is ok, since all that happens is SVC_RELEASE()
decrements a reference count.  However, if the socket has just been shut
down, SVC_RELEASE() drops the reference count to 0 and acquires a sleep
lock during destruction of the server side krpc structure.

This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.

MFC after:	1 week
2020-09-16 02:25:18 +00:00
Rick Macklem
2848d6d4de Fix a case where the NFSv4.0 server might crash if delegations are enabled.
asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
kdb_backtrace
vpanic
panic
nfsrv_docallback
nfsrv_checkgetattr
nfsrvd_getattr
nfsrvd_dorpc
nfssvc_program
svc_run_internal
svc_thread_start
fork_exit
fork_trampoline
where the panic message was "docallb", which indicates that a callback
was attempted when the ClientID is unconfirmed.
This would not normally occur, but it is possible to have an unconfirmed
ClientID structure with delegation structure(s) chained off it if the
client were to issue a SetClientID with the same "id" but different
"verifier" after acquiring delegations on the previously confirmed ClientID.

The bug appears to be that nfsrv_checkgetattr() failed to check for
this uncommon case of an unconfirmed ClientID with a delegation structure
that no longer refers to a delegation the client knows about.

This patch adds a check for this case, handling it as if no delegation
exists, which is the case when the above occurs.
Although difficult to reproduce, this change should avoid the panic().

PR:		249127
Reported by:	asomers
Reviewed by:	asomers
MFC after:	1 week
Differential Revision:	https://reviews.freebbsd.org/D26342
2020-09-14 00:44:50 +00:00
Mateusz Guzik
586ee69f09 fs: clean up empty lines in .c and .h files 2020-09-01 21:18:40 +00:00
Eric van Gyzen
0bb426274e Fix nfsrvd_locku memory leak
Coverity detected memory leak fix.

Submitted by:	bret_ketchum@dell.com
Reported by:	Coverity
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D26231
2020-08-31 15:31:17 +00:00
Rick Macklem
6e4b6ff88f Add flags to enable NFS over TLS to the NFS client and server.
An Internet Draft titled "Towards Remote Procedure Call Encryption By Default"
(soon to be an RFC I think) describes how Sun RPC is to use TLS with NFS
as a specific application case.
Various commits prepared the NFS code to use KERN_TLS, mainly enabling use
of ext_pgs mbufs for large RPC messages.
r364475 added TLS support to the kernel RPC.

This commit (which is the final one for kernel changes required to do
NFS over TLS) adds support for three export flags:
MNT_EXTLS - Requires a TLS connection.
MNT_EXTLSCERT - Requires a TLS connection where the client presents a valid
            X.509 certificate during TLS handshake.
MNT_EXTLSCERTUSER - Requires a TLS connection where the client presents a
            valid X.509 certificate with "user@domain" in the otherName
            field of the SubjectAltName during TLS handshake.
Without these export options, clients are permitted, but not required, to
use TLS.

For the client, a new nmount(2) option called "tls" makes the client do
a STARTTLS Null RPC and TLS handshake for all TCP connections used for the
mount. The CLSET_TLS client control option is used to indicate to the kernel RPC
that this should be done.

Unless the above export flags or "tls" option is used, semantics should
not change for the NFS client nor server.

For NFS over TLS to work, the userspace daemons rpctlscd(8) { for client }
or rpctlssd(8) daemon { for server } must be running.
2020-08-27 23:57:30 +00:00
Rick Macklem
808306dd0f Delete the unused "use_ext" argument to nfscl_reqstart().
This is a partial revert of r363210, since the "use_ext" argument added
by that commit is not actually useful.

This patch should not result in any semantics change.
2020-08-18 01:41:12 +00:00
Rick Macklem
02511d2112 Add an argument to newnfs_connect() that indicates use TLS for the connection.
For NFSv4.0, the server creates a server->client TCP connection for callbacks.
If the client mount on the server is using TLS, enable TLS for this callback
TCP connection.
TLS connections from clients will not be supported until the kernel RPC
changes are committed.

Since this changes the internal ABI between the NFS kernel modules that
will require a version bump, delete newnfs_trimtrailing(), which is no
longer used.

Since LCL_TLSCB is not yet set, these changes should not have any semantic
affect at this time.
2020-08-11 00:26:45 +00:00
Rick Macklem
cb889ce631 Add optional support for ext_pgs mbufs to the NFS server's read, readlink
and getxattr operations.

This patch optionally enables generation of read, readlink and getxattr replies
in ext_pgs mbufs.  Since neither of ND_EXTPG or ND_TLS are currently ever set,
there is no change in semantics at this time.
It also corrects the message in a couple of panic()s that should never occur.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-31 23:35:49 +00:00
Rick Macklem
ea83d07e82 Add support for ext_pgs mbufs to nfsrvd_readdir() and nfsrvd_readdirplus().
This patch code that optionally (based on ND_TLS, never set yet) generates
readdir replies in ext_pgs mbufs.
To trim the list back, a new function that is ext_pgs aware called
nfsm_trimtrailing() replaces newnfs_trimtrailing().
newnfs_trimtrailing() is no longer used, but will be removed in a future
commit, since its removal does modify the internal kpi between the NFS
modules.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-29 22:58:08 +00:00
Rick Macklem
2de592f6e1 Fix the NFS server so that it sets va_birthtime.
r362490 marked that the NFSv4 attribute TimeCreate (va_birthtime) is supported,
but it did not change the NFS server code to actually do it.
As such, errors could occur when unrolling a tarball onto an NFSv4 mounted
volume, since setting TimeCreate would fail with a NFSERR_ATTRNOTSUPP reply.

This patch fixes the server so that it does TimeCreate and also makes
sure that TimeCreate will not be set for a DS file for a pNFS server.

A separate commit will add a check to the NFSv4 client for support of
the TimeCreate attribute before attempting to set it, to avoid a problem
when mounting a server that does not support the attribute.
The failures will still occur for r362490 or later kernels that do not
have this patch, since they indicate support for the attribute, but do not
actually support the attribute.
2020-07-26 23:03:41 +00:00
Rick Macklem
18a48314ba Add support for ext_pgs mbufs to nfsrv_adj().
This patch uses a slightly different algorithm for nfsrv_adj()
since ext_pgs mbuf lists are not permitted to have m_len == 0 mbufs.
As such, the code now frees mbufs after the adjustment in the list instead
of setting their m_len field to 0.
Since mbuf(s) may be trimmed off the tail of the list, the function now
returns a pointer to the last mbuf in the list.  This saves the caller
from needing to use m_last() to find the last mbuf.
It also implies that it might return a nul list, which required a check for
that in nfsrvd_readlink().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-26 02:42:09 +00:00