Commit Graph

432 Commits

Author SHA1 Message Date
Rick Macklem
78ffcb86d9 nfscommon: fix function name in comment
MFC after:	2 weeks
2021-04-19 20:09:46 -07:00
Rick Macklem
34256484af Revert "nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661"
This reverts commit 9edaceca81.

It turns out that the Linux client intentionally does an NFSv4.1
RPC with only a Sequence operation in it and with "seqid + 1"
for the slot.  This is used to re-synchronize the slot's seqid
and the client expects the NFS4ERR_SEQ_MISORDERED error reply.

As such, revert the patch, so that the server remains RFC5661
compliant.
2021-04-15 14:08:40 -07:00
Rick Macklem
9edaceca81 nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.

Sometimes, after some Linux NFSv4.1/4.2 clients establish
a new TCP connection, they will advance the sequence number
for a session slot by 2 instead of 1.
RFC5661 specifies that a server should reply
NFS4ERR_SEQ_MISORDERED for this case.
This might result in a system call error in the client and
seems to disable future use of the slot by the client.
Since advancing the sequence number by 2 seems harmless,
allow this case if vfs.nfs.linuxseqsesshack is non-zero.

Note that, if the order of RPCs is actually reversed,
a subsequent RPC with a smaller sequence number value
for the slot will be received.  This will result in
a NFS4ERR_SEQ_MISORDERED reply.
This has not been observed during testing.
Setting vfs.nfs.linuxseqsesshack to 0 will provide
RFC5661 compliant behaviour.

This fix affects the fairly rare case where a NFSv4
Linux client does a TCP reconnect and then apparently
erroneously increments the sequence number for the
session slot twice during the reconnect cycle.

PR:	254816
MFC after:	2 weeks
2021-04-11 16:51:25 -07:00
Rick Macklem
7763814fc9 nfsv4 client: do the BindConnectionToSession as required
During a recent testing event, it was reported that the NFSv4.1/4.2
server erroneously bound the back channel to a new TCP connection.
RFC5661 specifies that the fore channel is implicitly bound to a
new TCP connection when an RPC with Sequence (almost any of them)
is done on it.  For the back channel to be bound to the new TCP
connection, an explicit BindConnectionToSession must be done as
the first RPC on the new connection.

Since new TCP connections are created by the "reconnect" layer
(sys/rpc/clnt_rc.c) of the krpc, this patch adds an optional
upcall done by the krpc whenever a new connection is created.
The patch also adds the specific upcall function that does a
BindConnectionToSession and configures the krpc to call it
when required.

This is necessary for correct interoperability with NFSv4.1/NFSv4.2
servers when the nfscbd daemon is running.

If doing NFSv4.1/NFSv4.2 mounts without this patch, it is
recommended that the nfscbd daemon not be running and that
the "pnfs" mount option not be specified.

PR:	254840
Comments by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29475
2021-04-11 14:34:57 -07:00
Rick Macklem
22cefe3d83 nfsd: fix replies from session cache for multiple retries
Recent testing of network partitioning a FreeBSD NFSv4.1
server from a Linux NFSv4.1 client identified problems
with both the FreeBSD server and Linux client.

Commit 05a39c2c1c fixed replying with the cached reply in
in the session slot if same session slot sequence#.
However, the code uses the reply and, as such,
will fail for a subsequent retry of the RPC.
A subsequent retry would be an extremely rare event,
but this patch fixes this, so long as m_copym(..M_NOWAIT)
does not fail, which should also be a rare event.

This fix affects the exceedingly rare case where a NFSv4
client retries a non-idempotent RPC, such as a lock
operation, multiple times.  Note that retries only occur
after the client has needed to create a new TCP connection,
with a new TCP connection for each retry.

MFC after:	2 weeks
2021-04-10 15:50:25 -07:00
Rick Macklem
7a606f280a nfsd: make the server repeat CB_RECALL every couple of seconds
Commit 01ae8969a9 stopped the NFSv4.1/4.2 server from implicitly
binding the back channel to a new TCP connection so that it
conforms to RFC5661, for NFSv4.1/4.2. An effect of this
for the Linux NFS client is that it will do a
BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN
set in a sequence reply. This will fix the back channel, but the
first attempt at a callback like CB_RECALL will already have
failed. Without this patch, a CB_RECALL will not be retried
and that can result in a 5 minute delay until the delegation
times out.

This patch modifies the code so that it will retry the
CB_RECALL every couple of seconds, often avoiding the
5 minute delay.

This is not critical for correct behaviour, but avoids
the 5 minute delay for the case where the Linux client
re-binds the back channel via BindConnectionToSession.

MFC after:	2 weeks
2021-04-04 18:15:54 -07:00
Rick Macklem
5f742d3879 nfsv4 client: fix forced dismount when sleeping on nfsv4lck
During a recent NFSv4 testing event a test server caused a hang
where "umount -N" failed.  The renew thread was sleeping on "nfsv4lck"
and the "umount" was sleeping, waiting for the renew thread to
terminate.

This is the first of two patches that is hoped to fix the renew thread
so that it will terminate when "umount -N" is done on the mount.

nfsv4_lock() checks for forced dismount, but only after it wakes up
from msleep().  Without this patch, a wakeup() call was required.
This patch adds a 1second timeout on the msleep(), so that it will
wake up and see the forced dismount flag.  Normally a wakeup()
will occur in less than 1second, but if a premature return from
msleep() does occur, it will simply loop around and msleep() again.

While here, replace the nfsmsleep() wrapper that was used for portability
with the actual msleep() call and make the same change for nfsv4_getref().

MFC after:	2 weeks
2021-03-19 14:09:33 -07:00
Rick Macklem
c04199affe nfsclient: Fix ReadDS/WriteDS/CommitDS nfsstats RPC counts for a NFSv3 DS
During a recent virtual NFSv4 testing event, a bug in the FreeBSD client
was detected when doing I/O DS operations on a Flexible File Layout pNFS
server.  For an NFSv3 DS, the Read/Write/Commit nfsstats were incremented
instead of the ReadDS/WriteDS/CommitDS counts.
This patch fixes this.

Only the RPC counts reported by nfsstat(1) were affected by this bug,
the I/O operations were performed correctly.

MFC after:	2 weeks
2021-03-02 14:18:23 -08:00
Alexander Motin
d01032736c Fix diroffdiroff, probably copy/paste bug.
Too long name looks bad in `vmstat -m`.

MFC after:	1 week
2021-02-28 09:08:31 -05:00
Alex Richardson
8d55837dc1 qeueue.h: Add {SLIST,STAILQ,LIST,TAILQ}_END()
We provide these for compat with other queue.h headers since some software
assumes it exists (e.g. the libevent contrib code), but we are not
encouraging their use (NULL should be used instead).

This fixes the following warning (which should arguable be an error since
it results in a function call to an undefined function):

.../contrib/libevent/buffer.c:495:16: warning: implicit declaration of function 'LIST_END' is invalid in C99 [-Wimplicit-function-declaration]
             cbent != LIST_END(&buffer->callbacks);
                      ^
.../contrib/libevent/buffer.c:495:13: warning: comparison between pointer and integer ('struct evbuffer_cb_entry *' and 'int') [-Wpointer-integer-compare]
             cbent != LIST_END(&buffer->callbacks);
             ~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed By:	jhb
Differential Revision: https://reviews.freebsd.org/D27151
2021-01-25 15:09:35 +00:00
Rick Macklem
dc78533a52 nfsd: fix NFSv4.0 seqid handling for ERELOOKUP
Commit 774a36851e fixed the NFS server so that it could handle
ERELOOKUP returns from VOP calls by redoing the operation/RPC.
However, for NFSv4.0, redoing an Open would increment
the open_owner's seqid multiple times, breaking the protocol.
This patch sets a new flag called ND_ERELOOKUP on the RPC when
a redo is in progress.  Then the code that increments the seqid
avoids the seqid increment/check when the flag is set, since
it indicates this has already been done for the Open.
2021-01-01 14:21:51 -08:00
Rick Macklem
774a36851e nfsd: fix NFS server for ERELOOKUP
r367672 modified UFS such that certain VOPs, such as
VOP_CREATE() will intermittently return ERELOOKUP.
When this happens, the entire system call, or NFS
operation in the case of the NFS server, must be redone.

This patch adds that support to the NFS server by rolling
back the state of the NFS request arguments and NFS
reply arguments mbuf lists to the condition they were
in before the operation and then redoing the operation.

Tested by:	pho
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D27875
2021-01-01 13:55:51 -08:00
Rick Macklem
665b1365fe Add a new "tlscertname" NFS mount option.
When using NFS-over-TLS, an NFS client can optionally provide an X.509
certificate to the server during the TLS handshake.  For some situations,
such as different NFS servers or different certificates being mapped
to different user credentials on the NFS server, there may be a need
for different mounts to provide different certificates.

This new mount option called "tlscertname" may be used to specify a
non-default certificate be provided.  This alernate certificate will
be stored in /etc/rpc.tlsclntd in a file with a name based on what is
provided by this mount option.
2020-12-23 13:42:55 -08:00
Brooks Davis
52e63ec2f1 VFS_QUOTACTL: Remove needless casts of arg
The argument is a void * so there's no need to cast it to caddr_t.

Update documentation to match function decleration.

Reviewed by:	freqlabs
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27093
2020-12-17 21:58:10 +00:00
Alan Somers
ac8c4a61af nfs: Mark unused statistics variable as reserved
FreeBSD's NFS exporter has long exported some unused statistics fields.
Revision r366992 removed them from nfsstat. This revision renames those
fields in the kernel's exported structures to make it clear to other
consumers that they are unused.

Reported by:	emaste
Reviewed by:	emaste
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D27258
2020-11-18 04:35:49 +00:00
Konstantin Belousov
96474d2a3f Do not copy vp into f_data for DTYPE_VNODE files.
The pointer to vnode is already stored into f_vnode, so f_data can be
reused.  Fix all found users of f_data for DTYPE_VNODE.

Provide finit_vnode() helper to initialize file of DTYPE_VNODE type.

Reviewed by:	markj (previous version)
Discussed with:	freqlabs (openzfs chunk)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 21:55:21 +00:00
Mateusz Guzik
586ee69f09 fs: clean up empty lines in .c and .h files 2020-09-01 21:18:40 +00:00
Rick Macklem
6e4b6ff88f Add flags to enable NFS over TLS to the NFS client and server.
An Internet Draft titled "Towards Remote Procedure Call Encryption By Default"
(soon to be an RFC I think) describes how Sun RPC is to use TLS with NFS
as a specific application case.
Various commits prepared the NFS code to use KERN_TLS, mainly enabling use
of ext_pgs mbufs for large RPC messages.
r364475 added TLS support to the kernel RPC.

This commit (which is the final one for kernel changes required to do
NFS over TLS) adds support for three export flags:
MNT_EXTLS - Requires a TLS connection.
MNT_EXTLSCERT - Requires a TLS connection where the client presents a valid
            X.509 certificate during TLS handshake.
MNT_EXTLSCERTUSER - Requires a TLS connection where the client presents a
            valid X.509 certificate with "user@domain" in the otherName
            field of the SubjectAltName during TLS handshake.
Without these export options, clients are permitted, but not required, to
use TLS.

For the client, a new nmount(2) option called "tls" makes the client do
a STARTTLS Null RPC and TLS handshake for all TCP connections used for the
mount. The CLSET_TLS client control option is used to indicate to the kernel RPC
that this should be done.

Unless the above export flags or "tls" option is used, semantics should
not change for the NFS client nor server.

For NFS over TLS to work, the userspace daemons rpctlscd(8) { for client }
or rpctlssd(8) daemon { for server } must be running.
2020-08-27 23:57:30 +00:00
Rick Macklem
808306dd0f Delete the unused "use_ext" argument to nfscl_reqstart().
This is a partial revert of r363210, since the "use_ext" argument added
by that commit is not actually useful.

This patch should not result in any semantics change.
2020-08-18 01:41:12 +00:00
Rick Macklem
02511d2112 Add an argument to newnfs_connect() that indicates use TLS for the connection.
For NFSv4.0, the server creates a server->client TCP connection for callbacks.
If the client mount on the server is using TLS, enable TLS for this callback
TCP connection.
TLS connections from clients will not be supported until the kernel RPC
changes are committed.

Since this changes the internal ABI between the NFS kernel modules that
will require a version bump, delete newnfs_trimtrailing(), which is no
longer used.

Since LCL_TLSCB is not yet set, these changes should not have any semantic
affect at this time.
2020-08-11 00:26:45 +00:00
Rick Macklem
cb889ce631 Add optional support for ext_pgs mbufs to the NFS server's read, readlink
and getxattr operations.

This patch optionally enables generation of read, readlink and getxattr replies
in ext_pgs mbufs.  Since neither of ND_EXTPG or ND_TLS are currently ever set,
there is no change in semantics at this time.
It also corrects the message in a couple of panic()s that should never occur.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-31 23:35:49 +00:00
Rick Macklem
194d870481 Fix the NFSv4 client so that it checks for support of TimeCreate before
trying to set it.

r362490 added support for setting of the TimeCreate (va_birthtime) attribute,
but it does so without checking to see if the server supports the attribute.
This could result in NFSERR_ATTRNOTSUPP error replies to the Setattr operation.
This patch adds code to check that the server supports TimeCreate before
attempting to do a Setattr of it to avoid these error returns.
2020-07-26 23:13:10 +00:00
Rick Macklem
18a48314ba Add support for ext_pgs mbufs to nfsrv_adj().
This patch uses a slightly different algorithm for nfsrv_adj()
since ext_pgs mbuf lists are not permitted to have m_len == 0 mbufs.
As such, the code now frees mbufs after the adjustment in the list instead
of setting their m_len field to 0.
Since mbuf(s) may be trimmed off the tail of the list, the function now
returns a pointer to the last mbuf in the list.  This saves the caller
from needing to use m_last() to find the last mbuf.
It also implies that it might return a nul list, which required a check for
that in nfsrvd_readlink().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-26 02:42:09 +00:00
Rick Macklem
cfaafa7908 Add support for ext_pgs mbufs to nfsm_uiombuflist() and nfsm_split().
This patch uses a slightly different algorithm for nfsm_uiombuflist() for
the non-ext_pgs case, where a variable called "mcp" is maintained, pointing to
the current location that mbuf data can be filled into. This avoids use of
mtod(mp, char *) + mp->m_len to calculate the location, since this does
not work for ext_pgs mbufs and I think it makes the algorithm more readable.
This change should not result in semantic changes for the non-ext_pgs case.
The patch also deletes come unneeded code.

It also adds support for anonymous page ext_pgs mbufs to nfsm_split().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.
At this time for this case, use of ext_pgs mbufs cannot be enabled, since
ktls_encrypt() replaces the unencrypted data with encrypted data in place.

Until such time as this can be enabled, there should be no semantic change.
Also, note that this code is only used by the NFS client for a mirrored pNFS
server.
2020-07-24 23:17:09 +00:00
Rick Macklem
022346fa62 Add support for ext_pgs mbufs to nfsrvd_rephead().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-07 00:42:23 +00:00
Rick Macklem
34fc29e0c9 Add support for ext_pgs mbufs to nfsm_strtom().
Also, add a new function nfsm_add_ext_pgs() which will either add a page
or add a new ext_pgs mbuf with a page to the mbuf list. Used by nfsm_strtom().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-05 21:55:16 +00:00
Rick Macklem
dccb580624 Add support for ext_pgs mbufs to nfscl_reqstart() and nfsm_set().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-04 03:28:13 +00:00
Rick Macklem
606007409c Fix build breakage caused by r362903. Only pmap.h is needed now, but
vm_page.h and vm_pageout.h is needed later, so put them in now.

Pointy hat goes on me.
2020-07-03 05:21:05 +00:00
Rick Macklem
2da1527844 Add support for ext_pgs mbufs to nfsm_build().
This is the first of a series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-03 01:19:29 +00:00
Rick Macklem
4476c1def0 Add a boolean argument to nfscl_reqstart() to indicate that ext_pgs mbufs
should be used.

For KERN_TLS (and possibly some other future network interface) the mbuf
list passed into sosend() must be ext_pgs mbufs. The krpc could simply
copy all the mbuf data into ext_pgs mbufs before calling sosend(), but
that would be inefficient for large RPC messages.
This patch adds an argument to nfscl_reqstart() to indicate that it should
fill the RPC message into ext_pgs mbufs.
It also adds fields to "struct nfsrv_descript" needed for building NFS RPC
messages in ext_pgs mbufs, along with new flags for this.

Since the argument is always "false", this commit should not result in any
semantic change. However, this commit prepares the code
for future commits that will add support for building of NFS RPC messages
in ext_pgs mbufs.
2020-06-26 03:11:54 +00:00
Doug Rabson
c07782e10e Add some missing parts for supporting va_birthtime.
Reviewed by:	rmacklem
2020-06-22 08:23:16 +00:00
Alan Somers
eea79fde5a Remove vfs_statfs and vnode_mount macros from NFS
These macro definitions are no longer needed as the NFS OSX port is long
dead.  The vfs_statfs macro conflicts with the vfsops field of the same
name.

Submitted by:	shivank@
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	Google, Inc. (GSoC 2020)
Differential Revision:	https://reviews.freebsd.org/D25263
2020-06-17 16:20:19 +00:00
Doug Rabson
3900c11481 Add support for the timecreate attribute
This maps to the va_birthtime VFS attribute.
2020-06-14 11:41:57 +00:00
Rick Macklem
1f7104d720 Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.
Since mnt_flags was upgraded to 64bits there has been a quirk in
"struct export_args", since it hold a copy of mnt_flags
in ex_flags, which is an "int" (32bits).
This happens to currently work, since all the flag bits used in ex_flags are
defined in the low order 32bits. However, new export flags cannot be defined.
Also, ex_anon is a "struct xucred", which limits it to 16 additional groups.
This patch revises "struct export_args" to make ex_flags 64bits and replaces
ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a
groups list, so it can be malloc'd up to NGROUPS in size.
This requires that the VFS_CHECKEXP() arguments change, so I also modified the
last "secflavors" argument to be an array pointer, so that the
secflavors could be copied in VFS_CHECKEXP() while the export entry is locked.
(Without this patch VFS_CHECKEXP() returns a pointer to the secflavors
array and then it is used after being unlocked, which is potentially
a problem if the exports entry is changed.
In practice this does not occur when mountd is run with "-S",
but I think it is worth fixing.)

This patch also deleted the vfs_oexport_conv() function, since
do_mount_update() does the conversion, as required by the old vfs_cmount()
calls.

Reviewed by:	kib, freqlabs
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D25088
2020-06-14 00:10:18 +00:00
Ryan Moeller
245bfd34da Deduplicate fsid comparisons
Comparing fsid_t objects requires internal knowledge of the fsid structure
and yet this is duplicated across a number of places in the code.

Simplify by creating a fsidcmp function (macro).

Reviewed by:	mjg, rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24749
2020-05-21 01:55:35 +00:00
Rick Macklem
3d7650f04c Add a function nfsm_set() to initialize "struct nfsrv_descript" for building
mbuf lists.

This function is currently trivial, but will that will change when
support for building NFS messages in ext_pgs mbufs is added.
Adding support for ext_pgs mbufs is needed for KERN_TLS, which will
be used to implement nfs-over-tls.
2020-05-18 00:07:45 +00:00
John Baldwin
07a34ce381 Remove unused header for DES.
The NFS port doesn't use any of the DES functions.
2020-05-13 18:35:02 +00:00
Ryan Moeller
b9cc3262bc nfs: Remove APPLESTATIC macro
It is no longer useful.

Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24811
2020-05-12 13:23:25 +00:00
Ryan Moeller
32033b3d30 Remove APPLEKEXT ifndefs
They are no longer useful.

Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24752
2020-05-08 14:39:38 +00:00
Rick Macklem
04d6c514b0 Delete unused function newnfs_trimleading.
The NFS function called newnfs_trimleading() has not been used by the
code in long time. To give you a clue, it still had a K&R style function
declaration.
Delete it, since it is just cruft, as a part of the NFS mbuf handling
cleanup in preparation for adding ext_pgs mbuf support.
The ext_pgs mbuf support for the build/send side is needed by
nfs-over-tls.
2020-05-06 00:44:03 +00:00
Rick Macklem
3973ef1dfc Revert r360514, to avoid unnecessary churn of the sources.
r360514 prepared the NFS code for changes to handle ext_pgs mbufs on
the receive side. However, at this time, KERN_TLS does not pass
ext_pgs mbufs up through soreceive(). As such, as this time, only
the send/build side of the NFS mbuf code needs to handle ext_pgs mbufs.
Revert r360514 since the rather extensive changes required for receive
side ext_pgs mbufs are not yet needed.
This avoids unnecessary churn of the sources.
2020-05-05 00:58:03 +00:00
Rick Macklem
0c9cd5cacd Factor some code out of nfsm_dissct() into separate functions.
Factoring some of the code in nfsm_dissct() out into separate functions
allows these functions to be used elsewhere in the NFS mbuf handling code.
Other uses of these functions will be done in future commits.
It also makes it easier to add support for ext_pgs mbufs, which is needed
for nfs-over-tls under development in base/projects/nfs-over-tls.

Although the algorithm in nfsm_dissct() is somewhat re-written by this
patch, the semantics of nfsm_dissct() should not have changed.
2020-05-01 00:36:14 +00:00
Rick Macklem
5ecf33c6c4 Get rid of uio_XXX macros used for the Mac OS/X port.
The NFS code had a bunch of Mac OS/X accessor functions named uio_XXX
left over from the port to Mac OS/X. Since that port is long forgotten,
replace the calls with the code generated by the FreeBSD macros for these
in nfskpiport.h. This allows the macros to be deleted from nfskpiport.h
and I think makes the code more readable.

This patch should not result in any semantic change.
2020-04-28 02:11:02 +00:00
Rick Macklem
e4a458bb1b Remove Mac OS/X macros that did nothing for FreeBSD.
The macros CAST_USER_ADDR_T() and CAST_DOWN() were used for the Mac OS/X
port. The first of these macros was a no-op for FreeBSD and the second
is no longer used.
This patch gets rid of them. It also deletes the "mbuf_t" typedef which
is no longer used in the FreeBSD code from nfskpiport.h

This patch should not change semantics.
2020-04-25 02:18:59 +00:00
Rick Macklem
897d7d45ba Make the NFSv4.n client's recovery from NFSERR_BADSESSION RFC5661 conformant.
RFC5661 specifies that a client's recovery upon receipt of NFSERR_BADSESSION
should first consist of a CreateSession operation using the extant ClientID.
If that fails, then a full recovery beginning with the ExchangeID operation
is to be done.
Without this patch, the FreeBSD client did not attempt the CreateSession
operation with the extant ClientID and went directly to a full recovery
beginning with ExchangeID. I have had this patch several years, but since
no extant NFSv4.n server required the CreateSession with extant ClientID,
I have never committed it.
I an committing it now, since I suspect some future NFSv4.n server will
require this and it should not negatively impact recovery for extant NFSv4.n
servers, since they should all return NFSERR_STATECLIENTID for this first
CreateSession.

The patched client has been tested for recovery against both the FreeBSD
and Linux NFSv4.n servers and no problems have been observed.

MFC after:	1 month
2020-04-22 21:00:14 +00:00
Rick Macklem
ae070589d3 Replace all instances of the typedef mbuf_t with "struct mbuf *".
The typedef mbuf_t was used for the Mac OS/X port of the code long ago.
Since this port is no longer used and the use of mbuf_t obscures what
the code does (and is not consistent with style(9)), it is no longer needed.
This patch replaces all instances of mbuf_t with "struct mbuf *", so that
it is no longer used.

This patch should not result in any semantic change.
2020-04-17 21:17:51 +00:00
Rick Macklem
66ea9219a2 Delete the mbuf macros that were used for the Mac OS/X port.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since r359757, r359780, r359785, r359810, r359811 have removed all uses
of these macros, this patch deleted the macros from the .h files.

My eventual goal is deleting nfskpiport.h, but that will take some more
editting to replace uses of the remaining macros.
2020-04-13 00:07:37 +00:00
Rick Macklem
e3e7c612f3 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.

This is the final patch of this series and the macros should now be
able to be deleted from the .h files in a future commit.
2020-04-11 23:37:58 +00:00
Rick Macklem
9f6624d317 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.
2020-04-11 20:57:15 +00:00
Rick Macklem
c948a17a52 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.
This conversion will be committed one file at a time.
2020-04-09 23:11:19 +00:00
Rick Macklem
8de97f394e Remove the old NFS lock device driver that uses Giant.
This NFS lock device driver was replaced by the kernel NLM around FreeBSD7 and
has not normally been used since then.
To use it, the kernel had to be built without "options NFSLOCKD" and
the nfslockd.ko had to be deleted as well.
Since it uses Giant and is no longer used, this patch removes it.

With this device driver removed, there is now a lot of unused code
in the userland rpc.lockd. That will be removed on a future commit.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D22933
2020-04-09 14:44:46 +00:00
Rick Macklem
76fd19b0a2 Fix noisy NFSv4 server printf.
Peter reported that his dmesg was getting cluttered with
nfsrv_cache_session: no session
messages when he rebooted his NFS server and they did not seem useful.
He was correct, in that these messages are "normal" and expected when
NFSv4.1 or NFSv4.2 are mounted and the server is rebooted.
This patch silences the printf() during the grace period after a reboot.
It also adds the client IP address to the printf(), so that the message
is more useful if/when it occurs. If this happens outside of the
server's grace period, it does indicate something is not working correctly.
Instead of adding yet another nd_XXX argument, the arguments for
nfsrv_cache_session() were simplified to take a "struct nfsrv_descript *".

Reported by:	pen@lysator.liu.se
MFC after:	2 weeks
2020-04-06 23:21:39 +00:00
Mark Johnston
355b3b7fd7 Simplify td_ucred handling in newnfs_connect().
No functional change intended.

MFC after:	1 week
2020-03-26 15:02:56 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Eric van Gyzen
cf64777f50 Add missing comma in nfsv4_errstr
Reported by:	Coverity
CID:		1412243
Sponsored by:	Dell EMC Isilon
2020-01-13 21:49:27 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Rick Macklem
57fd4aa7e4 Change NFSv4.1 and NFSv4.2 error strings to start with lower case letter.
r356084 added error strings for NFSv4.1 and NFSv4.2, with the first
character capitalized. Since the other error strings were not capitalized
and these strings would usually be imbedded in an error, I decided to
make the first characters lower cased.
No real effect but more consistent.
2019-12-26 21:06:34 +00:00
Rick Macklem
8f2940cec7 Add NFSv4.1 and NFSv4.2 errors to nfsv4_errstr.h.
nfsv4_errstr.h only had strings for NFSv4.0 errors. This patch adds the
errors for NFSv4.1 and NFSv4.2. At this time, this file is not used by
any sources in the tree, so the change is not significant.
I do plan on using nfsv4_errstr.h in a future patch to mount_nfs.c.
Since I am doing this patch so that "minor version mismatch" will be
recognized, I made that string less abbreviated.
2019-12-25 22:25:30 +00:00
Rick Macklem
c057a37818 Add support for NFSv4.2 to the NFS client and server.
This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes
(RFC-8276) to the NFS client and server.
NFSv4.2 is comprised of several optional features that can be supported
in addition to NFSv4.1. This patch adds the following optional features:
   - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED)
   - posix_fallocate()
   - intra server file range copying via the copy_file_range(2) syscall
     --> Avoiding data tranfer over the wire to/from the NFS client.
   - lseek(SEEK_DATA/SEEK_HOLE)
   - Extended attribute syscalls for "user" namespace attributes as defined
     by RFC-8276.

Although this patch is fairly large, it should not affect support for
the other versions of NFS. However it does add two new sysctls that allow
a sysadmin to limit which minor versions of NFSv4 a server supports, allowing
a sysadmin to disable NFSv4.2.

Unfortunately, when the NFS stats structure was last revised, it was assumed
that there would be no additional operations added beyond what was
specified in RFC-7862. However RFC-8276 did add additional operations,
forcing the NFS stats structure to revised again. It now has extra unused
entries in all arrays, so that future extensions to NFSv4.2 can be
accomodated without revising this structure again.

A future commit will update nfsstat(1) to report counts for the new NFSv4.2
specific operations/procedures.

This patch affects the internal interface between the nfscommon, nfscl and
nfsd modules and, as such, they all must be upgraded simultaneously.
I will do a version bump (although arguably not needed), due to this.

This code has survived a "make universe" but has not been built with a
recent GCC. If you encounter build problems, please email me.

Relnotes:	yes
2019-12-12 23:22:55 +00:00
Rick Macklem
8e1906f700 Fix kernel handling of a NFSERR_MINORVERSMISMATCH NFSv4 server reply.
When an NFSv4 server replies NFSERR_MINORVERSMISMATCH, it does not generate
a status result for the first operation in the compound. Without this
patch, this will result in a bogus EBADXDR error return.
Returning EBADXDR is relatively harmless, but a correct reply of
NFSERR_MINORVERSMISMATCH is needed by the pNFS client to select the correct
minor version to use for a File Layout DS now that there can be NFSv4.2
DS servers.

mount_nfs.c still needs to be fixed for this, although how the mount fails
is only useful to help sysadmins isolate why a mount fails.

Found during testing of the NFSv4.2 client and server.

MFC after:	2 weeks
2019-12-08 00:06:00 +00:00
Rick Macklem
238da71f91 Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-07 23:13:51 +00:00
Rick Macklem
394dae30b1 Set the XATTRSUPPORT attribute bit for NFSv4.2, always cleared for now.
Since r355472 added code which clears the XATTRSUPPORT bit for non-NFSv4.2
mounts, it is now safe to set it.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
This commit completes updates to nfsproto.h required by the NFSv4.2.
2019-12-07 01:10:38 +00:00
Rick Macklem
2096ce0339 Add a couple of definitions for NFSv4.2 and update macros to use them.
This patch adds code to macros to clear attribute bits not supported
by NFSv4.2. For now, these bits are never set anyhow, but this prepares
the code for the addition of NFSv4.2 support in a future commit.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-06 23:51:11 +00:00
Rick Macklem
8f9259a550 Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-06 01:53:02 +00:00
Rick Macklem
348d9b567e Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-04 23:24:40 +00:00
Rick Macklem
e1cda5eea6 Fix two races while handling nfsuserd daemon start/stop.
A crash was reported where the nr_client field was NULL during an upcall
to the nfsuserd daemon. Since nr_client == NULL only occurs when the
nfsuserd daemon is being shut down, it appeared to be caused by a race
between doing an upcall and the daemon shutting down.
By inspection two races were identified:
1 - The nfsrv_nfsuserd variable is used to indicate whether or not the
    daemon is running. However it did not handle the intermediate phase
    where the daemon is starting or stopping.

    This was fixed by making nfsrv_nfsuserd tri-state and having the
    functions that are called during start/stop to obey the intermediate
    state.

2 - nfsrv_nfsuserd was checked to see that the daemon was running at
    the beginning of an upcall, but nothing prevented the daemon from
    being shut down while an upcall was still in progress.
    This race probably caused the crash.

    The patch fixes this by adding a count of upcalls in progress and
    having the shut down function delay until this count goes to zero
    before getting rid of nr_client and related data used by an upcall.

Tested by:	avg (Panzura QA)
Reported by:	avg
Reviewed by:	avg
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D22377
2019-11-28 23:34:23 +00:00
Rick Macklem
14eff785e8 Fix the pNFS server's reporting of SpaceUsed (va_bytes).
The pNFS server currently reports SpaceUsed (va_bytes) for the metadata
file. This in not correct, since the metadata file is always empty and,
as such, va_bytes is just the allocation for the empty file.
This patch adds va_bytes to the list of attributes acquired from the
DS for a file, so that it includes the allocated data size and is updated
when the file is written.
For files created on a pNFS server before this patch is applied, the
va_bytes value is estimated by rounding va_size up to a multiple of
BLKDEV_IOSIZE. Once the file is written after this patch has been
applied to the metadata server, the va_bytes returned for the file
will be correct.

This patch only affects a pNFS metadata server.

Found during testing of the NFSv4.2 pNFS server for the Allocate operation.
(Not yet in head/current.)

MFC after:	2 weeks
2019-11-22 00:22:55 +00:00
Konstantin Belousov
c6ba06d86c Fix interface between nfsclient and vnode pager.
Make the nfsclient always call vnode_pager_setsize() with the vnode
exclusively locked.  This ensures that page fault always can find the
backing page if the object size check succeeded.  Set VV_VMSIZEVNLOCK
flag on NFS nodes.

The main offender breaking the interface in nfsclient is
nfs_loadattrcache(), which is used whenever server responded with
updated attributes, which can happen on non-changing operations as
well.  Also, iod threads only have buffers locked (and even that is
LK_KERNPROC), but they still may call nfs_loadattrcache() on RPC
response.

Instead of immediately calling vnode_pager_setsize() if server
response indicated changed file size, but the vnode is not exclusively
locked, set a new node flag NVNSETSZSKIP.  When the vnode exclusively
locked, or when we can temporary upgrade the lock to exclusive, call
vnode_pager_setsize(), by providing the nfsclient VOP_LOCK() implementation.

Tested by:	pho
Discussed with:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D21883
2019-10-22 16:17:38 +00:00
Rick Macklem
ee7201a725 Replace all mtx_assert() calls for n_mtx and ncl_iod_mutex with macros.
To be consistent with replacing the mtx_lock()/mtx_unlock() calls on
the NFS node mutex (n_mtx) and ncl_iod_mutex, this patch replaces
all mtx_assert() calls on these mutexes with macros as well.
This will simplify changing these locks to sx locks in a future commit.
However, this change may be delayed indefinitely, since it appears there
is a deadlock when vnode_pager_setsize() is called to shrink the size
and the NFS node lock is held.
There is no semantic change as a result of this commit.

Suggested by:	kib
MFC after:	1 week
2019-09-26 02:54:45 +00:00
Rick Macklem
b662b41e62 Replace all mtx_lock()/mtx_unlock() on the iod lock with macros.
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called and the iod lock is held when the NFS node lock
is acquired, the iod mutex will need to be changed to an sx lock as well.
To simply the future commit that changes both the NFS node lock and iod lock
to sx locks, this commit replaces all mtx_lock()/mtx_unlock() calls on the
iod lock with macros.
There is no semantic change as a result of this commit.

I don't know when the future commit will happen and be MFC'd, so I have
set the MFC on this commit to one week so that it can be MFC'd at the same
time.

Suggested by:	kib
MFC after:	1 week
2019-09-24 23:38:10 +00:00
Rick Macklem
5d85e12f44 Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros.
For a long time, some places in the NFS code have locked/unlocked the
NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas
others have simply used mtx_lock()/mtx_unlock().
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock
with the macros to simply making the change to an sx lock in future commit.
There is no semantic change as a result of this commit.

I am not sure if the change to an sx lock will be MFC'd soon, so I put
an MFC of 1 week on this commit so that it could be MFC'd with that commit.

Suggested by:	kib
MFC after:	1 week
2019-09-24 01:58:54 +00:00
Rick Macklem
2e67077700 Delete the unused "nd" argument for nfsrv_checkdsattr().
The "nd" argument for nfsrv_checkdsattr() is no longer used by the function.
This patch deletes it. This allows subsequent patches to delete the "nd"
argument from nfsrv_proxyds(), since it's only use of "nd" was to pass it
to nfsrv_checkdsattr(). The same will then be true for nfsvno_getattr(),
which passes "nd" to nfsrv_proxyds().
Getting rid of the "nd" argument from nfsvno_getattr() avoids confusion
over why it might need "nd".

This patch is trivial and does not have any semantic effect.
Found by inspection while working on the NFSv4.2 server.
2019-09-04 22:37:28 +00:00
Rick Macklem
a6f77c9a6e Add #ifdef INET as requested by bz@. 2019-04-21 22:53:51 +00:00
Rick Macklem
b4372164ed Add support for the ModeSetMasked attribute to the NFSv4.1 server.
I do not know of an extant NFSv4.1 client that currently does a Setattr
operation for the ModeSetMasked, but it has been discussed on the linux-nfs
mailing list.
This patch adds support for doing a Setattr of ModeSetMasked, so that it
will work for any future NFSv4.1 client that chooses to do so.
Tested via a hacked FreeBSD NFSv4.1 client.

MFC after:	2 weeks
2019-04-19 23:35:08 +00:00
Rick Macklem
ea5776ec47 Fix the NFSv4.0 server so that it does not support NFSv4.1 attributes.
During inspection of a packet trace, I noticed that an NFSv4.0 mount
reported that it supported attributes that are only defined for NFSv4.1.
In practice, this bug appears to be benign, since NFSv4.0 clients will
not use attributes that were added for NFSv4.1.
However, this was not correct and this patch fixes the NFSv4.0 server
so that it only supports attributes defined for NFSv4.0.
It also adds a definition for NFSv4.1 attributes that can only be set,
although it is only defined as 0 for now.
This is anticipation of the addition of support for the NFSv4.1 mode+mask
attribute soon.

MFC after:	2 weeks
2019-04-19 03:36:22 +00:00
Rick Macklem
eeb1f3ed51 Fix the NFSv4 client to safely find processes.
r340744 broke the NFSv4 client, because it replaced pfind_locked() with a
call to pfind(), since pfind() acquires the sx lock for the pid hash and
the NFSv4 already holds a mutex when it does the call.
The patch fixes the problem by recreating a pfind_any_locked() and adding the
functions pidhash_slockall() and pidhash_sunlockall to acquire/release
all of the pid hash locks.
These functions are then used by the NFSv4 client instead of acquiring
the allproc_lock and calling pfind().

Reviewed by:	kib, mjg
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19887
2019-04-15 01:27:15 +00:00
Rick Macklem
80405bcf79 Add INET6 support for the upcalls to the nfsuserd daemon.
The kernel code uses UDP to do upcalls to the nfsuserd(8) daemon to get
updates to the username<->uid and groupname<->gid mappings.
A change to AF_LOCAL last year had to be reverted, since it could result
in vnode locking issues on the AF_LOCAL socket.
This patch adds INET6 support and the required #ifdef INET and INET6
to the code.

Requested by:	bz
PR:		205193
Reviewed by:	bz, rgrimes
MFC after:	2 weeks
Differential Revision:	http://reviews.freebsd.org/D19218
2019-04-06 21:53:46 +00:00
Rick Macklem
02c8dd7d72 Revert r320698, since the related userland changes were reverted by r338192.
r338192 reverted the changes to nfsuserd so that it could use an AF_LOCAL
socket, since it resulted in a vnode locking panic().
Post r338192 nfsuserd daemons use the old AF_INET socket for upcalls and
do not use these kernel changes.
I left them in for a while, so that nfsuserd daemons built from head sources
between r320757 (Jul. 6, 2017) and r338192 (Aug. 22, 2018) would need them
by default.
This only affects head, since the changes were never MFC'd.
I will add an UPDATING entry, since an nfsuserd daemon built from head
sources between r320757 and r338192 will not run unless the "-use-udpsock"
option is specified. (This command line option is only in the affected
revisions of the nfsuserd daemon.)

I suspect few will be affected by this, since most who run systems built
from head sources (not stable or releases) will have rebuilt their nfsuserd
daemon from sources post r338192 (Aug. 22, 2018)

This is being reverted in preparation for an update to include AF_INET6
support to the code.
2019-04-04 23:30:27 +00:00
Edward Tomasz Napierala
2df8bd90c8 Drop unused 'p' argument to nfsv4_strtogid().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 15:07:47 +00:00
Edward Tomasz Napierala
c703cba811 Drop unused 'p' argument to nfsv4_gidtostr().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 15:05:11 +00:00
Edward Tomasz Napierala
0658ac3943 Drop unused 'p' argument to nfsv4_strtouid().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 15:02:52 +00:00
Edward Tomasz Napierala
0f86b94a56 Drop unused 'p' argument to nfsv4_uidtostr().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 14:59:08 +00:00
Edward Tomasz Napierala
f32bf2922f Drop unused 'p' argument to nfsrv_getuser().
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19455
2019-03-12 14:53:53 +00:00
Edward Tomasz Napierala
01c27978f5 Don't pass td to nfsvno_open().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-04 14:50:00 +00:00
Edward Tomasz Napierala
127152fe56 Don't pass td to nfsvno_createsub().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-04 14:30:53 +00:00
Edward Tomasz Napierala
5edc9102dc Don't pass td to nfsd_fhtovp(), it's unused.
Reviewed by:	rmacklem (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19421
2019-03-04 13:18:04 +00:00
Edward Tomasz Napierala
af444b18ed Push down the thread argument in NFS server code, using curthread
instead of passing it explicitly. No functional changes

Reviewed by:	rmacklem (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19419
2019-03-04 13:12:23 +00:00
Edward Tomasz Napierala
113aa93390 Push down td in nfsrvd_dorpc() - make it use curthread instead
of it being explicitly passed as an argument. No functional changes.

The big picture here is that I want to get rid of the 'td' argument
being passed everywhere, and this is the first piece that affects
the NFS server.

Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19417
2019-03-04 13:02:36 +00:00
Mateusz Guzik
cc426dd319 Remove unused argument to priv_check_cred.
Patch mostly generated with cocinnelle:

@@
expression E1,E2;
@@

- priv_check_cred(E1,E2,0)
+ priv_check_cred(E1,E2)

Sponsored by:	The FreeBSD Foundation
2018-12-11 19:32:16 +00:00
Rick Macklem
778f29833b nfsm_advance() would panic() when the offs argument was negative.
The code assumed that this would indicate a corrupted mbuf chain, but
it could simply be caused by bogus RPC message data.
This patch replaces the panic() with a printf() plus error return.

MFC after:	1 week
2018-11-20 01:56:34 +00:00
Brooks Davis
1493c2ee62 Make vop_symlink take a const target path.
This will enable callers to take const paths as part of syscall
decleration improvements.

Where doing so is easy and non-distruptive carry the const through
implementations. In UFS the value is passed to an interface that must
take non-const values. In ZFS, const poisoning would touch code shared
with upstream and it's not worth adding diffs.

Bump __FreeBSD_version for external API consumers.

Reviewed by:	kib (prior version)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17805
2018-11-02 14:42:36 +00:00
Rick Macklem
3e5ba2e187 Fix LORs between vn_start_write() and vn_lock() in the pNFS server.
When coding the pNFS server, I added several vn_start_write() calls done
while the vnode was locked, not realizing I had introduced LORs and
possible deadlock when an exported file system on the MDS is suspended.
This patch fixes this by removing the added vn_start_write() calls and
modifying the code so that the extant vn_start_write() call before the
NFS RPC/operation is done when needed by the pNFS server.
Flags are changed so that LayoutCommit and LayoutReturn now get a
vn_start_write() done for them.
When the pNFS server is enabled, the code now also changes the flags for
Getattr, so that the vn_start_write() is done for Getattr, since it may
need to do a vn_set_extattr(). The nfs_writerpc flag array was made global
to the NFS server and renamed nfsrv_writerpc, which is consistent naming
for globals in the NFS server.
Thanks go to kib@ for reporting that doing vn_start_write() while the vnode is
locked results in a LOR.
This patch only affects the behaviour of the pNFS server.
2018-08-17 21:12:16 +00:00
Rick Macklem
93df87f208 Allow newnfs_request() to retry all callback RPCs with an NFSERR_DELAY reply.
The code in newnfs_request() retries RPCs that get a reply of NFSERR_DELAY,
but exempts certain NFSv4 operations. However, for callback RPCs, there
should not be any exemptions at this time. The code would have erroneously
exempted the CBRECALL callback, since it has the same operation number as
the CLOSE operation.
This patch fixes this by checking for a callback RPC (indicated by clp != NULL)
and not checking for exempt operations for callbacks.
This would have only affected the NFSv4 server when delegations are enabled
(they are not enabled by default) and the client replies to CBRECALL with
NFSERR_DELAY. This may never actually happen.
Spotted during code inspection.

MFC after:	2 weeks
2018-08-07 21:29:14 +00:00
Rick Macklem
a3e709cd33 Modify the NFSv4.1 server so that it allows ReclaimComplete as done by ESXi 6.7.
I believe that a ReclaimComplete with rca_one_fs == TRUE is only
to be used after a file system has been transferred to a different
file server.  However, RFC5661 is somewhat vague w.r.t. this and
the ESXi 6.7 client does both a ReclaimComplete with rca_one_fs == TRUE
and one with ReclaimComplete with rca_one_fs == FALSE.
Therefore, just ignore the rca_one_fs == TRUE operation and return
NFS_OK without doing anything instead of replying NFS4ERR_NOTSUPP.
This allows the ESXi 6.7 NFSv4.1 client to do a mount.
After discussion on the NFSv4 IETF working group mailing list, doing this
along with setting a flag to note that a ReclaimComplete with rca_one_fs TRUE
was an appropriate way to handle this.
The flag that indicates that a ReclaimComplete with rca_one_fs == TRUE was
done may be used to disable replies of NFS4ERR_GRACE for non-reclaim
state operations in a future commit.

This patch along with r332790, r334492 and r336357 allow ESXi 6.7 NFSv4.1 mounts
work ok. ESX 6.5 NFSv4.1 mounts do not work well, due to what I believe are
violations of RFC-5661 and should not be used.

Reported by:	andreas.nagy@frequentis.com
Tested by:	andreas.nagy@frequentis.com, daniel@ftml.net (earlier version)
MFC after:	2 weeks
Relnotes:	yes
2018-07-28 20:21:04 +00:00
Rick Macklem
cecf6c6e9c Set CLSET_TIMEOUT on TCP connections to pNFS DSs.
Use CLSET_TIMEOUT to set the timeout for connections to DSs instead of
specifying a timeout on each RPC. This is done so that SO_SNDTIMEO
is set on the TCP socket as well as specifying a time limit when
waiting for an RPC reply.  Useful if the send queue for the TCP
connection has become constipated, due to a failed DS.
The choice of lease_duration / 4 is fairly arbitrary, but seems to work
ok, with a lower bound of 10sec.
For client connections to a DS, set the retry limit to vfs.nfsd.dsretries,
which is 2 by default.
This patch should only affect pNFS connections to DSs.
This patch requires r336542.

MFC after:	2 weeks
2018-07-21 01:33:07 +00:00
Rick Macklem
5d54f186bb Modify the reasons for not issuing a delegation in the NFSv4.1 server.
The ESXi NFSv4.1 client will generate warning messages when the reason for
not issuing a delegation is two. Two refers to a resource limit and I do
not see why it would be considered invalid. However it probably was not the
best choice of reason for not issuing a delegation.
This patch changes the reasons used to ones that the ESXi client doesn't
complain about. This change does not affect the FreeBSD client and does
not appear to affect behaviour of the Linux NFSv4.1 client.
RFC5661 defines these "reasons" but does not give any guidance w.r.t. which
ones are more appropriate to return to a client.

Tested by:	andreas.nagy@frequentis.com
PR:		226650
MFC after:	2 weeks
2018-07-16 21:32:50 +00:00
Rick Macklem
5da3882447 Shut down the TCP connection to a DS in the pNFS client when Renew fails.
When a NFSv4.1 client mount using pNFS detects a failure trying to do a
Renew (actually just a Sequence operation), the code would simply try
again and again and again every 30sec.
This would tie up the "nfscl" thread, which should also be doing other
things like Renews on other DSs and the MDS.
This patch adds code which closes down the TCP connection and marks it
defunct when Renew detects an failure to communicate with the DS, so
further Renews will not be attempted until a new working TCP connection to
the DS is established.
It also makes the call to nfscl_cancelreqs() unconditional, since
nfscl_cancelreqs() checks the NFSCLDS_SAMECONN flag and does so while holding
the lock.
This fix only applies to the NFSv4.1 client whne using pNFS and without it
the only effect would have been an "nfscl" thread busy doing Renew attempts
on an unresponsive DS.

MFC after:	2 weeks
2018-07-15 18:54:44 +00:00
Rick Macklem
89c64a3a4f Fix the pNFS client when mirrors aren't on the same machine.
Without this patch, the client side NFSv4.1 pNFS code erroneously did writes
and commits to both DS mirrors using the TCP connection of the first one.
For my test setup this worked, since I have both DSs running on the same
machine, but it would have failed when the DSs are on separate machines.
This patch fixes the code to use the correct TCP connection for each DS.
This patch should only affect the NFSv4.1 client when using "pnfs" mounts
to mirrored DSs.

MFC after:	2 weeks
2018-07-14 19:51:44 +00:00
Rick Macklem
a6fed5f514 Modify the NFSv4.1 pNFS client to use separate TCP connections for DSs.
Without this patch, the NFSv4.1 pNFS client shared a single TCP connection
for all DSs that resided on the same machine. This made disabling one of
the DSs impossible. Although unlikely, it is possible that the storage
subsystem has failed in such a way that the storage for one DS on a machine
is no longer functioning correctly, but the storage used by another DS on
the same machine is still ok. For this case, it would be nice if a system
can fail one of the DSs without failing them all.
This patch changes the default behaviour to use separate TCP connections
for each DS even if they reside on the same machine.
I do not believe that this will be a problem for extant pNFS servers, but
a sysctl can be set to restore the old behaviour if this change causes a
problem for an extant pNFS server.
This patch only affects the NFSv4.1 pNFS client.

MFC after:	2 weeks
2018-07-12 20:46:22 +00:00
Rick Macklem
de9a1a70ab Add support for a "forced" pnfsdskill to the pNFS server kernel code.
The pnfsdskill(8) command will normally fail if there is no valid mirror
for the DS to be disabled. However, a system administrator may need to
disable a DS which does not have a valid mirror so that the nfsd threads
can be terminated. This patch adds the kernel code needed by pnfsdskill(8)
to implement this "forced" case of disabling a DS.
This patch only affects the pNFS server.
2018-07-09 19:58:01 +00:00
Rick Macklem
2f32675c83 Add an optional feature to the pNFS server.
Without this patch, the pNFS server distributes the data storage files across
all of the specified DSs.
A tester noted that it would be nice if a system administrator could control
which DSs are used to store the file data for a given exported MDS file system.
This patch adds the kernel support to do this. It also makes a slight semantic
change to nfsv4_findmirror(), since some uses of it no longer require that
the DS being searched for have a current mirror.
A patch that will be committed in a few minutes will modify the nfsd daemon
to support this feature.
The patch should only affect sites using the pNFS server (specified via the
"-p" command line option for nfsd.

Suggested by:	james.rose@framestore.com
2018-07-02 19:21:33 +00:00
Rick Macklem
9f4c522e6b Set the slotid and ND_HASSLOTID flag for NFSv4.1 sequenced operations.
Most NFSv4.1 compound RPCs start with a Sequence operation. For these
cases, save the slotid and note that it is saved by setting ND_HASSLOTID.
This is used by r335568 to free up the session slot and disable it.

MFC after:	2 weeks
2018-06-23 00:48:45 +00:00
Rick Macklem
b18130d330 Define ND_HASSLOTID needed by r335568.
r335568 uses a flag called ND_HASSLOTID to indicate that the slotid is set,
so it can free and invalidate it.
This flag needs to be set, which will be done in a subsequent commit.

MFC after:	2 weeks
2018-06-23 00:37:15 +00:00
Rick Macklem
ba6cce3aea Fix the handling of NFSv4.1 sessions for "soft" mounts.
When a "soft" mount is used for NFSv4.1, an RPC that fails without completing
will leave a slot in the NFSv4.1 session in an indeterminate state.
As such, all that can be done is free up the slot while making is no longer
usable.
A "soft" NFSv4.1 mount is not recommended in general, since it will leave
Open/Lock state in an indeterminate state. An exception is a pNFS mount of
a DS, since there are no Opens/Locks done for them except file creates
where loss of the Open state does not matter.
The patch also makes connections to DSs soft, so that they will fail when
a DS is non-functional or network partitioned, allowing the pNFS MDS to disable
the DS for a mirrored configuration.
This patch should not affect normal "hard" NFSv4.1 mounts.

MFC after:	2 weeks
2018-06-22 21:37:20 +00:00
Rick Macklem
2e35b8fe24 Change the NFSv4.1 pNFS client so that it returns the DS error in layoutreturn.
When the NFSv4.1 pNFS client gets an error for a DS I/O operation using a
Flexible File layout, it returns the layout with an error.
This patch changes the code slightly, so that it returns the layout for all
errors except EACCES and lets the MDS decide what to do based on the error.
It also makes a couple of changes to nfscl_layoutrecall() to ensure that
the first layoutreturn(s) will have the error in the reply.
Plus, the patch adds a wakeup() so that the "nfscl" thread won't wait 1sec
before doing the LayoutReturn.
Tested against the pNFS service.
This patch should not affect non-pNFS use of the client.
The unused "dsp" argument will be used by a future patch that disables the
connection to the DS when possible.

MFC after:	2 weeks
2018-06-22 21:25:27 +00:00
Rick Macklem
755e4b7936 Revert r335263, since it can cause crashes in unusual circumstances.
This needs to be fixed in a different way.
2018-06-17 23:08:54 +00:00
Rick Macklem
2bad64241c Make the pNFS NFSv4.1 client return a Flexible File layout upon error.
The Flexible File layout LayoutReturn operation has argument fields where
an I/O error encountered when attempting I/O on a DS can be reported back
to the MDS.
This patch adds code to the client to do this for the Flexible File layout
mirrored case.
This patch should only affect mounts using the "pnfs" option against servers
that support the Flexible File layout.

MFC after:	2 weeks
2018-06-17 16:30:06 +00:00
Rick Macklem
46d30d3d9c Fix NFSv4.1 client side handling of "soft,retrans=2" mounts.
Normally "soft,retrans=2" cannot be safely used on NFSv4 mounts, since
the RPC can fail and leave the open/lock state in an undefined state.
Doing I/O on a pNFS DS is an exception to this, since no open/lock state
is maintained on the DS server.
It is useful to do "soft,retrans=2" connections to a DS when it is mirrored,
so that the client can detect failure of the DS. As such, mounts from the MDS
to the DSs should use these mount options when mirroring is enabled.
However, the NFSv4.1 client still leaves the session in an undefined state
when this happens.
This patch fixes the problem by setting the session defunct, so it will
no longer be used.
The patch also sets "retries=2" on the connections done by the client to
a DS, which is the internal equivalent of "soft,retrans=2".
The client does not know if the server implements mirroring at connection
time, but always doing this should be safe, since it will fall back on doing
I/O via the MDS as a proxy when there is a failure doing an I/O RPC to the DS.

This patch should not affect non-pNFS client mounts.

MFC after:	2 weeks
2018-06-16 19:45:06 +00:00
Rick Macklem
c338c94d20 Move four functions in nfscl.ko to nfscommon.ko.
Four functions nfscl_reqstart(), nfscl_fillsattr(), nfsm_stateidtom()
and nfsmnt_mdssession() are now called from within the nfsd.
As such, they needed to be moved from nfscl.ko to nfscommon.ko so that
nfsd.ko would load when nfscl.ko wasn't loaded.

Reported by:	herbert@gojira.at
2018-06-14 10:00:19 +00:00
Rick Macklem
90d2dfab19 Merge the pNFS server code from projects/pnfs-planb-server into head.
This code merge adds a pNFS service to the NFSv4.1 server. Although it is
a large commit it should not affect behaviour for a non-pNFS NFS server.
Some documentation on how this works can be found at:
http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt
and will hopefully be turned into a proper document soon.
This is a merge of the kernel code. Userland and man page changes will
come soon, once the dust settles on this merge.
It has passed a "make universe", so I hope it will not cause build problems.
It also adds NFSv4.1 server support for the "current stateid".

Here is a brief overview of the pNFS service:
A pNFS service separates the Read/Write oeprations from all the other NFSv4.1
Metadata operations. It is hoped that this separation allows a pNFS service
to be configured that exceeds the limits of a single NFS server for either
storage capacity and/or I/O bandwidth.
It is possible to configure mirroring within the data servers (DSs) so that
the data storage file for an MDS file will be mirrored on two or more of
the DSs.
When this is used, failure of a DS will not stop the pNFS service and a
failed DS can be recovered once repaired while the pNFS service continues
to operate.  Although two way mirroring would be the norm, it is possible
to set a mirroring level of up to four or the number of DSs, whichever is
less.
The Metadata server will always be a single point of failure,
just as a single NFS server is.

A Plan B pNFS service consists of a single MetaData Server (MDS) and K
Data Servers (DS), all of which are recent FreeBSD systems.
Clients will mount the MDS as they would a single NFS server.
When files are created, the MDS creates a file tree identical to what a
single NFS server creates, except that all the regular (VREG) files will
be empty. As such, if you look at the exported tree on the MDS directly
on the MDS server (not via an NFS mount), the files will all be of size 0.
Each of these files will also have two extended attributes in the system
attribute name space:
pnfsd.dsfile - This extended attrbute stores the information that
    the MDS needs to find the data storage file(s) on DS(s) for this file.
pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime
    and Change attributes for the file, so that the MDS doesn't need to
    acquire the attributes from the DS for every Getattr operation.
For each regular (VREG) file, the MDS creates a data storage file on one
(or more if mirroring is enabled) of the DSs in one of the "dsNN"
subdirectories.  The name of this file is the file handle
of the file on the MDS in hexadecimal so that the name is unique.
The DSs use subdirectories named "ds0" to "dsN" so that no one directory
gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize
on the MDS, with the default being 20.
For production servers that will store a lot of files, this value should
probably be much larger.
It can be increased when the "nfsd" daemon is not running on the MDS,
once the "dsK" directories are created.

For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces
of information to the client that allows it to do I/O directly to the DS.
DeviceInfo - This is relatively static information that defines what a DS
             is. The critical bits of information returned by the FreeBSD
             server is the IP address of the DS and, for the Flexible
             File layout, that NFSv4.1 is to be used and that it is
             "tightly coupled".
             There is a "deviceid" which identifies the DeviceInfo.
Layout     - This is per file and can be recalled by the server when it
             is no longer valid. For the FreeBSD server, there is support
             for two types of layout, call File and Flexible File layout.
             Both allow the client to do I/O on the DS via NFSv4.1 I/O
             operations. The Flexible File layout is a more recent variant
             that allows specification of mirrors, where the client is
             expected to do writes to all mirrors to maintain them in a
             consistent state. The Flexible File layout also allows the
             client to report I/O errors for a DS back to the MDS.
             The Flexible File layout supports two variants referred to as
             "tightly coupled" vs "loosely coupled". The FreeBSD server always
             uses the "tightly coupled" variant where the client uses the
             same credentials to do I/O on the DS as it would on the MDS.
             For the "loosely coupled" variant, the layout specifies a
             synthetic user/group that the client uses to do I/O on the DS.
             The FreeBSD server does not do striping and always returns
             layouts for the entire file. The critical information in a layout
             is Read vs Read/Writea and DeviceID(s) that identify which
             DS(s) the data is stored on.

At this time, the MDS generates File Layout layouts to NFSv4.1 clients
that know how to do pNFS for the non-mirrored DS case unless the sysctl
vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File
layouts are generated.
The mirrored DS configuration always generates Flexible File layouts.
For NFS clients that do not support NFSv4.1 pNFS, all I/O operations
are done against the MDS which acts as a proxy for the appropriate DS(s).
When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy.
If the DS is on the same machine, the MDS/DS will do the RPC on the DS as
a proxy and so on, until the machine runs out of some resource, such as
session slots or mbufs.
As such, DSs must be separate systems from the MDS.

Tested by:	james.rose@framestore.com
Relnotes:	yes
2018-06-12 19:36:32 +00:00
Rick Macklem
73b1879c2d Add a couple of safety belt checks to the NFSv4.1 client related to sessions.
There were a couple of cases in newnfs_request() that it assumed that it
was an NFSv4.1 mount with a session. This should always be the case when
a Sequence operation is in the reply or the server replies NFSERR_BADSESSION.
However, if a server was broken and sent an erroneous reply, these safety
belt checks should avoid trouble.
The one check required a small tweak to nfsmnt_mdssession() so that it
returns NULL when there is no session instead of the offset of the field
in the structure (0x8 for i386).
This patch should have no effect on normal operation of the client.
Found by inspection during pNFS server development.

MFC after:	2 weeks
2018-06-11 19:00:07 +00:00
Rick Macklem
be9d155ff4 Delete some macros that are unused.
These macros were added because they were used by the pNFS server last
year. However, they are no longer used by the pNFS server code and
might as well be deleted.
This is a partial reversion of r326735.
2018-06-09 23:38:22 +00:00
Rick Macklem
d506aa140d Delete an unused macro and clean up a comment about it.
NFSDEV_MIRRORSTR was defined for the pNFS server, but has not been used,
so this patch deletes it. It also cleans up the comment and hopefully
makes it more readable.
2018-06-09 23:14:59 +00:00
Rick Macklem
dec8894b45 Fix the default number of threads for Flex File layout pNFS client I/O.
The intent was that the default would be based on number of CPUs, but the
code disabled using taskqueue() by default.
This code is only executed when mounting a NFSv4.1 server that supports the
Flexible File layout for pNFS and, since such servers are rare, this change
shouldn't result in a POLA violation.
(The FreeBSD pNFS server is still a project and the only other one that
 uses Flexible File layout is being developed by Primary Data and I don't
 know if they have even shipped any to customers yet.)
Found while testing the pNFS server.
2018-06-02 00:11:26 +00:00
Rick Macklem
9442a64e53 Add the BindConnectiontoSession operation to the NFSv4.1 server.
Under some fairly unusual circumstances, the Linux NFSv4.1 client is
doing a BindConnectiontoSession operation for TCP connections.
It is also used by the ESXi6.5 NFSv4.1 client.
This patch adds this operation to the NFSv4.1 server.

Reported by:	andreas.nagy@frequentis.com
Tested by:	andreas.nagy@frequentis.com
MFC after:	2 weeks
2018-06-01 19:47:41 +00:00
Rick Macklem
0ebe2634be End grace for the NFSv4 server if all mounts do ReclaimComplete.
The NFSv4 protocol requires that the server only allow reclaim of state
and not issue any new open/lock state for a grace period after booting.
The NFSv4.0 protocol required this grace period to be greater than the
lease duration (over 2minutes). For NFSv4.1, the client tells the server
that it has done reclaiming state by doing a ReclaimComplete operation.
If all NFSv4 clients are NFSv4.1, the grace period can end once all the
clients have done ReclaimComplete, shortening the time period considerably.
This patch does this. If there are any NFSv4.0 mounts, the grace period
will still be over 2minutes.
This change is only an optimization and does not affect correct operation.

Tested by:	andreas.nagy@frequentis.com
MFC after:	2 months
2018-05-15 20:28:50 +00:00
Rick Macklem
5d4835e4b7 Add support for the TestStateID operation to the NFSv4.1 server.
The Linux client now uses the TestStateID operation, so this patch adds
support for it to the NFSv4.1 server. The FreeBSD client never uses this
operation, so it should not be affected.

MFC after:	2 months
2018-05-11 22:16:23 +00:00
Conrad Meyer
b97b91b547 nfs: Remove NFSSOCKADDRALLOC, NFSSOCKADDRFREE macros
They were just thin wrappers over malloc(9) w/ M_ZERO and free(9).

Discussed with:	rmacklem, markj
Sponsored by:	Dell EMC Isilon
2018-01-25 22:38:39 +00:00
Conrad Meyer
222daa421f style: Remove remaining deprecated MALLOC/FREE macros
Mechanically replace uses of MALLOC/FREE with appropriate invocations of
malloc(9) / free(9) (a series of sed expressions).  Something like:

* MALLOC(a, b, ... -> a = malloc(...
* FREE( -> free(
* free((caddr_t) -> free(

No functional change.

For now, punt on modifying contrib ipfilter code, leaving a definition of
the macro in its KMALLOC().

Reported by:	jhb
Reviewed by:	cy, imp, markj, rmacklem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14035
2018-01-25 22:25:13 +00:00
John Baldwin
b1288166e0 Use long for the last argument to VOP_PATHCONF rather than a register_t.
pathconf(2) and fpathconf(2) both return a long.  The kern_[f]pathconf()
functions now accept a pointer to a long value rather than modifying
td_retval directly.  Instead, the system calls explicitly store the
returned long value in td_retval[0].

Requested by:	bde
Reviewed by:	kib
Sponsored by:	Chelsio Communications
2018-01-17 22:36:58 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
John Baldwin
5538424353 Replace one more LINK_MAX with NFS_LINK_MAX missed in r326991.
Sponsored by:	Chelsio Communications
2017-12-19 22:43:39 +00:00
John Baldwin
a0a073b16d Update NFS to handle larger link counts post ino64.
- Define a NFS_LINK_MAX as UINT32_MAX to match the wire protocol.
- Use NFS_LINK_MAX instead of LINK_MAX as the fallback value reported
  for a PATHCONF RPC by the NFS server.
- Use NFS_LINK_MAX instead of LINK_MAX as the default value reported
  by the NFS client pathconf() if not overridden by the NFS server.
- When reading the link count out of an RPC reply, read the full 32
  bits instead of the lower 16 bits.

Reviewed by:	rmacklem (earlier version)
Sponsored by:	Chelsio Communications
2017-12-19 19:18:48 +00:00
Rick Macklem
219afc4fe2 Define macros used by the pNFS server code.
This commit defines some macros used by the pNFS server code.
They will not be used until the main pNFS server code merge occurs,
which will probably be in April 2018.
2017-12-09 21:04:56 +00:00
Pedro F. Giffuni
d63027b668 sys/fs: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:15:37 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Rick Macklem
f49c813c1d Use taskqueue(9) to do writes/commits to mirrored DSs concurrently.
When the NFSv4.1 pNFS client is using a Flexible File Layout specifying
mirrored Data Servers, it must do the writes and commits to all mirrors.
This patch modifies the client to use a taskqueue to perform these writes
and commits concurrently.
The number of threads can't be changed for taskqueue(9), so it is set
to 4 * mp_ncpus by default, but this can be overridden by setting the
sysctl vfs.nfs.pnfsiothreads.

Differential Revision:	https://reviews.freebsd.org/D12632
2017-10-16 23:28:12 +00:00
Rick Macklem
63918d3848 Fix forced dismount when a pNFS mount is hung on a DS.
When a "pnfs" NFSv4.1 mount is hung because of an unresponsive DS,
a forced dismount wouldn't work, because the RPC socket for the DS
was not being closed. This patch fixes this.
This will only affect "pnfs" mounts where the pNFS server's DS
is unresponsive (crashed or network partitioned or...).
Found during testing of the pNFS server.

MFC after:	2 weeks
2017-10-10 21:05:40 +00:00
Rick Macklem
b949cc41d1 Add Flex File Layout support to the NFSv4.1 pNFS client.
This patch adds support for the Flexible File Layout to the pNFS client.
Although the patch is rather large, it should only affect NFS mounts
using the "pnfs" option against pNFS servers that do not support File
Layout.
There are still a couple of things missing from the Flexible File Layout
client implementation:
- The code does not yet do a LayoutReturn with I/O error stats when
  I/O error(s) occur when attempting to do I/O on a DS.
  This will be fixed in a future commit, since it is important for the
  MDS to know that I/O on a DS is failing.
- The current code does writes and commits to mirror DSs serially.
  Making them happen concurrently will be done in a future commit,
  after discussion on freebsd-current@ on the best way to do this.
- The code does not handle NFSv4.0 DSs. Since there is no extant pNFS
  server that implements NFSv4.0 DSs and NFSv4.1 DSs makes more sense
  now, I don't intend to implement this until there is a need for it.
  There is support for NFSv4.1 and NFSv3 DSs.
2017-10-05 20:10:40 +00:00
Rick Macklem
64059dce60 Add a few definitions for the Flex File Layout.
This patch adds a few definitions for the Flex File Layout.
Until a future commit adds Flex File layout support, these new fields
are not used.
This patch should not affect the "pnfs" option for File Layout.
2017-10-04 22:55:30 +00:00
Rick Macklem
efea6b20b9 Add support for Flex File Layout to the pNFS client structures.
This patch modifies the pNFS client layout and deviceinfo structures
to add fields and unions for the Flex File Layout. Until a future
commit adds Flex File layout support, these new fields are not used.
This patch should not affect the "pnfs" option for File Layout.
2017-09-29 23:13:01 +00:00
Rick Macklem
8dd62f2eaa Add the NFS client state flag that enables Flexible File Layout.
This patch adds a NFSSTA_FLEXFILE flag that will be used to enable
Flexible File Layout for the NFSv4.1 pNFS client. It is not yet
used, but will be after a future commit adds Flex File Layout support.
2017-09-28 23:05:08 +00:00
Rick Macklem
be3d32ad6e Change nfsv4_getipaddr() and nfsrpc_fillsa() to not use sockaddr_storage.
This patch changes nfsv4_getipaddr() and nfsrpc_fillsa() to use
a sockaddr_in * and sockaddr_in6 * instead of sockaddr_storage, to
avoid allocating the latter on the stack. It also moves the nfsrpc_fillsa()
call to after the completion of parsing of the DeviceInfo reply from
the server. This patch is in preparation for addition of Flex File
Layout support in a future commit.
It only affects the "pnfs" NFSv4.1 client mount option and should not
have changed its semantics.
2017-09-28 22:33:01 +00:00
Rick Macklem
a8462c582c Add major and minor version arguments to nfscl_reqstart().
This patch adds "vers" and "minorvers" arguments to nfscl_reqstart().
The patch always passes them in as "0" and that implies no change
in semantics. These arguments will be used by a future commit that
adds support for the Flexible File Layout.
2017-09-26 23:42:44 +00:00
Rick Macklem
6b43e06029 Add a few definitions for Flex File Layout for pNFS.
These definitions will be used by a future commit.
2017-09-21 00:41:12 +00:00
Rick Macklem
0f29b8292d Make the nfsrpc_layoutget() function a static.
Make the NFSv4 pNFS client function nfsrpc_layoutget() a static, since it
is only used in sys/fs/nfsclient/nfs_clrpcops.c.
This prepares the code for future patches that add Flex File layout
support.
2017-09-19 23:28:22 +00:00
Rick Macklem
2742a21091 Add a new function called nfsm_uiombuflist(), similar to nfsm_uiombuf().
This patch adds a new function called nfsm_uiombuflist(), which is
similar to nfsm_uiombuf(), but doesn't not use the fields in
struct nfsrv_descript. This new function will be used by the pNFS client
for writing to mirrors using Flex Files layout.
The function is not yet called anywhere.
Also, get rid of #ifndef APPLE, which is ancient cruft left over from
the Mac OSX port of the NFSv4 client.
2017-09-19 21:31:36 +00:00
Rick Macklem
b0932afacc Simplify nfsrpc_layoutreturn() args.
Simplify nfsrpc_layoutreturn() args. in preparation for the addition
of Flex File layout support, since File layout uses a 0 length field.
Flex Files does use a longer field, but that will be added in a
subsequent commit.
2017-09-19 20:45:25 +00:00
Rick Macklem
ab118d04be Simplify nfsrpc_layoutcommit() args.
Simplify nfsrpc_layoutcommit() args. in preparation for the addition
of Flex File layout support, since it also uses a 0 length field.
2017-09-19 20:18:41 +00:00
Rick Macklem
47cbff34fa Add kernel support for the NFS client forced dismount "umount -N" option.
When an NFS mount is hung against an unresponsive NFS server, the "umount -f"
option can be used to dismount the mount. Unfortunately, "umount -f" gets
hung as well if a "umount" without "-f" has already been done. Usually,
this is because of a vnode lock being held by the "umount" for the mounted-on
vnode.
This patch adds kernel code so that a new "-N" option can be added to "umount",
allowing it to avoid getting hung for this case.
It adds two flags. One indicates that a forced dismount is about to happen
and the other is used, along with setting mnt_data == NULL, to handshake
with the nfs_unmount() VFS call.
It includes a slight change to the interface used between the client and
common NFS modules, so I bumped __FreeBSD_version to ensure both modules are
rebuilt.

Tested by:	pho
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11735
2017-07-29 19:52:47 +00:00
Rick Macklem
16f300fa4a Replace the checks for MNTK_UNMOUNTF with a macro that does the same thing.
This patch defines a macro that checks for MNTK_UNMOUNTF and replaces
explicit checks with this macro. It has no effect on semantics, but
prepares the code for a future patch where there will also be a
NFS specific flag for "forced dismount about to occur".

Suggested by:	kib
MFC after:	2 weeks
2017-07-27 20:55:31 +00:00
Edward Tomasz Napierala
1d2fef9b9a Rename vfs.nfsd.enable_uidtostring to vfs.nfs.enable_uidtostring.
It applies to both NFS client and NFS server, and is useful for both.
This is different from vfs.nfsd.enable_stringtouid, which is specific
to server side.

Reviewed by:	rmacklem@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-07-19 09:59:32 +00:00
Rick Macklem
25d694a6fa Add support for AF_LOCAL socket upcalls to the nfsuserd daemon.
This patch adds support for AF_LOCAL socket upcalls to an nfsuserd daemon
that supports them. A future patch to the nfsuserd daemon will use AF_LOCAL
sockets to avoid a problem when using upcalls to 127.0.0.1 if jails are
in use.

Suggested by:	dfr
PR:		205193
2017-07-06 00:53:12 +00:00
Edward Tomasz Napierala
3c264086aa Revert part of r320359, as suggested by rmacklem@. That case is only used
for nfsuserd -manage-gids and shouldn't depend on sysctl.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-06-27 15:14:06 +00:00
Edward Tomasz Napierala
6a3450e178 Add vfs.nfsd.nfsd_enable_uidtostring, which works just like
vfs.nfsd.nfsd_enable_stringtouid, but in reverse - when set to 1,
it forces the NFSv4 server to return numeric UIDs and GIDs instead
of "user@domain" strings. This helps with clients that can't
translate returned identifiers, eg when rerooting.

The same can be achieved by just never running nfsuserd(8),
but the sysctl is useful to toggle the behaviour back and forth
without rebooting.

Reviewed by:	rmacklem (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11326
2017-06-26 13:11:21 +00:00
Rick Macklem
81b07aac10 Add support to the NFSv4.1/pNFS client for commits through the DS.
A NFSv4.1/pNFS server using File Layout can specify that Commit operations
are to be done against the DS instead of MDS. Since no extant pNFS
server did this, the code was untested and "#ifdef notyet".
The FreeBSD pNFS server I am developing does specify that Commits be done
through the DS, so the code has been enabled/tested.
This patch should only affect the case of a pNFS server that specfies
Commits through the DS.

PR:		219551
MFC after:	2 weeks
2017-06-26 00:43:04 +00:00
Rick Macklem
a351e99ce6 Add two new compound RPCs to the NFSv4.1/pNFS client.
When the NFSv4.1 client is doing pNFS, it needs to get an Open and
a Layout for every file it will be doing I/O on. The current code
does two separate RPCs to get these. This patch adds two new compounds
that do the both the Open and LayoutGet in the same RPC, reducing the
RPC count.
It also factors out the code that sets up and parses the LayoutGet operation
into separate functions, so that the code doesn't get duplicated for
these new RPCs.
This patch is fairly large, but should only affect the NFSv4.1 client
when the "pnfs" option is specified.

PR:		219550
MFC after:	2 weeks
2017-06-24 20:01:21 +00:00
Rick Macklem
ee791357a2 Add the definition of maxbcachebuf to sys/buf.h.
r320070 removed the definition of maxbcachebuf from sys/param.h to
fix the build for arm.
This patch adds the definition of maxbcachebuf to sys/buf.h, which
should be ok, since sys/buf.h is not being included in arm/arm/elf_note.S.

Suggested by:	kib
MFC after:	2 weeks
2017-06-19 22:07:53 +00:00
Rick Macklem
95ac7f1a74 Fix the NFS client/server so that it actually uses the 64bit ino_t filenos.
The code still doesn't use d_off. That will come in a future commit.
The code also removes the checks for servers returning a fileno that
doesn't fit in 32bits, since that should work ok now.
Bump __FreeBSD_version since this patch changes the interface between
the NFS kernel modules.

Reviewed by:	kib
2017-06-18 21:48:31 +00:00
Rick Macklem
1d9f01b18e Take "extern int maxbcachebuf" out of sys/param.h, since it breaks the
arm build.

In the arm build, elf_note.S includes sys/param.h and then does an
elf macro called ELFNOTE(). Although the compile error doesn't make
sense to me, I believe it just means that an "extern ..." can't exist
in param.h for this inclusion case.
I suspect adding #if !defined(LOCORE) might fix the build, but this
commit just takes the definition out.
I will ask freebsd-current@ what is the best was to deal with this
and do a subsequent commit after that.

Reported by:	melounmichal@gmail.com
2017-06-18 12:28:43 +00:00