Commit Graph

5017 Commits

Author SHA1 Message Date
Hans Petter Selasky
d14b53ee31 cuse(3): Allow shared memory allocations up to, but excluding 2 GBytes.
Currently the cuse(3) mmap(2) offset is split into 128 banks of 16 Mbytes.
Allow cuse(3) to make allocations that span multiple banks at the expense
of any fragmentation issues that may arise. Typically mmap(2) buffers are
well below 16 Mbytes. This allows 8K video resolution to work using webcamd.

Reviewed by:	markj @
Differential Revision:	https://reviews.freebsd.org/D35830
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-07-20 10:41:11 +02:00
Hans Petter Selasky
0996dd7df6 cuse(3): Fix an off-by-one.
The page allocation limit is inclusive and not exclusive.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-07-20 10:41:11 +02:00
Dimitry Andric
276099434d Adjust dtnfsclient_unload() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:

    sys/fs/nfsclient/nfs_clkdtrace.c:544:19: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
    dtnfsclient_unload()
                      ^
                       void

This is because dtnfsclient_unload() is declared with a (void) argument
list, but defined with an empty argument list. Make the definition match
the declaration.

MFC after:	3 days
2022-07-19 20:41:24 +02:00
Ed Maste
a5f59e8565 cd9660: Use ANSI (c89) prototypes
Sponsored by:	The FreeBSD Foundation
2022-07-17 08:14:49 -04:00
Mark Johnston
7f3c78fbc9 vm_pager: Remove references to KVME_TYPE_DEFAULT in the kernel
Keep the definition around since it's used by userspace.

Reviewed by:	alc, imp, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35791
2022-07-17 07:09:48 -04:00
Rick Macklem
088ba4356a nfsd: Fix CreateSession for an established ClientID
I mis-read the RFC w.r.t. handling of the sequenceid
when a CreateSession is done after the initial one
that confirms the ClientID.  Fortunately this does
not affect most extant NFSv4.1/4.2 clients, since
they only acquire a single session for TCP for a
ClientID (Solaris might be an exception?).

This patch fixes the server to handle this case,
where the RFC requires the sequenceid be incremented
for each CreateSession and is required to reply to
a retried CreateSession with a cached reply.
It adds a field to nfsclient called lc_prevsess,
which caches the sessionid, which is the only field
in a CreateSession reply that will change for a
retry, to implement this reply cache.

The recent commits up to d4a11b3e3b that mark
session slots bad when "intr" and/or "soft" mounts
are used by the client needs this server patch.
Without this patch, the client will do a full
recovery, including a new ClientID, losing all
byte range locks.  However, prior to the recent
client commits, the client would hang when all
session slots were bad, so even without this
patch it is not a regression.

PR: 260011
MFC after:	2 weeks
2022-07-13 16:28:56 -07:00
Rick Macklem
d4a11b3e3b nfscl: Fix CreateSession for an established ClientID
Commit 981ef32230 added optional use of the session
slots marked bad to recover a new session when all
slots are marked bad.  The recovery worked against
a FreeBSD NFSv4.1/4.2 server, but not a Linux one.
It turns out that it was a bug in the FreeBSD client
and not the Linux server.

This patch fixes the client so that DeleteSession
followed by CreateSession after receiving a
NFSERR_BADSESSION error reply works against the
Linux server (and conforms to the RFC).

This also implies that the FreeBSD NFSv4.1/4.2
server needs to be fixed in a future commit.
Without the fix, the FreeBSD server does a full
recovery, including creation of a new ClientID,
but since "intr" mounts were broken, this does
not result in a regression.

This patch only affects the case where a CreateSession
is done for an already confirmed ClientID, which was
not being done prior to commit 981ef32230.

PR: 260011
MFC after:	2 weeks
2022-07-11 16:50:34 -07:00
Rick Macklem
2adb30740b nfscl: Replace "cred" with NULL to cleanup code
Commit 326bcf9394 added a new "cred" argument to nfscl_reqstart().
Fsinfo is a NFSv3 RPC and since the "cred" argument is not
used for NFSv3, it does not matter what is passed in.
However, to be consistent with the rest of the patch, change the
argument to NULL.

This patch should not result in a semantics change.

PR: 260011
MFC after:	2 weeks
2022-07-11 15:58:07 -07:00
Rick Macklem
8f4a5fc6bc nfscl: Do not call nfscl_hasexpired() for NFSv4.1/4.2
Commit 981ef32230 enabled marking of potentially bad
session slots when an RPC is interrupted if the "intr"
mount option is used.  As such, it no longer makes
sense to call nfscl_hasexpired() for I/O operations that
reply NFSERR_BADSTATEID for NFSv4.1/4.2, which does a full
recovery of NFSv4 open state, destroying all byte range locks.
Recovery of open state should not be usually needed, since
the session slot has been marked potentially bad and,
although opens for the process that has been terminated via
a signal may be broken, locks for other processes will still
be valid.

This patch disables calls to nfscl_hasexpired for NFSv4.1/4.2
mounts, when I/O RPCs receive NFSERR_BADSTATEID replies.
It does not affect the behaviour of NFSv4.0 mounts nor
hard (non "intr") mounts.

PR: 260011
MFC after:	2 weeks
2022-07-10 13:56:38 -07:00
Rick Macklem
981ef32230 nfscl: Enable detection of bad session slots
To deal with broken session slots caused by the use of the
"soft" and/or "intr" mount options, nfsv4_sequencelookup()
has been modified to track the potentially broken session
slots (commit 40ada74ee1).  Then, when all session slots
are potentially broken, nfsv4_sequencelookup() does a
DeleteSession operation, so that the NFSv4.1/4.2 server will
reply NFSERR_BADSESSION to uses of the session.
The client will then recover by doing a CreateSession to
acquire a new session.

This patch adds the code that marks potentially bad
slots, so that the above semantics become functional.
It has been successfully tested against a FreeBSD
NFSv4.1/4.2 server, but does not work against a Linux 5.15
NFSv4.1/4.2 server. (The Linux 5.15 server creates
a new session with the same sessionid as the destroyed
one and, as such, keeps returning NFSERR_BADSESSION.
I believe this is a bug in the Linux server.)

However, this should not cause a regression and will
make "intr" mounts fairly usable against the NFSv4.1/4.2
servers where it works.

PR: 260011
MFC after:	2 weeks
2022-07-10 13:33:19 -07:00
Rick Macklem
627f1555f5 nfscl: Initialize nfsess_badslots to zero
Commit 40ada74ee1 added a field to mark bad session slots.
This patch ensures that the field is initialized to 0.

PR: 260011
MFC after:	2 weeks
2022-07-09 16:12:31 -07:00
Rick Macklem
40ada74ee1 nfscl: Add optional support for slots marked bad
This patch adds support for session slots marked bad
to nfsv4_sequencelookup().  An additional boolean
argument indicates if the check for slots marked bad
should be done.

The "cred" argument added to nfscl_reqstart() by
commit 326bcf9394 is now passed into nfsv4_setquence()
so that it can optionally set the boolean argument
for nfsv4_sequencelookup().  When optionally enabled,
nfsv4_setsequence() will do a DestroySession when all
slots are marked bad.

Since the code that marks slots bad is not yet committed,
this patch should not result in a semantics change.

PR: 260011
MFC after:	2 weeks
2022-07-09 14:43:16 -07:00
Rick Macklem
dff31ae1c5 nfscl: Move nfsrpc_destroysession into nfscommon
This patch moves nfsrpc_destroysession() into nfscommon.ko
and also modifies its arguments slightly.  This will allow
the function to be called from nfsv4_sequencelookup() in
a future commit.

This patch should not result in a semantics change.

PR: 260011
MFC after:	2 weeks
2022-07-09 08:02:14 -07:00
Rick Macklem
2b766d5e5a nfscl: Change the cred argument to non-NULL for pNFS proxies
Commit 326bcf9394 added a "cred" argument to nfscl_reqstart().
For the pNFS proxy calls on the server, the argument
should be "cred" instead of NULL.
This patch fixes this.

Since the argument is not yet used, this patch
should not result in a semantics change.

PR: 260011
MFC after:	2 weeks
2022-07-08 17:27:23 -07:00
Rick Macklem
326bcf9394 nfscl: Add a cred argument to nfscl_reqstart()
To deal with broken session slots caused by the use of the
"soft" and/or "intr" mount options, nfsv4_sequencelookup()
will be modified to track the potentially broken session
slots.  Then, when all session slots are potentially
broken, do a DeleteSession operation, so that the NFSv4
server will reply NFSERR_BADSESSION to uses of the session.
These changes will be done in future commits.  However,
to do the DeleteSession RPC, a "cred" argument is needed
for nfscl_reqstart().  This patch adds this argument,
which is unused at this time.  If the argument is NULL,
it indicates that DeleteSession should not be done
(usually because the RPC does not use sessions).

This patch should not cause any semantics change.

PR: 260011
MFC after:	2 weeks
2022-07-08 16:58:06 -07:00
Rick Macklem
be7b87de16 nfscl: Fix setting of nfsess_defunct for nfscl_hasexpired()
Commit a7bb120f8b added a printf for the case where recovery
has not marked the session defunct by setting nfsess_defunct
to 1.  It turns out that nfscl_hasexpired() calls
nfsrpc_setclient() directly, without setting nfsess_defunct.
This patch replaces the printf with code that sets
nfsess_defunct to 1 to handle this case.

If SIGTERM is issued to a process when it is doing I/O on
an "intr" mount, the NFSv4 server may reply NFSERR_BADSTATEID,
due to the Open being prematurely closed.
This can result in a call to nfscl_hasexpired() to do a
recovery.

This would explain at least one hang described in the PR.

PR: 260011
MFC after:	2 weeks
2022-07-08 07:37:36 -07:00
Hans Petter Selasky
2c28cd09d9 cuse(3): Remove PAGE_SIZE from libcuse.
To allow for a dynamic page size on arm64 remove the static value from libcuse.

Differential Revision:	https://reviews.freebsd.org/D35585
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-06-25 12:01:59 +02:00
Rick Macklem
c11e64ce51 nfscommon: Clean up the code by removing the vnode_vtype() macro
The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code and, therefore,
use of the macro has been deleted by previous commits.
This commit deletes the, now unused, macro.

This commit should not result in a semantics change.
2022-06-24 13:56:35 -07:00
Rick Macklem
1ebc14c900 nfscommon: Clean up the code by not using the vnode_vtype() macro
The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
avoid using it to clean up the code.

This commit should not result in a semantics change.
2022-06-24 13:47:57 -07:00
Rick Macklem
746974c061 nfscl: Clean up the code by not using the vnode_vtype() macro
The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
avoid using it to clean up the code.

This commit should not result in a semantics change.
2022-06-23 16:13:12 -07:00
Rick Macklem
5d3fe02c5a nfsd: Clean up the code by not using the vnode_vtype() macro
The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
avoid using it to clean up the code.

This commit should not result in a semantics change.
2022-06-22 13:20:32 -07:00
Rick Macklem
0586a12904 nfscl: Clean up the code by removing vfs_flags() macro
The vfs_flags() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
remove it to clean up the code.

This commit should not result in a semantics change.
2022-06-20 13:23:04 -07:00
Rick Macklem
6d25ea6d96 nfscl: Clean up the code by removing #if(n)def APPLE
The definition of "APPLE" was used by the Mac OSX port.
For FreeBSD, this definition is never used, so remove
the references to it to clean up the code.

This commit should not result in a semantics change.
2022-06-18 13:43:02 -07:00
Rick Macklem
3c4266eda1 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c and
nfs_clstate.c.

This commit should not result in a semantics change.
2022-06-17 16:46:11 -07:00
Rick Macklem
1e70163c50 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c and
nfs_clvfsops.c. Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-17 14:51:11 -07:00
Rick Macklem
c692ea4026 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c.
Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-16 16:46:06 -07:00
Rick Macklem
af6665e0aa nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c.
Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-16 16:17:13 -07:00
Rick Macklem
8cb42d6918 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c.
Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-15 16:10:50 -07:00
Rick Macklem
da47c186ac nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c.
Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-15 13:12:54 -07:00
Rick Macklem
1c665e95d4 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c.
Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-14 13:35:25 -07:00
Konstantin Belousov
7fd37611b9 null_vptocnp(): busy nullfs mp instead of refing it
null_nodeget() needs a valid mount point data, otherwise we might
race and dereference NULL.

Using MBF_NOWAIT makes non-forced unmount non-transparent for
vn_fullpath() over nullfs, but we make no guarantee that fullpath
calculation succeeds anyway.

Reported and tested by:	pho
Reviewed by:	jah
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35477
2022-06-14 10:32:45 +03:00
Rick Macklem
41c029d506 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c.
Future commits will do the same for other functions.

This commit should not result in a semantics change.
2022-06-13 15:57:42 -07:00
Konstantin Belousov
156745b42d fdescfs: allow chown/utime etc on fdescfs fd for underlying files opened with O_PATH
Reported and tested by:	dchagin
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35410
2022-06-06 22:27:36 +03:00
Rick Macklem
56b64e28e1 nfscl: Do not flush when a write delegation is held
When a NFSv4 byte range write lock is unlocked, all
data modifications need to be flushed to the server
to satisfy the coherency requirements for byte range
locking.  However, if a write delegation for the
file is held by the client, flushing is not required,
since no other NFSv4 client can have the file NFSv4
Opened.

Found by inspection as suggested by a similar change
that was done to the Linux NFSv4 client.
2022-06-02 12:11:55 -07:00
Rick Macklem
9792c7d3eb nfscl: Enable support for the Lookup+Open RPC
Commits 3ad1e1c1ce and 57014f21e7 added a Lookup+Open
RPC for NFSv4.1/4.2, which can reduce the RPC count by
10-20% for some loads.  This has now received a fair amount
of testing, so I think it is ok to enable it.

Note that the Lookup+Open RPC is only used when the
"oneopenown" mount option is specified.  As such, this
change won't affect most NFSv4.1/4.2 mounts.
2022-05-31 11:59:39 -07:00
Dmitry Chagin
31d1b816fe sysent: Get rid of bogus sys/sysent.h include.
Where appropriate hide sysent.h under proper condition.

MFC after:	2 weeks
2022-05-28 20:52:17 +03:00
Rick Macklem
a7bb120f8b nfscl: Add a diagnostic printf() for a "should never happen" case
When a NFSv4.1/4.2 session to the NFS server (not a pNFS DS) is
replaced, the old session should always be marked defunct by
nfsess_defunct being set non-zero.

However, the hang reported by the PR suggests that this might
be the case.

This patch adds a printf() to indicate this has somehow happened.

PR:	260011
MFC after: 	2 weeks
2022-05-27 14:32:46 -07:00
Rick Macklem
425e5c739b nfscl: Do not handle NFSERR_BADSESSION in operation code
The NFSERR_BADSESSION reply from a NFSv4.1/4.2 server
is handled by newnfs_request().  It should not be handled
separately after newnfs_request() has returned.

These two cases were spotted during code inspection.
One of them should only redo what newnfs_request() already
did by the same "nfscl" thread.  The other might have
resulted in recovery being done twice, but the code is
only used for "pnfs" mounts, so that would be rare.
Also, since NFSERR_BADSESSION should only be replied by
a server after the server reboots, this would be extremely
rare.

MFC after: 	2 weeks
2022-05-27 14:20:31 -07:00
Alan Somers
0bef4927ea fusefs: handle evil servers that return illegal inode numbers
* If during FUSE_CREATE, FUSE_MKDIR, etc the server returns the same
  inode number for the new file as for its parent directory, reject it.
  Previously this would triggers a recurse-on-non-recursive lock panic.

* If during FUSE_LINK the server returns a different inode number for
  the new name as for the old one, reject it.  Obviously, that can't be
  a hard link.

* If during FUSE_LOOKUP the server returns the same inode number for the
  new file as for its parent directory, reject it.  Nothing good can
  come of this.

PR:		263662
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D35128
2022-05-12 14:32:26 -06:00
Alan Somers
dcfa054216 fusefs: fix an undefined variable access
In an error path, a dtrace probe could access an undefined variable.

Reported by:	Coverity (CID 1471986)
MFC after:	2 weeks
Sponsored by:	Axcient
2022-05-05 08:43:51 -06:00
Rick Macklem
ef4edb70c9 nfsd: Add a sanity check for Owner/OwnerGroup string length
Robert Morris reported that, if a client sends an absurdly
large Owner/OwnerGroup string, the kernel malloc() for the
large size string can block forever.

This patch adds a sanity limit for Owner/OwnerGroup string
length.  Since the RFCs do not specify any limit and FreeBSD
can handle a group name greater than 1Kbyte, the limit is
set at a generous 10Kbytes.

Reported by:	rtm@lcs.mit.edu
PR:	260546
MFC after:	2 weeks
2022-05-04 13:58:22 -07:00
Rick Macklem
f32bf50d43 nfsd: Fix handling of Open/Create for the pNFS server
When the MDS of a pNFS service receives an Open/Create
and the file already exists, it must do a Setattr of
size == 0.  Without this patch, this was eroneously
done via a VOP_SETAATR() call, which would set the
length of the MDS file to 0 (which is already is,
since all data lives on the DSs).

This patch fixes the problem by doing a nfsvno_setattr()
instead of VOP_SETATTR(), which knows to do a proxied
Setattr on the DSs.

For a non-pNFS server, the change has no effect, since
nfsvno_setattr() only does a VOP_SETATTR() for that case.

This was found during a recent IETF NFSv4 testing event.

MFC after:	2 weeks
2022-05-04 13:52:33 -07:00
Rick Macklem
70910e4b55 nfscl: Acquire a refcount on "cred" for mirrored pNFS RPCs
When the NFSv4.1/4.2 client is doing a pnfs mount to
mirrored DS(s), asynchronous threads are used to do the
RPCs against the DS(s) concurrently.  If a DS is slow
to reply, it is possible for the "cred" to be free'd
before the asynchronous thread is done with it, causing
a panic/crash.

This patch fixes the problem by acquiring a refcount on
the "cred" while it is being used by the asynchronous thread
for a DS RPC.  This bug was found during a recent IETF
NFSv4 testing event.

This bug only affects "pnfs" mounts to mirrored pNFS
servers.

MFC after:	2 weeks
2022-05-03 07:22:15 -07:00
Rick Macklem
271f6d52a6 nfsd: Fix session slot freeing for NFSv4.1/4.2
Without this patch the NFSv4.1/4.2 server erroneously
always frees session slot zero for callbacks.  This only
affects 4.1/4.2 mounts if the server has delegations
enabled or is a pNFS configuration.  Even for those
cases, the effect is mainly to only use slot 0 for
callbacks, serializing all of them.  There is a slight
chance that callbacks will fail if the client performs
them in a different order than received on the TCP
connection.

If this bug affects your server, you will see console
messages like:
  newnfs_request: Bad session slot

This patch fixes the problem.  Found during a recent
IETF NFSv4 testing event.

PR:	263728
MFC after:	2 weeks
2022-05-02 12:47:43 -07:00
Rick Macklem
47d75c29f5 nfsd: Add a sanity check to SecinfoNoname for file type
Robert Morris reported that, for the case of SecinfoNoname
with the Parent option, providing a non-directory could
cause a crash.

This patch adds a sanity check for v_type == VDIR for
this case, to avoid the crash.

Reported by:	rtm@lcs.mit.edu
PR:	260300
MFC after:	2 weeks
2022-05-01 13:41:31 -07:00
Rick Macklem
5218d82c81 nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct, followed by the Write operation.

This patch modifies the NFSv4 client to use an Appendwrite
RPC, which does a Verify to check the file's size before
doing the Write.  This avoids the need for a Getattr RPC
to preceed this RPC and reduces the RPC count by half for
IO_APPEND writes, so long as the client knows the file's
size.

The nfsd structure was moved from the stack to be malloc()'d,
since the kernel stack limit was being exceeded.

While here, fix the types of a few variables, although
there should not be any semantics change caused by these
type changes.
2022-04-30 13:49:23 -07:00
Alan Somers
2f6362484c fusefs: use the fsname mount option if set
The daemon can specify fsname=XXX in its mount options.  If so, the file
system should report f_mntfromname as XXX during statfs.  This will show
up in the output of commands like mount and df.

Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35090
2022-04-29 11:10:03 -06:00
Alan Somers
45825a12f9 fusefs: fix FUSE_CREATE with file handles and fuse protocol < 7.9
Prior to fuse protocol version 7.9, the fuse_entry_out structure had a
smaller size.  But fuse_vnop_create did not take that into account when
working with servers that use older protocols.  The bug does not matter
for servers which don't use file handles or open flags (the only fields
affected).

PR:		263625
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
2022-04-28 15:13:09 -06:00
Mateusz Guzik
11c5495554 ext2: plug a set-but-not-used var
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-04-19 12:45:57 +00:00
Alan Somers
3a1b3c6a1e fusefs: correctly handle servers that report too much data written
During a FUSE_WRITE, the kernel requests the server to write a certain
amount of data, and the server responds with the amount that it actually
did write.  It is obviously an error for the server to write more than
it was provided, and we always treated it as such, but there were two
problems:

* If the server responded with a huge amount, greater than INT_MAX, it
  would trigger an integer overflow which would cause a panic.

* When extending the file, we wrongly set the file's size before
  validing the amount written.

PR:		263263
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	2 weeks
Sponsored by:	Axcient
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D34955
2022-04-18 18:59:10 -06:00
Rick Macklem
32c3e0f049 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
local to nfs_clrpcops.c.
Future commits will do the same for other functions.
2022-04-15 18:51:01 -07:00
Alan Somers
155ac516c6 fusefs: validate servers' error values
Formerly fusefs would pass up the stack any error value returned by the
fuse server.  However, some values aren't valid for userland, but have
special meanings within the kernel.  One of these, EJUSTRETURN, could
cause a kernel page fault if the server returned it in response to
FUSE_LOOKUP.  Fix by validating all errors returned by the server.

Also, fix a data lifetime bug in the FUSE_DESTROY test.

PR:		263220
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	3 weeks
Sponsored by:	Axcient
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D34931
2022-04-15 13:57:32 -06:00
Rick Macklem
068fc05745 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_nget().
Future commits will do the same for other functions.
2022-04-14 16:15:56 -07:00
John Baldwin
ac9c3c32c6 unionfs: Use __diagused for a variable only used in KASSERT(). 2022-04-13 16:08:20 -07:00
Rick Macklem
4ad3423bc2 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_loadattrcache().
Future commits will do the same for other functions.
2022-04-13 07:43:13 -07:00
Rick Macklem
5580e5bd71 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_request().
Future commits will do the same for other functions.
2022-04-10 14:05:44 -07:00
Rick Macklem
38c3cf6aed nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_postop_attr().
Future commits will do the same for other functions.
2022-04-09 18:53:25 -07:00
Rick Macklem
c45d934f6b nfscl: Ansify a function header 2022-04-09 15:14:05 -07:00
Rick Macklem
21de450aa1 nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct, followed by the Write operation.

This patch modifies nfscl_wcc_data() to optionally
acquire the file's size, for use with an AppendWrite.
Although the "stuff" arguments are always NULL
(these were used for the Mac OSX port and should be
cleared out someday), make the argument to
nfscl_wcc_data() explicitly NULL for clarity.

This patch does not cause any semantics change until
the AppendWrite is added in a future commit.
2022-04-08 13:59:05 -07:00
John Baldwin
771e4a8613 smbfs: Remove unused variable. 2022-04-07 17:01:28 -07:00
John Baldwin
9fe2867ce4 smbfs_rename: Move all references to flags under #ifdef notnow. 2022-04-07 17:01:28 -07:00
Alan Somers
3227325366 fusefs: fix two bugs regarding VOP_RECLAIM of the root inode
* We never send FUSE_LOOKUP for the root inode, since its inode number
  is hard-coded to 1.  Therefore, we should not send FUSE_FORGET for it,
  lest the server see its lookup count fall below 0.

* During VOP_RECLAIM, if we are reclaiming the root inode, we must clear
  the file system's vroot pointer.  Otherwise it will be left pointing
  at a reclaimed vnode, which will cause future VOP_LOOKUP operations to
  fail.  Previously we only cleared that pointer during VFS_UMOUNT.  I
  don't know of any real-world way to trigger this bug.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D34753
2022-04-06 16:16:52 -06:00
Warner Losh
393b7606f9 Unbreak the build: Also define NFSV42_OLDNPROCS here.
If nfsproto.h is included before nfsport.h, then NFSV42_OLDNPROCS would
be undefined when it is used in struct nfsstatsov1.

Sponsored by:		Netflix
2022-04-05 11:54:20 -06:00
Rick Macklem
330aa8acde nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct before doing the Write operation.

This patch prepares the NFSv4 client for such an
RPC, which will be added in a future commit.

This patch does not cause any semantics change.
2022-04-05 08:11:37 -07:00
Gordon Bergling
ef1534cad8 fusefs(5): Fix a typo in a source code comment
- s/accomodate/accommodate/

MFC after:	3 days
2022-04-02 14:56:21 +02:00
Rick Macklem
c1970a7eba nfscl: Fix IO_APPEND writes from kernel space
Commit 867c27c23a modified the NFS client so that
it did IO_APPEND writes directly to the NFS server
bypassing the buffer cache, via a call to
nfs_directio_write().  Unfortunately, this (very old)
function assumed that the uio iov was for user space
addresses.  As such, a IO_APPEND VOP_WRITE() that
was for system space, such as ktrace(1) does, would
write bogus data.

This patch fixes nfs_directio_write() so that it
handles kernel space uio iovs.

Reported by:	bz
Tested by:	bz
MFC after:	2 weeks
2022-03-28 15:11:52 -07:00
Gordon Bergling
c1ad8a39a1 nfsclient: Fix a typos in source code comments
- s/ony/only/

Obtained from:	NetBSD
MFC after:	3 days
2022-03-27 19:27:05 +02:00
Mateusz Guzik
bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Mateusz Guzik
aeabf8d4b9 nullfs: hash insertion without vnode lock upgrade
Use the hash lock to serialize instead.

This enables shared-locked ".." lookups.

Reviewed by:	markj
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D34466
2022-03-19 10:47:10 +00:00
Mark Johnston
c0b98fe16f fusefs: Initialize a pad word in the mknod message
Reported by:	Jenkins (KMSAN job)
Reviewed by:	asomers
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34593
2022-03-17 18:30:21 -04:00
Rick Macklem
f37dc50d9f nfscl: Do not do a Lookup+Open for pNFS mounts
A NFSv4.1/4.2 pNFS mount needs to do a
separate Open+LayoutGet RPC, so do not do
a Lookup+Open RPC for these mounts.

The Lookup+Open RPCs are still disabled,
until further testing is done, so this patch
has no effect at this time.
2022-03-17 07:48:06 -07:00
Rick Macklem
57014f21e7 nfscl: Fix NFSv4.1/4.2 Lookup+Open RPC
Use of the Lookup+Open RPC is currently disabled,
due to a problem detected during testing.  This
patch fixes this problem.  The problem was that
nfscl_postop_attr() does not parse the attributes
if nd_repstat != 0.  It also would parse the
return status for the operation, where the
Lookup+Open code had already parsed it.

The first change in the patch does not make any
semantics change, but makes the code identical
to what is done later in the function, so that
it is apparent that the semantics should be the
same in both places.

Lookup+Open remains disabled while further
testing is being done, so this patch has no
effect at this time.
2022-03-13 13:15:12 -07:00
Mateusz Guzik
0134bbe56f vfs: prefix lookup and relookup with vfs_
Reviewed by:	imp, mckusick
Differential Revision:		https://reviews.freebsd.org/D34530
2022-03-13 14:44:39 +00:00
Rick Macklem
3fc3fe9091 nfsd: Do not exempt NFSv3 Fsinfo from the TLS check
The Fsinfo RPC is exempt from the check for
Kerberized NFS being required, as recommended
by RFC2623.  However, there is no reason to
exempt Fsinfo from the requirement to use TLS.

This patch fixes the code so that the exemption
only applies to Kerberized NFS and not
NFS-over-TLS.

This only affects NFS-over-TLS for an NFSv3
mount when it is required, but the client does
not do so.

MFC after:	1 month
2022-03-09 16:52:42 -08:00
Rick Macklem
1cedb4ea1a nfscl: Fix a use after free in nfscl_cleanupkext()
ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
  owp->nfsow_owner.  If we free that particular
  owner structure, than in subsequent comparisons
  "own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
  of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
  without the NFSCLSTATE mutex held, this could race with
  nfscl_cleanupkext().
  This could happen when the exclusive lock is held
  on the client, such as when delegations are being returned
  or when recovering from NFSERR_EXPIRED.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
    nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() so that it will never free more
    than the first matching element.  Normally there should only
    be one element in each list with a matching open/lock owner
    anyhow (but there might be a bug that results in a duplicate).
    This should guarantee that the FOREACH_SAFE loops in
    nfscl_cleanupkext() are adequate.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
    and nfscl_freelockowner(), if it is not already held.
    This serializes all of these calls with the ones done in
    nfscl_cleanup_common().

Reported by:	ler
Reviewed by:	markj
Tested by:	cy
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34334
2022-02-25 07:27:03 -08:00
Alan Somers
e8553be9bc fusefs: fix a cached attributes bug during directory rename
When renaming a directory into a different parent directory, invalidate
the cached attributes of the new parent.  Otherwise, stat will show the
wrong st_nlink value.

MFC after:	1 week
Reviewed by:	ngie
Differential Revision: https://reviews.freebsd.org/D34336
2022-02-24 14:07:25 -07:00
Rick Macklem
06148d2251 Revert "nfscl: Fix a use after free in nfscl_cleanupkext()"
This reverts commit dd08b84e35.

cy@ reported a problem caused by this patch.  He will be
testing an alternate patch, but I'm reverting this one.
2022-02-24 07:01:03 -08:00
Jason A. Harmening
fcb164742b unionfs: rework unionfs_getwritemount()
VOP_GETWRITEMOUNT() is called on the vn_start_write() path without any
vnode locks guaranteed to be held.  It's therefore unsafe to blindly
access per-mount and per-vnode data.  Instead, follow the approach taken
by nullfs and use the vnode interlock coupled with the hold count to
ensure the mount and the vnode won't be recycled while they are being
accessed.

Reviewed by:	kib (earlier version), markj, pho
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D34282
2022-02-23 22:10:02 -06:00
Rick Macklem
dd08b84e35 nfscl: Fix a use after free in nfscl_cleanupkext()
ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
  owp->nfsow_owner.  If we free that particular
  owner structure, than in subsequent comparisons
  "own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
  of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
  without the NFSCLSTATE mutex held, this could race with
  nfscl_cleanupkext().
  This could happen when the exclusive lock is held
  on the client, such as when delegations are being returned.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
    nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() to return whether or not a
    free was done.
    When a free was done, do a goto to restart the loop, instead
    of using FOREACH_SAFE, which was not safe in this case.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
    and nfscl_freelockowner(), if it not already held.
    This serializes all of these calls with the ones done in
    nfscl_cleanup_common().

Reported by:	ler
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34334
2022-02-22 14:21:43 -08:00
Mark Johnston
c7cd607a4e msdosfs: Fix mounting when the device sector size is >512B
HugeSectors * BytesPerSec should be computed before converting
HugeSectors to a DEV_BSIZE-based count.

Fixes:	ba2c98389b ("msdosfs: sanity check sector count from BPB")
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34264
2022-02-14 10:06:47 -05:00
Jason A. Harmening
974efbb3d5 unionfs: fix typo in comment
I deleted the wrong word when writing up a comment in a prior change;
the covered vnode may be recursed during any unmount, not just forced
unmount.
2022-02-10 15:17:43 -06:00
Rick Macklem
17a56f3fab nfsd: Reply NFSERR_SEQMISORDERED for bogus seqid argument
The ESXi NFSv4.1 client bogusly sends the wrong value
for the csa_sequence argument for a Create_session operation.
RFC8881 requires this value to be the same as the sequence
reply from the ExchangeID operation most recently done for
the client ID.

Without this patch, the server replies NFSERR_STALECLIENTID,
which is the correct response for an NFSv4.0 SetClientIDConfirm
but is not the correct error for NFSv4.1/4.2, which is
specified as NFSERR_SEQMISORDERED in RFC8881.
This patch fixes this.

This change does not fix the issue reported in the PR, where
the ESXi client loops, attempting ExchangeID/Create_session
repeatedly.

Reported by:	asomers
Tested by:	asomers
PR:	261291
MFC after:	1 week
2022-02-09 15:17:50 -08:00
Gordon Bergling
8ea3ceda76 fs: fix a few common typos in source code comments
- s/quadradically/quadratically/
- s/persistant/persistent/

Obtained from:	NetBSD
MFC after:	3 days
2022-02-06 13:48:31 +01:00
Alan Somers
18ed2ce77a fusefs: fix the build without INVARIANTS after 00134a0789
MFC after:	2 weeks
MFC with:	00134a0789
Reported by:	se
2022-02-04 18:44:27 -07:00
Alan Somers
00134a0789 fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting
FUSE file systems that do not set FUSE_NO_OPENDIR_SUPPORT do not
guarantee that d_off will be valid after closing and reopening a
directory.  That conflicts with NFS's statelessness, that results in
unresolvable bugs when NFS reads large directories, if:

* The file system _does_ change the d_off field for the last directory
  entry previously returned by VOP_READDIR, or
* The file system deletes the last directory entry previously seen by
  NFS.

Rather than doing a poor job of exporting such file systems, it's better
just to refuse.

Even though this is technically a breaking change, 13.0-RELEASE's
NFS-FUSE support was bad enough that an MFC should be allowed.

MFC after:	3 weeks.
Reviewed by:	rmacklem
Differential Revision: https://reviews.freebsd.org/D33726
2022-02-04 16:31:05 -07:00
Alan Somers
4a6526d84a fusefs: optimize NFS readdir for FUSE_NO_OPENDIR_SUPPORT
In its lowest common denominator, FUSE does not require that a directory
entry's d_off field is valid outside of the lifetime of the directory's
FUSE file handle.  But since NFS is stateless, it must reopen the
directory on every call to VOP_READDIR.  That means reading the
directory all the way from the first entry.  Not only does this create
an O(n^2) condition for large directories, but it can also result in
incorrect behavior if either:

* The file system _does_ change the d_off field for the last directory
  entry previously seen by NFS, or
* The file system deletes the last directory entry previously seen by
  NFS.

Handily, for file systems that set FUSE_NO_OPENDIR_SUPPORT d_off is
guaranteed to be valid for the lifetime of the directory entry, there is
no need to read the directory from the start.

MFC after:	3 weeks
Reviewed by:	rmacklem
2022-02-04 16:30:58 -07:00
Alan Somers
d088dc76e1 Fix NFS exports of FUSE file systems for big directories
The FUSE protocol does not require that a directory entry's d_off field
outlive the lifetime of its directory's file handle.  Since the NFS
server must reopen the directory on every VOP_READDIR call, that means
it can't pass uio->uio_offset down to the FUSE server.  Instead, it must
read the directory from 0 each time.  It may need to issue multiple
FUSE_READDIR operations until it finds the d_off field that it's looking
for.  That was the intention behind SVN r348209 and r297887, but a logic
bug prevented subsequent FUSE_READDIR operations from ever being issued,
rendering large directories incompletely browseable.

MFC after:	3 weeks
Reviewed by:	rmacklem
2022-02-04 16:30:49 -07:00
Jason A. Harmening
83d61d5b73 unionfs: do not force LK_NOWAIT if VI_OWEINACT is set
I see no apparent need to avoid waiting on the lock just because
vinactive() may be called on another thread while the thread that
cleared the vnode refcount has the lock dropped.  In fact, this
can at least lead to a panic of the form "vn_lock: error <errno>
incompatible with flags" if LK_RETRY was passed to VOP_LOCK().
In this case LK_NOWAIT may cause the underlying FS to return an
error which is incompatible with LK_RETRY.

Reported by:	pho
Reviewed by:	kib, markj, pho
Differential Revision:	https://reviews.freebsd.org/D34109
2022-02-02 21:08:17 -06:00
Jason A. Harmening
6ff167aa42 unionfs: allow lock recursion when reclaiming the root vnode
The unionfs root vnode will always share a lock with its lower vnode.
If unionfs was mounted with the 'below' option, this will also be the
vnode covered by the unionfs mount.  During unmount, the covered vnode
will be locked by dounmount() while the unionfs root vnode will be
locked by vgone().  This effectively requires recursion on the same
underlying like, albeit through two different vnodes.

Reported by:	pho
Reviewed by:	kib, markj, pho
Differential Revision:	https://reviews.freebsd.org/D34109
2022-02-02 21:08:17 -06:00
Jason A. Harmening
0cd8f3e958 unionfs: fix assertion order in unionfs_lock()
VOP_LOCK() may be handed a vnode that is concurrently reclaimed.
unionfs_lock() accounts for this by checking for empty vnode private
data under the interlock.  But it incorrectly asserts that the vnode
is using the unionfs dispatch table before making this check.
Reverse the order, and also update KASSERT_UNIONFS_VNODE() to provide
more useful information.

Reported by:	pho
Reviewed by:	kib, markj, pho
Differential Revision:	https://reviews.freebsd.org/D34109
2022-02-02 21:08:17 -06:00
Rick Macklem
e2fe58d61b nfsd: Allow file owners to perform Open(Delegate_cur)
Commit b0b7d978b6 changed the NFSv4 server's default
behaviour to check the file's mode or ACL for permission to
open the file, to be Linux and Solaris compatible.
However, it turns out that Linux makes an exception for
the case of Claim_delegate_cur(_fh).

When a NFSv4 client is returning a delegation, it must
acquire Opens against the server to replace the ones
done locally in the client.  The client does this via
an Open operation with Claim_delegate_cur(_fh).  If
this operation fails, due to a change to the file's
mode or ACL after the delegation was issued, the
client does not have any way to retain the open.

As such, the Linux client allows the file's owner
to perform an Open with Claim_delegate_cur(_fh)
no matter what the mode or ACL allows.

This patch makes the FreeBSD server allow this case,
to be Linux compatible.

This patch only affects the case where delegations
are enabled, which is not the default.

MFC after:	2 weeks
2022-02-02 14:10:16 -08:00
Konstantin Belousov
303d3ae7e8 ufs, msdosfs: do not record witness order when creating vnode
When allocating new vnode, we need to lock it exclusively before
making it externally visible.  Since other threads cannot observe the
vnode yet, current lock order cannot create LoR conditions.

Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34126
2022-02-01 10:51:55 +02:00
Konstantin Belousov
d51b0786a2 msdosfs_denode.c: some style
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D34126
2022-02-01 10:51:48 +02:00
Konstantin Belousov
66c5fbca77 insmntque1(): remove useless arguments
Also remove once-used functions to clean up after failed insmntque1(),
which were destructor callbacks in previous life.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D34071
2022-01-31 16:49:08 +02:00
Konstantin Belousov
9cd59de2e1 ext2fs: remove remnants of the UFS snapshot code
Noted and reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34095
2022-01-31 04:37:16 +02:00
Jason A. Harmening
a01ca46b9b unionfs: use VV_ROOT to check for root vnode in unionfs_lock()
This avoids a potentially wild reference to the mount object.
Additionally, simplify some of the checks around VV_ROOT in
unionfs_nodeget().

Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D33914
2022-01-29 22:38:44 -06:00
Rick Macklem
98c788737f nfsclient: Delete unused function nfscl_getcookie()
The function nfscl_getcookie(), which is essentially the
same as ncl_getcookie(), is never called, so delete it.
This is probably cruft left over from the port of the
NFSv4 code to FreeBSD several years ago.

Found while modifying the code to better use the
directory offset cookies.

MFC after:	2 weeks
2022-01-27 15:30:26 -08:00
Mateusz Guzik
2a7e4cf843 Revert b58ca5df0b ("vfs: remove the now unused insmntque1")
I was somehow convinced that insmntque calls insmntque1 with a NULL
destructor. Unfortunately this worked well enough to not immediately
blow up in simple testing.

Keep not using the destructor in previously patched filesystems though
as it avoids unnecessary casts.

Noted by:	kib
Reported by:	pho
2022-01-27 16:32:22 +00:00
Mateusz Guzik
d35991d327 nullfs: ansify fs/nullfs/null_subr.c 2022-01-27 01:01:45 +01:00
Mateusz Guzik
3150cf0c13 unionfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:57:37 +01:00
Mateusz Guzik
5ccdfdabc8 tmpfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:56:12 +01:00
Mateusz Guzik
4e91a0b9fe nullfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:47 +01:00
Mateusz Guzik
ade1367ba8 fdescfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:38 +01:00
Mateusz Guzik
3af3e99ce4 devfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:30 +01:00
Mark Johnston
3d8562348c fusefs: Address -Wunused-but-set-variable warnings
Reviewed by:	asomers
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33957
2022-01-20 08:25:00 -05:00
Alan Somers
89d57b94d7 fusefs: implement VOP_DEALLOCATE
MFC after:	Never
Reviewed by:	khng
Differential Revision: https://reviews.freebsd.org/D33800
2022-01-18 21:13:02 -07:00
Jason A. Harmening
39a2dc44f8 unionfs: allow vnode lock to be held shared during VOP_OPEN
do_execve() will hold the vnode lock shared when it calls VOP_OPEN(),
but unionfs_open() requires the lock to be held exclusive to
correctly synchronize node status updates.  This requirement is
asserted in unionfs_get_node_status().

Change unionfs_open() to temporarily upgrade the lock as is already
done in unionfs_close().  Related to this, fix various cases throughout
unionfs in which vnodes are not checked for reclamation following lock
upgrades that may have temporarily dropped the lock.  Also fix another
related issue in which unionfs_lock() can incorrectly add LK_NOWAIT
during a downgrade operation, which trips a lockmgr assertion.

Reviewed by:	kib (prior version), markj, pho
Reported by:	pho
Differential Revision: https://reviews.freebsd.org/D33729
2022-01-11 18:44:03 -08:00
Rick Macklem
a91a57846b nfsd: Do not accept audit/alarm ACEs for the NFSv4 server
The UFS and ZFS file systems only support Allow/Deny ACEs
in the NFSv4 ACLs.  This patch does not allow the server
to parse Audit/Alarm ACEs.  The NFSv4 client is still
allowed to pase Audit/Alarm ACEs, since non-FreeBSD NFSv4
servers may use them.

This patch should not have a significant effect, since the
UFS and ZFS file systems will not handle these ACEs anyhow.
It simply serves as an additional "safety belt" for the
NFSv4 server.

MFC after:	2 weeks
2022-01-11 09:40:07 -08:00
Rick Macklem
5da9b3b011 Revert "nfscommon: Add arguments for support of the dacl attribute"
This reverts commit 0fa074b53e.

I now see that the implementation of the "dacl" operation
requires that the NFSv4 server to "automatic inheritance"
and I do not plan on doing this.  As such, this patch is
harmless, but unneeded.
2022-01-11 08:30:50 -08:00
Rick Macklem
b1f80dfac9 Revert "nfscommon: Return NFSERR_ATTRNOTSUPP for AUDIT/ALARM ACEs"
This reverts commit f10dc28ec2.

The client should still be able to getfacl
audit and alarm ACEs, for non-FreeBSD NFSv4 servers.

A patch that only disables audit/alarm for the server
side will be committed to replace this patch.
2022-01-11 08:26:42 -08:00
Alexander Motin
3455c738ac nfsd: Reduce callouts rate.
Before this callouts were scheduled twice a seconds even if nfsd was
never used.  This reduces the rate to ~1Hz and only after nfsd first
started.

MFC after:	2 weeks
2022-01-09 13:14:23 -05:00
Konstantin Belousov
aaaa4fb54e msdosfs: use mntfs vnode for pm_devvp
to prevent races with devfs VCHR vnode reclamation, same as it was
done for UFS.

Reported by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 06:21:58 +02:00
Konstantin Belousov
41e85eeab9 msdosfs: on integrity error, fire a task to remount filesystem to ro
In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 06:20:48 +02:00
Konstantin Belousov
b2e4b63584 msdosfs: add msdosfs_integrity_error()
A function to remount the filesystem from rw to ro on integrity error.
The work is performed in taskqueue to allow the call to be done from
almost arbitrary context where erronous state was detected.

Tested by:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 06:20:48 +02:00
Konstantin Belousov
ba2c98389b msdosfs: sanity check sector count from BPB
We use sector count to size the FAT inuse bitset.  If sector count is
corrupted, kernel might be tricked into doing unbound allocation.
Ensure that the sector count does not exceed the actual volume size.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov
65990b68a2 msdosfs: clusterfree() is used only in error handling cases
Change its return type to void, because its result is ignored in both
call sites.  Remove oldcnp argument as well, it is NULL always.

Suggested and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov
aec97963cd msdosfs: do no allow lookup to return vdp except for dot lookups
In collaboaration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov
1319c433f4 msdosfs: handle a case when non-dot lookup returned dvp
This means that filesystem is corrupted, there is a loop.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov
2c9a1c22c3 msdosfs: take inusemap inconsistency as an error, not invariants violation
In other words, stop silently accepting freeing free cluster in
non-debug kernels, but return the error to the caller.  Modify callers
to handle errors from usemap_free().

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov
595ed4d767 msdosfs: handle inconsistently hashed denodes
It is possible, on the corrupted msdosfs volume, to have file which
denode inode number does not match the one calculated using directory
cluster.  Instead of asserting the condition as impossible, handle it
and return error, after reclaiming the aliased vnode.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Rick Macklem
e4df1036f6 nfscl: Always invalidate buffers for append writes
kib@ reported a problem which was resolved by
reverting commit 867c27c23a, which changed the NFS
client to use direct RPCs to the server for
IO_APPEND writes.  He also spotted that the
code only invalidated buffer cache buffers
when they were marked NMODIFIED (had been
written into).

This patch modifies the NFS VOP_WRITE() to
always invalidate the buffer cache buffers
and pages for the file when IO_APPEND is
specified.  It also includes some cleanup
suggested by kib@.

Reported by:	kib
Tested by:	kib
Reviewed by:	kib
MFC after:	10 weeks
2022-01-06 14:18:36 -08:00
Jason A. Harmening
9e891d43f5 unionfs: implement VOP_SET_TEXT/VOP_UNSET_TEXT
The implementation simply passes the text ref to the appropriate
underlying vnode.  Without this, the default [un]set_text
implementation will only manage the text ref on the unionfs vnode,
causing it to be out of sync with the underlying filesystems and
potentially allowing corruption of executable file contents.
On INVARIANTS kernels, it also readily produces a panic on process
termination because the VM object representing the executable mapping
is backed by the underlying vnode, not the unionfs vnode.

PR:	251342
Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D33611
2022-01-02 19:52:58 -08:00
Jason A. Harmening
d877dd5767 unionfs: simplify writecount management
Use atomics to track the writecount granted to the underlying FS,
and avoid holding the vnode interlock while calling the underling FS'
VOP_ADD_WRITECOUNT().  This also fixes a WITNESS warning about nesting
the same lock type.  Also add comments explaining why we need to track
the writecount on the unionfs vnode in the first place.  Finally,
simplify writecount management to only use the upper vnode and assert
that we shouldn't have an active writecount on the lower vnode through
unionfs.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33611
2022-01-02 19:52:58 -08:00
Konstantin Belousov
04fd468da0 mountmsdosfs(): some style
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-02 22:25:07 +02:00
Alan Somers
398c88c758 fusefs: implement VOP_ALLOCATE
Now posix_fallocate will be correctly forwarded to fuse file system
servers, for those that support it.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33389
2021-12-31 21:05:28 -07:00
Alan Somers
1613087a81 fusefs: fix .. lookups when the parent has been reclaimed.
By default, FUSE file systems are assumed not to support lookups for "."
and "..".  They must opt-in to that.  To cope with this limitation, the
fusefs kernel module caches every fuse vnode's parent's inode number,
and uses that during VOP_LOOKUP for "..".  But if the parent's vnode has
been reclaimed that won't be possible.  Previously we paniced in this
situation.  Now, we'll return ESTALE instead.  Or, if the file system
has opted into ".." lookups, we'll just do that instead.

This commit also fixes VOP_LOOKUP to respect the cache timeout for ".."
lookups, if the FUSE file system specified a finite timeout.

PR:		259974
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33239
2021-12-31 20:38:27 -07:00
Alan Somers
5169832c96 fusefs: copy_file_range must update file timestamps
If FUSE_COPY_FILE_RANGE returns successfully, update the atime of the
source and the mtime and ctime of the destination.

MFC after:	2 weeks
Reviewers:	pfg
Differential Revision: https://reviews.freebsd.org/D33159
2021-12-31 17:43:57 -07:00
Alan Somers
13d593a5b0 Fix a race in fusefs that can corrupt a file's size.
VOPs like VOP_SETATTR can change a file's size, with the vnode
exclusively locked.  But VOPs like VOP_LOOKUP look up the file size from
the server without the vnode locked.  So a race is possible.  For
example:

1) One thread calls VOP_SETATTR to truncate a file.  It locks the vnode
   and sends FUSE_SETATTR to the server.
2) A second thread calls VOP_LOOKUP and fetches the file's attributes from
   the server.  Then it blocks trying to acquire the vnode lock.
3) FUSE_SETATTR returns and the first thread releases the vnode lock.
4) The second thread acquires the vnode lock and caches the file's
   attributes, which are now out-of-date.

Fix this race by recording a timestamp in the vnode of the last time
that its filesize was modified.  Check that timestamp during VOP_LOOKUP
and VFS_VGET.  If it's newer than the time at which FUSE_LOOKUP was
issued to the server, ignore the attributes returned by FUSE_LOOKUP.

PR:		259071
Reported by:	Agata <chogata@moosefs.pro>
Reviewed by:	pfg
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33158
2021-12-31 17:38:42 -07:00
Fedor Uporov
f1d5e2c862 Improve extents verification logic
Add functionality for extents validation inside the filesystem
extents block. The main logic is implemented under
ext4_validate_extent_entries() function, which verifies extents
or extents indexes depending of extent depth value.

PR:                     259112
Reported by:            Robert Morris
Reviewed by:            pfg
MFC after:              2 weeks
Differential Revision:  https://reviews.freebsd.org/D33375
2021-12-30 09:14:45 +03:00
Fedor Uporov
ced2172822 Add more accurate check for root inode
Check that root inode has links and is directory.

PR:             259105
Reported by:    Robert Morris
MFC after:      2 weeks
2021-12-30 09:14:45 +03:00
Fedor Uporov
bb9f1ba4b5 Add more accurate directory entries check
Rename ext2_dirbadentry() to ext2_check_direntry(). Add directory
entry inode value check, and call ext2_check_direntry() in all cases.
The dirchk sysctl is removed.

PR:                     259024,259041
Reported by:            Robert Morris
Reviewed by:            pfg
MFC after:              2 weeks
Differential Revision:  https://reviews.freebsd.org/D33374
2021-12-30 09:14:44 +03:00
Fedor Uporov
5034b44574 Remove unnecessary e2fs_first_dblock value check
MFC after:      2 weeks
2021-12-30 09:14:44 +03:00
Rick Macklem
f10dc28ec2 nfscommon: Return NFSERR_ATTRNOTSUPP for AUDIT/ALARM ACEs
FreeBSD only supports Allow/Deny ACEs in NFSv4 ACLs.
As such, it does not make sense to parse Audit/Alarm
ACEs.  Modify nfsrv_dissectace() so that it returns
NFSERR_ATTRNOTSUPP if an Audit/Alarm ACE is found in
the ACL being parsed.  The code has been #ifdef notnow'd,
since Audit/Alarm ACEs might be supported someday.

This should not have significant impact, since FreeBSD
reports to clients that only Allow/Deny ACEs are
supported and an attempt to set one would have failed
anyhow.

MFC after:	2 weeks
2021-12-27 08:03:41 -08:00
Rick Macklem
0fa074b53e nfscommon: Add arguments for support of the dacl attribute
NFSv4.1/4.2 has an alternative to the acl attribute, called
dacl, that includes support for the ACL_ENTRY_INHERITED flag,
called NFSV4ACE_INHERITED in NFSv4.

This patch adds a dacl argument to nfsrv_buildacl(),
nfsrv_dissectacl() and nfsrv_dissectace(), so that they
will handle NFSV4ACE_INHERITED when dacl == true.

Since these functions are always called with dacl == false
for this patch, semantics should not have changed.
A future patch will add support for dacl.

MFC after:	2 weeks
2021-12-26 16:43:46 -08:00
Rick Macklem
744c2dc7dd rpc: Delete AUTH_NEEDS_TLS(_MUTUAL_HOST) auth_stat values
I thought that these new auth_stat values had been agreed
upon by the IETF NFSv4 working group, but that no longer
is the case.  As such, delete them and use AUTH_TOOWEAK
instead.  Leave the code that uses these new auth_stat
values in the sources #ifdef notnow, in case they are
defined in the future.

MFC after:	1 week
2021-12-23 14:31:53 -08:00
Rick Macklem
b70042adfe nfscl: Check for mmap(2)'d file before doing direct output
Commit 867c27c23a modified the NFS client so that
it does IO_APPEND writes directly to the NFS server,
bypassing the buffer cache.  However, this could result
in stale data in client pages when the file is mmap(2)'d.
As such, the NFS client needs to call vm_object_is_active()
to check if the file is mmap(2)'d and only do direct
output if the file is not mmap(2)'d.

This patch adds this check.

Although a simple patch, I have given it a long MFC,
since the related commit 867c27c23a made a significant
semantics change and, as such, has a long MFC.

MFC after:	3 months
2021-12-20 13:10:26 -08:00
Rick Macklem
150da1e3cd nfscl: Partially revert commit 867c27c23a
Commit 867c27c23a enabled the n_directio_opens code
in open/close, which sets/clears NNONCACHE, for
IO_APPEND. This code should not be enabled unless
newnfs_directio_enable is non-zero.

This patch reverts that part of commit 867c27c23a.

A future patch that fixes the case where the
file that is being written IO_APPEND is mmap()'d.

MFC after:	3 months
2021-12-16 14:30:37 -08:00
Alan Somers
b214fcceac Change VOP_READDIR's cookies argument to a **uint64_t
The cookies argument is only used by the NFS server.  NFSv2 defines the
cookie as 32 bits on the wire, but NFSv3 increased it to 64 bits.  Our
VOP_READDIR, however, has always defined it as u_long, which is 32 bits
on some architectures.  Change it to 64 bits on all architectures.  This
doesn't matter for any in-tree file systems, but it matters for some
FUSE file systems that use 64-bit directory cookies.

PR:             260375
Reviewed by:    rmacklem
Differential Revision: https://reviews.freebsd.org/D33404
2021-12-15 20:54:57 -07:00
Alan Somers
32fbc5d824 nfs: don't truncate directory cookies to 32-bits in the NFS server
In NFSv2, the directory cookie was 32-bits.  NFSv3 widened it to
64-bits and SVN r22521 widened the corresponding argument in
VOP_READDIR, but FreeBSD's NFS server continued to treat the cookies as
32-bits, and 0-extended to fill the field on the wire.  Nobody ever
noticed, because every in-tree file system generates cookies that fit
comfortably within 32-bits.

Also, have better type safety for txdr_hyper.  Turn it into an inline
function that type-checks its arguments.  Prevents warnings about
shift-count-overflow.

PR:		260375
MFC after:	2 weeks
Reviewed by:	rmacklem
Differential Revision: https://reviews.freebsd.org/D33404
2021-12-15 20:54:57 -07:00
Rick Macklem
e0861304a7 nfscl: Handle CB_SEQUENCE not first op correctly
The check for "not first operation" in CB_SEQUENCE
was done after the slot, etc. was updated. This patch
moves the check to the beginning of CB_SEQUENCE
processing.

While here, also fix the check for "no CB_SEQUENCE operation first"
by moving the check to the beginning of callback operation parsing,
since the check was in a couple of the other operations, but
not all of them.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260412
MFC after:	2 weeks
2021-12-15 16:36:40 -08:00
Rick Macklem
867c27c23a nfscl: Change IO_APPEND writes to direct I/O
IO_APPEND writes have always been very slow over NFS, due to
the need to acquire an up to date file size after flushing
all writes to the NFS server.

This patch switches the IO_APPEND writes to use direct I/O,
bypassing the buffer cache.  As such, flushing of writes
normally only occurs when the open(..O_APPEND..) is done.
It does imply that all writes must be done synchronously
and must be committed to stable storage on the file server
(NFSWRITE_FILESYNC).

For a simple test program that does 10,000 IO_APPEND writes
in a loop, performance improved significantly with this patch.

For a UFS exported file system, the test ran 12x faster.
This drops to 3x faster when the open(2)/close(2) are done
for each loop iteration.
For a ZFS exported file system, the test ran 40% faster.

The much smaller improvement may have been because the ZFS
file system I tested against does not have a ZIL log and
does have "sync" enabled.

Note that IO_APPEND write performance is still much slower
than when done on local file systems.

Although this is a simple patch, it does result in a
significant semantics change, so I have given it a
large MFC time.

Tested by:	otis
MFC after:	3 months
2021-12-15 08:35:48 -08:00
Rick Macklem
fe04c91184 nfscl: add a filesize limit check to nfs_allocate()
As reported in PR#260343, nfs_allocate() did not check
the filesize rlimit. This patch adds that check.

PR:	260343
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33422
2021-12-13 15:32:19 -08:00
Rick Macklem
c302f889e2 nfsd: Limit parsing of layout errors to maxcnt bytes
This patch decrements maxcnt by the appropriate
number of bytes during parsing and checks to see
if there is data remaining.  If not, it just returns
from nfsrv_flexlayouterr() without further processing.
This prevents the tl pointer from running off the end
of the error data pointed at by layp, if there are
flaws in the data.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260293
MFC after:	2 weeks
2021-12-13 15:21:31 -08:00
Rick Macklem
24947b701d nfscl: Fix must_commit handling for mirrored pNFS mounts
For pNFS mounts to mirrored Flexible File layout pNFS servers,
the "must_commit" component in the nfsclwritedsdorpc
structure must be checked and the "must_commit" argument passed
into nfscl_doiods() must be updated.  Technically, only writes to
the DS with a writeverf change must be redone, but since this
occurrence will be rare, the must_commit argument to nfscl_doiosd()
is set to 1, so all writes to all DSs will be redone.

This bug would affect few, since use of mirrored pNFS servers
is rare and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots.

MFC after:	2 weeks
2021-12-12 15:40:30 -08:00
Rick Macklem
ead50c94cb nfscl: Fix must_commit/writeverf handling for Direct I/O
Without this patch, the KASSERT(must_commit == 0,..) can be
triggered by the writeverf in the Direct I/O write reply changing.
This is not a situation that should cause a panic(). Correct
handling is to ignore the change in "writeverf" for Direct
I/O, since it is done with NFSWRITE_FILESYNC.

This patch modifies the semantics of the "must_commit"
argument slightly, allowing an initial value of 2 to indicate
that a change in "writeverf" should be ignored.
It also fixes the KASSERT()s.

This bug would affect few, since Direct I/O is not enabled
by default and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots, however I found the
bug when testing against a Linux 5.15.1 kernel nfsd, which
replied to a NFSWRITE_FILESYNC write with a "writeverf" of all
0x0 bytes.

MFC after:	2 weeks
2021-12-11 15:00:30 -08:00
Rick Macklem
ab639f2398 nfscl: Check for an error return from nfsrv_getattrbits()
There were two places where the client code did not check
for a parse error return from nfsrv_getattrbits().

This patch fixes both of these cases.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260272
MFC after:	2 weeks
2021-12-09 14:32:22 -08:00
Rick Macklem
d9931c2561 nfscl: Sanity check the callback tag length
The sanity check for tag length in a callback request
was broken in two ways:

It checked for a negative value, but not a large positive
value.

It did not set taglen to -1, to indicate to the code that
it should not be used.

This patch fixes both of these issues.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260266
MFC after:	2 weeks
2021-12-09 14:15:48 -08:00
Dmitry Chagin
0f74021fb6 pseudofs: Destroy vncache hashtbl on pseudofs module unload.
Reviewed by:		mjg, kib
Differential Revision:	https://reviews.freebsd.org/D31605
MFC after:		2 weeks
2021-12-09 21:41:08 +00:00
Bjoern A. Zeeb
df38ada293 modules: increase MAXMODNAME and provide backward compat
With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.

MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first.  With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].

Reported by:	Greg V (greg unrelenting.technology)
Sponsored by:	The FreeBSD Foundation
Suggested by:	imp [1]
MFC after:	10 days
Reviewed by:	imp (and others to different level)
Differential Revision:	https://reviews.freebsd.org/D32383
2021-12-09 18:09:53 +00:00
Jason A. Harmening
cfc2cfeca1 unionfs: implement VOP_VPUT_PAIR
unionfs must pass VOP_VPUT_PAIR directly to the underlying FS so that
it can have a chance to manage any special locking considerations that
may be necessary.  The unionfs implementation is based heavily on the
corresponding nullfs implementation.

Also note some outstanding issues with the unionfs locking scheme, as
a first step in fixing those issues in a future change.

Discussed with:	kib
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D33008
2021-12-07 16:20:02 -08:00
Jason A. Harmening
6d8420d444 Remove unnecessary thread argument from unionfs_nodeget() and _noderem()
Also remove a couple of write-only variables found by the recent clang
update.  No functional change intended.

Discussed with:	kib
Differential Revision:	https://reviews.freebsd.org/D33008
2021-12-07 16:20:02 -08:00
Alan Somers
41ae9f9e64 fusefs: invalidate the cache during copy_file_range
FUSE_COPY_FILE_RANGE instructs the server to write data to a file.
fusefs must invalidate any cached data within the written range.

PR:		260242
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33280
2021-12-06 21:41:50 -07:00
Alan Somers
dc433e1530 fusefs: inline fuse_io_dispatch
This function was always confusing, because it created an H-shaped
callgraph: two functions called in and left via different paths based on
which which called.

MFC after: 2 weeks
2021-12-06 21:41:50 -07:00
Alan Somers
25927e068f fusefs: correctly handle an inode that changes file types
Correctly handle the situation where a FUSE server unlinks a file, then
creates a new file of a different type but with the same inode number.
Previously fuse_vnop_lookup in this situation would return EAGAIN.  But
since it didn't call vgone(), the vnode couldn't be reused right away.
Fix this by immediately calling vgone() and reallocating a new vnode.

This problem can occur in three code paths, during VOP_LOOKUP,
VOP_SETATTR, or following FUSE_GETATTR, which usually happens during
VOP_GETATTR but can occur during other vops, too.  Note that the correct
response actually doesn't depend on whether the entry cache has expired.
In fact, during VOP_LOOKUP, we can't even tell.  Either it has expired
already, or else the vnode got reclaimed by vnlru.

Also, correct the error code during the VOP_SETATTR path.

PR:		258022
Reported by:	chogata@moosefs.pro
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33283
2021-12-06 21:36:46 -07:00
Rick Macklem
2d90ef4714 nfsd: Fix Verify for attributes like FilesAvail
When the Verify operation calls nfsv4_loadattr(), it provides
the "struct statfs" information that can be used for doing a
compare for FilesAvail, FilesFree, FilesTotal, SpaceAvail,
SpaceFree and SpaceTotal.  However, the code erroneously
used the "struct nfsstatfs *" argument that is NULL.
This patch fixes these cases to use the correct argument
structure.  For the case of FilesAvail, the code in
nfsv4_fillattr() was factored out into a separate function
called nfsv4_filesavail(), so that it can be called from
nfsv4_loadattr() as well as nfsv4_fillattr().

In fact, most of the code in nfsv4_filesavail() is old
OpenBSD code that does not build/run on FreeBSD, but I
left it in place, in case it is of some use someday.

I am not aware of any extant NFSv4 client that does Verify
on these attributes.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260176
MFC after:	2 weeks
2021-12-04 14:38:55 -08:00
Rick Macklem
480be96e1e nfsd: Sanity check the Layouttype count
Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260155
MFC after:	2 weeks
2021-12-04 14:22:19 -08:00
Cy Schubert
db0ac6ded6 Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"
This reverts commit 266f97b5e9, reversing
changes made to a10253cffe.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.
2021-12-02 14:45:04 -08:00
Cy Schubert
266f97b5e9 wpa: Import wpa_supplicant/hostapd commit 14ab4a816
This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after:      1 month
2021-12-02 13:35:14 -08:00
Rick Macklem
fd020f197d nfsd: Sanity check the ACL attribute
When an ACL is presented to the NFSv4 server in
Setattr or Verify, parsing of the ACL assumed a
sane acecnt and sane sizes for the "who" strings.
This patch adds sanity checks for these.

The patch also fixes handling of an error
return from nfsrv_dissectacl() for one broken
case.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260111
MFC after:	2 weeks
2021-12-01 13:55:17 -08:00
Rick Macklem
33d0be8a92 nfsd: Do not try to cache a reply for NFSERR_BADSLOT
When nfsrv_checksequence() replies NFSERR_BADSLOT,
the value of nd_slotid is not valid.  As such, the
reply cannot be cached in the session.
Do not set ND_HASSEQUENCE for this case.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260076
MFC after:	2 weeks
2021-12-01 13:46:41 -08:00
Neel Chauhan
3dd3a395ba ext2: Check for e2fs_first_dblock in ext2_compute_sb_data()
This prevents a kernel panic on a damaged ext2 superblock.

PR:			259107
Reported by:		Robert Morris <rtm@lcs.mit.edu>
Differential Revision:	https://reviews.freebsd.org/D33029
2021-11-29 09:53:45 -08:00
Alan Somers
91972cfcdd fusefs: update atime on reads when using cached attributes
When using cached attributes, whether or not the data cache is enabled,
fusefs must update a file's atime whenever it reads from it, so long as
it wasn't mounted with -o noatime.  Update it in-kernel, and flush it to
the server on close or during the next setattr operation.

The downside is that close() will now frequently trigger a FUSE_SETATTR
upcall.  But if you care about performance, you should be using
-o noatime anyway.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33145
2021-11-28 18:53:31 -07:00
Alan Somers
65d70b3bae fusefs: fix copy_file_range when extending a file
When copy_file_range extends a file, it must update the cached file
size.

MFC after:	2 weeks
Reviewed by:	rmacklem, pfg
Differential Revision: https://reviews.freebsd.org/D33151
2021-11-28 18:35:58 -07:00
Rick Macklem
638b90a191 nfs: Quiet a few "unused" warnings
For most of these warnings, the variable is loaded
with data parsed out of an RPC messages.  In case
the data is useful in the future, I just marked
these with __unused.
2021-11-28 15:48:51 -08:00
Alan Somers
8fbae6c7bd fusefs: delete a redundant getnanouptime
It's been redundant since SVN r346060 added another getnanouptime just
above.

MFC after:	2 weeks
2021-11-28 16:05:30 -07:00
Rick Macklem
c3134a6af0 nfscl: Disable use of the LookupOpen RPC
The LookupOpen RPC reduces the number of Open RPCs
needed.  Unfortunately, it breaks certain software
builds over NFS, so disable it until this is fixed.

The LookupOpen RPC is only used for NFSv4.1/4.2
mounts when the "oneopenown" mount option is
specified, so this should not affect many users.
2021-11-27 15:34:45 -08:00
Mateusz Guzik
4dcdf3987c vfs: replace the MNTK_TEXT_REFS flag with VIRF_TEXT_REF
This allows to stop maintaing the VI_TEXT_REF flag and consequently
opens up fully lockless v_writecount adjustment.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33127
2021-11-27 23:07:25 +00:00
Rick Macklem
1c15c8c0e9 nfscl: Sanity check the Sequence slotid in reply
The slotid in the Sequence reply must be the same as
in the request.  Check that it is the same and log
a console message if it is not, plus set it to the
correct value.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260071
MFC after:	2 weeks
2021-11-27 15:02:04 -08:00
Rick Macklem
5b430a1323 nfsd: Sanity check the len argument for ListXattr
The check for the original len being >= retlen needs to
be done before the "if (nd->nd_repstat == 0)" code, so
that it can be reported as too small.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260046
MFC after:	2 weeks
2021-11-26 15:56:29 -08:00
Rick Macklem
bdd57cbb1b nfsd: Add checks for layout errors in LayoutReturn
For a LayoutReturn when using the Flexible File Layout,
error reports may be provided in the request.
Sanity check the size of these error reports and
check that they exist before calling nfsrv_flexlayouterr().

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260012
MFC after:	2 weeks
2021-11-26 15:42:32 -08:00
Rick Macklem
22f7bcb523 nfscl: Sanity check irdcnt in nfsrpc_createsession
Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	259996
MFC after:	2 weeks
2021-11-26 15:28:40 -08:00
Mateusz Guzik
1879021942 tmpfs: add vop_stdadd_writecount_nomsync to fifo vnode ops
Reported by:	yasu
Fixes: 3ffcfa599e ("vfs: add vop_stdadd_writecount_nomsync")
2021-11-26 19:32:35 +00:00
Mateusz Guzik
3ffcfa599e vfs: add vop_stdadd_writecount_nomsync
This avoids needing to inspect the mount point every time.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D33125
2021-11-26 12:06:08 +00:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Mateusz Guzik
40dd1c9c06 ext2: plug set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 23:02:26 +00:00
Mateusz Guzik
873606999f unionfs: plug a set-but-not-unused var
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:31:35 +00:00
Mateusz Guzik
35b12b8711 fdescfs: plug a set-but-not-unused var
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-11-24 21:25:24 +00:00
Neel Chauhan
be60d8f276 ext2fs: check for eh_depth in ext4_ext_check_header()
PR:			259112
Reported by:		Robert Morris <rtm@lcs.mit.edu>
Reviewed by:		fsu
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D33030
2021-11-18 09:54:42 -08:00
Konstantin Belousov
8ef0c11e7c nfsclient: upgrade vnode lock in VOP_OPEN()/VOP_CLOSE() if we need to flush buffers
VOP_FSYNC() asserts that the vnode is exclusively locked for NFS.
If we try to execute file with recently modified content, the assert is
triggered.

Reviewed by:	rmacklem
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32999
2021-11-16 19:13:29 +02:00
Jason A. Harmening
06f79675b7 unionfs: fix potential deadlock in VOP_RMDIR
VOP_RMDIR() is called with both parent and child directory vnodes
locked.  The relookup operation performed by the unionfs implementation
may relock both vnodes.  Accordingly, unionfs_relookup() drops the
parent vnode lock, but the child vnode lock is never dropped.
Although relookup() will very likely try to relock the child vnode
which is already locked, in most cases this doesn't produce a deadlock
because unionfs_lock() forces LK_CANRECURSE (!).  However, relocking
of the parent vnode while the child vnode remains locked effectively
reverses the expected parent->child lock order, which can produce
a deadlock e.g. in the presence of a concurrent unionfs_lookup()
operation.  Address the issue by dropping the child lock around
the unionfs_relookup() call in unionfs_rmdir().

Reported by:	pho
Reviewed by:	kib, markj
Differential Revision: https://reviews.freebsd.org/D32986
2021-11-14 20:07:42 -08:00
Rick Macklem
ce9676de86 pNFS: Add nfsstats counters for number of Layouts
For pNFS, Layouts are issued by the server to indicate
where a file's data resides on the DS(s).  This patch
adds counters for how many layouts are allocated to
the nfsstatsv1 structure, using two reserved fields.

MFC after:	2 weeks
2021-11-12 17:32:55 -08:00
Konstantin Belousov
25809a018d mntfs: lock mntfs pseudo devfs vnode properly
Require devvp locked for mntfs_freevp(), to have it locked around
vgone().  Make that true for ffs, which is the only consumer of
the interface.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:41 +02:00
Rick Macklem
44744f7538 nfscl: Add a LayoutError RPC for NFSv4.2 pNFS mounts
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the MDS server to
tell it about the error.  This patch adds the same to the
FreeBSD NFSv4.2 pNFS client, to maintain Linux compatible
behaviour, particlularily for non-FreeBSD pNFS servers.

MFC after:	2 weeks
2021-11-11 15:43:58 -08:00
Rick Macklem
f8dc06303b nfsd: Fix the NFSv4.2 pNFS MDS server for NFSERR_NOSPC via LayoutError
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the MDS server to
tell it about the error and keeps retrying, doing repeated
LayoutGets to the MDS and Write RPCs to the DS.  The Linux client is
"stuck" until disk space on the DS is free'd up unless
a subsequent LayoutGet request is sent a NFSERR_NOSPC
reply.
The looping problem still occurs for NFSv4.1 mounts, but no
fix for this is known at this time.

This patch changes the pNFS MDS server to reply to LayoutGet
operations with NFSERR_NOSPC once a LayoutError reports the
problem, until the DS has available space.  This keeps the Linux
NFSv4.2 from looping.

Found during recent testing because of issues w.r.t. a DS
being out of space found during a recent IEFT NFSv4 working
group testing event.

MFC after:	2 weeks
2021-11-08 15:58:00 -08:00
Rick Macklem
d70ca5b00e nfsd: Fix f_bavail and f_ffree for NFSv4 when negative
Since the NFS Space_available and Files_available are unsigned,
the NFSv3 server sets them to 0 when negative, so that they
do not appear to be large positive values for non-FreeBSD clients.
This patch fixes the NFSv4 server to do the same.

Found during a recent IEFT NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-08 12:59:31 -08:00
Rick Macklem
a7e014eee5 nfsd: Fix the NFSv4 pNFS MDS server for DS NFSERR_NOSPC
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing.  For the Linux
client, it then sends a LayoutError RPC to the server to
tell it about the error and keeps retrying, doing repeated
LayoutGet and Write RPCs to the DS.  The Linux client is
"stuck" until disk space on the DS is free'd up.
For a mirrored server configuration, the first mirror that
ran out of space was taken offline.  This does not make
much sense, since the other mirror(s) will run out of space
soon and the fix is a manual cleanup up disk space.

This patch changes the pNFS server to not disable a mirror
for the mirrored case when this occurs.

Further work is needed, since the Linux client expects the
MDS to reply NFSERR_NOSPC to LayoutGets once the DS is out
of space.  Without this further change, the above mentioned
looping occurs.

Found during a recent IEFT NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-07 11:43:03 -08:00
Rick Macklem
f0c9847a6c vfs: Add "ioflag" and "cred" arguments to VOP_ALLOCATE
When the NFSv4.2 server does a VOP_ALLOCATE(), it needs
the operation to be done for the RPC's credential and not
td_ucred. It also needs the writing to be done synchronously.

This patch adds "ioflag" and "cred" arguments to VOP_ALLOCATE()
and modifies vop_stdallocate() to use these arguments.

The VOP_ALLOCATE.9 man page will be patched separately.

Reviewed by:	khng, kib
Differential Revision:	https://reviews.freebsd.org/D32865
2021-11-06 13:26:43 -07:00
Jason A. Harmening
5f73b3338e unionfs: Improve vnode validation
Instead of validating that a vnode belongs to unionfs only when the
caller attempts to extract the upper or lower vnode pointers, do this
validation any time the caller tries to extract a unionfs_node from
the vnode private data.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32629
2021-11-06 07:08:34 -07:00
Jason A. Harmening
fb273fe70f unionfs: replace zero-length read check with KASSERT
The lower FS VOP_READDIR() shouldn't return an empty read without
setting EOF; don't try to handle this case only for non-DIAGNOSTIC
builds.

Noted by:	kib
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32629
2021-11-06 07:08:34 -07:00
Jason A. Harmening
66191a76ac unionfs: Improve locking assertions
Add an assertion to unionfs_node_update() that the upper vnode is
exclusively locked; we already make the same assertion for the lower
vnode.
Also, assert in unionfs_noderem() that the vnode lock is not recursed
and acquire v_lock with LK_NOWAIT.  Since v_lock is not the active
lock for the vnode at this point, it should not be contended.
Finally, remove VDIR assertions from unionfs_get_cached_vnode().
lvp/uvp will be referenced but not locked at this point, so v_type
may concurrently change due to vgonel().  The cached unionfs node,
if one exists, would only have made it into the cache if lvp/uvp
were of type VDIR at the time of insertion; the corresponding
VDIR assert in unionfs_ins_cached_vnode() should be safe because
lvp/uvp will be locked by that time and will not be used if either
is doomed.

Noted by:	kib
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32629
2021-11-06 07:08:33 -07:00
Jason A. Harmening
3ecefc4a61 unionfs: assorted style fixes
No functional change intended, beyond slightly different panic strings

Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D32629
2021-11-06 07:08:33 -07:00
Jason A. Harmening
866dd6335a unionfs: various locking fixes
--Clearing cached subdirectories in unionfs_noderem() should be done
  under the vnode interlock

--When preparing to switch the vnode lock in both unionfs_node_update()
  and unionfs_noderem(), the incoming lock should be acquired before
  updating the v_vnlock field to point to it.  Otherwise we effectively
  break the locking contract for a brief window.

Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D32629
2021-11-06 07:08:33 -07:00
Rick Macklem
f5d5164fb6 nfscl: Fix two more cases for forced dismount
Although I was not able to cause a failure during testing, there
are places in nfscl_removedeleg() and nfscl_renamedeleg() where
I think a forced dismount could get hung.  This patch fixes those.

This patch only affects forced dismount and only if the NFSv4
server is issuing delegations to the client.

Found by code inspection.

MFC after:	2 weeks
2021-11-05 15:33:19 -07:00
Rick Macklem
80e5955b08 nfscl: Fix NFSv4.1/4.2 pnfs mounts using nconnect
When a mount with the "pnfs" and "nconnect" options specified
does an I/O operation, it erroneously uses a TCP connection
to the MDS when it is meant to be a DS operation and, as such,
needs to use a TCP connection to the DS.  This patch fixes this.

When the "pnfs" and "nconnect" options are specified for a
NFSv4.1/4.2 mount, there probably should be N connections
established to each DS for I/O RPCs.  This is a fair amount
of work and may be done in a future commit.

This problem was found during a recent IETF NFSv4 working
group testing event.

MFC after:	2 weeks
2021-11-04 17:06:34 -07:00
Konstantin Belousov
15bd9fa3be procfs_doprocfile(): simplify
Now that proc_get_binpath() does not return NULL in fullpath on success,
directly use sbuf_cat() over the value.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-11-04 06:13:47 +02:00
Rick Macklem
6b67753488 nfscl: Fix forced dismount from looping on commit
When a forced dismount is in progress, it is possible to
end up looping, retrying commits that fail.
This patch fixes the problem by pretending
that commits succeeded when a forced dismount is in prgress.

MFC after:	2 weeks
2021-11-03 14:25:44 -07:00
Rick Macklem
ae49051c03 nfscl: Fix forced dismount when "nconnect" is specified
When a forced dismount is done and the "nconnect" mount
option was used, the additional connections must be closed.
This patch does that.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-03 13:26:38 -07:00
Rick Macklem
4412225859 nfscl: Fix use after free for forced dismount
When a forced dismount is done and delegations are being
issued by the server (disabled by default for FreeBSD
servers), the delegation structure is free'd before the
loop calling vflush().  This could result in a use after
free of the delegation structure.

This patch changes the code so that the delegation
structures are not free'd until after the vflush()
loop for forced dismounts.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-03 12:15:40 -07:00
Rick Macklem
331883a2f2 nfscl: Check for a forced dismount in nfscl_getref()
The nfscl_getref() function is called within nfscl_doiods() when
the NFSv4.1/4.2 pNFS client is doing I/O on a DS.  As such,
nfscl_getref() needs to check for a forced dismount.
This patch adds that check.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-02 17:28:13 -07:00
Rick Macklem
5a95a6e8e4 nfscl: Use a smaller initial delay time for NFSERR_DELAY
For NFS RPCs that receive a NFSERR_DELAY reply, the delay time
is initially 1sec and then increases exponentially to NFS_TRYLATERDEL.
It was found that this delay time is excessive for some NFSv4
servers, which work well with a 1msec delay.
A 1sec delay resulted in very slow performance for Remove and
Rename when delegations and pNFS were enabled.

This patch decreases the initial delay time to 1msec.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-11-01 17:21:31 -07:00
Rick Macklem
d5d2ce1c85 nfscl: Do pNFS layout return_on_close synchronously
For pNFS servers that specify that Layouts are to be returned
upon close, they may expect that LayoutReturn to happen before
the associated Close.

This patch modifies the NFSv4.1/4.2 pNFS client so that this
is done.  This only affects a pNFS mount against a non-FreeBSD
NFSv4.1/4.2 server that specifies return_on_close in LayoutGet
replies.

Found during a recent IETF NFSv4 working group testing event.

MFC after:	2 weeks
2021-10-31 16:31:31 -07:00
Konstantin Belousov
e5248548f9 procfs: return right hardlink from /proc/curproc/file
Use proc_get_binpath() to get the hardlink right.

PR:	248184
Reviewed by:	emaste, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32738
2021-10-31 03:05:14 +02:00
Rick Macklem
50dcff0816 nfscl: Add setting n_localmodtime to the Write RPC code
Similar to commit 2be417843a, I believe there could be a race between
the NFS client VOP_LOOKUP() and file Writing that could result in stale
file attributes being loaded into the NFS vnode by VOP_LOOKUP().

I have not been able to reproduce a failure due to this race, but
I believe that there are two possibilities:

The Lookup RPC happens while VOP_WRITE() is being executed and loads
stale file attributes after VOP_WRITE() returns when it has already
completed the Write/Commit RPC(s).
--> For this case, setting the local modify timestamp at the end of
  VOP_WRITE() should ensure that stale file attributes are not loaded.

The Lookup RPC occurs after VOP_WRITE() has returned, while
asynchronous Write/Commit RPCs are in progress and then is
blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which
will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the
NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP()
then acquires the NFS vnode lock and fills in stale file attributes.
 --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf()
   when they clear NMODIFIED should ensure that stale file attributes
   are not loaded.

This patch does the above.

PR:	259071
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32677
2021-10-30 17:08:28 -07:00
Rick Macklem
ab87c39c25 nfscl: Set n_localmodtime in Deallocate
Commit 2be417843a added n_localmodtime, which is used by Lookup
and ReaddirPlus to check to see if the file attributes in an RPC
reply might be stale.  This patch sets n_localmodtime in Deallocate.
Done as a separate commit, since Deallocate is not in stable/13.

PR:	259071
Reviewed by:	asomers
Differential Revision:	https://reviews.freebsd.org/D32635
2021-10-30 16:46:14 -07:00
Rick Macklem
2be417843a PR#259071 provides a test program that fails for the NFS client.
Testing with it, there appears to be a race between Lookup
and VOPs like Setattr-of-size, where Lookup ends up loading
stale attributes (including what might be the wrong file size)
into the NFS vnode's attribute cache.

The race occurs when the modifying VOP (which holds a lock
on the vnode), blocks the acquisition of the vnode in Lookup,
after the RPC (with now potentially stale attributes).

Here's what seems to happen:
Child                                Parent

does stat(), which does
VOP_LOOKUP(), doing the Lookup
RPC with the directory vnode
locked, acquiring file attributes
valid at this point in time

blocks waiting for locked file       does ftruncate(), which
vnode                                does VOP_SETATTR() of Size,
                                     changing the file's size
                                     while holding an exclusive
                                     lock on the file's vnode
                                     releases the vnode lock
acquires file vnode and fills in
now stale attributes including
the old wrong Size
                                     does a read() which returns
                                     wrong data size

This patch fixes the problem by saving a timestamp in the NFS vnode
in the VOPs that modify the file (Setattr-of-size, Allocate).
Then lookup/readdirplus compares that timestamp with the time just
before starting the RPC after it has acquired the file's vnode.
If the modifying RPC occurred during the Lookup, the attributes
in the RPC reply are discarded, since they might be stale.

With this patch the test program works as expected.

Note that the test program does not fail on a July stable/12,
although this race is in the NFS client code.  I suspect a
fairly recent change to the name caching code exposed this
bug.

PR:	259071
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32635
2021-10-30 16:35:02 -07:00
Rick Macklem
dc6dd769de nfscl: Use NFSMNTP_DELEGISSUED in two more functions
Commit 5e5ca4c8fc added a NFSMNTP_DELEGISSUED flag to indicate when
a delegation has been issued to the mount.  For the common case
where an NFSv4 server is not issuing delegations, this flag
can be checked to avoid acquisition of the NFSCLSTATEMUTEX.

This patch adds checks for NFSMNTP_DELEGISSUED being set
to two more functions.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

MFC after:	2 week
2021-10-29 20:35:02 -07:00
Rick Macklem
23024f004a nfscl: Add a missing delegation lock release
There was a case in nfscl_doiods() where the function would return
without releasing the delegation shared lock, if it was aquired by
the call to nfscl_getstateid().  This patch adds that release.

I have never observed a failure due to this missing release, so I
do not know if it ever happens in practice.  However, since the pNFS
client is not yet heavily used, it might be the case.

Found by code inspection during a recent NFSv4 IETF working group
testing event.

MFC after:	2 week
2021-10-25 19:11:45 -07:00
Jason A. Harmening
fd8ad2128d unionfs: implement vnode-based cache lookup
unionfs uses a per-directory hashtable to cache subdirectory nodes.
Currently this hashtable is looked up using the directory name, but
since unionfs nodes aren't removed from the cache until they're
reclaimed, this poses some problems.  For example, if a directory is
created on a unionfs mount shortly after deleting a previous directory
with the same path, the cache may end up reusing the node for the
previous directory, including its upper/lower FS vnodes.  Operations
against those vnodes with then likely fail because the vnodes
represent deleted files; for example UFS will reject VOP_MKDIR()
against such a vnode because its effective link count is 0.  This may
then manifest as e.g. mkdir(2) or open(2) returning ENOENT for an
attempt to create a file under the re-created directory.

While it would be possible to fix this by explicitly managing the
name-based cache during delete or rename operations, or by rejecting
cache hits if the underlying FS vnodes don't match those passed to
unionfs_nodeget(), it seems cleaner to instead hash the unionfs nodes
based on their underlying FS vnodes.  Since unionfs prefers to operate
against the upper vnode if one is present, the lower vnode will only
be used for hashing as long as the upper vnode is NULL.  This should
also make hashing faster by eliminating string traversal and using
the already-computed hash index stored in each vnode.

While here, fix a couple of other cache-related issues:

--Remove 8 bytes of unnecessary baggage from each unionfs node by
  getting rid of the stored hash mask field.  The mask is knowable
  at compile time.

--When a matching node is found in the cache, reference its vnode
  using vrefl() while still holding the vnode interlock.  Previously
  unionfs_nodeget() would vref() the vnode after the interlock was
  dropped, but the vnode may be reclaimed during that window.  This
  caused intermittent panics from vn_lock(9) during unionfs stress
  testing.

Reviewed by:	kib, markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D32533
2021-10-24 10:05:50 -07:00
Konstantin Belousov
2bd6d910b2 msdosfs_rename: remove write-only variables
Reviewed by:	imp, mjg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32577
2021-10-20 21:29:49 +03:00
Konstantin Belousov
80bca63cf4 tmpfs: remove write-only variables
Reviewed by:	imp, mjg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32577
2021-10-20 21:29:49 +03:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Rick Macklem
52dee2bc03 nfscl: Handle NFSv4.1/4.2 Close RPC NFSERR_DELAY replies better
Without this patch, if a NFSv4.1/4.2 server replies NFSERR_DELAY to
a Close operation, the client loops retrying the Close while holding
a shared lock on the clientID.  This shared lock blocks returns of
delegations, even though the server has issued a CB_RECALL to request
the delegation return.

This patch delays doing a retry of a Close that received a reply of
NFSERR_DELAY until after the shared lock on the clientID is released,
for NFSv4.1/4.2.  To fix this for NFSv4.0 would be very difficult and
since the only known NFSv4 server to reply NFSERR_DELAY to Close only
does NFSv4.1/4.2, this fix is hoped to be sufficient.

This problem was detected during a recent IETF working group NFSv4
testing event.

MFC after:	2 week
2021-10-18 15:05:34 -07:00
Rick Macklem
d95c0a12a2 nfscl: Modify Close RPC so that it does not use "owner" for NFSv4.1/4.2
This patch modifies the function that does the Close RPC (nfsrpc_closerpc)
so that it does not use the open_owner (nfso_own) for NFSv4.1/4.2.
Use of the seqid in the open_owner structure is only needed for NFSv4.0.
Same applies to a NFSERR_STALESTATEID reply, which should only happen
for NFSv4.0.  This allows nfsrpc_closerpc() to be called when nfso_own
is no longer valid.  This, in turn, allows nfsrpc_closerpc() to be called
after the shared lock on the clientID is released, for NFSv4.1/4.2.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-17 17:50:56 -07:00
Rick Macklem
e2aab5e2d7 nfscl: Move release of the clientID lock into nfscl_doclose()
This patch moves release of the shared clientID lock from nfsrpc_close()
just after the nfscl_doclose() call to the end of nfscl_doclose() call.
This does make the code cleaner, since the shared lock is acquired at
the beginning of nfscl_doclose().  The only semantics change is that
the code no longer drops and reaquires the NFSCLSTATELOCK() mutex,
which I do not believe will have a negative effect on the NFSv4 client.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-16 15:49:38 -07:00
Rick Macklem
77c595ce33 nfscl: Add an argument to nfscl_tryclose()
This patch adds a new argument to nfscl_tryclose() to indicate
whether or not it should loop when a NFSERR_DELAY reply is received
from the NFSv4 server.  Since this new argument is always passed in
as "true" at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-15 14:25:38 -07:00
Rick Macklem
6495766acf nfscl: Restructure nfscl_freeopen() slightly
This patch factors the unlinking of the nfsclopen structure out of
nfscl_freeopen() into a separate function called nfscl_unlinkopen().
It also adds a new argument to nfscl_freeopen() to conditionally do
the unlink.  Since this new argument is always passed in as "true"
at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after:	2 week
2021-10-14 17:28:01 -07:00
Jason A. Harmening
152c35ee4f unionfs: Ensure SAVENAME is set for unionfs vnode operations
"rm-style" system calls such as kern_frmdirat() and kern_funlinkat()
don't supply SAVENAME to preserve the pathname buffer for subsequent
vnode ops.  For unionfs this poses an issue because the pathname may
be needed for a relookup operation in unionfs_remove()/unionfs_rmdir().
Currently unionfs doesn't check for this case, leading to a panic on
DIAGNOSTIC kernels and use-after-free of cn_nameptr otherwise.

The unionfs node's stored buffer would suffice as a replacement for
cnp->cn_nameptr in some (but not all) cases, but it's cleaner to just
ensure that unionfs vnode ops always have a valid cn_nameptr by setting
SAVENAME in unionfs_lookup().

While here, do some light cleanup in unionfs_lookup() and assert that
HASBUF is always present in the relevant relookup calls.

Reported by:	pho
Reviewed by:	markj
Differential Revision: https://reviews.freebsd.org/D32148
2021-10-13 19:25:31 -07:00
Rick Macklem
24af0fcdfc nfscl: Make nfscl_getlayout() acquire the correct pNFS layout
Without this patch, if a pNFS read layout has already been acquired
for a file, writes would be redirected to the Metadata Server (MDS),
because nfscl_getlayout() would not acquire a read/write layout for
the file.  This happened because there was no "mode" argument to
nfscl_getlayout() to indicate whether reading or writing was being done.
Since doing I/O through the Metadata Server is not encouraged for some
pNFS servers, it is preferable to get a read/write layout for writes
instead of redirecting the write to the MDS.

This patch adds a access mode argument to nfscl_getlayout() and
nfsrpc_getlayout(), so that nfscl_getlayout() knows to acquire a read/write
layout for writing, even if a read layout has already been acquired.
This patch only affects NFSv4.1/4.2 client behaviour when pNFS ("pnfs" mount
option against a server that supports pNFS) is in use.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	2 week
2021-10-13 15:48:54 -07:00
Rick Macklem
b82168e657 nfscl: Fix another deadlock related to the NFSv4 clientID lock
Without this patch, it is possible to hang the NFSv4 client,
when a rename/remove is being done on a file where the client
holds a delegation, if pNFS is being used.  For a delegation
to be returned, dirty data blocks must be flushed to the NFSv4
server.  When pNFS is in use, a shared lock on the clientID
must be acquired while doing a write to the DS(s).
However, if rename/remove is doing the delegation return
an exclusive lock will be acquired on the clientID, preventing
the write to the DS(s) from acquiring a shared lock on the clientID.

This patch stops rename/remove from doing a delegation return
if pNFS is enabled.  Since doing delegation return in the same
compound as rename/remove is only an optimization, not doing
so should not cause problems.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	1 week
2021-10-12 17:21:01 -07:00
Kyle Evans
7259ca3104 fifos: delegate unhandled kqueue filters to underlying filesystem
This gives the vfs layer a chance to provide handling for EVFILT_VNODE,
for instance.  Change pipe_specops to use the default vop_kqfilter to
accommodate fifoops that don't specify the method (i.e. all in-tree).

Based on a patch by Jan Kokemüller.

PR:		225934
Reviewed by:	kib, markj (both pre-KASSERT)
Differential Revision:	https://reviews.freebsd.org/D32271
2021-10-12 02:43:07 -05:00
Rick Macklem
120b20bdf4 nfscl: Fix a deadlock related to the NFSv4 clientID lock
Without this patch, it is possible for a process doing an NFSv4
Open/create of a file to block to allow another process
to acquire the exclusive lock on the clientID when holding
a shared lock on the clientID.  As such, both processes
deadlock, with one wanting the exclusive lock, while the
other holds the shared lock.  This deadlock is unlikely to occur
unless delegations are in use on the NFSv4 mount.

This patch fixes the problem by not deferring to the process
waiting for the exclusive lock when a shared lock (reference cnt)
is already held by the process.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	1 week
2021-10-11 21:58:24 -07:00
Mateusz Guzik
2b68eb8e1d vfs: remove thread argument from VOP_STAT
and fo_stat.
2021-10-11 13:22:32 +00:00
Mateusz Guzik
b4a58fbf64 vfs: remove cn_thread
It is always curthread.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32453
2021-10-11 13:21:47 +00:00
Rick Macklem
dfe887b7d2 nfsd: Disable the NFSv4.2 Allocate operation by default
Some exported file systems, such as ZFS ones, cannot do VOP_ALLOCATE().
Since an NFSv4.2 server must either support the Allocate operation for
all file systems or not support it at all, define a sysctl called
vfs.nfsd.enable_v42allocate to enable the Allocate operation.
This sysctl is false by default and can only be set true if all
exported file systems (or all DSs for a pNFS server) can perform
VOP_ALLOCATE().

Unfortunately, there is no way to know if a ZFS file system will
be exported once the nfsd is operational, even if there are none
exported when the nfsd is started up, so enabling Allocate must
be done manually for a server configuration.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after:	2 weeks
2021-10-10 18:46:02 -07:00
Rick Macklem
235891a127 nfscl: Fix NFS VOP_ALLOCATE for mounts without Allocate support
Without this patch, nfs_allocate() fell back on using vop_stdallocate()
for NFS mounts without Allocate operation support.  This was incorrect,
since some file systems, such as ZFS, cannot do allocate via
vop_stdallocate(), which uses writes to try and allocate blocks.

Also, fix nfs_allocate() to return EINVAL when mounts cannot do Allocate,
since that is the correct error for posix_fallocate(2).
Note that Allocate is only supported by some NFSv4.2 servers.

MFC after:	2 weeks
2021-10-10 14:27:52 -07:00
Alan Somers
032a5bd55b fusefs: Fix a bug during VOP_STRATEGY when the server changes file size
If the FUSE server tells the kernel that a file's size has changed, then
the kernel must invalidate any portion of that file in cache.  But the
kernel can't do that during VOP_STRATEGY, because the file's buffers are
already locked.  Instead, proceed with the write.

PR:		256937
Reported by:	Agata <chogata@moosefs.pro>
Tested by:	Agata <chogata@moosefs.pro>
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D32332
2021-10-06 14:07:33 -06:00
Alan Somers
7430017b99 fusefs: fix a recurse-on-non-recursive lockmgr panic
fuse_vnop_bmap needs to know the file's size in order to calculate the
optimum amount of readahead.  If the file's size is unknown, it must ask
the FUSE server.  But if the file's data was previously cached and the
server reports that its size has shrunk, fusefs must invalidate the
cached data.  That's not possible during VOP_BMAP because the buffer
object is already locked.

Fix the panic by not querying the FUSE server for the file's size during
VOP_BMAP if we don't need it.  That's also a a slight performance
optimization.

PR:		256937
Reported by:	Agata <chogata@moosefs.pro>
Tested by:	Agata <chogata@moosefs.pro>
MFC after:	2 weeks
2021-10-06 14:07:33 -06:00
Alan Somers
5d94aaacb5 fusefs: quiet some cache-related warnings
If the FUSE server does something that would make our cache incoherent,
we should print a warning to the user.  However, we previously warned in
some situations when we shouldn't, such as if the file's size changed on
the server _after_ our own attribute cache had expired.  This change
suppresses the warning in cases like that.  It also moves the warning
logic to a single place within the code.

PR:		256936
Reported by:	Agata <chogata@moosefs.pro>
Tested by:	Agata <chogata@moosefs.pro>, jSML4ThWwBID69YC@protonmail.com
MFC after:	2 weeks
2021-10-06 14:07:33 -06:00
Kyle Evans
6b88668f0b vfs: remove dead fifoop VOP_KQFILTER implementations
These began to become obsolete in d6d64f0f2c (r137739) and the deal
was later sealed in 003e18aef4 (r137801) when vfs.fifofs.fops was
dropped and vop-bypass for pipes became mandatory.

PR:		225934
Suggested by:	markj
Reviewe by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D32270
2021-10-03 01:02:51 -05:00
Rick Macklem
93a32050ab nfsd: Fix pNFS handling of Deallocate
For a pNFS server configuration, an NFSv4.2 Deallocate operation
is proxied to the DS(s).  The code that parsed the reply for the
proxy RPC is broken and did not process the pre-operation attributes.

This patch fixes this problem.

This bug would only affect pNFS servers built from recent main/FreeBSD14
sources.
2021-10-02 14:11:15 -07:00
Mateusz Guzik
ef7d2c1fc1 nfs: eliminate thread argument from nfsvno_namei
This is a step towards retiring struct componentname cn_thread

Reviewed by:	rmacklem
Differential Revision:	https://reviews.freebsd.org/D32267
2021-10-02 00:57:20 +00:00
Alan Somers
7124d2bc3a fusefs: implement FUSE_NO_OPEN_SUPPORT and FUSE_NO_OPENDIR_SUPPORT
For file systems that allow it, fusefs will skip FUSE_OPEN,
FUSE_RELEASE, FUSE_OPENDIR, and FUSE_RELEASEDIR operations, a minor
optimization.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D32141
2021-09-26 21:57:29 -06:00
Alan Somers
a3a1ce3794 fusefs: diff reduction in fuse_kernel.h
Synchronize formatting and documentation in fuse_kernel.h with upstream
sources.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision:	https://reviews.freebsd.org/D32141
2021-09-26 21:57:07 -06:00
Rick Macklem
62c5be4ab4 nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg()
Commit 5e5ca4c8fc added a flag to a NFSv4 mount point that is set when
the first delegation is acquired from the NFSv4 server.

For a common case where delegations are not being issued by the
NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for
open/lock state, finds the delegation list empty, then just unlocks the
mutex and returns. This patch adds a check of the flag to avoid the
need to acquire the mutex for this common case.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

This commit should not affect the high level semantics of delegation
handling.

MFC after:      2 weeks
2021-09-26 18:37:25 -07:00
Gordon Bergling
90d60ca8b7 nfsclient: Fix a typo in a comment
- s/derefernce/dereference/

MFC after:	3 days
2021-09-26 15:17:00 +02:00
Mateusz Guzik
d71e1a883c fifo: support flock
This evens it up with Linux.

Original patch by:	Greg V <greg@unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D24255#565302
2021-09-25 14:58:31 +00:00
Jason A. Harmening
f9e28f9003 unionfs: lock newly-created vnodes before calling insmntque()
This fixes an insta-panic when attempting to use unionfs with
DEBUG_VFS_LOCKS.  Note that unionfs still has a long way to
go before it's generally stable or usable.

Reviewed by:	kib (prior version), markj
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D31917
2021-09-23 19:20:30 -07:00
Alan Somers
4f917847c9 fusefs: don't panic if FUSE_GETATTR fails durint VOP_GETPAGES
During VOP_GETPAGES, fusefs needs to determine the file's length, which
could require a FUSE_GETATTR operation.  If that fails, it's better to
SIGBUS than panic.

MFC after:	1 week
Sponsored by:	Axcient
Reviewed by: 	markj, kib
Differential Revision: https://reviews.freebsd.org/D31994
2021-09-21 14:01:06 -06:00
Rick Macklem
ad6dc36520 nfscl: Use vfs.nfs.maxalloclen to limit Deallocate RPC RTT
Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not
allow a reply with partial completion.  As such, the only way to
limit the time the operation takes to provide a reasonable RPC RTT
is to limit the size of the allocation/deallocation in the NFSv4.2
client.

This patch uses the sysctl vfs.nfs.maxalloclen to set
the limit on the size of the Deallocate operation.
There is no way to know how long a server will take to do an
deallocate operation, but 64Mbytes results in a reasonable
RPC RTT for the slow hardware I test on.

For an 8Gbyte deallocation, the elapsed time for doing it in 64Mbyte
chunks was the same (within margin of variability) as the
elapsed time taken for a single large deallocation
operation for a FreeBSD server with a UFS file system.
2021-09-18 14:38:43 -07:00
Konstantin Belousov
197a4f29f3 buffer pager: allow get_blksize method to return error
Reported and reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31998
2021-09-17 20:29:55 +03:00
Rick Macklem
9ebe4b8c67 nfscl: Add vfs.nfs.maxalloclen to limit Allocate/Deallocate RPC RTT
Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not
allow a reply with partial completion.  As such, the only way to
limit the time the operation takes to provide a reasonable RPC RTT
is to limit the size of the allocation/deallocation in the NFSv4.2
client.

This patch adds a sysctl called vfs.nfs.maxalloclen to set
the limit on the size of the Allocate operation.
There is no way to know how long a server will take to do an
allocate operation, but 64Mbytes results in a reasonable
RPC RTT for the slow hardware I test on, so that is what
the default value for vfs.nfs.maxalloclen is set to.

For an 8Gbyte allocation, the elapsed time for doing it in 64Mbyte
chunks was the same as the elapsed time taken for a single large
allocation operation for a FreeBSD server with a UFS file system.

MFC after:	2 weeks
2021-09-15 17:29:45 -07:00
Alexander Motin
272c4a4dc5 Allow setting NFS server scope and owner.
By default NFS server reports as scope and owner major the host UUID
value and zero for owner minor.  It works good in case of standalone
server.  But in case of CARP-based HA cluster failover the values
should remain persistent, otherwise some clients like VMware ESXi
get confused by the change and fail to reconnect automatically.

The patch makes server scope, major owner and minor owner values
configurable via sysctls.  If not set (by default) the host UUID
value is still used.

Reviewed by:	rmacklem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31952
2021-09-14 14:18:03 -04:00
Rick Macklem
55089ef4f8 nfscl: Make vfs.nfs.maxcopyrange larger by default
As of commit 103b207536, the NFSv4.2 server will limit the size
of a Copy operation based upon a 1 second timeout.  The Linux 5.2
kernel server also limits Copy operation size to 4Mbytes.
As such, the NFSv4.2 client can attempt a large Copy without
resulting in a long RPC RTT for these servers.

This patch changes vfs.nfs.maxcopyrange to 64bits and sets
the default to the maximum possible size of SSIZE_MAX, since
a larger size makes the Copy operation more efficient and
allows for copying to complete with fewer RPCs.
The sysctl may be need to be made smaller for other non-FreeBSD
NFSv4.2 servers.

MFC after:	2 weeks
2021-09-11 15:36:32 -07:00
Rick Macklem
f1c8811d2d nfsd: Fix build after commit 103b207536 for 32bit arches
MFC after:	2 weeks
2021-09-08 18:55:06 -07:00
Rick Macklem
103b207536 nfsd: Use the COPY_FILE_RANGE_TIMEO1SEC flag
Although it is not specified in the RFCs, the concept that
the NFSv4 server should reply to an RPC request within a
reasonable time is accepted practice within the NFSv4 community.

Without this patch, the NFSv4.2 server attempts to reply to
a Copy operation within 1 second by limiting the copy to
vfs.nfs.maxcopyrange bytes (default 10Mbytes). This is crude at
best, given the large variation in I/O subsystem performance.

This patch uses the COPY_FILE_RANGE_TIMEO1SEC flag added by
commit c5128c48df to limit the reply time for a Copy
operation to approximately 1 second.

MFC after:	2 weeks
2021-09-08 14:29:20 -07:00
Jason A. Harmening
312d49ef7a unionfs: style
Fix the more egregious style(9) violations in unionfs.
No functional change intended.
2021-09-01 07:55:37 -07:00
Jason A. Harmening
abe95116ba unionfs: rework pathname handling
Running stress2 unionfs tests reliably produces a namei_zone corruption
panic due to unionfs_relookup() attempting to NUL-terminate a newly-
allocate pathname buffer without first validating the buffer length.

Instead, avoid allocating new pathname buffers in unionfs entirely,
using already-provided buffers while ensuring the the correct flags
are set in struct componentname to prevent freeing or manipulation
of those buffers at lower layers.

While here, also compute and store the path length once in the unionfs
node instead of constantly invoking strlen() on it.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D31728
2021-09-01 07:55:09 -07:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Rick Macklem
13914e51eb nfsd: Make loop calling VOP_ALLOCATE() iterate until done
The NFSv4.2 Deallocate operation loops on VOP_DEALLOCATE()
while progress is being made (remaining length decreasing).
This patch changes the loop on VOP_ALLOCATE() for the NFSv4.2
Allocate operation do the same, instead of stopping after
an arbitrary 20 iterations.

MFC after:	2 weeks
2021-08-29 16:46:27 -07:00
Rick Macklem
08b9cc316a nfscl: Add a VOP_DEALLOCATE() for the NFSv4.2 client
This patch adds a VOP_DEALLOCATE() to the NFS client.
For NFSv4.2 servers that support the Deallocate operation,
it is used. Otherwise, it falls back on calling
vop_stddeallocate().

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31640
2021-08-27 18:31:36 -07:00