Commit Graph

4592 Commits

Author SHA1 Message Date
Alan Somers
032a5bd55b fusefs: Fix a bug during VOP_STRATEGY when the server changes file size
If the FUSE server tells the kernel that a file's size has changed, then
the kernel must invalidate any portion of that file in cache.  But the
kernel can't do that during VOP_STRATEGY, because the file's buffers are
already locked.  Instead, proceed with the write.

PR:		256937
Reported by:	Agata <chogata@moosefs.pro>
Tested by:	Agata <chogata@moosefs.pro>
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D32332
2021-10-06 14:07:33 -06:00
Alan Somers
7430017b99 fusefs: fix a recurse-on-non-recursive lockmgr panic
fuse_vnop_bmap needs to know the file's size in order to calculate the
optimum amount of readahead.  If the file's size is unknown, it must ask
the FUSE server.  But if the file's data was previously cached and the
server reports that its size has shrunk, fusefs must invalidate the
cached data.  That's not possible during VOP_BMAP because the buffer
object is already locked.

Fix the panic by not querying the FUSE server for the file's size during
VOP_BMAP if we don't need it.  That's also a a slight performance
optimization.

PR:		256937
Reported by:	Agata <chogata@moosefs.pro>
Tested by:	Agata <chogata@moosefs.pro>
MFC after:	2 weeks
2021-10-06 14:07:33 -06:00
Alan Somers
5d94aaacb5 fusefs: quiet some cache-related warnings
If the FUSE server does something that would make our cache incoherent,
we should print a warning to the user.  However, we previously warned in
some situations when we shouldn't, such as if the file's size changed on
the server _after_ our own attribute cache had expired.  This change
suppresses the warning in cases like that.  It also moves the warning
logic to a single place within the code.

PR:		256936
Reported by:	Agata <chogata@moosefs.pro>
Tested by:	Agata <chogata@moosefs.pro>, jSML4ThWwBID69YC@protonmail.com
MFC after:	2 weeks
2021-10-06 14:07:33 -06:00
Kyle Evans
6b88668f0b vfs: remove dead fifoop VOP_KQFILTER implementations
These began to become obsolete in d6d64f0f2c (r137739) and the deal
was later sealed in 003e18aef4 (r137801) when vfs.fifofs.fops was
dropped and vop-bypass for pipes became mandatory.

PR:		225934
Suggested by:	markj
Reviewe by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D32270
2021-10-03 01:02:51 -05:00
Rick Macklem
93a32050ab nfsd: Fix pNFS handling of Deallocate
For a pNFS server configuration, an NFSv4.2 Deallocate operation
is proxied to the DS(s).  The code that parsed the reply for the
proxy RPC is broken and did not process the pre-operation attributes.

This patch fixes this problem.

This bug would only affect pNFS servers built from recent main/FreeBSD14
sources.
2021-10-02 14:11:15 -07:00
Mateusz Guzik
ef7d2c1fc1 nfs: eliminate thread argument from nfsvno_namei
This is a step towards retiring struct componentname cn_thread

Reviewed by:	rmacklem
Differential Revision:	https://reviews.freebsd.org/D32267
2021-10-02 00:57:20 +00:00
Alan Somers
7124d2bc3a fusefs: implement FUSE_NO_OPEN_SUPPORT and FUSE_NO_OPENDIR_SUPPORT
For file systems that allow it, fusefs will skip FUSE_OPEN,
FUSE_RELEASE, FUSE_OPENDIR, and FUSE_RELEASEDIR operations, a minor
optimization.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D32141
2021-09-26 21:57:29 -06:00
Alan Somers
a3a1ce3794 fusefs: diff reduction in fuse_kernel.h
Synchronize formatting and documentation in fuse_kernel.h with upstream
sources.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision:	https://reviews.freebsd.org/D32141
2021-09-26 21:57:07 -06:00
Rick Macklem
62c5be4ab4 nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg()
Commit 5e5ca4c8fc added a flag to a NFSv4 mount point that is set when
the first delegation is acquired from the NFSv4 server.

For a common case where delegations are not being issued by the
NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for
open/lock state, finds the delegation list empty, then just unlocks the
mutex and returns. This patch adds a check of the flag to avoid the
need to acquire the mutex for this common case.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

This commit should not affect the high level semantics of delegation
handling.

MFC after:      2 weeks
2021-09-26 18:37:25 -07:00
Gordon Bergling
90d60ca8b7 nfsclient: Fix a typo in a comment
- s/derefernce/dereference/

MFC after:	3 days
2021-09-26 15:17:00 +02:00
Mateusz Guzik
d71e1a883c fifo: support flock
This evens it up with Linux.

Original patch by:	Greg V <greg@unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D24255#565302
2021-09-25 14:58:31 +00:00
Jason A. Harmening
f9e28f9003 unionfs: lock newly-created vnodes before calling insmntque()
This fixes an insta-panic when attempting to use unionfs with
DEBUG_VFS_LOCKS.  Note that unionfs still has a long way to
go before it's generally stable or usable.

Reviewed by:	kib (prior version), markj
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D31917
2021-09-23 19:20:30 -07:00
Alan Somers
4f917847c9 fusefs: don't panic if FUSE_GETATTR fails durint VOP_GETPAGES
During VOP_GETPAGES, fusefs needs to determine the file's length, which
could require a FUSE_GETATTR operation.  If that fails, it's better to
SIGBUS than panic.

MFC after:	1 week
Sponsored by:	Axcient
Reviewed by: 	markj, kib
Differential Revision: https://reviews.freebsd.org/D31994
2021-09-21 14:01:06 -06:00
Rick Macklem
ad6dc36520 nfscl: Use vfs.nfs.maxalloclen to limit Deallocate RPC RTT
Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not
allow a reply with partial completion.  As such, the only way to
limit the time the operation takes to provide a reasonable RPC RTT
is to limit the size of the allocation/deallocation in the NFSv4.2
client.

This patch uses the sysctl vfs.nfs.maxalloclen to set
the limit on the size of the Deallocate operation.
There is no way to know how long a server will take to do an
deallocate operation, but 64Mbytes results in a reasonable
RPC RTT for the slow hardware I test on.

For an 8Gbyte deallocation, the elapsed time for doing it in 64Mbyte
chunks was the same (within margin of variability) as the
elapsed time taken for a single large deallocation
operation for a FreeBSD server with a UFS file system.
2021-09-18 14:38:43 -07:00
Konstantin Belousov
197a4f29f3 buffer pager: allow get_blksize method to return error
Reported and reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31998
2021-09-17 20:29:55 +03:00
Rick Macklem
9ebe4b8c67 nfscl: Add vfs.nfs.maxalloclen to limit Allocate/Deallocate RPC RTT
Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not
allow a reply with partial completion.  As such, the only way to
limit the time the operation takes to provide a reasonable RPC RTT
is to limit the size of the allocation/deallocation in the NFSv4.2
client.

This patch adds a sysctl called vfs.nfs.maxalloclen to set
the limit on the size of the Allocate operation.
There is no way to know how long a server will take to do an
allocate operation, but 64Mbytes results in a reasonable
RPC RTT for the slow hardware I test on, so that is what
the default value for vfs.nfs.maxalloclen is set to.

For an 8Gbyte allocation, the elapsed time for doing it in 64Mbyte
chunks was the same as the elapsed time taken for a single large
allocation operation for a FreeBSD server with a UFS file system.

MFC after:	2 weeks
2021-09-15 17:29:45 -07:00
Alexander Motin
272c4a4dc5 Allow setting NFS server scope and owner.
By default NFS server reports as scope and owner major the host UUID
value and zero for owner minor.  It works good in case of standalone
server.  But in case of CARP-based HA cluster failover the values
should remain persistent, otherwise some clients like VMware ESXi
get confused by the change and fail to reconnect automatically.

The patch makes server scope, major owner and minor owner values
configurable via sysctls.  If not set (by default) the host UUID
value is still used.

Reviewed by:	rmacklem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31952
2021-09-14 14:18:03 -04:00
Rick Macklem
55089ef4f8 nfscl: Make vfs.nfs.maxcopyrange larger by default
As of commit 103b207536, the NFSv4.2 server will limit the size
of a Copy operation based upon a 1 second timeout.  The Linux 5.2
kernel server also limits Copy operation size to 4Mbytes.
As such, the NFSv4.2 client can attempt a large Copy without
resulting in a long RPC RTT for these servers.

This patch changes vfs.nfs.maxcopyrange to 64bits and sets
the default to the maximum possible size of SSIZE_MAX, since
a larger size makes the Copy operation more efficient and
allows for copying to complete with fewer RPCs.
The sysctl may be need to be made smaller for other non-FreeBSD
NFSv4.2 servers.

MFC after:	2 weeks
2021-09-11 15:36:32 -07:00
Rick Macklem
f1c8811d2d nfsd: Fix build after commit 103b207536 for 32bit arches
MFC after:	2 weeks
2021-09-08 18:55:06 -07:00
Rick Macklem
103b207536 nfsd: Use the COPY_FILE_RANGE_TIMEO1SEC flag
Although it is not specified in the RFCs, the concept that
the NFSv4 server should reply to an RPC request within a
reasonable time is accepted practice within the NFSv4 community.

Without this patch, the NFSv4.2 server attempts to reply to
a Copy operation within 1 second by limiting the copy to
vfs.nfs.maxcopyrange bytes (default 10Mbytes). This is crude at
best, given the large variation in I/O subsystem performance.

This patch uses the COPY_FILE_RANGE_TIMEO1SEC flag added by
commit c5128c48df to limit the reply time for a Copy
operation to approximately 1 second.

MFC after:	2 weeks
2021-09-08 14:29:20 -07:00
Jason A. Harmening
312d49ef7a unionfs: style
Fix the more egregious style(9) violations in unionfs.
No functional change intended.
2021-09-01 07:55:37 -07:00
Jason A. Harmening
abe95116ba unionfs: rework pathname handling
Running stress2 unionfs tests reliably produces a namei_zone corruption
panic due to unionfs_relookup() attempting to NUL-terminate a newly-
allocate pathname buffer without first validating the buffer length.

Instead, avoid allocating new pathname buffers in unionfs entirely,
using already-provided buffers while ensuring the the correct flags
are set in struct componentname to prevent freeing or manipulation
of those buffers at lower layers.

While here, also compute and store the path length once in the unionfs
node instead of constantly invoking strlen() on it.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D31728
2021-09-01 07:55:09 -07:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Rick Macklem
13914e51eb nfsd: Make loop calling VOP_ALLOCATE() iterate until done
The NFSv4.2 Deallocate operation loops on VOP_DEALLOCATE()
while progress is being made (remaining length decreasing).
This patch changes the loop on VOP_ALLOCATE() for the NFSv4.2
Allocate operation do the same, instead of stopping after
an arbitrary 20 iterations.

MFC after:	2 weeks
2021-08-29 16:46:27 -07:00
Rick Macklem
08b9cc316a nfscl: Add a VOP_DEALLOCATE() for the NFSv4.2 client
This patch adds a VOP_DEALLOCATE() to the NFS client.
For NFSv4.2 servers that support the Deallocate operation,
it is used. Otherwise, it falls back on calling
vop_stddeallocate().

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31640
2021-08-27 18:31:36 -07:00
Konstantin Belousov
85fb840ebf msdosfs: drop now unused DE_RENAME
Submitted by:	trasz
Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31464
2021-08-27 18:39:45 +03:00
Konstantin Belousov
6ae13c0feb msdosfs: add doscheckpath lock
Similar to the UFS revision 8df4bc48c8

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31464
2021-08-27 18:39:45 +03:00
Konstantin Belousov
95d42526e9 msdosfs: fix rename
Use the same locking algorithm for msdosfs_rename() as used by ufs_rename().
Convert doscheckpath() to non-sleeping version.

Reported by:	trasz
PR:	257522
In collaboration with:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31464
2021-08-27 18:39:45 +03:00
Konstantin Belousov
ae7e8a02e6 msdosfs deget(): add locking flags argument
LK_EXCLUSIVE must be passed always, some consumers need the ability to
specify LK_NOWAIT

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31464
2021-08-27 18:39:45 +03:00
Konstantin Belousov
92d4e08827 msdosfs: unstaticise msdosfs_lookup_
and rename it to msdosfs_lookup_ino(), similarly to UFS

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31464
2021-08-27 18:39:45 +03:00
Rick Macklem
bb958dcf3d nfsd: Add support for the NFSv4.2 Deallocate operation
The recently added VOP_DEALLOCATE(9) VOP call allows
implementation of the Deallocate NFSv4.2 operation.

Since the Deallocate operation is a single succeed/fail
operation, the call to VOP_DEALLOCATE(9) loops so long
as progress is being made.  It calls maybe_yield()
between loop iterations to allow other processes
to preempt it.

Where RFC 7862 underspecifies behaviour, the code
is written to be Linux NFSv4.2 server compatible.

Reviewed by:	khng
Differential Revision:	https://reviews.freebsd.org/D31624
2021-08-26 18:14:11 -07:00
Ka Ho Ng
8d7cd10ba6 tmpfs: Implement VOP_DEALLOCATE
Implementing VOP_DEALLOCATE to allow hole-punching in the same manner as
POSIX shared memory's fspacectl(SPACECTL_DEALLOC) support.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31684
2021-08-26 05:34:54 +08:00
Ka Ho Ng
399be91098 tmpfs: Move partial page invalidation to a separate helper
The partial page invalidation code is factored out to be a separate
helper from tmpfs_reg_resize().

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31683
2021-08-26 05:34:54 +08:00
Ka Ho Ng
a48416f844 tmpfs: Fix error being cleared after commit c12118f6ce
In tmpfs_link() error was erroneously cleared in commit c12118f6ce.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
MFC with:	c12118f6ce
2021-08-25 00:35:29 +08:00
Ka Ho Ng
c12118f6ce tmpfs: Fix styles
A lot of return statements were in the wrong style before this commit.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-08-24 22:45:08 +08:00
Gordon Bergling
47f880ebeb ext2fs(5): Correct a typo in an error message
- s/talbes/tables/

MFC after:	1 week
2021-08-22 07:58:22 +02:00
Rick Macklem
06afb53bcd nfsd: Fix sanity check for NFSv4.2 Allocate operations
The NFSv4.2 Allocate operation sanity checks the aa_offset
and aa_length arguments.  Since they are assigned to variables
of type off_t (signed) it was possible for them to be negative.
It was also possible for aa_offset+aa_length to exceed OFF_MAX
when stored in lo_end, which is uint64_t.

This patch adds checks for these cases to the sanity check.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31511
2021-08-12 16:48:28 -07:00
Rick Macklem
3ad1e1c1ce nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2
This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2
NFS client, which can be used by nfs_lookup() so that a
subsequent Open RPC is not required.
It uses the cn_flags OPENREAD, OPENWRITE added by commit c18c74a87c.
This reduced the number of RPCs by about 15% for a kernel
build over NFS.

For now, use of Lookup+Open is only done when the "oneopenown"
mount option is used.  It may be possible for Lookup+Open to
be used for non-oneopenown NFSv4.1/4.2 mounts, but that will
require extensive further testing to determine if it works.

While here, I've added the changes to the nfscommon module
that are needed to implement the Deallocate NFSv4.2 operation.
This avoids needing another cycle of changes to the internal
KAPI between the NFS modules.

This commit has changed the internal KAPI between the NFS
modules and, as such, all need to be rebuilt from sources.
I have not bumped __FreeBSD_version, since it was bumped a
few days ago.
2021-08-11 18:49:26 -07:00
Rick Macklem
efea1bc1fd nfscl: Cache an open stateid for the "oneopenown" mount option
For NFSv4.1/4.2, if the "oneopenown" mount option is used,
there is, at most, only one open stateid for each NFS vnode.
When an open stateid for a file is acquired, set a pointer to
the open structure in the NFS vnode.  This pointer can be used to
acquire the open stateid without searching the open linked list
when the following is true:
- No delegations have been issued for the file.  Since delegations
  can outlive an NFS vnode for a file, use the global
  NFSMNTP_DELEGISSUED flag on the mount to determine this.
- No lock stateid has been issued for the file.  To determine
  this, a new NFS vnode flag called NMIGHTBELOCKED is set when a lock
  stateid is issued, which can then be tested.

When this open structure pointer can be used, it avoids the need to
acquire the NFSCLSTATELOCK() and searching the open structure list for
an open.  The NFSCLSTATELOCK() can be highly contended when there are
a lot of opens issued for the NFSv4.1/4.2 mount.

This patch only affects NFSv4.1/4.2 mounts when the "oneopenown"
mount option is used.

MFC after:	2 weeks
2021-07-28 15:48:27 -07:00
Rick Macklem
54ff3b3986 nfscl: Set correct lockowner for "oneopenown" mount option
For NFSv4.1/4.2, the client may use either an open, lock or
delegation stateid as the stateid argument for an I/O operation.
RFC 5661 defines an order of preference of delegation, then lock
and finally open stateid for the argument, although NFSv4.1/4.2
servers are expected to handle any stateid type.

For the "oneopenown" mount option, the lock owner was not being
correctly generated and, as such, the I/O operation would use an
open stateid, even when a lock stateid existed.  Although this
did not and should not affect an NFSv4.1/4.2 server's behaviour,
this patch makes the behaviour for "oneopenown" the same as when
the mount option is not specified.

Found during inspection of packet captures.  No failure during
testing against NFSv4.1/4.2 servers of the unpatched code occurred.

MFC after:	2 weeks
2021-07-28 15:23:05 -07:00
Konstantin Belousov
4eaf9609fe nullfs: provide custom null_rename bypass
fdvp and fvp vnodes are not locked, and race with reclaim cannot be handled
by the generic bypass routine.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:48 +03:00
Konstantin Belousov
26e72728ce null_rename: some style
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:47 +03:00
Konstantin Belousov
10db189649 fifofs: fifo vnode might be relocked before VOP_OPEN() is called
Handle it in fifo_close by checking for v_fifoinfo == NULL

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:47 +03:00
Konstantin Belousov
4f21442e10 null_lookup: restore dvp lock always, not only on success
Caller of VOP_LOOKUP() passes dvp locked and expect it locked on return.
Relock of lower vnode in any case could leave upper vnode reclaimed and
unlocked.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:47 +03:00
Konstantin Belousov
d5b078163e null_bypass(): prevent loosing the only reference to the lower vnode
The upper vnode reference to the lower vnode is the only reference that
keeps our pointer to the lower vnode alive. If lower vnode is relocked
during the VOP call, upper vnode might become unlocked and reclaimed,
which invalidates our reference.

Add a transient vhold around VOP call.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:47 +03:00
Konstantin Belousov
161e9a9736 nullfs: provide custom null_advlock bypass
The advlock VOP takes the vnode unlocked, which makes the normal bypass
function racy.  Same as null_pgcache_read(), nullfs implementation needs
to take interlock and reference lower vnode under it.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:47 +03:00
Konstantin Belousov
7b7227c4a6 null_bypass(): some style
Reivewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31310
2021-07-27 19:58:47 +03:00
Jason A. Harmening
c746ed724d Allow stacked filesystems to be recursively unmounted
In certain emergency cases such as media failure or removal, UFS will
initiate a forced unmount in order to prevent dirty buffers from
accumulating against the no-longer-usable filesystem.  The presence
of a stacked filesystem such as nullfs or unionfs above the UFS mount
will prevent this forced unmount from succeeding.

This change addreses the situation by allowing stacked filesystems to
be recursively unmounted on a taskqueue thread when the MNT_RECURSE
flag is specified to dounmount().  This call will block until all upper
mounts have been removed unless the caller specifies the MNT_DEFERRED
flag to indicate the base filesystem should also be unmounted from the
taskqueue.

To achieve this, the recently-added vfs_pin_from_vp()/vfs_unpin() KPIs
have been combined with the existing 'mnt_uppers' list used by nullfs
and renamed to vfs_register_upper_from_vp()/vfs_unregister_upper().
The format of the mnt_uppers list has also been changed to accommodate
filesystems such as unionfs in which a given mount may be stacked atop
more than one lower mount.  Additionally, management of lower FS
reclaim/unlink notifications has been split into a separate list
managed by a separate set of KPIs, as registration of an upper FS no
longer implies interest in these notifications.

Reviewed by:	kib, mckusick
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D31016
2021-07-24 12:52:00 -07:00
Rick Macklem
7685f8344d nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts
For NFSv4.1/4.2, the client may set the "seqid" field of the
stateid to 0 in RPC requests.  This indicates to the server that
it should not check the "seqid" or return NFSERR_OLDSTATEID if the
"seqid" value is not up to date w.r.t. Open/Lock operations
on the stateid.  This "seqid" is incremented by the NFSv4 server
for each Open/OpenDowngrade/Lock/Locku operation done on the stateid.

Since a failure return of NFSERR_OLDSTATEID is of no use to
the client for I/O operations, it makes sense to set "seqid"
to 0 for the stateid argument for I/O operations.
This avoids server failure replies of NFSERR_OLDSTATEID,
although I am not aware of any case where this failure occurs.

This makes the FreeBSD NFSv4.1/4.2 client compatible with the
Linux NFSv4.1/4.2 client.

MFC after:	2 weeks
2021-07-19 17:35:39 -07:00
Rick Macklem
ee29e6f311 nfsd: Add sysctl to set maximum I/O size up to 1Mbyte
Since MAXPHYS now allows the FreeBSD NFS client
to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio
so that the maximum NFS server I/O size can be set up to 1Mbyte.
The Linux NFS client can also do 1Mbyte I/O operations.

The default of 128Kbytes for the maximum I/O size has
not been changed for two reasons:
- kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O
- The limited benchmarking I can do actually shows a drop in I/O rate
  when the I/O size is above 256Kbytes.
However, daveb@spectralogic.com reports seeing an increase
in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client.

Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30826
2021-07-16 15:01:03 -07:00