freebsd-nq

Author	SHA1	Message	Date
Alan Somers	a3a1ce3794	fusefs: diff reduction in fuse_kernel.h Synchronize formatting and documentation in fuse_kernel.h with upstream sources. MFC after: 2 weeks Reviewed by: pfg Differential Revision: https://reviews.freebsd.org/D32141	2021-09-26 21:57:07 -06:00
Rick Macklem	62c5be4ab4	nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg() Commit `5e5ca4c8fc` added a flag to a NFSv4 mount point that is set when the first delegation is acquired from the NFSv4 server. For a common case where delegations are not being issued by the NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for open/lock state, finds the delegation list empty, then just unlocks the mutex and returns. This patch adds a check of the flag to avoid the need to acquire the mutex for this common case. This change appears to be performance neutral for a small number of opens, but should reduce lock contention for a large number of opens for the common case where server is not issuing delegations. This commit should not affect the high level semantics of delegation handling. MFC after: 2 weeks	2021-09-26 18:37:25 -07:00
Gordon Bergling	90d60ca8b7	nfsclient: Fix a typo in a comment - s/derefernce/dereference/ MFC after: 3 days	2021-09-26 15:17:00 +02:00
Mateusz Guzik	d71e1a883c	fifo: support flock This evens it up with Linux. Original patch by: Greg V <greg@unrelenting.technology> Differential Revision: https://reviews.freebsd.org/D24255#565302	2021-09-25 14:58:31 +00:00
Jason A. Harmening	f9e28f9003	unionfs: lock newly-created vnodes before calling insmntque() This fixes an insta-panic when attempting to use unionfs with DEBUG_VFS_LOCKS. Note that unionfs still has a long way to go before it's generally stable or usable. Reviewed by: kib (prior version), markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D31917	2021-09-23 19:20:30 -07:00
Alan Somers	4f917847c9	fusefs: don't panic if FUSE_GETATTR fails durint VOP_GETPAGES During VOP_GETPAGES, fusefs needs to determine the file's length, which could require a FUSE_GETATTR operation. If that fails, it's better to SIGBUS than panic. MFC after: 1 week Sponsored by: Axcient Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D31994	2021-09-21 14:01:06 -06:00
Rick Macklem	ad6dc36520	nfscl: Use vfs.nfs.maxalloclen to limit Deallocate RPC RTT Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not allow a reply with partial completion. As such, the only way to limit the time the operation takes to provide a reasonable RPC RTT is to limit the size of the allocation/deallocation in the NFSv4.2 client. This patch uses the sysctl vfs.nfs.maxalloclen to set the limit on the size of the Deallocate operation. There is no way to know how long a server will take to do an deallocate operation, but 64Mbytes results in a reasonable RPC RTT for the slow hardware I test on. For an 8Gbyte deallocation, the elapsed time for doing it in 64Mbyte chunks was the same (within margin of variability) as the elapsed time taken for a single large deallocation operation for a FreeBSD server with a UFS file system.	2021-09-18 14:38:43 -07:00
Konstantin Belousov	197a4f29f3	buffer pager: allow get_blksize method to return error Reported and reviewed by: asomers Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31998	2021-09-17 20:29:55 +03:00
Rick Macklem	9ebe4b8c67	nfscl: Add vfs.nfs.maxalloclen to limit Allocate/Deallocate RPC RTT Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not allow a reply with partial completion. As such, the only way to limit the time the operation takes to provide a reasonable RPC RTT is to limit the size of the allocation/deallocation in the NFSv4.2 client. This patch adds a sysctl called vfs.nfs.maxalloclen to set the limit on the size of the Allocate operation. There is no way to know how long a server will take to do an allocate operation, but 64Mbytes results in a reasonable RPC RTT for the slow hardware I test on, so that is what the default value for vfs.nfs.maxalloclen is set to. For an 8Gbyte allocation, the elapsed time for doing it in 64Mbyte chunks was the same as the elapsed time taken for a single large allocation operation for a FreeBSD server with a UFS file system. MFC after: 2 weeks	2021-09-15 17:29:45 -07:00
Alexander Motin	272c4a4dc5	Allow setting NFS server scope and owner. By default NFS server reports as scope and owner major the host UUID value and zero for owner minor. It works good in case of standalone server. But in case of CARP-based HA cluster failover the values should remain persistent, otherwise some clients like VMware ESXi get confused by the change and fail to reconnect automatically. The patch makes server scope, major owner and minor owner values configurable via sysctls. If not set (by default) the host UUID value is still used. Reviewed by: rmacklem MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31952	2021-09-14 14:18:03 -04:00
Rick Macklem	55089ef4f8	nfscl: Make vfs.nfs.maxcopyrange larger by default As of commit `103b207536`, the NFSv4.2 server will limit the size of a Copy operation based upon a 1 second timeout. The Linux 5.2 kernel server also limits Copy operation size to 4Mbytes. As such, the NFSv4.2 client can attempt a large Copy without resulting in a long RPC RTT for these servers. This patch changes vfs.nfs.maxcopyrange to 64bits and sets the default to the maximum possible size of SSIZE_MAX, since a larger size makes the Copy operation more efficient and allows for copying to complete with fewer RPCs. The sysctl may be need to be made smaller for other non-FreeBSD NFSv4.2 servers. MFC after: 2 weeks	2021-09-11 15:36:32 -07:00
Rick Macklem	f1c8811d2d	nfsd: Fix build after commit `103b207536` for 32bit arches MFC after: 2 weeks	2021-09-08 18:55:06 -07:00
Rick Macklem	103b207536	nfsd: Use the COPY_FILE_RANGE_TIMEO1SEC flag Although it is not specified in the RFCs, the concept that the NFSv4 server should reply to an RPC request within a reasonable time is accepted practice within the NFSv4 community. Without this patch, the NFSv4.2 server attempts to reply to a Copy operation within 1 second by limiting the copy to vfs.nfs.maxcopyrange bytes (default 10Mbytes). This is crude at best, given the large variation in I/O subsystem performance. This patch uses the COPY_FILE_RANGE_TIMEO1SEC flag added by commit `c5128c48df` to limit the reply time for a Copy operation to approximately 1 second. MFC after: 2 weeks	2021-09-08 14:29:20 -07:00
Jason A. Harmening	312d49ef7a	unionfs: style Fix the more egregious style(9) violations in unionfs. No functional change intended.	2021-09-01 07:55:37 -07:00
Jason A. Harmening	abe95116ba	unionfs: rework pathname handling Running stress2 unionfs tests reliably produces a namei_zone corruption panic due to unionfs_relookup() attempting to NUL-terminate a newly- allocate pathname buffer without first validating the buffer length. Instead, avoid allocating new pathname buffers in unionfs entirely, using already-provided buffers while ensuring the the correct flags are set in struct componentname to prevent freeing or manipulation of those buffers at lower layers. While here, also compute and store the path length once in the unionfs node instead of constantly invoking strlen() on it. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D31728	2021-09-01 07:55:09 -07:00
Andrew Turner	b792434150	Create sys/reg.h for the common code previously in machine/reg.h Move the common kernel function signatures from machine/reg.h to a new sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2). Reviewed by: imp, markj Sponsored by: DARPA, AFRL (original work) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19830	2021-08-30 12:50:53 +01:00
Rick Macklem	13914e51eb	nfsd: Make loop calling VOP_ALLOCATE() iterate until done The NFSv4.2 Deallocate operation loops on VOP_DEALLOCATE() while progress is being made (remaining length decreasing). This patch changes the loop on VOP_ALLOCATE() for the NFSv4.2 Allocate operation do the same, instead of stopping after an arbitrary 20 iterations. MFC after: 2 weeks	2021-08-29 16:46:27 -07:00
Rick Macklem	08b9cc316a	nfscl: Add a VOP_DEALLOCATE() for the NFSv4.2 client This patch adds a VOP_DEALLOCATE() to the NFS client. For NFSv4.2 servers that support the Deallocate operation, it is used. Otherwise, it falls back on calling vop_stddeallocate(). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31640	2021-08-27 18:31:36 -07:00
Konstantin Belousov	85fb840ebf	msdosfs: drop now unused DE_RENAME Submitted by: trasz Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31464	2021-08-27 18:39:45 +03:00
Konstantin Belousov	6ae13c0feb	msdosfs: add doscheckpath lock Similar to the UFS revision `8df4bc48c8` Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31464	2021-08-27 18:39:45 +03:00
Konstantin Belousov	95d42526e9	msdosfs: fix rename Use the same locking algorithm for msdosfs_rename() as used by ufs_rename(). Convert doscheckpath() to non-sleeping version. Reported by: trasz PR: 257522 In collaboration with: pho Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31464	2021-08-27 18:39:45 +03:00
Konstantin Belousov	ae7e8a02e6	msdosfs deget(): add locking flags argument LK_EXCLUSIVE must be passed always, some consumers need the ability to specify LK_NOWAIT Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31464	2021-08-27 18:39:45 +03:00
Konstantin Belousov	92d4e08827	msdosfs: unstaticise msdosfs_lookup_ and rename it to msdosfs_lookup_ino(), similarly to UFS Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31464	2021-08-27 18:39:45 +03:00
Rick Macklem	bb958dcf3d	nfsd: Add support for the NFSv4.2 Deallocate operation The recently added VOP_DEALLOCATE(9) VOP call allows implementation of the Deallocate NFSv4.2 operation. Since the Deallocate operation is a single succeed/fail operation, the call to VOP_DEALLOCATE(9) loops so long as progress is being made. It calls maybe_yield() between loop iterations to allow other processes to preempt it. Where RFC 7862 underspecifies behaviour, the code is written to be Linux NFSv4.2 server compatible. Reviewed by: khng Differential Revision: https://reviews.freebsd.org/D31624	2021-08-26 18:14:11 -07:00
Ka Ho Ng	8d7cd10ba6	tmpfs: Implement VOP_DEALLOCATE Implementing VOP_DEALLOCATE to allow hole-punching in the same manner as POSIX shared memory's fspacectl(SPACECTL_DEALLOC) support. Sponsored by: The FreeBSD Foundation Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31684	2021-08-26 05:34:54 +08:00
Ka Ho Ng	399be91098	tmpfs: Move partial page invalidation to a separate helper The partial page invalidation code is factored out to be a separate helper from tmpfs_reg_resize(). Sponsored by: The FreeBSD Foundation Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D31683	2021-08-26 05:34:54 +08:00
Ka Ho Ng	a48416f844	tmpfs: Fix error being cleared after commit `c12118f6ce` In tmpfs_link() error was erroneously cleared in commit `c12118f6ce`. Sponsored by: The FreeBSD Foundation MFC after: 3 days MFC with: `c12118f6ce`	2021-08-25 00:35:29 +08:00
Ka Ho Ng	c12118f6ce	tmpfs: Fix styles A lot of return statements were in the wrong style before this commit. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2021-08-24 22:45:08 +08:00
Gordon Bergling	47f880ebeb	ext2fs(5): Correct a typo in an error message - s/talbes/tables/ MFC after: 1 week	2021-08-22 07:58:22 +02:00
Rick Macklem	06afb53bcd	nfsd: Fix sanity check for NFSv4.2 Allocate operations The NFSv4.2 Allocate operation sanity checks the aa_offset and aa_length arguments. Since they are assigned to variables of type off_t (signed) it was possible for them to be negative. It was also possible for aa_offset+aa_length to exceed OFF_MAX when stored in lo_end, which is uint64_t. This patch adds checks for these cases to the sanity check. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31511	2021-08-12 16:48:28 -07:00
Rick Macklem	3ad1e1c1ce	nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2 This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2 NFS client, which can be used by nfs_lookup() so that a subsequent Open RPC is not required. It uses the cn_flags OPENREAD, OPENWRITE added by commit `c18c74a87c`. This reduced the number of RPCs by about 15% for a kernel build over NFS. For now, use of Lookup+Open is only done when the "oneopenown" mount option is used. It may be possible for Lookup+Open to be used for non-oneopenown NFSv4.1/4.2 mounts, but that will require extensive further testing to determine if it works. While here, I've added the changes to the nfscommon module that are needed to implement the Deallocate NFSv4.2 operation. This avoids needing another cycle of changes to the internal KAPI between the NFS modules. This commit has changed the internal KAPI between the NFS modules and, as such, all need to be rebuilt from sources. I have not bumped __FreeBSD_version, since it was bumped a few days ago.	2021-08-11 18:49:26 -07:00
Rick Macklem	efea1bc1fd	nfscl: Cache an open stateid for the "oneopenown" mount option For NFSv4.1/4.2, if the "oneopenown" mount option is used, there is, at most, only one open stateid for each NFS vnode. When an open stateid for a file is acquired, set a pointer to the open structure in the NFS vnode. This pointer can be used to acquire the open stateid without searching the open linked list when the following is true: - No delegations have been issued for the file. Since delegations can outlive an NFS vnode for a file, use the global NFSMNTP_DELEGISSUED flag on the mount to determine this. - No lock stateid has been issued for the file. To determine this, a new NFS vnode flag called NMIGHTBELOCKED is set when a lock stateid is issued, which can then be tested. When this open structure pointer can be used, it avoids the need to acquire the NFSCLSTATELOCK() and searching the open structure list for an open. The NFSCLSTATELOCK() can be highly contended when there are a lot of opens issued for the NFSv4.1/4.2 mount. This patch only affects NFSv4.1/4.2 mounts when the "oneopenown" mount option is used. MFC after: 2 weeks	2021-07-28 15:48:27 -07:00
Rick Macklem	54ff3b3986	nfscl: Set correct lockowner for "oneopenown" mount option For NFSv4.1/4.2, the client may use either an open, lock or delegation stateid as the stateid argument for an I/O operation. RFC 5661 defines an order of preference of delegation, then lock and finally open stateid for the argument, although NFSv4.1/4.2 servers are expected to handle any stateid type. For the "oneopenown" mount option, the lock owner was not being correctly generated and, as such, the I/O operation would use an open stateid, even when a lock stateid existed. Although this did not and should not affect an NFSv4.1/4.2 server's behaviour, this patch makes the behaviour for "oneopenown" the same as when the mount option is not specified. Found during inspection of packet captures. No failure during testing against NFSv4.1/4.2 servers of the unpatched code occurred. MFC after: 2 weeks	2021-07-28 15:23:05 -07:00
Konstantin Belousov	4eaf9609fe	nullfs: provide custom null_rename bypass fdvp and fvp vnodes are not locked, and race with reclaim cannot be handled by the generic bypass routine. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:48 +03:00
Konstantin Belousov	26e72728ce	null_rename: some style Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:47 +03:00
Konstantin Belousov	10db189649	fifofs: fifo vnode might be relocked before VOP_OPEN() is called Handle it in fifo_close by checking for v_fifoinfo == NULL Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:47 +03:00
Konstantin Belousov	4f21442e10	null_lookup: restore dvp lock always, not only on success Caller of VOP_LOOKUP() passes dvp locked and expect it locked on return. Relock of lower vnode in any case could leave upper vnode reclaimed and unlocked. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:47 +03:00
Konstantin Belousov	d5b078163e	null_bypass(): prevent loosing the only reference to the lower vnode The upper vnode reference to the lower vnode is the only reference that keeps our pointer to the lower vnode alive. If lower vnode is relocked during the VOP call, upper vnode might become unlocked and reclaimed, which invalidates our reference. Add a transient vhold around VOP call. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:47 +03:00
Konstantin Belousov	161e9a9736	nullfs: provide custom null_advlock bypass The advlock VOP takes the vnode unlocked, which makes the normal bypass function racy. Same as null_pgcache_read(), nullfs implementation needs to take interlock and reference lower vnode under it. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:47 +03:00
Konstantin Belousov	7b7227c4a6	null_bypass(): some style Reivewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31310	2021-07-27 19:58:47 +03:00
Jason A. Harmening	c746ed724d	Allow stacked filesystems to be recursively unmounted In certain emergency cases such as media failure or removal, UFS will initiate a forced unmount in order to prevent dirty buffers from accumulating against the no-longer-usable filesystem. The presence of a stacked filesystem such as nullfs or unionfs above the UFS mount will prevent this forced unmount from succeeding. This change addreses the situation by allowing stacked filesystems to be recursively unmounted on a taskqueue thread when the MNT_RECURSE flag is specified to dounmount(). This call will block until all upper mounts have been removed unless the caller specifies the MNT_DEFERRED flag to indicate the base filesystem should also be unmounted from the taskqueue. To achieve this, the recently-added vfs_pin_from_vp()/vfs_unpin() KPIs have been combined with the existing 'mnt_uppers' list used by nullfs and renamed to vfs_register_upper_from_vp()/vfs_unregister_upper(). The format of the mnt_uppers list has also been changed to accommodate filesystems such as unionfs in which a given mount may be stacked atop more than one lower mount. Additionally, management of lower FS reclaim/unlink notifications has been split into a separate list managed by a separate set of KPIs, as registration of an upper FS no longer implies interest in these notifications. Reviewed by: kib, mckusick Tested by: pho Differential Revision: https://reviews.freebsd.org/D31016	2021-07-24 12:52:00 -07:00
Rick Macklem	7685f8344d	nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts For NFSv4.1/4.2, the client may set the "seqid" field of the stateid to 0 in RPC requests. This indicates to the server that it should not check the "seqid" or return NFSERR_OLDSTATEID if the "seqid" value is not up to date w.r.t. Open/Lock operations on the stateid. This "seqid" is incremented by the NFSv4 server for each Open/OpenDowngrade/Lock/Locku operation done on the stateid. Since a failure return of NFSERR_OLDSTATEID is of no use to the client for I/O operations, it makes sense to set "seqid" to 0 for the stateid argument for I/O operations. This avoids server failure replies of NFSERR_OLDSTATEID, although I am not aware of any case where this failure occurs. This makes the FreeBSD NFSv4.1/4.2 client compatible with the Linux NFSv4.1/4.2 client. MFC after: 2 weeks	2021-07-19 17:35:39 -07:00
Rick Macklem	ee29e6f311	nfsd: Add sysctl to set maximum I/O size up to 1Mbyte Since MAXPHYS now allows the FreeBSD NFS client to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio so that the maximum NFS server I/O size can be set up to 1Mbyte. The Linux NFS client can also do 1Mbyte I/O operations. The default of 128Kbytes for the maximum I/O size has not been changed for two reasons: - kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O - The limited benchmarking I can do actually shows a drop in I/O rate when the I/O size is above 256Kbytes. However, daveb@spectralogic.com reports seeing an increase in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client. Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30826	2021-07-16 15:01:03 -07:00
Mark Johnston	7a9bc8a82e	nfssvc: Zero the buffer copied out when NFSSVC_DUMPMNTOPTS is set Reported by: KMSAN MFC after: 1 week Sponsored by: The FreeBSD Foundation	2021-07-15 22:41:10 -04:00
Mark Johnston	44de1834b5	nfsclient: Avoid copying uninitialized bytes into statfs hst will be nul-terminated but the remaining space in the buffer is left uninitialized. Avoid copying the entire buffer to ensure that uninitialized bytes are not leaked via statfs(2). Reported by: KMSAN Reviewed by: rmacklem MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31167	2021-07-15 12:18:17 -04:00
Rick Macklem	7f5508fe78	nfscl: Avoid KASSERT() panic in cache_enter_time() Commit `844aa31c6d` added cache_enter_time_flags(), specifically so that the NFS client could specify that cache enter replace any stale entry for the same name. Doing so avoids a KASSERT() panic() in cache_enter_time(), as reported by the PR. This patch uses cache_enter_time_flags() for Readdirplus, to avoid the panic(), since it is impossible for the NFS client to know if another client (or a local process on the NFS server) has replaced a file with another file of the same name. This patch only affects NFS mounts that use the "rdirplus" mount option. There may be other places in the NFS client where this needs to be done, but no panic() has been observed during testing. PR: 257043 MFC after: 2 weeks	2021-07-14 13:33:37 -07:00
Mark Johnston	b9ca419a21	fifo: Explicitly initialize generation numbers when opening The fi_rgen and fi_wgen fields are generation numbers used when sleeping waiting for the other end of the fifo to be opened. The fields were not explicitly initialized after allocation, but this was harmless. To avoid false positives from KMSAN, though, ensure that they get initialized to zero. Reported by: KMSAN MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2021-07-13 17:45:49 -04:00
Rick Macklem	1e0a518d65	nfscl: Add a Linux compatible "nconnect" mount option Linux has had an "nconnect" NFS mount option for some time. It specifies that N (up to 16) TCP connections are to created for a mount, instead of just one TCP connection. A discussion on freebsd-net@ indicated that this could improve client<-->server network bandwidth, if either the client or server have one of the following: - multiple network ports aggregated to-gether with lagg/lacp. - a fast NIC that is using multiple queues It does result in using more IP port#s and might increase server peak load for a client. One difference from the Linux implementation is that this implementation uses the first TCP connection for all RPCs composed of small messages and uses the additional TCP connections for RPCs that normally have large messages (Read/Readdir/Write). The Linux implementation spreads all RPCs across all TCP connections in a round robin fashion, whereas this implementation spreads Read/Readdir/Write across the additional TCP connections in a round robin fashion. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30970	2021-07-08 17:39:04 -07:00
Rick Macklem	c5f4772c66	nfscl: Improve "Consider increasing kern.ipc.maxsockbuf" message When the setting of kern.ipc.maxsockbuf is less than what is desired for I/O based on vfs.maxbcachebuf and vfs.nfs.bufpackets, a console message of "Consider increasing kern.ipc.maxsockbuf". is printed. This patch modifies the message to provide a suggested value for kern.ipc.maxsockbuf. Note that the setting is only needed when the NFS rsize/wsize is set to vfs.maxbcachebuf. While here, make nfs_bufpackets global, so that it can be used by a future patch that adds a sysctl to set the NFS server's maximum I/O size. Also, remove "sizeof(u_int32_t)" from the maximum packet length, since NFS_MAXXDR is already an "overestimate" of the actual length. MFC after: 2 weeks	2021-06-30 15:15:41 -07:00
Jason A. Harmening	372691a7ae	unionfs: release parent vnodes in deferred context Each unionfs node holds a reference to its parent directory vnode. A single open file reference can therefore end up keeping an arbitrarily deep vnode hierarchy in place. When that reference is released, the resulting VOP_RECLAIM call chain can then exhaust the kernel stack. This is easily reproducible by running the unionfs.sh stress2 test. Fix it by deferring recursive unionfs vnode release to taskqueue context. PR: 238883 Reviewed By: kib (earlier version), markj Differential Revision: https://reviews.freebsd.org/D30748	2021-06-29 06:02:01 -07:00
Alan Somers	18b19f8c6e	fusefs: correctly set lock owner during FUSE_SETLK During FUSE_SETLK, the owner field should uniquely identify the calling process. The fusefs module now sets it to the process's pid. Previously, it expected the calling process to set it directly, which was wrong. libfuse also apparently expects the owner field to be set during FUSE_GETLK, though I'm not sure why. PR: 256005 Reported by: Agata <chogata@moosefs.pro> MFC after: 2 weeks Reviewed by: pfg Differential Revision: https://reviews.freebsd.org/D30622	2021-06-25 20:40:08 -06:00
Rick Macklem	a145cf3f73	nfscl: Change the default minor version for NFSv4 mounts When NFSv4.1 support was added to the client, the implementation was still experimental and, as such, the default minor version was set to 0. Since the NFSv4.1 client implementation is now believed to be solid and the NFSv4.1/4.2 protocol is significantly better than NFSv4.0, I beieve that NFSv4.1/4.2 should be used where possible. This patch changes the default minor version for NFSv4 to be the highest minor version supported by the NFSv4 server. If a specific minor version is desired, the "minorversion" mount option can be used to override this default. This is compatible with the Linux NFSv4 client behaviour. This was discussed on freebsd-current@ in mid-May 2021 under the subject "changing the default NFSv4 minor version" and the consensus seemed to be support for this change. It also appeared that changing this for FreeBSD 13.1 was not considered a POLA violation, so long as UPDATING and RELNOTES entries were made for it. MFC after: 2 weeks	2021-06-24 18:52:23 -07:00
Konstantin Belousov	190110f2eb	unionfs: do not use bare struct componentname Allocate nameidata on stack and NDPREINIT() it, for compatibility with assumptions from other filesystems' lookup code. Reviewed by: mckusick Discussed with: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D30041	2021-06-23 23:46:15 +03:00
Alan Somers	5403f2c163	fusefs: ensure that FUSE ops' headers' unique values are actually unique Every FUSE operation has a unique value in its header. As the name implies, these values are supposed to be unique among all outstanding operations. And since FUSE_INTERRUPT is asynchronous and racy, it is desirable that the unique values be unique among all operations that are "close in time". Ensure that they are actually unique by incrementing them whenever we reuse a fuse_dispatcher object, for example during fsync, write, and listextattr. PR: 244686 MFC after: 2 weeks Reviewed by: pfg Differential Revision: https://reviews.freebsd.org/D30810	2021-06-19 14:45:29 -06:00
Alan Somers	b97c7abc1a	fusefs: delete dead code It was always dead, accidentally included in SVN r345876. MFC after: 2 weeks Reviewed by: pfg	2021-06-19 14:45:04 -06:00
gAlfonso-bit	9b876fbd50	Simplify fuse_device_filt_write It always returns 1, so why bother having a variable. MFC after: 2 weeks MFC with: `7b8622fa22` Pull Request: https://github.com/freebsd/freebsd-src/pull/478	2021-06-16 15:54:24 -06:00
Alan Somers	7b8622fa22	fusefs: support EVFILT_WRITE on /dev/fuse /dev/fuse is always ready for writing, so it's kind of dumb to poll it. But some applications do it anyway. Better to return ready than EINVAL. MFC after: 2 weeks Reviewed by: emaste, pfg Differential Revision: https://reviews.freebsd.org/D30784	2021-06-16 13:34:14 -06:00
Alan Somers	0b9a5c6fa1	fusefs: improve warnings about buggy FUSE servers The fusefs driver will print warning messages about FUSE servers that commit protocol violations. Previously it would print those warnings on every violation, but that could spam the console. Now it will print each warning no more than once per lifetime of the mount. There is also now a dtrace probe for each violation. MFC after: 2 weeks Sponsored by: Axcient Reviewed by: emaste, pfg Differential Revision: https://reviews.freebsd.org/D30780	2021-06-16 13:31:31 -06:00
Rick Macklem	aed98fa5ac	nfscl: Make NFSv4.0 client acquisition NFSv4.1/4.2 compatible When the NFSv4.0 client was implemented, acquisition of a clientid via SetClientID/SetClientIDConfirm was done upon the first Open, since that was when it was needed. NFSv4.1/4.2 acquires the clientid during mount (via ExchangeID/CreateSession), since the associated session is required during mount. This patch modifies the NFSv4.0 mount so that it acquires the clientid during mount. This simplifies the code and makes it easy to implement "find the highest minor version supported by the NFSv4 server", which will be done for the default minorversion in a future commit. The "start_renewthread" argument for nfscl_getcl() is replaced by "tryminvers", which will be used by the aforementioned future commit. MFC after: 2 weeks	2021-06-15 17:48:51 -07:00
Alan Somers	d63e6bc256	fusefs: delete dead code Delete two fields in the per-mountpoint struct that have never been used. MFC after: 2 weeks Sponsored by: Axcient	2021-06-15 13:34:01 -06:00
Rick Macklem	e1a907a25c	krpc: Acquire ref count of CLIENT for backchannel use Michael Dexter <editor@callfortesting.org> reported a crash in FreeNAS, where the first argument to clnt_bck_svccall() was no longer valid. This argument is a pointer to the callback CLIENT structure, which is free'd when the associated NFSv4 ClientID is free'd. This appears to have occurred because a callback reply was still in the socket receive queue when the CLIENT structure was free'd. This patch acquires a reference count on the CLIENT that is not CLNT_RELEASE()'d until the socket structure is destroyed. This should guarantee that the CLIENT structure is still valid when clnt_bck_svccall() is called. It also adds a check for closed or closing to clnt_bck_svccall() so that it will not process the callback RPC reply message after the ClientID is free'd. Comments by: mav MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30153	2021-06-11 16:57:14 -07:00
Rick Macklem	5e5ca4c8fc	nfscl: Add a "has acquired a delegation" flag for delegations A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. For a common case where delegations are not being issued by the NFSv4 server, the code acquires the mutex lock for open/lock state, finds the delegation list empty and just unlocks the mutex and returns. This patch adds an NFS mount point flag that is set when a delegation is issued for the mount. Then the patched code checks for this flag before acquiring the open/lock mutex, avoiding the need to acquire the lock for the case where delegations are not being issued by the NFSv4 server. This change appears to be performance neutral for a small number of opens, but should reduce lock contention for a large number of opens for the common case where server is not issuing delegations. This commit should not affect the high level semantics of delegation handling. MFC after: 2 weeks	2021-06-09 08:00:43 -07:00
Rick Macklem	03c81af249	nfscl: Fix generation of va_fsid for a tree of NFSv4 server file systems Pre-r318997 the code looked like: if (vp->v_mount->mnt_stat.f_fsid.val[0] != (uint32_t)np->n_vattr.na_filesid[0]) vap->va_fsid = (uint32_t)np->n_vattr.na_filesid[0]; Doing this assignment got lost by r318997 and, as such, NFSv4 mounts of servers with trees of file systems on the server is broken, due to duplicate fileno values for the same st_dev/va_fsid. Although I could have re-introduced the assignment, since the value of na_filesid[0] is not guaranteed to be unique across the server file systems, I felt it was better to always do the hash for na_filesid[0,1]. Since dev_t (st_dev/va_fsid) is now 64bits, I switched to a 64bit hash. There is a slight chance of a hash conflict where 2 different na_filesid values map to same va_fsid, which will be documented in the BUGS section of the man page for mount_nfs(8). Using a table to keep track of mappings to catch conflicts would not easily scale to 10,000+ server file systems and, when the conflict occurs, it only results in fts(3) reporting a "directory cycle" under certain circumstances. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30660	2021-06-07 13:48:25 -07:00
Jason A. Harmening	59409cb90f	Add a generic mechanism for preventing forced unmount This is aimed at preventing stacked filesystems like nullfs and unionfs from "losing" their lower mounts due to forced unmount. Otherwise, VFS operations that are passed through to the lower filesystem(s) may crash or otherwise cause unpredictable behavior. Introduce two new functions: vfs_pin_from_vp() and vfs_unpin(). which are intended to be called on the lower mount(s) when the stacked filesystem is mounted and unmounted, respectively. Much as registration in the mnt_uppers list previously did, pinning will prevent even forced unmount of the lower FS and will allow the stacked FS to freely operate on the lower mount either by direct use of the struct mount* or indirect use through a properly-referenced vnode's v_mount field. vfs_pin_from_vp() is modeled after vfs_ref_from_vp() in that it uses the mount interlock coupled with re-checking vp->v_mount to ensure that it will fail in the face of a pending unmount request, even if the concurrent unmount fully completes. Adopt these new functions in both nullfs and unionfs. Reviewed By: kib, markj Differential Revision: https://reviews.freebsd.org/D30401	2021-06-05 18:20:36 -07:00
Rick Macklem	a5df139ec6	nfsd: Fix when NFSERR_WRONGSEC may be replied to NFSv4 clients Commit `d224f05fcf` pre-parsed the next operation number for the put file handle operations. This patch uses this next operation number, plus the type of the file handle being set by the put file handle operation, to implement the rules in RFC5661 Sec. 2.6 with respect to replying NFSERR_WRONGSEC. This patch also adds a check to see if NFSERR_WRONGSEC should be replied when about to perform Lookup, Lookupp or Open with a file name component, so that the NFSERR_WRONGSEC reply is done for these operations, as required by RFC5661 Sec. 2.6. This patch does not have any practical effect for the FreeBSD NFSv4 client and I believe that the same is true for the Linux client, since NFSERR_WRONGSEC is considered a fatal error at this time. MFC after: 2 weeks	2021-06-05 16:53:07 -07:00
Rick Macklem	56e9d8e38e	nfsd: Fix NFSv4.1/4.2 Secinfo_no_name when security flavors empty Commit `947bd2479b` added support for the Secinfo_no_name operation. When a non-exported file system is being traversed, the list of security flavors is empty. It turns out that the Linux client mount attempt fails when the security flavors list in the Secinfo_no_name reply is empty. This patch modifies Secinfo/Secinfo_no_name so that it replies with all four security flavors when the list is empty. This fixes Linux NFSv4.1/4.2 mounts when the file system at the NFSv4 root (as specified on a V4: exports(5) line) is not exported. MFC after: 2 weeks	2021-06-04 20:31:20 -07:00
Rick Macklem	d224f05fcf	nfsd: Pre-parse the next NFSv4 operation number for put FH operations RFC5661 Sec. 2.6 specifies when a NFSERR_WRONGSEC error reply can be done. For the four operations PutFH, PutrootFH, PutpublicFH and RestoreFH, NFSERR_WRONGSEC can or cannot be replied, depending upon what operation follows one of these operations in the compound. This patch modifies nfsrvd_compound() so that it parses the next operation number before executing any of the above four operations, storing it in "nextop". A future commit will implement use of "nextop" to decide if NFSERR_WRONGSEC can be replied for the above four operations. This commit should not change the semantics of performing the compound RPC. MFC after: 2 weeks	2021-06-03 20:48:26 -07:00
Konstantin Belousov	f9b1e711f0	fdescfs: add an option to return underlying file vnode on lookup The 'nodup' option forces fdescfs to return real vnode behind file descriptor instead of the fdescfs fd vnode, on lookup. The end result is that e.g. stat("/dev/fd/3") returns the stat data for the underlying vnode, if any. Similarly, fchdir(2) works in the expected way. For open(2), if applied over file descriptor opened with O_PATH, it effectively re-open that vnode into normal file descriptor which has the specified access mode, assuming the current vnode permissions allow it. If the file descriptor does not reference vnode, the behavior is unchanged. This is done by a mount option, because permission check on open(2) breaks established fdescfs open semantic of dup(2)-ing the descriptor. So it is not suitable for /dev/fd mount. Tested by: Andrew Walker <awalker@ixsystems.com> Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30140	2021-06-04 03:30:12 +03:00
Rick Macklem	984c71f903	nfsd: Fix the failure return for non-fh NFSv4 operations Without this patch, nfsd_checkrootexp() returns failure and then the NFSv4 operation would reply NFSERR_WRONGSEC. RFC5661 Sec. 2.6 only allows a few NFSv4 operations, none of which call nfsv4_checktootexp(), to return NFSERR_WRONGSEC. This patch modifies nfsd_checkrootexp() to return the error instead of a boolean and sets the returned error to an RPC layer AUTH_ERR, as discussed on nfsv4@ietf.org. The patch also fixes nfsd_errmap() so that the pseudo error NFSERR_AUTHERR is handled correctly such that an RPC layer AUTH_ERR is replied to the NFSv4 client. The two new "enum auth_stat" values have not yet been assigned by IANA, but are the expected next two values. The effect on extant NFSv4 clients of this change appears limited to reporting a different failure error when a mount that does not use adequate security is attempted. MFC after: 2 weeks	2021-06-02 15:28:07 -07:00
Rick Macklem	1d4afcaca2	nfsd: Delete extraneous NFSv4 root checks There are several NFSv4.1/4.2 server operation functions which have unneeded checks for the NFSv4 root being set up. The checks are not needed because the operations always follow a Sequence operation, which performs the check. This patch deletes these checks, simplifying the code so that a future patch that fixes the checks to conform with RFC5661 Sec. 2.6 will be less extension. MFC after: 2 weeks	2021-05-31 19:41:17 -07:00
Mateusz Guzik	f4aa64528e	tmpfs: save on relocking the allnode lock in tmpfs_free_node_locked	2021-05-31 23:21:15 +00:00
Mateusz Guzik	68c2544264	nfs: even up value returned by nfsrv_parsename with copyinstr Reported by: dim Reviewed by: rmacklem	2021-05-31 16:32:04 +00:00
Rick Macklem	947bd2479b	nfsd: Add support for the NFSv4.1/4.2 Secinfo_no_name operation The Linux client is now attempting to use the Secinfo_no_name operation for NFSv4.1/4.2 mounts. Although it does not seem to mind the NFSERR_NOTSUPP reply, adding support for it seems reasonable. I also noticed that "savflag" needed to be 64bits in nfsrvd_secinfo() since nd_flag in now 64bits, so I changed the declaration of it there. I also added code to set "vp" NULL after performing Secinfo/Secinfo_no_name, since these operations consume the current FH, which is represented by "vp" in nfsrvd_compound(). Fixing when the server replies NFSERR_WRONGSEC so that it conforms to RFC5661 Sec. 2.6 still needs to be done in a future commit. MFC after: 2 weeks	2021-05-30 17:52:43 -07:00
Jason A. Harmening	a4b07a2701	VFS_QUOTACTL(9): allow implementation to indicate busy state changes Instead of requiring all implementations of vfs_quotactl to unbusy the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param to VFS_QUOTACTL(9). The implementation may then indicate to the caller whether it needed to unbusy the mount. Also, add stbool.h to libprocstat modules which #define _KERNEL before including sys/mount.h. Otherwise they'll pull in sys/types.h before defining _KERNEL and therefore won't have the bool definition they need for mp_busy. Reviewed By: kib, markj Differential Revision: https://reviews.freebsd.org/D30556	2021-05-30 14:53:47 -07:00
Mateusz Guzik	284cf3f18b	ext2: add missing uio_td initialization to ext2_htree_append_block Reported by: pho	2021-05-30 17:19:31 +00:00
Jason A. Harmening	271fcf1c28	Revert commits `6d3e78ad6c` and `54256e7954` Parts of libprocstat like to pretend they're kernel components for the sake of including mount.h, and including sys/types.h in the _KERNEL case doesn't fix the build for some reason. Revert both the VFS_QUOTACTL() change and the follow-up "fix" for now.	2021-05-29 17:48:02 -07:00
Mateusz Guzik	331a7601c9	tmpfs: save on common case relocking in tmpfs_reclaim	2021-05-29 22:04:10 +00:00
Mateusz Guzik	439d942b9e	tmpfs: drop a redundant NULL check in tmpfs_alloc_vp	2021-05-29 22:04:10 +00:00
Mateusz Guzik	7fbeaf33b8	tmpfs: drop useless parent locking from tmpfs_dir_getdotdotdent The id field is immutable until the node gets freed.	2021-05-29 22:04:10 +00:00
Jason A. Harmening	6d3e78ad6c	VFS_QUOTACTL(9): allow implementation to indicate busy state changes Instead of requiring all implementations of vfs_quotactl to unbusy the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param to VFS_QUOTACTL(9). The implementation may then indicate to the caller whether it needed to unbusy the mount. Reviewed By: kib, markj Differential Revision: https://reviews.freebsd.org/D30218	2021-05-29 14:05:39 -07:00
Rick Macklem	96b40b8967	nfscl: Use hash lists to improve expected search performance for opens A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. When searching for an open, usually for a match by file handle, a linear search of all opens is done. Commit `3f7e14ad93` added a hash table of lists hashed on file handle for the opens. This patch uses the hash lists for searching for a matching open based of file handle instead of an exhaustive linear search of all opens. This change appears to be performance neutral for a small number of opens, but should improve expected performance for a large number of opens. This commit should not affect the high level semantics of open handling. MFC after: 2 weeks	2021-05-27 19:08:36 -07:00
Rick Macklem	724072ab1d	nfscl: Use hash lists to improve expected search performance for opens A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. When searching for an open, usually for a match by file handle, a linear search of all opens is done. Commit `3f7e14ad93` added a hash table of lists hashed on file handle for the opens. This patch uses the hash lists for searching for a matching open based of file handle instead of an exhaustive linear search of all opens. This change appears to be performance neutral for a small number of opens, but should improve expected performance for a large number of opens. This patch also moves any found match to the front of the hash list, to try and maintain the hash lists in recently used ordering (least recently used at the end of the list). This commit should not affect the high level semantics of open handling. MFC after: 2 weeks	2021-05-25 14:19:29 -07:00
Rick Macklem	3f7e14ad93	nfscl: Add hash lists for the NFSv4 opens A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. When searching for an open, usually for a match by file handle, a linear search of all opens is done. This patch adds a table of hash lists for the opens, hashed on file handle. This table will be used by future commits to search for an open based on file handle more efficiently. MFC after: 2 weeks	2021-05-22 14:53:56 -07:00
Konstantin Belousov	f784da883f	Move mnt_maxsymlinklen into appropriate fs mount data structures Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC-Note: struct mount layout Differential revision: https://reviews.freebsd.org/D30325	2021-05-22 15:16:09 +03:00
Konstantin Belousov	42881526d4	nullfs: dirty v_object must imply the need for inactivation Otherwise pages are cleaned some time later when the lower fs decides that it is time to do it. This mostly manifests itself as delayed mtime update, e.g. breaking make-like programs. Reported by: mav Tested by: mav, pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-05-22 12:30:17 +03:00
Rick Macklem	d80a903a1c	nfsd: Add support for CLAIM_DELEG_PREV_FH to the NFSv4.1/4.2 Open Commit `b3d4c70dc6` added support for CLAIM_DELEG_CUR_FH to Open. While doing this, I noticed that CLAIM_DELEG_PREV_FH support could be added the same way. Although I am not aware of any extant NFSv4.1/4.2 client that uses this claim type, it seems prudent to add support for this variant of Open to the NFSv4.1/4.2 server. This patch does not affect mounts from extant NFSv4.1/4.2 clients, as far as I know. MFC after: 2 weeks	2021-05-20 18:37:40 -07:00
Rick Macklem	c28cb257dd	nfscl: Fix NFSv4.1/4.2 mount recovery from an expired lease The most difficult NFSv4 client recovery case happens when the lease has expired on the server. For NFSv4.0, the client will receive a NFSERR_EXPIRED reply from the server to indicate this has happened. For NFSv4.1/4.2, most RPCs have a Sequence operation and, as such, the client will receive a NFSERR_BADSESSION reply when the lease has expired for these RPCs. The client will then call nfscl_recover() to handle the NFSERR_BADSESSION reply. However, for the expired lease case, the first reclaim Open will fail with NFSERR_NOGRACE. This patch recognizes this case and calls nfscl_expireclient() to handle the recovery from an expired lease. This patch only affects NFSv4.1/4.2 mounts when the lease expires on the server, due to a network partitioning that exceeds the lease duration or similar. MFC after: 2 weeks	2021-05-19 14:52:56 -07:00
Mateusz Guzik	4fe925b81e	fdescfs: allow shared locking of root vnode Eliminates fdescfs from lock profile when running poudriere.	2021-05-19 17:58:54 +00:00
Mateusz Guzik	43999a5cba	pseudofs: use vget_prep + vget_finish instead of vget + the interlock	2021-05-19 17:58:42 +00:00
Rick Macklem	fc0dc94029	nfsd: Reduce the callback timeout to 800msec Recent discussion on the nfsv4@ietf.org mailing list confirmed that an NFSv4 server should reply to an RPC in less than 1second. If an NFSv4 RPC requires a delegation be recalled, the server will attempt a CB_RECALL callback. If the client is not responsive, the RPC reply will be delayed until the callback times out. Without this patch, the timeout is set to 4 seconds (set in ticks, but used as seconds), resulting in the RPC reply taking over 4sec. This patch redefines the constant as being in milliseconds and it implements that for a value of 800msec, to ensure the RPC reply is sent in less than 1second. This patch only affects mounts from clients when delegations are enabled on the server and the client is unresponsive to callbacks. MFC after: 2 weeks	2021-05-18 16:17:58 -07:00
Rick Macklem	b3d4c70dc6	nfsd: Add support for CLAIM_DELEG_CUR_FH to the NFSv4.1/4.2 Open The Linux NFSv4.1/4.2 client now uses the CLAIM_DELEG_CUR_FH variant of the Open operation when delegations are recalled and the client has a local open of the file. This patch adds support for this variant of Open to the NFSv4.1/4.2 server. This patch only affects mounts from Linux clients when delegations are enabled on the server. MFC after: 2 weeks	2021-05-18 15:53:54 -07:00
Rick Macklem	46269d66ed	NFSv4 server: Re-establish the delegation recall timeout Commit `7a606f280a` allowed the server to do retries of CB_RECALL callbacks every couple of seconds. This was needed to allow the Linux client to re-establish the back channel. However this patch broke the delegation timeout check, such that it would just keep retrying CB_RECALLS. If the client has crashed or been network patitioned from the server, this continues until the client TCP reconnects to the server and re-establishes the back channel. This patch modifies the code such that it still times out the delegation recall after some minutes, so that the server will allow the conflicting client request once the delegation times out. This patch only affects the NFSv4 server when delegations are enabled and a NFSv4 client that holds a delegation has crashed or been network partitioned from the server for at least several minutes when a delegation needs to be recalled. MFC after: 2 weeks	2021-05-16 16:40:01 -07:00
Mateusz Guzik	eec2e4ef7f	tmpfs: reimplement the mtime scan to use the lazy list Tested by: pho Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D30065	2021-05-15 20:48:45 +00:00
Mateusz Guzik	128e25842e	vm: add another pager private flag Move OBJ_SHADOWLIST around to let pager flags be next to each other. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D30258	2021-05-15 20:47:29 +00:00
Konstantin Belousov	28bc23ab92	tmpfs: dynamically register tmpfs pager Remove OBJT_SWAP_TMPFS. Move tmpfs-specific swap pager bits into tmpfs_subr.c. There is no longer any code to directly support tmpfs in sys/vm, most tmpfs knowledge is shared by non-anon swap object type implementation. The tmpfs-specific methods are provided by registered tmpfs pager, which inherits from the swap pager. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30168	2021-05-13 20:13:34 +03:00
Konstantin Belousov	8b99833ac2	procfs_map: switch to use vm_object_kvme_type to get object type, and stop enumerating OBJT_XXX constants. This also provides properly a pointer for the vnode, if object backs any. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30168	2021-05-13 20:10:35 +03:00
Rick Macklem	cb07628d9e	nfscl: Delete unneeded redundant MODULE_DEPEND() calls There are two module declarations in the nfscl.ko module for "nfscl" and "nfs". Both of these declarations had MODULE_DEPEND() calls. This patch deletes the MODULE_DEPEND() calls for "nfs" to avoid confusion with respect to what modules this module is dependent upon. The patch also adds comments explaining why there are two module declarations within the module. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D30102	2021-05-10 17:34:29 -07:00
Fedor Uporov	2a984c2b49	Make encode/decode extra time functions inline. Mentioned by: pfg MFC after: 2 weeks	2021-05-08 06:42:20 +03:00
Rick Macklem	dd02d9d605	nfscl: Add support for va_birthtime to NFSv4 There is a NFSv4 file attribute called TimeCreate that can be used for va_birthtime. r362175 added some support for use of TimeCreate. This patch completes support of va_birthtime by adding support for setting this attribute to the server. It also eanbles the client to acquire and set the attribute for a NFSv4 server that supports the attribute. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30156	2021-05-07 17:30:56 -07:00
Konstantin Belousov	4b8365d752	Add OBJT_SWAP_TMPFS pager This is OBJT_SWAP pager, specialized for tmpfs. Right now, both swap pager and generic vm code have to explicitly handle swap objects which are tmpfs vnode v_object, in the special ways. Replace (almost) all such places with proper methods. Since VM still needs a notion of the 'swap object', regardless of its use, add yet another type-classification flag OBJ_SWAP. Set it in vm_object_allocate() where other type-class flags are set. This change almost completely eliminates the knowledge of tmpfs from VM, and opens a way to make OBJT_SWAP_TMPFS loadable from tmpfs.ko. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30070	2021-05-07 17:08:03 +03:00
Fedor Uporov	c40a160fd0	Make inode extra time fields updating logic more closer to linux. Found using pjdfstest: pjdfstest/tests/utimensat/09.t Reviewed by: pfg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29933	2021-05-07 10:46:55 +03:00
Fedor Uporov	b3f4665639	Invalidate inode extents cache on truncation. It is needed to invalidate cache in case of inode space removal to avoid situation, when extents cache returns not exist extent. Reviewed by: pfg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29931	2021-05-07 10:27:37 +03:00
Fedor Uporov	5679656e09	Improve extents verification logic. It is possible to walk thru inode extents if EXT2FS_PRINT_EXTENTS macro is defined. The extents headers magics and physical blocks ranges are checked during extents walk. Reviewed by: pfg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29932	2021-05-07 10:27:28 +03:00
Fedor Uporov	1ed5f62d61	Add chr/blk devices support. The dev field is placed into the inode structure. The major/minor numbers conversion to/from linux compatile format happen during on-disk inodes writing/reading. Reviewed by: pfg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29930	2021-05-07 10:08:31 +03:00
Fedor Uporov	1484574843	Fix inode birthtime updating logic. The birthtime field of struct vattr does not checked for VNOVAL in case of ext2_setattr() and produce incorrect inode birthtime values. Found using pjdfstest: pjdfstest/tests/utimensat/03.t Reviewed by: pfg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29929	2021-05-07 10:08:20 +03:00
Mark Johnston	8bde6d15d1	nfsclient: Copy only initialized fields in nfs_getattr() When loading attributes from the cache, the NFS client is careful to copy only the fields that it initialized. After fetching attributes from the server, however, it would copy the entire vattr structure initialized from the RPC response, so uninitialized stack bytes would end up being copied to userspace. In particular, va_birthtime (v2 and v3) and va_gen (v3) had this problem. Use a common subroutine to copy fields provided by the NFS client, and ensure that we provide a dummy va_gen for the v3 case. Reviewed by: rmacklem Reported by: KMSAN MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30090	2021-05-04 08:53:57 -04:00
Rick Macklem	0755df1eee	nfscl: fix typo in a comment MFC after: 2 weeks	2021-05-03 18:29:27 -07:00
Mark Johnston	243b324f96	devfs: Avoid comparison with an uninitialized var in devfs_fp_check() devvn_refthread() will initialize *devp only if it succeeds, so check for success before comparing with fp->f_data. Other devvn_refthread() callers are careful to do this. Reported by: KMSAN Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30068	2021-05-03 13:24:30 -04:00
Rick Macklem	f6fec55fe3	nfscl: add check for NULL clp and forced dismounts to nfscl_delegreturnvp() Commit `aad780464f` added a function called nfscl_delegreturnvp() to return delegations during the NFS VOP_RECLAIM(). The function erroneously assumed that nm_clp would be non-NULL. It will be NULL for NFSV4.0 mounts until a regular file is opened. It will also be NULL during vflush() in nfs_unmount() for a forced dismount. This patch adds a check for clp == NULL to fix this. Also, since it makes no sense to call nfscl_delegreturnvp() during a forced dismount, the patch adds a check for that case and does not do the call during forced dismounts. PR: 255436 Reported by: ish@amail.plala.or.jp MFC after: 2 weeks	2021-04-27 17:30:16 -07:00
Rick Macklem	f5ff282bc0	nfscl: fix the handling of NFSERR_DELAY for Open/LayoutGet RPCs For a pNFS mount, the NFSv4.1/4.2 client uses compound RPCs that have both Open and LayoutGet operations in them. If the pNFS server were tp reply NFSERR_DELAY for one of these compounds, the retry after a delay cannot be handled by newnfs_request(), since there is a reference held on the open state for the Open operation in them. Fix this by adding these RPCs to the "don't do delay here" list in newnfs_request(). This patch is only needed if the mount is using pNFS (the "pnfs" mount option) and probably only matters if the MDS server is issuing delegations as well as pNFS layouts. Found by code inspection. MFC after: 2 weeks	2021-04-26 17:48:21 -07:00
Rick Macklem	8759773148	nfsd: fix the slot sequence# when a callback fails Commit `4281bfec36` patched the server so that the callback session slot would be free'd for reuse when a callback attempt fails. However, this can often result in the sequence# for the session slot to be advanced such that the client end will reply NFSERR_SEQMISORDERED. To avoid the NFSERR_SEQMISORDERED client reply, this patch negates the sequence# advance for the case where the callback has failed. The common case is a failed back channel, where the callback cannot be sent to the client, and not advancing the sequence# is correct for this case. For the uncommon case where the client's reply to the callback is lost, not advancing the sequence# will indicate to the client that the next callback is a retry and not a new callback. But, since the FreeBSD server always sets "csa_cachethis" false in the callback sequence operation, a retry and a new callback should be handled the same way by the client, so this should not matter. Until you have this patch in your NFSv4.1/4.2 server, you should consider avoiding the use of delegations. Even with this patch, interoperation with the Linux NFSv4.1/4.2 client in kernel versions prior to 5.3 can result in frequent 15second delays if delegations are enabled. This occurs because, for kernels prior to 5.3, the Linux client does a TCP reconnect every time it sees multiple concurrent callbacks and then it takes 15seconds to recover the back channel after doing so. MFC after: 2 weeks	2021-04-26 16:24:10 -07:00
Rick Macklem	aad780464f	nfscl: return delegations in the NFS VOP_RECLAIM() After a vnode is recycled it can no longer be acquired via vfs_hash_get() and, as such, a delegation for the vnode cannot be recalled. In the unlikely event that a delegation still exists when the vnode is being recycled, return the delegation since it will no longer be recallable. Until you have this patch in your NFSv4 client, you should consider avoiding the use of delegations. MFC after: 2 weeks	2021-04-25 17:57:55 -07:00
Rick Macklem	02695ea890	nfscl: fix delegation recall when the file is not open Without this patch, if a NFSv4 server recalled a delegation when the file is not open, the renew thread would block in the NFS VOP_INACTIVE() trying to acquire the client state lock that it already holds. This patch fixes the problem by delaying the vrele() call until after the client state lock is released. This bug has been in the NFSv4 client for a long time, but since it only affects delegation when recalled due to another client opening the file, it got missed during previous testing. Until you have this patch in your client, you should avoid the use of delegations. MFC after: 2 weeks	2021-04-25 12:55:00 -07:00
Rick Macklem	4281bfec36	nfsd: fix session slot handling for failed callbacks When the NFSv4.1/4.2 server does a callback to a client on the back channel, it will use a session slot in the back channel session. If the back channel has failed, the callback will fail and, without this patch, the session slot will not be released. As more callbacks are attempted, all session slots can become busy and then the nfsd thread gets stuck waiting for a back channel session slot. This patch frees the session slot upon callback failure to avoid this problem. Without this patch, the problem can be avoided by leaving delegations disabled in the NFS server. MFC after: 2 weeks	2021-04-23 15:24:47 -07:00
Rick Macklem	78ffcb86d9	nfscommon: fix function name in comment MFC after: 2 weeks	2021-04-19 20:09:46 -07:00
Rick Macklem	5a89498d19	nfsd: fix stripe size reply for the File Layout pNFS server At a recent testing event I found out that I had misinterpreted RFC5661 where it describes the stripe size in the File Layout's nfl_util field. This patch fixes the pNFS File Layout server so that it returns the correct value to the NFSv4.1/4.2 pNFS enabled client. This affects almost no one, since pNFS server configurations are rare and the extant pNFS aware NFS clients seemed to function correctly despite the erroneous stripe size. It might be needed for correct behaviour if a recent Linux client mounts a FreeBSD pNFS server configuration that is using File Layout (non-mirrored configuration). MFC after: 2 weeks	2021-04-19 17:54:54 -07:00
Rick Macklem	34256484af	Revert "nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661" This reverts commit `9edaceca81`. It turns out that the Linux client intentionally does an NFSv4.1 RPC with only a Sequence operation in it and with "seqid + 1" for the slot. This is used to re-synchronize the slot's seqid and the client expects the NFS4ERR_SEQ_MISORDERED error reply. As such, revert the patch, so that the server remains RFC5661 compliant.	2021-04-15 14:08:40 -07:00
Konstantin Belousov	5edf7227ec	pseudofs: limit writes to 1M Noted and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29752	2021-04-14 10:23:21 +03:00
Konstantin Belousov	8cca7b7f28	nfs client: depend on xdr Since `7763814fc9` nfsrpc_setclient() uses mem_alloc() that is macro around malloc(M_RPC). M_RPC is provided by xdr.ko. Reviewed by: rmacklem Sponsored by: Mellanox Technologies/NVidia Networking MFC after: 1 week	2021-04-13 18:04:43 +03:00
Rick Macklem	9edaceca81	nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661 Recent testing of network partitioning a FreeBSD NFSv4.1 server from a Linux NFSv4.1 client identified problems with both the FreeBSD server and Linux client. Sometimes, after some Linux NFSv4.1/4.2 clients establish a new TCP connection, they will advance the sequence number for a session slot by 2 instead of 1. RFC5661 specifies that a server should reply NFS4ERR_SEQ_MISORDERED for this case. This might result in a system call error in the client and seems to disable future use of the slot by the client. Since advancing the sequence number by 2 seems harmless, allow this case if vfs.nfs.linuxseqsesshack is non-zero. Note that, if the order of RPCs is actually reversed, a subsequent RPC with a smaller sequence number value for the slot will be received. This will result in a NFS4ERR_SEQ_MISORDERED reply. This has not been observed during testing. Setting vfs.nfs.linuxseqsesshack to 0 will provide RFC5661 compliant behaviour. This fix affects the fairly rare case where a NFSv4 Linux client does a TCP reconnect and then apparently erroneously increments the sequence number for the session slot twice during the reconnect cycle. PR: 254816 MFC after: 2 weeks	2021-04-11 16:51:25 -07:00
Rick Macklem	7763814fc9	nfsv4 client: do the BindConnectionToSession as required During a recent testing event, it was reported that the NFSv4.1/4.2 server erroneously bound the back channel to a new TCP connection. RFC5661 specifies that the fore channel is implicitly bound to a new TCP connection when an RPC with Sequence (almost any of them) is done on it. For the back channel to be bound to the new TCP connection, an explicit BindConnectionToSession must be done as the first RPC on the new connection. Since new TCP connections are created by the "reconnect" layer (sys/rpc/clnt_rc.c) of the krpc, this patch adds an optional upcall done by the krpc whenever a new connection is created. The patch also adds the specific upcall function that does a BindConnectionToSession and configures the krpc to call it when required. This is necessary for correct interoperability with NFSv4.1/NFSv4.2 servers when the nfscbd daemon is running. If doing NFSv4.1/NFSv4.2 mounts without this patch, it is recommended that the nfscbd daemon not be running and that the "pnfs" mount option not be specified. PR: 254840 Comments by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29475	2021-04-11 14:34:57 -07:00
Rick Macklem	22cefe3d83	nfsd: fix replies from session cache for multiple retries Recent testing of network partitioning a FreeBSD NFSv4.1 server from a Linux NFSv4.1 client identified problems with both the FreeBSD server and Linux client. Commit `05a39c2c1c` fixed replying with the cached reply in in the session slot if same session slot sequence#. However, the code uses the reply and, as such, will fail for a subsequent retry of the RPC. A subsequent retry would be an extremely rare event, but this patch fixes this, so long as m_copym(..M_NOWAIT) does not fail, which should also be a rare event. This fix affects the exceedingly rare case where a NFSv4 client retries a non-idempotent RPC, such as a lock operation, multiple times. Note that retries only occur after the client has needed to create a new TCP connection, with a new TCP connection for each retry. MFC after: 2 weeks	2021-04-10 15:50:25 -07:00
Rick Macklem	05a39c2c1c	nfsd: fix replies from session cache for retried RPCs Recent testing of network partitioning a FreeBSD NFSv4.1 server from a Linux NFSv4.1 client identified problems with both the FreeBSD server and Linux client. The FreeBSD server failec to reply using the cached reply in the session slot when an RPC was retried on the session slot, as indicated by same slot sequence#. This patch fixes this. It should also fix a similar failure for NFSv4.0 mounts, when the sequence# in the open/lock_owner requires a reply be done from an entry locked into the DRC. This fix affects the fairly rare case where a NFSv4 client retries a non-idempotent RPC, such as a lock operation. Note that retries only occur after the client has needed to create a new TCP connection. MFC after: 2 weeks	2021-04-08 14:04:22 -07:00
Rick Macklem	7a606f280a	nfsd: make the server repeat CB_RECALL every couple of seconds Commit `01ae8969a9` stopped the NFSv4.1/4.2 server from implicitly binding the back channel to a new TCP connection so that it conforms to RFC5661, for NFSv4.1/4.2. An effect of this for the Linux NFS client is that it will do a BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN set in a sequence reply. This will fix the back channel, but the first attempt at a callback like CB_RECALL will already have failed. Without this patch, a CB_RECALL will not be retried and that can result in a 5 minute delay until the delegation times out. This patch modifies the code so that it will retry the CB_RECALL every couple of seconds, often avoiding the 5 minute delay. This is not critical for correct behaviour, but avoids the 5 minute delay for the case where the Linux client re-binds the back channel via BindConnectionToSession. MFC after: 2 weeks	2021-04-04 18:15:54 -07:00
Rick Macklem	6f2addd838	nfsd: fix BindConnectionToSession so that it clears "cb path down" Commit `01ae8969a9` stopped the NFSv4.1/4.2 server from implicitly binding the back channel to a new TCP connection so that it conforms to RFC5661, for NFSv4.1/4.2. An effect of this for the Linux NFS client is that it will do a BindConnectionToSession when it sees NFSV4SEQ_CBPATHDOWN set in a sequence reply. It will do this for every RPC reply until it no longer sees the flag. Without that patch, this will happen until the client does an Open, which will clear LCL_CBDOWN. This patch clears LCL_CBDOWN right away, so that NFSV4SEQ_CBPATHDOWN will no longer be sent to the client in Sequence replies and the Linux client will not repeat the BindConnectionToSession RPCs. This is not critical for correct behaviour, but reduces RPC overheads for cases where the Open will not be done for a while. MFC after: 2 weeks	2021-04-04 15:05:39 -07:00
Konstantin Belousov	76b1b5ce6d	nullfs: protect against user creating inconsistent state The VFS conventions is that VOP_LOOKUP() methods do not need to handle ISDOTDOT lookups for VV_ROOT vnodes (since they cannot, after all). Nullfs bypasses VOP_LOOKUP() to lower filesystem, and there, due to user actions, it is possible to get into situation where - upper vnode does not have VV_ROOT set - lower vnode is root - ISDOTDOT is requested User just needs to nullfs-mount non-root of some filesystem, and then move some directory under mount, out of mount, using lower filesystem. In this case, nullfs cannot do much, but we still should and can ensure internal kernel structures are consistent. Avoid ISDOTDOT lookup forwarding when VV_ROOT is set on lower dvp, return somewhat arbitrary ENOENT. PR: 253593 Reported by: Gregor Koscak <elogin41@gmail.com> Test by: Patrick Sullivan <sulli00777@gmail.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-04-02 15:40:25 +03:00
Rick Macklem	4e6c2a1ee9	nfsv4 client: factor loop contents out into a separate function Commit `fdc9b2d50f` replaced a couple of while loops with LIST_FOREACH() loops. This patch factors the body of that loop out into a separate function called nfscl_checkown(). This prepares the code for future changes to use a hash table of lists for open searches via file handle. This patch should not result in a semantics change. MFC after: 2 weeks	2021-04-01 15:36:37 -07:00
Rick Macklem	01ae8969a9	nfsd: do not implicitly bind the back channel for NFSv4.1/4.2 mounts The NFSv4.1 (and 4.2 on 13) server incorrectly binds a new TCP connection to the back channel when first used by an RPC with a Sequence op in it (almost all of them). RFC5661 specifies that only the fore channel should be bound. This was done because early clients (including FreeBSD) did not do the required BindConnectionToSession RPC. Unfortunately, this breaks the Linux client when the "nconnects" mount option is used, since the server may do a callback on the incorrect TCP connection. This patch converts the server behaviour to that required by the RFC. It also makes the server test/indicate failure of the back channel more aggressively. Until this patch is applied to the server, the "nconnects" mount option is not recommended for a Linux NFSv4.1/4.2 client mount to the FreeBSD server. Reported by: bcodding@redhat.com Tested by: bcodding@redhat.com PR: 254560 MFC after: 1 week	2021-03-30 14:31:05 -07:00
Rick Macklem	fdc9b2d50f	nfsv4 client: replace while loops with LIST_FOREACH() loops This patch replaces a couple of while() loops with LIST_FOREACH() loops. While here, declare a couple of variables "bool". I think LIST_FOREACH() is preferred and makes the code more readable. This also prepares the code for future changes to use a hash table of lists for open searches via file handle. This patch should not result in a semantics change. MFC after: 2 weeks	2021-03-29 14:14:51 -07:00
Rick Macklem	e61b29ab5d	nfsv4.1/4.2 client: fix handling of delegations for "oneopenown" mnt option If a delegation for a file has been acquired, the "oneopenown" option was ignored when the local open was issued. This could result in multiple openowners/opens for a file, that would be transferred to the server when the delegation was recalled. This would not be serious, but could result in more than one openowner. Since the Amazon/EFS does not issue delegations, this probably never occurs in practice. Spotted during code inspection. This small patch fixes the code so that it checks for "oneopenown" when doing client local opens on a delegation. MFC after: 2 weeks	2021-03-29 12:09:19 -07:00
Rick Macklem	82ee386c2a	nfsv4 client: fix forced dismount when sleeping in the renew thread During a recent NFSv4 testing event a test server caused a hang where "umount -N" failed. The renew thread was sleeping on "nfsv4lck" and the "umount" was sleeping, waiting for the renew thread to terminate. This is the second of two patches that is hoped to fix the renew thread so that it will terminate when "umount -N" is done on the mount. This patch adds a 5second timeout on the msleep()s and checks for the forced dismount flag so that the renew thread will wake up and see the forced dismount flag. Normally a wakeup() will occur in less than 5seconds, but if a premature return from msleep() does occur, it will simply loop around and msleep() again. The patch also adds the "mp" argument to nfsv4_lock() so that it will return when the forced dismount flag is set. While here, replace the nfsmsleep() wrapper that was used for portability with the actual msleep() call. MFC after: 2 weeks	2021-03-23 13:04:37 -07:00
Alan Somers	9c5aac8f2e	fusefs: fix a dead store in fuse_vnop_advlock kevans actually caught this in the original review and I fixed it, but then I committed an older copy of the branch. Whoops. Reported by: kevans MFC after: 13 days MFC with: `929acdb19a` Differential Revision: https://reviews.freebsd.org/D29031	2021-03-19 19:38:57 -06:00
Rick Macklem	5f742d3879	nfsv4 client: fix forced dismount when sleeping on nfsv4lck During a recent NFSv4 testing event a test server caused a hang where "umount -N" failed. The renew thread was sleeping on "nfsv4lck" and the "umount" was sleeping, waiting for the renew thread to terminate. This is the first of two patches that is hoped to fix the renew thread so that it will terminate when "umount -N" is done on the mount. nfsv4_lock() checks for forced dismount, but only after it wakes up from msleep(). Without this patch, a wakeup() call was required. This patch adds a 1second timeout on the msleep(), so that it will wake up and see the forced dismount flag. Normally a wakeup() will occur in less than 1second, but if a premature return from msleep() does occur, it will simply loop around and msleep() again. While here, replace the nfsmsleep() wrapper that was used for portability with the actual msleep() call and make the same change for nfsv4_getref(). MFC after: 2 weeks	2021-03-19 14:09:33 -07:00
Alan Somers	929acdb19a	fusefs: fix two bugs regarding fcntl file locks 1) F_SETLKW (blocking) operations would be sent to the FUSE server as F_SETLK (non-blocking). 2) Release operations, F_SETLK with lk_type = F_UNLCK, would simply return EINVAL. PR: 253500 Reported by: John Millikin <jmillikin@gmail.com> MFC after: 2 weeks	2021-03-18 17:09:10 -06:00
Rick Macklem	fd232a21bb	nfsv4 pnfs client: fix updating of the layout stateid.seqid During a recent NFSv4 testing event a test server was replying NFSERR_OLDSTATEID for layout stateids presented to the server for LayoutReturn operations. Upon rereading RFC5661, it was apparent that the FreeBSD NFSv4.1/4.2 pNFS client did not maintain the seqid field of the layout stateid correctly. This patch is believed to correct the problem. Tested against a FreeBSD pNFS server with diagnostics added to check the stateid's seqid did not indicate problems. Unfortunately, testing aginst this server will not happen in the near future, so the fix may not be correct yet. MFC after: 2 weeks	2021-03-18 12:20:25 -07:00
Gordon Bergling	5666643a95	Fix some common typos in comments - occured -> occurred - normaly -> normally - controling -> controlling - fileds -> fields - insterted -> inserted - outputing -> outputting MFC after: 1 week	2021-03-13 18:26:15 +01:00
Konstantin Belousov	16dea83410	null_vput_pair(): release use reference on dvp earlier We might own the last use reference, and then vrele() at the end would need to take the dvp vnode lock to inactivate, which causes deadlock with vp. We cannot vrele() dvp from start since this might unlock ldvp. Handle it by holding the vnode and dropping use ref after lowerfs VOP_VPUT_PAIR() ended. This effectivaly requires unlock of the vp vnode after VOP_VPUT_PAIR(), so the call is changed to set unlock_vp to true unconditionally. This opens more opportunities for vp to be reclaimed, if lvp is still alive we reinstantiate vp with null_nodeget(). Reported and tested by: pho Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29178	2021-03-12 13:31:08 +02:00
Rick Macklem	c04199affe	nfsclient: Fix ReadDS/WriteDS/CommitDS nfsstats RPC counts for a NFSv3 DS During a recent virtual NFSv4 testing event, a bug in the FreeBSD client was detected when doing I/O DS operations on a Flexible File Layout pNFS server. For an NFSv3 DS, the Read/Write/Commit nfsstats were incremented instead of the ReadDS/WriteDS/CommitDS counts. This patch fixes this. Only the RPC counts reported by nfsstat(1) were affected by this bug, the I/O operations were performed correctly. MFC after: 2 weeks	2021-03-02 14:18:23 -08:00
Rick Macklem	94f2e42f5e	nfsclient: Fix the stripe unit size for a File Layout pNFS layout During a recent virtual NFSv4 testing event, a bug in the FreeBSD client was detected when doing a File Layout pNFS DS I/O operation. The size of the I/O operation was smaller than expected. The I/O size is specified as a stripe unit size in bits 6->31 of nflh_util in the layout. I had misinterpreted RFC5661 and had shifted the value right by 6 bits. The correct interpretation is to use the value as presented (it is always an exact multiple of 64), clearing bits 0->5. This patch fixes this. Without the patch, I/O through the DSs work, but the I/O size is 1/64th of what is optimal. MFC after: 2 weeks	2021-03-01 12:49:32 -08:00
Rick Macklem	15bed8c46b	nfsclient: add nfs node locking around uses of n_direofoffset During code inspection I noticed that the n_direofoffset field of the NFS node was being manipulated without any lock being held to make it SMP safe. This patch adds locking of the NFS node's mutex around handling of n_direofoffset to make it SMP safe. I have not seen any failure that could be attributed to n_direofoffset being manipulated concurrently by multiple processors, but I think this is possible, since directories are read with shared vnode locking, plus locks only on individual buffer cache blocks. However, there have been as yet unexplained issues w.r.t reading large directories over NFS that could have conceivably been caused by concurrent manipulation of n_direofoffset. MFC after: 2 weeks	2021-02-28 14:53:54 -08:00
Rick Macklem	3e04ab36ba	nfsclient: add checks for a server returning the current directory Commit `3fe2c68ba2` dealt with a panic in cache_enter_time() where the vnode referred to the directory argument. It would also be possible to get these panics if a broken NFS server were to return the directory as an new object being created within the directory or in a Lookup reply. This patch adds checks to avoid the panics and logs messages to indicate that the server is broken for the file object creation cases. Reviewd by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D28987	2021-02-28 14:15:32 -08:00
Alexander Motin	d01032736c	Fix diroffdiroff, probably copy/paste bug. Too long name looks bad in `vmstat -m`. MFC after: 1 week	2021-02-28 09:08:31 -05:00
Rick Macklem	3fe2c68ba2	nfsclient: fix panic in cache_enter_time() Juraj Lutter (otis@) reported a panic "dvp != vp not true" in cache_enter_time() called from the NFS client's nfsrpc_readdirplus() function. This is specific to an NFSv3 mount with the "rdirplus" mount option. Unlike NFSv4, NFSv3 replies to ReaddirPlus includes entries for the current directory. This trivial patch avoids doing a cache_enter_time() call for the current directory to avoid the panic. Reported by: otis Tested by: otis Reviewed by: mjg MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D28969	2021-02-27 17:54:05 -08:00
Ryan Libby	d7671ad8d6	Close races in vm object chain traversal for unlock We were unlocking the vm object before reading the backing_object field. In the meantime, the object could be freed and reused. This could cause us to go off the rails in the object chain traversal, failing to unlock the rest of the objects in the original chain and corrupting the lock state of the victim chain. Reviewed by: bdrewery, kib, markj, vangyzen MFC after: 3 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D28926	2021-02-25 12:11:19 -08:00
Alex Richardson	ba2cfa80e1	Fix makefs bootstrap after `d485c77f20` The makefs msdosfs code includes fs/msdosfs/denode.h which directly uses struct buf from <sys/buf.h> rather than the makefs struct m_buf. To work around this problem provide a local denode.h that includes ffs/buf.h and defines buf as an alias for m_buf. Reviewed By: kib, emaste Differential Revision: https://reviews.freebsd.org/D28835	2021-02-22 17:55:45 +00:00
Konstantin Belousov	8b7239681e	ext2fs: clear write cluster tracking on truncation Reviewed by: fsu, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Konstantin Belousov	2bfd8992c7	vnode: move write cluster support data to inodes. The data is only needed by filesystems that 1. use buffer cache 2. utilize clustering write support. Requested by: mjg Reviewed by: asomers (previous version), fsu (ext2 parts), mckusick Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Konstantin Belousov	d485c77f20	Remove #define _KERNEL hacks from libprocstat Make sys/buf.h, sys/pipe.h, sys/fs/devfs/devfs*.h headers usable in userspace, assuming that the consumer has an idea what it is for. Unhide more material from sys/mount.h and sys/ufs/ufs/inode.h, sys/ufs/ufs/ufsmount.h for consumption of userspace tools, with the same caveat. Remove unacceptable hack from usr.sbin/makefs which relied on sys/buf.h being unusable in userspace, where it override struct buf with its own definition. Instead, provide struct m_buf and struct m_vnode and adapt code to use local variants. Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Alexander V. Chernikov	605284b894	Enforce net epoch in in6_selectsrc(). in6_selectsrc() may call fib6_lookup() in some cases, which requires epoch. Wrap in6_selectsrc* calls into epoch inside its users. Mark it as requiring epoch by adding NET_EPOCH_ASSERT(). MFC after: 1 weeek Differential Revision: https://reviews.freebsd.org/D28647	2021-02-15 22:33:12 +00:00
Alan Somers	71befc3506	fusefs: set d_off during VOP_READDIR This allows d_off to be used with lseek to position the file so that getdirentries(2) will return the next entry. It is not used by readdir(3). PR: 253411 Reported by: John Millikin <jmillikin@gmail.com> Reviewed by: cem MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28605	2021-02-12 21:50:52 -07:00

1 2 3 4 5 ...

4685 Commits