freebsd-nq

Author	SHA1	Message	Date
Rick Macklem	77c595ce33	nfscl: Add an argument to nfscl_tryclose() This patch adds a new argument to nfscl_tryclose() to indicate whether or not it should loop when a NFSERR_DELAY reply is received from the NFSv4 server. Since this new argument is always passed in as "true" at this time, no semantics change should occur. This is being done to prepare the code for a future patch that fixes the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close operation. MFC after: 2 week	2021-10-15 14:25:38 -07:00
Rick Macklem	6495766acf	nfscl: Restructure nfscl_freeopen() slightly This patch factors the unlinking of the nfsclopen structure out of nfscl_freeopen() into a separate function called nfscl_unlinkopen(). It also adds a new argument to nfscl_freeopen() to conditionally do the unlink. Since this new argument is always passed in as "true" at this time, no semantics change should occur. This is being done to prepare the code for a future patch that fixes the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close operation. MFC after: 2 week	2021-10-14 17:28:01 -07:00
Rick Macklem	24af0fcdfc	nfscl: Make nfscl_getlayout() acquire the correct pNFS layout Without this patch, if a pNFS read layout has already been acquired for a file, writes would be redirected to the Metadata Server (MDS), because nfscl_getlayout() would not acquire a read/write layout for the file. This happened because there was no "mode" argument to nfscl_getlayout() to indicate whether reading or writing was being done. Since doing I/O through the Metadata Server is not encouraged for some pNFS servers, it is preferable to get a read/write layout for writes instead of redirecting the write to the MDS. This patch adds a access mode argument to nfscl_getlayout() and nfsrpc_getlayout(), so that nfscl_getlayout() knows to acquire a read/write layout for writing, even if a read layout has already been acquired. This patch only affects NFSv4.1/4.2 client behaviour when pNFS ("pnfs" mount option against a server that supports pNFS) is in use. This problem was detected during a recent NFSv4 interoperability testing event held by the IETF working group. MFC after: 2 week	2021-10-13 15:48:54 -07:00
Rick Macklem	b82168e657	nfscl: Fix another deadlock related to the NFSv4 clientID lock Without this patch, it is possible to hang the NFSv4 client, when a rename/remove is being done on a file where the client holds a delegation, if pNFS is being used. For a delegation to be returned, dirty data blocks must be flushed to the NFSv4 server. When pNFS is in use, a shared lock on the clientID must be acquired while doing a write to the DS(s). However, if rename/remove is doing the delegation return an exclusive lock will be acquired on the clientID, preventing the write to the DS(s) from acquiring a shared lock on the clientID. This patch stops rename/remove from doing a delegation return if pNFS is enabled. Since doing delegation return in the same compound as rename/remove is only an optimization, not doing so should not cause problems. This problem was detected during a recent NFSv4 interoperability testing event held by the IETF working group. MFC after: 1 week	2021-10-12 17:21:01 -07:00
Rick Macklem	120b20bdf4	nfscl: Fix a deadlock related to the NFSv4 clientID lock Without this patch, it is possible for a process doing an NFSv4 Open/create of a file to block to allow another process to acquire the exclusive lock on the clientID when holding a shared lock on the clientID. As such, both processes deadlock, with one wanting the exclusive lock, while the other holds the shared lock. This deadlock is unlikely to occur unless delegations are in use on the NFSv4 mount. This patch fixes the problem by not deferring to the process waiting for the exclusive lock when a shared lock (reference cnt) is already held by the process. This problem was detected during a recent NFSv4 interoperability testing event held by the IETF working group. MFC after: 1 week	2021-10-11 21:58:24 -07:00
Rick Macklem	62c5be4ab4	nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg() Commit `5e5ca4c8fc` added a flag to a NFSv4 mount point that is set when the first delegation is acquired from the NFSv4 server. For a common case where delegations are not being issued by the NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for open/lock state, finds the delegation list empty, then just unlocks the mutex and returns. This patch adds a check of the flag to avoid the need to acquire the mutex for this common case. This change appears to be performance neutral for a small number of opens, but should reduce lock contention for a large number of opens for the common case where server is not issuing delegations. This commit should not affect the high level semantics of delegation handling. MFC after: 2 weeks	2021-09-26 18:37:25 -07:00
Rick Macklem	3ad1e1c1ce	nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2 This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2 NFS client, which can be used by nfs_lookup() so that a subsequent Open RPC is not required. It uses the cn_flags OPENREAD, OPENWRITE added by commit `c18c74a87c`. This reduced the number of RPCs by about 15% for a kernel build over NFS. For now, use of Lookup+Open is only done when the "oneopenown" mount option is used. It may be possible for Lookup+Open to be used for non-oneopenown NFSv4.1/4.2 mounts, but that will require extensive further testing to determine if it works. While here, I've added the changes to the nfscommon module that are needed to implement the Deallocate NFSv4.2 operation. This avoids needing another cycle of changes to the internal KAPI between the NFS modules. This commit has changed the internal KAPI between the NFS modules and, as such, all need to be rebuilt from sources. I have not bumped __FreeBSD_version, since it was bumped a few days ago.	2021-08-11 18:49:26 -07:00
Rick Macklem	efea1bc1fd	nfscl: Cache an open stateid for the "oneopenown" mount option For NFSv4.1/4.2, if the "oneopenown" mount option is used, there is, at most, only one open stateid for each NFS vnode. When an open stateid for a file is acquired, set a pointer to the open structure in the NFS vnode. This pointer can be used to acquire the open stateid without searching the open linked list when the following is true: - No delegations have been issued for the file. Since delegations can outlive an NFS vnode for a file, use the global NFSMNTP_DELEGISSUED flag on the mount to determine this. - No lock stateid has been issued for the file. To determine this, a new NFS vnode flag called NMIGHTBELOCKED is set when a lock stateid is issued, which can then be tested. When this open structure pointer can be used, it avoids the need to acquire the NFSCLSTATELOCK() and searching the open structure list for an open. The NFSCLSTATELOCK() can be highly contended when there are a lot of opens issued for the NFSv4.1/4.2 mount. This patch only affects NFSv4.1/4.2 mounts when the "oneopenown" mount option is used. MFC after: 2 weeks	2021-07-28 15:48:27 -07:00
Rick Macklem	54ff3b3986	nfscl: Set correct lockowner for "oneopenown" mount option For NFSv4.1/4.2, the client may use either an open, lock or delegation stateid as the stateid argument for an I/O operation. RFC 5661 defines an order of preference of delegation, then lock and finally open stateid for the argument, although NFSv4.1/4.2 servers are expected to handle any stateid type. For the "oneopenown" mount option, the lock owner was not being correctly generated and, as such, the I/O operation would use an open stateid, even when a lock stateid existed. Although this did not and should not affect an NFSv4.1/4.2 server's behaviour, this patch makes the behaviour for "oneopenown" the same as when the mount option is not specified. Found during inspection of packet captures. No failure during testing against NFSv4.1/4.2 servers of the unpatched code occurred. MFC after: 2 weeks	2021-07-28 15:23:05 -07:00
Rick Macklem	7685f8344d	nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts For NFSv4.1/4.2, the client may set the "seqid" field of the stateid to 0 in RPC requests. This indicates to the server that it should not check the "seqid" or return NFSERR_OLDSTATEID if the "seqid" value is not up to date w.r.t. Open/Lock operations on the stateid. This "seqid" is incremented by the NFSv4 server for each Open/OpenDowngrade/Lock/Locku operation done on the stateid. Since a failure return of NFSERR_OLDSTATEID is of no use to the client for I/O operations, it makes sense to set "seqid" to 0 for the stateid argument for I/O operations. This avoids server failure replies of NFSERR_OLDSTATEID, although I am not aware of any case where this failure occurs. This makes the FreeBSD NFSv4.1/4.2 client compatible with the Linux NFSv4.1/4.2 client. MFC after: 2 weeks	2021-07-19 17:35:39 -07:00
Rick Macklem	a145cf3f73	nfscl: Change the default minor version for NFSv4 mounts When NFSv4.1 support was added to the client, the implementation was still experimental and, as such, the default minor version was set to 0. Since the NFSv4.1 client implementation is now believed to be solid and the NFSv4.1/4.2 protocol is significantly better than NFSv4.0, I beieve that NFSv4.1/4.2 should be used where possible. This patch changes the default minor version for NFSv4 to be the highest minor version supported by the NFSv4 server. If a specific minor version is desired, the "minorversion" mount option can be used to override this default. This is compatible with the Linux NFSv4 client behaviour. This was discussed on freebsd-current@ in mid-May 2021 under the subject "changing the default NFSv4 minor version" and the consensus seemed to be support for this change. It also appeared that changing this for FreeBSD 13.1 was not considered a POLA violation, so long as UPDATING and RELNOTES entries were made for it. MFC after: 2 weeks	2021-06-24 18:52:23 -07:00
Rick Macklem	aed98fa5ac	nfscl: Make NFSv4.0 client acquisition NFSv4.1/4.2 compatible When the NFSv4.0 client was implemented, acquisition of a clientid via SetClientID/SetClientIDConfirm was done upon the first Open, since that was when it was needed. NFSv4.1/4.2 acquires the clientid during mount (via ExchangeID/CreateSession), since the associated session is required during mount. This patch modifies the NFSv4.0 mount so that it acquires the clientid during mount. This simplifies the code and makes it easy to implement "find the highest minor version supported by the NFSv4 server", which will be done for the default minorversion in a future commit. The "start_renewthread" argument for nfscl_getcl() is replaced by "tryminvers", which will be used by the aforementioned future commit. MFC after: 2 weeks	2021-06-15 17:48:51 -07:00
Rick Macklem	5e5ca4c8fc	nfscl: Add a "has acquired a delegation" flag for delegations A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. For a common case where delegations are not being issued by the NFSv4 server, the code acquires the mutex lock for open/lock state, finds the delegation list empty and just unlocks the mutex and returns. This patch adds an NFS mount point flag that is set when a delegation is issued for the mount. Then the patched code checks for this flag before acquiring the open/lock mutex, avoiding the need to acquire the lock for the case where delegations are not being issued by the NFSv4 server. This change appears to be performance neutral for a small number of opens, but should reduce lock contention for a large number of opens for the common case where server is not issuing delegations. This commit should not affect the high level semantics of delegation handling. MFC after: 2 weeks	2021-06-09 08:00:43 -07:00
Rick Macklem	96b40b8967	nfscl: Use hash lists to improve expected search performance for opens A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. When searching for an open, usually for a match by file handle, a linear search of all opens is done. Commit `3f7e14ad93` added a hash table of lists hashed on file handle for the opens. This patch uses the hash lists for searching for a matching open based of file handle instead of an exhaustive linear search of all opens. This change appears to be performance neutral for a small number of opens, but should improve expected performance for a large number of opens. This commit should not affect the high level semantics of open handling. MFC after: 2 weeks	2021-05-27 19:08:36 -07:00
Rick Macklem	724072ab1d	nfscl: Use hash lists to improve expected search performance for opens A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. When searching for an open, usually for a match by file handle, a linear search of all opens is done. Commit `3f7e14ad93` added a hash table of lists hashed on file handle for the opens. This patch uses the hash lists for searching for a matching open based of file handle instead of an exhaustive linear search of all opens. This change appears to be performance neutral for a small number of opens, but should improve expected performance for a large number of opens. This patch also moves any found match to the front of the hash list, to try and maintain the hash lists in recently used ordering (least recently used at the end of the list). This commit should not affect the high level semantics of open handling. MFC after: 2 weeks	2021-05-25 14:19:29 -07:00
Rick Macklem	3f7e14ad93	nfscl: Add hash lists for the NFSv4 opens A problem was reported via email, where a large (130000+) accumulation of NFSv4 opens on an NFSv4 mount caused significant lock contention on the mutex used to protect the client mount's open/lock state. Although the root cause for the accumulation of opens was not resolved, it is obvious that the NFSv4 client is not designed to handle 100000+ opens efficiently. When searching for an open, usually for a match by file handle, a linear search of all opens is done. This patch adds a table of hash lists for the opens, hashed on file handle. This table will be used by future commits to search for an open based on file handle more efficiently. MFC after: 2 weeks	2021-05-22 14:53:56 -07:00
Rick Macklem	c28cb257dd	nfscl: Fix NFSv4.1/4.2 mount recovery from an expired lease The most difficult NFSv4 client recovery case happens when the lease has expired on the server. For NFSv4.0, the client will receive a NFSERR_EXPIRED reply from the server to indicate this has happened. For NFSv4.1/4.2, most RPCs have a Sequence operation and, as such, the client will receive a NFSERR_BADSESSION reply when the lease has expired for these RPCs. The client will then call nfscl_recover() to handle the NFSERR_BADSESSION reply. However, for the expired lease case, the first reclaim Open will fail with NFSERR_NOGRACE. This patch recognizes this case and calls nfscl_expireclient() to handle the recovery from an expired lease. This patch only affects NFSv4.1/4.2 mounts when the lease expires on the server, due to a network partitioning that exceeds the lease duration or similar. MFC after: 2 weeks	2021-05-19 14:52:56 -07:00
Rick Macklem	f6fec55fe3	nfscl: add check for NULL clp and forced dismounts to nfscl_delegreturnvp() Commit `aad780464f` added a function called nfscl_delegreturnvp() to return delegations during the NFS VOP_RECLAIM(). The function erroneously assumed that nm_clp would be non-NULL. It will be NULL for NFSV4.0 mounts until a regular file is opened. It will also be NULL during vflush() in nfs_unmount() for a forced dismount. This patch adds a check for clp == NULL to fix this. Also, since it makes no sense to call nfscl_delegreturnvp() during a forced dismount, the patch adds a check for that case and does not do the call during forced dismounts. PR: 255436 Reported by: ish@amail.plala.or.jp MFC after: 2 weeks	2021-04-27 17:30:16 -07:00
Rick Macklem	aad780464f	nfscl: return delegations in the NFS VOP_RECLAIM() After a vnode is recycled it can no longer be acquired via vfs_hash_get() and, as such, a delegation for the vnode cannot be recalled. In the unlikely event that a delegation still exists when the vnode is being recycled, return the delegation since it will no longer be recallable. Until you have this patch in your NFSv4 client, you should consider avoiding the use of delegations. MFC after: 2 weeks	2021-04-25 17:57:55 -07:00
Rick Macklem	02695ea890	nfscl: fix delegation recall when the file is not open Without this patch, if a NFSv4 server recalled a delegation when the file is not open, the renew thread would block in the NFS VOP_INACTIVE() trying to acquire the client state lock that it already holds. This patch fixes the problem by delaying the vrele() call until after the client state lock is released. This bug has been in the NFSv4 client for a long time, but since it only affects delegation when recalled due to another client opening the file, it got missed during previous testing. Until you have this patch in your client, you should avoid the use of delegations. MFC after: 2 weeks	2021-04-25 12:55:00 -07:00
Rick Macklem	4e6c2a1ee9	nfsv4 client: factor loop contents out into a separate function Commit `fdc9b2d50f` replaced a couple of while loops with LIST_FOREACH() loops. This patch factors the body of that loop out into a separate function called nfscl_checkown(). This prepares the code for future changes to use a hash table of lists for open searches via file handle. This patch should not result in a semantics change. MFC after: 2 weeks	2021-04-01 15:36:37 -07:00
Rick Macklem	fdc9b2d50f	nfsv4 client: replace while loops with LIST_FOREACH() loops This patch replaces a couple of while() loops with LIST_FOREACH() loops. While here, declare a couple of variables "bool". I think LIST_FOREACH() is preferred and makes the code more readable. This also prepares the code for future changes to use a hash table of lists for open searches via file handle. This patch should not result in a semantics change. MFC after: 2 weeks	2021-03-29 14:14:51 -07:00
Rick Macklem	e61b29ab5d	nfsv4.1/4.2 client: fix handling of delegations for "oneopenown" mnt option If a delegation for a file has been acquired, the "oneopenown" option was ignored when the local open was issued. This could result in multiple openowners/opens for a file, that would be transferred to the server when the delegation was recalled. This would not be serious, but could result in more than one openowner. Since the Amazon/EFS does not issue delegations, this probably never occurs in practice. Spotted during code inspection. This small patch fixes the code so that it checks for "oneopenown" when doing client local opens on a delegation. MFC after: 2 weeks	2021-03-29 12:09:19 -07:00
Rick Macklem	82ee386c2a	nfsv4 client: fix forced dismount when sleeping in the renew thread During a recent NFSv4 testing event a test server caused a hang where "umount -N" failed. The renew thread was sleeping on "nfsv4lck" and the "umount" was sleeping, waiting for the renew thread to terminate. This is the second of two patches that is hoped to fix the renew thread so that it will terminate when "umount -N" is done on the mount. This patch adds a 5second timeout on the msleep()s and checks for the forced dismount flag so that the renew thread will wake up and see the forced dismount flag. Normally a wakeup() will occur in less than 5seconds, but if a premature return from msleep() does occur, it will simply loop around and msleep() again. The patch also adds the "mp" argument to nfsv4_lock() so that it will return when the forced dismount flag is set. While here, replace the nfsmsleep() wrapper that was used for portability with the actual msleep() call. MFC after: 2 weeks	2021-03-23 13:04:37 -07:00
Rick Macklem	fd232a21bb	nfsv4 pnfs client: fix updating of the layout stateid.seqid During a recent NFSv4 testing event a test server was replying NFSERR_OLDSTATEID for layout stateids presented to the server for LayoutReturn operations. Upon rereading RFC5661, it was apparent that the FreeBSD NFSv4.1/4.2 pNFS client did not maintain the seqid field of the layout stateid correctly. This patch is believed to correct the problem. Tested against a FreeBSD pNFS server with diagnostics added to check the stateid's seqid did not indicate problems. Unfortunately, testing aginst this server will not happen in the near future, so the fix may not be correct yet. MFC after: 2 weeks	2021-03-18 12:20:25 -07:00
Mateusz Guzik	586ee69f09	fs: clean up empty lines in .c and .h files	2020-09-01 21:18:40 +00:00
Alan Somers	eea79fde5a	Remove vfs_statfs and vnode_mount macros from NFS These macro definitions are no longer needed as the NFS OSX port is long dead. The vfs_statfs macro conflicts with the vfsops field of the same name. Submitted by: shivank@ Reviewed by: rmacklem MFC after: 2 weeks Sponsored by: Google, Inc. (GSoC 2020) Differential Revision: https://reviews.freebsd.org/D25263	2020-06-17 16:20:19 +00:00
Ryan Moeller	b9cc3262bc	nfs: Remove APPLESTATIC macro It is no longer useful. Reviewed by: rmacklem Approved by: mav (mentor) MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D24811	2020-05-12 13:23:25 +00:00
Ryan Moeller	32033b3d30	Remove APPLEKEXT ifndefs They are no longer useful. Reviewed by: rmacklem Approved by: mav (mentor) MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D24752	2020-05-08 14:39:38 +00:00
Rick Macklem	897d7d45ba	Make the NFSv4.n client's recovery from NFSERR_BADSESSION RFC5661 conformant. RFC5661 specifies that a client's recovery upon receipt of NFSERR_BADSESSION should first consist of a CreateSession operation using the extant ClientID. If that fails, then a full recovery beginning with the ExchangeID operation is to be done. Without this patch, the FreeBSD client did not attempt the CreateSession operation with the extant ClientID and went directly to a full recovery beginning with ExchangeID. I have had this patch several years, but since no extant NFSv4.n server required the CreateSession with extant ClientID, I have never committed it. I an committing it now, since I suspect some future NFSv4.n server will require this and it should not negatively impact recovery for extant NFSv4.n servers, since they should all return NFSERR_STATECLIENTID for this first CreateSession. The patched client has been tested for recovery against both the FreeBSD and Linux NFSv4.n servers and no problems have been observed. MFC after: 1 month	2020-04-22 21:00:14 +00:00
Rick Macklem	c057a37818	Add support for NFSv4.2 to the NFS client and server. This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes (RFC-8276) to the NFS client and server. NFSv4.2 is comprised of several optional features that can be supported in addition to NFSv4.1. This patch adds the following optional features: - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED) - posix_fallocate() - intra server file range copying via the copy_file_range(2) syscall --> Avoiding data tranfer over the wire to/from the NFS client. - lseek(SEEK_DATA/SEEK_HOLE) - Extended attribute syscalls for "user" namespace attributes as defined by RFC-8276. Although this patch is fairly large, it should not affect support for the other versions of NFS. However it does add two new sysctls that allow a sysadmin to limit which minor versions of NFSv4 a server supports, allowing a sysadmin to disable NFSv4.2. Unfortunately, when the NFS stats structure was last revised, it was assumed that there would be no additional operations added beyond what was specified in RFC-7862. However RFC-8276 did add additional operations, forcing the NFS stats structure to revised again. It now has extra unused entries in all arrays, so that future extensions to NFSv4.2 can be accomodated without revising this structure again. A future commit will update nfsstat(1) to report counts for the new NFSv4.2 specific operations/procedures. This patch affects the internal interface between the nfscommon, nfscl and nfsd modules and, as such, they all must be upgraded simultaneously. I will do a version bump (although arguably not needed), due to this. This code has survived a "make universe" but has not been built with a recent GCC. If you encounter build problems, please email me. Relnotes: yes	2019-12-12 23:22:55 +00:00
Rick Macklem	eeb1f3ed51	Fix the NFSv4 client to safely find processes. r340744 broke the NFSv4 client, because it replaced pfind_locked() with a call to pfind(), since pfind() acquires the sx lock for the pid hash and the NFSv4 already holds a mutex when it does the call. The patch fixes the problem by recreating a pfind_any_locked() and adding the functions pidhash_slockall() and pidhash_sunlockall to acquire/release all of the pid hash locks. These functions are then used by the NFSv4 client instead of acquiring the allproc_lock and calling pfind(). Reviewed by: kib, mjg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D19887	2019-04-15 01:27:15 +00:00
Edward Tomasz Napierala	c9172fb4f1	Work around the "nfscl: bad open cnt on server" assertion that can happen when rerooting into NFSv4 rootfs with kernel built with INVARIANTS. I've talked to rmacklem@ (back in 2017), and while the root cause is still unknown, the case guarded by assertion (nfscl_doclose() being called from VOP_INACTIVE) is believed to be safe, and the whole thing seems to run just fine. Obtained from: CheriBSD MFC after: 2 weeks Sponsored by: DARPA, AFRL	2019-02-19 12:45:37 +00:00
Rick Macklem	743d528198	Silence newer gcc warnings. Newer versions of gcc generate "set, but not used" warnings. Add __unused macros to silence these warnings. Although the variables are not being used, they are values parsed from arguments to callback RPCs that might be needed in the future. Requested by: mmacy	2018-07-30 20:25:32 +00:00
Eitan Adler	33f4bccaa6	Use https over http for FreeBSD pages	2018-07-27 10:40:48 +00:00
Rick Macklem	5da3882447	Shut down the TCP connection to a DS in the pNFS client when Renew fails. When a NFSv4.1 client mount using pNFS detects a failure trying to do a Renew (actually just a Sequence operation), the code would simply try again and again and again every 30sec. This would tie up the "nfscl" thread, which should also be doing other things like Renews on other DSs and the MDS. This patch adds code which closes down the TCP connection and marks it defunct when Renew detects an failure to communicate with the DS, so further Renews will not be attempted until a new working TCP connection to the DS is established. It also makes the call to nfscl_cancelreqs() unconditional, since nfscl_cancelreqs() checks the NFSCLDS_SAMECONN flag and does so while holding the lock. This fix only applies to the NFSv4.1 client whne using pNFS and without it the only effect would have been an "nfscl" thread busy doing Renew attempts on an unresponsive DS. MFC after: 2 weeks	2018-07-15 18:54:44 +00:00
Rick Macklem	89c64a3a4f	Fix the pNFS client when mirrors aren't on the same machine. Without this patch, the client side NFSv4.1 pNFS code erroneously did writes and commits to both DS mirrors using the TCP connection of the first one. For my test setup this worked, since I have both DSs running on the same machine, but it would have failed when the DSs are on separate machines. This patch fixes the code to use the correct TCP connection for each DS. This patch should only affect the NFSv4.1 client when using "pnfs" mounts to mirrored DSs. MFC after: 2 weeks	2018-07-14 19:51:44 +00:00
Rick Macklem	0e7bd20bb2	Close down the TCP connection to a pNFS DS when it is disabled. So long as the TCP connection to a pNFS DS isn't shared with other DSs, it can be closed down when the DS is being disabled in the pNFS client. This causes any RPCs in progress to fail. This patch only affects the NFSv4.1 pNFS client when errors occur while doing I/O on a DS. MFC after: 2 weeks	2018-07-13 20:03:05 +00:00
Rick Macklem	2e35b8fe24	Change the NFSv4.1 pNFS client so that it returns the DS error in layoutreturn. When the NFSv4.1 pNFS client gets an error for a DS I/O operation using a Flexible File layout, it returns the layout with an error. This patch changes the code slightly, so that it returns the layout for all errors except EACCES and lets the MDS decide what to do based on the error. It also makes a couple of changes to nfscl_layoutrecall() to ensure that the first layoutreturn(s) will have the error in the reply. Plus, the patch adds a wakeup() so that the "nfscl" thread won't wait 1sec before doing the LayoutReturn. Tested against the pNFS service. This patch should not affect non-pNFS use of the client. The unused "dsp" argument will be used by a future patch that disables the connection to the DS when possible. MFC after: 2 weeks	2018-06-22 21:25:27 +00:00
Rick Macklem	2bad64241c	Make the pNFS NFSv4.1 client return a Flexible File layout upon error. The Flexible File layout LayoutReturn operation has argument fields where an I/O error encountered when attempting I/O on a DS can be reported back to the MDS. This patch adds code to the client to do this for the Flexible File layout mirrored case. This patch should only affect mounts using the "pnfs" option against servers that support the Flexible File layout. MFC after: 2 weeks	2018-06-17 16:30:06 +00:00
Rick Macklem	90d2dfab19	Merge the pNFS server code from projects/pnfs-planb-server into head. This code merge adds a pNFS service to the NFSv4.1 server. Although it is a large commit it should not affect behaviour for a non-pNFS NFS server. Some documentation on how this works can be found at: http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt and will hopefully be turned into a proper document soon. This is a merge of the kernel code. Userland and man page changes will come soon, once the dust settles on this merge. It has passed a "make universe", so I hope it will not cause build problems. It also adds NFSv4.1 server support for the "current stateid". Here is a brief overview of the pNFS service: A pNFS service separates the Read/Write oeprations from all the other NFSv4.1 Metadata operations. It is hoped that this separation allows a pNFS service to be configured that exceeds the limits of a single NFS server for either storage capacity and/or I/O bandwidth. It is possible to configure mirroring within the data servers (DSs) so that the data storage file for an MDS file will be mirrored on two or more of the DSs. When this is used, failure of a DS will not stop the pNFS service and a failed DS can be recovered once repaired while the pNFS service continues to operate. Although two way mirroring would be the norm, it is possible to set a mirroring level of up to four or the number of DSs, whichever is less. The Metadata server will always be a single point of failure, just as a single NFS server is. A Plan B pNFS service consists of a single MetaData Server (MDS) and K Data Servers (DS), all of which are recent FreeBSD systems. Clients will mount the MDS as they would a single NFS server. When files are created, the MDS creates a file tree identical to what a single NFS server creates, except that all the regular (VREG) files will be empty. As such, if you look at the exported tree on the MDS directly on the MDS server (not via an NFS mount), the files will all be of size 0. Each of these files will also have two extended attributes in the system attribute name space: pnfsd.dsfile - This extended attrbute stores the information that the MDS needs to find the data storage file(s) on DS(s) for this file. pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime and Change attributes for the file, so that the MDS doesn't need to acquire the attributes from the DS for every Getattr operation. For each regular (VREG) file, the MDS creates a data storage file on one (or more if mirroring is enabled) of the DSs in one of the "dsNN" subdirectories. The name of this file is the file handle of the file on the MDS in hexadecimal so that the name is unique. The DSs use subdirectories named "ds0" to "dsN" so that no one directory gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize on the MDS, with the default being 20. For production servers that will store a lot of files, this value should probably be much larger. It can be increased when the "nfsd" daemon is not running on the MDS, once the "dsK" directories are created. For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces of information to the client that allows it to do I/O directly to the DS. DeviceInfo - This is relatively static information that defines what a DS is. The critical bits of information returned by the FreeBSD server is the IP address of the DS and, for the Flexible File layout, that NFSv4.1 is to be used and that it is "tightly coupled". There is a "deviceid" which identifies the DeviceInfo. Layout - This is per file and can be recalled by the server when it is no longer valid. For the FreeBSD server, there is support for two types of layout, call File and Flexible File layout. Both allow the client to do I/O on the DS via NFSv4.1 I/O operations. The Flexible File layout is a more recent variant that allows specification of mirrors, where the client is expected to do writes to all mirrors to maintain them in a consistent state. The Flexible File layout also allows the client to report I/O errors for a DS back to the MDS. The Flexible File layout supports two variants referred to as "tightly coupled" vs "loosely coupled". The FreeBSD server always uses the "tightly coupled" variant where the client uses the same credentials to do I/O on the DS as it would on the MDS. For the "loosely coupled" variant, the layout specifies a synthetic user/group that the client uses to do I/O on the DS. The FreeBSD server does not do striping and always returns layouts for the entire file. The critical information in a layout is Read vs Read/Writea and DeviceID(s) that identify which DS(s) the data is stored on. At this time, the MDS generates File Layout layouts to NFSv4.1 clients that know how to do pNFS for the non-mirrored DS case unless the sysctl vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File layouts are generated. The mirrored DS configuration always generates Flexible File layouts. For NFS clients that do not support NFSv4.1 pNFS, all I/O operations are done against the MDS which acts as a proxy for the appropriate DS(s). When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy. If the DS is on the same machine, the MDS/DS will do the RPC on the DS as a proxy and so on, until the machine runs out of some resource, such as session slots or mbufs. As such, DSs must be separate systems from the MDS. Tested by: james.rose@framestore.com Relnotes: yes	2018-06-12 19:36:32 +00:00
Rick Macklem	8097753476	Add checks for the Flexible File layout to LayoutRecall callbacks. The Flexible File layout case wasn't handled by LayoutRecall callbacks because it just checked for File layout and returned NFSERR_NOMATCHLAYOUT otherwise. This patch adds the Flexible File layout handling. Found during testing of the pNFS server. MFC after: 2 weeks	2018-06-10 19:03:21 +00:00
Rick Macklem	260785fe60	Fix the sleep event for layout recall. The sleep for I/O completion during an NFSv4.1 pNFS layout recall used the wrong event value and could result in the "[nfscl]" thread hung for the mount. This patch fixes the event to be the correct. This bug will only affect NFSv4.1 pnfs mounts and only when the server does a layout recall callback, so it won't affect many. Without the patch, a mount without the "pnfs" option will avoid the problem. Found during testing of the pNFS server. MFC after: 1 week	2018-05-26 23:02:15 +00:00
Conrad Meyer	222daa421f	style: Remove remaining deprecated MALLOC/FREE macros Mechanically replace uses of MALLOC/FREE with appropriate invocations of malloc(9) / free(9) (a series of sed expressions). Something like: * MALLOC(a, b, ... -> a = malloc(... * FREE( -> free( * free((caddr_t) -> free( No functional change. For now, punt on modifying contrib ipfilter code, leaving a definition of the macro in its KMALLOC(). Reported by: jhb Reviewed by: cy, imp, markj, rmacklem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14035	2018-01-25 22:25:13 +00:00
Alexander Kabaev	151ba7933a	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385	2017-12-25 04:48:39 +00:00
Pedro F. Giffuni	d63027b668	sys/fs: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:15:37 +00:00
Rick Macklem	b949cc41d1	Add Flex File Layout support to the NFSv4.1 pNFS client. This patch adds support for the Flexible File Layout to the pNFS client. Although the patch is rather large, it should only affect NFS mounts using the "pnfs" option against pNFS servers that do not support File Layout. There are still a couple of things missing from the Flexible File Layout client implementation: - The code does not yet do a LayoutReturn with I/O error stats when I/O error(s) occur when attempting to do I/O on a DS. This will be fixed in a future commit, since it is important for the MDS to know that I/O on a DS is failing. - The current code does writes and commits to mirror DSs serially. Making them happen concurrently will be done in a future commit, after discussion on freebsd-current@ on the best way to do this. - The code does not handle NFSv4.0 DSs. Since there is no extant pNFS server that implements NFSv4.0 DSs and NFSv4.1 DSs makes more sense now, I don't intend to implement this until there is a need for it. There is support for NFSv4.1 and NFSv3 DSs.	2017-10-05 20:10:40 +00:00
Rick Macklem	bd290946e9	Fix a memory leak that occurred in the pNFS client. When a "pnfs" NFSv4.1 mount was unmounted, it didn't free up the layouts and deviceinfo structures. This leak only affects "pnfs" mounts and only when the mount is umounted. Found while testing the pNFS Flexible File layout client code. MFC after: 2 weeks	2017-09-27 23:23:41 +00:00
Rick Macklem	b0932afacc	Simplify nfsrpc_layoutreturn() args. Simplify nfsrpc_layoutreturn() args. in preparation for the addition of Flex File layout support, since File layout uses a 0 length field. Flex Files does use a longer field, but that will be added in a subsequent commit.	2017-09-19 20:45:25 +00:00
Rick Macklem	ab118d04be	Simplify nfsrpc_layoutcommit() args. Simplify nfsrpc_layoutcommit() args. in preparation for the addition of Flex File layout support, since it also uses a 0 length field.	2017-09-19 20:18:41 +00:00

1 2 3

105 Commits