freebsd-skq

Author	SHA1	Message	Date
Rick Macklem	0649fcaea5	Fix the NFS client for "text file modified, process killed" mmap'd case. When an mmap'd text file is written and then executed immediately afterwards, it was possible that the modify time would change after the text file was executing, resulting in the process executing the file being killed. This was usually only observed when the file system's times were set to higher resolution, but could have occurred for any time resolution. This was reported on a recent email list discussion. This patch adds a VOP_SET_TEXT() to the NFS client which flushed all dirty pages to the NFS server and then makes sure that n_mtime is up to date to avoid this from occurring. Thanks go to kib@ and pho@ for their help with developing this patch. Tested by: pho Reviewed by: kib MFC after: 2 weeks	2017-04-12 21:37:12 +00:00
Rick Macklem	7d9b62e1a1	Don't throw away Open state when a NFSv4.1 client recovery fails. If the ExchangeID/CreateSession operations done by an NFSv4.1 client after the server crashes/reboots fails, it is possible that some process/thread is waiting for an open_owner lock. If the client state is free'd, this can cause a crash. This would not normally happen, but has been observed on a mount of the AmazonEFS service. Reported by: cperciva Tested by: cperciva PR: 216086 MFC after: 2 weeks	2017-04-11 22:47:02 +00:00
Rick Macklem	9e507a6a00	During a server crash recovery, fix the NFSv4.1 client for a NFSERR_BADSESSION during recovery. If the NFSv4.1 client gets a NFSv4.1 NFSERR_BADSESSION reply to an Open/Lock operation while recovering from the server crash/reboot, allow the opens to be retained for a subsequent recovery attempt. Since NFSv4.1 servers should only reply NFSERR_BADSESSION after a crash/reboot that has lost state, this case should almost never happen. However, for the AmazonEFS file service, this has been observed when the client does a fresh TCP connection for RPCs. Reported by: cperciva Tested by: cperciva PR: 216088 MFC after: 2 weeks	2017-04-11 20:28:15 +00:00
Konstantin Belousov	6627d919de	Remove debugging printf. Instead, issue a diagnostic and return appropriate error if ncl_flush() was unable to clean buffer queue after the specified number or retries. Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2017-04-11 08:29:12 +00:00
Rick Macklem	8efffe80ba	Avoid starvation of the server crash recovery thread for the NFSv4 client. This patch gives a requestor of the exclusive lock on the client state in the NFSv4 client priority over shared lock requestors. This avoids the server crash recovery thread being starved out by other threads doing RPCs. Tested by: cperciva PR: 216087 MFC after: 2 weeks	2017-04-10 01:28:01 +00:00
Rick Macklem	4e7dcfab2d	Fix the NFSv4 client hndling of a stale write verifier in the Commit operation. When the NFSv4 client Commit operation encountered a stale write verifier, it erroneously mapped that to EIO. This could have caused recently written data to be lost when a server crashes/reboots between an UNSTABLE write and the subsequent commit. This patch fixes this. The bug was only for the NFSv4 client and did not affect NFSv3. Tested by: cperciva PR: 215887 MFC after: 2 weeks	2017-04-09 21:50:21 +00:00
Rick Macklem	83a37350bf	Fix parsing failure for NFSv4 Setattr operation for failed case. If an operation that preceeds a Setattr in an NFSv4 compound fails, there is no bitmap of attributes to parse. Without this patch, the parsing would fail and return EBADRPC instead of the correct failure error. This could break recovery from a server crash/reboot. Tested by: cperciva PR: 215883 MFC after: 2 weeks	2017-04-09 12:32:22 +00:00
Konstantin Belousov	a046da7e90	Remove spl*() calls from the nfsclient code. Style adjustments in the related lines in ncl_writebp(). Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-04-06 12:44:34 +00:00
Konstantin Belousov	ea52525928	Make nfs pageout coherent with the dirty state of the buffers. Write out the dirty pages using VOP_WRITE() instead of directly calling ncl_writerpc(). The state of the buffers now reflects the write, fixing some hard to diagnose consistency and write order issues. The change also allowed to remove remapping of paged out pages into kernel space and related allocation of the phys buffer. Reviewed by: markj, rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D10241	2017-04-05 17:26:20 +00:00
Konstantin Belousov	b73cd4d344	Handle nfs IO_ASYNC write requests asynchronously. Reviewed by: markj, rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D10241	2017-04-05 17:20:31 +00:00
Konstantin Belousov	48fe926362	Handle possible vnode reclamation after ncl_vinvalbuf() call. ncl_vinvalbuf() might need to upgrade vnode lock, allowing the vnode to be reclaimed by other thread. Handle the situation, indicated by the returned error zero and VI_DOOMED iflag set, converting it into EBADF. Handle all calls, even where the vnode is exclusively locked right now. Reviewed by: markj, rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D10241	2017-04-05 17:11:39 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Konstantin Belousov	bd1623def1	Do not access memory past the buffer end. Do not accept and silently truncate too long hostname. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-02-16 06:36:16 +00:00
Konstantin Belousov	599009e261	Do not allocate char[MNAMELEN] on stack in nfsclient. Right now this is not critical, but will be after planned increase of MNAMELEN from 88 to 1k. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-02-16 06:34:20 +00:00
Ed Maste	1dc349ab95	prefix UFS symbols with UFS_ to reduce namespace pollution Specifically: ROOTINO -> UFS_ROOTINO WINO -> UFS_WINO NXADDR -> UFS_NXADDR NDADDR -> UFS_NDADDR NIADDR -> UFS_NIADDR MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency) Also prefix ext2's and nandfs's NDADDR and NIADDR with EXT2_ and NANDFS_ Reviewed by: kib, mckusick Obtained from: NetBSD MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D9536	2017-02-15 19:50:26 +00:00
Konstantin Belousov	1c32456953	Use type-independent formats for printing nlink_t and ino_t. Extracted from: ino64 work by gleb, mckusick Discussed with: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-06 16:59:33 +00:00
Rick Macklem	b2fc0141d9	Fix NFSv4.1 client recovery from NFS4ERR_BAD_SESSION errors. For most NFSv4.1 servers, a NFS4ERR_BAD_SESSION error is a rare failure that indicates that the server has lost session/open/lock state. However, recent testing by cperciva@ against the AmazonEFS server found several problems with client recovery from this due to it generating this failure frequently. Briefly, the problems fixed are: - If all session slots were in use at the time of the failure, some processes would continue to loop waiting for a slot on the old session forever. - If an RPC that doesn't use open/lock state failed with NFS4ERR_BAD_SESSION, it would fail the RPC/syscall instead of initiating recovery and then looping to retry the RPC. - If a successful reply to an RPC for an old session wasn't processed until after a new session was created for a NFS4ERR_BAD_SESSION error, it would erroneously update the new session and corrupt it. - The use of the first element of the session list in the nfs mount structure (which is always the current metadata session) was slightly racey. With changes for the above problems it became more racey, so all uses of this head pointer was wrapped with a NFSLOCKMNT()/NFSUNLOCKMNT(). - Although the kernel malloc() usually allocates more bytes than requested and, as such, this wouldn't have caused problems, the allocation of a session structure was 1 byte smaller than it should have been. (Null termination byte for the string not included in byte count.) There are probably still problems with a pNFS data server that fails with NFS4ERR_BAD_SESSION, but I have no server that does this to test against (the AmazonEFS server doesn't do pNFS), so I can't fix these yet. Although this patch is fairly large, it should only affect the handling of NFS4ERR_BAD_SESSION error replies from an NFSv4.1 server. Thanks go to cperciva@ for the extension testing he did to help isolate/fix these problems. Reported by: cperciva Tested by: cperciva MFC after: 3 months Differential Revision: https://reviews.freebsd.org/D8745	2016-12-23 23:14:53 +00:00
Konstantin Belousov	abc1515601	NFSv4 client tracks opens, and the track records are only dropped when the vnode is inactivated. This contradicts with the nullfs caching which keeps upper vnode around, as consequence keeping the use reference to lower vnode. Add a filesystem flag to request nullfs to not cache when mounted over that filesystem, and set the flag for nfs v4 mounts. Reported by: asomers Reviewed by: rmacklem Tested by: asomers, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-27 09:20:58 +00:00
Konstantin Belousov	753a007f0d	Use buffer pager for NFS. The pager, due to its construction, implements clustering for the page-ins. In particular, buildworld load demonstrates reduction of the READ RPCs from 39k down to 24k. No change in real or CPU time was observed. Discussed with, and measured by: bde No objections from: rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-22 10:58:24 +00:00
Konstantin Belousov	fc2c3afee0	Minor cleanup, remove unneeded XXX comments and unused re-define. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-22 10:24:59 +00:00
Rick Macklem	1b819cf265	Update the nfsstats structure to include the changes needed by the patch in D1626 plus changes so that it includes counts for NFSv4.1 (and the draft of NFSv4.2). Also, make all the counts uint64_t and add a vers field at the beginning, so that future revisions can easily be implemented. There is code in place to handle the old vesion of the nfsstats structure for backwards binary compatibility. Subsequent commits will update nfsstat(8) to use the new fields. Submitted by: will (earlier version) Reviewed by: ken MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D1626	2016-08-12 22:44:59 +00:00
Konstantin Belousov	ad600ac8e3	Remove ncl_printf(), use printf(9) directly. After r303710 the function duplicates printf(). Correct function names in the messages []. Noted by: bde [] Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-03 15:58:20 +00:00
Konstantin Belousov	83d7cf21ea	Remove unneeded (recursing) Giant acquisition around vprintf(9). Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-08-03 11:49:17 +00:00
Konstantin Belousov	20de93c6c0	Clean other flags in ncl_inactive, only. Add comment explaining why other flags should be unset. Suggested and reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 12 days Approved by: re (gjb)	2016-06-26 14:18:28 +00:00
Konstantin Belousov	8f73d398ed	Since VOP_INACTIVE() is not guaranteed to be called, all cleanups executed by inactive methods, must be repeated on reclaim. In particular, unlink and free sillyrenamed vnode both on inactivation and reclaim. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-06-25 11:34:06 +00:00
Konstantin Belousov	e37dfd3d2b	Do not access NFS data for reclaimed vnode. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (delphij)	2016-06-19 18:29:43 +00:00
Conrad Meyer	ab8316b8df	nfs_clvfsops: Fix leading whitespace introduced in r299848 Replace spaces with tabs. No functional change. Sponsored by: EMC / Isilon Storage Division	2016-06-07 20:16:01 +00:00
Conrad Meyer	15634fd60c	nfs_clvfsops: Prevent strdup of stack garbage with bogus mount specs If strlen(hostp) was zero, the stack array 'nam' would never be initialized before being strdup()ed. Fix this by initializing it to the empty string. It's possible some external condition makes this case impossible, in which case, an assertion instead of this workaround is appropriate. Introduced in r299848. Reported by: Coverity CID: 1355336 Sponsored by: EMC / Isilon Storage Division	2016-06-07 20:00:20 +00:00
Gleb Smirnoff	fefbf77024	Comment fix: the getsockaddr() is actually meant here. Reviewed by: rmacklem	2016-05-18 17:40:53 +00:00
Edward Tomasz Napierala	0d1654c39b	Make it possible to reroot into NFS. This means one can have eg an NFSv4 root over WiFi: boot from md_root (small rootfs image preloaded by loader(8)), setup WiFi, and then reroot into the actual root, over NFS. Note that it's currently limited to NFSv4, and due to problems with nfsuserd(8) it requres a workaround on the server side: one needs to set the vfs.nfsd.enable_stringtouid=1 sysctl and not run nfsuserd(8) on either the server or the client side. Reviewed by: rmacklem@ MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6347	2016-05-15 08:34:59 +00:00
Konstantin Belousov	b6a60ae74a	Use vfs_hash_ref(9) to eliminate LK_EXCLOTHER kludge. As a consequence, the nfs client override of VOP_LOCK1() is no longer needed. Reviewed and tested by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-05-11 06:35:46 +00:00
Pedro F. Giffuni	a96c9b30e2	NFS: spelling fixes on comments. No funcional change.	2016-04-29 16:07:25 +00:00
Rick Macklem	84aa8a8ad1	Bruce Evans reported that there was a performance regression between the old and new NFS clients. He did a good job of isolating the problem which was caused by the new NFS client not setting the post write mtime correctly. The new NFS client code was cloned from the old client, but was incorrect, because the mtime in the nfs vnode's cache wasn't yet updated. This patch fixes this problem. The patch also adds missing mutex locking. Reported and tested by: bde MFC after: 2 weeks	2016-04-11 21:55:21 +00:00
Pedro F. Giffuni	74b8d63dcc	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
Bjoern A. Zeeb	8676704962	Unbreak NOIP builds after r294084.	2016-01-15 16:45:36 +00:00
Alexander V. Chernikov	d3bf8f6486	Make nfscl_getmyip() use new routing KPI. * Use standard IPv6 SAS instead of rt->rt_ifa address. * Make address lookup work for IPv6 LLA. * Save address into buffer provided by caller instead of using static vars. Discussed with: rmacklem	2016-01-15 09:05:14 +00:00
Gleb Smirnoff	f17f88d3e0	Fix breakage caused by r292373 in ZFS/FUSE/NFS/SMBFS. With the new VOP_GETPAGES() KPI the "count" argument counts pages already, and doesn't need to be translated from bytes to pages. While here make it consistent that rbehind and rahead are updated only if we doesn't return error. Pointy hat to: glebius	2015-12-16 23:48:50 +00:00
Gleb Smirnoff	b0cd20172d	A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-12-16 21:30:45 +00:00
Kirk McKusick	43a993bb7d	For performance reasons, it is useful to have a single string used as the name of a filesystem when setting it as the first parameter to the getnewvnode() function. Most filesystems call getnewvnode from just one place so can use a literal string as the first parameter. However, NFS calls getnewvnode from two places, so we create a global constant string that can be used by the two instances. This change also collapses two instances of getnewvnode() in the UFS filesystem to a single call. Reviewed by: kib Tested by: Peter Holm	2015-11-29 21:01:02 +00:00
Rick Macklem	b179878dde	Revert r283330 since it broke directory caching in the client. At this time I cannot see a way to fix directory caching when it has partial blocks in the buffer cache, due to the fact that the syscall's uio_offset won't stay the same as the lblkno * NFS_DIRBLKSIZ offset. Reported by: bde MFC after: 2 weeks	2015-11-21 00:15:41 +00:00
Rick Macklem	f315383406	mnt_stat.f_iosize (which is used to set bo_bsize) must be set to the largest size of buffer cache block or the mapping of the buffer is bogus. When a mount with rsize=4096,wsize=4096 was done, f_iosize would be set to 4096. This resulted in corrupted directory data, since the buffer cache block size for directories is NFS_DIRBLKSIZ (8192). This patch fixes the code so that it always sets f_iosize to at least NFS_DIRBLKSIZ. Tested by: krichy@cflinux.hu PR: 177971 MFC after: 2 weeks	2015-11-17 01:44:26 +00:00
Conrad Meyer	b5af3f30a7	nfsclient: Protest loudly when GETATTR responses are invalid BROKEN NFS SERVER OR MIDDLEWARE: Certain WAN "accelerators" attempt to cache NFS GETATTR traffic, but actually corrupt it (e.g., responding to requests with attributes for totally different files). Warn very verbosely when this is detected. Linux' NFS client has a similar warning. Adds a sysctl/tunable (vfs.nfs.fileid_maxwarnings) to configure the quantity of warnings; default to 10. (Zero disables; -1 is unlimited.) Adds a failpoint to aid in validating the warning / behavior with a non-broken server. Use something like: sysctl 'debug.fail_point.nfscl_force_fileid_warning=10%return(1)' Reviewed by: rmacklem Approved by: markj (mentor) Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3304	2015-08-05 22:27:30 +00:00
Rick Macklem	2a3508eb48	If a "principal" argument isn't provided for a Kerberized NFS mount, the kernel would generate a bogus one with a ":/<path>" suffix. This would only occur for the case where there was no explicit "principal" argument and the getaddrinfo() call in mount_nfs.c failed to a return a cannonical name for the server. This patch fixes this unusual case. PR: 201073 Submitted by: masato@itc.naist.jp MFC after: 2 weeks	2015-07-03 22:11:07 +00:00
Rick Macklem	d189dcb6e2	Alex Burlyga reported a POLA violation for the new NFS client as compared to the old NFS client via email to the freebsd-fs@ mailing list. For the new client, when multiple clients attempted to create a symbolic link concurrently, more that one client would report success instead of EEXIST. This was caused by code in the new client that mapped EEXIST to OK assuming it was caused by a retried RPC request. Since the old client did not do this, the patch defaults to the old behaviour and permits the new behaviour to be enabled via a sysctl. Reported by: alex.burlyga.ietf@gmail.com Tested by: alex.burlyga.ietf@gmail.com MFC after: 2 weeks	2015-07-03 01:15:21 +00:00
Gleb Smirnoff	093ebe1d28	o Un-inline vm_pager_get_pages(), vm_pager_get_pages_async(). o Provide an extensive set of assertions for input array of pages. o Remove now duplicate assertions from different pagers. Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-06-17 22:44:27 +00:00
Rick Macklem	262a84286d	The NFS client generated directory block(s) with d_fileno == 0 so that it would not return less data than requested. Since returning less directory data than requested is not a problem for FreeBSD and even UFS no longer returns directory structures with d_fileno == 0, this patch stops the client from doing this. Although entries with d_fileno == 0 should not be a problem, the man pages no longer document that these entries should be ignored, so there was a concern that these entries might be an issue in the future. Suggested by: trasz Tested by: trasz MFC after: 2 weeks	2015-05-23 21:58:41 +00:00
Rick Macklem	86b9457f5b	The NFS client wasn't handling getdirentries(2) requests for sizes that are not an exact multiple of DIRBLKSIZ correctly. Fortunately readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications were affected. This patch fixes this problem by reducing the size of the directory read to an exact multiple of DIRBLKSIZ. Tested by: trasz Reported by: trasz Reviewed by: trasz MFC after: 2 weeks	2015-05-21 23:14:18 +00:00
Alexander Motin	a87627b26b	Do not promote large async writes to sync. Present implementation of large sync writes is too strict and so can be quite slow. Instead of doing that, execute large async write in chunks, syncing each chunk separately. It would be good to fix large sync writes too, but I leave it to somebody with more skills in this area. Reviewed by: rmacklem MFC after: 1 week	2015-05-14 10:04:42 +00:00
Rick Macklem	7cfdc2a7bc	MAXBSIZE defines both the largest UFS block size and the largest size for a buffer in the buffer cache. This patch defines a new constant MAXBCACHEBUF, which is the largest size for a buffer in the buffer cache. Having a separate constant allows MAXBCACHEBUF to be set larger than MAXBSIZE on a per-architecture basis, so that NFS can do larger read/writes for these architectures. It modifies sys/param.h so that BKVASIZE can also be set on a per-architecture basis. A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE is fixed as well. Differential Revision: https://reviews.freebsd.org/D2330 Reviewed by: mav, kib MFC after: 2 weeks	2015-04-25 00:52:01 +00:00
Pedro F. Giffuni	2f39c91019	Prevent a double free. This is similar to r281756 so set the ptr NULL after free as a safety belt against future changes. Obtained from: HardenedBSD (b2e77ced9ae213d358b44d98f552d9ae4636ecac) Submitted by: Oliver Pinter Revewed by: rmacklem	2015-04-20 16:40:13 +00:00

1 2 3 4 5 ...

320 Commits