freebsd-dev

Author	SHA1	Message	Date
Rick Macklem	86b9457f5b	The NFS client wasn't handling getdirentries(2) requests for sizes that are not an exact multiple of DIRBLKSIZ correctly. Fortunately readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications were affected. This patch fixes this problem by reducing the size of the directory read to an exact multiple of DIRBLKSIZ. Tested by: trasz Reported by: trasz Reviewed by: trasz MFC after: 2 weeks	2015-05-21 23:14:18 +00:00
Rick Macklem	2f88b3d20a	Delete some duplicate code that was harmless because exactly the same code is at the end of the nfscl_checksattr() function that is called just before it. As such, this code had already been executed and didn't do anything. MFC after: 1 week	2014-12-25 22:29:37 +00:00
Konstantin Belousov	6c21f6edb8	The VOP_LOOKUP() implementations for CREATE op do not put the name into namecache, to avoid cache trashing when doing large operations. E.g., tar archive extraction is not usually followed by access to many of the files created. Right now, each VOP_LOOKUP() implementation explicitely knowns about this quirk and tests for both MAKEENTRY flag presence and op != CREATE to make the call to cache_enter(). Centralize the handling of the quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP. VFS now sets NOCACHE flag for CREATE namei() calls. Note that the change in semantic is backward-compatible and could be merged to the stable branch, and is compatible with non-changed third-party filesystems which correctly handle MAKEENTRY. Suggested by: Chris Torek <torek@pi-coral.com> Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-18 10:01:12 +00:00
Konstantin Belousov	65589a29f4	Check for the cross-device cross-link attempt in the VFS, instead of forcing filesystem VOP_LINK() methods to repeat the code. In tmpfs_link(), remove redundand check for the type of the source, already done by VFS. Note that NFS server already performs this check before calling VOP_LINK(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-16 14:04:46 +00:00
Rick Macklem	c0990edac6	Modify the NFSv4 client's Pathconf RPC (actually a Getattr Op.) so that it only does the RPC for names that are answered by the RPC. Doing the RPC for other names is harmless, but unnecessary. MFC after: 2 weeks	2014-04-23 22:13:10 +00:00
Rick Macklem	c7b560b9b4	For an NFSv4 mount with the "nocto" option, don't get the up to date file attributes upon close. This reduces the Getattr RPC count by about 65% for software builds. MFC after: 2 weeks	2014-04-21 19:10:23 +00:00
Rick Macklem	cf766161ff	For software builds, the NFS client does many small synchronous (with FILE_SYNC) writes because non-contiguous byte ranges in the same buffer cache block are being written. This patch adds a new mount option "noncontigwr" which allows the non-contiguous byte ranges to be combined, with the dirty byte range becoming the superset of the bytes that are dirty, if the file has not been file locked. This reduces the number of writes significantly for software builds. The only case where this change might break existing applications is where an application is writing non-overlapping byte ranges within the same buffer cache block of a file from multiple clients concurrently. Since such an application would normally do file locking on the file, avoiding the byte range merge for files that have been file locked should be sufficient for most (maybe all?) cases. Submitted by: jhb (earlier version) Reviewed by: kib MFC after: 3 weeks	2013-12-07 23:05:59 +00:00
Attilio Rao	54366c0bd7	- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip	2013-11-25 07:38:45 +00:00
Rick Macklem	562395581b	When the NFSv4.1 client is writing to a pNFS Data Server (DS), the file's size attribute does not get updated. As such, it is necessary to invalidate the attribute cache before clearing NMODIFIED for pNFS. MFC after: 2 weeks	2013-06-21 22:26:18 +00:00
Jeff Roberson	22a722605d	- Convert the bufobj lock to rwlock. - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG. Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf	2013-05-31 00:43:41 +00:00
Rick Macklem	77a03c148c	Add support for the eofflag to nfs_readdir() in the new NFS client so that it works under a unionfs mount. Submitted by: Jared Yanovich (slovichon@gmail.com) Reviewed by: kib MFC after: 2 weeks	2013-05-12 21:48:08 +00:00
John Baldwin	3b14c753ff	Revert 195703 and 195821 as this special stop handling in NFS is now implemented via VFCF_SBDRY rather than passing PBDRY to individual sleep calls.	2013-03-13 21:06:03 +00:00
Attilio Rao	89f6b8632c	Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho	2013-03-09 02:32:23 +00:00
John Baldwin	d177f14da9	Use vfs_timestamp() to set file timestamps rather than invoking getmicrotime() or getnanotime() directly in NFS. Reviewed by: rmacklem, bde MFC after: 1 week	2013-01-18 18:43:38 +00:00
Rick Macklem	1f60bfd822	Move the NFSv4.1 client patches over from projects/nfsv4.1-client to head. I don't think the NFS client behaviour will change unless the new "minorversion=1" mount option is used. It includes basic NFSv4.1 support plus support for pNFS using the Files Layout only. All problems detecting during an NFSv4.1 Bakeathon testing event in June 2012 have been resolved in this code and it has been tested against the NFSv4.1 server available to me. Although not reviewed, I believe that kib@ has looked at it.	2012-12-08 22:52:39 +00:00
Rick Macklem	7af1242a34	PR# 165923 reported intermittent write failures for dirty memory mapped pages being written back on an NFS mount. Since any thread can call VOP_PUTPAGES() to write back a dirty page, the credentials of that thread may not have write access to the file on an NFS server. (Often the uid is 0, which may be mapped to "nobody" in the NFS server.) Although there is no completely correct fix for this (NFS servers check access on every write RPC instead of at open/mmap time), this patch avoids the common cases by holding onto a credential that recently opened the file for writing and uses that credential for the write RPCs being done by VOP_PUTPAGES() for both NFS clients. Tested by: Joel Ray Holveck (joelh at juniper.net) PR: kern/165923 Reviewed by: kib MFC after: 2 weeks	2012-05-12 12:02:51 +00:00
Rick Macklem	4964d80705	It was reported via email that some non-FreeBSD NFS servers do not include file attributes in the reply to an NFS create RPC under certain circumstances. This resulted in a vnode of type VNON that was not usable. This patch adds an NFS getattr RPC to nfs_create() for this case, to fix the problem. It was tested by the person that reported the problem and confirmed to fix this case for their server. Tested by: Steven Haber (steven.haber at isilon.com) MFC after: 2 weeks	2012-04-27 22:23:06 +00:00
Konstantin Belousov	a53373fabe	Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks	2012-03-17 23:03:20 +00:00
Rick Macklem	5e99212d36	Post r230394, the Lookup RPC counts for both NFS clients increased significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks	2012-03-03 01:06:54 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
John Baldwin	0b17c7bea5	Add a timeout on positive name cache entries in the NFS client. That is, we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week	2012-01-25 20:05:58 +00:00
John Baldwin	5aefb4cbbf	Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks	2012-01-20 20:02:01 +00:00
Rick Macklem	a5e583eea0	Modify the new NFS client so that nfs_fsync() only calls ncl_flush() for regular files. Since other file types don't write into the buffer cache, calling ncl_flush() is almost a no-op. However, it does clear the NMODIFIED flag and this shouldn't be done by nfs_fsync() for directories. MFC after: 2 weeks	2011-11-15 23:35:43 +00:00
Rick Macklem	670bf6f126	Since NFSv4 byte range locking only works for regular files, add a sanity check for the vnode type to the NFSv4 client. MFC after: 2 weeks	2011-11-14 00:10:11 +00:00
Rick Macklem	d1907de2ba	The new NFS client failed to vput() the new vnode if a setattr failed after the file was created in nfs_create(). This would probably only happen during a forced dismount. The old NFS client does have a vput() for this case. Detected by pho during recent testing, where an open syscall returned with a vnode still locked. Tested by: pho Approved by: re (kib) MFC after: 2 weeks	2011-07-30 22:57:38 +00:00
Zack Kirsch	68347a92db	Simple find/replace of VOP_ISLOCKED -> NFSVOPISLOCKED. This is done so that NFSVOPISLOCKED can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:41 +00:00
Zack Kirsch	a998963469	Simple find/replace of VOP_UNLOCK -> NFSVOPUNLOCK. This is done so that NFSVOPUNLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:36 +00:00
Zack Kirsch	98f234f338	Simple find/replace of vn_lock -> NFSVOPLOCK. This is done so that NFSVOPLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:31 +00:00
Rick Macklem	8f0e65c915	Add DTrace support to the new NFS client. This is essentially cloned from the old NFS client, plus additions for NFSv4. A review of this code is in progress, however it was felt by the reviewer that it could go in now, before code slush. Any changes required by the review can be committed as bug fixes later.	2011-06-18 23:02:53 +00:00
Rick Macklem	fb35711d76	Add support for flock(2) locks to the new NFSv4 client. I think this should be ok, since the client now delays NFSv4 Close operations until VOP_INACTIVE()/VOP_RECLAIM(). As such, there should be no risk that the NFSv4 Open is closed while an associated byte range lock still exists. Tested by: avg MFC after: 2 weeks	2011-06-05 20:22:56 +00:00
Rick Macklem	f8f4e256e7	The new NFSv4 client was erroneously using "p" instead of "p_leader" for the "id" for POSIX byte range locking. I think this would only have affected processes created by rfork(2) with the RFTHREAD flag specified. This patch fixes that by passing the "id" down through the various functions from nfs_advlock(). MFC after: 2 weeks	2011-06-05 18:17:37 +00:00
Rick Macklem	b398d10657	Fix the new NFS client so that it doesn't do an NFSv3 Pathconf RPC for cases where the reply doesn't include the answer. This fixes a problem reported by avg@ where the NFSv3 Pathconf RPC would fail when "ls -l" did an lpathconf(2) for _PC_ACL_NFS4. Tested by: avg MFC after: 2 weeks	2011-05-31 17:43:25 +00:00
Rick Macklem	81ddb192e8	Add some missing mutex locking to the new NFS client. MFC after: 2 weeks	2011-05-25 21:17:53 +00:00
Rick Macklem	147206ae68	Fix the new NFS client so that it correctly sets the "must_commit" argument for a write RPC when it succeeds for the first one and fails for a subsequent RPC within the same call to the function. This makes it compatible with the old NFS client for this case. MFC after: 2 weeks	2011-05-25 20:53:08 +00:00
Alan Cox	76036f2bbd	Eliminate duplicate #include's.	2011-05-22 18:11:41 +00:00
Rick Macklem	1f3765902c	Change the sysctl naming for the old and new NFS clients to vfs.oldnfs.xxx and vfs.nfs.xxx respectively. This makes the default nfs client use vfs.nfs.xxx after r221124.	2011-05-15 20:52:43 +00:00
Ruslan Ermilov	e2f2b37089	Implemented a mount option "nocto" that disables cache coherency checking at open time. It may improve performance for read-only NFS mounts. Use deliberately. MFC after: 1 week Reviewed by: rmacklem, jhb (earlier version)	2011-05-04 13:27:45 +00:00
Rick Macklem	bc62b5cf6a	Add a vput() to nfs_lookitup() in the experimental NFS client for a case that will probably never happen. It can only happen if a server were to successfully lookup a file, but not return attributes for that file. Although technically allowed by the NFSv3 RFC, I doubt any server would ever do this. However, if it did, the client would have not vput()'d the new vnode when it needed to do so. MFC after: 2 weeks	2011-04-18 01:02:43 +00:00
Rick Macklem	ab42af2708	Add vput() calls in two places in the experimental NFS client that would be needed if, in the future, nfscl_loadattrcache() were to return an error. Currently nfscl_loadattrcache() never returns an error, so these cases never currently happen. MFC after: 2 weeks	2011-04-18 00:41:23 +00:00
Rick Macklem	78d8a60009	Change the mutex locking for several locations in the experimental NFS client's vnode op functions to make them compatible with the regular NFS client. I'll admit I'm not sure that the mutex locks around the assignments are needed, but the regular client has them, so I added them. Also, add handling of the case of partial attributes in setattr to be compatible with the regular client. MFC after: 2 weeks	2011-04-17 23:56:57 +00:00
Rick Macklem	0a9f005dff	Fix up some of the sysctls for the experimental NFS client so that they use the same names as the regular client. Also add string descriptions for them. MFC after: 2 weeks	2011-04-17 18:56:17 +00:00
Rick Macklem	4b3a38ecdf	Add a lktype flags argument to nfscl_nget() and ncl_nget() in the experimental NFS client so that its nfs_lookup() function can use cn_lkflags in a manner analagous to the regular NFS client. MFC after: 2 weeks	2011-04-16 23:20:21 +00:00
Rick Macklem	149ce1025c	Add VOP_PATHCONF() support to the experimental NFS client so that it can, along with other things, report whether or not NFS4 ACLs are supported. MFC after: 2 weeks	2011-04-13 22:37:28 +00:00
Rick Macklem	f93d95cbf6	Modify nfs_open() in the experimental NFS client to be compatible with the regular NFS client. Also, fix a couple of mutex lock issues. MFC after: 1 week	2010-10-29 13:46:21 +00:00
Rick Macklem	ca27c028d8	Modify the NFS clients and the NLM so that the NLM can be used by both clients. Since the NLM uses various fields of the nfsmount structure, those fields were extracted and put in a separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h. This structure also has a function pointer for a function that extracts the required information from the mount point and nfs vnode for that particular client, for information stored differently by the clients. Reviewed by: jhb MFC after: 2 weeks	2010-10-19 00:20:00 +00:00
Rick Macklem	a8c0af5906	Fix the experimental NFS client so that it doesn't panic when NFSv2,3 byte range locking is attempted. A fix that allows the nlm_advlock() to work with both clients is in progress, but may take a while. As such, I am doing this commit so that the kernel doesn't panic in the meantime. Submitted by: jh MFC after: 2 weeks	2010-09-09 15:45:11 +00:00
John Baldwin	8e27c18282	Store the full timestamp when caching timestamps of files and directories for purposes of validating name cache entries. This closes races where two updates to a file or directory within the same second could result in stale entries in the name cache. While here, remove the 'n_expiry' field as it is no longer used. Reviewed by: rmacklem MFC after: 1 week	2010-09-07 14:29:45 +00:00

1 2

76 Commits