freebsd-skq

Author	SHA1	Message	Date
Rick Macklem	7cfdc2a7bc	MAXBSIZE defines both the largest UFS block size and the largest size for a buffer in the buffer cache. This patch defines a new constant MAXBCACHEBUF, which is the largest size for a buffer in the buffer cache. Having a separate constant allows MAXBCACHEBUF to be set larger than MAXBSIZE on a per-architecture basis, so that NFS can do larger read/writes for these architectures. It modifies sys/param.h so that BKVASIZE can also be set on a per-architecture basis. A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE is fixed as well. Differential Revision: https://reviews.freebsd.org/D2330 Reviewed by: mav, kib MFC after: 2 weeks	2015-04-25 00:52:01 +00:00
Pedro F. Giffuni	2f39c91019	Prevent a double free. This is similar to r281756 so set the ptr NULL after free as a safety belt against future changes. Obtained from: HardenedBSD (b2e77ced9ae213d358b44d98f552d9ae4636ecac) Submitted by: Oliver Pinter Revewed by: rmacklem	2015-04-20 16:40:13 +00:00
Pedro F. Giffuni	a3a4b110da	nfsrpc_createv4: fix double free. Reported by: Oliver Pinter, clang static checker Obtained from: HardenedBSD (commit 63cac77c42c0c3fc67da62f97d5ab651d52ae707) Reviewed by: rmacklem MFC after: 5 days	2015-04-19 23:55:59 +00:00
Alexander Motin	afdfc9a40d	Change wcommitsize default from one empirical value to another. The new value is more predictable with growing RAM size: hibufspace maxvnodes old new i386: 256MB 32980992 15800 2198732 2097152 2GB 94027776 107677 878764 4194304 amd64: 256MB 32980992 15800 2198732 2097152 1GB 114114560 68062 1678155 4194304 4GB 217055232 111807 1955452 4194304 16GB 1717846016 337308 5097465 16777216 64GB 1734918144 1164427 1490479 16777216 256GB 1734918144 4426453 391983 16777216 Reviewed by: rmacklem MFC after: 2 weeks	2015-04-19 11:34:41 +00:00
Edward Tomasz Napierala	50a220c699	Replace "new NFS" with just "NFS" in some sysctl description strings. Sponsored by: The FreeBSD Foundation	2015-04-19 06:18:41 +00:00
Rick Macklem	dda11d4ab9	File systems that do not use the buffer cache (such as ZFS) must use VOP_FSYNC() to perform the NFS server's Commit operation. This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which is set by file systems that use the buffer cache. If this flag is not set, the NFS server always does a VOP_FSYNC(). This should be ok for old file system modules that do not set MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although it might not be optimal for file systems that use the buffer cache. Reviewed by: kib MFC after: 2 weeks	2015-04-15 20:16:31 +00:00
Gleb Smirnoff	4d6481a4c9	o Enhance vm_pager_free_nonreq() function: - Allow to call the function with vm object lock held. - Allow to specify reqpage that doesn't match any page in the region, meaning freeing all pages. o Utilize the new function in couple more places in vnode pager. Reviewed by: alc, kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-03-17 19:19:19 +00:00
Rick Macklem	07d491dede	r245508 modified the NFS client's Setattr RPC to use VA_UTIMES_NULL to indicate whether it should set the time to the current tod on the server. This had the side effect of making the NFS client use the client's timestamp for exclusive create, starting with FreeBSD9.2. Unfortunately a bug in some Solaris NFS servers causes these servers to return NFS_OK to the Setattr RPC done during exclusive create, but not actually set the file's mode, leaving the file's mode == 0. This patch restores the NFS client's behaviour to use the server's tod for the exclusive open's Setattr RPC, to avoid the Solaris server bug and to restore the pre-FreeBSD9.2 NFS behaviour. Discussed on: freebsd-fs PR: 186293 MFC after: 3 months	2014-12-28 21:13:52 +00:00
Rick Macklem	2f88b3d20a	Delete some duplicate code that was harmless because exactly the same code is at the end of the nfscl_checksattr() function that is called just before it. As such, this code had already been executed and didn't do anything. MFC after: 1 week	2014-12-25 22:29:37 +00:00
Rick Macklem	62c23db947	Fix kernel builds with "options NFS_DEBUG" that were broken by r276096. Also delete the two kernel options NFS_GATHERDELAY, NFS_WDELAYHASHSIZ which are no longer used. Reported by: bz	2014-12-23 14:24:36 +00:00
Rick Macklem	c15882f091	Remove the old NFS client and server from head, which means that the NFSCLIENT and NFSSERVER kernel options will no longer work. This commit only removes the kernel components. Removal of unused code in the user utilities will be done later. This commit does not include an addition to UPDATING, but that will be committed in a few minutes. Discussed on: freebsd-fs	2014-12-23 00:47:46 +00:00
Konstantin Belousov	6c21f6edb8	The VOP_LOOKUP() implementations for CREATE op do not put the name into namecache, to avoid cache trashing when doing large operations. E.g., tar archive extraction is not usually followed by access to many of the files created. Right now, each VOP_LOOKUP() implementation explicitely knowns about this quirk and tests for both MAKEENTRY flag presence and op != CREATE to make the call to cache_enter(). Centralize the handling of the quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP. VFS now sets NOCACHE flag for CREATE namei() calls. Note that the change in semantic is backward-compatible and could be merged to the stable branch, and is compatible with non-changed third-party filesystems which correctly handle MAKEENTRY. Suggested by: Chris Torek <torek@pi-coral.com> Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-18 10:01:12 +00:00
Edward Tomasz Napierala	2fbe0cff73	Fix handling of "conn" mount_nfs(8) option. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2014-10-30 09:25:03 +00:00
Edward Tomasz Napierala	5a06ac3540	Add support for "timeo", "actimeo", "noac", and "proto" options to mount_nfs(8). They are implemented on Linux, OS X, and Solaris, and thus can be expected to appear in automounter maps. Reviewed by: rmacklem@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2014-10-30 08:50:01 +00:00
Rick Macklem	6a30c96cdc	Clip the settings for the NFS rsize, wsize mount options to a power of 2. For non-power of 2 settings, intermittent page faults have been reported. Although the bug that causes these page faults/crashes has not been identified, it does not appear to occur when rsize, wsize is a power of 2. Reported by: tcberner@gmail.com MFC after: 2 weeks	2014-10-22 22:27:51 +00:00
Rick Macklem	fcf121d481	Revert r273481 so it can be recoded using fls(), which some feel will make it more readable.	2014-10-22 21:57:35 +00:00
Rick Macklem	88cc4e92da	Clip the settings for the NFS rsize, wsize mount options to a power of 2. For non-power of 2 settings, intermittent page faults have been reported. Although the bug that causes these page faults/crashes has not been identified, it does not appear to occur when rsize, wsize is a power of 2. Reported by: tcberner@gmail.com MFC after: 2 weeks	2014-10-22 20:47:11 +00:00
Davide Italiano	2be111bf7d	Follow up to r225617. In order to maximize the re-usability of kernel code in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe	2014-10-16 18:04:43 +00:00
Alan Cox	396b3e34b4	Avoid an exclusive acquisition of the object lock on the expected execution path through the NFS clients' getpages functions. Introduce vm_pager_free_nonreq(). This function can be used to eliminate code that is duplicated in many getpages functions. Also, in contrast to the code that currently appears in those getpages functions, vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one case. Reviewed by: kib MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division	2014-09-14 18:07:55 +00:00
Konstantin Belousov	65589a29f4	Check for the cross-device cross-link attempt in the VFS, instead of forcing filesystem VOP_LINK() methods to repeat the code. In tmpfs_link(), remove redundand check for the type of the source, already done by VFS. Note that NFS server already performs this check before calling VOP_LINK(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-16 14:04:46 +00:00
Rick Macklem	c59e4cc34d	Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server. MFC after: 1 month	2014-07-01 20:47:16 +00:00
Rick Macklem	2d5f835917	There might be a potential race condition for the NFSv4 client when a newly created file has another open done on it that update the open mode. This patch moves the code that updates the open mode up into the block where the mutex is held to ensure this cannot happen. No bug caused by this potential race has been observed, but this fix is a safety belt to ensure it cannot happen. MFC after: 2 weeks	2014-06-28 21:47:15 +00:00
Rick Macklem	c0990edac6	Modify the NFSv4 client's Pathconf RPC (actually a Getattr Op.) so that it only does the RPC for names that are answered by the RPC. Doing the RPC for other names is harmless, but unnecessary. MFC after: 2 weeks	2014-04-23 22:13:10 +00:00
Rick Macklem	9eeef7464b	Fixes mkdir for the NFSv2 client that was broken by r264705. Reported by: bdrewery MFC after: 2 weeks	2014-04-22 04:42:46 +00:00
Rick Macklem	c7b560b9b4	For an NFSv4 mount with the "nocto" option, don't get the up to date file attributes upon close. This reduces the Getattr RPC count by about 65% for software builds. MFC after: 2 weeks	2014-04-21 19:10:23 +00:00
Rick Macklem	c3e4a7261c	Modify the NFSv4 client create/mkdir RPC so that it acquires post-create/mkdir directory attributes. This allows the RPC to name cache the newly created directory and reduces the lookup RPC count for applications creating a lot of directories. MFC after: 2 weeks	2014-04-20 22:19:00 +00:00
Rick Macklem	de1a42bd0c	Modify the NFSv4 client open/create RPC so that it acquires post-open/create directory attributes. This allows the RPC to name cache the newly created file and reduces the lookup RPC count by about 10% for software builds. MFC after: 2 weeks	2014-04-19 19:40:20 +00:00
Rick Macklem	a6f8e64e74	Modify the Lookup RPC for NFSv4 so that it acquires directory attributes. This allows the client to cache directory names when they are looked up, reducing the Lookup RPC count by about 40% for software builds. MFC after: 2 weeks	2014-04-18 22:05:34 +00:00
Robert Watson	4a14441044	Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks	2014-03-16 10:55:57 +00:00
Rick Macklem	b921158ae0	The NFSv4 client was passing both the p and cred arguments to nfsv4_fillattr() as NULLs for the Getattr callback. This caused nfsv4_fillattr() to not fill in the Change attribute for the reply. I believe this was a violation of the RFC, but had little effect on server behaviour. This patch passes a non-NULL p argument to fix this. MFC after: 1 week	2013-12-24 00:48:39 +00:00
Rick Macklem	6b8fe5d59d	The NFSv4.1 client didn't return NFSv4.1 specific error codes for the Getattr and Recall callbacks. This patch fixes it. Since the NFSv4.1 specific error codes would only happen for abnormal circumstances, this patch has little effect, in practice. MFC after: 1 week	2013-12-23 15:16:53 +00:00
Rick Macklem	cf766161ff	For software builds, the NFS client does many small synchronous (with FILE_SYNC) writes because non-contiguous byte ranges in the same buffer cache block are being written. This patch adds a new mount option "noncontigwr" which allows the non-contiguous byte ranges to be combined, with the dirty byte range becoming the superset of the bytes that are dirty, if the file has not been file locked. This reduces the number of writes significantly for software builds. The only case where this change might break existing applications is where an application is writing non-overlapping byte ranges within the same buffer cache block of a file from multiple clients concurrently. Since such an application would normally do file locking on the file, avoiding the byte range merge for files that have been file locked should be sufficient for most (maybe all?) cases. Submitted by: jhb (earlier version) Reviewed by: kib MFC after: 3 weeks	2013-12-07 23:05:59 +00:00
Sergey Kandaurov	0d8dc7cc39	- Nuke a second copy of nfscl_attrcache extern declarations from under ifdef KDTRACE_HOOKS. This fixes kernel build with options KDTRACE_HOOKS. - Fix style inconsistencies.	2013-11-26 22:41:40 +00:00
Gleb Smirnoff	285e7a2d97	Fix build, attempt two.	2013-11-26 20:27:57 +00:00
Gleb Smirnoff	6882b8ea66	Fix build.	2013-11-26 10:34:34 +00:00
Attilio Rao	54366c0bd7	- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip	2013-11-25 07:38:45 +00:00
Rick Macklem	42b6336a98	Fix an NFSv4.1 client specific case where a forced dismount would hang. The hang occurred in nfsv4_setsequence() when it couldn't find an available session slot and is fixed by checking for a forced dismount in progress and just returning for this case. MFC after: 1 month	2013-11-09 21:24:56 +00:00
Pawel Jakub Dawidek	7008be5bd7	Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD \| CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t cap_rights_init(cap_rights_t rights, ...); void cap_rights_set(cap_rights_t rights, ...); void cap_rights_clear(cap_rights_t rights, ...); bool cap_rights_is_set(const cap_rights_t rights, ...); bool cap_rights_is_valid(const cap_rights_t rights); void cap_rights_merge(cap_rights_t dst, const cap_rights_t src); void cap_rights_remove(cap_rights_t dst, const cap_rights_t src); bool cap_rights_contains(const cap_rights_t big, const cap_rights_t little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP \| CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation	2013-09-05 00:09:56 +00:00
Rick Macklem	f7d8291af0	Crashes have been observed for NFSv4.1 mounts when the system is being shut down which were caused by the nfscbd_pool being destroyed before the backchannel is disabled. This patch is believed to fix the problem, by simply avoiding ever destroying the nfscbd_pool. Since the NFS client module cannot be unloaded, this should not cause a memory leak. MFC after: 2 weeks	2013-09-04 22:47:56 +00:00
Rick Macklem	8fe6bddff7	Forced dismounts of NFS mounts can fail when thread(s) are stuck waiting for an RPC reply from the server while holding the mount point busy (mnt_lockref incremented). This happens because dounmount() msleep()s waiting for mnt_lockref to become 0, before calling VFS_UNMOUNT(). This patch adds a new VFS operation called VFS_PURGE(), which the NFS client implements as purging RPCs in progress. Making this call before checking mnt_lockref fixes the problem, by ensuring that the VOP_xxx() calls will fail and unbusy the mount point. Reported by: sbruno Reviewed by: kib MFC after: 2 weeks	2013-09-01 23:02:59 +00:00
Rick Macklem	88a2437a65	Add support for host-based (Kerberos 5 service principal) initiator credentials to the kernel rpc. Modify the NFSv4 client to add support for the gssname and allgssname mount options to use this capability. Requires the gssd daemon to be running with the "-h" option. Reviewed by: jhb	2013-07-09 01:05:28 +00:00
Rick Macklem	a820822ec8	A problem with the old NFS client where large writes to large files would sometimes result in a corrupted file was reported via email. This problem appears to have been caused by r251719 (reverting r251719 fixed the problem). Although I have not been able to reproduce this problem, I suspect it is caused by another thread increasing np->n_size after the mtx_unlock(&np->n_mtx) but before the vnode_pager_setsize() call. Since the np->n_mtx mutex serializes updates to np->n_size, doing the vnode_pager_setsize() with the mutex locked appears to avoid the problem. Unfortunately, vnode_pager_setsize() where the new size is smaller, cannot be called with a mutex held. This patch returns the semantics to be close to pre-r251719 (actually pre-r248567, r248581, r248567 for the new client) such that the call to vnode_pager_setsize() is only delayed until after the mutex is unlocked when np->n_size is shrinking. Since the file is growing when being written, I believe this will fix the corruption. A better solution might be to replace the mutex with a sleep lock, but that is a non-trivial conversion, so this fix is hoped to be sufficient in the meantime. Reported by: David G. Lawrence (dg@dglawrence.com) Tested by: David G. Lawrence (to be done soon) Reviewed by: kib MFC after: 1 week	2013-07-03 00:19:03 +00:00
Rick Macklem	2e6a4b0c55	Fix r252074 so that it builds on 64bit arches.	2013-06-22 21:58:21 +00:00
Rick Macklem	1dd95a046c	The NFSv4.1 LayoutCommit operation requires a valid offset and length. (0, 0 is not sufficient) This patch a loop for each file layout, using the offset, length of each file layout in a separate LayoutCommit.	2013-06-21 22:46:16 +00:00
Rick Macklem	562395581b	When the NFSv4.1 client is writing to a pNFS Data Server (DS), the file's size attribute does not get updated. As such, it is necessary to invalidate the attribute cache before clearing NMODIFIED for pNFS. MFC after: 2 weeks	2013-06-21 22:26:18 +00:00
Rick Macklem	315c38d135	Since some NFSv4 servers enforce the requirement for a reserved port#, enable use of the (no)resvport mount option for NFSv4. I had thought that the RFC required that non-reserved port #s be allowed, but I couldn't find it in the RFC. MFC after: 2 weeks	2013-06-21 19:41:30 +00:00
Jeff Roberson	22a722605d	- Convert the bufobj lock to rwlock. - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG. Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf	2013-05-31 00:43:41 +00:00
Rick Macklem	734b03c38d	Post-r248567, there were times when the client would return a truncated directory for some NFS servers. This turned out to be because the size of a directory reported by an NFS server can be smaller that the ufs-like directory created from the RPC XDR in the client. This patch fixes the problem by changing r248567 so that vnode_pager_setsize() is only done for regular files. Reported and tested by: hartmut.brandt@dlr.de Reviewed by: kib MFC after: 1 week	2013-05-28 22:36:01 +00:00
Rick Macklem	77a03c148c	Add support for the eofflag to nfs_readdir() in the new NFS client so that it works under a unionfs mount. Submitted by: Jared Yanovich (slovichon@gmail.com) Reviewed by: kib MFC after: 2 weeks	2013-05-12 21:48:08 +00:00
Rick Macklem	64a0e848ab	When an NFS unmount occurs, once vflush() writes the last dirty buffer for the last vnode on the mount back to the server, it returns. At that point, the code continues with the unmount, including freeing up the nfs specific part of the mount structure. It is possible that an nfsiod thread will try to check for an empty I/O queue in the nfs specific part of the mount structure after it has been free'd by the unmount. This patch avoids this problem by setting the iodmount entries for the mount back to NULL while holding the mutex in the unmount and checking the appropriate entry is non-NULL after acquiring the mutex in the nfsiod thread. Reported and tested by: pho Reviewed by: kib MFC after: 2 weeks	2013-04-18 23:20:16 +00:00

1 2 3 4 5 ...

272 Commits