freebsd-skq

Author	SHA1	Message	Date
Rick Macklem	dda11d4ab9	File systems that do not use the buffer cache (such as ZFS) must use VOP_FSYNC() to perform the NFS server's Commit operation. This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which is set by file systems that use the buffer cache. If this flag is not set, the NFS server always does a VOP_FSYNC(). This should be ok for old file system modules that do not set MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although it might not be optimal for file systems that use the buffer cache. Reviewed by: kib MFC after: 2 weeks	2015-04-15 20:16:31 +00:00
Mateusz Guzik	cd29b292b8	Convert nullfs hash lock from a mutex to an rwlock.	2014-12-30 21:41:35 +00:00
Mateusz Guzik	4fce16e4c9	Provide vfs suspension support only for filesystems which need it, take two. nullfs and unionfs need to request suspension if underlying filesystem(s) use it. Utilize mnt_kern_flag for this purpose. This is a fixup for 273271. No strong objections from: kib Pointy hat to: mjg MFC after: 2 weeks	2014-10-20 18:00:50 +00:00
Konstantin Belousov	effc6a3593	VOP_LOOKUP() may relock the directory vnode for some reasons. Since nullfs vnode shares vnode lock with lower vnode, this allows the reclamation of nullfs directory vnode in null_lookup(). In this situation, VOP must return ENOENT. More, since after the reclamation, the locks of nullfs directory vnode and lower vnode are no longer shared, the relock of the ldvp does not restore the correct locking state of dvp, and leaks ldvp lock. Correct this by unlocking ldvp and locking dvp. Use cached value of dvp->v_mount. Reported by: bdrewery Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-08-08 11:39:05 +00:00
Konstantin Belousov	0ebe0000b6	Assert that nullfs vnode has VV_ROOT set whenever lower vnode has. Assert that dotdot lookup on the root vnode is not performed. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-28 14:20:31 +00:00
Konstantin Belousov	289dd6dd7c	Fix typo. MFC after: 3 days	2014-07-24 23:14:03 +00:00
Konstantin Belousov	65589a29f4	Check for the cross-device cross-link attempt in the VFS, instead of forcing filesystem VOP_LINK() methods to repeat the code. In tmpfs_link(), remove redundand check for the type of the source, already done by VFS. Note that NFS server already performs this check before calling VOP_LINK(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-16 14:04:46 +00:00
Dag-Erling Smørgrav	1a05c762b9	Fix the length calculation for the final block of a sendfile(2) transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11] In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12] Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13] Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re	2013-09-10 10:05:59 +00:00
Konstantin Belousov	18a8d3d7f8	The tvp vnode on rename is usually unlinked. Drop the cached null vnode for tvp to allow the free of the lower vnode, if needed. PR: kern/180236 Tested by: smh Sponsored by: The FreeBSD Foundation MFC after: 1 week	2013-07-04 19:01:18 +00:00
Konstantin Belousov	74c7ff1a0e	Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(), for the case when the nullfs vnode is not reclaimed. Otherwise, later reclamation would not unlock the lower vnode. Reported by: antoine Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2013-05-21 11:31:56 +00:00
Konstantin Belousov	0fc6daa72d	- Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The null_hashget() obtains the reference on the nullfs vnode, which must be dropped. - Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode. - Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching. Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-05-11 11:17:44 +00:00
Jilles Tjoelker	6d6a91c50f	nullfs: Improve f_flags in statfs(). Include some flags of the nullfs mount itself: MNT_RDONLY, MNT_NOEXEC, MNT_NOSUID, MNT_UNION, MNT_NOSYMFOLLOW. This allows userland code calling statfs() or fstatfs() to see these flags. In particular, this allows opendir() to detect that a -t nullfs -o union mount needs deduplication (otherwise at least . and .. are returned twice) and allows rtld to detect a -t nullfs -o noexec mount as noexec. Turn off the MNT_ROOTFS flag from the underlying filesystem because the nullfs mount is definitely not the root filesystem. Reviewed by: kib MFC after: 1 week	2013-03-02 12:42:23 +00:00
Konstantin Belousov	e8f966eeb8	Remove the filtering of the acceptable mount options for nullfs, added in r245004. Although the report was for noatime option which is non-functional for the nullfs, other standard options like nosuid or noexec are useful with it. Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au> MFC after: 3 days	2013-01-16 05:32:49 +00:00
Konstantin Belousov	603f963e56	The current default size of the nullfs hash table used to lookup the existing nullfs vnode by the lower vnode is only 16 slots. Since the default mode for the nullfs is to cache the vnodes, hash has extremely huge chains. Size the nullfs hashtbl based on the current value of desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a given vnode. Pointy hat to: kib Diagnosed and reviewed by: peter Tested by: peter, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 5 days	2013-01-14 05:44:47 +00:00
Konstantin Belousov	6b17595133	When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed, get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days	2013-01-10 18:24:48 +00:00
Konstantin Belousov	268dd286a0	Fix reversed condition in the assertion. Pointy hat to: kib MFC after: 13 days	2013-01-04 07:52:47 +00:00
Konstantin Belousov	9cf4c952ca	Add the "nocache" nullfs mount option, which disables the caching of the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable. Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache". Tested and benchmarked by: pho (previous version) MFC after: 2 weeks	2013-01-03 19:17:57 +00:00
Konstantin Belousov	6db79c26ce	Remove the check and panic for an impossible condition. The NULL lowervp vnode v_vnlock would cause panic due to NULL pointer dereference much earlier. MFC after: 1 week	2012-11-20 15:25:00 +00:00
Attilio Rao	c6e0355cee	r16312 is not any longer real since many years (likely since when VFS received granular locking) but the comment present in UFS has been copied all over other filesystems code incorrectly for several times. Removes comments that makes no sense now. Reviewed by: kib MFC after: 3 days	2012-11-19 22:43:45 +00:00
Attilio Rao	bc2258da88	Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag. Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.	2012-11-09 18:02:25 +00:00
Konstantin Belousov	140dedb81c	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks	2012-11-02 13:56:36 +00:00
Konstantin Belousov	82ed933c6f	Grammar fixes. Submitted by: bf MFC after: 1 week	2012-10-14 18:13:33 +00:00
Konstantin Belousov	806efacae0	Replace the XXX comment with the proper description. MFC after: 1 week	2012-10-14 17:07:34 +00:00
Konstantin Belousov	d9e9650a36	Allow shared lookups for nullfs mounts, if lower filesystem supports it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability: 1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower. Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded. 2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale. Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode. Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems. In collaboration with: pho MFC after: 3 weeks	2012-09-09 19:20:23 +00:00
Edward Tomasz Napierala	af6e6b87ad	Remove unused thread argument to vrecycle(). Reviewed by: kib	2012-04-23 14:10:34 +00:00
Kevin Lo	11753bd018	Use NULL instead of 0	2012-03-13 10:04:13 +00:00
Konstantin Belousov	66f02f4b25	Do not expose unlocked unconstructed nullfs vnode on mount list. Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week	2012-03-02 09:48:46 +00:00
Konstantin Belousov	37a1046e61	Allow shared locks for reads when lower filesystem accept shared locking. Tested by: pho MFC after: 1 week	2012-02-29 15:18:53 +00:00
Konstantin Belousov	cec1d07726	Document that null_nodeget() cannot take shared-locked lowervp due to insmntque() requirements. Tested by: pho MFC after: 1 week	2012-02-29 15:18:04 +00:00
Konstantin Belousov	409b12c08a	In null_reclaim(), assert that reclaimed vnode is fully constructed, instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week	2012-02-29 15:15:36 +00:00
Konstantin Belousov	e4e1d9f382	Always request exclusive lock for the lower vnode in nullfs_vget(). The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week	2012-02-29 15:09:20 +00:00
Konstantin Belousov	67e3d54f80	Move the code to destroy half-contructed nullfs vnode into helper function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week	2012-02-29 15:06:00 +00:00
Konstantin Belousov	da732fc69f	Merge a split multi-line comment. MFC after: 1 week	2012-02-29 14:43:27 +00:00
Martin Matuska	bf3db8aa65	To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks	2012-02-23 18:51:24 +00:00
Martin Matuska	61f0e25abf	Allow mounting nullfs(5) inside jails. This is now possible thanks to r230129. MFC after: 1 month	2012-02-09 10:39:01 +00:00
Eygene Ryabinkin	15c75a0d9f	Subject: NULLFS: properly destroy node hash Use hashdestroy() instead of naive free(). Approved by: kib MFC after: 2 weeks	2012-01-18 11:23:46 +00:00
Dimitry Andric	f39adedd5b	In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode pointer 'lowervp' instead of 'vp', which is uninitialized at that point. Reviewed by: kib MFC after: 1 week	2012-01-05 17:06:04 +00:00
Konstantin Belousov	dd0f9532f3	Do the vput() for the lowervp in the null_nodeget() for error case too. Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week	2012-01-03 21:09:07 +00:00
Konstantin Belousov	48a1e3f624	Document the state of the lowervp vnode for null_nodeget(). Tested by: pho MFC after: 1 week	2012-01-03 21:03:20 +00:00
Konstantin Belousov	f82360acf2	Existing VOP_VPTOCNP() interface has a fatal flow that is critical for nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)	2011-11-19 07:50:49 +00:00
Konstantin Belousov	f82ee01c1c	Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiled in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump. MFC after: 1 week	2011-11-19 07:41:37 +00:00
Konstantin Belousov	4d2310dd81	Use the plain panic calls, without additional printing around them. The debugger and dumping support is adequate. Tested by: pho MFC after: 1 week	2011-11-19 07:40:13 +00:00
Konstantin Belousov	17edcd764d	The use of VOP_ISLOCKED() without a check for the return values can cause false positives. Replace the #ifdef block with the proper ASSERT_VOP_UNLOCKED() assert. Tested by: pho MFC after: 1 week	2011-10-24 13:56:31 +00:00
Konstantin Belousov	234ab7412e	The only possible error return from null_nodeget() is due to insmntque1 failure (the getnewvnode cannot return an error). In this case, the null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK() in the nullfs_mount() after null_nodeget() failure is wrong. Tested by: pho MFC after: 1 week	2011-10-24 13:53:32 +00:00
Konstantin Belousov	ffa43617e8	The covered vnode must be reloced if it was unlocked. Remove VOP_ISLOCKED test because of this and also because it can lead to false positives. Tested by: pho MFC after: 1 week	2011-10-24 13:48:13 +00:00
Peter Holm	9ce7379778	Only unlock if the lock is exclusive. Reported by: Subbsd <subbsd gmail com> Discussed with: kib	2011-10-24 10:35:37 +00:00
Rick Macklem	694a586a43	Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib	2011-05-22 01:07:54 +00:00
Rebecca Cran	974206cf70	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-23 09:22:33 +00:00
Rick Macklem	2d0c83b139	Add a null_remove() function to nullfs, so that the v_usecount of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well. Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks	2010-08-31 01:16:45 +00:00
Konstantin Belousov	de082cd17a	Disable bypass for the vop_advlockpurge(). The vop is called after vop_revoke(), the v_data is already destroyed. Reported and tested by: ed	2010-05-16 05:00:29 +00:00

1 2 3 4 5

240 Commits