freebsd-skq

Author	SHA1	Message	Date
Konstantin Belousov	effc6a3593	VOP_LOOKUP() may relock the directory vnode for some reasons. Since nullfs vnode shares vnode lock with lower vnode, this allows the reclamation of nullfs directory vnode in null_lookup(). In this situation, VOP must return ENOENT. More, since after the reclamation, the locks of nullfs directory vnode and lower vnode are no longer shared, the relock of the ldvp does not restore the correct locking state of dvp, and leaks ldvp lock. Correct this by unlocking ldvp and locking dvp. Use cached value of dvp->v_mount. Reported by: bdrewery Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-08-08 11:39:05 +00:00
Konstantin Belousov	0ebe0000b6	Assert that nullfs vnode has VV_ROOT set whenever lower vnode has. Assert that dotdot lookup on the root vnode is not performed. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-28 14:20:31 +00:00
Konstantin Belousov	289dd6dd7c	Fix typo. MFC after: 3 days	2014-07-24 23:14:03 +00:00
Konstantin Belousov	65589a29f4	Check for the cross-device cross-link attempt in the VFS, instead of forcing filesystem VOP_LINK() methods to repeat the code. In tmpfs_link(), remove redundand check for the type of the source, already done by VFS. Note that NFS server already performs this check before calling VOP_LINK(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-16 14:04:46 +00:00
Dag-Erling Smørgrav	1a05c762b9	Fix the length calculation for the final block of a sendfile(2) transmission which could be tricked into rounding up to the nearest page size, leaking up to a page of kernel memory. [13:11] In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR and SIOCSIFNETMASK at the socket layer rather than pass them on to the link layer without validation or credential checks. [SA-13:12] Prevent cross-mount hardlinks between different nullfs mounts of the same underlying filesystem. [SA-13:13] Security: CVE-2013-5666 Security: FreeBSD-SA-13:11.sendfile Security: CVE-2013-5691 Security: FreeBSD-SA-13:12.ifioctl Security: CVE-2013-5710 Security: FreeBSD-SA-13:13.nullfs Approved by: re	2013-09-10 10:05:59 +00:00
Konstantin Belousov	18a8d3d7f8	The tvp vnode on rename is usually unlinked. Drop the cached null vnode for tvp to allow the free of the lower vnode, if needed. PR: kern/180236 Tested by: smh Sponsored by: The FreeBSD Foundation MFC after: 1 week	2013-07-04 19:01:18 +00:00
Konstantin Belousov	0fc6daa72d	- Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The null_hashget() obtains the reference on the nullfs vnode, which must be dropped. - Fix a wart which existed from the introduction of the nullfs caching, do not unlock lower vnode in the nullfs_reclaim_lowervp(). It should be innocent, but now it is also formally safe. Inform the nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on nullfs inode. - Add a callback to the upper filesystems for the lower vnode unlinking. When inactivating a nullfs vnode, check if the lower vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC on the lower vnode, and reclaim upper vnode if so. This allows nullfs to purge cached vnodes for the unlinked lower vnode, avoiding excessive caching. Reported by: G??ran L??wkrantz <goran.lowkrantz@ismobile.com> Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-05-11 11:17:44 +00:00
Konstantin Belousov	6b17595133	When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed, get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days	2013-01-10 18:24:48 +00:00
Konstantin Belousov	9cf4c952ca	Add the "nocache" nullfs mount option, which disables the caching of the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable. Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache". Tested and benchmarked by: pho (previous version) MFC after: 2 weeks	2013-01-03 19:17:57 +00:00
Konstantin Belousov	140dedb81c	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks	2012-11-02 13:56:36 +00:00
Konstantin Belousov	82ed933c6f	Grammar fixes. Submitted by: bf MFC after: 1 week	2012-10-14 18:13:33 +00:00
Konstantin Belousov	806efacae0	Replace the XXX comment with the proper description. MFC after: 1 week	2012-10-14 17:07:34 +00:00
Konstantin Belousov	d9e9650a36	Allow shared lookups for nullfs mounts, if lower filesystem supports it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability: 1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower. Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded. 2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale. Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode. Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems. In collaboration with: pho MFC after: 3 weeks	2012-09-09 19:20:23 +00:00
Edward Tomasz Napierala	af6e6b87ad	Remove unused thread argument to vrecycle(). Reviewed by: kib	2012-04-23 14:10:34 +00:00
Konstantin Belousov	409b12c08a	In null_reclaim(), assert that reclaimed vnode is fully constructed, instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week	2012-02-29 15:15:36 +00:00
Konstantin Belousov	dd0f9532f3	Do the vput() for the lowervp in the null_nodeget() for error case too. Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week	2012-01-03 21:09:07 +00:00
Konstantin Belousov	f82360acf2	Existing VOP_VPTOCNP() interface has a fatal flow that is critical for nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)	2011-11-19 07:50:49 +00:00
Konstantin Belousov	f82ee01c1c	Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiled in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump. MFC after: 1 week	2011-11-19 07:41:37 +00:00
Rebecca Cran	974206cf70	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-23 09:22:33 +00:00
Rick Macklem	2d0c83b139	Add a null_remove() function to nullfs, so that the v_usecount of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well. Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks	2010-08-31 01:16:45 +00:00
Konstantin Belousov	de082cd17a	Disable bypass for the vop_advlockpurge(). The vop is called after vop_revoke(), the v_data is already destroyed. Reported and tested by: ed	2010-05-16 05:00:29 +00:00
Konstantin Belousov	c808c9632d	Add explicit struct ucred * argument for VOP_VPTOCNP, to be used by vn_open_cred in default implementation. Valid struct ucred is needed for audit and MAC, and curthread credentials may be wrong. This further requires modifying the interface of vn_fullpath(9), but it is out of scope of this change. Reviewed by: rwatson	2009-06-21 19:21:01 +00:00
Konstantin Belousov	b0f34bb643	Implement the bypass routine for VOP_VPTOCNP in nullfs. Among other things, this makes procfs <pid>/file working for executables started from nullfs mount. Tested by: pho PR: 94269, 104938	2009-05-31 14:58:43 +00:00
Konstantin Belousov	cec9ed6d7f	Lock the real null vnode lock before substitution of vp->v_vnlock. This should not really matter for correctness, since vp->v_lock is not locked before the call, and null_lock() holds the interlock, but makes the control flow for reclaim more clear. Tested by: pho	2009-05-31 14:52:45 +00:00
Edward Tomasz Napierala	c97fcdba57	Add VOP_ACCESSX, which can be used to query for newly added V* permissions, such as VWRITE_ACL. For a filsystems that don't implement it, there is a default implementation, which works as a wrapper around VOP_ACCESS. Reviewed by: rwatson@	2009-05-30 13:59:05 +00:00
Peter Holm	aa73f8c7a2	Do not use null_bypass for VOP_ISLOCKED, directly call default implementation. null_bypass cannot work for the !nullfs-vnodes, in particular, for VBAD vnodes. In collaboration with: kib	2009-03-18 13:54:35 +00:00
Attilio Rao	b13ec5e016	Remove the null_islocked() overloaded vop because the standard one does the same.	2009-03-13 07:09:20 +00:00
Konstantin Belousov	062ef8a5f8	Do not use bypass for vop_vptocnp() from nullfs, call standard implementation instead. The bypass does not assume that returned vnode is only held. Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com> Reviewed by: jhb Tested by: pho, pluknet <pluknet gmail com>	2009-03-10 14:35:21 +00:00
Bjoern A. Zeeb	7956d34b95	Remove unused local variables. Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks	2009-01-31 17:36:22 +00:00
Konstantin Belousov	5147a76a0e	In null_lookup(), do the needed cleanup instead of panicing saying the cleanup is needed. Reported by: kris, pho Tested by: pho MFC after: 2 weeks	2008-11-26 13:41:15 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Ed Schouten	19c5cd6288	Fix two small typo's in comments in the nullfs vnops code. Submitted by: Jille Timmermans <jille quis cx>	2008-09-11 20:15:34 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	0e9eb108f0	Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo	2008-01-24 12:34:30 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Daichi GOTO	1016626062	This changes give nullfs correctly work with latest unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:57:11 +00:00
Robert Watson	97cd541437	Where I previously removed calls to kdb_enter(), now remove include of kdb.h. Pointed out by: bde	2007-05-29 11:28:28 +00:00
Konstantin Belousov	d413d21071	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.	2007-05-18 13:02:13 +00:00
Pawel Jakub Dawidek	10bcafe9ab	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
Kip Macy	2f6a774be4	change vop_lock handling to allowing tracking of callers' file and line for acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)	2006-11-13 05:51:22 +00:00
Jeff Roberson	4bf5133b1f	- Define a null_getwritemount to get the mount-point for the lower filesystem so that nullfs doesn't permit you to circumvent snapshots. Discussed with: tegge Sponsored by: Isilon Systems, Inc.	2006-03-12 04:58:18 +00:00
Jeff Roberson	f5cacb3964	- spell VOP_LOCK(vp, LK_RELEASE... VOP_UNLOCK(vp,... so that asserts in vop_lock_post do not trigger. - Rearrange null_inactive to null_hashrem earlier so there is no chance of finding the null node on the hash list after the locks have been switched. - We should never have a NULL lowervp in null_reclaim() so there is no need to handle this situation. panic instead. MFC After: 1 week	2006-02-22 06:17:31 +00:00
Alexander Kabaev	d11c07ba56	Handle a race condition where NULLFS vnode can be cleaned while threads can still be asleep waiting for lowervp lock. Tested by: kkenn Discussed with: ssouhlal, jeffr	2005-09-15 19:21:26 +00:00
Suleiman Souhlal	cdeb72045b	Use vput() instead of vrele() in null_reclaim() since the lower vnode is locked. MFC after: 3 days	2005-09-02 15:49:55 +00:00
Jeff Roberson	7fd2deacb4	- As this is presently the one and only place where duplicate acquires of the vnode interlock are allowed mark it by passing MTX_DUPOK to this lock operation only. Sponsored by: Isilon Systems, Inc.	2005-04-22 22:42:44 +00:00
Jeff Roberson	ba73105324	- Lock the clearing of v_data so it is safe to inspect it with the interlock. Sponsored by: Isilon Systems, Inc.	2005-03-17 12:00:05 +00:00
Jeff Roberson	bc855512c8	- Assume that all lower filesystems now support proper locking. Assert that they set v->v_vnlock. This is true for all filesystems in the tree. - Remove all uses of LK_THISLAYER. If the lower layer is locked, the null layer is locked. We only use vget() to get a reference now. null essentially does no locking. This fixes LOOKUP_SHARED with nullfs. - Remove the special LK_DRAIN considerations, I do not believe this is needed now as LK_DRAIN doesn't destroy the lower vnode's lock, and it's hardly used anymore. - Add one well commented hack to prevent the lowervp from going away while we're in it's VOP_LOCK routine. This can only happen if we're forcibly unmounted while some callers are waiting in the lock. In this case the lowervp could be recycled after we drop our last ref in null_reclaim(). Prevent this with a vhold().	2005-03-15 13:49:33 +00:00
Jeff Roberson	9feb7408f8	- We have to transfer lockers after reseting our vnlock pointer. Sponsored by: Isilon Systems, Inc.	2005-03-15 11:28:45 +00:00
Jeff Roberson	8da0046596	- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK. - VOP_INACTIVE should no longer drop the vnode lock. - The vnode lock is required around calls to vrecycle() and vgone(). Sponsored by: Isilon Systems, Inc.	2005-03-13 12:18:25 +00:00

1 2 3

132 Commits