Commit Graph

72 Commits

Author SHA1 Message Date
Konstantin Belousov
0fc6daa72d - Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The
null_hashget() obtains the reference on the nullfs vnode, which must
  be dropped.

- Fix a wart which existed from the introduction of the nullfs
  caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
  It should be innocent, but now it is also formally safe.  Inform the
  nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
  nullfs inode.

- Add a callback to the upper filesystems for the lower vnode
  unlinking. When inactivating a nullfs vnode, check if the lower
  vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
  on the lower vnode, and reclaim upper vnode if so.  This allows
  nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
  excessive caching.

Reported by:	G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-05-11 11:17:44 +00:00
Konstantin Belousov
603f963e56 The current default size of the nullfs hash table used to lookup the
existing nullfs vnode by the lower vnode is only 16 slots.  Since the
default mode for the nullfs is to cache the vnodes, hash has extremely
huge chains.

Size the nullfs hashtbl based on the current value of
desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a
given vnode.

Pointy hat to:	    kib
Diagnosed and reviewed by:	peter
Tested by:    peter, pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2013-01-14 05:44:47 +00:00
Konstantin Belousov
268dd286a0 Fix reversed condition in the assertion.
Pointy hat to:	kib
MFC after:	13 days
2013-01-04 07:52:47 +00:00
Konstantin Belousov
9cf4c952ca Add the "nocache" nullfs mount option, which disables the caching of
the free nullfs vnodes, switching nullfs behaviour to pre-r240285.
The option is mostly intended as the last-resort when higher pressure
on the vnode cache due to doubling of the vnode counts is not
desirable.

Note that disabling the cache costs more than 2x wall time in the
metadata-hungry scenarious.  The default is "cache".

Tested and benchmarked by:	pho (previous version)
MFC after:	2 weeks
2013-01-03 19:17:57 +00:00
Konstantin Belousov
6db79c26ce Remove the check and panic for an impossible condition. The NULL
lowervp vnode v_vnlock would cause panic due to NULL pointer
dereference much earlier.

MFC after:	1 week
2012-11-20 15:25:00 +00:00
Attilio Rao
c6e0355cee r16312 is not any longer real since many years (likely since when VFS
received granular locking) but the comment present in UFS has been
copied all over other filesystems code incorrectly for several times.

Removes comments that makes no sense now.

Reviewed by:	kib
MFC after:	3 days
2012-11-19 22:43:45 +00:00
Konstantin Belousov
d9e9650a36 Allow shared lookups for nullfs mounts, if lower filesystem supports
it.  There are two problems which shall be addressed for shared
lookups use to have measurable effect on nullfs scalability:

1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup
operation to lower fs, resulting vnode is often only shared-locked. Then
null_nodeget() cannot instantiate covering vnode for lower vnode, since
insmntque1() and null_hashins() require exclusive lock on the lower.

Change the assert that lower vnode is exclusively locked to only
require any lock.  If null hash failed to find pre-existing nullfs
vnode for lower vnode and the vnode is shared-locked, the lower vnode
lock is upgraded.

2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs
inability to detect reclamation of the lower vnode.  Reclamation of a
nullfs vnode at deactivation time prevents a reference to the lower
vnode to become stale.

Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the
VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode
together with the reclamation of the lower vnode.

Note that nullfs reclamation procedure calls vput() on the lowervp
vnode, temporary unlocking the vnode being reclaimed. This seems to be
fine for MPSAFE filesystems, but not-MPSAFE code often put partially
initialized vnode on some globally visible list, and later can decide
that half-constructed vnode is not needed.  If nullfs mount is created
above such filesystem, then other threads might catch such not
properly initialized vnode. Instead of trying to overcome this case,
e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I
decided to rely on nearby removal of the support for non-MPSAFE
filesystems.

In collaboration with:	pho
MFC after:	3 weeks
2012-09-09 19:20:23 +00:00
Konstantin Belousov
66f02f4b25 Do not expose unlocked unconstructed nullfs vnode on mount list.
Lock the native nullfs vnode lock before switching the locks.

Tested by:	pho
MFC after:	1 week
2012-03-02 09:48:46 +00:00
Konstantin Belousov
cec1d07726 Document that null_nodeget() cannot take shared-locked lowervp due to
insmntque() requirements.

Tested by:	pho
MFC after:	1 week
2012-02-29 15:18:04 +00:00
Konstantin Belousov
67e3d54f80 Move the code to destroy half-contructed nullfs vnode into helper
function null_destroy_proto() from null_insmntque_dtr(). Also
apply null_destroy_proto() in null_nodeget() when we raced and a vnode
is found in the hash, so the currently allocated protonode shall be
destroyed.

Lock the vnode interlock around reassigning the v_vnlock.

In fact, this path will not be exercised after several later commits,
since null_nodeget() cannot take shared-locked lowervp at all due to
insmntque() requirements.

Reported by:	rea
Tested by:	pho
MFC after:	1 week
2012-02-29 15:06:00 +00:00
Konstantin Belousov
da732fc69f Merge a split multi-line comment.
MFC after:	1 week
2012-02-29 14:43:27 +00:00
Eygene Ryabinkin
15c75a0d9f Subject: NULLFS: properly destroy node hash
Use hashdestroy() instead of naive free().

Approved by:	kib
MFC after:	2 weeks
2012-01-18 11:23:46 +00:00
Dimitry Andric
f39adedd5b In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode
pointer 'lowervp' instead of 'vp', which is uninitialized at that point.

Reviewed by:	kib
MFC after:	1 week
2012-01-05 17:06:04 +00:00
Konstantin Belousov
dd0f9532f3 Do the vput() for the lowervp in the null_nodeget() for error case too.
Several callers of null_nodeget() did the cleanup itself, but several
missed it, most prominent being null_bypass(). Remove the cleanup from
the callers, now null_nodeget() handles lowervp free itself.

Reported and tested by:	pho
MFC after:	1 week
2012-01-03 21:09:07 +00:00
Konstantin Belousov
48a1e3f624 Document the state of the lowervp vnode for null_nodeget().
Tested by:	pho
MFC after:	1 week
2012-01-03 21:03:20 +00:00
Konstantin Belousov
4d2310dd81 Use the plain panic calls, without additional printing around them.
The debugger and dumping support is adequate.

Tested by:	pho
MFC after:	1 week
2011-11-19 07:40:13 +00:00
Konstantin Belousov
b9131889d2 Do not drop vnode interlock in null_checkvp(). null_lock() verifies that
v_data is not-null before calling NULLVPTOLOWERVP(), and dropping the
interlock allows for reclaim to clean v_data and free the memory.

While there, remove unneeded semicolons and convert the infinite loops
to panics. I have a will to remove null_checkvp() altogether, or leave
it as a trivial stub, but not now.

Reported and tested by:	pho
2009-05-31 14:54:20 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
Jeff Roberson
4c65d593e2 - Simplify null_hashget() and null_hashins() by using vref() rather
than a complex series of steps involving vget() without a lock type
   to emulate the same thing.
2008-03-29 23:24:54 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
Daichi GOTO
1016626062 This changes give nullfs correctly work with latest unionfs.
Submitted by:   Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer)
Reviewed by:    jeff, kensmith
Approved by:    re (kensmith)
MFC after:      1 week
2007-10-14 13:57:11 +00:00
Tor Egge
61b9d89ff0 Make insmntque() externally visibile and allow it to fail (e.g. during
late stages of unmount).  On failure, the vnode is recycled.

Add insmntque1(), to allow for file system specific cleanup when
recycling vnode on failure.

Change getnewvnode() to no longer call insmntque().  Previously,
embryonic vnodes were put onto the list of vnode belonging to a file
system, which is unsafe for a file system marked MPSAFE.

Change vfs_hash_insert() to no longer lock the vnode.  The caller now
has that responsibility.

Change most file systems to lock the vnode and call insmntque() or
insmntque1() after a new vnode has been sufficiently setup.  Handle
failed insmntque*() calls by propagating errors to callers, possibly
after some file system specific cleanup.

Approved by:	re (kensmith)
Reviewed by:	kib
In collaboration with:	kib
2007-03-13 01:50:27 +00:00
Jeff Roberson
9c12e63100 - Assert that the lowervp is locked in null_hashget().
- Simplify the logic dealing with recycled vnodes in null_hashget() and
   null_hashins().  Since we hold the lower node locked in both cases
   the null node can not be undergoing recycling unless reclaim somehow
   called null_nodeget().  The logic that was in place was not safe and
   was essentially dead code.

MFC After:	1 week
2006-02-22 06:15:12 +00:00
Robert Watson
5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Jeff Roberson
8e82c4cd5f - Clear VI_OWEINACT before calling vget() with no lock type. We know
the node is actually already locked, and VOP_INACTIVE is not desirable
   in this case.
2005-04-11 11:17:20 +00:00
Jeff Roberson
bc855512c8 - Assume that all lower filesystems now support proper locking. Assert
that they set v->v_vnlock.  This is true for all filesystems in the
   tree.
 - Remove all uses of LK_THISLAYER.  If the lower layer is locked, the
   null layer is locked.  We only use vget() to get a reference now.
   null essentially does no locking.  This fixes LOOKUP_SHARED with
   nullfs.
 - Remove the special LK_DRAIN considerations, I do not believe this is
   needed now as LK_DRAIN doesn't destroy the lower vnode's lock, and
   it's hardly used anymore.
 - Add one well commented hack to prevent the lowervp from going away
   while we're in it's VOP_LOCK routine.  This can only happen if we're
   forcibly unmounted while some callers are waiting in the lock.  In
   this case the lowervp could be recycled after we drop our last ref
   in null_reclaim().  Prevent this with a vhold().
2005-03-15 13:49:33 +00:00
Jeff Roberson
8da0046596 - The VI_DOOMED flag now signals the end of a vnode's relationship with
the filesystem.  Check that rather than VI_XLOCK.
 - VOP_INACTIVE should no longer drop the vnode lock.
 - The vnode lock is required around calls to vrecycle() and vgone().

Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:18:25 +00:00
Warner Losh
d167cf6f3a /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 18:10:42 +00:00
Poul-Henning Kamp
aec0fb7b40 Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

	Replace the entire vop_t frobbing thing with properly typed
	structures.  The only casualty is that we can not add a new
	VOP_ method with a loadable module.  History has not given
	us reason to belive this would ever be feasible in the the
	first place.

	Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

	Give coda correct prototypes and function definitions for
	all vop_()s.

	Generate a bit more data from the vnode_if.src file:  a
	struct vop_vector and protype typedefs for all vop methods.

	Add a new vop_bypass() and make vop_default be a pointer
	to another struct vop_vector.

	Remove a lot of vfs_init since vop_vector is ready to use
	from the compiler.

	Cast various vop_mumble() to void * with uppercase name,
	for instance VOP_PANIC, VOP_NULL etc.

	Implement VCALL() by making vdesc_offset the offsetof() the
	relevant function pointer in vop_vector.  This is disgusting
	but since the code is generated by a script comparatively
	safe.  The alternative for nullfs etc. would be much worse.

	Fix up all vnode method vectors to remove casts so they
	become typesafe.  (The bulk of this is generated by scripts)
2004-12-01 23:16:38 +00:00
Marcel Moolenaar
4ea4f1f97e Update for the KDB framework:
o  Call kdb_enter() instead of Debugger().
o  Make debugging code conditional upon KDB instead of DDB.
2004-07-10 21:20:11 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Tim J. Robbins
549398753a MFp4: Fix two bugs causing possible deadlocks or panics, and one nit:
- Emulate lock draining (LK_DRAIN) in null_lock() to avoid deadlocks
  when the vnode is being recycled.
- Don't allow null_nodeget() to return a nullfs vnode from the wrong
  mount when multiple nullfs's are mounted. It's unclear why these checks
  were removed in null_subr.c 1.35, but they are definitely necessary.
  Without the checks, trying to unmount a nullfs mount will erroneously
  return EBUSY, and forcibly unmounting with -f will cause a panic.
- Bump LOG2_SIZEVNODE up to 8, since vnodes are >256 bytes now. The old
  value (7) didn't cause any problems, but made the hash algorithm
  suboptimal.

These changes fix nullfs enough that a parallel buildworld succeeds.

Submitted by:	tegge (partially; LK_DRAIN)
Tested by:	kris
2003-06-17 08:52:45 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Kirk McKusick
a5b65058d5 Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by:	DARPA & NAI Labs.
2002-10-14 03:20:36 +00:00
Jeff Roberson
4d93c0be1f - Use vrefcnt() where it is safe to do so instead of doing direct and
unlocked accesses to v_usecount.
 - Lock access to the buf lists in the various sync routines.  interlock
   locking could be avoided almost entirely in leaf filesystems if the
   fsync function had a generic helper.
2002-09-25 02:32:42 +00:00
Nate Lawson
06be2aaa83 Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by:   phk
Reviewed by:    bde, rwatson (earlier version)
2002-09-14 09:02:28 +00:00
Semen Ustimenko
1cfdefbb9f Fix a race during null node creation between relookuping the hash and
adding vnode to hash. The fix is to use atomic hash-lookup-and-add-if-
not-found operation. The odd thing is that this race can't happen
actually because the lowervp vnode is locked exclusively now during the
whole process of null node creation. This must be thought as a step
toward shared lookups.

Also remove vp->v_mount checks when looking for a match in the hash,
as this is the vestige.

Also add comments and cosmetic changes.
2002-06-13 21:49:09 +00:00
Semen Ustimenko
1542003115 Change null_hashlock into null_hashmtx, because there is no need for
lockmgr and this helps to vget() vnode from hash without a race.

Reviewed by:	bp
MFC after:	2 weeks
2002-06-13 20:18:50 +00:00
Semen Ustimenko
08720e34b0 Fix the "error" path (when dropping not fully initialized vnode).
Also move hash operations out of null_vnops.c and explicitly initialize
v_lock in null_node_alloc (to set wmesg).

Reviewed by:	bp
MFC after:	2 weeks
2002-06-13 18:25:06 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Ruslan Ermilov
99d300a1ec - FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file
systems were repo-copied from sys/miscfs to sys/fs.

- Renamed the following file systems and their modules:
  fdesc -> fdescfs, portal -> portalfs, union -> unionfs.

- Renamed corresponding kernel options:
  FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.

- Install header files for the above file systems.

- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
  Makefiles.
2001-05-23 09:42:29 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Poul-Henning Kamp
fc2ffbe604 Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)
2001-02-04 13:13:25 +00:00
Poul-Henning Kamp
53ce36d17a Remove unneeded #include <sys/proc.h> lines. 2000-10-29 13:57:19 +00:00
Jason Evans
a18b1f1d4d Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.
2000-10-04 01:29:17 +00:00
Boris Popov
4451405fbd Fix vnode locking bugs in the nullfs.
Add correct support for v_object management, so mmap() operation should
work properly.
Add support for extattrctl() routine (submitted by semenu).

At this point nullfs can be considered as functional and much more stable.
In fact, it should behave as a "hard" "symlink" to underlying filesystem.

Reviewed in general by:		mckusick, dillon
Parts of logic obtained from:	NetBSD
2000-09-25 15:38:32 +00:00
Boris Popov
8da8066061 Various cleanups towards make nullfs functional (it is still broken
at this point):

Replace all '#ifdef DEBUG' with '#ifdef NULLFS_DEBUG' and add NULLFSDEBUG
macro.

Protect nullfs hash table with lockmgr.

Use proper order of operations when freeing mnt_data.

Return correct fsid in the null_getattr().

Add null_open() function to catch MNT_NODEV (obtained from NetBSD).

Add null_rename() to catch cross-fs rename operations (submitted by
Ustimenko Semen <semen@iclub.nsu.ru>)

Remove duplicate $FreeBSD$ tags.
2000-09-05 09:02:07 +00:00
Boris Popov
7da1e3f0b0 Get rid from the __P() macros.
Encouraged by:	peter
2000-09-05 07:54:39 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00