105 Commits

Author SHA1 Message Date
kib
82d5c5773a Remove the filtering of the acceptable mount options for nullfs, added
in r245004.  Although the report was for noatime option which is
non-functional for the nullfs, other standard options like nosuid or
noexec are useful with it.

Reported by:	Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au>
MFC after:	3 days
2013-01-16 05:32:49 +00:00
kib
a7c71037df Add the "nocache" nullfs mount option, which disables the caching of
the free nullfs vnodes, switching nullfs behaviour to pre-r240285.
The option is mostly intended as the last-resort when higher pressure
on the vnode cache due to doubling of the vnode counts is not
desirable.

Note that disabling the cache costs more than 2x wall time in the
metadata-hungry scenarious.  The default is "cache".

Tested and benchmarked by:	pho (previous version)
MFC after:	2 weeks
2013-01-03 19:17:57 +00:00
attilio
d5d551ec46 Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.
Porters should refer to __FreeBSD_version 1000021 for this change as
it may have happened at the same timeframe.
2012-11-09 18:02:25 +00:00
kib
3ed1c80d25 Allow shared lookups for nullfs mounts, if lower filesystem supports
it.  There are two problems which shall be addressed for shared
lookups use to have measurable effect on nullfs scalability:

1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup
operation to lower fs, resulting vnode is often only shared-locked. Then
null_nodeget() cannot instantiate covering vnode for lower vnode, since
insmntque1() and null_hashins() require exclusive lock on the lower.

Change the assert that lower vnode is exclusively locked to only
require any lock.  If null hash failed to find pre-existing nullfs
vnode for lower vnode and the vnode is shared-locked, the lower vnode
lock is upgraded.

2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs
inability to detect reclamation of the lower vnode.  Reclamation of a
nullfs vnode at deactivation time prevents a reference to the lower
vnode to become stale.

Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the
VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode
together with the reclamation of the lower vnode.

Note that nullfs reclamation procedure calls vput() on the lowervp
vnode, temporary unlocking the vnode being reclaimed. This seems to be
fine for MPSAFE filesystems, but not-MPSAFE code often put partially
initialized vnode on some globally visible list, and later can decide
that half-constructed vnode is not needed.  If nullfs mount is created
above such filesystem, then other threads might catch such not
properly initialized vnode. Instead of trying to overcome this case,
e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I
decided to rely on nearby removal of the support for non-MPSAFE
filesystems.

In collaboration with:	pho
MFC after:	3 weeks
2012-09-09 19:20:23 +00:00
kevlo
af6e15978b Use NULL instead of 0 2012-03-13 10:04:13 +00:00
kib
3f374b0489 Allow shared locks for reads when lower filesystem accept shared locking.
Tested by:	pho
MFC after:	1 week
2012-02-29 15:18:53 +00:00
kib
a51474db93 Always request exclusive lock for the lower vnode in nullfs_vget().
The null_nodeget() requires exclusive lock on lowervp to be able to
insmntque() new vnode.

Reported by:	rea
Tested by:	pho
MFC after:	1 week
2012-02-29 15:09:20 +00:00
mm
4825085ea4 To improve control over the use of mount(8) inside a jail(8), introduce
a new jail parameter node with the following parameters:

allow.mount.devfs:
	allow mounting the devfs filesystem inside a jail

allow.mount.nullfs:
	allow mounting the nullfs filesystem inside a jail

Both parameters are disabled by default (equals the behavior before
devfs and nullfs in jails). Administrators have to explicitly allow
mounting devfs and nullfs for each jail. The value "-1" of the
devfs_ruleset parameter is removed in favor of the new allow setting.

Reviewed by:	jamie
Suggested by:	pjd
MFC after:	2 weeks
2012-02-23 18:51:24 +00:00
mm
76b54f7cc2 Allow mounting nullfs(5) inside jails.
This is now possible thanks to r230129.

MFC after:	1 month
2012-02-09 10:39:01 +00:00
kib
e4b773d887 Do the vput() for the lowervp in the null_nodeget() for error case too.
Several callers of null_nodeget() did the cleanup itself, but several
missed it, most prominent being null_bypass(). Remove the cleanup from
the callers, now null_nodeget() handles lowervp free itself.

Reported and tested by:	pho
MFC after:	1 week
2012-01-03 21:09:07 +00:00
kib
c8591306ab The use of VOP_ISLOCKED() without a check for the return values can cause
false positives. Replace the #ifdef block with the proper
ASSERT_VOP_UNLOCKED() assert.

Tested by:	pho
MFC after:	1 week
2011-10-24 13:56:31 +00:00
kib
177754802c The only possible error return from null_nodeget() is due to insmntque1
failure (the getnewvnode cannot return an error). In this case, the
null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK()
in the nullfs_mount() after null_nodeget() failure is wrong.

Tested by:	pho
MFC after:	1 week
2011-10-24 13:53:32 +00:00
kib
7950dabcc0 The covered vnode must be reloced if it was unlocked. Remove VOP_ISLOCKED
test because of this and also because it can lead to false positives.

Tested by:	pho
MFC after:	1 week
2011-10-24 13:48:13 +00:00
pho
2a208667af Only unlock if the lock is exclusive.
Reported by:	Subbsd <subbsd gmail com>
Discussed with:	kib
2011-10-24 10:35:37 +00:00
rmacklem
fbb8a5e8ec Add a lock flags argument to the VFS_FHTOVP() file system
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.

Reviewed by:	kib
2011-05-22 01:07:54 +00:00
attilio
1dcb84131b Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
kib
eff8c6d35e Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:01:21 +00:00
attilio
4014b55830 Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is
always curthread.

As KPI gets broken by this patch, manpages and __FreeBSD_version will be
updated by further commits.

Tested by:	Andrea Barberio <insomniac at slackware dot it>
2008-02-25 18:45:57 +00:00
attilio
e1db4e70b3 Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into
VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should
only acquire curthread as argument; this will lead in axing the additional
argument from both functions, making the code cleaner.

Reviewed by: jeff, kib
2008-02-08 21:45:47 +00:00
attilio
71b7824213 VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
attilio
18d0a0dd51 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
alfred
3a60df401c Get rid of qaddr_t.
Requested by: bde
2007-10-16 10:54:55 +00:00
rwatson
be966bcc03 Where I previously removed calls to kdb_enter(), now remove include of
kdb.h.

Pointed out by:	bde
2007-05-29 11:28:28 +00:00
rwatson
b3193c8a43 Rather than entering the debugger via kdb_enter() in the event the
root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and
then returning EDEADLK, panic.
2007-05-27 13:10:16 +00:00
pjd
cb2d7c85a8 Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.

Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.

VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.

Approved by:	mckusick
Discussed with:	many (on IRC)
Tested with:	ufs, msdosfs, cd9660, nullfs and zfs
2007-02-15 22:08:35 +00:00
tegge
83154f853d Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().
2006-09-26 04:12:49 +00:00
rodrigc
92e37b9fef Remove incorrect null_checkexp() routine. This
will allow the NFS server to call vfs_stdcheckexp() on the exported nullfs
filesystem, not the underlying filesystem being nullfs mounted.
If the lower filesystem was not NFS exported, then the NFS exported
null filesystem would not work.

Pointed out by:	scottl
PR:		kern/87906
MFC after:	1 week
2006-05-28 22:45:52 +00:00
rodrigc
d1d9c4f5bc Modify MNT_UPDATE behavior for nullfs so that it does not
return EOPNOTSUPP if an "export" parameter was passed in.
This should allow nullfs mounts to be NFS exported.

PR:		kern/87906
MFC after:	1 week
2006-05-28 20:09:18 +00:00
jhb
6acd384eb7 Correctly set MNTK_MPSAFE flag from the lower vnode's mount rather than
always turning it on along with any flags set in the lower mount.

Tested by:	kris
Reviewed by:	jeff
MFC after:	3 days
2006-02-10 18:06:49 +00:00
jeff
9ea95f5e38 - No need to WANTPARENT when we're just going to vrele it in a deadlock
prone way later.

Reported by:	kkenn
MFC After:	3 days
2006-02-07 11:31:32 +00:00
des
5d3c44687b Eradicate caddr_t from the VFS API. 2005-12-14 00:49:52 +00:00
rwatson
be4f357149 Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
kris
fa8ac58228 Reflect mpsafety of the underlying filesystem in the nullfs image.
I benchmarked this by simultaneously extracting 4 large tarballs (basically
world images) on a 4-processor AMD64 system, in a malloc-backed md.

With this patch, system time was reduced by 43%, and wall clock time by 33%.

Submitted by:	jeff
MFC after: 	1 week
2005-10-16 21:45:25 +00:00
jeff
9375f1d524 - Honor the flags argument passed to null_root(). The filesystem below
us will decide whether or not to grab a real shared lock.
2005-04-11 11:16:29 +00:00
jeff
226bf6ead4 - Update vfs_root implementations to match the new prototype. None of
these filesystems will support shared locks until they are explicitly
   modified to do so.  Careful review must be done to ensure that this
   is safe for each individual filesystem.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 07:36:16 +00:00
phk
da2718f1af Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().
I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

	The credentials for syncing a file (ability to write to the
	file) should be checked at the system call level.

	Credentials for syncing one or more filesystems ("none")
	should be checked at the system call level as well.

	If the filesystem implementation needs a particular credential
	to carry out the syncing it would logically have to the
	cached mount credential, or a credential cached along with
	any delayed write data.

Discussed with:	rwatson
2005-01-11 07:36:22 +00:00
imp
fb0a393e99 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 18:10:42 +00:00
phk
1eff61ea7f Use vfs_mountedfrom(), rely on vfs_mount.c calling VFS_STATFS(). 2004-12-06 20:02:13 +00:00
phk
6c14f71ef7 VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases
doesn't.  Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.

Fix this the cleaner way:  Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.
2004-12-05 22:41:02 +00:00
phk
59f305606c Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

	Replace the entire vop_t frobbing thing with properly typed
	structures.  The only casualty is that we can not add a new
	VOP_ method with a loadable module.  History has not given
	us reason to belive this would ever be feasible in the the
	first place.

	Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

	Give coda correct prototypes and function definitions for
	all vop_()s.

	Generate a bit more data from the vnode_if.src file:  a
	struct vop_vector and protype typedefs for all vop methods.

	Add a new vop_bypass() and make vop_default be a pointer
	to another struct vop_vector.

	Remove a lot of vfs_init since vop_vector is ready to use
	from the compiler.

	Cast various vop_mumble() to void * with uppercase name,
	for instance VOP_PANIC, VOP_NULL etc.

	Implement VCALL() by making vdesc_offset the offsetof() the
	relevant function pointer in vop_vector.  This is disgusting
	but since the code is generated by a script comparatively
	safe.  The alternative for nullfs etc. would be much worse.

	Fix up all vnode method vectors to remove casts so they
	become typesafe.  (The bulk of this is generated by scripts)
2004-12-01 23:16:38 +00:00
phk
5525bf4639 Use system wide no-op vfs_start function. 2004-11-25 09:11:27 +00:00
phk
37ad4f1923 Refuse attempts to mount root filesystem 2004-11-09 22:21:10 +00:00
phk
2d868d02cf Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version.  This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space.  A few places abused
it to get hold of some credentials to pass around.  Effectively
it is unused.

Reorganize the root filesystem selection code.
2004-07-30 22:08:52 +00:00
alfred
8a1713aada Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
2004-07-12 08:14:09 +00:00
marcel
32de0087b0 Update for the KDB framework:
o  Call kdb_enter() instead of Debugger().
o  Make debugging code conditional upon KDB instead of DDB.
2004-07-10 21:20:11 +00:00
imp
b49b7fe799 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
phk
fd139fd7d0 Initialize struct vfsops C99-sparsely.
Submitted by:   hmp
Reviewed by:	phk
2003-06-12 20:48:38 +00:00
phk
2ebd6ca61c Use temporary variable to avoid double expansion of macro with side effects.
Found by:       FlexeLint
2003-05-31 18:46:45 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00