414 Commits

Author SHA1 Message Date
mckusick
9d00a4a781 When scanning the freelist looking for candidate vnodes to recycle,
be sure to exit the loop with vp == NULL if no candidates are found.
Formerly, this bug would cause the last vnode inspected to be used,
even if it was not available. The result was a panic "vn_finished_write:
neg cnt".

Sponsored by:	DARPA & NAI Labs.
2002-10-14 19:54:39 +00:00
mckusick
05ff8976a7 Unconditionally reset vp->v_vnlock back to the default in the
vclean() function (e.g., vp->v_vnlock = &vp->v_lock) rather
than requiring filesystems that use alternate locks to do so
in their vop_reclaim functions. This change is a further cleanup
of the vop_stdlock interface.

Submitted by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
Sponsored by:	DARPA & NAI Labs.
2002-10-14 19:44:51 +00:00
mckusick
25230d4c6a Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by:	DARPA & NAI Labs.
2002-10-14 03:20:36 +00:00
mckusick
16ad96c43c When considering a vnode for reuse in getnewvnode, we call
vcanrecycle to check a free vnode's availability. If it is
available, vcanrecycle returns an error code of zero and the
vnode in question locked. The getnewvnode routine then used
to call vn_start_write with the V_NOWAIT flag. If the filesystem
was suspended while taking a snapshot, the vn_start_write would
fail but getnewvnode would fail to unlock the vnode, instead
leaving it locked on the freelist. The result would be that the
vnode would be locked forever and would eventually hang the
system with a race to the root when it was attempted to recycle
it. This fix moves the vn_start_write check into vcanrecycle
where it will properly unlock the vnode if it is unavailable
for recycling due to filesystem suspension.

Sponsored by:	DARPA & NAI Labs.
2002-10-11 01:04:14 +00:00
sobomax
18d9db4bb5 Fix problem introduced in rev.1.406, which can cause already unlocked
mutex being unlocked again causing system panic.
2002-10-05 12:56:10 +00:00
phk
b55fa4540e Fix some harmless mis-indents.
Spotted by:	FlexeLint
2002-10-01 15:48:31 +00:00
rwatson
5d5060bddf Move vnode MAC label initialization to after the release of the vnode
interlock in getnewvnode() to avoid possible sleeps while holding
the mutex.  Note that the warning from Witness is a slight false
positive since we know there will be no contention on the interlock
since we haven't made the vnode available for use yet, but the theory
is not a bad one.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-30 20:51:48 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
jeff
31b1ddae74 - Move ASSERT_VOP_*LOCK* functionality into functions in vfs_subr.c
- Make the VI asserts more orthogonal to the rest of the asserts by using a
   new, common vfs_badlock() function and adding a 'str' arg.
 - Adjust generated ASSERTS to match the new prototype.
 - Adjust explicit ASSERTS to match the new prototype.
2002-09-26 04:48:44 +00:00
jeff
ee7cd9172d - Lock down the syncer with sync_mtx.
- Enable vfs_badlock_mutex by default.
 - Assert that the vp is locked in VOP_UNLOCK.
 - Use standard interlock macros in remaining code.
 - Correct a race in getnewvnode().
 - Lock access to v_numoutput with interlock.
 - Lock access to buf lists and splay tree with interlock.
 - Add VOP and VI asserts.
 - Lock b_vnbufs with the vnode interlock.
 - Add vrefcnt() for callers who want to retreive the vnode ref without
   holding a lock.  Add a comment that describes when this is safe.
 - Add vholdl() and vdropl() so that callers who already own the interlock
   can avoid race conditions and unnecessary unlocking.
 - Move the VOP_GETATTR() in vflush() into the WRITECLOSE conditional case.
 - Hold the interlock before droping the mntlist_mtx in vflush() to avoid
   a race.
 - Fix locking in vfs_msync().
2002-09-25 02:22:21 +00:00
njl
00c79f5c92 Remove any VOP_PRINT that redundantly prints the tag.
Move lockmgr_printinfo() into vprint() for everyone's benefit.

Suggested by: bde
2002-09-18 20:42:04 +00:00
njl
0590c43070 Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by:   phk
Reviewed by:    bde, rwatson (earlier version)
2002-09-14 09:02:28 +00:00
julian
06f500f894 Indentation does not make a block.. need curly braces too.
Submitted by: Eagle-eyes evans <bde@freebsd.org>
2002-09-11 18:15:26 +00:00
julian
5702a380a5 Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00
phk
3303b3f624 Fix an inherited style bug: compare with NOCRED instead of NULL.
Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:46:19 +00:00
phk
55be95d161 Introduce new extattr_check_cred() function which implements the canonical
crential washing for extended attributes.

Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:38:57 +00:00
charnier
7dd9d47059 Replace various spelling with FALLTHROUGH which is lint()able 2002-08-25 13:23:09 +00:00
jeff
da601a39ac - Fix a mistake in my last few commits. The PDROP flag stops msleep from
re-acquiring the mutex.

Pointy hat to:	me
Noticed by:	tegge
2002-08-23 00:32:03 +00:00
jeff
6c5497f47a - Make vn_lock() vget() and VOP_LOCK() all behave the same way WRT
LK_INTERLOCK.  The interlock will never be held on return from these
   functions even when there is an error.  Errors typically only occur when
   the XLOCK is held which means this isn't the vnode we want anyway.  Almost
   all users of these interfaces expected this behavior even though it was
   not provided before.
2002-08-22 07:44:45 +00:00
jeff
1e39ba8620 - Fix interlock handling in vn_lock(). Previously, vn_lock() could return
with interlock held in error conditions when the caller did not specify
   LK_INTERLOCK.
 - Add several comments to vn_lock() describing the rational behind the code
   flow since it was not immediately obvious.
2002-08-22 06:51:06 +00:00
jeff
275611472a - Document two cases, one in vget and the other in vn_lock, where the state
of interlock on exit is not consistent.  There are probably several bugs
   relating to this.
2002-08-21 08:34:48 +00:00
jeff
ca5f1feb36 - If vn_lock fails with the LK_INTERLOCK flag set, interlock will not be
released.  vcanrecycle() failed to unlock interlock under this condition.
 - Remove an extra VOP_UNLOCK from a failure case in vcanrecycle().

Pointed out by:	rwatson
2002-08-21 06:40:34 +00:00
jeff
2fc7835d26 - Add two new debugging macros: ASSERT_VI_LOCKED and ASSERT_VI_UNLOCKED
- Use the new VI asserts in place of the old mtx_assert checks.
 - Add the VI asserts to the automated lock checking in the VOP calls.  The
   interlock should not be held across vops with a few exceptions.
 - Add the vop_(un)lock_{pre,post} functions to assert that interlock is held
   when LK_INTERLOCK is set.
2002-08-21 06:19:29 +00:00
jeff
d18378e088 - Extend the vnode_free_list_mtx to cover numvnodes and freevnodes. This
was done only some of the time before, and now it is uniformly applied.
2002-08-13 05:29:48 +00:00
mux
f43070c325 - Introduce a new struct xvfsconf, the userland version of struct vfsconf.
- Make getvfsbyname() take a struct xvfsconf *.
- Convert several consumers of getvfsbyname() to use struct xvfsconf.
- Correct the getvfsbyname.3 manpage.
- Create a new vfs.conflist sysctl to dump all the struct xvfsconf in the
  kernel, and rewrite getvfsbyname() to use this instead of the weird
  existing API.
- Convert some {set,get,end}vfsent() consumers to use the new vfs.conflist
  sysctl.
- Convert a vfsload() call in nfsiod.c to kldload() and remove the useless
  vfsisloadable() and endvfsent() calls.
- Add a warning printf() in vfs_sysctl() to tell people they are using
  an old userland.

After these changes, it's possible to modify struct vfsconf without
breaking the binary compatibility.  Please note that these changes don't
break this compatibility either.

When bp will have updated mount_smbfs(8) with the patch I sent him, there
will be no more consumers of the {set,get,end}vfsent(), vfsisloadable()
and vfsload() API, and I will promptly delete it.
2002-08-10 20:19:04 +00:00
jeff
f91961bfed - Move some logic from getnewvnode() to a new function vcanrecycle()
- Unlock the free list mutex around vcanrecycle to prevent a lock order
   reversal.
2002-08-05 10:15:56 +00:00
jeff
02517b6731 - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
rwatson
a5dcc1fd3d Include file cleanup; mac.h and malloc.h at one point had ordering
relationship requirements, and no longer do.

Reminded by:	bde
2002-08-01 17:47:56 +00:00
des
2ca172b725 Nit in previous commit: the correct sysctl type is "S,xvnode" 2002-07-31 12:25:28 +00:00
des
9c7ec03502 Initialize v_cachedid to -1 in getnewvnode().
Reintroduce the kern.vnode sysctl and make it export xvnodes rather than
vnodes.

Sponsored by:	DARPA, NAI Labs
2002-07-31 12:24:35 +00:00
rwatson
6bb9b1da05 Note that the privilege indicating flag to vaccess() originally used
by the process accounting system is now deprecated.
2002-07-31 02:05:12 +00:00
rwatson
261170743f Introduce support for Mandatory Access Control and extensible
kernel access control.

Invoke the necessary MAC entry points to maintain labels on vnodes.
In particular, initialize the label when the vnode is allocated or
reused, and destroy the label when the vnode is going to be released,
or reused.  Wow, an object where there really is exactly one place
where it's allocated, and one other where it's freed.  Amazing.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-07-31 02:03:46 +00:00
jeff
5dce00d8f1 - Backout the patch made in revision 1.75 of vfs_mount.c. The vputs here
were hiding the real problem of the missing unlock in sync_inactive.
 - Add the missing unlock in sync_inactive.

Submitted by:	iedowse
2002-07-29 06:26:55 +00:00
truckman
b1555a2743 Wire the sysctl output buffer before grabbing any locks to prevent
SYSCTL_OUT() from blocking while locks are held.  This should
only be done when it would be inconvenient to make a temporary copy of
the data and defer calling SYSCTL_OUT() until after the locks are
released.
2002-07-28 19:59:31 +00:00
rwatson
7be639a7c0 Teach discretionary access control methods for files about VAPPEND
and VALLPERM.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-07-22 03:57:07 +00:00
mckusick
b44cb5787c Add support to UFS2 to provide storage for extended attributes.
As this code is not actually used by any of the existing
interfaces, it seems unlikely to break anything (famous
last words).

The internal kernel interface to manipulate these attributes
is invoked using two new IO_ flags: IO_NORMAL and IO_EXT.
These flags may be specified in the ioflags word of VOP_READ,
VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that
you want to do I/O to the normal data part of the file and
IO_EXT means that you want to do I/O to the extended attributes
part of the file. IO_NORMAL and IO_EXT are mutually exclusive
for VOP_READ and VOP_WRITE, but may be specified individually
or together in the case of VOP_TRUNCATE. For example, when
removing a file, VOP_TRUNCATE is called with both IO_NORMAL
and IO_EXT set. For backward compatibility, if neither IO_NORMAL
nor IO_EXT is set, then IO_NORMAL is assumed.

Note that the BA_ and IO_ flags have been `merged' so that they
may both be used in the same flags word. This merger is possible
by assigning the IO_ flags to the low sixteen bits and the BA_
flags the high sixteen bits. This works because the high sixteen
bits of the IO_ word is reserved for read-ahead and help with
write clustering so will never be used for flags. This merge
lets us get away from code of the form:

        if (ioflags & IO_SYNC)
                flags |= BA_SYNC;

For the future, I have considered adding a new field to the
vattr structure, va_extsize. This addition could then be
exported through the stat structure to allow applications to
find out the size of the extended attribute storage and also
would provide a more standard interface for truncating them
(via VOP_SETATTR rather than VOP_TRUNCATE).

I am also contemplating adding a pathconf parameter (for
concreteness, lets call it _PC_MAX_EXTSIZE) which would
let an application determine the maximum size of the extended
atribute storage.

Sponsored by:	DARPA & NAI Labs.
2002-07-19 07:29:39 +00:00
mckusick
3abb526f86 Change utimes to set the file creation time (for filesystems that
support creation times such as UFS2) to the value of the
modification time if the value of the modification time is older
than the current creation time. See utimes(2) for further details.

Sponsored by:	DARPA & NAI Labs.
2002-07-17 02:03:19 +00:00
dillon
da4e111a55 Replace the global buffer hash table with per-vnode splay trees using a
methodology similar to the vm_map_entry splay and the VM splay that Alan
Cox is working on.  Extensive testing has appeared to have shown no
increase in overhead.

Disadvantages
    Dirties more cache lines during lookups.

    Not as fast as a hash table lookup (but still N log N and optimal
    when there is locality of reference).

Advantages
    vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem
    syncer operate more efficiently.

    I get to rip out all the old hacks (some of which were mine) that tried
    to keep the v_dirtyblkhd tailq sorted.

    The per-vnode splay tree should be easier to lock / SMPng pushdown on
    vnodes will be easier.

    This commit along with another that Alan is working on for the VM page
    global hash table will allow me to implement ranged fsync(), optimize
    server-side nfs commit rpcs, and implement partial syncs by the
    filesystem syncer (aka filesystem syncer would detect that someone is
    trying to get the vnode lock, remembers its place, and skip to the
    next vnode).

Note that the buffer cache splay is somewhat more complex then other splays
due to special handling of background bitmap writes (multiple buffers with
the same lblkno in the same vnode), and B_INVAL discontinuities between the
old hash table and the existence of the buffer on the v_cleanblkhd list.

Suggested by: alc
2002-07-10 17:02:32 +00:00
jeff
fe9018671a - Use standard locking functions in syncer's opv
- vput instead of vrele syncer vnodes in vfs_mount
 - Add vop_lookup_{pre,post} to verify locking in VOP_LOOKUP
2002-07-09 19:54:20 +00:00
jeff
cca3a0ef3d - Don't hold the vn lock while calling VOP_CLOSE in vclean(). 2002-07-07 06:38:22 +00:00
jeff
8bf1a039cb - BUF_REFCNT() seems to be the preferred method for verifying a locked buf.
Tell vop_strategy_pre() to use this instead.
 - Ignore B_CLUSTER bufs.  Their components are locked but they don't really
   exist so they don't have to be.  This isn't ideal but it is safe.
2002-07-07 05:29:45 +00:00
jeff
f1b0400267 Fix a mistake in my last commit. Don't grab an extra reference to the object
in bp->b_object.
2002-07-06 21:27:20 +00:00
jeff
0dd7645264 Fixup uses of GETVOBJECT.
- Cache a pointer to the vnode's object in the buf.
 - Hold a reference to that object in addition to the vnode's reference just
   to be consistent.
 - Cleanup code that got the object indirectly through the vp and VOP calls.

This fixes at least one case where we were calling GETVOBJECT without a lock.
It also avoids an expensive layered call at the cost of another pointer in
struct buf.
2002-07-06 08:59:52 +00:00
jeff
908b0eb9a7 - Add vop_strategy_pre to validate VOP_STRATEGY locking.
- Disable original vop_strategy lock specification.
 - Switch to the new vop_strategy_pre for lock validation.

VOP_STRATEGY requires only that the buf is locked UNLESS the block numbers need
to be translated.  There may be other reasons, but as long as the underlying
layer uses a VOP to perform the operations they will be caught later.
2002-07-06 05:21:12 +00:00
jeff
3bce786a77 Add "vop_rename_pre" to do pre rename lock verification. This is enabled only
with DEBUG_VFS_LOCKS.
2002-07-06 04:39:48 +00:00
mux
4f6ffa4183 Move vfs_rootmountalloc() in vfs_mount.c and remove lite2_vfs_mountroot()
which was #if 0'd and is not likely to be used now.
2002-07-03 09:27:24 +00:00
mux
eb5a0f4a7e Move every code related to mount(2) in a new file, vfs_mount.c.
The file vfs_conf.c which was dealing with root mounting has
been repo-copied into vfs_mount.c to preserve history.
This makes nmount related development easier, and help reducing
the size of vfs_syscalls.c, which is still an enormous file.

Reviewed by:	rwatson
Repo-copy by:	peter
2002-07-02 17:09:22 +00:00
iedowse
4416f82706 Use indirect function pointer hooks instead of #ifdef SOFTUPDATES
direct calls for the two places where the kernel calls into soft
updates code. Set up the hooks in softdep_initialize() and NULL
them out in softdep_uninitialize(). This change allows soft updates
to function correctly when ufs is loaded as a module.

Reviewed by:	mckusick
2002-07-01 17:59:40 +00:00
obrien
4db8ac83cb Rename the db command lockedvnodes to lockedvnods so that it fits on the
help screen and one doens't think we have a lockedvnodesmap command.
2002-06-29 04:45:09 +00:00
alfred
708aac7550 nuke caddr_t. 2002-06-28 23:17:36 +00:00