Commit Graph

190 Commits

Author SHA1 Message Date
John Baldwin
3634d5b241 Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE and
LK_CANRECURSE after a lock is created.  Use them to implement macros that
otherwise manipulated the flags directly.  Assert that the associated
lockmgr lock is exclusively locked by the current thread when manipulating
these flags to ensure the flag updates are safe.  This last change required
some minor shuffling in a few filesystems to exclusively lock a brand new
vnode slightly earlier.

Reviewed by:	kib
MFC after:	3 days
2010-08-20 19:46:50 +00:00
Konstantin Belousov
5673e3cb08 The cache_enter(9) function shall not be called for doomed dvp.
Assert this.

In the reported panic, vdestroy() fired the assertion "vp has namecache
for ..", because pseudofs may end up doing cache_enter() with reclaimed
dvp, after dotdot lookup temporary unlocked dvp.
Similar problem exists in ufs_lookup() for "." lookup, when vnode
lock needs to be upgraded.

Verify that dvp is not reclaimed before calling cache_enter().

Reported and tested by:	pho
Reviewed by:	kan
MFC after:	2 weeks
2010-04-20 10:19:27 +00:00
Jaakko Heinonen
e48fbf26e8 Truncate read request rather than returning EIO if the request is
larger than MAXPHYS + 1. This fixes a problem with cat(1) when it
uses a large I/O buffer.

Reported by:	Fernando Apesteguía
Suggested by:	jilles
Reviewed by:	des
Approved by:	trasz (mentor)
2010-01-22 08:45:12 +00:00
Konstantin Belousov
481208a815 If a race is detected, pfs_vncache_alloc() may reclaim a vnode that had
never been inserted into the pfs_vncache list. Since pfs_vncache_free()
does not anticipate this case, it decrements pfs_vncache_entries
unconditionally; if the vnode was not in the list, pfs_vncache_entries
will no longer reflect the actual number of list entries. This may cause
size of the cache to exceed the configured maximum. It may also trigger
a panic during module unload or system shutdown.

Do not decrement pfs_vncache_entries for the vnode that was not in the
list.

Submitted by:	tegge
Reviewed by:	des
MFC after:	1 week
2009-09-07 12:10:41 +00:00
Konstantin Belousov
6cc745d2d7 insmntque_stddtr() clears vp->v_data and resets vp->v_op to
dead_vnodeops before calling vgone(). Revert r189706 and corresponding
part of the r186560.

Noted and reviewed by:	tegge
Approved by:	des (pseudofs part)
MFC after:	3 days
2009-09-07 11:55:34 +00:00
Konstantin Belousov
34f83c86f6 Remove spurious pfs_unlock().
PR:	kern/137310
Reviewed by:	des
MFC after:	3 days
2009-08-31 09:26:04 +00:00
Konstantin Belousov
9f80ce043d Change the type of uio_resid member of struct uio from int to ssize_t.
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.

Tested by:	pho
Reviewed by:	jhb, bde (as part of the review of the bigger patch)
2009-06-25 18:46:30 +00:00
Konstantin Belousov
c4df27d5c8 VOP_IOCTL takes unlocked vnode as an argument. Due to this, v_data may
be NULL or derefenced memory may become free at arbitrary moment.

Lock the vnode in cd9660, devfs and pseudofs implementation of VOP_IOCTL
to prevent reclaim; check whether the vnode was already reclaimed after
the lock is granted.

Reported by:	georg at dts su
Reviewed by:	des (pseudofs)
MFC after:	2 weeks
2009-06-10 13:57:36 +00:00
Dag-Erling Smørgrav
c097b30885 Drop Giant.
MFC after:	1 week
2009-06-06 00:44:13 +00:00
Konstantin Belousov
b00098d164 Unlock the pseudofs vnode before calling fill method for pfs_readlink().
The fill code may need to lock another vnode, e.g. procfs file
implementation.

Reviewed by:	des
Tested by:	pho
MFC after:	2 weeks
2009-05-31 15:01:50 +00:00
Dag-Erling Smørgrav
26088b9d4d Use a temporary variable to avoid a duplicate strlen().
Submitted by:	kib
MFC after:	1 week
2009-05-28 10:24:26 +00:00
Attilio Rao
dfd233edd5 Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
Dag-Erling Smørgrav
4970b8ae0a Remove spurious locking in pfs_write().
Reported by:	Andrew Brampton <me@bramp.net>
MFC after:	1 week
2009-04-08 09:02:42 +00:00
Dag-Erling Smørgrav
59fc1b068f Fix an inverted KASSERT. Add similar assertions in other similar places.
Reported by:	Andrew Brampton <me@bramp.net>
MFC after:	1 week
2009-04-07 16:13:10 +00:00
Dag-Erling Smørgrav
655fcdaa00 Fix a logic bug that caused the pfs_attr method to be called only for
PFS_PROCDEP nodes.

Submitted by:	Andrew Brampton <brampton@gmail.com>
MFC after:	2 weeks
2009-02-16 15:17:26 +00:00
Joe Marcus Clarke
4424c9d053 Fix a deadlock which can occur due to a pseudofs vnode not getting unlocked.
Reported by:	Richard Todd <rmtodd@ichotolot.servalan.com>
Reviewed by:	kib
Approved by:	kib
2009-01-09 22:06:48 +00:00
Joe Marcus Clarke
e7f54c1b71 Add a VOP_VPTOCNP implementation for pseudofs which covers file systems
such as procfs and linprocfs.

This implementation's locking was enhanced by kib.

Reviewed by:	kib
		des
Approved by:	des
		kib
Tested by:	pho
2008-12-30 21:49:39 +00:00
Konstantin Belousov
78e4cea909 When the insmntque() in the pfs_vncache_alloc() fails, vop_reclaim calls
pfs_vncache_free() that removes pvd from the list, while it is not yet
put on the list.

Prevent the invalid removal from the list by clearing pvd_next and
pvd_prev for the newly allocated pvd, and only move pfs_vncache list
head when the pvd was at the head.

Suggested and approved by:	des
MFC after:	2 weeks
2008-12-29 13:25:58 +00:00
Konstantin Belousov
505d02eebe Drop the pseudofs vnode lock around call to pfs_read handler. The handler
may need to lock arbitrary vnodes, causing either lock order reversal or
recursive vnode lock acquisition.

Tested by:	pho
Approved by:	des
MFC after:	2 weeks
2008-12-29 12:12:23 +00:00
Konstantin Belousov
99ec92c962 After the pfs_vncache_mutex is dropped, another thread may attempt to
do pfs_vncache_alloc() for the same pfs_node and pid. In this case, we
could end up with two vnodes for the pair. Recheck the cache under the
locked pfs_vncache_mutex after all sleeping operations are done [1].

This case mostly cannot happen now because pseudofs uses exclusive vnode
locking for lookup. But it does drop the vnode lock for dotdot lookups,
and Marcus' pseudofs_vptocnp implementation is vulnerable too.

Do not call free() on the struct pfs_vdata after insmntque() failure,
because vp->v_data points to the structure, and pseudofs_reclaim()
frees it by the call to pfs_vncache_free().

Tested by:	pho [1]
Approved by:	des
MFC after:	2 weeks
2008-12-29 12:07:18 +00:00
Edward Tomasz Napierala
15bc6b2bd8 Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by:	rwatson (mentor)
2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
Konstantin Belousov
caf8aec886 fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs
initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(),
vattr_null() or by zeroing it. Remove these to allow preinitialization
of fields work in vn_stat(). This is needed to get birthtime initialized
correctly.

Submitted by:   Jaakko Heinonen <jh saunalahti fi>
Discussed on:   freebsd-fs
MFC after:	1 month
2008-09-20 19:50:52 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Attilio Rao
628f51d275 Introduce some functions in the vnode locks namespace and in the ffs
namespace in order to handle lockmgr fields in a controlled way instead
than spreading all around bogus stubs:
- VN_LOCK_AREC() allows lock recursion for a specified vnode
- VN_LOCK_ASHARE() allows lock sharing for a specified vnode

In FFS land:
- BUF_AREC() allows lock recursion for a specified buffer lock
- BUF_NOREC() disallows recursion for a specified buffer lock

Side note: union_subr.c::unionfs_node_update() is the only other function
directly handling lockmgr fields. As this is not simple to fix, it has
been left behind as "sole" exception.
2008-02-24 16:38:58 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
Alfred Perlstein
77465d9390 Get rid of qaddr_t.
Requested by: bde
2007-10-16 10:54:55 +00:00
John Baldwin
c1f7cf23b1 Use the correct pid when checking to see whether or not the /proc/<pid>
directory itself (rather than any of its contents) is visible to the
current thread.

MFC after:	1 week
PR:		kern/90063
Submitted by:	john of 8192.net
Approved by:	re (kensmith)
2007-10-05 17:37:25 +00:00
Bruce A. Mah
5cca41595d Fix off-by-one error (introduced in r1.60) that had the effect of
disallowing a read of exactly MAXPHYS bytes.

Reviewed by:	des, rdivacky
MFC after:	1 week
Sponsored by:	nCircle Network Security
2007-06-07 15:04:30 +00:00
Dag-Erling Smørgrav
b77d604841 Fix old locking bugs which were revealed when pseudofs was made MPSAFE.
Submitted by:	tegge
2007-04-23 19:17:01 +00:00
Dag-Erling Smørgrav
8edf8ae133 Avoid "unused variable" warning when building without PSEUDOFS_TRACE. 2007-04-15 20:35:18 +00:00
Dag-Erling Smørgrav
388596dffc Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE. 2007-04-15 17:10:01 +00:00
Dag-Erling Smørgrav
f61bc4ea5e Further pseudofs improvements:
The pfs_info mutex is only needed to lock pi_unrhdr.  Everything else
in struct pfs_info is modified only while Giant is held (during
vfs_init() / vfs_uninit()); add assertions to that effect.

Simplify pfs_destroy somewhat.

Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the
assertions which were added in the previous commit to ensure they were
consistent.

Assert that Giant is held while the vnode cache is initialized and
destroyed.  Also assert that the cache is empty when it is destroyed.

Rename the vnode cache mutex for consistency.

Fix a long-standing bug in pfs_getattr(): it would uncritically return
the node's pn_fileno as st_ino.  This would result in st_ino being 0
if the node had not previously been visited by readdir(), and also in
an incorrect st_ino for process directories and any files contained
therein.  Correct this by abstracting the fileno manipulations
previously done in pfs_readdir() into a new function, pfs_fileno(),
which is used by both pfs_getattr() and pfs_readdir().
2007-04-14 14:08:30 +00:00
Dag-Erling Smørgrav
15bad11fdb Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process-
specific nodes when the process exits)

Move the vnode-cache-walking loop which was duplicated in pfs_exit() and
pfs_disable() into its own function, pfs_purge(), which looks for vnodes
marked as dead and / or belonging to the specified pfs_node and reclaims
them.  Note that this loop is still extremely inefficient.

Add a comment in pfs_vncache_alloc() explaining why we have to purge the
vnode from the vnode cache before returning, in case anyone should be
tempted to remove the call to cache_purge().

Move the special handling for pfstype_root nodes into pfs_fileno_alloc()
and pfs_fileno_free() (the root node's fileno must always be 2).  This
also fixes a bug where pfs_fileno_free() would reclaim the root node's
fileno, triggering a panic in the unr code, as that fileno was never
allocated from unr to begin with.

When destroying a pfs_node, release its fileno and purge it from the
vnode cache.  I wish we could put off the call to pfs_purge() until
after the entire tree had been destroyed, but then we'd have vnodes
referencing freed pfs nodes.  This probably doesn't matter while we're
still under Giant, but might become an issue later.

When destroying a pseudofs instance, destroy the tree before tearing
down the fileno allocator.

In pfs_mount(), acquire the mountpoint interlock when required.

MFC after:	3 weeks
2007-04-11 22:40:57 +00:00
Dag-Erling Smørgrav
56c62ab69c Whitespace nits. 2007-04-05 13:43:00 +00:00
Tor Egge
61b9d89ff0 Make insmntque() externally visibile and allow it to fail (e.g. during
late stages of unmount).  On failure, the vnode is recycled.

Add insmntque1(), to allow for file system specific cleanup when
recycling vnode on failure.

Change getnewvnode() to no longer call insmntque().  Previously,
embryonic vnodes were put onto the list of vnode belonging to a file
system, which is unsafe for a file system marked MPSAFE.

Change vfs_hash_insert() to no longer lock the vnode.  The caller now
has that responsibility.

Change most file systems to lock the vnode and call insmntque() or
insmntque1() after a new vnode has been sufficiently setup.  Handle
failed insmntque*() calls by propagating errors to callers, possibly
after some file system specific cleanup.

Approved by:	re (kensmith)
Reviewed by:	kib
In collaboration with:	kib
2007-03-13 01:50:27 +00:00
Dag-Erling Smørgrav
771709eb78 Add a pn_destroy field to pfs_node. This field points to a destructor
function which is called from pfs_destroy() before the node is reclaimed.

Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor
function in addition to the usual attr / fill / vis pointers.

This breaks both the programming and binary interfaces between pseudofs
and its consumers.  It is believed that there are no pseudofs consumers
outside the source tree, so that the impact of this change is minimal.

Submitted by:	Aniruddha Bohra <bohra@cs.rutgers.edu>
2007-03-12 12:16:52 +00:00
John Baldwin
b082761327 Use the vnode interlock to close a race where pfs_vncache_alloc() could
attempt to vn_lock() a destroyed vnode resulting in a hang.

MFC after:	1 week
Submitted by:	ups
Reviewed by:	des
2007-01-02 17:27:52 +00:00
Alexander Leidinger
85646f7eb1 Correctly calculate a buffer length. It was off by one so a read() returned
one byte less than needed.

This is a RELENG_x_y candidate, since it fixes a problem with Oracle 10.

Noticed by:	Dmitry Ganenko <dima@apk-inform.com>
Testcase by:	Dmitry Ganenko <dima@apk-inform.com>
Reviewed by:	des
Submitted by:	rdivacky
Sponsored by:	Google SoC 2006
MFC after:	1 week
2006-06-27 20:21:38 +00:00
Kelly Yancey
c9ad8a67af Restore the ability to mount procfs and fdescfs filesystems via the
mount(2) system call:

  * Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and
    linprocfs).  This (mostly) restores the ability to mount these
    filesystems using the old mount(2) system call (see below for the
    rest of the fix).

  * Remove not-NULL check for the data argument from the mount(2) entry
    point.  Per the mount(2) man page, it is up to the individual
    filesystem being mounted to verify data.  Or, in the case of procfs,
    etc. the filesystem is free to ignore the data parameter if it does
    not use it.  Enforcing data to be not-NULL in the mount(2) system call
    entry point prevented passing NULL to filesystems which ignored the
    data pointer value.  Apparently, passing NULL was common practice
    in such cases, as even our own mount_std(8) used to do it in the
    pre-nmount(2) world.

All userland programs in the tree were converted to nmount(2) long ago,
but I've found at least one external program which broke due to this
(presumably unintentional) mount(2) API change.  One could argue that
external programs should also be converted to nmount(2), but then there
isn't much point in keeping the mount(2) interface for backward
compatibility if it isn't backward compatible.
2006-05-15 19:42:10 +00:00
John Baldwin
06ad42b2f7 Close some races between procfs/ptrace and exit(2):
- Reorder the events in exit(2) slightly so that we trigger the S_EXIT
  stop event earlier.  After we have signalled that, we set P_WEXIT and
  then wait for any processes with a hold on the vmspace via PHOLD to
  release it.  PHOLD now KASSERT()'s that P_WEXIT is clear when it is
  invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops
  to zero.
- Change proc_rwmem() to require that the processing read from has its
  vmspace held via PHOLD by the caller and get rid of all the junk to
  screw around with the vmspace reference count as we no longer need it.
- In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it
  doesn't exist.
- Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers
  FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem()
  to clear an earlier single-step simualted via a breakpoint).  We only
  do one to avoid races.  Also, by making the EINVAL error for unknown
  requests be part of the default: case in the switch, the various
  switch cases can now just break out to return which removes a _lot_ of
  duplicated PRELE and proc unlocks, etc.  Also, it fixes at least one bug
  where a LWP ptrace command could return EINVAL with the proc lock still
  held.
- Changed the locking for ptrace_single_step(), ptrace_set_pc(), and
  ptrace_clear_single_step() to always be called with the proc lock
  held (it was a mixed bag previously).  Alpha and arm have to drop
  the lock while the mess around with breakpoints, but other archs
  avoid extra lock release/acquires in ptrace().  I did have to fix a
  couple of other consumers in kern_kse and a few other places to
  hold the proc lock and PHOLD.

Tested by:	ps (1 mostly, but some bits of 2-4 as well)
MFC after:	1 week
2006-02-22 18:57:50 +00:00
John Baldwin
f8e3eeb519 Change pfs_visible() to optionally return a pointer to the process
associated with the passed in pfs_node.  If it does return a pointer, it
keeps the process locked.  This allows a lot of places that were calling
pfind() again right after pfs_visible() to not have to do that and avoids
races since we don't drop the proc lock just to turn around and lock it
again.  This will become more important with future changes to fix races
between procfs/ptrace and exit(2).  Also, removed a duplicate pfs_visible()
call in pfs_getextattr().

Reviewed by:	des
MFC after:	1 week
2006-02-22 17:24:54 +00:00
Dag-Erling Smørgrav
8ab2a64d2f Eliminate an unnecessary bcopy(). 2005-08-12 12:22:05 +00:00
Jeff Roberson
8b3676f1a1 - Since we don't hold a usecount in pfs_exit we have to get a holdcnt
prior to calling vgone() to prevent any races.

Sponsored by:	Isilon Systems, Inc.
Approved by:	re (vfs blanket)
2005-07-07 07:33:10 +00:00
Dag-Erling Smørgrav
4cd27a97bc Fix an old pasto. 2005-04-30 16:27:20 +00:00
Jeff Roberson
4585e3ac5a - Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case.  Se vfs_lookup.c r1.79 for details.

Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:59:09 +00:00
Jeff Roberson
eddcb03d02 - We no longer have to bother with PDIRUNLOCK, lookup() handles it for us.
Sponsored by:   Isilon Systems, Inc.
2005-03-28 09:34:36 +00:00
Jeff Roberson
d9b2d9f7a2 - Update vfs_root implementations to match the new prototype. None of
these filesystems will support shared locks until they are explicitly
   modified to do so.  Careful review must be done to ensure that this
   is safe for each individual filesystem.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 07:36:16 +00:00
Poul-Henning Kamp
7f661c6ba1 Use subr_unit 2005-03-19 08:22:36 +00:00
Dag-Erling Smørgrav
0e3b5c73b2 Hook pfs_lookup() up to vfs_cachedlookup_desc instead of vfs_lookup_desc,
as suggested by Matt's comment.  Also fix some style and paranoia issues.

The entire function could benefit from review by a VFS guru.

MFC after:	6 weeks
2005-03-14 16:24:50 +00:00
Dag-Erling Smørgrav
bc593ccd83 Fix two long-standing bugs in pfs_readdir():
Since we used an sbuf of size resid to accumulate dirents, we would end
up returning one byte short when we had enough dirents to fill or exceed
the size of the sbuf (the last byte being lost to bogus NUL termination)
causing the next call to return EINVAL due to an unaligned offset.  This
went undetected for a long time because I did most of my testing in
single-user mode, where there are rarely enough processes to fill the
4096-byte buffer ls(1) uses.  The most common symptom of this bug is that
tab completion of /proc or /compat/linux/proc does not work properly when
many processes are running.

Also, a check near the top would return EINVAL if resid was smaller than
PFS_DELEN, even if it was 0, which is frequently the case and perfectly
allowable.  Change the test so that it returns 0 if resid is 0.

MFC after:	2 weeks
2005-03-14 16:21:32 +00:00
Dag-Erling Smørgrav
cb5abc7d2d If PSEUDOFS_TRACE is defined, create a sysctl knob to enable / disable
pseudofs call tracing.
2005-03-14 16:06:47 +00:00
Dag-Erling Smørgrav
de52d21a02 fbsdidize. 2005-03-14 15:54:11 +00:00
Jeff Roberson
8da0046596 - The VI_DOOMED flag now signals the end of a vnode's relationship with
the filesystem.  Check that rather than VI_XLOCK.
 - VOP_INACTIVE should no longer drop the vnode lock.
 - The vnode lock is required around calls to vrecycle() and vgone().

Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:18:25 +00:00
Poul-Henning Kamp
a24042b727 Avoid a couple of mutex operations in the process exit path for the
common case where procfs have never been mounted.

OK'ed by:	des
2005-03-01 12:20:49 +00:00
Poul-Henning Kamp
83c6439714 Whitespace in vop_vector{} initializations. 2005-01-13 18:59:48 +00:00
Robert Watson
f644bbc45c Annotate that pfs_exit() always acquires and releases two mutexes for
every process exist, even if procfs isn't mounted.  And one of those
mutexes is Giant.  No immediate thoughts on fixing this.
2005-01-08 04:56:38 +00:00
Poul-Henning Kamp
def91cf267 Use vfs_mountedfrom().
Since VFS_STATFS() always calls the filesystem with mp->mnt_stat now, the
vfs_statfs method is now a no-op.  Explain this in a comment.
2004-12-06 20:52:46 +00:00
Alexander Kabaev
f6968c4a99 Fix a typo in PFS_TRACE.
PR:		kern/74461
Submitted by:	Craig Rodrigues <rodrigc at crodrigues.org>
2004-12-06 20:07:17 +00:00
Poul-Henning Kamp
aec0fb7b40 Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

	Replace the entire vop_t frobbing thing with properly typed
	structures.  The only casualty is that we can not add a new
	VOP_ method with a loadable module.  History has not given
	us reason to belive this would ever be feasible in the the
	first place.

	Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

	Give coda correct prototypes and function definitions for
	all vop_()s.

	Generate a bit more data from the vnode_if.src file:  a
	struct vop_vector and protype typedefs for all vop methods.

	Add a new vop_bypass() and make vop_default be a pointer
	to another struct vop_vector.

	Remove a lot of vfs_init since vop_vector is ready to use
	from the compiler.

	Cast various vop_mumble() to void * with uppercase name,
	for instance VOP_PANIC, VOP_NULL etc.

	Implement VCALL() by making vdesc_offset the offsetof() the
	relevant function pointer in vop_vector.  This is disgusting
	but since the code is generated by a script comparatively
	safe.  The alternative for nullfs etc. would be much worse.

	Fix up all vnode method vectors to remove casts so they
	become typesafe.  (The bulk of this is generated by scripts)
2004-12-01 23:16:38 +00:00
Robert Watson
10b7196db4 Back out pseudo_vnops.c:1.45, which was a workaround for pfind()
returning incompletely initialized processes.  This problem was
eliminated by kern_proc.c:1.215, which causes pfind() not to
return processes in the PRS_NEW state.
2004-09-02 16:04:09 +00:00
Dag-Erling Smørgrav
c9b9a82654 Release the vnode cache mutex when calling vgone(), since vgone() may
sleep.  This makes pfs_exit() even less efficient than before, but on
the bright side, the vnode cache mutex no longer needs to be recursive.
2004-08-15 21:58:02 +00:00
Robert Watson
d990378077 Commit a work-around for a more general bug involving process state:
check whether p_ucred is NULL or not in pfs_getattr() before
dereferencing the credential, and return ENOENT if there wasn't one.

This is a symptom of a larger problem, wherein pfind() can return
references to incompletely initialized processes, and we instead ought
to not return them, or check the process state before acting on the
process.

Reported by:	kris
Discussed with:	tjr, others
2004-08-13 20:27:56 +00:00
Poul-Henning Kamp
5e8c582ac2 Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version.  This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space.  A few places abused
it to get hold of some credentials to pass around.  Effectively
it is unused.

Reorganize the root filesystem selection code.
2004-07-30 22:08:52 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Alfred Perlstein
f257b7a54b Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
2004-07-12 08:14:09 +00:00
Dag-Erling Smørgrav
195a6b21e4 Accumulate directory entries in a fixed-length sbuf, and uiomove them in
one go before returning.  This avoids calling uiomove() while holding
allproc_lock.

Don't adjust uio->uio_offset manually, uiomove() does that for us.

Don't drop allproc_lock before calling panic().

Suggested by:	alfred
2004-07-09 11:43:37 +00:00
Brian Feldman
6fedf94775 When taking event callbacks (like process_exit) out from under Giant, those
which do not lock Giant themselves will be exposed.  Unbreak pfs_exit().
2004-03-14 15:57:45 +00:00
Jacques Vidrine
a9c2bfa8e9 Fix a panic in pseudofs(9) that could occur when doing an I/O
operation with a large request or large offset.

Reported by:	Joel Ray Holveck <joelh@piquan.org>
Submitted by:	des
2004-02-10 21:06:47 +00:00
Dag-Erling Smørgrav
b331ec01c4 Constify, and add an API function to find a named node in a directory. 2003-12-07 17:41:19 +00:00
Jeff Roberson
9c695a2697 - Don't cache_purge() in *_reclaim routines. vclean() does it for us so
this is redundant.
2003-10-05 02:43:30 +00:00
Jacques Vidrine
8b7358ca43 Introduce a uiomove_frombuf helper routine that handles computing and
validating the offset within a given memory buffer before handing the
real work off to uiomove(9).

Use uiomove_frombuf in procfs to correct several issues with
integer arithmetic that could result in underflows/overflows.  As a
side-effect, the code is significantly simplified.

Add additional sanity checks when computing a memory allocation size
in pfs_read.

Submitted by:	rwatson  (original uiomove_frombuf -- bugs are mine :-)
Reported by:	Joost Pol <joost@pine.nl>  (integer underflows/overflows)
2003-10-02 15:00:55 +00:00
Dag-Erling Smørgrav
134ce0f9cc Add pfs_visible() checks to pfs_getattr() and pfs_getextattr(). This
also fixes pfs_access() since it relies on VOP_GETATTR() which will call
pfs_getattr().  This prevents jailed processes from discovering the
existence, start time and ownership of processes outside the jail.

PR:		kern/48156
2003-08-19 10:26:41 +00:00
John Baldwin
d49ebea58c Spell the name of the lock right in addition to getting the type right.
Submitted by:	Kim Culhan <kimc@w8hd.org>
2003-08-18 19:23:01 +00:00
John Baldwin
cda369cac4 The allproc lock is a sx lock, not a mutex, so fix the assertion. This
asserts that the sx lock is held, but does not specify if the lock is held
shared or exclusive, thus either type of lock satisfies the assertion.
2003-08-18 18:02:33 +00:00
Dag-Erling Smørgrav
653fae1761 Rework pfs_iterate() a bit to eliminate a bug related to process
directories.  Previously, pfs_iterate() would return -1 when it
reached the end of the process list while processing a process
directory node, even if the parent directory contained further nodes
(which is the case for the linprocfs root directory, where the process
directory node is actually first in the list).  With this patch,
pfs_iterate() will continue to traverse the parent directory's node
list after exhausting the process list (as was the intention all
along).  The code should hopefully be easier to read as well.

While I'm here, have pfs_iterate() assert that the allproc lock is
held.
2003-08-18 13:36:09 +00:00
John-Mark Gurney
efe0afa930 fix grammar in comment 2003-06-20 23:29:04 +00:00
Poul-Henning Kamp
7652131bee Initialize struct vfsops C99-sparsely.
Submitted by:   hmp
Reviewed by:	phk
2003-06-12 20:48:38 +00:00
Don Lewis
64820e19bc Don't unlock the parent directory vnode twice if the ISDOTDOT flag
is set.
2003-06-01 09:16:26 +00:00
Alexander Kabaev
104a9b7e3e Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
John Baldwin
75b8b3b25c Replace the at_fork, at_exec, and at_exit functions with the slightly more
flexible process_fork, process_exec, and process_exit eventhandlers.  This
reduces code duplication and also means that I don't have to go duplicate
the eventhandler locking three more times for each of at_fork, at_exec, and
at_exit.

Reviewed by:	phk, jake, almost complete silence on arch@
2003-03-24 21:15:35 +00:00
Alexander Kabaev
c162e9c2eb Rename vfs_stdsync function to vfs_stdnosync which matches more
closely what function is really doing. Update all existing consumers
to use the new name.

Introduce a new vfs_stdsync function, which iterates over mount
point's vnodes and call FSYNC on each one of them in turn.

Make nwfs and smbfs use this new function instead of rolling their
own identical sync implementations.

Reviewed by:	jeff
2003-03-11 22:15:10 +00:00
Dag-Erling Smørgrav
7b726be320 Get rid of caddr_t. 2003-03-02 22:23:45 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Tim J. Robbins
613fcc1359 Do not allow a cached vnode to be shared among multiple mounts of the same
kind of pseudofs-based filesystem. Fixes (at least) one problem where
when procfs is mounted mupltiple times, trying to unmount one will often
cause the wrong one to get unmounted, and other problem where mounting
one procfs on top of another caused the kernel to lock up.

Reviewed by:		des
2003-01-28 09:21:42 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Robert Watson
4273cc51a5 GC an unused reference to vop_refreshlabel_desc; reference to
opt_mac.h was removed previously so it was never compiled in.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-01-21 04:05:37 +00:00
Poul-Henning Kamp
c6e3ae999b Since Jeffr made the std* functions the default in rev 1.63 of
kern/vfs_defaults.c it is wrong for the individual filesystems to use
the std* functions as that prevents override of the default.

Found by:       src/tools/tools/vop_table
2003-01-04 08:47:19 +00:00
Robert Watson
763bbd2f4f Slightly change the semantics of vnode labels for MAC: rather than
"refreshing" the label on the vnode before use, just get the label
right from inception.  For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system.  With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance.  This
also corrects sematics for shared vnode locks, which were not
previously present in the system.  This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form.  With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception.  We'll introduce a work around for this shortly.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-26 14:38:24 +00:00
Poul-Henning Kamp
ce2fb5776b '&' not used for pointers to functions.
Spotted by:	FlexeLint
2002-10-20 21:31:16 +00:00
Poul-Henning Kamp
65a728a53b Plug an infrequent (I think) memory leak.
Spotted by:	FlexeLint
2002-10-15 18:51:02 +00:00
Kirk McKusick
a5b65058d5 Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by:	DARPA & NAI Labs.
2002-10-14 03:20:36 +00:00
Jeff Roberson
4d93c0be1f - Use vrefcnt() where it is safe to do so instead of doing direct and
unlocked accesses to v_usecount.
 - Lock access to the buf lists in the various sync routines.  interlock
   locking could be avoided almost entirely in leaf filesystems if the
   fsync function had a generic helper.
2002-09-25 02:32:42 +00:00
Nate Lawson
06be2aaa83 Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by:   phk
Reviewed by:    bde, rwatson (earlier version)
2002-09-14 09:02:28 +00:00
Jeff Roberson
e6e370a7fe - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
Robert Watson
dee93f2c52 Introduce support for Mandatory Access Control and extensible
kernel access control.

Modify pseudofs so that it can support synthetic file systems with
the multilabel flag set.  In particular, implement vop_refreshlabel()
as pn_refreshlabel().  Implement pfs_refreshlabel() to invoke this,
and have it fall back to the mount label if the file system does
not implement pn_refreshlabel() for the node.  Otherwise, permit
the file system to determine how the service is provided.

Approved by:	des
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-01 01:33:12 +00:00
Jeff Roberson
922b974a44 Lock down pseudofs:
- Initialize lock structure in vncache_alloc
 - Return locked vnodes from vncache_alloc
 - Setup vnode op vectors to use default lock, unlock, and islocked
 - Implement simple locking scheme required for lookup
2002-07-08 01:50:14 +00:00
Dag-Erling Smørgrav
a3d37b1322 Gratuitous whitespace cleanup. 2002-06-06 16:59:24 +00:00
John Baldwin
f44d9e24fb Change p_can{debug,see,sched,signal}()'s first argument to be a thread
pointer instead of a proc pointer and require the process pointed to
by the second argument to be locked.  We now use the thread ucred reference
for the credential checks in p_can*() as a result.  p_canfoo() should now
no longer need Giant.
2002-05-19 00:14:50 +00:00