Commit Graph

594 Commits

Author SHA1 Message Date
truckman
9ed03e6eb3 When shutting down the syncer kernel thread, first tell it to run
faster and iterate to over its work list a few times in an attempt
to empty the work list before the syncer terminates.  This leaves
fewer dirty blocks to be written at the "syncing disks" stage and
keeps the the "giving up on N buffers" problem from being triggered
by the presence of a large soft updates work list at system shutdown
time.  The downside is that the syncer takes noticeably longer to
terminate.

Tested by:	"Arjan van Leeuwen" <avleeuwen AT piwebs DOT com>
Approved by:	mckusick
2004-07-01 23:59:19 +00:00
phk
40dd98a3bd Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
phk
dfd1f7fd50 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
phk
ba2d5c4937 Remove a left over from userland buffer-cache access to disks. 2004-06-14 14:25:03 +00:00
rwatson
afc098b3e1 Assert Giant in vrele(). 2004-05-31 19:06:01 +00:00
mux
79217d1505 Put deprecated sysctl code inside BURN_BRIDGES. 2004-04-11 21:09:22 +00:00
imp
74cf37bd00 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00
peter
5e91995e52 Kill some XXXKSE's. vnlru/syncer are single threaded. 2004-03-29 22:45:33 +00:00
phk
2a5e157787 Properly vector all bwrite() and BUF_WRITE() calls through the same path
and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().
2004-03-11 18:02:36 +00:00
phk
eeb7579130 Remove unused second arg to vfinddev().
Don't call addaliasu() on VBLK nodes.
2004-03-11 16:33:11 +00:00
kan
e795b7939d Always call vn_finished_write after vn_start_write was called. All
occurences of 'goto done' after vn_start_write invocation were cleaning
up incompletely.
2004-03-06 04:09:54 +00:00
jhb
d25301c858 Switch the sleep/wakeup and condition variable implementations to use the
sleep queue interface:
- Sleep queues attempt to merge some of the benefits of both sleep queues
  and condition variables.  Having sleep qeueus in a hash table avoids
  having to allocate a queue head for each wait channel.  Thus, struct cv
  has shrunk down to just a single char * pointer now.  However, the
  hash table does not hold threads directly, but queue heads.  This means
  that once you have located a queue in the hash bucket, you no longer have
  to walk the rest of the hash chain looking for threads.  Instead, you have
  a list of all the threads sleeping on that wait channel.
- Outside of the sleepq code and the sleep/cv code the kernel no longer
  differentiates between cv's and sleep/wakeup.  For example, calls to
  abortsleep() and cv_abort() are replaced with a call to sleepq_abort().
  Thus, the TDF_CVWAITQ flag is removed.  Also, calls to unsleep() and
  cv_waitq_remove() have been replaced with calls to sleepq_remove().
- The sched_sleep() function no longer accepts a priority argument as
  sleep's no longer inherently bump the priority.  Instead, this is soley
  a propery of msleep() which explicitly calls sched_prio() before
  blocking.
- The TDF_ONSLEEPQ flag has been dropped as it was never used.  The
  associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been
  dropped and replaced with a single explicit clearing of td_wchan.
  TD_SET_ONSLEEPQ() would really have only made sense if it had taken
  the wait channel and message as arguments anyway.  Now that that only
  happens in one place, a macro would be overkill.
2004-02-27 18:52:44 +00:00
truckman
1de257deb3 Split the mlock() kernel code into two parts, mlock(), which unpacks
the syscall arguments and does the suser() permission check, and
kern_mlock(), which does the resource limit checking and calls
vm_map_wire().  Split munlock() in a similar way.

Enable the RLIMIT_MEMLOCK checking code in kern_mlock().

Replace calls to vslock() and vsunlock() in the sysctl code with
calls to kern_mlock() and kern_munlock() so that the sysctl code
will obey the wired memory limits.

Nuke the vslock() and vsunlock() implementations, which are no
longer used.

Add a member to struct sysctl_req to track the amount of memory
that is wired to handle the request.

Modify sysctl_wire_old_buffer() to return an error if its call to
kern_mlock() fails.  Only wire the minimum of the length specified
in the sysctl request and the length specified in its argument list.
It is recommended that sysctl handlers that use sysctl_wire_old_buffer()
should specify reasonable estimates for the amount of data they
want to return so that only the minimum amount of memory is wired
no matter what length has been specified by the request.

Modify the callers of sysctl_wire_old_buffer() to look for the
error return.

Modify sysctl_old_user to obey the wired buffer length and clean up
its implementation.

Reviewed by:	bms
2004-02-26 00:27:04 +00:00
phk
711ff67b90 Check for NODEV return from udev2dev() 2004-02-21 23:52:03 +00:00
phk
5551e292d8 Device megapatch 6/6:
This is what we came here for:  Hang dev_t's from their cdevsw,
refcount cdevsw and dev_t and generally keep track of things a lot
better than we used to:

Hold a cdevsw reference around all entrances into the device driver,
this will be necessary to safely determine when we can unload driver
code.

Hold a dev_t reference while the device is open.

KASSERT that we do not enter the driver on a non-referenced dev_t.

Remove old D_NAG code, anonymous dev_t's are not a problem now.

When destroy_dev() is called on a referenced dev_t, move it to
dead_cdevsw's list.  When the refcount drops, free it.

Check that cdevsw->d_version is correct.  If not, set all methods
to the dead_*() methods to prevent entrance into driver.  Print
warning on console to this effect.  The device driver may still
explode if it is also incompatible with newbus, but in that case
we probably didn't get this far in the first place.
2004-02-21 21:57:26 +00:00
phk
39fb4aef3d Device megapatch 5/6:
Remove the unused second argument from udev2dev().

Convert all remaining users of makedev() to use udev2dev().  The
semantic difference is that udev2dev() will only locate a pre-existing
dev_t, it will not line makedev() create a new one.

Apart from the tiny well controlled windown in D_PSEUDO drivers,
there should no longer be any "anonymous" dev_t's in the system
now, only dev_t's created with make_dev() and make_dev_alias()
2004-02-21 21:32:15 +00:00
kan
443a35f2bf More style fixes.
Obtained from:	bde
2004-01-05 23:40:46 +00:00
kan
2ffbc2464e style(9):
Add empty line before first code line in functions with no local
variables.
Properly terminate comment sentences.
Indent lines which are longer that 80 characters.
Move v_addpollinfo closer to the rest of poll-related functions.
Move DEBUG_VFS_LOCKS ifdefed block to the end of file.

Obtained from:	bde (partly)
2004-01-05 19:04:29 +00:00
kan
2872921967 Cosmetics: strip '\n' from a string passed to Debugger(). 2004-01-04 03:42:20 +00:00
bde
7d91626477 v_vxproc was a bogus name for a thread (pointer). 2003-12-28 09:12:56 +00:00
jeff
5443bd4c65 - In vget() if LK_NOWAIT is specified we should return EBUSY and not ENOENT.
Submitted by:	Stephan Uphoff <ups@stups.com>
2003-12-16 17:08:27 +00:00
jeff
aa712bc6e4 - When doing a forced unmount, VFS attempts to keep VCHR vnodes valid by
reassigning their v_ops field to specfs, detaching from the mountpoint, etc.
   However, this is not sufficient.  If we vclean() the vnode the pages owned
   by the vnode are lost, potentially while buffers reference them.  Implement
   parts of vclean() seperately in vgonechrl() so that the pages and bufs
   associated with a device vnode are not destroyed while in use.
2003-12-16 17:05:05 +00:00
jeff
e35dcab926 - Don't forget to unlock the vnode interlock in the LK_NOWAIT case.
Submitted by:	Stephan Uphoff <ups@stups.com>
Approved by:	re (rwatson)
2003-11-30 22:09:58 +00:00
tanimura
7eade05dfa - Implement selwakeuppri() which allows raising the priority of a
thread being waken up.  The thread waken up can run at a priority as
  high as after tsleep().

- Replace selwakeup()s with selwakeuppri()s and pass appropriate
  priorities.

- Add cv_broadcastpri() which raises the priority of the broadcast
  threads.  Used by selwakeuppri() if collision occurs.

Not objected in:	-arch, -current
2003-11-09 09:17:26 +00:00
kan
36d60f3bb7 Remove mntvnode_mtx and replace it with per-mountpoint mutex.
Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to
operate on this mutex transparently.

Eventually new mutex will be protecting more fields in
struct mount, not only vnode list.

Discussed with: jeff
2003-11-05 04:30:08 +00:00
wollman
dabc1a332d Add appropriate const poisoning to the assert_*locked() family so that I can
call ASSERT_VOP_LOCKED(vp, __func__) without a diagnostic.

Inspired by:	the evil and rude OpenAFS cache manager code
2003-10-23 18:17:36 +00:00
alc
512489f301 Initialize the buf's b_object in pbgetvp(). Clear it in pbrelvp(). (This
facilitates synchronization of the vm page's valid field using the
vm object's lock.)

Suggested by:	tegge
2003-10-20 18:24:38 +00:00
phk
888092f317 Simplify count_dev() 2003-10-17 11:56:48 +00:00
phk
edaf9214c4 Simplify vn_isdisk() a bit. 2003-10-12 14:04:39 +00:00
jeff
5b9cc4b22e - Fix a typo, I meant & and not |. This was causing lockups from the syncer
looping forever due to list corruption.

Solved by:	tegge
2003-10-11 21:50:45 +00:00
jeff
2c3fea92c8 - Fix an XXX. Check the error of vn_lock() in vflush(). Don't specify
LK_RETRY either, we don't want this vnode if it turns into another.
 - Remove the code that checks the mount point after acquiring the lock
   we are guaranteed to either fail or get the vnode that we wanted.
2003-10-05 07:12:38 +00:00
jeff
6b8324e841 - Rename vcanrecycle() to vtryrecycle() to reflect its new role.
- In vtryrecycle() try to vgonel the vnode if all of the previous checks
   passed.  We won't vgonel if someone has either acquired a hold or usecount
   or started the vgone process elsewhere.  This is because we may have been
   removed from the free list while we were inspecting the vnode for
   recycling.
 - The VI_TRYLOCK stops two threads from entering getnewvnode() and recycling
   the same vnode.  To further reduce the likelyhood of this event, requeue
   the vnode on the tail of the list prior to calling vtryrecycle().  We can
   not actually remove the vnode from the list until we know that it's
   going to be recycled because other interlock holders may see the VI_FREE
   flag and try to remove it from the free list.
 - Kill a bogus XXX comment.  If XLOCK is set we shouldn't wait for it
   regardless of MNT_WAIT because the vnode does not actually belong to
   this filesystem.
2003-10-05 05:35:41 +00:00
jeff
e15704d590 - Don't cache_purge() in getnewvnode. It's done in vclean(). With this
purge, the purge in vclean, and the filesystems purge, we had 3 purges
   per vnode.
 - Move the insmntque(vp, 0) to vclean() so that we may remove it from the
   two vgone() functions and reduce the number of lock operations required.
2003-10-05 02:48:04 +00:00
jeff
8d0f78003f - Solve a LOR with the sync_mtx by using the VI_ONWORKLST flag to determine
whether or not the sync failed.  This could potentially get set between
   the time that we VOP_UNLOCK and VI_LOCK() but the race would harmelssly
   lead to the sync being delayed by an extra 30 seconds.  If we do not move
   the vnode it could cause an endless loop if it continues to fail to sync.
 - Use vhold and vdrop to stop the vnode from changing identities while we
   have it unlocked.  Other internal vfs lists are likely to follow this
   scheme.
2003-10-05 00:35:41 +00:00
jeff
8d72c43763 - Move the xlock 'locking' code into vx_lock() and vx_unlock().
- Create a new function, vgonechrl(), which performs vgone for an in-use
   character device.  Move the code from vflush() that did this into
   vgonechrl().
 - Hold the xlock across the entirety of vgonel() and vgonechrl() so that
   at no point will an invalid vnode exist on any list without XLOCK set.
 - Move the xlock code out of vclean() now that it is in the vgone*()
   functions.
2003-10-05 00:02:41 +00:00
jeff
55547647ec - In sched_sync() test our preconditions prior to dropping the sync_mtx.
This is so that we may grab the interlock while still holding the
   sync_mtx.  We have to VI_TRYLOCK() because in all other cases the lock
   order runs the other way.
 - If we don't meet any of the preconditions, reinsert the vp into the
   list for the next second.
 - We don't need to panic if we fail to sync here because each FSYNC
   function handles this case.  Removing this redundant code also
   simplifies locking.
2003-10-04 18:03:53 +00:00
jeff
bb40013912 - In a Giantless world, the vn_lock() in vcanrecycle() could legitimately
fail.  Remove the panic from that case and document why it might fail.
 - Document the reason for calling cache_purge() on a newly created vnode.
 - In insmntque() order the operations so that we can call mtx_unlock()
   one fewer times.  This makes the code somewhat clearer as well.
 - Add XXX comments in sched_sync() and vflush().
 - In vget(), do not sleep while waiting for XLOCK to clear if LK_NOWAIT is
   set.
 - In vclean() we don't need to acquire a lock around a single TAILQ_FIRST
   call.  It's ok if we race here, the vinvalbuf will just do nothing.
 - Increase the scope of the lock in vgonel() to reduce the number of lock
   operations that are performed.
2003-10-04 15:10:40 +00:00
jeff
517dcea6c8 - In reassignbuf() don't unlock vp and lock newvp if they are the same.
Doing so creates a race where the buf is on neither list.
 - Only vfree() in an error case in vclean() if VSHOULDFREE() thinks we
   should.
 - Convert the error case in vclean() to INVARIANTS from DIAGNOSTIC as this
   really should not happen and is fast to check.
2003-09-20 00:21:48 +00:00
jeff
45f3b1b270 - Remove spls(). The locking that has replaced them is in place and they
no longer serve as guidelines for future work.
2003-09-19 23:52:06 +00:00
kan
cf77f9f005 Eliminate one case of VI_UNLOCK followed by an immediate
VI_LOCK.
2003-09-19 19:13:54 +00:00
jhb
37641f86f1 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
phk
eb30c92e49 Revert stuff which accidentally ended up in the previous commit. 2003-07-22 10:36:36 +00:00
phk
c4a9334fa6 Don't attempt to inline large functions mb_alloc() and mb_free(),
it more than doubles the text size of this file.

GCC has wisely ignored us on this previously
2003-07-22 10:24:41 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
phk
ddccb1d287 Remove unused variable and now unbalanced call to splbio();
Found by:       FlexeLint
2003-05-31 20:09:01 +00:00
alc
53638c7027 Make the maximum number of vnodes a function of both the physical memory
size and the kernel's heap size, specifically, vm_kmem_size.  This
function allows a maximum of 40% of the vm_kmem_size to be used for
vnodes and vm objects.  This is a conservative bound based upon recent
problem reports.  (In other words, a slight increase in this percentage
may be safe.)

Finally, machines with less than ~3GB of RAM should be unaffected
by this change, i.e., the maximum number of vnodes should remain
the same.  If necessary, machines with 3GB or more of RAM can increase
the maximum number of vnodes by increasing vm_kmem_size.

Desired by:	scottl
Tested by:	jake
Approved by:	re (rwatson,scottl)
2003-05-23 19:54:02 +00:00
truckman
80040f21a3 Detect that a vnode has been reclaimed while vflush() was waiting to lock
the vnode and restart the loop.  Vflush() is vulnerable since it does not
hold a reference to the vnode and it holds no other locks while waiting
for the vnode lock.  The vnode will no longer be on the list when the
loop is restarted.

Approved by:	re (rwatson)
2003-05-16 19:46:51 +00:00
alc
0422418ef4 Optimize the use of splay in gbincore(). During a "make buildworld" the
desired buffer is found at one of the roots more than 60% of the time.
Thus, checking both roots before performing either splay eliminates
unnecessary splays on the first tree splayed.

Approved by:	re (jhb)
2003-05-13 04:36:02 +00:00
rwatson
c21d149f29 Remove bogus locking from DDB's "show lockedvnods" command: using
synchronization primitives from inside DDB is generally a bad idea,
and in this case it frequently results in panics due to DDB commands
being executed from the sio fast interrupt context on a serial
console.  Replace the locking with a note that a lack of locking
means that DDB may get see inconsistent views of the mount and vnode
lists, which could also result in a panic.  More frequently,
though, this avoids a panic than causes it.

Discussed with ages ago:	bde
Approved by:			re (scottl)
2003-05-12 14:37:47 +00:00
alc
410b675ed9 - Revert kern/vfs_subr.c revision 1.444. The vm_object's size isn't
trustworthy for vnode-backed objects.
 - Restore the old behavior of vm_object_page_remove() when the end
   of the given range is zero.  Add a comment to vm_object_page_remove()
   regarding this behavior.

Reported by:	iedowse
2003-05-03 08:09:24 +00:00
alc
e9c4374a87 Lock accesses to the vm_object's ref_count and resident_page_count. 2003-05-01 03:10:38 +00:00
alc
d5ac0bc453 Various changes to vm_object_page_remove():
- Eliminate an odd, special-case feature:
   if start == end == 0 then all pages are removed.  Only one caller
   used this feature and that caller can trivially pass the object's
   size.
 - Assert that the vm_object is locked on entry; don't bother testing
   for a NULL vm_object.
 - Style: Fix lines that are longer than 80 characters.
2003-04-26 23:41:30 +00:00
alc
373b18b5c3 - Convert vm_object_pip_wait() from using tsleep() to msleep().
- Make vm_object_pip_sleep() static.
 - Lock the vm_object when performing vm_object_pip_wait().
2003-04-26 18:33:18 +00:00
alc
87da2c3cf3 - Acquire the vm_object's lock when performing vm_object_page_clean().
- Add a parameter to vm_pageout_flush() that tells vm_pageout_flush()
   whether its caller has locked the vm_object.  (This is a temporary
   measure to bootstrap vm_object locking.)
2003-04-24 04:31:25 +00:00
alc
83fe46be18 Update locking around vm_object_page_remove() to use the new macros. 2003-04-18 16:39:03 +00:00
alc
227f7746c4 Use vm_object_pip_wait() rather than reimplementing it. 2003-04-13 05:10:44 +00:00
tegge
5e14826743 Adjust the number of vnodes scanned by vlrureclaim() according to the
size of the vnode list.
2003-03-26 22:15:58 +00:00
yar
f9968b4d9f We shouldn't assert that a vode is locked in vop_lock_post()
if VOP_LOCK() has failed.

Reviewed by:	jeff
2003-03-22 13:21:54 +00:00
jeff
ec5374265b - Remove a dead check for bp->b_vp == vp in vtruncbuf(). This has not been
possible for some time.
 - Lock the buf before accessing fields.  This should very rarely be locked.
 - Assert that B_DELWRI is set after we acquire the buf.  This should always
   be the case now.
2003-03-13 07:22:53 +00:00
jeff
ae3c8799da - Remove a race between fsync like functions and flushbufqueues() by
requiring locked bufs in vfs_bio_awrite().  Previously the buf could
   have been written out by fsync before we acquired the buf lock if it
   weren't for giant.  The cluster_wbuild() handles this race properly but
   the single write at the end of vfs_bio_awrite() would not.
 - Modify flushbufqueues() so there is only one copy of the loop.  Pass a
   parameter in that says whether or not we should sync bufs with deps.
 - Call flushbufqueues() a second time and then break if we couldn't find
   any bufs without deps.
2003-03-13 07:19:23 +00:00
alc
c50367da67 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
njl
5a225ad933 Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
2003-03-03 19:15:40 +00:00
jeff
8e95e91722 - Hold the vnode interlock across calls to bgetvp instead of acquiring it
internally.  This is required to stop multiple bufs from being associated
   with a single lblkno.
2003-03-02 06:05:23 +00:00
jeff
98d7696db0 - gc USE_BUFHASH. The smp locking of the buf cache renders this useless. 2003-03-01 05:55:03 +00:00
mckusick
6e9f6f2d6d Prevent large files from monopolizing the system buffers. Keep
track of the number of dirty buffers held by a vnode. When a
bdwrite is done on a buffer, check the existing number of dirty
buffers associated with its vnode. If the number rises above
vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one
of the other (hopefully older) dirty buffers associated with
the vnode is written (using bawrite). In the event that this
approach fails to curb the growth in it the vnode's number of
dirty buffers (due to soft updates rollback dependencies),
the more drastic approach of doing a VOP_FSYNC on the vnode
is used. This code primarily affects very large and actively
written files such as snapshots. This change should eliminate
hanging when taking snapshots or doing background fsck on
very large filesystems.

Hopefully, one day it will be possible to cache filesystem
metadata in the VM cache as is done with file data. As it
stands, only the buffer cache can be used which limits total
metadata storage to about 20Mb no matter how much memory is
available on the system. This rather small memory gets badly
thrashed causing a lot of extra I/O. For example, taking a
snapshot of a 1Tb filesystem minimally requires about 35,000
write operations, but because of the cache thrashing (we only
have about 350 buffers at our disposal) ends up doing about
237,540 I/O's thus taking twenty-five minutes instead of four
if it could run entirely in the cache.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 06:44:42 +00:00
jeff
9e4c9a6ce9 - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
   these fields instead.
 - Hold the vnode interlock while locking bufs on the clean/dirty queues.
   This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
   BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by:	arch, mckusick
2003-02-25 03:37:48 +00:00
phk
af9c7adfc3 Bracket the kern.vnode sysctl in #ifdef notyet because it results
in massive locking issues on diskless systems.

It is also not clear that this sysctl is non-dangerous in its
requirements for locked down memory on large RAM systems.
2003-02-23 18:09:05 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
iedowse
1ec5f03e6d Add a new vnode flag VI_DOINGINACT to indicate that a VOP_INACTIVE
call is in progress on the vnode. When vput() or vrele() sees a
1->0 reference count transition, it now return without any further
action if this flag is set. This flag is necessary to avoid recursion
into VOP_INACTIVE if the filesystem inactive routine causes the
reference count to increase and then drop back to zero. It is also
used to guarantee that an unlocked vnode will not be recycled while
blocked in VOP_INACTIVE().

There are at least two cases where the recursion can occur: one is
that the softupdates code called by ufs_inactive() via ffs_truncate()
can call vput() on the vnode. This has been reported by many people
as "lockmgr: draining against myself" panics. The other case is
that nfs_inactive() can call vget() and then vrele() on the vnode
to clean up a sillyrename file.

Reviewed by:	mckusick (an older version of the patch)
2002-12-29 18:30:49 +00:00
phk
2eae537376 Use a timeout of one second while we wait for the vnode washer,
this prevents a potential race and makes the system a little bit
less jerky under extreme loads.
2002-12-29 11:18:25 +00:00
phk
90510abb6e Vnodes pull in 800-900 bytes these days, all things counted, so we need
to treat desiredvnodes much more like a limit than as a vague concept.

On a 2GB RAM machine where desired vnodes is 130k, we run out of
kmem_map space when we hit about 190k vnodes.

If we wake up the vnode washer in getnewvnode(), sleep until it is done,
so that it has a chance to offer us a washed vnode.  If we don't sleep
here we'll just race ahead and allocate yet a vnode which will never
get freed.

In the vnodewasher, instead of doing 10 vnodes per mountpoint per
rotation, do 10% of the vnodes distributed evenly across the
mountpoints.
2002-12-29 10:39:05 +00:00
phk
1496f0639d KASSERT that vop_revoke() gets a VCHR. 2002-12-28 22:27:14 +00:00
alc
6be448f264 Perform vm_object_lock() and vm_object_unlock() around
vm_object_page_remove().
2002-12-15 05:41:56 +00:00
alc
a7482ae294 To avoid lock order reversals in getnewvnode(), the call to uma_zfree()
must be delayed until the vnode interlock is released.

Reported by:	kris@
Approved by:	re (jhb)
2002-12-08 05:06:50 +00:00
robert
b4c9e24303 Do not set a variable (vp->p_pollinfo) to NULL if we know
it already has that value.

Approved by:	re
2002-11-27 16:45:54 +00:00
rwatson
312cab0dee Slightly change the semantics of vnode labels for MAC: rather than
"refreshing" the label on the vnode before use, just get the label
right from inception.  For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system.  With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance.  This
also corrects sematics for shared vnode locks, which were not
previously present in the system.  This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form.  With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception.  We'll introduce a work around for this shortly.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-26 14:38:24 +00:00
phk
7320ebcf72 In vrele() we can actually have a VCHR with v_rdev == NULL if we
came from the bottom of addaliasu().  Don't panic.
2002-10-25 07:58:25 +00:00
mckusick
6b1611bd94 Within ufs, the ffs_sync and ffs_fsync functions did not always
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned.  This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.

Sponsored by:   DARPA & NAI Labs.
2002-10-25 00:20:37 +00:00
phk
a3766d9d16 Fix the spechash lock order reversal by keeping an updated sum
of v_usecount in the dev_t which vcount() can return without
locking any vnodes.

Seen by:	jhb
2002-10-24 19:38:56 +00:00
mckusick
9d00a4a781 When scanning the freelist looking for candidate vnodes to recycle,
be sure to exit the loop with vp == NULL if no candidates are found.
Formerly, this bug would cause the last vnode inspected to be used,
even if it was not available. The result was a panic "vn_finished_write:
neg cnt".

Sponsored by:	DARPA & NAI Labs.
2002-10-14 19:54:39 +00:00
mckusick
05ff8976a7 Unconditionally reset vp->v_vnlock back to the default in the
vclean() function (e.g., vp->v_vnlock = &vp->v_lock) rather
than requiring filesystems that use alternate locks to do so
in their vop_reclaim functions. This change is a further cleanup
of the vop_stdlock interface.

Submitted by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
Sponsored by:	DARPA & NAI Labs.
2002-10-14 19:44:51 +00:00
mckusick
25230d4c6a Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by:	DARPA & NAI Labs.
2002-10-14 03:20:36 +00:00
mckusick
16ad96c43c When considering a vnode for reuse in getnewvnode, we call
vcanrecycle to check a free vnode's availability. If it is
available, vcanrecycle returns an error code of zero and the
vnode in question locked. The getnewvnode routine then used
to call vn_start_write with the V_NOWAIT flag. If the filesystem
was suspended while taking a snapshot, the vn_start_write would
fail but getnewvnode would fail to unlock the vnode, instead
leaving it locked on the freelist. The result would be that the
vnode would be locked forever and would eventually hang the
system with a race to the root when it was attempted to recycle
it. This fix moves the vn_start_write check into vcanrecycle
where it will properly unlock the vnode if it is unavailable
for recycling due to filesystem suspension.

Sponsored by:	DARPA & NAI Labs.
2002-10-11 01:04:14 +00:00
sobomax
18d9db4bb5 Fix problem introduced in rev.1.406, which can cause already unlocked
mutex being unlocked again causing system panic.
2002-10-05 12:56:10 +00:00
phk
b55fa4540e Fix some harmless mis-indents.
Spotted by:	FlexeLint
2002-10-01 15:48:31 +00:00
rwatson
5d5060bddf Move vnode MAC label initialization to after the release of the vnode
interlock in getnewvnode() to avoid possible sleeps while holding
the mutex.  Note that the warning from Witness is a slight false
positive since we know there will be no contention on the interlock
since we haven't made the vnode available for use yet, but the theory
is not a bad one.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-30 20:51:48 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
jeff
31b1ddae74 - Move ASSERT_VOP_*LOCK* functionality into functions in vfs_subr.c
- Make the VI asserts more orthogonal to the rest of the asserts by using a
   new, common vfs_badlock() function and adding a 'str' arg.
 - Adjust generated ASSERTS to match the new prototype.
 - Adjust explicit ASSERTS to match the new prototype.
2002-09-26 04:48:44 +00:00
jeff
ee7cd9172d - Lock down the syncer with sync_mtx.
- Enable vfs_badlock_mutex by default.
 - Assert that the vp is locked in VOP_UNLOCK.
 - Use standard interlock macros in remaining code.
 - Correct a race in getnewvnode().
 - Lock access to v_numoutput with interlock.
 - Lock access to buf lists and splay tree with interlock.
 - Add VOP and VI asserts.
 - Lock b_vnbufs with the vnode interlock.
 - Add vrefcnt() for callers who want to retreive the vnode ref without
   holding a lock.  Add a comment that describes when this is safe.
 - Add vholdl() and vdropl() so that callers who already own the interlock
   can avoid race conditions and unnecessary unlocking.
 - Move the VOP_GETATTR() in vflush() into the WRITECLOSE conditional case.
 - Hold the interlock before droping the mntlist_mtx in vflush() to avoid
   a race.
 - Fix locking in vfs_msync().
2002-09-25 02:22:21 +00:00
njl
00c79f5c92 Remove any VOP_PRINT that redundantly prints the tag.
Move lockmgr_printinfo() into vprint() for everyone's benefit.

Suggested by: bde
2002-09-18 20:42:04 +00:00
njl
0590c43070 Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by:   phk
Reviewed by:    bde, rwatson (earlier version)
2002-09-14 09:02:28 +00:00
julian
06f500f894 Indentation does not make a block.. need curly braces too.
Submitted by: Eagle-eyes evans <bde@freebsd.org>
2002-09-11 18:15:26 +00:00
julian
5702a380a5 Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00
phk
3303b3f624 Fix an inherited style bug: compare with NOCRED instead of NULL.
Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:46:19 +00:00
phk
55be95d161 Introduce new extattr_check_cred() function which implements the canonical
crential washing for extended attributes.

Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:38:57 +00:00
charnier
7dd9d47059 Replace various spelling with FALLTHROUGH which is lint()able 2002-08-25 13:23:09 +00:00
jeff
da601a39ac - Fix a mistake in my last few commits. The PDROP flag stops msleep from
re-acquiring the mutex.

Pointy hat to:	me
Noticed by:	tegge
2002-08-23 00:32:03 +00:00
jeff
6c5497f47a - Make vn_lock() vget() and VOP_LOCK() all behave the same way WRT
LK_INTERLOCK.  The interlock will never be held on return from these
   functions even when there is an error.  Errors typically only occur when
   the XLOCK is held which means this isn't the vnode we want anyway.  Almost
   all users of these interfaces expected this behavior even though it was
   not provided before.
2002-08-22 07:44:45 +00:00
jeff
1e39ba8620 - Fix interlock handling in vn_lock(). Previously, vn_lock() could return
with interlock held in error conditions when the caller did not specify
   LK_INTERLOCK.
 - Add several comments to vn_lock() describing the rational behind the code
   flow since it was not immediately obvious.
2002-08-22 06:51:06 +00:00
jeff
275611472a - Document two cases, one in vget and the other in vn_lock, where the state
of interlock on exit is not consistent.  There are probably several bugs
   relating to this.
2002-08-21 08:34:48 +00:00
jeff
ca5f1feb36 - If vn_lock fails with the LK_INTERLOCK flag set, interlock will not be
released.  vcanrecycle() failed to unlock interlock under this condition.
 - Remove an extra VOP_UNLOCK from a failure case in vcanrecycle().

Pointed out by:	rwatson
2002-08-21 06:40:34 +00:00
jeff
2fc7835d26 - Add two new debugging macros: ASSERT_VI_LOCKED and ASSERT_VI_UNLOCKED
- Use the new VI asserts in place of the old mtx_assert checks.
 - Add the VI asserts to the automated lock checking in the VOP calls.  The
   interlock should not be held across vops with a few exceptions.
 - Add the vop_(un)lock_{pre,post} functions to assert that interlock is held
   when LK_INTERLOCK is set.
2002-08-21 06:19:29 +00:00
jeff
d18378e088 - Extend the vnode_free_list_mtx to cover numvnodes and freevnodes. This
was done only some of the time before, and now it is uniformly applied.
2002-08-13 05:29:48 +00:00
mux
f43070c325 - Introduce a new struct xvfsconf, the userland version of struct vfsconf.
- Make getvfsbyname() take a struct xvfsconf *.
- Convert several consumers of getvfsbyname() to use struct xvfsconf.
- Correct the getvfsbyname.3 manpage.
- Create a new vfs.conflist sysctl to dump all the struct xvfsconf in the
  kernel, and rewrite getvfsbyname() to use this instead of the weird
  existing API.
- Convert some {set,get,end}vfsent() consumers to use the new vfs.conflist
  sysctl.
- Convert a vfsload() call in nfsiod.c to kldload() and remove the useless
  vfsisloadable() and endvfsent() calls.
- Add a warning printf() in vfs_sysctl() to tell people they are using
  an old userland.

After these changes, it's possible to modify struct vfsconf without
breaking the binary compatibility.  Please note that these changes don't
break this compatibility either.

When bp will have updated mount_smbfs(8) with the patch I sent him, there
will be no more consumers of the {set,get,end}vfsent(), vfsisloadable()
and vfsload() API, and I will promptly delete it.
2002-08-10 20:19:04 +00:00
jeff
f91961bfed - Move some logic from getnewvnode() to a new function vcanrecycle()
- Unlock the free list mutex around vcanrecycle to prevent a lock order
   reversal.
2002-08-05 10:15:56 +00:00
jeff
02517b6731 - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
rwatson
a5dcc1fd3d Include file cleanup; mac.h and malloc.h at one point had ordering
relationship requirements, and no longer do.

Reminded by:	bde
2002-08-01 17:47:56 +00:00
des
2ca172b725 Nit in previous commit: the correct sysctl type is "S,xvnode" 2002-07-31 12:25:28 +00:00
des
9c7ec03502 Initialize v_cachedid to -1 in getnewvnode().
Reintroduce the kern.vnode sysctl and make it export xvnodes rather than
vnodes.

Sponsored by:	DARPA, NAI Labs
2002-07-31 12:24:35 +00:00
rwatson
6bb9b1da05 Note that the privilege indicating flag to vaccess() originally used
by the process accounting system is now deprecated.
2002-07-31 02:05:12 +00:00
rwatson
261170743f Introduce support for Mandatory Access Control and extensible
kernel access control.

Invoke the necessary MAC entry points to maintain labels on vnodes.
In particular, initialize the label when the vnode is allocated or
reused, and destroy the label when the vnode is going to be released,
or reused.  Wow, an object where there really is exactly one place
where it's allocated, and one other where it's freed.  Amazing.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-07-31 02:03:46 +00:00
jeff
5dce00d8f1 - Backout the patch made in revision 1.75 of vfs_mount.c. The vputs here
were hiding the real problem of the missing unlock in sync_inactive.
 - Add the missing unlock in sync_inactive.

Submitted by:	iedowse
2002-07-29 06:26:55 +00:00
truckman
b1555a2743 Wire the sysctl output buffer before grabbing any locks to prevent
SYSCTL_OUT() from blocking while locks are held.  This should
only be done when it would be inconvenient to make a temporary copy of
the data and defer calling SYSCTL_OUT() until after the locks are
released.
2002-07-28 19:59:31 +00:00
rwatson
7be639a7c0 Teach discretionary access control methods for files about VAPPEND
and VALLPERM.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-07-22 03:57:07 +00:00
mckusick
b44cb5787c Add support to UFS2 to provide storage for extended attributes.
As this code is not actually used by any of the existing
interfaces, it seems unlikely to break anything (famous
last words).

The internal kernel interface to manipulate these attributes
is invoked using two new IO_ flags: IO_NORMAL and IO_EXT.
These flags may be specified in the ioflags word of VOP_READ,
VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that
you want to do I/O to the normal data part of the file and
IO_EXT means that you want to do I/O to the extended attributes
part of the file. IO_NORMAL and IO_EXT are mutually exclusive
for VOP_READ and VOP_WRITE, but may be specified individually
or together in the case of VOP_TRUNCATE. For example, when
removing a file, VOP_TRUNCATE is called with both IO_NORMAL
and IO_EXT set. For backward compatibility, if neither IO_NORMAL
nor IO_EXT is set, then IO_NORMAL is assumed.

Note that the BA_ and IO_ flags have been `merged' so that they
may both be used in the same flags word. This merger is possible
by assigning the IO_ flags to the low sixteen bits and the BA_
flags the high sixteen bits. This works because the high sixteen
bits of the IO_ word is reserved for read-ahead and help with
write clustering so will never be used for flags. This merge
lets us get away from code of the form:

        if (ioflags & IO_SYNC)
                flags |= BA_SYNC;

For the future, I have considered adding a new field to the
vattr structure, va_extsize. This addition could then be
exported through the stat structure to allow applications to
find out the size of the extended attribute storage and also
would provide a more standard interface for truncating them
(via VOP_SETATTR rather than VOP_TRUNCATE).

I am also contemplating adding a pathconf parameter (for
concreteness, lets call it _PC_MAX_EXTSIZE) which would
let an application determine the maximum size of the extended
atribute storage.

Sponsored by:	DARPA & NAI Labs.
2002-07-19 07:29:39 +00:00
mckusick
3abb526f86 Change utimes to set the file creation time (for filesystems that
support creation times such as UFS2) to the value of the
modification time if the value of the modification time is older
than the current creation time. See utimes(2) for further details.

Sponsored by:	DARPA & NAI Labs.
2002-07-17 02:03:19 +00:00
dillon
da4e111a55 Replace the global buffer hash table with per-vnode splay trees using a
methodology similar to the vm_map_entry splay and the VM splay that Alan
Cox is working on.  Extensive testing has appeared to have shown no
increase in overhead.

Disadvantages
    Dirties more cache lines during lookups.

    Not as fast as a hash table lookup (but still N log N and optimal
    when there is locality of reference).

Advantages
    vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem
    syncer operate more efficiently.

    I get to rip out all the old hacks (some of which were mine) that tried
    to keep the v_dirtyblkhd tailq sorted.

    The per-vnode splay tree should be easier to lock / SMPng pushdown on
    vnodes will be easier.

    This commit along with another that Alan is working on for the VM page
    global hash table will allow me to implement ranged fsync(), optimize
    server-side nfs commit rpcs, and implement partial syncs by the
    filesystem syncer (aka filesystem syncer would detect that someone is
    trying to get the vnode lock, remembers its place, and skip to the
    next vnode).

Note that the buffer cache splay is somewhat more complex then other splays
due to special handling of background bitmap writes (multiple buffers with
the same lblkno in the same vnode), and B_INVAL discontinuities between the
old hash table and the existence of the buffer on the v_cleanblkhd list.

Suggested by: alc
2002-07-10 17:02:32 +00:00
jeff
fe9018671a - Use standard locking functions in syncer's opv
- vput instead of vrele syncer vnodes in vfs_mount
 - Add vop_lookup_{pre,post} to verify locking in VOP_LOOKUP
2002-07-09 19:54:20 +00:00
jeff
cca3a0ef3d - Don't hold the vn lock while calling VOP_CLOSE in vclean(). 2002-07-07 06:38:22 +00:00
jeff
8bf1a039cb - BUF_REFCNT() seems to be the preferred method for verifying a locked buf.
Tell vop_strategy_pre() to use this instead.
 - Ignore B_CLUSTER bufs.  Their components are locked but they don't really
   exist so they don't have to be.  This isn't ideal but it is safe.
2002-07-07 05:29:45 +00:00
jeff
f1b0400267 Fix a mistake in my last commit. Don't grab an extra reference to the object
in bp->b_object.
2002-07-06 21:27:20 +00:00
jeff
0dd7645264 Fixup uses of GETVOBJECT.
- Cache a pointer to the vnode's object in the buf.
 - Hold a reference to that object in addition to the vnode's reference just
   to be consistent.
 - Cleanup code that got the object indirectly through the vp and VOP calls.

This fixes at least one case where we were calling GETVOBJECT without a lock.
It also avoids an expensive layered call at the cost of another pointer in
struct buf.
2002-07-06 08:59:52 +00:00
jeff
908b0eb9a7 - Add vop_strategy_pre to validate VOP_STRATEGY locking.
- Disable original vop_strategy lock specification.
 - Switch to the new vop_strategy_pre for lock validation.

VOP_STRATEGY requires only that the buf is locked UNLESS the block numbers need
to be translated.  There may be other reasons, but as long as the underlying
layer uses a VOP to perform the operations they will be caught later.
2002-07-06 05:21:12 +00:00
jeff
3bce786a77 Add "vop_rename_pre" to do pre rename lock verification. This is enabled only
with DEBUG_VFS_LOCKS.
2002-07-06 04:39:48 +00:00
mux
4f6ffa4183 Move vfs_rootmountalloc() in vfs_mount.c and remove lite2_vfs_mountroot()
which was #if 0'd and is not likely to be used now.
2002-07-03 09:27:24 +00:00
mux
eb5a0f4a7e Move every code related to mount(2) in a new file, vfs_mount.c.
The file vfs_conf.c which was dealing with root mounting has
been repo-copied into vfs_mount.c to preserve history.
This makes nmount related development easier, and help reducing
the size of vfs_syscalls.c, which is still an enormous file.

Reviewed by:	rwatson
Repo-copy by:	peter
2002-07-02 17:09:22 +00:00
iedowse
4416f82706 Use indirect function pointer hooks instead of #ifdef SOFTUPDATES
direct calls for the two places where the kernel calls into soft
updates code. Set up the hooks in softdep_initialize() and NULL
them out in softdep_uninitialize(). This change allows soft updates
to function correctly when ufs is loaded as a module.

Reviewed by:	mckusick
2002-07-01 17:59:40 +00:00
obrien
4db8ac83cb Rename the db command lockedvnodes to lockedvnods so that it fits on the
help screen and one doens't think we have a lockedvnodesmap command.
2002-06-29 04:45:09 +00:00
alfred
708aac7550 nuke caddr_t. 2002-06-28 23:17:36 +00:00
jeff
1ed9e0f375 Improve the VOP locking asserts
- Add vfs_badlock_print to control whether or not we print lock violations
 - Add vfs_badlock_panic to control whether we panic on lock violations

Both default to on to mimic the original behavior if DEBUG_VFS_LOCKS is on.
2002-06-28 20:58:14 +00:00
green
62d02a6b93 Fix a case where a vnode got explicitly unlocked after the pointer to it
got set to NULL.

Revision 1.355: in the box
2002-06-28 16:17:47 +00:00
mux
3770ca4156 Change the way we internally store the mount options to
a linked list.  This is to allow the merging of the mount
options in the MNT_UPDATE case, as the current data structure
is unsuitable for this.

There are no functional differences in this commit.

Reviewed by:	phk
2002-06-20 20:03:42 +00:00
mux
49532dbc77 Change vfs_copyopt() so that the length argument passed to it
must be the exact same size as the mount option.  This makes
vfs_copyopt() much more useful.
2002-06-14 20:04:21 +00:00
des
936333132d Move some sysctls from the debug tree to the vfs tree. 2002-06-06 15:50:22 +00:00
des
8aef2ace20 Gratuitous whitespace cleanup. 2002-06-06 15:46:38 +00:00
trhodes
28d42899b7 More s/file system/filesystem/g 2002-05-16 21:28:32 +00:00
mux
84d9baf797 o Fix vfs_copyopt(), the first argument to bcopy() is the source,
not the destination.
o Remove some code from vfs_getopt() which was making the interface
  more complicated to use for a very slight gain.
2002-05-16 17:09:41 +00:00
jeff
74069a30ee Switch from just holding the interlock to holding the standard lock throughout
getnewvnode().  This is safer.  In the future, we should investigate requiring
only the interlock to get the vnode object.
2002-05-07 02:44:06 +00:00
jeff
bfe0870a56 Hold the currently selected vnode's lock across the call to VOP_GETVOBJECT.
Don't try to create a vm object before the file system has a chance to finish
initializing it.  This is incorrect for a number of reasons.  Firstly, that
VOP requires a lock which the file system may not have initialized yet. Also,
open and others will create a vm object if it is necessary later.
2002-05-06 04:47:43 +00:00
phk
5020d62430 Expand the one-line function pbreassignbuf() the only place it is or could
be used.
2002-05-05 20:37:08 +00:00
dillon
226cd40e3d Remove obsolete code (that was already #if 0'd out).
Requested by: Hiten Pandya <hitmaster2k@yahoo.com>
2002-05-04 17:10:15 +00:00
jhb
db9aa81e23 Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on:	i386, alpha, sparc64
2002-04-04 21:03:38 +00:00
jhb
dc2e474f79 Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
mux
124c6d3a26 As discussed in -arch, add the new nmount(2) system call and the
new vfs_getopt()/vfs_copyopt() API.  This is intended to be used
later, when there will be filesystems implementing the VFS_NMOUNT
operation.  The mount(2) system call will disappear when all
filesystems will be converted to the new API.  Documentation will
be committed in a while.

Reviewed by:	phk
2002-03-26 15:33:44 +00:00
jeff
318cbeeecf Remove references to vm_zone.h and switch over to the new uma API.
Also, remove maxsockets.  If you look carefully you'll notice that the old
zone allocator never honored this anyway.
2002-03-20 04:09:59 +00:00
alfred
357e37e023 Remove __P. 2002-03-19 21:25:46 +00:00
rwatson
8d5b7b21f3 Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to
be safe.
2002-03-05 19:45:45 +00:00
jhb
3706cd3509 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
phk
68389bd8ba Make v_addpollinfo() visible and non-inline.
Have callers only call it as needed.
Add necessary call in ufs_kqfilter().

Test-case found by:	Andrew Gallatin <gallatin@cs.duke.edu>
2002-02-18 16:18:02 +00:00