freebsd-dev

Author	SHA1	Message	Date
Alan Cox	ecde4b3218	Various changes to vm_object_page_remove(): - Eliminate an odd, special-case feature: if start == end == 0 then all pages are removed. Only one caller used this feature and that caller can trivially pass the object's size. - Assert that the vm_object is locked on entry; don't bother testing for a NULL vm_object. - Style: Fix lines that are longer than 80 characters.	2003-04-26 23:41:30 +00:00
Alan Cox	1ca5895341	- Convert vm_object_pip_wait() from using tsleep() to msleep(). - Make vm_object_pip_sleep() static. - Lock the vm_object when performing vm_object_pip_wait().	2003-04-26 18:33:18 +00:00
Alan Cox	b6e48e0372	- Acquire the vm_object's lock when performing vm_object_page_clean(). - Add a parameter to vm_pageout_flush() that tells vm_pageout_flush() whether its caller has locked the vm_object. (This is a temporary measure to bootstrap vm_object locking.)	2003-04-24 04:31:25 +00:00
Alan Cox	49281fbf68	Update locking around vm_object_page_remove() to use the new macros.	2003-04-18 16:39:03 +00:00
Alan Cox	e96c181d16	Use vm_object_pip_wait() rather than reimplementing it.	2003-04-13 05:10:44 +00:00
Tor Egge	6b08046175	Adjust the number of vnodes scanned by vlrureclaim() according to the size of the vnode list.	2003-03-26 22:15:58 +00:00
Yaroslav Tykhiy	17ce5b94d6	We shouldn't assert that a vode is locked in vop_lock_post() if VOP_LOCK() has failed. Reviewed by: jeff	2003-03-22 13:21:54 +00:00
Jeff Roberson	e99215a614	- Remove a dead check for bp->b_vp == vp in vtruncbuf(). This has not been possible for some time. - Lock the buf before accessing fields. This should very rarely be locked. - Assert that B_DELWRI is set after we acquire the buf. This should always be the case now.	2003-03-13 07:22:53 +00:00
Jeff Roberson	09f11da5a3	- Remove a race between fsync like functions and flushbufqueues() by requiring locked bufs in vfs_bio_awrite(). Previously the buf could have been written out by fsync before we acquired the buf lock if it weren't for giant. The cluster_wbuild() handles this race properly but the single write at the end of vfs_bio_awrite() would not. - Modify flushbufqueues() so there is only one copy of the loop. Pass a parameter in that says whether or not we should sync bufs with deps. - Call flushbufqueues() a second time and then break if we couldn't find any bufs without deps.	2003-03-13 07:19:23 +00:00
Alan Cox	09c80124a3	Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@	2003-03-06 03:41:02 +00:00
Nate Lawson	99648386d3	Finish cleanup of vprint() which was begun with changing v_tag to a string. Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.	2003-03-03 19:15:40 +00:00
Jeff Roberson	491081fabf	- Hold the vnode interlock across calls to bgetvp instead of acquiring it internally. This is required to stop multiple bufs from being associated with a single lblkno.	2003-03-02 06:05:23 +00:00
Jeff Roberson	bff5362bf2	- gc USE_BUFHASH. The smp locking of the buf cache renders this useless.	2003-03-01 05:55:03 +00:00
Kirk McKusick	3a7053cb60	Prevent large files from monopolizing the system buffers. Keep track of the number of dirty buffers held by a vnode. When a bdwrite is done on a buffer, check the existing number of dirty buffers associated with its vnode. If the number rises above vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one of the other (hopefully older) dirty buffers associated with the vnode is written (using bawrite). In the event that this approach fails to curb the growth in it the vnode's number of dirty buffers (due to soft updates rollback dependencies), the more drastic approach of doing a VOP_FSYNC on the vnode is used. This code primarily affects very large and actively written files such as snapshots. This change should eliminate hanging when taking snapshots or doing background fsck on very large filesystems. Hopefully, one day it will be possible to cache filesystem metadata in the VM cache as is done with file data. As it stands, only the buffer cache can be used which limits total metadata storage to about 20Mb no matter how much memory is available on the system. This rather small memory gets badly thrashed causing a lot of extra I/O. For example, taking a snapshot of a 1Tb filesystem minimally requires about 35,000 write operations, but because of the cache thrashing (we only have about 350 buffers at our disposal) ends up doing about 237,540 I/O's thus taking twenty-five minutes instead of four if it could run entirely in the cache. Reported by: Attila Nagy <bra@fsn.hu> Sponsored by: DARPA & NAI Labs.	2003-02-25 06:44:42 +00:00
Jeff Roberson	17661e5ac4	- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK. - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick	2003-02-25 03:37:48 +00:00
Poul-Henning Kamp	acb18acfec	Bracket the kern.vnode sysctl in #ifdef notyet because it results in massive locking issues on diskless systems. It is also not clear that this sysctl is non-dangerous in its requirements for locked down memory on large RAM systems.	2003-02-23 18:09:05 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Ian Dowse	6a1b2a22ef	Add a new vnode flag VI_DOINGINACT to indicate that a VOP_INACTIVE call is in progress on the vnode. When vput() or vrele() sees a 1->0 reference count transition, it now return without any further action if this flag is set. This flag is necessary to avoid recursion into VOP_INACTIVE if the filesystem inactive routine causes the reference count to increase and then drop back to zero. It is also used to guarantee that an unlocked vnode will not be recycled while blocked in VOP_INACTIVE(). There are at least two cases where the recursion can occur: one is that the softupdates code called by ufs_inactive() via ffs_truncate() can call vput() on the vnode. This has been reported by many people as "lockmgr: draining against myself" panics. The other case is that nfs_inactive() can call vget() and then vrele() on the vnode to clean up a sillyrename file. Reviewed by: mckusick (an older version of the patch)	2002-12-29 18:30:49 +00:00
Poul-Henning Kamp	371400cf2e	Use a timeout of one second while we wait for the vnode washer, this prevents a potential race and makes the system a little bit less jerky under extreme loads.	2002-12-29 11:18:25 +00:00
Poul-Henning Kamp	851a87ea1a	Vnodes pull in 800-900 bytes these days, all things counted, so we need to treat desiredvnodes much more like a limit than as a vague concept. On a 2GB RAM machine where desired vnodes is 130k, we run out of kmem_map space when we hit about 190k vnodes. If we wake up the vnode washer in getnewvnode(), sleep until it is done, so that it has a chance to offer us a washed vnode. If we don't sleep here we'll just race ahead and allocate yet a vnode which will never get freed. In the vnodewasher, instead of doing 10 vnodes per mountpoint per rotation, do 10% of the vnodes distributed evenly across the mountpoints.	2002-12-29 10:39:05 +00:00
Poul-Henning Kamp	9f16282798	KASSERT that vop_revoke() gets a VCHR.	2002-12-28 22:27:14 +00:00
Alan Cox	475e8011ab	Perform vm_object_lock() and vm_object_unlock() around vm_object_page_remove().	2002-12-15 05:41:56 +00:00
Alan Cox	2e29a1f21f	To avoid lock order reversals in getnewvnode(), the call to uma_zfree() must be delayed until the vnode interlock is released. Reported by: kris@ Approved by: re (jhb)	2002-12-08 05:06:50 +00:00
Robert Drehmel	f85a961930	Do not set a variable (vp->p_pollinfo) to NULL if we know it already has that value. Approved by: re	2002-11-27 16:45:54 +00:00
Robert Watson	763bbd2f4f	Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-26 14:38:24 +00:00
Poul-Henning Kamp	0d6dc414b4	In vrele() we can actually have a VCHR with v_rdev == NULL if we came from the bottom of addaliasu(). Don't panic.	2002-10-25 07:58:25 +00:00
Kirk McKusick	9ab73fd11a	Within ufs, the ffs_sync and ffs_fsync functions did not always check for and/or report I/O errors. The result is that a VFS_SYNC or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in the presence of a hard error writing a disk sector or in a filesystem full condition. This patch ensures that I/O errors will always be checked and returned. This patch also ensures that every call to VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes appropriate action when an error is returned. Sponsored by: DARPA & NAI Labs.	2002-10-25 00:20:37 +00:00
Poul-Henning Kamp	a2fb4feded	Fix the spechash lock order reversal by keeping an updated sum of v_usecount in the dev_t which vcount() can return without locking any vnodes. Seen by: jhb	2002-10-24 19:38:56 +00:00
Kirk McKusick	a6b9f47b31	When scanning the freelist looking for candidate vnodes to recycle, be sure to exit the loop with vp == NULL if no candidates are found. Formerly, this bug would cause the last vnode inspected to be used, even if it was not available. The result was a panic "vn_finished_write: neg cnt". Sponsored by: DARPA & NAI Labs.	2002-10-14 19:54:39 +00:00
Kirk McKusick	e04a020067	Unconditionally reset vp->v_vnlock back to the default in the vclean() function (e.g., vp->v_vnlock = &vp->v_lock) rather than requiring filesystems that use alternate locks to do so in their vop_reclaim functions. This change is a further cleanup of the vop_stdlock interface. Submitted by: Poul-Henning Kamp <phk@critter.freebsd.dk> Sponsored by: DARPA & NAI Labs.	2002-10-14 19:44:51 +00:00
Kirk McKusick	a5b65058d5	Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.	2002-10-14 03:20:36 +00:00
Kirk McKusick	192e439ed4	When considering a vnode for reuse in getnewvnode, we call vcanrecycle to check a free vnode's availability. If it is available, vcanrecycle returns an error code of zero and the vnode in question locked. The getnewvnode routine then used to call vn_start_write with the V_NOWAIT flag. If the filesystem was suspended while taking a snapshot, the vn_start_write would fail but getnewvnode would fail to unlock the vnode, instead leaving it locked on the freelist. The result would be that the vnode would be locked forever and would eventually hang the system with a race to the root when it was attempted to recycle it. This fix moves the vn_start_write check into vcanrecycle where it will properly unlock the vnode if it is unavailable for recycling due to filesystem suspension. Sponsored by: DARPA & NAI Labs.	2002-10-11 01:04:14 +00:00
Maxim Sobolev	790a8088d0	Fix problem introduced in rev.1.406, which can cause already unlocked mutex being unlocked again causing system panic.	2002-10-05 12:56:10 +00:00
Poul-Henning Kamp	8d3574c7a4	Fix some harmless mis-indents. Spotted by: FlexeLint	2002-10-01 15:48:31 +00:00
Robert Watson	0626774f08	Move vnode MAC label initialization to after the release of the vnode interlock in getnewvnode() to avoid possible sleeps while holding the mutex. Note that the warning from Witness is a slight false positive since we know there will be no contention on the interlock since we haven't made the vnode available for use yet, but the theory is not a bad one. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-09-30 20:51:48 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Jeff Roberson	6423c9433c	- Move ASSERT_VOP_LOCK functionality into functions in vfs_subr.c - Make the VI asserts more orthogonal to the rest of the asserts by using a new, common vfs_badlock() function and adding a 'str' arg. - Adjust generated ASSERTS to match the new prototype. - Adjust explicit ASSERTS to match the new prototype.	2002-09-26 04:48:44 +00:00
Jeff Roberson	6cb8bf2027	- Lock down the syncer with sync_mtx. - Enable vfs_badlock_mutex by default. - Assert that the vp is locked in VOP_UNLOCK. - Use standard interlock macros in remaining code. - Correct a race in getnewvnode(). - Lock access to v_numoutput with interlock. - Lock access to buf lists and splay tree with interlock. - Add VOP and VI asserts. - Lock b_vnbufs with the vnode interlock. - Add vrefcnt() for callers who want to retreive the vnode ref without holding a lock. Add a comment that describes when this is safe. - Add vholdl() and vdropl() so that callers who already own the interlock can avoid race conditions and unnecessary unlocking. - Move the VOP_GETATTR() in vflush() into the WRITECLOSE conditional case. - Hold the interlock before droping the mntlist_mtx in vflush() to avoid a race. - Fix locking in vfs_msync().	2002-09-25 02:22:21 +00:00
Nate Lawson	86ed6d45ac	Remove any VOP_PRINT that redundantly prints the tag. Move lockmgr_printinfo() into vprint() for everyone's benefit. Suggested by: bde	2002-09-18 20:42:04 +00:00
Nate Lawson	06be2aaa83	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
Julian Elischer	85e40eaf26	Indentation does not make a block.. need curly braces too. Submitted by: Eagle-eyes evans <bde@freebsd.org>	2002-09-11 18:15:26 +00:00
Julian Elischer	71fad9fdee	Completely redo thread states. Reviewed by: davidxu@freebsd.org	2002-09-11 08:13:56 +00:00
Poul-Henning Kamp	f8b663614d	Fix an inherited style bug: compare with NOCRED instead of NULL. Sponsored by: DARPA & NAI Labs.	2002-09-05 20:46:19 +00:00
Poul-Henning Kamp	c1a925a637	Introduce new extattr_check_cred() function which implements the canonical crential washing for extended attributes. Sponsored by: DARPA & NAI Labs.	2002-09-05 20:38:57 +00:00
Philippe Charnier	93b0017f88	Replace various spelling with FALLTHROUGH which is lint()able	2002-08-25 13:23:09 +00:00
Jeff Roberson	ad32f726db	- Fix a mistake in my last few commits. The PDROP flag stops msleep from re-acquiring the mutex. Pointy hat to: me Noticed by: tegge	2002-08-23 00:32:03 +00:00
Jeff Roberson	9abf54f032	- Make vn_lock() vget() and VOP_LOCK() all behave the same way WRT LK_INTERLOCK. The interlock will never be held on return from these functions even when there is an error. Errors typically only occur when the XLOCK is held which means this isn't the vnode we want anyway. Almost all users of these interfaces expected this behavior even though it was not provided before.	2002-08-22 07:44:45 +00:00
Jeff Roberson	183158485a	- Fix interlock handling in vn_lock(). Previously, vn_lock() could return with interlock held in error conditions when the caller did not specify LK_INTERLOCK. - Add several comments to vn_lock() describing the rational behind the code flow since it was not immediately obvious.	2002-08-22 06:51:06 +00:00
Jeff Roberson	0b600db425	- Document two cases, one in vget and the other in vn_lock, where the state of interlock on exit is not consistent. There are probably several bugs relating to this.	2002-08-21 08:34:48 +00:00
Jeff Roberson	88cf6b94bd	- If vn_lock fails with the LK_INTERLOCK flag set, interlock will not be released. vcanrecycle() failed to unlock interlock under this condition. - Remove an extra VOP_UNLOCK from a failure case in vcanrecycle(). Pointed out by: rwatson	2002-08-21 06:40:34 +00:00
Jeff Roberson	71ea4ba57c	- Add two new debugging macros: ASSERT_VI_LOCKED and ASSERT_VI_UNLOCKED - Use the new VI asserts in place of the old mtx_assert checks. - Add the VI asserts to the automated lock checking in the VOP calls. The interlock should not be held across vops with a few exceptions. - Add the vop_(un)lock_{pre,post} functions to assert that interlock is held when LK_INTERLOCK is set.	2002-08-21 06:19:29 +00:00
Jeff Roberson	055c012332	- Extend the vnode_free_list_mtx to cover numvnodes and freevnodes. This was done only some of the time before, and now it is uniformly applied.	2002-08-13 05:29:48 +00:00
Maxime Henrion	5965373e69	- Introduce a new struct xvfsconf, the userland version of struct vfsconf. - Make getvfsbyname() take a struct xvfsconf *. - Convert several consumers of getvfsbyname() to use struct xvfsconf. - Correct the getvfsbyname.3 manpage. - Create a new vfs.conflist sysctl to dump all the struct xvfsconf in the kernel, and rewrite getvfsbyname() to use this instead of the weird existing API. - Convert some {set,get,end}vfsent() consumers to use the new vfs.conflist sysctl. - Convert a vfsload() call in nfsiod.c to kldload() and remove the useless vfsisloadable() and endvfsent() calls. - Add a warning printf() in vfs_sysctl() to tell people they are using an old userland. After these changes, it's possible to modify struct vfsconf without breaking the binary compatibility. Please note that these changes don't break this compatibility either. When bp will have updated mount_smbfs(8) with the patch I sent him, there will be no more consumers of the {set,get,end}vfsent(), vfsisloadable() and vfsload() API, and I will promptly delete it.	2002-08-10 20:19:04 +00:00
Jeff Roberson	8947be9ba0	- Move some logic from getnewvnode() to a new function vcanrecycle() - Unlock the free list mutex around vcanrecycle to prevent a lock order reversal.	2002-08-05 10:15:56 +00:00
Jeff Roberson	e6e370a7fe	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
Robert Watson	f9d0d52459	Include file cleanup; mac.h and malloc.h at one point had ordering relationship requirements, and no longer do. Reminded by: bde	2002-08-01 17:47:56 +00:00
Dag-Erling Smørgrav	3072197229	Nit in previous commit: the correct sysctl type is "S,xvnode"	2002-07-31 12:25:28 +00:00
Dag-Erling Smørgrav	217b2a0b61	Initialize v_cachedid to -1 in getnewvnode(). Reintroduce the kern.vnode sysctl and make it export xvnodes rather than vnodes. Sponsored by: DARPA, NAI Labs	2002-07-31 12:24:35 +00:00
Robert Watson	07bdba7e2d	Note that the privilege indicating flag to vaccess() originally used by the process accounting system is now deprecated.	2002-07-31 02:05:12 +00:00
Robert Watson	a0ee6ed1c0	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke the necessary MAC entry points to maintain labels on vnodes. In particular, initialize the label when the vnode is allocated or reused, and destroy the label when the vnode is going to be released, or reused. Wow, an object where there really is exactly one place where it's allocated, and one other where it's freed. Amazing. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 02:03:46 +00:00
Jeff Roberson	a562685f65	- Backout the patch made in revision 1.75 of vfs_mount.c. The vputs here were hiding the real problem of the missing unlock in sync_inactive. - Add the missing unlock in sync_inactive. Submitted by: iedowse	2002-07-29 06:26:55 +00:00
Don Lewis	5c38b6dbce	Wire the sysctl output buffer before grabbing any locks to prevent SYSCTL_OUT() from blocking while locks are held. This should only be done when it would be inconvenient to make a temporary copy of the data and defer calling SYSCTL_OUT() until after the locks are released.	2002-07-28 19:59:31 +00:00
Robert Watson	b02aac465d	Teach discretionary access control methods for files about VAPPEND and VALLPERM. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-22 03:57:07 +00:00
Kirk McKusick	7aca6291e3	Add support to UFS2 to provide storage for extended attributes. As this code is not actually used by any of the existing interfaces, it seems unlikely to break anything (famous last words). The internal kernel interface to manipulate these attributes is invoked using two new IO_ flags: IO_NORMAL and IO_EXT. These flags may be specified in the ioflags word of VOP_READ, VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that you want to do I/O to the normal data part of the file and IO_EXT means that you want to do I/O to the extended attributes part of the file. IO_NORMAL and IO_EXT are mutually exclusive for VOP_READ and VOP_WRITE, but may be specified individually or together in the case of VOP_TRUNCATE. For example, when removing a file, VOP_TRUNCATE is called with both IO_NORMAL and IO_EXT set. For backward compatibility, if neither IO_NORMAL nor IO_EXT is set, then IO_NORMAL is assumed. Note that the BA_ and IO_ flags have been `merged' so that they may both be used in the same flags word. This merger is possible by assigning the IO_ flags to the low sixteen bits and the BA_ flags the high sixteen bits. This works because the high sixteen bits of the IO_ word is reserved for read-ahead and help with write clustering so will never be used for flags. This merge lets us get away from code of the form: if (ioflags & IO_SYNC) flags \|= BA_SYNC; For the future, I have considered adding a new field to the vattr structure, va_extsize. This addition could then be exported through the stat structure to allow applications to find out the size of the extended attribute storage and also would provide a more standard interface for truncating them (via VOP_SETATTR rather than VOP_TRUNCATE). I am also contemplating adding a pathconf parameter (for concreteness, lets call it _PC_MAX_EXTSIZE) which would let an application determine the maximum size of the extended atribute storage. Sponsored by: DARPA & NAI Labs.	2002-07-19 07:29:39 +00:00
Kirk McKusick	fb36a3d847	Change utimes to set the file creation time (for filesystems that support creation times such as UFS2) to the value of the modification time if the value of the modification time is older than the current creation time. See utimes(2) for further details. Sponsored by: DARPA & NAI Labs.	2002-07-17 02:03:19 +00:00
Matthew Dillon	d331c5d43f	Replace the global buffer hash table with per-vnode splay trees using a methodology similar to the vm_map_entry splay and the VM splay that Alan Cox is working on. Extensive testing has appeared to have shown no increase in overhead. Disadvantages Dirties more cache lines during lookups. Not as fast as a hash table lookup (but still N log N and optimal when there is locality of reference). Advantages vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem syncer operate more efficiently. I get to rip out all the old hacks (some of which were mine) that tried to keep the v_dirtyblkhd tailq sorted. The per-vnode splay tree should be easier to lock / SMPng pushdown on vnodes will be easier. This commit along with another that Alan is working on for the VM page global hash table will allow me to implement ranged fsync(), optimize server-side nfs commit rpcs, and implement partial syncs by the filesystem syncer (aka filesystem syncer would detect that someone is trying to get the vnode lock, remembers its place, and skip to the next vnode). Note that the buffer cache splay is somewhat more complex then other splays due to special handling of background bitmap writes (multiple buffers with the same lblkno in the same vnode), and B_INVAL discontinuities between the old hash table and the existence of the buffer on the v_cleanblkhd list. Suggested by: alc	2002-07-10 17:02:32 +00:00
Jeff Roberson	25b286d6db	- Use standard locking functions in syncer's opv - vput instead of vrele syncer vnodes in vfs_mount - Add vop_lookup_{pre,post} to verify locking in VOP_LOOKUP	2002-07-09 19:54:20 +00:00
Jeff Roberson	18c48f437f	- Don't hold the vn lock while calling VOP_CLOSE in vclean().	2002-07-07 06:38:22 +00:00
Jeff Roberson	bed75d4627	- BUF_REFCNT() seems to be the preferred method for verifying a locked buf. Tell vop_strategy_pre() to use this instead. - Ignore B_CLUSTER bufs. Their components are locked but they don't really exist so they don't have to be. This isn't ideal but it is safe.	2002-07-07 05:29:45 +00:00
Jeff Roberson	c031d11bb4	Fix a mistake in my last commit. Don't grab an extra reference to the object in bp->b_object.	2002-07-06 21:27:20 +00:00
Jeff Roberson	9a236af3ad	Fixup uses of GETVOBJECT. - Cache a pointer to the vnode's object in the buf. - Hold a reference to that object in addition to the vnode's reference just to be consistent. - Cleanup code that got the object indirectly through the vp and VOP calls. This fixes at least one case where we were calling GETVOBJECT without a lock. It also avoids an expensive layered call at the cost of another pointer in struct buf.	2002-07-06 08:59:52 +00:00
Jeff Roberson	302c7aaab9	- Add vop_strategy_pre to validate VOP_STRATEGY locking. - Disable original vop_strategy lock specification. - Switch to the new vop_strategy_pre for lock validation. VOP_STRATEGY requires only that the buf is locked UNLESS the block numbers need to be translated. There may be other reasons, but as long as the underlying layer uses a VOP to perform the operations they will be caught later.	2002-07-06 05:21:12 +00:00
Jeff Roberson	cc8662b0f9	Add "vop_rename_pre" to do pre rename lock verification. This is enabled only with DEBUG_VFS_LOCKS.	2002-07-06 04:39:48 +00:00
Maxime Henrion	d7f9ecc86b	Move vfs_rootmountalloc() in vfs_mount.c and remove lite2_vfs_mountroot() which was #if 0'd and is not likely to be used now.	2002-07-03 09:27:24 +00:00
Maxime Henrion	2b4edb69f1	Move every code related to mount(2) in a new file, vfs_mount.c. The file vfs_conf.c which was dealing with root mounting has been repo-copied into vfs_mount.c to preserve history. This makes nmount related development easier, and help reducing the size of vfs_syscalls.c, which is still an enormous file. Reviewed by: rwatson Repo-copy by: peter	2002-07-02 17:09:22 +00:00
Ian Dowse	6bd521df93	Use indirect function pointer hooks instead of #ifdef SOFTUPDATES direct calls for the two places where the kernel calls into soft updates code. Set up the hooks in softdep_initialize() and NULL them out in softdep_uninitialize(). This change allows soft updates to function correctly when ufs is loaded as a module. Reviewed by: mckusick	2002-07-01 17:59:40 +00:00
David E. O'Brien	87e1503e2c	Rename the db command lockedvnodes to lockedvnods so that it fits on the help screen and one doens't think we have a lockedvnodesmap command.	2002-06-29 04:45:09 +00:00
Alfred Perlstein	210a5a7169	nuke caddr_t.	2002-06-28 23:17:36 +00:00
Jeff Roberson	90769c9ed0	Improve the VOP locking asserts - Add vfs_badlock_print to control whether or not we print lock violations - Add vfs_badlock_panic to control whether we panic on lock violations Both default to on to mimic the original behavior if DEBUG_VFS_LOCKS is on.	2002-06-28 20:58:14 +00:00
Brian Feldman	aac12bcfbc	Fix a case where a vnode got explicitly unlocked after the pointer to it got set to NULL. Revision 1.355: in the box	2002-06-28 16:17:47 +00:00
Maxime Henrion	7d2d440991	Change the way we internally store the mount options to a linked list. This is to allow the merging of the mount options in the MNT_UPDATE case, as the current data structure is unsuitable for this. There are no functional differences in this commit. Reviewed by: phk	2002-06-20 20:03:42 +00:00
Maxime Henrion	fe93750656	Change vfs_copyopt() so that the length argument passed to it must be the exact same size as the mount option. This makes vfs_copyopt() much more useful.	2002-06-14 20:04:21 +00:00
Dag-Erling Smørgrav	edad3af28d	Move some sysctls from the debug tree to the vfs tree.	2002-06-06 15:50:22 +00:00
Dag-Erling Smørgrav	4a357a32e0	Gratuitous whitespace cleanup.	2002-06-06 15:46:38 +00:00
Tom Rhodes	d394511de3	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
Maxime Henrion	34e53231d0	o Fix vfs_copyopt(), the first argument to bcopy() is the source, not the destination. o Remove some code from vfs_getopt() which was making the interface more complicated to use for a very slight gain.	2002-05-16 17:09:41 +00:00
Jeff Roberson	f0d73b3e5f	Switch from just holding the interlock to holding the standard lock throughout getnewvnode(). This is safer. In the future, we should investigate requiring only the interlock to get the vnode object.	2002-05-07 02:44:06 +00:00
Jeff Roberson	6953f5da1a	Hold the currently selected vnode's lock across the call to VOP_GETVOBJECT. Don't try to create a vm object before the file system has a chance to finish initializing it. This is incorrect for a number of reasons. Firstly, that VOP requires a lock which the file system may not have initialized yet. Also, open and others will create a vm object if it is necessary later.	2002-05-06 04:47:43 +00:00
Poul-Henning Kamp	81e017430a	Expand the one-line function pbreassignbuf() the only place it is or could be used.	2002-05-05 20:37:08 +00:00
Matthew Dillon	9f9435545b	Remove obsolete code (that was already #if 0'd out). Requested by: Hiten Pandya <hitmaster2k@yahoo.com>	2002-05-04 17:10:15 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
John Baldwin	44731cab3b	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
Maxime Henrion	17594b936b	As discussed in -arch, add the new nmount(2) system call and the new vfs_getopt()/vfs_copyopt() API. This is intended to be used later, when there will be filesystems implementing the VFS_NMOUNT operation. The mount(2) system call will disappear when all filesystems will be converted to the new API. Documentation will be committed in a while. Reviewed by: phk	2002-03-26 15:33:44 +00:00
Jeff Roberson	c897b81311	Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.	2002-03-20 04:09:59 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Robert Watson	89e1164ee2	Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to be safe.	2002-03-05 19:45:45 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
Poul-Henning Kamp	68edc1b939	Make v_addpollinfo() visible and non-inline. Have callers only call it as needed. Add necessary call in ufs_kqfilter(). Test-case found by: Andrew Gallatin <gallatin@cs.duke.edu>	2002-02-18 16:18:02 +00:00
Poul-Henning Kamp	90737495aa	Remove yet a redundant VN_KNOTE() macro.	2002-02-18 08:24:48 +00:00

1 2 3 4 5 ...

493 Commits