Commit Graph

482 Commits

Author SHA1 Message Date
jeff
8e533783f3 - Add information about the buf lock to db_show_buffer.
- Add a 'show lockedbufs' command that is similar to show lockedvnods.

Sponsored by:	Isilon Systems, Inc.
2005-03-25 00:20:37 +00:00
jeff
893b010525 - Lock access to the buffer_map with the vm_map lock. In 4.x this was
done with splbio, in 5.x this was done with Giant.

Discussed with:		alc
Reported by:		julian, pho
2005-03-08 09:34:54 +00:00
phk
5dd8d30575 Make various vnode related functions static 2005-02-10 12:28:58 +00:00
jeff
480b60be3c - Add more information to the getnewbuf() recycling KTR.
Sponsored by:	Isilon Systems, Inc.
2005-02-10 02:22:56 +00:00
jeff
ede81ae242 - Remove an invalid KASSERT added in recent background write reshuffling.
Sponsored by:	Isilon Systems, Inc.
2005-02-08 23:25:08 +00:00
phk
af5ef3f262 Background writes are entirely an FFS/Softupdates thing.
Give FFS vnodes a specific bufwrite method which contains all the
background write stuff and then calls into the default bufwrite()
for the rest of the job.

Remove all the background write related stuff from the normal bufwrite.

This drags the softdep_move_dependencies() back into FFS.

Long term, it is worth looking at simply copying the data into
allocated memory and issuing the bio directly and not create the
"shadow buf" in the first place (just like copy-on-write is done
in snapshots for instance).  I don't think we really gain anything
but complexity from doing this with a buf.
2005-02-08 20:29:10 +00:00
jeff
0a084a15e2 - Don't release BKGRDINPROG until after we've bufdone'd the copy.
Sponsored by:	Isilon Systems, Inc.
2005-02-05 01:26:14 +00:00
jeff
da8e6b049d - Don't drop the wref on the bufobj until after bufdone() has completed.
Without this, threads waiting in bufobj_wwait() may wakeup prior to
   bufdone() completing.

Sponsored by:	Isilon Systems, Inc.
2005-01-28 17:48:58 +00:00
phk
796d435574 Don't use VOP_GETVOBJECT, use vp->v_object directly. 2005-01-25 00:40:01 +00:00
phk
d5c135375c Kill the VV_OBJBUF and test the v_object for NULL instead. 2005-01-24 13:13:57 +00:00
jeff
39bf4e6e67 - Add CTR calls to trace the lifecycle of a buffer.
- Remove some KASSERTs which are invalid if the appropriate lock is
   not held.
 - Slightly restructure bremfree() so that it is more sane.
 - Change the flush code in bdwrite() to avoid acquiring a mutex
   whenever possible.
 - Change the flush code in bdwrite() to avoid holding the bufobj mutex
   while calling buf_countdeps().  This introduces a lock-order
   relationship with the softdep lock that can not otherwise be resolved.
 - Don't set B_DONE until bufdone() is complete, otherwise another
   processor may believe the buf is done before it is.
 - Only acquire Giant if the caller has set b_iodone.  Don't grab giant
   around normal bufdone() calls.

Sponsored By:	Isilon Systems, Inc.
2005-01-24 10:47:04 +00:00
phk
5a497775d6 Add BO_SYNC() and add a default which uses the secret vnode pointer
and VOP_FSYNC() for now.
2005-01-11 10:43:08 +00:00
phk
da2718f1af Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().
I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

	The credentials for syncing a file (ability to write to the
	file) should be checked at the system call level.

	Credentials for syncing one or more filesystems ("none")
	should be checked at the system call level as well.

	If the filesystem implementation needs a particular credential
	to carry out the syncing it would logically have to the
	cached mount credential, or a credential cached along with
	any delayed write data.

Discussed with:	rwatson
2005-01-11 07:36:22 +00:00
jeff
9caab2e843 - Eliminate the acquisition and release of the bqlock in bremfree() by
setting the B_REMFREE flag in the buf.  This is done to prevent lock order
   reversals with code that must call bremfree() with a local lock held.
   This also reduces overhead by removing two lock operations per buf for
   fsync() and similar.
 - Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock
   has been acquired so that we may remove ourself from the free-list.
 - Provide a bremfreef() function to immediately remove a buf from a
   free-list for use only by NFS.  This is done because the nfsclient code
   overloads the b_freelist queue for its own async. io queue.
 - Simplify the numfreebuffers accounting by removing a switch statement
   that executed the same code in every possible case.
 - getnewbuf() can encounter locked bufs on free-lists once Giant is removed.
   Remove a panic associated with this condition and delay asserts that
   inspect the buf until after it is locked.

Reviewed by:	phk
Sponsored by:	Isilon Systems, Inc.
2004-11-18 08:44:09 +00:00
phk
e5715b2cc1 Retire b_magic now, we have the bufobj containing the same hint. 2004-11-04 09:48:18 +00:00
phk
e9aa533e84 Change buf->b_object to buf->b_bufobj->bo_object
some whitespace fixes.
2004-11-04 09:06:54 +00:00
phk
bb0cfa35bf whitespace 2004-11-04 08:25:52 +00:00
phk
1e4caea88c Remove buf->b_dev field. 2004-11-04 07:59:57 +00:00
alc
25b80a64b9 The synchronization provided by vm object locking has eliminated the
need for most calls to vm_page_busy().  Specifically, most calls to
vm_page_busy() occur immediately prior to a call to vm_page_remove().
In such cases, the containing vm object is locked across both calls.
Consequently, the setting of the vm page's PG_BUSY flag is not even
visible to other threads that are following the synchronization
protocol.

This change (1) eliminates the calls to vm_page_busy() that
immediately precede a call to vm_page_remove() or functions, such as
vm_page_free() and vm_page_rename(), that call it and (2) relaxes the
requirement in vm_page_remove() that the vm page's PG_BUSY flag is
set.  Now, the vm page's PG_BUSY flag is set only when the vm object
lock is released while the vm page is still in transition.  Typically,
this is when it is undergoing I/O.
2004-11-03 20:17:31 +00:00
phk
546ea57ed3 Remove the last call in the system to VOP_SPECSTRATEGY(): We can no
longer come through the VNODE layer to the disks since all the filesystems
now go via geom_vfs to GEOM.
2004-10-29 10:52:31 +00:00
phk
86cc21c765 Give dev_strategy() an explict cdev argument in preparation for removing
buf->b-dev.

Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.

Assert copyright on vfs_bio.c and update copyright message to canonical
text.  There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.
2004-10-29 07:16:37 +00:00
phk
08ed0626b7 Lock bp->b_bufobj->b_object instead of bp->b_object 2004-10-28 08:38:46 +00:00
phk
fd2239c999 The island council met and voted buf_prewrite() home.
Give ffs it's own bufobj->bo_ops vector and create a private strategy
routine, (currently misnamed for forwards compatibility), which is
just a copy of the generic bufstrategy routine except we call
softdep_disk_prewrite() directly instead of through the buf_prewrite()
indirection.

Teach UFS about the need for softdep_disk_prewrite() and call the
function directly in FFS.

Remove buf_prewrite() from the default bufstrategy() and from the
global bio_ops method vector.
2004-10-26 10:44:10 +00:00
phk
c66aa10c8e Put the I/O block size in bufobj->bo_bsize.
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open().  The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
2004-10-26 07:39:12 +00:00
alc
343104d2b1 Hold the lock on the containing vm object when calling
vm_page_sleep_if_busy().
2004-10-26 06:58:26 +00:00
phk
3a8a530155 Remove vnode->v_bsize. This was a dead-end. 2004-10-25 07:50:59 +00:00
alc
c0959741f7 Use VM_ALLOC_NOBUSY to eliminate vm_page_wakeup() calls and the acquisition
and release of the global page queues lock required to make the call.

Remove GIANT_REQUIRED from vm_hold_free_pages().  All of its VM operations
are properly synchronized.
2004-10-25 06:34:14 +00:00
phk
4ba53ec41b Collapse vnode->v_object and buf->b_object into bufobj->bo_object. 2004-10-25 06:02:57 +00:00
phk
1b25a59886 Move the buffer method vector (buf->b_op) to the bufobj.
Extend it with a strategy method.

Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.

Rename ibwrite to bufwrite().

Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.

Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().

Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
2004-10-24 20:03:41 +00:00
phk
52a089c526 Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.
Initialize b_bufobj for all buffers.

Make incore() and gbincore() take a bufobj instead of a vnode.

Make inmem() local to vfs_bio.c

Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj)
also VI_MTX() to BO_MTX(),

Make buf_vlist_add() take a bufobj instead of a vnode.

Eliminate other uses of bp->b_vp where bp->b_bufobj will do.

Various minor polishing: remove "register", turn panic into KASSERT,
use new function declarations, TAILQ_FOREACH_SAFE() etc.
2004-10-22 08:47:20 +00:00
phk
3833976d12 Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write
count on a bufobj.  Bufobj_wdrop() replaces vwakeup().

Use these functions all relevant places except in ffs_softdep.c where
the use if interlocked_sleep() makes this impossible.

Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
2004-10-21 15:53:54 +00:00
phk
18b8697aaa use dev_re[fl]thread() rather than home rolled versions. 2004-09-24 05:55:03 +00:00
phk
1d992e18ec Eliminate DEV_STRATEGY() macro: call dev_strategy() directly.
Make dev_strategy() handle errors and departing devices properly.
2004-09-23 14:45:04 +00:00
phk
3947e54e89 Do not refcount the cdevsw, but rather maintain a cdev->si_threadcount
of the number of threads which are inside whatever is behind the
cdevsw for this particular cdev.

Make the device mutex visible through dev_lock() and dev_unlock().
We may want finer granularity later.

Replace spechash_mtx use with dev_lock()/dev_unlock().
2004-09-23 07:17:41 +00:00
phk
02df7323ee Remove unused B_WRITEINPROG flag 2004-09-15 21:49:22 +00:00
phk
7bf1722b65 undent some functions a bit. 2004-09-15 21:08:58 +00:00
phk
b2c5cf5b2a stylistic polishing. 2004-09-15 20:54:23 +00:00
phk
2806321da1 Remove the buffercache/vnode side of BIO_DELETE processing in
preparation for integration of p4::phk_bufwork.  In the future,
local filesystems will talk to GEOM directly and they will consequently
be able to issue BIO_DELETE directly.  Since the removal of the fla
driver, BIO_DELETE has effectively been a no-op anyway.
2004-09-13 06:50:42 +00:00
phk
5297516e02 Eliminate unused second argument to reassignbuf() and simplify it
accordingly.
2004-07-25 21:24:23 +00:00
phk
c65194c810 Neuter this warning for now, I think I know the remaining issues. 2004-07-25 08:09:21 +00:00
alc
e479e5c263 Remove GIANT_REQUIRED from vmapbuf(). 2004-07-18 04:57:49 +00:00
peadar
adb7022709 Fix bug introduced in rev 1.434:
When avoiding the zeroing of "bogus_page" when it appears in a buf,
be sure to advance the pointers into the data for successive pages.

The bug caused file corruption when read(2)ing from a "hole" in a
file where a previous page of the read block had already been faulted
in: fsx tripped up on this pretty quickly. The particular access
pattern is probably pretty unusual, so other applications probably
wouldn't have had problems, but you'd never know.

Reviewed By: alc@
2004-07-06 23:40:40 +00:00
phk
59c88fd71a Make the last commit handle non-phk root devices better. 2004-07-04 19:42:25 +00:00
stefanf
9dea8aeba1 Consistently use __inline instead of __inline__ as the former is an empty macro
in <sys/cdefs.h> for compilers without support for inline.
2004-07-04 16:11:03 +00:00
phk
b52c81e5db Blocksize for I/O should be a property of the vnode and not found by groping
around in the vnodes surroundings when we allocate a block.

Assign a blocksize when we create a vnode, and yell a warning (and ignore it)
if we got the wrong size.

Please email all such warnings to me.
2004-07-04 12:49:04 +00:00
phk
7af2451eed Remove stale comment 2004-07-03 19:37:06 +00:00
phk
40dd98a3bd Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
phk
dfd1f7fd50 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
alc
b0b9413238 Avoid pointless zeroing of the bogus page in vfs_bio_clrbuf().
Suggested by:	tegge@	(from October of last year)
2004-05-08 06:46:40 +00:00
alc
b57e5e03fd Make vm_page's PG_ZERO flag immutable between the time of the page's
allocation and deallocation.  This flag's principal use is shortly after
allocation.  For such cases, clearing the flag is pointless.  The only
unusual use of PG_ZERO is in vfs_bio_clrbuf().  However, allocbuf() never
requests a prezeroed page.  So, vfs_bio_clrbuf() never sees a prezeroed
page.

Reviewed by:	tegge@
2004-05-06 05:03:23 +00:00