Commit Graph

232 Commits

Author SHA1 Message Date
Poul-Henning Kamp
bd8a0d70f4 Remove devsw() call missed in last commit. 2004-09-24 07:08:33 +00:00
Poul-Henning Kamp
5ef8cac184 Use def_re[fl]thread().
Retire various old compatibility helpers.
2004-09-24 05:58:06 +00:00
Poul-Henning Kamp
1a52a73d68 Eliminate DEV_STRATEGY() macro: call dev_strategy() directly.
Make dev_strategy() handle errors and departing devices properly.
2004-09-23 14:45:04 +00:00
Poul-Henning Kamp
a0e78d2eb0 Do not refcount the cdevsw, but rather maintain a cdev->si_threadcount
of the number of threads which are inside whatever is behind the
cdevsw for this particular cdev.

Make the device mutex visible through dev_lock() and dev_unlock().
We may want finer granularity later.

Replace spechash_mtx use with dev_lock()/dev_unlock().
2004-09-23 07:17:41 +00:00
Poul-Henning Kamp
d705e025d0 The getpages VOP was a good stab at getting scatter/gather I/O without
too much kernel copying, but it is not the right way to do it, and it is
in the way for straightening out the buffer cache.

The right way is to pass the VM page array down through the struct
bio to the disk device driver and DMA directly in to/out off the
physical memory.  Once the VM/buf thing is sorted out it is next on
the list.

Retire most of vnode method. ffs_getpages().  It is not clear if what is
left shouldn't be in the default implementation which we now fall back to.

Retire specfs_getpages() as well, as it has no users now.
2004-09-19 08:14:55 +00:00
Poul-Henning Kamp
883d3c0c07 Remove the buffercache/vnode side of BIO_DELETE processing in
preparation for integration of p4::phk_bufwork.  In the future,
local filesystems will talk to GEOM directly and they will consequently
be able to issue BIO_DELETE directly.  Since the removal of the fla
driver, BIO_DELETE has effectively been a no-op anyway.
2004-09-13 06:50:42 +00:00
Poul-Henning Kamp
7ac439fec4 use bufdone() not biodone(). 2004-08-08 13:23:05 +00:00
Poul-Henning Kamp
f3732fd15b Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Julian Elischer
fa88511615 Nice, is a property of a process as a whole..
I mistakenly moved it to the ksegroup when breaking up the process
structure. Put it back in the proc structure.
2004-06-16 00:26:31 +00:00
Alan Cox
5a32489377 Make vm_page's PG_ZERO flag immutable between the time of the page's
allocation and deallocation.  This flag's principal use is shortly after
allocation.  For such cases, clearing the flag is pointless.  The only
unusual use of PG_ZERO is in vfs_bio_clrbuf().  However, allocbuf() never
requests a prezeroed page.  So, vfs_bio_clrbuf() never sees a prezeroed
page.

Reviewed by:	tegge@
2004-05-06 05:03:23 +00:00
Poul-Henning Kamp
bc20ced763 Do not drop Giant around the poll method yet, we're not ready for it. 2004-04-12 21:52:52 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Poul-Henning Kamp
ceb58ca58f When I was a kid my work table was one cluttered mess an cleaning it up
were a rather overwhelming task.  I soon learned that if you don't know
where you're going to store something, at least try to pile it next to
something slightly related in the hope that a pattern emerges.

Apply the same principle to the ffs/snapshot/softupdates code which have
leaked into specfs:  Add yet a buf-quasi-method and call it from the
only two places I can see it can make a difference and implement the
magic in ffs_softdep.c where it belongs.

It's not pretty, but at least it's one less layer violated.
2004-03-11 18:50:33 +00:00
Poul-Henning Kamp
39a78f8cf4 Don't call devsw() more than we need to, and in particular do not expose
ourselves to device removal by not checking for it the second time.

Use count_dev(dev) rather than vcount(vp)
2004-03-10 20:56:28 +00:00
Poul-Henning Kamp
ad3917e8e6 Do not attempt to open NODEV 2004-02-24 09:59:35 +00:00
Poul-Henning Kamp
cd690b60de Device megapatch 6/6:
This is what we came here for:  Hang dev_t's from their cdevsw,
refcount cdevsw and dev_t and generally keep track of things a lot
better than we used to:

Hold a cdevsw reference around all entrances into the device driver,
this will be necessary to safely determine when we can unload driver
code.

Hold a dev_t reference while the device is open.

KASSERT that we do not enter the driver on a non-referenced dev_t.

Remove old D_NAG code, anonymous dev_t's are not a problem now.

When destroy_dev() is called on a referenced dev_t, move it to
dead_cdevsw's list.  When the refcount drops, free it.

Check that cdevsw->d_version is correct.  If not, set all methods
to the dead_*() methods to prevent entrance into driver.  Print
warning on console to this effect.  The device driver may still
explode if it is also incompatible with newbus, but in that case
we probably didn't get this far in the first place.
2004-02-21 21:57:26 +00:00
Poul-Henning Kamp
dc08ffec87 Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
2004-02-21 21:10:55 +00:00
Poul-Henning Kamp
98d87dfecd Initialize b_iooffset correctly. 2003-11-13 09:58:09 +00:00
Poul-Henning Kamp
01758670e9 Initialize b_iooffset before calling strategy 2003-10-18 19:48:21 +00:00
Poul-Henning Kamp
c87b01a0fd Initialize b_offset before calling VOP_STRATEGY/VOP_SPECSTRATEGY.
Remove various comments of KASSERTS and comments about B_PHYS which
does not apply anymore.
2003-10-18 11:06:15 +00:00
Poul-Henning Kamp
0023f61848 Introduce a new optional memberfunction for cdevsw, fdopen() which
passes the fdidx from VOP_OPEN down.

This is for all I know the final API for this functionality, but
the locking semantics for messing with the filedescriptor from
the device driver are not settled at this time.
2003-10-15 20:00:59 +00:00
Alan Cox
10e9e2d1b9 Synchronize access to a page's valid field by using the lock from its
containing object.
2003-10-04 09:20:00 +00:00
Marcel Moolenaar
fccf82902d The valid field in struct vm_page can be of type unsigned long when
32K pages are selected. In spec_getpages() change the printf format
specifier and add an explicit cast so that we always print the field
as a long type.
2003-08-28 01:52:14 +00:00
Alan Cox
49dc7ac17d Use the requested page's object field instead of the vnode's. In some
cases, the vnode's object field is not initialized leading to a NULL
pointer dereference when the object is locked.

Tested by:	rwatson
2003-08-22 17:50:32 +00:00
Poul-Henning Kamp
291faa1677 Don't drop giant around ->d_strategy(), too much code explodes. 2003-08-06 06:49:18 +00:00
Poul-Henning Kamp
f4a3d9da6e Only drop Giant around the drivers ->d_strategy() if the buffer is not
marked to prevent this.
2003-08-05 06:43:56 +00:00
Alan Cox
a38918cdbd Lock the vm object when freeing a vm page. 2003-06-19 17:56:12 +00:00
Poul-Henning Kamp
e04393d6de In specfs::vop_specstratey(), assert that the vnode and buffer agree about
the device.
2003-06-15 20:31:04 +00:00
Poul-Henning Kamp
2a0f8aeb52 I have not had any reports of trouble for a long time, so remove the
gentle versions of the vop_strategy()/vop_specstrategy() mismatch methods
and use vop_panic() instead.
2003-06-15 19:49:14 +00:00
Poul-Henning Kamp
dc81367d8d Take 2: Remove _both_ KASSERTS. 2003-06-15 19:16:34 +00:00
Poul-Henning Kamp
d5bde314e9 Duh! I misread my handwritte notes: We do _not_ want to asser that
vp == bp->b_vp in specfs, that was the entire point of VOP_SPECSTRATEGY().
2003-06-15 19:14:03 +00:00
Poul-Henning Kamp
cefb5754dd Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations
to check that the buffer points to the correct vnode.
2003-06-15 18:53:00 +00:00
Jeff Roberson
749ffa4ecd - Add a lock for protecting against msleep(bp, ...) wakeup(bp) races.
- Create a new function bdone() which sets B_DONE and calls wakup(bp). This
   is suitable for use as b_iodone for buf consumers who are not going
   through the buf cache.
 - Create a new function bwait() which waits for the buf to be done at a set
   priority and with a specific wmesg.
 - Replace several cases where the above functionality was implemented
   without locking with the new functions.
2003-03-13 07:31:45 +00:00
Nate Lawson
99648386d3 Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
2003-03-03 19:15:40 +00:00
Poul-Henning Kamp
182a9f7455 Make nokqfilter() return the correct return value.
Ditch the D_KQFILTER flag which was used to prevent calling NULL pointers.
2003-03-03 16:24:47 +00:00
Poul-Henning Kamp
88b1af7730 Use the SI_CANDELETE flag on the dev_t rather than the D_CANFREE flag
on the cdevsw to determine ability to handle the BIO_DELETE request.
2003-02-11 12:49:58 +00:00
Jeff Roberson
767b9a529d - Cleanup unlocked accesses to buf flags by introducing a new b_vflag member
that is protected by the vnode lock.
 - Move B_SCANNED into b_vflags and call it BV_SCANNED.
 - Create a vop_stdfsync() modeled after spec's sync.
 - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
   fs specific processing.  This gives all of these filesystems proper
   behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
 - Annotate the locking in buf.h
2003-02-09 11:28:35 +00:00
Poul-Henning Kamp
7a976ae838 Don't override the vop_lock, vop_unlock and vop_isunlocked methods.
Previously all filesystems which relied on specfs to do devices
would have private overrides for vop_std*, so the vop_no* overrides
here had no effect.  I overlooked the transitive nature of the vop
vectors when I removed the vop_std* in those filesystems.

Removing the override here restores device node locking to it's
previous modus operandi.

Spotted by:	bde
2003-01-05 19:14:44 +00:00
Poul-Henning Kamp
f576a06e22 Don't take the detour over VOP_STRATEGY from spec_getpages, call our
own strategy directly.
2003-01-05 10:03:57 +00:00
Poul-Henning Kamp
973418dbf5 Split out the vnode and buf arguments to the internal strategy worker
routine instead of doing evil casts.
2003-01-05 09:55:26 +00:00
Poul-Henning Kamp
f5b11b6e2d Temporarily introduce a new VOP_SPECSTRATEGY operation while I try
to sort out disk-io from file-io in the vm/buffer/filesystem space.

The intent is to sort VOP_STRATEGY calls into those which operate
on "real" vnodes and those which operate on VCHR vnodes.  For
the latter kind, the call will be changed to VOP_SPECSTRATEGY,
possibly conditionally for those places where dual-use happens.

Add a default VOP_SPECSTRATEGY method which will call the normal
VOP_STRATEGY.  First time it is called it will print debugging
information.  This will only happen if a normal vnode is passed
to VOP_SPECSTRATEGY by mistake.

Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY
does on a VCHR vnode today.

Add a new VOP_STRATEGY method in specfs to catch instances where
the conversion to VOP_SPECSTRATEGY has not yet happened.  Handle
the request just like we always did, but first time called print
debugging information.

Apart up to two instances of console messages per boot, this amounts
to a glorified no-op commit.

If you get any of the messages on your console I would very much
like a copy of them mailed to phk@freebsd.org
2003-01-04 22:10:36 +00:00
Poul-Henning Kamp
646e95fe69 resort vnode ops list 2003-01-04 20:32:03 +00:00
Poul-Henning Kamp
2ff9e749be Replace spec_bmap() with vop_panic: We should never BMAP a device backed
vnode only filesystem backed vnodes.
2003-01-04 11:29:44 +00:00
Poul-Henning Kamp
862702306b Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since
all BUF_STRATEGY did in the first place was call VOP_STRATEGY.
2003-01-03 06:32:15 +00:00
Poul-Henning Kamp
e2a3ea1c45 Remove unused second argument from DEV_STRATEGY(). 2003-01-03 05:57:35 +00:00
Kirk McKusick
5878393060 Add debug.doslowdown to enable/disable niced slowdown on I/O. Default
to off until locking interference issues get sorted out.

Sponsored by:   DARPA & NAI Labs.
2002-11-04 07:29:20 +00:00
Poul-Henning Kamp
45c5890054 Put a KASSERT in specfs::strategy() to check that the incoming buffer
has a valid b_iocmd.  Valid is any one of BIO_{READ,WRITE,DELETE}.

I have seen at least one case where the bio_cmd field was zero once the
request made it into GEOM.  Putting the KASSERT here allows us to spot
the culprit in the backtrace.
2002-11-01 15:32:12 +00:00
Kirk McKusick
9ab73fd11a Within ufs, the ffs_sync and ffs_fsync functions did not always
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned.  This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.

Sponsored by:   DARPA & NAI Labs.
2002-10-25 00:20:37 +00:00
Kirk McKusick
e03486d198 This checkin reimplements the io-request priority hack in a way
that works in the new threaded kernel. It was commented out of
the disksort routine earlier this year for the reasons given in
kern/subr_disklabel.c (which is where this code used to reside
before it moved to kern/subr_disk.c):

----------------------------
revision 1.65
date: 2002/04/22 06:53:20;  author: phk;  state: Exp;  lines: +5 -0
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
----------------------------

As suggested in this comment, it is no longer located in the disk sort
routine, but rather now resides in spec_strategy where the disk operations
are being queued by the thread that is associated with the process that
is really requesting the I/O. At that point, the disk queues are not
visible, so the I/O for positively niced processes is always slowed
down whether or not there is other activity on the disk.

On the issue of scaling HZ, I believe that the current scheme is
better than using a fixed quantum of time. As machines and I/O
subsystems get faster, the resolution on the clock also rises.
So, ten years from now we will be slowing things down for shorter
periods of time, but the proportional effect on the system will
be about the same as it is today. So, I view this as a feature
rather than a drawback. Hence this patch sticks with using HZ.

Sponsored by:	DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
2002-10-22 00:59:49 +00:00