doesn't have one. The test was bogus on these architectures, but
recent changes broke it altogether.
Prompted by: phk
This should fix the recent SPARC 64 build problems.
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
Tidy up comments.
Check for null rqgs. This continue to be reported, though I can't
work out why.
Correct formats for some error messages. Don't cast the value to
match the format.
Use microtime, not getmicrotime, for timing debug entries.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
these fields instead.
- Hold the vnode interlock while locking bufs on the clean/dirty queues.
This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
BUF_LOCK with a LK_TIMEFAIL to a single lock.
Reviewed by: arch, mckusick
is not long long on all archs. (They happen to be long's on 64-bit arch's
and gcc considers that significant enough to warn about it.) These should
probably be uintmax_t but I didn't feel like adding all the extra includes.
instead of %llx when %j is available).
Changed nearby output formats from %x to %#x so that it is obvious that the
numbers are in hex (vinum mostly uses 0x%x elsewhere).
Didn't fix nearby format printf errors (long lines).
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
with more than one plex, the data will be accessed
multiple times. During this time, userland code could
potentially modify the buffer, thus causing data
corruption. In the case of a multi-plexed volume this
might be cosmetic, but in the case of a RAID-[45] plex it
can cause severe data corruption which only becomes
evident after a drive failure. Avoid this situation by
making a copy of the data buffer before using it.
Note that this solution does not guarantee any particular
content of the buffer, just that it remains unchanged for
the duration of the request.
Suggested by: alfred
Get counting volume I/Os right.
launch_requests: Be macho, throw away the safety net and walk the
tightrope with no splbio().
Add some comments explaining the smoke and mirrors.
Remove some redundant braces.
sdio: Set the state of an accessed but down subdisk correctly. This
appears to duplicate an earlier commit that I hadn't seen.
(Much of this done by script)
Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.
Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.
Add bio_queue field for struct bio aware disksort.
Address a lot of stylistic issues brought up by bde.
set properly in the struct buf with vinum:
Fix locations where B_READ was cleared in the old code but
b.b_iocmd wasn't set to BIO_WRITE
Fix propogation of b_iocmd
Correct comments to reflect reality
Don't compare b_flags with BIO_READ, it's in b_iocmd.
Submitted by: Bernd Walter <ticso@cicely.de>
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)
substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)
This patch is machine generated except for the ccd.c and buf.h parts.
field in struct buf: b_iocmd. The b_iocmd is enforced to have
exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue. It is likely to write on your disk
where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
transferred, do it in complete_rqe instead.
launch_requests: Replace the inadvertently removed splbio() around the
main loop. It may not be necessary, but the biggest
test of this stuff are IDE disks, which I'm not
using.
Remove throttling code, I'm pretty sure it's not
needed any more.
Don't set B_ORDERED, it's not necessary either.
Objected-to-by: alfred
build_rq_buffer: Don't lose the B_ORDERED bit, it still has some
residual meaning. To do this right, Vinum needs to
look at the B_ORDERED bit and order the transfer
across all disks involved. That's an exercise for
another day.
Objected-to-by: alfred
Implicitly-sanctioned-by: jkh
the tsleep call flags.
Submitted-by: Bernd Walter <ticso@cicely.de>
Remove references to vnode pointers, including debug output. Vinum
now talks directly to the device driver.
bre: Add case for RAID-4.
sdio: Don't try to write to a down drive. Set the sd state instead.
Approved-by: jkh
alpha.
Modify the manner in which we lock RAID-5 plexes. This appears to
solve some of the elusive panics we have seen with corrupted buffer
headers (specifically the zeroed-out b_iodone field).
Submitted-by: Bernd Walter <ticso@cicely.de>
Put splbio protection around the main launch loop. We've seen cases where
the bottom half was cutting off the branch on which we're sitting.
Experienced-by: Michael Reifenberger <root@nihil.plaut.de>
limit the number of outstanding requests on a specific drive and
overall.
Change the way we set the active request count. This enables us to
start the requests without being in splbio for the duration, which
could be very long for IDE drives in PIO mode.
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.
Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.
Remove bogus and unused strategy1 routines.
Remove open/close arguments from dssize(). Pick them up from dev_t.
Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
Move the declaration of freerq() to request.h.
logrq: add support for lock events.
vinumstart: solve a problem where removing a plex from an active
volume could cause attempts to access non-existent plexes.
launch_requests: don't set a request group active until we're sure we
can launch it. This caused some hangs under unusual
circumstances.
bre: don't set XFR_BAD_SUBDISK if we're not going to use it.
build_read_request: correct recovery, which caused some hangs under
(other) unusual circumstances.
build_rq_buffer: don't set bp->b_dev if we don't have a dev.
sdio: clean up, remove obsolete code.
deallocrqg: unlock any locks the rqg may have.
Add Cybernet copyright.
OK'd-by: Chuck Jacobus <chuck@cybernet.com>
logrq: save device major and minor numbers to compensate for lost
dev_t.
launch_requests: Don't issue requests which are marked
XFR_BAD_SUBDISK. This may make things easier in bre().
bre:
Rearrange.
- Change some comments
- Recognize holes in plex structure. Formerly this could lead to
incorrect write to the plex. Return REQUEST_DEGRADED on a read
request, but carry on to the bitter end on a write request, and
mark the requests for the inaccessible subdisks with
XFR_BAD_SUBDISK.
- return REQUEST_EOF if the requested transfer goes beyond the end
of the plex. This is not an error, since other plexes may go
further into the volume address space.
build_read_request:
Handle REQUEST_DEGRADED returned from bre().
sdio:
Lock buffer before issuing the requests.
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.
Don't bzero the buffer structure, it's been done already by
allocrqg.
sdio:
Build up a correct buffer header, don't steal linkages from system
buffer headers.
Noticed-by: mckusick
Made a new (inline) function devsw(dev_t dev) and substituted it.
Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)
DEVFS will eventually benefit from this change too.
Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline)
function.
Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
to the order of the cmaj/bmaj arguments!)
Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
(ditto!)
(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)