If the client requests that the error recovery code retry a selection
timeout, it will be retried after half a second. The delay is to give the
device time to recover.
For most of these drivers, I only added selection timeout retries where
they were also retrying unit attention type errors. The sa(4) driver calls
saerror() in a number of places, but most of them don't request retrying
unit attentions.
Also, bump the default minimum CD changer timeout from 2 to 5 seconds and
the maximum timeout from 10 to 15 seconds. Some Pioneer changers seem to
have trouble with the shorter timeout.
Reviewed by: gibbs
NOTE: These changes will require recompilation of any userland
applications, like cdrecord, xmcd, etc., that use the CAM passthrough
interface. A make world is recommended.
camcontrol.[c8]:
- We now support two new commands, "tags" and "negotiate".
- The tags commands allows users to view the number of tagged
openings for a device as well as a number of other related
parameters, and it allows users to set tagged openings for
a device.
- The negotiate command allows users to enable and disable
disconnection and tagged queueing, set sync rates, offsets
and bus width. Note that not all of those features are
available for all controllers. Only the adv, ahc, and ncr
drivers fully support all of the features at this point.
Some cards do not allow the setting of sync rates, offsets and
the like, and some of the drivers don't have any facilities to
do so. Some drivers, like the adw driver, only support enabling
or disabling sync negotiation, but do not support setting sync
rates.
- new description in the camcontrol man page of how to format a disk
- cleanup of the camcontrol inquiry command
- add support in the 'devlist' command for skipping unconfigured devices if
-v was not specified on the command line.
- make use of the new base_transfer_speed in the path inquiry CCB.
- fix CCB bzero cases
cam_xpt.c, cam_sim.[ch], cam_ccb.h:
- new flags on many CCB function codes to designate whether they're
non-immediate, use a user-supplied CCB, and can only be passed from
userland programs via the xpt device. Use these flags in the transport
layer and pass driver to categorize CCBs.
- new flag in the transport layer device matching code for device nodes
that indicates whether a device is unconfigured
- bump the CAM version from 0x10 to 0x11
- Change the CAM ioctls to use the version as their group code, so we can
force users to recompile code even when the CCB size doesn't change.
- add + fill in a new value in the path inquiry CCB, base_transfer_speed.
Remove a corresponding field from the cam_sim structure, and add code to
every SIM to set this field to the proper value.
- Fix the set transfer settings code in the transport layer.
scsi_cd.c:
- make some variables volatile instead of just casting them in various
places
- fix a race condition in the changer code
- attach unless we get a "logical unit not supported" error. This should
fix all of the cases where people have devices that return weird errors
when they don't have media in the drive.
scsi_da.c:
- attach unless we get a "logical unit not supported" error
scsi_pass.c:
- for immediate CCBs, just malloc a CCB to send the user request in. This
gets rid of the 'held' count problem in camcontrol tags.
scsi_pass.h:
- change the CAM ioctls to use the CAM version as their group code.
adv driver:
- Allow changing the sync rate and offset separately.
adw driver
- Allow changing the sync rate and offset separately.
aha driver:
- Don't return CAM_REQ_CMP for SET_TRAN_SETTINGS CCBs.
ahc driver:
- Allow setting offset and sync rate separately
bt driver:
- Don't return CAM_REQ_CMP for SET_TRAN_SETTINGS CCBs.
NCR driver:
- Fix the ultra/ultra 2 negotiation bug
- allow setting both the sync rate and offset separately
Other HBA drivers:
- Put code in to set the base_transfer_speed field for
XPT_GET_TRAN_SETTINGS CCBs.
Reviewed by: gibbs, mjacob (isp), imp (aha)
peripheral drivers can determine where in the devstat(9) list they are
inserted.
This requires recompilation of libdevstat, systat, vmstat, rpc.rstatd, and
any ports that depend on the devstat code, since the size of the devstat
structure has changed. The devstat version number has been incremented as
well to reflect the change.
This sorts devices in the devstat list in "more interesting" to "less
interesting" order. So, for instance, da devices are now more important
than floppy drives, and so will appear before floppy drives in the default
output from systat, iostat, vmstat, etc.
The order of devices is, for now, kept in a central table in devicestat.h.
If individual drivers were able to make a meaningful decision on what
priority they should be at attach time, we could consider splitting the
priority information out into the various drivers. For now, though, they
have no way of knowing that, so it's easier to put them in an easy to find
table.
Also, move the checkversion() call in vmstat(8) to a more logical place.
Thanks to Bruce and David O'Brien for suggestions, for reviewing this, and
for putting up with the long time it has taken me to commit it. Bruce did
object somewhat to the central priority table (he would rather the
priorities be distributed in each driver), so his objection is duly noted
here.
Reviewed by: bde, obrien
CAPACITY fail for a non-removable media device. There's a race
condition where the device entry is removed and then
xpt_release_ccb is called which attempts to give back the ccb
to a device that's now gone. In this bandaid release the ccb
early and then remember to not call xpt_release_ccb later.
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.
These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.
Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Mike Spengler <mks@networkcs.com>
not like the 6-byte read and write commands! It returns illegal request,
with the field pointer pointing to byte 9 of a 6 byte CDB.
In any case, the work around is to put in a quirk mechanism that makes sure
that we don't send 6-byte reads or writes to this device. It's rather sad
that this is necessary. You'd think that they would be able to get
something that basic to work right in their firmware...
Reviewed by: gibbs
Reported by: Adam McDougall <bsdx@spawnet.com>
to a device failed.
In theory, the same steps that happen when we get an AC_LOST_DEVICE async
notification should have been taken when a driver fails to attach. In
practice, that wasn't the case.
This only affected the da, cd and ch drivers, but the fix affects all
peripheral drivers.
There were several possible problems:
- In the da driver, we didn't remove the peripheral's softc from the da
driver's linked list of softcs. Once the peripheral and softc got
removed, we'd get a kernel panic the next time the timeout routine
called dasendorderedtag().
- In the da, cd and possibly ch drivers, we didn't remove the
peripheral's devstat structure from the devstat queue. Once the
peripheral and softc were removed, this could cause a panic if anyone
tried to access device statistics. (one component of the linked list
wouldn't exist anymore)
- In the cd driver, we didn't take the peripheral off the changer run
queue if it was scheduled to run. In practice, it's highly unlikely,
and maybe impossible that the peripheral would have been on the
changer run queue at that stage of the probe process.
The fix is:
- Add a new peripheral callback function (the "oninvalidate" function)
that is called the first time cam_periph_invalidate() is called for a
peripheral.
- Create new foooninvalidate() routines for each peripheral driver. This
routine is always called at splsoftcam(), and contains all the stuff
that used to be in the AC_LOST_DEVICE case of the async callback
handler.
- Move the devstat cleanup call to the destructor/cleanup routines, since
some of the drivers do I/O in their close routines.
- Make sure that when we're flushing the buffer queue, we traverse it at
splbio().
- Add a check for the invalid flag in the pt driver's open routine.
Reviewed by: gibbs
1) The vnode pager wasn't properly tracking the file size due to
"size" being page rounded in some cases and not in others.
This sometimes resulted in corrupted files. First noticed by
Terry Lambert.
Fixed by changing the "size" pager_alloc parameter to be a 64bit
byte value (as opposed to a 32bit page index) and changing the
pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
caused some 64bit offsets and sizes to be scrambled. Removing
the cast required adding casts at a few dozen callers.
There may be problems with other bogus casts in close-by
macros. A quick check seemed to indicate that those were okay,
however.
2217's (reported by Matthew Jacob in NetBSD PR kern/6027) and Fujitsu
M2954's (reported by Tom Jackson).
Some of the Fujitsus at least hang when they get a cache sync command.
(Others just return illegal request.)
Also, make error printing in dashutdown() a little more selective. Don't
print any error when the sense key is illegal request. Drives that don't
support the synchronize cache command usually return illegal request.
Also, make sure the scsi status is check condition before going into
scsi_sense_print().
Reviewed by: gibbs
command on drives that don't like it. Right now, there's just a bogus
quirk entry in the table that doesn't do anything, but that should be
changed once we get actual inquiry data for drives that don't like the
synchronize cache command.
Also, add a shutdown hook that runs through all direct access peripherals
and runs a synchronize cache on them if they're still open, and if
synchronize cache isn't disabled via a quirk entry.
Add a synchronize cache call at the end of dadump() (again, conditionalized
on the quirk entry), so we can insure that the disk cache contents get
flushed to physical media after a dump.
Check the new quirk entry in daclose() to decide whether or not to
synchronize the cache for a disk at final close.
Reviewed by: gibbs
JAZ drive happy. This shouldn't impact fast drives, and will keep cam
from failing on very slow ones (that are spinning up, say). 20
seconds was almost long enough, but not in all cases.
Suggested by: gibbs
well) Among them:
[ cd driver ]
1. Old labeling code was still there.
2. Error handling for dsopen() was broken (no test for the `error'
returned by dsopen(); bogus test of an `error' that is known to be 0).
3. cdopen() closed the physical device after certain errors although there
may still be open partitions on it.
4. cdclose() closed the physical device although there may still be open
partitions on it.
5. Some printf format fixes was incomplete or missing.
6. cdioctl() truncated unit numbers mod 256.
7. cdioctl() was missing locking.
[ da driver ]
1. daclose() closed the physical device although there may still be open
partitions on it. This was fixed many years ago in sd.c rev.1.57.
2. A minor optimization (the dk_slices != NULL test) in sdopen() became
uglier in daopen(). It is not worth doing. da only regressed compared
with od and my version of sd, since I never committed the change to sd.
daopen() should probably do less if some partition is already open.
This is not addressed by the diffs.
[ ... ]
5. "opt_hw_wdog.h" was not included, so the HW_WDOG code was unreachable.
- Added a getdev CCB call in the cdopen() and daopen() calls so that the
vendor name and device name are available for the disklabel. (suggested
by bde)
- Removed vestigal devfs support in both drivers, since we can't properly
work with devfs yet. (ask bde for details on this)
- Cleaned up the probe code in both drivers in the failure cases. There
were a number of things wrong here. The peripheral driver instances
weren't getting properly cleaned up. Sometimes the wrong probe message
would get printed out (with the failure message appended), so it wasn't
very clear that we failed to attach. SCSI sense information was printed,
even when the error in question wasn't a SCSI error. I put similar fixes
into the changer driver in revision 1.2 of scsi_ch.c.
Reviewed by: gibbs
Submitted by: bde (partially)
a perfect world, we'd notice the UA and do some device validation to ensure
that the device hasn't changed. We may get this before the year ends,
but not before 3.0R. This change gives the adminstrator ample ammunition
to take off a foot or two, but hey this *is* UN*X.
without the DA driver.
The problem was that the CD driver depended on scsi_read_write() and
scsi_start_stop(), which were defined in scsi_da.c.
I moved both functions, and their associated data structures and defines
from scsi_da.* to scsi_all.*. This is technically the "wrong" thing to do
since those commands are really only for direct-access type devices, not
for all SCSI devices. I think, though, that the advantage (allowing people
to compile kernels without the disk driver) outweighs any architectural
purity arguments.
PR: kern/7969
Reviewed by: gibbs