This is useful if you want to dynamically move into a Fibre Channel
or Multi-initiator environment that happens to be particularly noisy
and ugly that requires a lot of retries (with shorter I/O timeouts)
for commands destried by LIPs or Bus Resets.
Reviewed by: deafening silence on audit && scsi on the retry counts
MFC after: 2 weeks
1. Add SA_IO_TIMEOUT as an option (4 minutes default) to cover reads,
writes, wfm, test unit ready.
2. Add internal SCSIOP_TIMEOUT (e.g., for mode sense) at 1 minute. This
should not require an option, but is cleaner to parameterize.
MFC after: 1 week
- Replace some very poorly thought out API hacks that should have been
fixed a long while ago.
- Provide some much more flexible search functions (resource_find_*())
- Use strings for storage instead of an outgrowth of the rather
inconvenient temporary ioconf table from config(). We already had a
fallback to using strings before malloc/vm was running anyway.
drivers.
- change daprevent() to set CAM_RETRY_SELTO and SF_RETRY_UA when it calls
cam_periph_runccb().
- change the pt(4) driver to ignore unit attentions
- change the targ(4) driver to retry selection timeouts
- clean up a few formatting glitches in the targ(4) driver
Reviewed by: gibbs
prevent scsi_sense_desc() from deferencing a NULL pointer when a drive
happens to return one of these sense keys.
Reported by: Michael Samuel <michael@miknet.net>
With the recent changes in the CAM error handling, some problems in
the error handling of sa(4) have been uncovered. Basically, a number
of conditions that are not actually errors have been mistreated as
genuine errors. In particular:
. Trying to read in variable length mode with a mismatched blocksize
between the on-tape (virtual) blocks and the read(2) supplied buffer
size, causing an ILI SCSI condition, have caused an attempt to retry
the supposedly `errored' transfer, causing the tape to be read
continuously until it eventually hit EOM. Since by default any
simple mt(1) operation does an initial test read, an `mt stat' was
sufficient to trigger this bug.
Note that it's Justin's opinion that treating a NO SENSE as an EIO
is another bug in CAM. I feel not authorized to fix cam_periph.c
without another confirmation that i'm on the right track, however.
. Hitting a filemark caused the read(2) syscall to return EIO, instead
of returning a `short read'. Note that the current fix only solves
this problem in variable length mode. Fixed length mode uses a
different code path, and since i didn't grok all the intentions behind
that handling, i did not touch it (IOW: it's still broken, and you get
an EIO upon hitting a filemark).
The solution is to keep track of those conditions inside saerror(),
and upon completion to not call cam_periph_error() in that case. We
need to make sure that the device gets unfrozen if needed though (in
case of actual errors, cam_periph_error() does this on our behalf).
Not objected by: mjacob (who currently doesn't have the time to
review the patch)
bcopy would go off the end of the array by two elements, which sometimes
causes a panic if it happens to cross into a page that isn't mapped.
Submitted by: gibbs
Reviewed by: peter
Some of the major changes include:
- The SCSI error handling portion of cam_periph_error() has
been broken out into a number of subfunctions to better
modularize the code that handles the hierarchy of SCSI errors.
As a result, the code is now much easier to read.
- String handling and error printing has been significantly
revamped. We now use sbufs to do string formatting instead
of using printfs (for the kernel) and snprintf/strncat (for
userland) as before.
There is a new catchall error printing routine,
cam_error_print() and its string-based counterpart,
cam_error_string() that allow the kernel and userland
applications to pass in a CCB and have errors printed out
properly, whether or not they're SCSI errors. Among other
things, this helped eliminate a fair amount of duplicate code
in camcontrol.
We now print out more information than before, including
the CAM status and SCSI status and the error recovery action
taken to remedy the problem.
- sbufs are now available in userland, via libsbuf. This
change was necessary since most of the error printing code
is shared between libcam and the kernel.
- A new transfer settings interface is included in this checkin.
This code is #ifdef'ed out, and is primarily intended to aid
discussion with HBA driver authors on the final form the
interface should take. There is example code in the ahc(4)
driver that implements the HBA driver side of the new
interface. The new transfer settings code won't be enabled
until we're ready to switch all HBA drivers over to the new
interface.
src/Makefile.inc1,
lib/Makefile: Add libsbuf. It must be built before libcam,
since libcam uses sbuf routines.
libcam/Makefile: libcam now depends on libsbuf.
libsbuf/Makefile: Add a makefile for libsbuf. This pulls in the
sbuf sources from sys/kern.
bsd.libnames.mk: Add LIBSBUF.
camcontrol/Makefile: Add -lsbuf. Since camcontrol is statically
linked, we can't depend on the dynamic linker
to pull in libsbuf.
camcontrol.c: Use cam_error_print() instead of checking for
CAM_SCSI_STATUS_ERROR on every failed CCB.
sbuf.9: Change the prototypes for sbuf_cat() and
sbuf_cpy() so that the source string is now a
const char *. This is more in line wth the
standard system string functions, and helps
eliminate warnings when dealing with a const
source buffer.
Fix a typo.
cam.c: Add description strings for the various CAM
error status values, as well as routines to
look up those strings.
Add new cam_error_string() and
cam_error_print() routines for userland and
the kernel.
cam.h: Add a new CAM flag, CAM_RETRY_SELTO.
Add enumerated types for the various options
available with cam_error_print() and
cam_error_string().
cam_ccb.h: Add new transfer negotiation structures/types.
Change inq_len in the ccb_getdev structure to
be "reserved". This field has never been
filled in, and will be removed when we next
bump the CAM version.
cam_debug.h: Fix typo.
cam_periph.c: Modularize cam_periph_error(). The SCSI error
handling part of cam_periph_error() is now
in camperiphscsistatuserror() and
camperiphscsisenseerror().
In cam_periph_lock(), increase the reference
count on the periph while we wait for our lock
attempt to succeed so that the periph won't go
away while we're sleeping.
cam_xpt.c: Add new transfer negotiation code. (ifdefed
out)
Add a new function, xpt_path_string(). This
is a string/sbuf analog to xpt_print_path().
scsi_all.c: Revamp string handing and error printing code.
We now use sbufs for much of the string
formatting code. More of that code is shared
between userland the kernel.
scsi_all.h: Get rid of SS_TURSTART, it wasn't terribly
useful in the first place.
Add a new error action, SS_REQSENSE. (Send a
request sense and then retry the command.)
This is useful when the controller hasn't
performed autosense for some reason.
Change the default actions around a bit.
scsi_cd.c,
scsi_da.c,
scsi_pt.c,
scsi_ses.c: SF_RETRY_SELTO -> CAM_RETRY_SELTO. Selection
timeouts shouldn't be covered by a sense flag.
scsi_pass.[ch]: SF_RETRY_SELTO -> CAM_RETRY_SELTO.
Get rid of the last vestiges of a read/write
interface.
libkern/bsearch.c,
sys/libkern.h,
conf/files: Add bsearch.c, which is needed for some of the
new table lookup routines.
aic7xxx_freebsd.c: Define AHC_NEW_TRAN_SETTINGS if
CAM_NEW_TRAN_CODE is defined.
sbuf.h,
subr_sbuf.c: Add the appropriate #ifdefs so sbufs can
compile and run in userland.
Change sbuf_printf() to use vsnprintf()
instead of kvprintf(), which is only available
in the kernel.
Change the source string for sbuf_cpy() and
sbuf_cat() to be a const char *.
Add __BEGIN_DECLS and __END_DECLS around
function prototypes since they're now exported
to userland.
kdump/mkioctls: Include stdio.h before cam.h since cam.h now
includes a function with a FILE * argument.
Submitted by: gibbs (mostly)
Reviewed by: jdp, marcel (libsbuf makefile changes)
Reviewed by: des (sbuf changes)
Reviewed by: ken
inq_len member of the ccb_getdev structure, but we've never filled that
value in..
So we now get the length from the inquiry data returned by the drive.
(Since we will fetch as much inquiry data as the drive claims to support.)
Reviewed by: mjacob
Reported by: Andrzej Tobola <san@iem.pw.edu.pl>
offset is set to 0.
Re-arrange the DT limiting code so that we don't end up setting the period
to 0xa if the user really wants async. The previous behavior seemed to
confuse the aic(4) driver.
PR: kern/22733
Reviewed by: gibbs
o Offset and period in synch messages and width negotiation should be
done for per target not per lun. Move these from *lun_info to
*targ_info.
o Change in handling XPT_RESET_DEV and XPT_GET_TRAN_SETTINGS .
o Change CAM_* xpt_done return values.
o Busy loop did not timeout. Change this to timeout as original NetBSD/pc98.
Reviewed by: bsd-nomads ML
not be retried. It is an indication that there was an error that was
corrected during the execution of the command. This is per ANSI SCSI2
spec.
It's possible that these should also be noted to the console (as indicative,
perhaps, of growing media defect lists in drives), but the default of
printing errors out if bootverbose in this case is probably enough.
Also, there'd been a missing ERESTART for that clause anyway.
2. If you have an ABORTED COMMAND, it's almost invariably a SCSI parity
error. You should never be silent about these since users should do something
about this if it occurs (moving that power cord *away* from the SCSI cable is
always a good first start). This should print irrespective of bootverbose
because it's an actual real error even if we retry a transmission.
Reviewed by: audit@freebsd.org, gibbs@freebsd.org
- Use swi_* function names.
- Use void * to hold cookies to handlers instead of struct intrhand *.
- In sio.c, use 'driver_name' instead of "sio" as the name of the driver
lock to minimize diffs with cy(4).
folks.
My guess is that reducing the number of tags is just masking the real
problem for the PR submitter. I'll re-open the PR and see if I can work
with the submitter to diagnose the problem.
PR: 21139
we *really* are.
It should be noted that there is a degenerate case where soft tape
location will be lost (not causing a frozen state- but causing
the loss of reporting fileno/blockno)- that's where you backspace
over a filemark- you stop backspacing as soon as you cross the
filemark, but you have no idea what the record number now is because
you have no idea how many records you are into the file you just
backed into. Such is life.
While I'm at it, also pick up residuals from writing filemarks.
PR: 24222
only CCB type but also extra flags- one of which can be "position
updated".
In other changes: Add in a SA_QUIRK_NO_CPAGE quirk so that it's possible
to avoid using a (broken) device's implementation of he DEVICE COMPRESSION
page.
Also do a couple of printout cleanups.
As per some discussion on FreeBSD-scsi, skip doing tape flushing
if we're reading tape logical block location (MTIOCRDSPOS).
MS will be treated as having this quirk. In the event that we falsely
identify one that doesn't need it, no harm will be done. Ken
suggested that we make this more generic since there may be more
needed in the future.
Reported by: TERAMOTO Masahiro <teramoto@comm.eng.osaka-u.ac.jp>
PR: kern/23378
Reviewed by: ken
field in CDB' error when attempting to start a caddy-type CD drive,
since those drives apparently in general refuse to load a medium. Since
we never advertised the feature to load the medium upon calling
cdstartunit() (i. e. upon receipt of a CDIOCSTART ioctl command), nobody
should have relied on it. Besides, nobody noticed so far at all that
this command is failing for caddy-type drives... Only few applications
seem to use it at all (among them is workman, which made me notice it).
Reviewed by: ken
CDs.
With audio CDs, you can't just do a READ(10) call on most drives without
first setting the blocksize with a mode select command. The disklabel code
does a read of the first sector of the media to find a label if it exists.
This caused drives to return an error when an audio CD was in the drive,
due to the problem described above.
The solution is to read the table of contents on the CD, and only attempt
to read the disklabel if the first track is a data track.
This works on all the various CD and DVD media I have tried, but further
testing (especially with Video CDs and other mode 2 media) will be
needed to determine if this is a universal solution.
Return ENOTSUP for any opcode that is not supported by the XPT
device.
Add back a missing local declaration that seems to have been deleted
by my last commit.
The XPT uses this to prevent tags from being used on parallel SCSI
interfaces immediately after a bus reset or BDR so that controllers
have an oportunity to renegotiate without tag messages in the way.
Somehow this got disabled... the functionality has been here for
quite some time.
Noticed by: my SCSI bus analyzer
This allows writing to DVD-RAM, PD and similar drives that probe as CD
devices. Note that these are randomly writeable devices, not
sequential-only devices like CD-R drives, which are supported by cdrecord.
Add a new flag value for dsopen(), DSO_COMPATLABEL. The cd(4) driver now
uses this flag instead of the DSO_NOLABELS flag. The DSO_NOLABELS always
used a "fake" disklabel for the entire disk, provided by the caller.
With the DSO_COMPATLABEL flag, dsopen() will first search the media for a
label, and if it finds a label, it will use that label. Otherwise it will
use the fake disklabel provided by the caller. This provides backwards
compatibility, since we will still have labels for ISO9660 media.
It also provides new functionality, since you can now have a regular BSD
disklabel on read-only media, or on writeable media (e.g. DVD-RAM).
Bruce and I both think that we should eventually (in a few years) get
away from using disklabels for ISO9660 media, and just use the whole disk
device (/dev/cd0). At that point disklabel handling in the cd(4) driver
could follow the "normal" model, as used in the da(4) driver.
Also, clean up the path in a couple of places in cdregister(). (Thanks to
Nick Hibma for catching that bug.)
Reviewed by: bde
because it only takes a struct tag which makes it impossible to
use unions, typedefs etc.
Define __offsetof() in <machine/ansi.h>
Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>
Remove myriad of local offsetof() definitions.
Remove includes of <stddef.h> in kernel code.
NB: Kernelcode should *never* include from /usr/include !
Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.
Deprecate <struct.h> with a warning. The warning turns into an error on
01-12-2000 and the file gets removed entirely on 01-01-2001.
Paritials reviews by: various.
Significant brucifications by: bde
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.
Submitted by: cp
(a NetBSD port for NEC PC-98x1 machines). They are ncv for NCR 53C500,
nsp for Workbit Ninja SCSI-3, and stg for TMC 18C30 and 18C50.
I thank NetBSD/pc98 and bsd-nomads people.
Obtained from: NetBSD/pc98
write caching is disabled on both SCSI and IDE disks where large
memory dumps could take up to an hour to complete.
Taking an i386 scsi based system with 512MB of ram and timing (in
seconds) how long it took to complete a dump, the following results
were obtained:
Before: After:
WCE TIME WCE TIME
------------------ ------------------
1 141.820972 1 15.600111
0 797.265072 0 65.480465
Obtained from: Yahoo!
Reviewed by: peter
- Make softinterrupts (SWI's) almost completely MI, and divorce them
completely from the x86 hardware interrupt code.
- The ihandlers array is now gone. Instead, there is a MI shandlers array
that just contains SWI handlers.
- Most of the former machine/ipl.h files have moved to a new sys/ipl.h.
- Stub out all the spl*() functions on all architectures.
Submitted by: dfr
newbus for referencing device interrupt handlers.
- Move the 'struct intrec' type which describes interrupt sources into
sys/interrupt.h instead of making it just be a x86 structure.
- Don't create 'ithd' and 'intrec' typedefs, instead, just use 'struct ithd'
and 'struct intrec'
- Move the code to translate new-bus interrupt flags into an interrupt thread
priority out of the x86 nexus code and into a MI ithread_priority()
function in sys/kern/kern_intr.c.
- Remove now-uneeded x86-specific headers from sys/dev/ata/ata-all.c and
sys/pci/pci_compat.c.
also mention the peripheral name, bus, target and lun of the device we
attempted to put in that slot. This gives the user a little more
information about what is going on.
Tested by: Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Discussed with: gibbs
for the Quantum "MAVERICK 540S" and "LPS525S".
Also, add common string variables, since we seem to have a few Quantum and
Micropolis drives in here.
Fix the 'quantum' variable usage in scsi_all.c that likely got broken when
someone staticized things in cam_xpt.c. (That particular problem would
cause Quantum Fireball ST drives to not get spun up if they were not
already spinning.)
Submitted by: Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Make the umass driver depend on this module.
Makes it possible to compile the kernel without SCSI support and load it
when for example a USB floppy is conencted.
to be obeying the original spec as to what the numeric value means.
Temperature flags are unaffected- these are still the 'pseudo-thermometers'
and overtemp/undertemp warnings will be caught and translated to SES objects
here.
PR: 20475