1. Fixed timeout specification for the msleep in mps_wait_command().
Added 30 second timeout for mps_wait_command() calls in mps_user.c.
2. Make sure we call mps_detach_user() from the kldunload path.
3. Raid Hotplug behavior change.
The driver now removes a volume when it goes to a failed state,
so we also need to add volume back to the OS when it goes to
opitimal/degraded/online from failed/missing.
Handle raid volume add and remove from the IR_Volume event.
4. Added some more debugging information.
5. Replace xpt_async(AC_LOST_DEVICE, path, NULL) with
mpssas_rescan_target().
This is to work around a panic in CAM that shows up when adding a
drive with a rescan and removing another device from the driver thread
with an AC_LOST_DEVICE async notification.
This problem was encountered in testing with the LSI sas2ircu utility,
which was used to create a RAID volume from physical disks. The driver
has to create the RAID volume target and remove the physical disk
targets, and triggered a panic in the process.
The CAM issue needs to be fully diagnosed and fixed, but this works
around the issue for now.
6. Fix some memory initialization issues in mps_free_command().
7. Resolve the "devq freeze forever" issue. This was caused by the
internal read capacity command issued in the non-head version of the
driver. When the command completed with an error, the driver wasn't
unfreezing thd device queue.
The version in head uses the CAM infrastructure for getting the read
capacity information, and therefore doesn't have the same issue.
8. Bump the version to 13.00.00.00-fbsd. (this is very close to LSI's
internal stable driver 13.00.00.00)
Submitted by: Kashyap Desai <Kashyap.Desai@lsi.com>
MFC after: 3 days
This involves significant changes to the mps(4) driver, but is not a
complete rewrite.
Some of the changes in this version of the driver:
- Integrated RAID (IR) support.
- Support for WarpDrive controllers.
- Support for SCSI protection information (EEDP).
- Support for TLR (Transport Level Retries), needed for tape drives.
- Improved error recovery code.
- ioctl interface compatible with LSI utilities.
mps.4: Update the mps(4) driver man page somewhat for the driver
changes. The list of supported hardware still needs to be
updated to reflect the full list of supported cards.
conf/files: Add the new driver files.
mps/mpi/*: Updated version of the MPI header files, with a BSD style
copyright.
mps/*: See above for a description of the new driver features.
modules/mps/Makefile:
Add the new mps(4) driver files.
Submitted by: Kashyap Desai <Kashyap.Desai@lsi.com>
Reviewed by: ken
MFC after: 1 week
no reason why it should be limited to 64K of DFLTPHYS. DMA data tag is any
way set to allow MAXPHYS, S/G lists (chain elements) are sufficient and
overflows are also handled. On my tests even 1MB I/Os are working fine.
Reviewed by: ken@
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.
Zero any sense not transferred by the device as the SCSI specification
mandates that any untransferred data should be assumed to be zero.
Reviewed by: ken
CAM.
Desriptor sense is a new sense data format that originated in SPC-3. Among
other things, it allows for an 8-byte info field, which is necessary to
pass back block numbers larger than 4 bytes.
This change adds a number of new functions to scsi_all.c (and therefore
libcam) that abstract out most access to sense data.
This includes a bump of CAM_VERSION, because the CCB ABI has changed.
Userland programs that use the CAM pass(4) driver will need to be
recompiled.
camcontrol.c: Change uses of scsi_extract_sense() to use
scsi_extract_sense_len().
Use scsi_get_sks() instead of accessing sense key specific
data directly.
scsi_modes: Update the control mode page to the latest version (SPC-4).
scsi_cmds.c,
scsi_target.c: Change references to struct scsi_sense_data to struct
scsi_sense_data_fixed. This should be changed to allow the
user to specify fixed or descriptor sense, and then use
scsi_set_sense_data() to build the sense data.
ps3cdrom.c: Use scsi_set_sense_data() instead of setting sense data
manually.
cam_periph.c: Use scsi_extract_sense_len() instead of using
scsi_extract_sense() or accessing sense data directly.
cam_ccb.h: Bump the CAM_VERSION from 0x15 to 0x16. The change of
struct scsi_sense_data from 32 to 252 bytes changes the
size of struct ccb_scsiio, but not the size of union ccb.
So the version must be bumped to prevent structure
mis-matches.
scsi_all.h: Lots of updated SCSI sense data and other structures.
Add function prototypes for the new sense data functions.
Take out the inline implementation of scsi_extract_sense().
It is now too large to put in a header file.
Add macros to calculate whether fields are present and
filled in fixed and descriptor sense data
scsi_all.c: In scsi_op_desc(), allow the user to pass in NULL inquiry
data, and we'll assume a direct access device in that case.
Changed the SCSI RESERVED sense key name and description
to COMPLETED, as it is now defined in the spec.
Change the error recovery action for a number of read errors
to prevent lots of retries when the drive has said that the
block isn't accessible. This speeds up reconstruction of
the block by any RAID software running on top of the drive
(e.g. ZFS).
In scsi_sense_desc(), allow for invalid sense key numbers.
This allows calling this routine without checking the input
values first.
Change scsi_error_action() to use scsi_extract_sense_len(),
and handle things when invalid asc/ascq values are
encountered.
Add a new routine, scsi_desc_iterate(), that will call the
supplied function for every descriptor in descriptor format
sense data.
Add scsi_set_sense_data(), and scsi_set_sense_data_va(),
which build descriptor and fixed format sense data. They
currently default to fixed format sense data.
Add a number of scsi_get_*() functions, which get different
types of sense data fields from either fixed or descriptor
format sense data, if the data is present.
Add a number of scsi_*_sbuf() functions, which print
formatted versions of various sense data fields. These
functions work for either fixed or descriptor sense.
Add a number of scsi_sense_*_sbuf() functions, which have a
standard calling interface and print the indicated field.
These functions take descriptors only.
Add scsi_sense_desc_sbuf(), which will print a formatted
version of the given sense descriptor.
Pull out a majority of the scsi_sense_sbuf() function and
put it into scsi_sense_only_sbuf(). This allows callers
that don't use struct ccb_scsiio to easily utilize the
printing routines. Revamp that function to handle
descriptor sense and use the new sense fetching and
printing routines.
Move scsi_extract_sense() into scsi_all.c, and implement it
in terms of the new function, scsi_extract_sense_len().
The _len() version takes a length (which should be the
sense length - residual) and can indicate which fields are
present and valid in the sense data.
Add a couple of new scsi_get_*() routines to get the sense
key, asc, and ascq only.
mly.c: Rename struct scsi_sense_data to struct
scsi_sense_data_fixed.
sbp_targ.c: Use the new sense fetching routines to get sense data
instead of accessing it directly.
sbp.c: Change the firewire/SCSI sense data transformation code to
use struct scsi_sense_data_fixed instead of struct
scsi_sense_data. This should be changed later to use
scsi_set_sense_data().
ciss.c: Calculate the sense residual properly. Use
scsi_get_sense_key() to fetch the sense key.
mps_sas.c,
mpt_cam.c: Set the sense residual properly.
iir.c: Use scsi_set_sense_data() instead of building sense data by
hand.
iscsi_subr.c: Use scsi_extract_sense_len() instead of grabbing sense data
directly.
umass.c: Use scsi_set_sense_data() to build sense data.
Grab the sense key using scsi_get_sense_key().
Calculate the sense residual properly.
isp_freebsd.h: Use scsi_get_*() routines to grab asc, ascq, and sense key
values.
Calculate and set the sense residual.
MFC after: 3 days
Sponsored by: Spectra Logic Corporation
mps.c: Hide the 'out of chain frames' warning behind MPS_INFO.
mps_sas.c: Hide the SIM queue freeze/unfreeze messages behind MPS_INFO.
mpsvar.h: Bump the number of chain frames from 1024 to 2048. From
testing, it looks like this makes it less likely that we'll
run out of chain frames, and it doesn't cost much memory
(32K).
MFC after: 3 days
When the driver ran out of DMA chaining buffers, it kept the timeout for
the I/O, and I/O would stall.
The driver was not freezing the device queue on errors.
mps.c: Pull command completion logic into a separate
function, and call the callback/wakeup for commands
that are never sent due to lack of chain buffers.
Add a number of extra diagnostic sysctl variables.
Handle pre-hardware errors for configuration I/O.
This doesn't panic the system, but it will fail the
configuration I/O and there is no retry mechanism.
So the device probe will not succeed. This should
be a very uncommon situation, however.
mps_sas.c: Freeze the SIM queue when we run out of chain
buffers, and unfreeze it when more commands
complete.
Freeze the device queue when errors occur, so that
CAM can insure proper command ordering.
Report pre-hardware errors for task management
commands. In general, that shouldn't be possible
because task management commands don't have S/G
lists, and that is currently the only error path
before we get to the hardware.
Handle pre-hardware errors (like out of chain
elements) for SMP requests. That shouldn't happen
either, since we should have enough space for two
S/G elements in the standard request.
For commands that end with
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and
MPI2_IOCSTATUS_SCSI_EXT_TERMINATED, return them
with CAM_REQUEUE_REQ to retry them unconditionally.
These seem to be related to back end, transport
related problems that are hopefully transient. We
don't want to go through the retry count for
something that is not a permanent error.
Keep track of the number of outstanding I/Os.
mpsvar.h: Track the number of free chain elements.
Add variables for the number of outstanding I/Os,
and I/O high water mark.
Add variables to track the number of free chain
buffers and the chain low water mark, as well as
the number of chain allocation failures.
Add I/O state flags and an attach done flag.
MFC after: 3 days
the controller firmware will return all of our commands. Instead, keep
track of outstanding I/Os and return them to CAM once device removal
processing completes.
mpsvar.h: Declare the new "io_list" in the mps_softc.
mps.c: Initialize the new "io_list" in the mps softc.
mps_sas.c: o Track SCSI I/O requests on the io_list from the
time of mpssas_action() through mpssas_scsiio_complete().
o Zero out the request structures used for device
removal commands prior to filling them out.
o Once the target reset task management function completes
during device removal processing, assume any SCSI I/O
commands that are still oustanding will never return
from the controller, and process them manually.
Submitted by: gibbs
MFC after: 3 days
Prior to this change, the addressing method wasn't getting set, and
so the LUN field could be set incorrectly in some instances.
This fix should allow for LUN numbers up to 16777215 (and return an error
for anything larger, which wouldn't fit into the flat addressing model).
Submitted by: scottl (in part)
This bug manifested itself after repeated device arrivals and
departures. The root of the problem was that the last entry in the
reply array wasn't initialized/allocated. So every time we got
around to that event, we had a bogus address.
There were a couple more problems with the code that are also fixed:
- The reply mechanism was being treated as sequential (indexed by
sc->replycurindex) even though the spec says that the driver
should use the ReplyFrameAddress field of the post queue
descriptor to figure out where the reply is. There is no
guarantee that the reply descriptors will be used in sequential
order.
- The second word of the reply post queue descriptor wasn't being
checked in mps_intr_locked() to make sure that it wasn't
0xffffffff. So the driver could potentially come across a
partially DMAed descriptor.
- The number of replies allocated was one less than the actual
size of the queue. Instead, it was the size of the number of
replies that can be used at one time. (Which is one less than
the size of the queue.)
mps.c: When initializing the entries in the reply free
queue, make sure we initialize the full number that
we tell the chip we have (sc->fqdepth), not the
number that can be used at any one time (sc->num_replies).
When allocating replies, make sure we allocate the
number of replies that we've told the chip exist,
not just the number that can be used simultaneously.
Use the ReplyFrameAddress field of the post queue
descriptor to figure out which reply is being
referenced. This is what the spec says to do, and
the spec doesn't guarantee that the replies will be
used in order.
Put a check in to verify that the reply address passed
back from the card is valid. (Panic if it isn't, we'll
panic when we try to deference the reply pointer in any
case.)
In mps_intr_locked(), verify that the second word of the
post queue descriptor is not 0xffffffff in addition to
verifying that the unused flag is not set, so we can
make sure we didn't get a partially DMAed descriptor.
Remove references to sc->replycurindex, it isn't needed
now.
mpsvar.h: Remove replycurindex from the softc, it isn't needed now.
Reviewed by: scottl
This includes support in the kernel, camcontrol(8), libcam and the mps(4)
driver for SMP passthrough.
The CAM SCSI probe code has been modified to fetch Inquiry VPD page 0x00
to determine supported pages, and will now fetch page 0x83 in addition to
page 0x80 if supported.
Add two new CAM CCBs, XPT_SMP_IO, and XPT_GDEV_ADVINFO. The SMP CCB is
intended for SMP requests and responses. The ADVINFO is currently used to
fetch cached VPD page 0x83 data from the transport layer, but is intended
to be extensible to fetch other types of device-specific data.
SMP-only devices are not currently represented in the CAM topology, and so
the current semantics are that the SIM will route SMP CCBs to either the
addressed device, if it contains an SMP target, or its parent, if it
contains an SMP target. (This is noted in cam_ccb.h, since it will change
later once we have the ability to have SMP-only devices in CAM's topology.)
smp_all.c,
smp_all.h: New helper routines for SMP. This includes
SMP request building routines, response parsing
routines, error decoding routines, and structure
definitions for a number of SMP commands.
libcam/Makefile: Add smp_all.c to libcam, so that SMP functionality
is available to userland applications.
camcontrol.8,
camcontrol.c: Add smp passthrough support to camcontrol. Several
new subcommands are now available:
'smpcmd' functions much like 'cmd', except that it
allows the user to send generic SMP commands.
'smprg' sends the SMP report general command, and
displays the decoded output. It will automatically
fetch extended output if it is available.
'smppc' sends the SMP phy control command, with any
number of potential options. Among other things,
this allows the user to reset a phy on a SAS
expander, or disable a phy on an expander.
'smpmaninfo' sends the SMP report manufacturer
information and displays the decoded output.
'smpphylist' displays a list of phys on an
expander, and the CAM devices attached to those
phys, if any.
cam.h,
cam.c: Add a status value for SMP errors
(CAM_SMP_STATUS_ERROR).
Add a missing description for CAM_SCSI_IT_NEXUS_LOST.
Add support for SMP commands to cam_error_string().
cam_ccb.h: Rename the CAM_DIR_RESV flag to CAM_DIR_BOTH. SMP
commands are by nature bi-directional, and we may
need to support bi-directional SCSI commands later.
Add the XPT_SMP_IO CCB. Since SMP commands are
bi-directional, there are pointers for both the
request and response.
Add a fill routine for SMP CCBs.
Add the XPT_GDEV_ADVINFO CCB. This is currently
used to fetch cached page 0x83 data from the
transport later, but is extensible to fetch many
other types of data.
cam_periph.c: Add support in cam_periph_mapmem() for XPT_SMP_IO
and XPT_GDEV_ADVINFO CCBs.
cam_xpt.c: Add support for executing XPT_SMP_IO CCBs.
cam_xpt_internal.h: Add fields for VPD pages 0x00 and 0x83 in struct
cam_ed.
scsi_all.c: Add scsi_get_sas_addr(), a function that parses
VPD page 0x83 data and pulls out a SAS address.
scsi_all.h: Add VPD page 0x00 and 0x83 structures, and a
prototype for scsi_get_sas_addr().
scsi_pass.c: Add support for mapping buffers in XPT_SMP_IO and
XPT_GDEV_ADVINFO CCBs.
scsi_xpt.c: In the SCSI probe code, first ask the device for
VPD page 0x00. If any VPD pages are supported,
that page is required to be implemented. Based on
the response, we may probe for the serial number
(page 0x80) or device id (page 0x83).
Add support for the XPT_GDEV_ADVINFO CCB.
sys/conf/files: Add smp_all.c.
mps.c: Add support for passing in a uio in mps_map_command(),
so we can map a S/G list at once.
Add support for SMP passthrough commands in
mps_data_cb(). SMP is a special case, because the
first buffer in the S/G list is outbound and the
second buffer is inbound.
Add support for warning the user if the busdma code
comes back with more buffers than will work for the
command. This will, for example, help the user
determine why an SMP command failed if busdma comes
back with three buffers.
mps_pci.c: Add sys/uio.h.
mps_sas.c: Add the SAS address and the parent handle to the
list of fields we pull from device page 0 and cache
in struct mpssas_target. These are needed for SMP
passthrough.
Add support for the XPT_SMP_IO CCB. For now, this
CCB is routed to the addressed device if it supports
SMP, or to its parent if it does not and the parent
does. This is necessary because CAM does not
currently support SMP-only nodes in the topology.
Make SMP passthrough support conditional on
__FreeBSD_version >= 900026. This will make it
easier to MFC this change to the driver without
MFCing the CAM changes as well.
mps_user.c: Un-staticize mpi_init_sge() so we can use it for
the SMP passthrough code.
mpsvar.h: Add a uio and iovecs into struct mps_command for
SMP passthrough commands.
Add a cm_max_segs field to struct mps_command so
that we can warn the user if busdma comes back with
too many segments.
Clear the cm_reply when a command gets freed. If
it is not cleared, reply frames will eventually get
freed into the pool multiple times and corrupt the
pool. (This fix is from scottl.)
Add a prototype for mpi_init_sge().
sys/param.h: Bump __FreeBSD_version to 900026 for the for the
inclusion of the XPT_GDEV_ADVINFO and XPT_SMP_IO
CAM CCBs.
- fix the leak of command struct on error
- simplify the cleanup logic
- EINPROGRESS is not a fatal error
- buggy comment and error message
Reviewed by: ken
controller, but make it optional.
After a problem report from Andrew Boyer, it looks like the LSI
chip may have issues (the watchdog timer fired) if too many aborts
are sent down to the chip at the same time. We know that task
management commands are serialized, and although the manual doesn't
say it, it may be a good idea to just send one at a time.
But, since I'm not certain that this is necessary, add a tunable
and sysctl variable (hw.mps.%d.allow_multiple_tm_cmds) to control
the driver's behavior.
mps.c: Add support for the sysctl and tunable, and add a
comment about the possible return values to
mps_map_command().
mps_sas.c: Run all task management commands through two new
routines, mpssas_issue_tm_request() and
mpssas_complete_tm_request().
This allows us to optionally serialize task
management commands. Also, change things so that
the response to a task management command always
comes back through the callback. (Before it could
come via the callback or the return value.)
mpsvar.h: Add softc variables for the list of active task
management commands, the number of active commands,
and whether we should allow multiple active task
management commands. Add an active command flag.
mps.4: Describe the new sysctl/loader tunable variable.
Sponsored by: Spectra Logic Corporation
When the driver is completely saturated with commands (1024 in the
case of the SAS2008 in my test system), I/O stops. If we tell CAM
that we have one less command slot than we have actually allocated,
everything works fine. We also need a few extra command slots to
allow for aborts and other task management commands to be sent down.
This needs more investigation to determine the root cause, but for
now this fixes things in my testing.
mps.c: Change a printf() to mps_printf().
mps_sas.c: Subtract 5 command slots when we tell CAM how many
commands we can handle.
Add some commented-out logic to print the contents
the CDBs for timed-out commands. This can help
in debugging devices that are timing out. This
will be uncommented once I bring some CAM changes in.
Reported by: Andrew Boyer <aboyer at averesystems dot com>
According to the MPT2 spec, task management commands are
serialized, and so no I/O should start while task management
commands are active.
So, to comply with that, freeze the SIM queue before we send any
task management commands (abort, target reset, etc.) down to the
IOC. We unfreeze the queue once the task management command
completes.
It isn't clear from the spec whether multiple simultaneous task
management commands are supported. Right now it is possible to
have multiple outstanding task management commands, especially in
the abort case. Multiple outstanding aborts do complete
successfully, so it may be supported.
We also don't yet have any recovery mechanism (e.g. reset the IOC)
if the task management command fails.
Bring in a driver for the LSI Logic MPT2 6Gb SAS controllers.
This driver supports basic I/O, and works with SAS and SATA drives and
expanders.
Basic error recovery works (i.e. timeouts and aborts) as well.
Integrated RAID isn't supported yet, and there are some known bugs.
So this isn't ready for production use, but is certainly ready for
testing and additional development. For the moment, new commits to this
driver should go into the FreeBSD Perforce repository first
(//depot/projects/mps/...) and then get merged into -current once
they've been vetted.
This has only been added to the amd64 GENERIC, since that is the only
architecture I have tested this driver with.
Submitted by: scottl
Discussed with: imp, gibbs, will
Sponsored by: Yahoo, Spectra Logic Corporation