Commit Graph

8 Commits

Author SHA1 Message Date
Scott Long
d9802deb4e Sometimes a device misbehaves so badly that it disrupts the entire system.
Add a tunable that allows such a device to be excluded from the driver.
The id parameter is the target id that the driver assigns to a given device.

dev.mps.X.exclude_ids=<id>,<id>

Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:09:02 +00:00
Kenneth D. Merry
9b91b192fb Merge in phase 14+ -> 16 mps driver fixes from LSI:
---------------------------------------------------------------
System panics during a Port reset with ouststanding I/O
---------------------------------------------------------------
It is possible to call mps_mapping_free_memory after this
memory is already freed, causing a panic. Removed this extra
call to mps_mappiing_free_memory and call mps_mapping_exit
in place of the mps_mapping_free_memory call so that any
outstanding mapping items can be flushed before memory is
freed.

---------------------------------------------------------------
Correct memory leak during a Port reset with ouststanding I/O
---------------------------------------------------------------
In mps_reinit function, the mapping memory was not being
freed before being re-allocated. Added line to call the
memory free function for mapping memory.

---------------------------------------------------------------
Use CAM_SIM_QUEUED flag in Driver IO path.
---------------------------------------------------------------
This flag informs the XPT that successful abort of a CCB
requires an abort ccb to be issued to the SIM.  While
processing SCSI IO's, set the CAM_SIM_QUEUED flag in the
status for the IO. When the command completes, clear this
flag.

---------------------------------------------------------------
Check for CAM_REQ_INPROG in I/O path.
---------------------------------------------------------------
Added a check in mpssas_action_scsiio for the In Progress
status for the IO. If this flag is set, the IO has already
been aborted by the upper layer (before CAM_SIM_QUEUED was
set) and there is no need to send the IO. The request will
be completed without error.

---------------------------------------------------------------
Improve "doorbell handshake method" for mps_get_iocfacts
---------------------------------------------------------------
Removed call to get Port Facts since this information is
not used currently.

Added mps_iocfacts_allocate function to allocate memory
that is based on IOC Facts data.  Added mps_iocfacts_free
function to free memory that is based on IOC Facts data.
Both of the functions are used when a Diag Reset is performed
or when the driver is attached/detached. This is needed in
case IOC Facts changes after a Diag Reset, which could
happen if FW is upgraded.

Moved call of mps_bases_static_config_pages from the attach
routine to after the IOC is ready to process accesses based
on the new memory allocations (instead of polling through
the Doorbell).

---------------------------------------------------------------
Set TimeStamp in INIT message in millisecond format Set the IOC
---------------------------------------------------------------

---------------------------------------------------------------
Prefer mps_wait_command to mps_request_polled
---------------------------------------------------------------
Instead of using mps_request_polled, call mps_wait_command
whenever possible. Change the mps_wait_command function to
check the current context and either use interrupt context
or poll if required by using the pause or DELAY function.
Added a check after waiting 50mSecs to see if the command
has timed out. This is only done if polliing, the msleep
command will automatically timeout if the command has taken
too long to complete.

---------------------------------------------------------------
Integrated RAID: Volume Activation Failed error message is
displayed though the volume has been activated.
---------------------------------------------------------------
Instead of failing an IOCTL request that does not have a
large enough buffer to hold the complete reply, copy as
much data from the reply as possible into the user's buffer
and log a message saying that the user's buffer was smaller
than the returned data.

---------------------------------------------------------------
mapping_add_new_device failure due to persistent table FULL
---------------------------------------------------------------
When a new device is added, if it is determined that the
device persistent table is being used and is full, instead
of displaying a message for this condition every time, only
log a message if the MPS_INFO bit is set in the debug_flags.

Submitted by:	LSI
MFC after:	1 week
2013-07-22 18:41:53 +00:00
Kenneth D. Merry
b01773b0f1 CAM and mps(4) driver scanning changes.
Add a PIM_NOSCAN flag to the CAM path inquiry CCB.  This tells CAM
not to perform a rescan on a bus when it is registered.

We now use this flag in the mps(4) driver.  Since it knows what
devices it has attached, it is more efficient for it to just issue
a target rescan on the targets that are attached.

Also, remove the private rescan thread from the mps(4) driver in
favor of the rescan thread already built into CAM.  Without this
change, but with the change above, the MPS scanner could run before
or during CAM's initial setup, which would cause duplicate device
reprobes and announcements.

sys/param.h:
	Bump __FreeBSD_version to 1000039 for the inclusion of the
	PIM_RESCAN CAM path inquiry flag.

sys/cam/cam_ccb.h:
sys/cam/cam_xpt.c:
	Added a PIM_NOSCAN flag.  If a SIM sets this in the path
	inquiry ccb, then CAM won't rescan the bus in
	xpt_bus_regsister.

sys/dev/mps/mps_sas.c
	For versions of FreeBSD that have the PIM_NOSCAN path
	inquiry flag, don't freeze the sim queue during scanning,
	because CAM won't be scanning this bus.  Instead, hold
	up the boot.  Don't call mpssas_rescan_target in
	mpssas_startup_decrement; it's redundant and I don't
	know why it was in there.

	Set PIM_NOSCAN in path inquiry CCBs.

	Remove methods related to the internal rescan daemon.

	Always use async events to trigger a probe for EEDP support.
	In older versions of FreeBSD where AC_ADVINFO_CHANGED is
	not available, use AC_FOUND_DEVICE and issue the
	necessary READ CAPACITY manually.

	Provide a path to xpt_register_async() so that we only
	receive events for our own SCSI domain.

	Improve error reporting in cases where setup for EEDP
	detection fails.

sys/dev/mps/mps_sas.h:
	Remove softc flags and data related to the scanner thread.

sys/dev/mps/mps_sas_lsi.c:
	Unconditionally rescan the target whenever a device is added.

Sponsored by:	Spectra Logic
MFC after:	1 week
2013-07-22 18:37:07 +00:00
Scott Long
1610f95c56 Overhaul error, information, and debug logging.
Obtained from:	Netflix
MFC after:	3 days
2013-07-19 00:12:41 +00:00
Christian Brueffer
1d50dd1f08 Fix a small memory leak in mpssas_get_sata_identify(). The change has been
submitted upstream as well.

Reviewed by:	ken, scottl
Obtained from:	DragonFly BSD (change df8658e030226dd015cff9749452666d8fe1e87b)
MFC after:	5 days
2012-07-18 09:06:07 +00:00
Kenneth D. Merry
be4aa869c1 Bring in LSI's latest mps(4) 6Gb SAS and WarpDrive driver, version
14.00.00.01-fbsd.

Their description of the changes is as follows:

1.	Copyright contents has been changed in all respective .c
	and .h files

2.	Support for WRITE12 and READ12 for direct-io (warpdrive only)
	has been added.

3.      Driver has added checks to see if Drive has READ_CAP_16
	support before sending it down to the device.
	If SPC3_SID_PROTECT flag is set in the inquiry data, the
	device supports protection information, and must support
	the 16 byte read capacity command, otherwise continue without
	sending read cap 16. This will optimize driver performance,
	since it will not send READ_CAP_16 to the drive which does
	not have support of READ_CAP_16.

4.      With new approach, "MPTIOCTL_RESET_ADAPTER" IOCTL will not
	use DELAY() which is busy loop implementation.
	It will use <msleep> (Better way to sleep without busy
	loop). Also from the HBA reset code path and some other
	places, DELAY() is replaced with msleep() or "pause()",
	which is based on sleep/wakeup style calls.  Driver use
	msleep()/pause() instead of DELAY based on CAN_SLEEP/NO_SLEEP
	flags to avoid busy loop which is not required all the
	time.e.a

	a. While driver is getting loaded, driver calls most of the
	   commands with NO_SLEEP.
	b. When Driver is functional and it needs Reinit of HBA,
	   CAN_SLEEP flag is used.

5.	<mpslsi> driver is not Endian safe. It will not work on Big
	Endian machines	like Sparc and PowerPC platforms because it
	assumes it is running on a Little Endian machine.

	Driver code is modified such way that it does not assume CPU
	arch is Little Endian.
	a. All places where Driver interacts from HBA to Host, it
	   converts Little Endian format to CPU format.
	b. All places where Driver interacts from Host to HBA, it
	   converts CPU format to Little Endian.

6.	Findout memory leaks in FreeBSD Driver and resolve those,
	such as memory leak in targ's luns creation/deletion.
	Also added additional checks to see memory allocation
	success/fail.

7.	Add loginfo prints as debug message, i.e. When FW sends any
	loginfo, Driver should print those as debug message.
	This will help for debugging purpose.

8.	There is possibility to get config request timeout. Current
	driver is able to detect config request timetout, but it does
	not do anything on config_request timeout.  Driver should
	call mps_reinit() if any request_poll (which is called as
	part of config_request) is time out.

9.	cdb length check is required for 32 byte CDB. Add correct mpi
	control value for 32 bit CDB as below while submitting SCSI IO
	Request to controller.
	mpi_control |= 4 << MPI2_SCSIIO_CONTROL_ADDCDBLEN_SHIFT;

10.	Check the actual status of Message unit reset
	(mps_message_unit_reset).Previously FreeBSD Driver just writes
	MPI2_FUNCTION_IOC_MESSAGE_UNIT_RESET and never check the ack
	(it just wait for 50 millisecond).  So, Driver now check the
	status of "MPI2_FUNCTION_IOC_MESSAGE_UNIT_RESET" after writing
	it to the FW.

	Now it also checking for whether doorbell ack uses msleep with
	proper sleep flags, instead of <DELAY>.

11.	Previously CAM does not detect Multi-Lun Devices. In order to
	detect Multi-Lun Devices by CAM the driver needs following change
	set:
	a. There is "max_lun" field which Driver need to set based on
	   hw/fw support. Currently LSI released driver does not set
	   this field.
	b. Default of "max_lun" should not be 0 in OS, but it is
	   currently set to 0 in CAM layer.
	c. Export max_lun capacity to 255

12.	Driver will not reset target info after port enable complete and
	also do Device removal when Device remove from FW.  The detail
	description is as follows
	a. When Driver receive WD PD add events, it will add all
	   information in driver local data structure.
	b. Only for WD, we have below checks after port enable
	   completes, where driver clear off all information retrieved
	   at #1.
	if ((sc->WD_available &&
             (sc->WD_hide_expose == MPS_WD_HIDE_ALWAYS)) ||
             (sc->WD_valid_config && (sc->WD_hide_expose ==
                            MPS_WD_HIDE_IF_VOLUME)) {
		  // clear off target data structure.
	}
	It is mainly not to attach PDs to OS.

	FreeBSD does bus rescan as older Parallel scsi style. So Driver
	needs to handle which Drive is visible to OS.  That is a reason
	we have to clear off targ information for PDs.

	Again, above logic was implemented long time ago. Similar concept
	we have for non-wd also. For that, LSI have introduced different
	logic to hide PDs.

	Eventually, because of above gap, when Phy goes offline, we
	observe below failure. That is what Driver is not doing complete
	removal of device with FW. (which was pointed by Scott)
	Apr  5 02:39:24 Freebsd7 kernel: mpslsi0: mpssas_prepare_remove
	Apr  5 02:39:24 Freebsd7 kernel: mpssas_prepare_remove 497 : invalid handle 0xe

	Now Driver will not reset target info after port enable complete
	and also will do Device removal when Device remove from FW.

13.	Returning "CAM_SEL_TIMEOUT" instead of "CAM_TID_INVALID"
	error code on request to the Target IDs that have no devices
	conected at that moment.  As if "CAM_TID_INVALID" error code
	is returned to the CAM Layaer then it results in a huge chain
	of errors in verbose kernel messages on boot and every
	hot-plug event.

Submitted by:	Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
MFC after:	3 days
2012-06-28 03:48:54 +00:00
Kenneth D. Merry
653c521f8d Bring in a number of mps(4) driver fixes from LSI:
1.  Fixed timeout specification for the msleep in mps_wait_command().
    Added 30 second timeout for mps_wait_command() calls in mps_user.c.

2.  Make sure we call mps_detach_user() from the kldunload path.

3.  Raid Hotplug behavior change.

    The driver now removes a volume when it goes to a failed state,
    so we also need to add volume back to the OS when it goes to
    opitimal/degraded/online from failed/missing.

    Handle raid volume add and remove from the IR_Volume event.
4.  Added some more debugging information.

5.  Replace xpt_async(AC_LOST_DEVICE, path, NULL) with
    mpssas_rescan_target().

    This is to work around a panic in CAM that shows up when adding a
    drive with a rescan and removing another device from the driver thread
    with an AC_LOST_DEVICE async notification.

    This problem was encountered in testing with the LSI sas2ircu utility,
    which was used to create a RAID volume from physical disks.  The driver
    has to create the RAID volume target and remove the physical disk
    targets, and triggered a panic in the process.

    The CAM issue needs to be fully diagnosed and fixed, but this works
    around the issue for now.

6.  Fix some memory initialization issues in mps_free_command().

7.  Resolve the "devq freeze forever" issue.  This was caused by the
    internal read capacity command issued in the non-head version of the
    driver.  When the command completed with an error, the driver wasn't
    unfreezing thd device queue.

    The version in head uses the CAM infrastructure for getting the read
    capacity information, and therefore doesn't have the same issue.

8.  Bump the version to 13.00.00.00-fbsd. (this is very close to LSI's
    internal stable driver 13.00.00.00)

Submitted by:	Kashyap Desai <Kashyap.Desai@lsi.com>
MFC after:	3 days
2012-02-09 00:16:12 +00:00
Kenneth D. Merry
d043c56453 Bring in the LSI-supported version of the mps(4) driver.
This involves significant changes to the mps(4) driver, but is not a
complete rewrite.

Some of the changes in this version of the driver:
 - Integrated RAID (IR) support.
 - Support for WarpDrive controllers.
 - Support for SCSI protection information (EEDP).
 - Support for TLR (Transport Level Retries), needed for tape drives.
 - Improved error recovery code.
 - ioctl interface compatible with LSI utilities.

mps.4:		Update the mps(4) driver man page somewhat for the driver
		changes.  The list of supported hardware still needs to be
		updated to reflect the full list of supported cards.

conf/files:	Add the new driver files.

mps/mpi/*:	Updated version of the MPI header files, with a BSD style
		copyright.

mps/*:		See above for a description of the new driver features.

modules/mps/Makefile:
		Add the new mps(4) driver files.

Submitted by:	Kashyap Desai <Kashyap.Desai@lsi.com>
Reviewed by:	ken
MFC after:	1 week
2012-01-26 18:17:21 +00:00