freebsd-dev

Author	SHA1	Message	Date
Matt Jacob	162eef1f09	Fix some negotiation issues (like not being able to negotiate async)	2006-11-02 23:19:00 +00:00
Matt Jacob	bd3fd815a7	2nd and final commit that moves us to CAM_NEW_TRAN_CODE as the default. Reviewed by multitudes.	2006-11-02 00:54:38 +00:00
Matt Jacob	fa9ed86506	The first of 3 major steps to move the CAM layer forward to using the CAM_NEW_TRAN_CODE that has been in the tree for some years now. This first step consists solely of adding to or correcting CAM_NEW_TRAN_CODE pieces in the kernel source tree such that a both a GENERIC (at least on i386) and a LINT build with CAM_NEW_TRAN_CODE as an option will compile correctly and run (at least with some the h/w I have). After a short settle time, the other pieces (making CAM_NEW_TRAN_CODE the default and updating libcam and camcontrol) will be brought in. This will be an incompatible change in that the size of structures related to XPT_PATH_INQ and XPT_{GET,SET}_TRAN_SETTINGS change in both size and content. However, basic system operation and basic system utilities work well enough with this change. Reviewed by: freebsd-scsi and specific stakeholders	2006-10-31 05:53:29 +00:00
Matt Jacob	4542a3798e	Connect up a QUEUE FULL event with CAM and adjust openings. Unfortunately, the QUEUE FULL event only tells you Bus && Target. It doesn't tell you lun. In order for the XPT_REL_SIMQ action to work, we have to have a real lun. But which one? For now, just iterate over MPT_MAX_LUNS. Practically speaking, this is only going to be happening for lower quality SAS or SATA drives behind the SAS controller, which means only lun 0, so it's not so bad. Helpful Reminder Nagging from: John Baldwin, Fred Whiteside MFC after: 5 days	2006-09-21 20:35:12 +00:00
Matt Jacob	a7303be1a8	Create a 'ready' handler for each personality. The purpose of this handler is to able to be called after all attach and enable events are done. We establish a SYSINIT hook to call this handler. The current usage for it is to add scsi target resources after all enables are done. There seems to be some dependencies between different halves of a dual-port with respect to target mode. Put in more meaningful event messages for some events- in particular QUEUE FULL events so we can see what the queue depth was when the IOC sent us this message. MFC after: 1 week	2006-09-07 23:08:21 +00:00
Matt Jacob	b2d24734cd	The poison pill of death: adding a target mode reply handler and target resources to a non-FC card killed us dead. Sorry for the breakage since last July 12.	2006-09-05 23:53:07 +00:00
Matt Jacob	1dad8bb0ba	When probing to attach the CAM functionality, check against desired role configuration instead of existing role. This gets us out of the mess where we configured a role of NONE (or were LAN only, for example), but didn't continue to attach the CAM module (because we had neither initiator nor target role set). Unfortunately, the code that rewrites NVRAM to match actual to desired role only works if the CAM module attaches. MFC after: 2 weeks	2006-07-25 00:59:54 +00:00
Matt Jacob	970043d7cd	Add sysctl information about things like WWNN/WWPN. MFC after: 2 weeks	2006-07-16 06:05:44 +00:00
Matt Jacob	6621d786eb	If we're in mpt_wait_req and the command times out, mark it as timed out. Don't try and free the config request for read_cfg_header that times out because it's still active. Put in code for the config reply handler that will then free up timed out requests. Fix the FC_PRIMITIVE_SEND completion to not try and free a command twice. Dunno how this possibly could have been working for awhile. MFC after: 2 weeks	2006-07-16 03:34:55 +00:00
Matt Jacob	73651fd1ef	If the card has target mode enabled, and we hang out ELS buffers but don't hang out commands, we hang folks on the SAN because the LSI-Logic f/w apparently sends back BUSY or QFULL or some darn thing. If we add command buffers, we have to respond to them sensibly even if we don't have any upstream listeners (scsi_targ or scsi_targ_bh), so put in some local command reponse stuff. MFC after: 2 weeks	2006-07-15 22:58:09 +00:00
Matt Jacob	b4c618c099	Fix config page writes to not strip out the attributes when you actually go write the config page. This fixes the long standing problem about updating NVRAM on Fibre Channel cards and seems so far to not break SPI config page writes. Put back role setting into mpt. That is, you can set a desired role for mpt as a hint. On the next reboot, it'll pick that up and redo the NVRAM settings appropriately and warn you that this won't take effect until the next reboot. This saves people the step of having to find a BIOS utilities disk to set target and/or initiator role for the MPT cards.	2006-07-12 07:48:50 +00:00
Matt Jacob	8ca0124685	VMWare ESX reports > 16 targets for the LSI-Logic U320 model it emulates. Then it crashes and burns when you probe that high.	2006-06-26 05:44:18 +00:00
Matt Jacob	9fe6d25444	Major Fixes: Don't enable/disable I/O space except for SAS adapters. This fixes a problem with VMware 4.5 Workstation. Fix an egregious bug introduced to target mode so it actually will not panic when you first enable a lun. Minor fixes: Take more infor from port facts and configuration pages. MFC after: 1 week	2006-06-25 04:23:26 +00:00
Jung-uk Kim	672e707a61	Add ability to reset individual devices and fix SCSI speed negotiation. Reviewed by: mjacob (initial version)	2006-06-09 23:11:43 +00:00
Matt Jacob	fcd9a16b1f	Do some source && comment cleanup. Clean out the abortive start to homegrown, per-mpt, Domain Validation. This should really be done at a higher level. Use the PIM_SEQSCAN flag for U320- this seems to correct cases of being unable to consistently negotiate U320 in the cases where I'd seen this before. Between this and other recent checkins, this driver is pretty close to being ready for MFC. Reviewed by: scottl, ken, scsi@ MFC after: 1 week	2006-06-05 22:25:49 +00:00
Matt Jacob	5580ce963e	More checkpointing on the way toward really (finally) fixing speed negotiation. Also fix the mpt_execute_req function to actually match mpt_execute_req_a64. This may explain why i386 users were having more grief.	2006-06-02 18:50:39 +00:00
Matt Jacob	ec5fe39d39	Add acknowledgements to LSI-Logic for support	2006-05-29 20:34:28 +00:00
Matt Jacob	800d362b5d	+ Change some debug messages to MPT_PRT_NEGOTIATE level (so we can see the results of SPI negotiation w/o being overwhelmed with other crap). + For U320 devices, check against both Settings and DV flags before deciding whether we need to skip actual SPI settings for a device. + Go back to creating a 'physical disk' side of a raid/passthru bus that is limited to the number of maximum physical disks. Actually, this isn't probably quite right yet for one RAID volume, and if we ever end up with finding a device that supports more than one RAID volume (not likely), it probably won't quite be right either. The problem here is that the creating of this 'physical' passthru sim is just a cheap way to leverage off the CAM midlayer to do our negotiation for us on the subentities that make up a RAID volume. It almost causes more trouble than it is worth because we have to remember which side we're talking to in terms of forming commands and which target ids are real and so on. Bleah. + Skip trying to actually do SPI settings for the RAID volumes on the real side of the raid/passthru bus pair- this just confuses the issue. The underlying real physical devices will have the negotiation performed and the Raid volume will inherit the resultant settings. At the sime time, non-RAID devices can be on the same real bus, so do perform negotiations with them. + At the end of doing all of the settings twiddling, ahem, remember to go update the settings on the card itself (dunno how this got nuked). At this point, negotiations seem to be being done (again) correctly for both RAID volumes and their subentities. And they seem to be mostly now right for other non-RAID entities on the same bus (I ended up with 3 out of 8 other disks still at narror/async- haven't the slightest idea why yes). Finally, negotiations on a normal bus seem to work (again). There's still more work coming into this area, but we're in the final stretch.	2006-05-29 20:30:40 +00:00
Matt Jacob	1d79ca0e46	Work in progress toward fixing IM checked in after having lost one set to a peninsula power failure last night. After this, I can see both submembers and the raid volumes again, but speed negotiation is still broken. Add a mpt_raid_free_mem function to centralize the resource reclaim and fixed a small memory leak. Remove restriction on number of targets for systems with IM enabled- you can have setups that have both IM volumes as well as other devices. Fix target id selection for passthru and nonpastrhu cases. Move complete command dumpt to MPT_PRT_DEBUG1 level so that just setting debug level gets mostly informative albeit less verbose dumping.	2006-05-27 17:26:57 +00:00
Matt Jacob	a3116b5a27	Get most of the way back to having Integrated Mirroring work again- the addition of target mode support broke it massively.	2006-05-26 05:54:21 +00:00
Matt Jacob	f69149626c	Remove MPT_PRT_INVARIANT- it was a silly idea.	2006-05-04 02:34:18 +00:00
Matt Jacob	54302f8e50	Change some order of the way we do some target mode ops. Found by Coverity.	2006-04-21 18:31:21 +00:00
Matt Jacob	2901a7b7d4	In receiving a new ATIO, don't record the associated CCB in the target state structure. This field is only for CCBs that are associated with actions that are occurring on the HBA (i.e., XPT_CONT_IO actions). This way we also don't get confused when the upstream listener stalls try and look at a CCB which has already been freed (by CAM).	2006-04-18 21:52:00 +00:00
Matt Jacob	5089bd63bd	A large set of changes: + Add boatloads of KASSERTs and really check out more locking issues (to catch recursions when we actually go to real locking in CAM soon). The KASSERTs also caught lots of other issues like using commands that were put back on free lists, etc. + Target mode: role setting is derived directly from port capabilities. There is no need to set a role any more. Some target mode resources are allocated early on (ELS), but target command buffer allocation is deferred until the first lun enable. + Fix some breakages I introduced with target mode in that some commands are repeating commands. That is, the reply shows up but the command isn't really done (we don't free it). We still need to take it off the pending list because when we resubmit it, bad things then happen. + Fix more of the way that timed out commands and bus reset is done. The actual TMF response code was being ignored. + For SPI, honor BIOS settings. This doesn't quite fix the problems we've seen where we can't seem to (re)negotiate U320 on all drives but avoids it instead by letting us honor the BIOS settings. I'm sure this is not quite right and will have to change again soon.	2006-04-11 16:47:30 +00:00
Matt Jacob	5e073106d5	Fix some of the previus changes 'better'. There's something strange going on with async events. They seem to be be treated differently for different Fusion implementations. Some will really tell you when it's okay to free the request that started them. Some won't. Very disconcerting. This is particularily bad when the chip (FC in this case) tells you in the reply that it's not a continuation reply, which means you can free the request that its associated with. However, if you do that, I've found that additional async event replies come back for that message context after you freed it. Very Bad Things Happen. Put in a reply register debounce. Warn about out of range context indices. Use more MPILIB defines where possible. Replace bzero with memset. Add tons more KASSERTS. Do a lot more request free list auditting and serial number usages. Get rid of the warning about the short IOC Facts Reply. Go back to 16 bits of context index. Do a lot more target state auditting as well. Make a tag out of not only the ioindex but the request index as well and worry less about keeping a full serial number.	2006-04-01 07:12:18 +00:00
Matt Jacob	c87e3f833c	Some fairly major changes to this driver. A) Fibre Channel Target Mode support mostly works (SAS/SPI won't be too far behind). I'd say that this probably works just about as well as isp(4) does right now. Still, it and isp(4) and the whole target mode stack need a bit of tightening. B) The startup sequence has been changed so that after all attaches are done, a set of enable functions are called. The idea here is that the attaches do whatever needs to be done prior to a port being enabled and the enables do what need to be done for enabling stuff for a port after it's been enabled. This means that we also have events handled by their proper handlers as we start up. C) Conditional code that means that this driver goes back all the way to RELENG_4 in terms of support. D) Quite a lot of little nitty bug fixes- some discovered by doing RELENG_4 support. We've been living under Giant waaaayyyyy too long and it's made some of us (me) sloppy. E) Some shutdown hook stuff that makes sure we don't blow up during a reboot (like by the arrival of a new command from an initiator). There's been some testing and LINT checking, but not as complete as would be liked. Regression testing with Fusion RAID instances has not been possible. Caveat Emptor. Sponsored by: LSI-Logic.	2006-03-25 07:08:27 +00:00
Matt Jacob	a4ca1e0bb0	If we actually succeed in the Task Management Function where we are aborting timed out commands, pull the request off the TAILQ.	2006-03-17 04:54:06 +00:00
Matt Jacob	7a49a0d1fb	Add a serial number for requests so we don't just depend on a request pointer to try and do forensics on what has occurred.	2006-03-07 17:56:40 +00:00
Matt Jacob	29ae59edff	Fix mpt_reset to try mpt_hard_reset more than once, and to try mpt_soft_reset more than once. And to wait for MPT_DB_STATE_READY twice. I mean, this is crucial- give the IOC a chance to get ready. If mpt_reset is called to reinit things, and we succeed, make sure to re-enable interrupts. This is what has mostly led to system lockup after having to hard reset the chip. Also, if we think that interrupts aren't function in mpt_cam_timeout, for goodness sake, turn them on again. In read_cfg_header, return distinguishing errnos so the caller can decide what's an error. It's not an error to fail to read a RAID page from a non-RAID capable device like the FC929X. Some whitespace fixes (removing spaces from ends of lines).	2006-02-28 07:44:50 +00:00
Matt Jacob	6a9fa0152c	Remove the ill-considered effect of using the type definitions as distributed by LSI-Logic. For FreeBSD, just use the posix defines instead of trying to figure out how wide an int is. Apologies to all.	2006-02-26 22:50:14 +00:00
Matt Jacob	8b14319c98	Shorten the time for waiting for TMF commands to complete- let's not hang the system for 5 seconds. If a TMF doesn't complete within, oh, say 500ms, that's enough. Put in a printout to catch mpt_recover_commands being activated with no commands.	2006-02-26 07:46:09 +00:00
Matt Jacob	0b80d21bdf	Role a microrev of the MPI Library in preparation for target mode work. Make my portions of the license clearer. Thank Chris Ellsworth for his support in getting a bunch of this done.	2006-02-25 07:45:54 +00:00
Matt Jacob	696e0ce44d	Remove commented out qualifier to dumping a message.	2006-02-22 05:19:50 +00:00
Matt Jacob	444dd2b669	Do initial cut of SAS HBA support. These controllers (106X) seem to support automatically both SATA and SAS drives. The async SAS event handling we catch but ignore at present (so automagic attach/detach isn't hooked up yet). Do 64 bit PCI support- we can now work on systems with > 4GB of memory. Do large transfer support- we now can support up to reported chain depth, or the length of our request area. We simply allocate additional request elements when we would run out of room for chain lists. Tested on Ultra320, FC and SAS controllers on AMD64 and i386 platforms. There were no RAID cards available for me to regression test. The error recovery for this driver still is pretty bad.	2006-02-11 01:35:29 +00:00
Ruslan Ermilov	f4e9888107	Fix -Wundef.	2005-12-04 02:12:43 +00:00
Justin T. Gibbs	286e947fee	Correct attribution in clause three to address the correct copyright holders. The license that was approved for my changes to this driver originally came from LSI, but the changes to the driver core are not owned by LSI. MFC: 1 day	2005-08-03 14:08:41 +00:00
Scott Long	b0a2fdee0d	Massive overhaul of MPT Fusion driver: o Add timeout error recovery (from a thread context to avoid the deferral of other critical interrupts). o Properly recover commands across controller reset events. o Update the driver to handle events and status codes that have been added to the MPI spec since the driver was originally written. o Make the driver more modular to improve maintainability and support dynamic "personality" registration (e.g. SCSI Initiator, RAID, SAS, FC, etc). o Shorten and simplify the common I/O path to improve driver performance. o Add RAID volume and RAID member state/settings reporting. o Add periodic volume resynchronization status reporting. o Add support for sysctl tunable resync rate, member write cache enable, and volume transaction queue depth. Sponsored by ---------------- Avid Technologies Inc: SCSI error recovery, driver re-organization, update of MPI library headers, portions of dynamic personality registration, and misc bug fixes. Wheel Open Technologies: RAID event notification, RAID member pass-thru support, firmware upload/download support, enhanced RAID resync speed, portions of dynamic personality registration, and misc bug fixes. Detailed Changes ================ mpt.c mpt_cam.c mpt_raid.c mpt_pci.c: o Add support for personality modules. Each module exports load, and unload module scope methods as well as probe, attach, event, reset, shutdown, and detach per-device instance methods mpt.c mpt.h mpt_pci.c: o The driver now associates a callback function (via an index) with every transaction submitted to the controller. This allows the main interrupt handler to absolve itself of any knowledge of individual transaction/response types by simply calling the callback function "registered" for the transaction. We use a callback index instead of a callback function pointer in each requests so we can properly handle responses (e.g. event notifications) that are not associated with a transaction. Personality modules dynamically register their callbacks with the driver core to receive the callback index to use for their handlers. o Move the interrupt handler into mpt.c. The ISR algorithm is bus transport and OS independent and thus had no reason to be in mpt_pci.c. o Simplify configuration message reply handling by copying reply frame data for the requester and storing completion status in the original request structure. o Add the mpt_complete_request_chain() helper method and use it to implement reset handlers that must abort transactions. o Keep track of all pending requests on the new requests_pending_list in the softc. o Add default handlers to mpt.c to handle generic event notifications and controller reset activities. The event handler code is largely the same as in the original driver. The reset handler is new and terminates any pending transactions with a status code indicating the controller needs to be re-initialized. o Add some endian support to the driver. A complete audit is still required for this driver to have any hope of operating in a big-endian environment. o Use inttypes.h and __inline. Come closer to being style(9) compliant. o Remove extraneous use of typedefs. o Convert request state from a strict enumeration to a series of flags. This allows us to, for example, tag transactions that have timed-out while retaining the state that the transaction is still in-flight on the controller. o Add mpt_wait_req() which allows a caller to poll or sleep for the completion of a request. Use this to simplify and factor code out from many initialization routines. We also use this to sleep for task management request completions in our CAM timeout handler. mpt.c: o Correct a bug in the event handler where request structures were freed even if the request reply was marked as a continuation reply. Continuation replies indicate that the controller still owns the request and freeing these replies prematurely corrupted controller state. o Implement firmware upload and download. On controllers that do not have dedicated NVRAM (as in the Sun v20/v40z), the firmware image is downloaded to the controller by the system BIOS. This image occupies precious controller RAM space until the host driver fetches the image, reducing the number of concurrent I/Os the controller can processes. The uploaded image is used to re-program the controller during hard reset events since the controller cannot fetch the firmware on its own. Implementing this feature allows much higher queue depths when RAID volumes are configured. o Changed configuration page accessors to allow threads to sleep rather than busy wait for completion. o Removed hard coded data transfer sizes from configuration page routines so that RAID configuration page processing is possible. mpt_reg.h: o Move controller register definitions into a separate file. mpt.h: o Re-arrange includes to allow inlined functions to be defined in mpt.h. o Add reply, event, and reset handler definitions. o Add softc fields for handling timeout and controller reset recovery. mpt_cam.c: o Move mpt_freebsd.c to mpt_cam.c. Move all core functionality, such as event handling, into mpt.c leaving only CAM SCSI support here. o Revamp completion handler to provide correct CAM status for all currently defined SCSI MPI message result codes. o Register event and reset handlers with the MPT core. Modify the event handler to notify CAM of bus reset events. The controller reset handler will abort any transactions that have timed out. All other pending CAM transactions are correctly aborted by the core driver's reset handler. o Allocate a single request up front to perform task management operations. This guarantees that we can always perform a TMF operation even when the controller is saturated with other operations. The single request also serves as a perfect mechanism of guaranteeing that only a single TMF is in flight at a time - something that is required according to the MPT Fusion documentation. o Add a helper function for issuing task management requests to the controller. This is used to abort individual requests or perform a bus reset. o Modify the CAM XPT_BUS_RESET ccb handler to wait for and properly handle the status of the bus reset task management frame used to reset the bus. The previous code assumed that the reset request would always succeed. o Add timeout recovery support. When a timeout occurs, the timed-out request is added to a queue to be processed by our recovery thread and the thread is woken up. The recovery thread processes timed-out command serially, attempting first to abort them and then falling back to a bus reset if an abort fails. o Add calls to mpt_reset() to reset the controller if any handshake command, bus reset attempt or abort attempt fails due to a timeout. o Export a secondary "bus" to CAM that exposes all volume drive members as pass-thru devices, allowing CAM to perform proper speed negotiation to hidden devices. o Add a CAM async event handler tracking the AC_FOUND_DEVICE event. Use this to trigger calls to set the per-volume queue depth once the volume is fully registered with CAM. This is required to avoid hitting firmware limits on volume queue depth. Exceeding the limit causes the firmware to hang. mpt_cam.h: o Add several helper functions for interfacing to CAM and performing timeout recovery. mpt_pci.c: o Disable interrupts on the controller before registering and enabling interrupt delivery to the OS. Otherwise we risk receiving interrupts before the driver is ready to receive them. o Make use of compatibility macros that allow the driver to be compiled under 4.x and 5.x. mpt_raid.c: o Add a per-controller instance RAID thread to perform settings changes and query status (minimizes CPU busy wait loops). o Use a shutdown handler to disable "Member Write Cache Enable" (MWCE) setting for RAID arrays set to enable MWCE During Rebuild. o Change reply handler function signature to allow handlers to defer the deletion of reply frames. Use this to allow the event reply handler to queue up events that need to be acked if no resources are available to immediately ack an event. Queued events are processed in mpt_free_request() where resources are freed. This avoids a panic on resource shortage. o Parse and print out RAID controller capabilities during driver probe. o Define, allocate, and maintain RAID data structures for volumes, hidden member physical disks and spare disks. o Add dynamic sysctls for per-instance setting of the log level, array resync rate, array member cache enable, and volume queue depth. mpt_debug.c: o Add mpt_lprt and mpt_lprtc for printing diagnostics conditioned on a particular log level to aid in tracking down driver issues. o Add mpt_decode_value() which parses the bits in an integer value based on a parsing table (mask, value, name string, tuples). mpilib/*: o Update mpi library header files to latest distribution from LSI. Submitted by: gibbs Approved by: re	2005-07-10 15:05:39 +00:00

37 Commits