8 Commits

Author SHA1 Message Date
Mark Johnston
dee3e845aa Add support for getting and setting BBU properties related to battery
relearning. Specifically, add subcommands to mfiutil(8) which allow the
user to set the BBU and autolearn modes when the firmware supports it,
and add a subcommand which kicks off a battery relearn.

Reviewed by:	sbruno, rstone
Tested by:	sbruno
Approved by:	rstone (co-mentor)
MFC after:	2 weeks
Sponsored by:	Sandvine Incorporated
2013-04-08 17:46:45 +00:00
Steven Hartland
08c89430bd Fixes queuing issues where mfi_release_command blindly sets the cm_flags = 0
without first removing the command from the relavent queue.

This was causing panics in the queue functions which check to ensure a command
is not on another queue.

Fixed some cases where the error from mfi_mapcmd was lost and where the command
was never released / dequeued in error cases.

Ensure that all failures to mfi_mapcmd are logged.

Fixed possible null pointer exception in mfi_aen_setup if mfi_get_log_state
failed.

Fixed mfi_parse_entries & mfi_aen_setup not returning possible errors.

Corrected MFI_DUMP_CMDS calls with invalid vars SC vs sc.

Commands which have timed out now set cm_error to ETIMEDOUT and call
mfi_complete which prevents them getting stuck in the busy queue forever.

Fixed possible use of NULL pointer in mfi_tbolt_get_cmd.

Changed output formats to be more easily recognisable when debugging.

Optimised mfi_cmd_pool_tbolt cleanup.

Made information about driver limiting commands always display as for modern
cards this can be severe.

Fixed mfi_tbolt_alloc_cmd out of memory case which previously didnt return an
error.

Added malloc checks for request_desc_pool including free when subsiquent errors
are detected.

Fixed overflow error in SIMD reply descriptor check.

Fixed tbolt_cmd leak in mfi_build_and_issue_cmd if there's an error during IO
build.

Elimintated double checks on sc->mfi_aen_cm & sc->mfi_map_sync_cm in
mfi_shutdown.

Move local hdr calculation after error check in mfi_aen_complete.

Fixed wakeup on NULL in mfi_aen_complete.

Fixed mfi_aen_cm cleanup in mfi_process_fw_state_chg_isr not checking if it was
NULL.

Changed mfi_alloc_commands to error if bus_dmamap_create fails. Previously we
would try to continue with the number of allocated commands but lots of places
in the driver assume sc->mfi_max_fw_cmds is whats available so its unsafe to do
this without lots of changes.

Removed mfi_total_cmds as its no longer used due the above change.

Corrected mfi_tbolt_alloc_cmd to return ENOMEM where appropriate.

Fixed timeouts actually firing at double what they should.

Setting hw.mfi.max_cmds=-1 now configures to use the controller max.

A few style (9) fixes e.g. braced single line conditions and double blank lines

Cleaned up queuing macros

Removed invalid queuing tests for multiple queues

Trap and deal with errors when doing sends in mfi_data_cb

Refactored frame sending into one method with error checking of the return
code so we can ensure commands aren't left on the queue after error. This
ensures that mfi_mapcmd & mfi_data_cb leave the queue in a valid state.

Refactored how commands are cleaned up, mfi_release_command now ensures
that all queues and command state is maintained in a consistent state.

Prevent NULL pointer use in mfi_tbolt_complete_cmd

Fixed use of NULL sc->mfi_map_sync_cm in wakeup

Added defines to help with output of mfi_cmd and header flags.

Fixed mfi_tbolt_init_MFI_queue invalidating cm_index of the acquired mfi_cmd.

Reset now reinitialises sync map as well as AEN.

Fixed possible use of NULL pointer in mfi_build_and_issue_cmd

Fixed mfi_tbolt_init_MFI_queue call to mfi_process_fw_state_chg_isr causing
panic on failure.

Ensure that tbolt cards always initialise next_host_reply_index and
free_host_reply_index (based off mfi_max_fw_cmds) on both startup and
reset as per the linux driver.

Fixed mfi_tbolt_complete_cmd not acknowledging unknown commands so
it didn't clear the controller.

Prevent locks from being dropped and re-acquired in the following functions
which was allowing multiple threads to enter critical methods such as
mfi_tbolt_complete_cmd & mfi_process_fw_state_chg_isr:-
* mfi_tbolt_init_MFI_queue
* mfi_aen_complete / mfi_aen_register
* mfi_tbolt_sync_map_info
* mfi_get_log_state
* mfi_parse_entries

The locking for these functions was promoting to higher level methods. This
also fixed MFI_LINUX_SET_AEN_2 which was already acquiring the lock, so would
have paniced for recursive lock.

This also required changing malloc of ld_sync in mfi_tbolt_sync_map_info to
M_NOWAIT which can hence now fail but this was already expected as its return
was being tested.

Removed the assignment of cm_index in mfi_tbolt_init_MFI_queue which breaks
the world if the cmd returned by mfi_dequeue_free isn't the first cmd.

Fixed locking in mfi_data_cb, this is an async callback from bus_dmamap_load
which could hence be called after the caller has dropped the lock. If we
don't have the lock we aquire it and ensure we unlock before returning.

Fixed locking mfi_comms_init when mfi_dequeue_free fails.

Fixed mfi_build_and_issue_cmd not returning tbolt cmds aquired to the pool
on error.

Fixed mfi_abort not dropping the io lock when mfi_dequeue_free fails.

Added hw.mfi.polled_cmd_timeout sysctl that enables tuning of polled
timeouts. This shouldn't be reduced below 50 seconds as its used for
firmware patching which can take quite some time.

Added hw.mfi.fw_reset_test sysctl which is avaliable when compiled with
MFI_DEBUG and allows the testing of controller reset that was provoking a
large number of the issues encountered here.

Reviewed by:	Doug Ambrisko
Approved by:	pjd (mentor)
MFC after:	1 month
2013-02-27 02:21:10 +00:00
Doug Ambrisko
ddbffe7f70 First fix pr 167226:
ThunderBolt cannot read sector >= 2^32 or 2^21
with supplied patch.

Second the bigger change, fix RAID operation on ThunderBolt base
card such as physically removing a disk from a RAID and replacing
it.  The current situation is the RAID firmware effectively hangs
waiting for an acknowledgement from the driver.  This is due to
the firmware support of the driver actually accessing the RAID
from under the firmware.  This is an interesting feature that
the FreeBSD driver does not use.  However, when the firmare
detects the driver has attached it then expects the driver will
synchronize LD's with the firmware.  If the driver does not sync.
then the management part of the firmware will hang waiting for
it so a pulled driver will listed as still there.

The fix for this problem isn't extremely difficult.  However,
figuring out why some of the code was the way it was and then
redoing it was involved.  Not have a spec. made it harder to
try to figure out.  The existing driver would send a
MFI_DCMD_LD_MAP_GET_INFO command in write mode to acknowledge
a LD state change.  In read mode it gets the RAID map from the
firmware.  The FreeBSD driver doesn't do that currently.  It
could be added in the future with the appropriate structures.
To simplify things, get the current LD state and then build
the MFI_DCMD_LD_MAP_GET_INFO/write command so that it sends
an acknowledgement for each LD.  The map would probably state
which LD's changed so then the driver could probably just
acknowledge the LD's that changed versus all.  This doesn't seem
to be a problem.  When a MFI_DCMD_LD_MAP_GET_INFO/write command
is sent to the firmware, it will complete later when a change
to the LD's happen.  So it is very much like an AEN command
returning when something happened.  When the
MFI_DCMD_LD_MAP_GET_INFO/write command completes, we refire the
sync'ing of the LD state.  This needs to be done in as an event
so that MFI_DCMD_LD_GET_LIST can wait for that command to
complete before issuing the MFI_DCMD_LD_MAP_GET_INFO/write.
The prior code didn't use the call-back function and tried
to intercept the MFI_DCMD_LD_MAP_GET_INFO/write command when
processing an interrupt.  This added a bunch of code complexity
to the interrupt handler.  Using the call-back that is done
for other commands got rid of this need.  So the interrupt
handler is greatly simplified.  It seems that even commands
that shouldn't be acknowledged end up in the interrupt handler.
To deal with this, code was added to check to see if a command
is in the busy queue or not.  This might have contributed to the
interrupt storm happening without MSI enabled on these cards.

Note that MFI_DCMD_LD_MAP_GET_INFO/read returns right away.

It would be interesting to see what other complexity could
be removed from the ThunderBolt driver that really isn't
needed in our mode of operation.  Letting the RAID firmware
do all of the I/O to disks is a lot faster since it can
use its caches.  It greatly simplifies what the driver has
to do and potential bugs if the driver and firmware are
not in sync.

Simplify the aen_abort/cm_map_abort and put it in the softc
versus in the command structure.

This should get merged to 9 before the driver is merged to
8.

PR:		167226
Submitted by:	Petr Lampa
MFC after:	3 days
2012-05-04 16:00:39 +00:00
Doug Ambrisko
a6ba0fd64d MFhead_mfi r227068
First cut of new HW support from LSI and merge into FreeBSD.
	Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
	Style
MFhead_mfi r227579
	Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
	MSI support
MFhead_mfi r227612
	More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
	Improved timeout support from Scott.
MFhead_mfi r228108
	Make file.
MFhead_mfi r228208
	Fixed botched merge of Skinny support and enhanced handling
	in call back routine.
MFhead_mfi r228279
	Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
	Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
	Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
	Restore structure layout by reverting the array header to
	use [0] instead of [1].
MFhead_mfi r232412
	Put wildcard pattern later in the match table.
MFhead_mfi r232413
	Use lower case for hexadecimal numbers to match surrounding
	style.
MFhead_mfi r232414
	Add more Thunderbolt variants.
MFhead_mfi r232888
	Don't act on events prior to boot or when shutting down.
	Add hw.mfi.detect_jbod_change to enable or disable acting
	on JBOD type of disks being added on insert and removed on
	removing.  Switch hw.mfi.msi to 1 by default since it works
	better on newer cards.
MFhead_mfi r233016
	Release driver lock before taking Giant when deleting children.
	Use TAILQ_FOREACH_SAFE when items can be deleted.  Make code a
	little simplier to follow.  Fix a couple more style issues.
MFhead_mfi r233620
	Update mfi_spare/mfi_array with the actual number of elements
	for array_ref and pd.  Change these max. #define names to avoid
	name space collisions.  This will require an update to mfiutil
	It avoids mfiutil having to do a magic calculation.

	Add a note and #define to state that a "SYSTEM" disk is really
	what the firmware calls a "JBOD" drive.

Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
2012-03-30 23:05:48 +00:00
Konstantin Belousov
d185f187b4 The sys/sysctl.h header is needed when MFI_DEBUG is defined.
Nod from:	jhb
2011-11-16 18:42:39 +00:00
Scott Long
441f6d5dca - Add a command validator for use in debugging.
- Fix the locking protocol to eliminate races between normal I/O and AENs.
- Various small improvements and usability tweaks.

Sponsored by: IronPort
Portions Submitted by: Doug Ambrisko
2006-10-16 04:18:38 +00:00
Scott Long
420a5a0e25 Fix a bad #include statment 2006-09-27 04:54:23 +00:00
Scott Long
5ba21ff136 Add a command debugging module and a periodic watchdog timer.
Sponsored by: IronPort
2006-09-25 11:35:34 +00:00