r276839, r276842, r277513, r277514, r277515
------------------------------------------------------------------------
r276839 | ken | 2015-01-08 10:41:28 -0700 (Thu, 08 Jan 2015) | 49 lines
Fix Fibre Channel Command Reference Number handling in the isp(4) driver.
The Command Reference Number is used for precise delivery of
commands, and is part of the FC-Tape functionality set. (This is
only enabled for devices that support precise delivery of commands.)
It is an 8-bit unsigned number that increments from 1 to 255. The
commands sent by the initiator must be processed by the target in
CRN order if the CRN is non-zero.
There are certain scenarios where the Command Reference Number
sequence needs to be reset. When the target is power cycled, for
instance, the initiator needs to reset the CRN to 1. The initiator
will know this because it will see a LIP (when directly connected)
or get a logout/login event (when connected to a switch).
The isp(4) driver was not resetting the CRN when a target
went away and came back. When it saw the target again after a
power cycle, it would continue the CRN sequence where it left off.
The target would ignore the command because the CRN sequence is
supposed to be reset to 1 after a power cycle or other similar
event.
The symptom that the user would see is that there would be lots of
aborted INQUIRY commands after a tape library was power cycled, and
the library would fail to probe. The INQUIRY commands were being
ignored by the tape drive due to the CRN issue mentioned above.
isp_freebsd.c:
Add a new function, isp_fcp_reset_crn(). This will reset
all of the CRNs for a given port, or the CRNs for all LUNs
on a target.
Reset the CRNs for all targets on a port when we get a LIP,
loop reset, or loop down event.
Reset the CRN for a particular target when it arrives, is changed
or departs. This is less precise behavior than the
clearing behavior specified in the FCP-4 spec (which says
that it should be reset for PRLI, PRLO, PLOGI and LOGO),
but this is the level of information we have here. If this
is insufficient, then we will need to add more precise
notification from the lower level isp(4) code.
isp_freebsd.h:
Add a prototype for isp_fcp_reset_crn().
Sponsored by: Spectra Logic
MFC after: 1 week
------------------------------------------------------------------------
r276842 | ken | 2015-01-08 10:51:12 -0700 (Thu, 08 Jan 2015) | 44 lines
Close a race in the isp(4) driver that caused devices to disappear
and not automatically come back if they were gone for a short
period of time.
The isp(4) driver has a 30 second gone device timer that gets
activated whenever a device goes away. If the device comes back
before the timer expires, we don't send a notification to CAM that
it has gone away. If, however, there is a command sent to the
device while it is gone and before it comes back, the isp(4) driver
sends the command back with CAM_SEL_TIMEOUT status.
CAM responds to the CAM_SEL_TIMEOUT status by removing the device.
In the case where a device comes back within the 30 second gone
device timer window, though, we weren't telling CAM the device
came back.
So, fix this by tracking whether we have told CAM the device is
gone, and if we have, send a rescan if it comes back within the 30
second window.
ispvar.h:
In the fcportdb_t structure, add a new bitfield,
reported_gone. This gets set whenever we return a command
with CAM_SEL_TIMEOUT status on a Fibre Channel device.
isp_freebsd.c:
In isp_done(), if we're sending CAM_SEL_TIMEOUT for for a
command sent to a FC device, set the reported_gone bit.
In isp_async(), in the ISPASYNC_DEV_STAYED case, rescan the
device in question if it is mapped to a target ID and has
been reported gone.
In isp_make_here(), take a port database entry argument,
and clear the reported_gone bit when we send a rescan to
CAM.
In isp_make_gone(), take a port database entry as an
argument, and set the reported_gone bit when we send an
async event telling CAM consumers that the device is gone.
Sponsored by: Spectra Logic
MFC after: 1 week
------------------------------------------------------------------------
r277514 | will | 2015-01-21 13:27:11 -0700 (Wed, 21 Jan 2015) | 18 lines
Force commit to record the correct log for r277513.
If the user sends an XPT_RESET_DEV CCB, make sure to reset the
Fibre Channel Command Reference Number if we're running on a FC
controller.
We send a SCSI Target Reset when we get this CCB, and as a result
need to reset the CRN to 1 on the next command.
isp_freebsd.c:
In the XPT_RESET_DEV implementation in isp_action(), reset
the CRN if we're on a FC controller.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112787 on 2015/01/15
------------------------------------------------------------------------
r277515 | will | 2015-01-21 13:32:36 -0700 (Wed, 21 Jan 2015) | 25 lines
Fix SCSI status byte reporting on 4Gb and 8Gb Qlogic boards.
The newer boards don't have the response field that indicates
whether the SCSI status byte is present. You have to just look to
see whether it is non-zero.
The code was looking to see whether the sense length was valid
before propagating the SCSI status byte (and sense information) up
the stack. With a status like Reservation Conflict, there is no
sense information, only the SCSI status byte. So it wasn't getting
correctly returned.
isp.c:
In isp_intr(), if we are on a 2400 or 2500 type board and
get a response, look at the actual contents of the
SCSI status value and set the RQSF_GOT_STATUS flag
accordingly so that return any SCSI status value we get. The
RQSF_GOT_SENSE flag will get set later on if there is
actual sense information returned.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112791 on 2015/01/15
------------------------------------------------------------------------
Sponsored by: Spectra Logic
Make isp_find_pdb_by_*() search for targets in portdb in reverse order.
Records with target_mode == 1 are allocated from the end of portdb, so it
seems logical to start search from the end not traverse whole array.
Pass correct command that should be aborted to ISPCTL_ABORT_CMD.
This makes XPT_ABORT to work for me on initiator side of isp(4).
Previous code was trying to abort the XPT_ABORT itself and failed.
Update isp_tgt_map and send new arrival notification if target that departed
earlier has returned. Previously that code worked only once, confusing CTL.
Harvest one no longer used constant string.
Remove another and place it into play in the
normally ifdef protected zone it would be used
int.
Noticed by: dim
Fix I/O freezes in some cases, caused by r257916.
Delaying isp_reqodx update, we should be ready to update it every time
we read it. Otherwise requests using several indexes may be requeued
ndefinitely without ever updating the variable.
locking support for CAM
r256826:
Fix several target mode SIMs to not blindly clear ccb_h.flags field of
ATIO CCBs. Not all CCB flags there belong to them.
r256836:
Remove hard limit on number of BIOs handled with one ATA TRIM request.
r256843:
Merge CAM locking changes from the projects/camlock branch to radically
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
r256888:
Unconditionally acquire periph reference on CCB allocation failure.
r256895:
Fix memory and references leak due to unfreed path.
r256960:
Move CAM_UNQUEUED_INDEX setting to the last moment and under the periph lock.
This fixes race condition with cam_periph_ccbwait(), causing use-after-free.
r256975:
Minor (mostly cosmetical) addition to r256960.
r257054:
Some microoptimizations for da and ada drivers:
- Replace ordered_tag_count counter with single flag;
- From da remove outstanding_cmds counter, duplicating pending_ccbs list;
- From da_softc remove unused links field.
r257482:
Fix lock recursion, triggered by `smartctl -a /dev/adaX`.
r257501:
Make getenv_*() functions and respectively TUNABLE_*_FETCH() macros not
allocate memory and so not require sleepable environment. getenv() has
already used on-stack temporary storage, so just use it more rationally.
getenv_string() receives buffer as argument, so don't need another one.
r257914:
Some CAM locks polishing:
- Fix LOR and possible lock recursion when handling high-power commands.
Introduce new lock to protect left power quota and list of frozen devices.
- Correct locking around xpt periph creation.
- Remove seems never used XPT_FLAG_OPEN xpt periph flag.
Again, Netflix assisted with testing the merge, but all of the credit goes
to Alexander and iX Systems.
Submitted by: mav
Sponsored by: iX Systems
Use relaxed (write-only) memory barriers when writing some of queue index
registers (for now on ISP2400+). We never read those registers back and
AFAIK their semantics does not require any immediate reaction on write.
Some more registers access optimizations:
- Process ATIO queue only if interrupt status tells so;
- Do not update queue out pointers after each processed command, do it
only once at the end of the loop.
Save one more register read per command by not reading rqstoutrp register
every time. The purpose of that register is unlikely output queue overflow
detection, so read it only when its last known (and probably stale now)
value signals overflow.
Optimize isp(4) to reduce CPU usage, especially in target mode:
- Remove two excessive and slow register reads from isp_intr(). Instead
of rereading value every time, assume that registers contain what we have
written there.
- Avoid sequential search through 4096 array elements when looking for
command tag. Use hash of lists to store active tags separately from free
ones and so greatly speedup the searches.
driver.
This tells consumers up the stack the maximum I/O size that the
controller can handle.
The I/O size is bounded by the number of scatter/gather segments
the controller can handle and the page size. For an amd64 system,
it works out to around 5MB.
Reviewed by: mjacob
MFC after: 3 days
Sponsored by: Spectra Logic
command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.
Submitted by: jhb
Reviewed by: jfv, marius, achadd, achim
MFC after: 1 day
a mailbox command and which registers to copy back in when
the command completes, the bits being set need to not only
specify what bits you want to add from the default from the
table but also what bits you want *subtract* (mask) from the
default from the table.
A failing ISP2200 command pointed this out.
Much appreciation to: marius, who persisted and narrowed down what
the failure delta was, and shamed me into actually fixing it.
MFC after: 1 week
Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
might have been enabled for them- now that we use all 32 bits of handle.
Fast Posting doesn't pass the full 32 bits.
Noticed by: Bugs in NetBSD. Only a NetBSD user might actually still use such old hardware.
MFC after: 1 week
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
CCB at a time outstanding reliable. It's not there yet, but this
is the direction to go in so might as well commit. So far,
multiple at a time CCBs work (see ISP_INTERNAL_TARGET test mode),
but it fails if there are more downstream than the SIM wants
to handle and SRR is sort of confused when this happens, plus
it is not entirely quite clear what one does if a CCB/CTIO fails
and you have more in flight (that don't fail, say) and more queued
up at the SIM level that haven't been started yet.
Some of this is driven because there apparently is no flow control
to requeue XPT_CONTINUE_IO requests like there are for XPT_SCSI_IO
requests. It is also more driven in that the few target mode
periph drivers there are are not really set up for handling pushback-
heck most of them don't even check for errors (and what would they
really do with them anyway? It's the initiator's problem, really....).
The data transfer arithmetic has been worked over again to handle
multiple outstanding commands, so you have a notion of what's been
moved already as well as what's currently in flight. It turns that
this led to uncovering a REPORT_LUNS bug in the ISP_INTERNAL_TARGET
code which was sending back 24 bytes of rpl data instead of the
specified 16. What happened furthermore here is that sending back
16 bytes and reporting an overrun of 8 bytes made the initiator
(running FC-Tape aware f/w) mad enough to request, and keep
requesting, another FCP response (I guess it didn't like the answer
so kept asking for it again).
Sponsored by: Spectralogic
MFC after: 1 month
a tinderbox myself and caught the error.
Change to isp_send_cmd needs a final ecmd argument.
Sponsored by: Spectralogic
MFC after: 1 month
X-MFC: 238869
MISC CHANGES
Add a new async event- ISP_TARGET_NOTIFY_ACK, that will guarantee
eventual delivery of a NOTIFY ACK. This is tons better than just
ignoring the return from isp_notify_ack and hoping for the best.
Clean up the lower level lun enable code to be a bit more sensible.
Fix a botch in isp_endcmd which was messing up the sense data.
Fix notify ack for SRR to use a sensible error code in the case
of a reject.
Clean up and make clear what kind of firmware we've loaded and
what capabilities it has.
-----------
FULL (252 byte) SENSE DATA
In CTIOs for the ISP, there's only a limimted amount of space
to load SENSE DATA for associated CHECK CONDITIONS (24 or 26
bytes). This makes it difficult to send full SENSE DATA that can
be up to 252 bytes.
Implement MODE 2 responses which have us build the FCP Response
in system memory which the ISP will put onto the wire directly.
On the initiator side, the same problem occurs in that a command
status response only has a limited amount of space for SENSE DATA.
This data is supplemented by status continuation responses that
the ISP pushes onto the response queue after the status response.
We now pull them all together so that full sense data can be
returned to the periph driver.
This is supported on 23XX, 24XX and 25XX cards.
This is also preparation for doing >16 byte CDBs.
-----------
FC TAPE
Implement full FC-TAPE on both initiator and target mode side. This
capability is driven by firmware loaded, board type, board NVRAM
settings, or hint configuration options to enable or disable. This
is supported for 23XX, 24XX and 25XX cards.
On the initiator side, we pretty much just have to generate a command
reference number for each command we send out. This is FCP-4 compliant
in that we do this per ITL nexus to generate the allowed 1 thru 255
CRN.
In order to support the target side of FC-TAPE, we now pay attention
to more of the PRLI word 3 parameters which will tell us whether
an initiator wants confirmed responses. While we're at it, we'll
pay attention to the initiator view too and report it.
On sending back CTIOs, we will notice whether the initiator wants
confirmed responses and we'll set up flags to do so.
If a response or data frame is lost the initiator sends us an SRR
(Sequence Retransmit Request) ELS which shows up as an SRR notify
and all outstanding CTIOs are nuked with SRR Received status. The
SRR notify contains the offset that the initiator wants us to restart
the data transfer from or to retransmit the response frame.
If the ISP driver still has the CCB around for which the data segment
or response applies, it will retransmit.
However, we typically don't know about a lost data frame until we
send the FCP Response and the initiator totes up counters for data
moved and notices missing segments. In this case we've already
completed the data CCBs already and sent themn back up to the periph
driver. Because there's no really clean mechanism yet in CAM to
handle this, a hack has been put into place to complete the CTIO
CCB with the CAM_MESSAGE_RECV status which will have a MODIFY DATA
POINTER extended message in it. The internal ISP target groks this
and ctl(8) will be modified to deal with this as well.
At any rate, the data is retransmitted and an an FCP response is
sent. The whole point here is to successfully complete a command
so that you don't have to depend on ULP (SCSI) to have to recover,
which in the case of tape is not really possible (hence the name
FC-TAPE).
Sponsored by: Spectralogic
MFC after: 1 month
not by some hint setting. Do more preparations for FC-Tape.
Clean up resource counting for 24XX or later chipsets so
we find out after EXEC_FIRMWARE what is actually supported.
Set target mode exchange count based upon whether or not
we are supporting simultaneous target/initiator mode. Clean
up some old (pre-24XX) xfwoption and zfwoption issues.
Sponsored by: Spectralogic
MFC after: 3 days
and crosschecks against firmware documentation. We now check and report
FC firmware attributes and at least are now prepared for the upper 48 bits
of f/w attributes (which are probably for the 8100 or later cards). This
involed changing how inbits and outbits are calculated for varios commands,
hopefully clearer and cleaner. This also caused me to clean up the actual
mailbox register usage. Finally, we are now unconditionally using a CRN
for initiator mode.
A longstanding issue with the 2400/2500 is that they do *not* support
a "Prefer PTP followed by loop", which explains why enabling that
caused the f/w to crash.
A slightly more invasive change is to let the firmware load entirely
drive whether multi_id support is enabled or not.
Sponsored by: Spectralogic
MFC after: 1 week
Make the default role NONE if target mode is selected. This
allows ctl(8) to switch to/from target mode via knob settings.
If we default to role 'none', this causes a reset of the
24XX f/w which then causes initiators to wake up and notice
when we come online.
Reviewed by: kdm
MFC after: 2 weeks
Sponsored by: Spectralogic
- in destroy_lun_state() assert hold == 1 instead of 0, as it should
receive hold taken by the create_lun_state() or get_lun_statep() before;
- fix hold count leak inside rls_lun_statep() that also fired above assert;
- in destroy_lun_state() use SIM bus number instead of SIM path id for
ISP_GET_PC_ADDR(), as it was before r196008;
- make isp_disable_lun() to set status in CCB;
- make isp_target_mark_aborted() set status into the proper CCB.
Reviewed by: mjacob
Sponsored by: iXsystems, inc.
MFC after: 1 month
is actually broken, or needs a BIOS upgrade for 64 bit loads, but this uncovered
a couple of misplaced opcode definitions and some missing continual mbox command
cases, so might as well update them here.