The newer boards don't have the response field that indicates
whether the SCSI status byte is present. You have to just look to
see whether it is non-zero.
The code was looking to see whether the sense length was valid
before propagating the SCSI status byte (and sense information) up
the stack. With a status like Reservation Conflict, there is no
sense information, only the SCSI status byte. So it wasn't getting
correctly returned.
isp.c:
In isp_intr(), if we are on a 2400 or 2500 type board and
get a response, look at the actual contents of the
SCSI status value and set the RQSF_GOT_STATUS flag
accordingly so that return any SCSI status value we get. The
RQSF_GOT_SENSE flag will get set later on if there is
actual sense information returned.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112791 on 2015/01/15
If the user sends an XPT_RESET_DEV CCB, make sure to reset the
Fibre Channel Command Reference Number if we're running on a FC
controller.
We send a SCSI Target Reset when we get this CCB, and as a result
need to reset the CRN to 1 on the next command.
isp_freebsd.c:
In the XPT_RESET_DEV implementation in isp_action(), reset
the CRN if we're on a FC controller.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112787 on 2015/01/15
Fix SCSI status byte reporting on 4Gb and 8Gb Qlogic boards.
The newer boards don't have the response field that indicates
whether the SCSI status byte is present. You have to just look to
see whether it is non-zero.
The code was looking to see whether the sense length was valid
before propagating the SCSI status byte (and sense information) up
the stack. With a status like Reservation Conflict, there is no
sense information, only the SCSI status byte. So it wasn't getting
correctly returned.
isp.c:
In isp_intr(), if we are on a 2400 or 2500 type board and
get a response, look at the actual contents of the
SCSI status value and set the RQSF_GOT_STATUS flag
accordingly so that return any SCSI status value we get. The
RQSF_GOT_SENSE flag will get set later on if there is
actual sense information returned.
Submitted by: ken
MFC after: 1 week
Sponsored by: Spectra Logic
MFSpectraBSD: 1112791 on 2015/01/15
and not automatically come back if they were gone for a short
period of time.
The isp(4) driver has a 30 second gone device timer that gets
activated whenever a device goes away. If the device comes back
before the timer expires, we don't send a notification to CAM that
it has gone away. If, however, there is a command sent to the
device while it is gone and before it comes back, the isp(4) driver
sends the command back with CAM_SEL_TIMEOUT status.
CAM responds to the CAM_SEL_TIMEOUT status by removing the device.
In the case where a device comes back within the 30 second gone
device timer window, though, we weren't telling CAM the device
came back.
So, fix this by tracking whether we have told CAM the device is
gone, and if we have, send a rescan if it comes back within the 30
second window.
ispvar.h:
In the fcportdb_t structure, add a new bitfield,
reported_gone. This gets set whenever we return a command
with CAM_SEL_TIMEOUT status on a Fibre Channel device.
isp_freebsd.c:
In isp_done(), if we're sending CAM_SEL_TIMEOUT for for a
command sent to a FC device, set the reported_gone bit.
In isp_async(), in the ISPASYNC_DEV_STAYED case, rescan the
device in question if it is mapped to a target ID and has
been reported gone.
In isp_make_here(), take a port database entry argument,
and clear the reported_gone bit when we send a rescan to
CAM.
In isp_make_gone(), take a port database entry as an
argument, and set the reported_gone bit when we send an
async event telling CAM consumers that the device is gone.
Sponsored by: Spectra Logic
MFC after: 1 week
The Command Reference Number is used for precise delivery of
commands, and is part of the FC-Tape functionality set. (This is
only enabled for devices that support precise delivery of commands.)
It is an 8-bit unsigned number that increments from 1 to 255. The
commands sent by the initiator must be processed by the target in
CRN order if the CRN is non-zero.
There are certain scenarios where the Command Reference Number
sequence needs to be reset. When the target is power cycled, for
instance, the initiator needs to reset the CRN to 1. The initiator
will know this because it will see a LIP (when directly connected)
or get a logout/login event (when connected to a switch).
The isp(4) driver was not resetting the CRN when a target
went away and came back. When it saw the target again after a
power cycle, it would continue the CRN sequence where it left off.
The target would ignore the command because the CRN sequence is
supposed to be reset to 1 after a power cycle or other similar
event.
The symptom that the user would see is that there would be lots of
aborted INQUIRY commands after a tape library was power cycled, and
the library would fail to probe. The INQUIRY commands were being
ignored by the tape drive due to the CRN issue mentioned above.
isp_freebsd.c:
Add a new function, isp_fcp_reset_crn(). This will reset
all of the CRNs for a given port, or the CRNs for all LUNs
on a target.
Reset the CRNs for all targets on a port when we get a LIP,
loop reset, or loop down event.
Reset the CRN for a particular target when it arrives, is changed
or departs. This is less precise behavior than the
clearing behavior specified in the FCP-4 spec (which says
that it should be reset for PRLI, PRLO, PLOGI and LOGO),
but this is the level of information we have here. If this
is insufficient, then we will need to add more precise
notification from the lower level isp(4) code.
isp_freebsd.h:
Add a prototype for isp_fcp_reset_crn().
Sponsored by: Spectra Logic
MFC after: 1 week
Records with target_mode == 1 are allocated from the end of portdb, so it
seems logical to start search from the end not traverse whole array.
MFC after: 1 month
In the current implementation, the isp_kthread() threads never exit.
The target threads do have an exit mode from isp_attach(), but it is
not invoked from isp_detach().
Ensure isp_detach() notifies threads started for each channel, such
that they exit before their parent device softc detaches, and thus
before the module does. Otherwise, a page fault panic occurs later in:
sysctl_kern_proc
sysctl_out_proc
kern_proc_out
fill_kinfo_proc
fill_kinfo_thread
strlcpy(kp->ki_wmesg, td->td_wmesg, sizeof(kp->ki_wmesg));
For isp_kthread() (and isp(4) target threads), td->td_wmesg references
now-unmapped memory after the module has been unloaded. These threads
are typically msleep()ing at the time of unload, but they could also
attempt to execute now-unmapped code segments.
MFC after: 1 month
Sponsored by: Spectra Logic
MFSpectraBSD: r1070921 on 2014/06/22 13:01:17
Delaying isp_reqodx update, we should be ready to update it every time
we read it. Otherwise requests using several indexes may be requeued
ndefinitely without ever updating the variable.
MFC after: 3 days
- Process ATIO queue only if interrupt status tells so;
- Do not update queue out pointers after each processed command, do it
only once at the end of the loop.
every time. The purpose of that register is unlikely output queue overflow
detection, so read it only when its last known (and probably stale now)
value signals overflow.
This reduces CPU load and lock congestion and rises bottleneck in CTL
while doing target mode via two 8Gbps ports from 100K to 120K IOPS.
mostly by adjustments to debugging printf() format specifiers. For high
numbered LUNs, also switch to printing them in hex as per SAM-5.
MFC after: 2 weeks
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
Replace big per-SIM locks with bunch of smaller ones:
- per-LUN locks to protect device and peripheral drivers state;
- per-target locks to protect list of LUNs on target;
- per-bus locks to protect reference counting;
- per-send queue locks to protect queue of CCBs to be sent;
- per-done queue locks to protect queue of completed CCBs;
- remaining per-SIM locks now protect only HBA driver internals.
While holding LUN lock it is allowed (while not recommended for performance
reasons) to take SIM lock. The opposite acquisition order is forbidden.
All the other locks are leaf locks, that can be taken anywhere, but should
not be cascaded. Many functions, such as: xpt_action(), xpt_done(),
xpt_async(), xpt_create_path(), etc. are no longer require (but allow) SIM
lock to be held.
To keep compatibility and solve cases where SIM lock can't be dropped, all
xpt_async() calls in addition to xpt_done() calls are queued to completion
threads for async processing in clean environment without SIM lock held.
Instead of single CAM SWI thread, used for commands completion processing
before, use multiple (depending on number of CPUs) threads. Load balanced
between them using "hash" of the device B:T:L address.
HBA drivers that can drop SIM lock during completion processing and have
sufficient number of completion threads to efficiently scale to multiple
CPUs can use new function xpt_done_direct() to avoid extra context switch.
Make ahci(4) driver to use this mechanism depending on hardware setup.
Sponsored by: iXsystems, Inc.
MFC after: 2 months
- Remove two excessive and slow register reads from isp_intr(). Instead
of rereading value every time, assume that registers contain what we have
written there.
- Avoid sequential search through 4096 array elements when looking for
command tag. Use hash of lists to store active tags separately from free
ones and so greatly speedup the searches.
Reviewed by: mjacob
driver.
This tells consumers up the stack the maximum I/O size that the
controller can handle.
The I/O size is bounded by the number of scatter/gather segments
the controller can handle and the page size. For an amd64 system,
it works out to around 5MB.
Reviewed by: mjacob
MFC after: 3 days
Sponsored by: Spectra Logic
command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.
Submitted by: jhb
Reviewed by: jfv, marius, achadd, achim
MFC after: 1 day
a mailbox command and which registers to copy back in when
the command completes, the bits being set need to not only
specify what bits you want to add from the default from the
table but also what bits you want *subtract* (mask) from the
default from the table.
A failing ISP2200 command pointed this out.
Much appreciation to: marius, who persisted and narrowed down what
the failure delta was, and shamed me into actually fixing it.
MFC after: 1 week
Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
might have been enabled for them- now that we use all 32 bits of handle.
Fast Posting doesn't pass the full 32 bits.
Noticed by: Bugs in NetBSD. Only a NetBSD user might actually still use such old hardware.
MFC after: 1 week
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
CCB at a time outstanding reliable. It's not there yet, but this
is the direction to go in so might as well commit. So far,
multiple at a time CCBs work (see ISP_INTERNAL_TARGET test mode),
but it fails if there are more downstream than the SIM wants
to handle and SRR is sort of confused when this happens, plus
it is not entirely quite clear what one does if a CCB/CTIO fails
and you have more in flight (that don't fail, say) and more queued
up at the SIM level that haven't been started yet.
Some of this is driven because there apparently is no flow control
to requeue XPT_CONTINUE_IO requests like there are for XPT_SCSI_IO
requests. It is also more driven in that the few target mode
periph drivers there are are not really set up for handling pushback-
heck most of them don't even check for errors (and what would they
really do with them anyway? It's the initiator's problem, really....).
The data transfer arithmetic has been worked over again to handle
multiple outstanding commands, so you have a notion of what's been
moved already as well as what's currently in flight. It turns that
this led to uncovering a REPORT_LUNS bug in the ISP_INTERNAL_TARGET
code which was sending back 24 bytes of rpl data instead of the
specified 16. What happened furthermore here is that sending back
16 bytes and reporting an overrun of 8 bytes made the initiator
(running FC-Tape aware f/w) mad enough to request, and keep
requesting, another FCP response (I guess it didn't like the answer
so kept asking for it again).
Sponsored by: Spectralogic
MFC after: 1 month
a tinderbox myself and caught the error.
Change to isp_send_cmd needs a final ecmd argument.
Sponsored by: Spectralogic
MFC after: 1 month
X-MFC: 238869
MISC CHANGES
Add a new async event- ISP_TARGET_NOTIFY_ACK, that will guarantee
eventual delivery of a NOTIFY ACK. This is tons better than just
ignoring the return from isp_notify_ack and hoping for the best.
Clean up the lower level lun enable code to be a bit more sensible.
Fix a botch in isp_endcmd which was messing up the sense data.
Fix notify ack for SRR to use a sensible error code in the case
of a reject.
Clean up and make clear what kind of firmware we've loaded and
what capabilities it has.
-----------
FULL (252 byte) SENSE DATA
In CTIOs for the ISP, there's only a limimted amount of space
to load SENSE DATA for associated CHECK CONDITIONS (24 or 26
bytes). This makes it difficult to send full SENSE DATA that can
be up to 252 bytes.
Implement MODE 2 responses which have us build the FCP Response
in system memory which the ISP will put onto the wire directly.
On the initiator side, the same problem occurs in that a command
status response only has a limited amount of space for SENSE DATA.
This data is supplemented by status continuation responses that
the ISP pushes onto the response queue after the status response.
We now pull them all together so that full sense data can be
returned to the periph driver.
This is supported on 23XX, 24XX and 25XX cards.
This is also preparation for doing >16 byte CDBs.
-----------
FC TAPE
Implement full FC-TAPE on both initiator and target mode side. This
capability is driven by firmware loaded, board type, board NVRAM
settings, or hint configuration options to enable or disable. This
is supported for 23XX, 24XX and 25XX cards.
On the initiator side, we pretty much just have to generate a command
reference number for each command we send out. This is FCP-4 compliant
in that we do this per ITL nexus to generate the allowed 1 thru 255
CRN.
In order to support the target side of FC-TAPE, we now pay attention
to more of the PRLI word 3 parameters which will tell us whether
an initiator wants confirmed responses. While we're at it, we'll
pay attention to the initiator view too and report it.
On sending back CTIOs, we will notice whether the initiator wants
confirmed responses and we'll set up flags to do so.
If a response or data frame is lost the initiator sends us an SRR
(Sequence Retransmit Request) ELS which shows up as an SRR notify
and all outstanding CTIOs are nuked with SRR Received status. The
SRR notify contains the offset that the initiator wants us to restart
the data transfer from or to retransmit the response frame.
If the ISP driver still has the CCB around for which the data segment
or response applies, it will retransmit.
However, we typically don't know about a lost data frame until we
send the FCP Response and the initiator totes up counters for data
moved and notices missing segments. In this case we've already
completed the data CCBs already and sent themn back up to the periph
driver. Because there's no really clean mechanism yet in CAM to
handle this, a hack has been put into place to complete the CTIO
CCB with the CAM_MESSAGE_RECV status which will have a MODIFY DATA
POINTER extended message in it. The internal ISP target groks this
and ctl(8) will be modified to deal with this as well.
At any rate, the data is retransmitted and an an FCP response is
sent. The whole point here is to successfully complete a command
so that you don't have to depend on ULP (SCSI) to have to recover,
which in the case of tape is not really possible (hence the name
FC-TAPE).
Sponsored by: Spectralogic
MFC after: 1 month