This should close the race between request arriving on new target mode
virtual port and its scanner thread finally fetch its address for request
routing.
For some reason firmware sends Port Database Changed notifications in case
of explicit login requests from the driver when target port is unavailabe.
Those notifications don't give driver any new information, but only cause
infinite scan loop.
Usually IOCBs should be put on queue for asynchronous processing and should
not require additional DMA memory. But there are some cases like aborts and
resets that for external reasons has to be synchronous. Give those cases
separate 2*64 byte DMA area to decouple them from other DMA scratch area
users, using it for asynchronous requests.
This space does not require DMA syncing. It reduces lock scope of the DMA
scratch space. It allows whole DMA scratch space to be used to I/O, so now
we can fetch up to ~1000 ports from SNS.
Due to the last fact, increase maximal number of ports from 256 to 1024.
Before this change virtual ports control IOCBs were executed synchronously
via Execute IOCB mailbox command. It required exclusive use of scratch
space of driver and mailbox registers of the hardware. Because of that
shared resources use this code could not really sleep, having to spin for
completion, blocking any other operation.
This change introduces new asynchronous design, sending the IOCBs directly
on request queue and gracefully waiting for their return on response queue.
Returned IOCBs are identified with unified handle space from r292725.
I am not sure why this was split long ago, but I see no reason for it.
At this point this unification just slightly reduces memory usage, but
as next step I plan to reuse shared handle space for other IOCB types.
- Make scan aborted by event restart immediately and infinitely.
- Improve handling of some loop events from firmware.
- Remove loop down timer, adding its functionality to scanner thread.
- Some more unification and simplification.
Hacks to enable target mode there complicated code, while didn't really
work. And for outdated hardware fixing it is not really interesting.
Initiator mode tested with Qlogic 1080 adapter is still working fine.
For those chips we are not receiving login events, adding initiators
based on ATIO requests. But there is no port ID in that structure, so
in fabric mode we have to explicitly fetch it from firmware to be able
to do normal scan after that.
This change simplifies and unifies port adding/updating for loop and
fabric scanners. It also fixes problems with scanning restarts due to
concurrent port databases changes. It also fixes many cosmetic issues.
Aside from cleaner and more consistent code, this allows ports to be both
target and initiator same time, and easily switch from any role to any.
Sponsored by: iXsystems, Inc.
FreeBSD never had limitation on number of target IDs, and there is no
any other requirement to allocate them densely. Since slots of port
database already populated just sequentially, there is no much need
for another indirection to allocate sequentially too.
and not automatically come back if they were gone for a short
period of time.
The isp(4) driver has a 30 second gone device timer that gets
activated whenever a device goes away. If the device comes back
before the timer expires, we don't send a notification to CAM that
it has gone away. If, however, there is a command sent to the
device while it is gone and before it comes back, the isp(4) driver
sends the command back with CAM_SEL_TIMEOUT status.
CAM responds to the CAM_SEL_TIMEOUT status by removing the device.
In the case where a device comes back within the 30 second gone
device timer window, though, we weren't telling CAM the device
came back.
So, fix this by tracking whether we have told CAM the device is
gone, and if we have, send a rescan if it comes back within the 30
second window.
ispvar.h:
In the fcportdb_t structure, add a new bitfield,
reported_gone. This gets set whenever we return a command
with CAM_SEL_TIMEOUT status on a Fibre Channel device.
isp_freebsd.c:
In isp_done(), if we're sending CAM_SEL_TIMEOUT for for a
command sent to a FC device, set the reported_gone bit.
In isp_async(), in the ISPASYNC_DEV_STAYED case, rescan the
device in question if it is mapped to a target ID and has
been reported gone.
In isp_make_here(), take a port database entry argument,
and clear the reported_gone bit when we send a rescan to
CAM.
In isp_make_gone(), take a port database entry as an
argument, and set the reported_gone bit when we send an
async event telling CAM consumers that the device is gone.
Sponsored by: Spectra Logic
MFC after: 1 week
- Remove two excessive and slow register reads from isp_intr(). Instead
of rereading value every time, assume that registers contain what we have
written there.
- Avoid sequential search through 4096 array elements when looking for
command tag. Use hash of lists to store active tags separately from free
ones and so greatly speedup the searches.
Reviewed by: mjacob
MISC CHANGES
Add a new async event- ISP_TARGET_NOTIFY_ACK, that will guarantee
eventual delivery of a NOTIFY ACK. This is tons better than just
ignoring the return from isp_notify_ack and hoping for the best.
Clean up the lower level lun enable code to be a bit more sensible.
Fix a botch in isp_endcmd which was messing up the sense data.
Fix notify ack for SRR to use a sensible error code in the case
of a reject.
Clean up and make clear what kind of firmware we've loaded and
what capabilities it has.
-----------
FULL (252 byte) SENSE DATA
In CTIOs for the ISP, there's only a limimted amount of space
to load SENSE DATA for associated CHECK CONDITIONS (24 or 26
bytes). This makes it difficult to send full SENSE DATA that can
be up to 252 bytes.
Implement MODE 2 responses which have us build the FCP Response
in system memory which the ISP will put onto the wire directly.
On the initiator side, the same problem occurs in that a command
status response only has a limited amount of space for SENSE DATA.
This data is supplemented by status continuation responses that
the ISP pushes onto the response queue after the status response.
We now pull them all together so that full sense data can be
returned to the periph driver.
This is supported on 23XX, 24XX and 25XX cards.
This is also preparation for doing >16 byte CDBs.
-----------
FC TAPE
Implement full FC-TAPE on both initiator and target mode side. This
capability is driven by firmware loaded, board type, board NVRAM
settings, or hint configuration options to enable or disable. This
is supported for 23XX, 24XX and 25XX cards.
On the initiator side, we pretty much just have to generate a command
reference number for each command we send out. This is FCP-4 compliant
in that we do this per ITL nexus to generate the allowed 1 thru 255
CRN.
In order to support the target side of FC-TAPE, we now pay attention
to more of the PRLI word 3 parameters which will tell us whether
an initiator wants confirmed responses. While we're at it, we'll
pay attention to the initiator view too and report it.
On sending back CTIOs, we will notice whether the initiator wants
confirmed responses and we'll set up flags to do so.
If a response or data frame is lost the initiator sends us an SRR
(Sequence Retransmit Request) ELS which shows up as an SRR notify
and all outstanding CTIOs are nuked with SRR Received status. The
SRR notify contains the offset that the initiator wants us to restart
the data transfer from or to retransmit the response frame.
If the ISP driver still has the CCB around for which the data segment
or response applies, it will retransmit.
However, we typically don't know about a lost data frame until we
send the FCP Response and the initiator totes up counters for data
moved and notices missing segments. In this case we've already
completed the data CCBs already and sent themn back up to the periph
driver. Because there's no really clean mechanism yet in CAM to
handle this, a hack has been put into place to complete the CTIO
CCB with the CAM_MESSAGE_RECV status which will have a MODIFY DATA
POINTER extended message in it. The internal ISP target groks this
and ctl(8) will be modified to deal with this as well.
At any rate, the data is retransmitted and an an FCP response is
sent. The whole point here is to successfully complete a command
so that you don't have to depend on ULP (SCSI) to have to recover,
which in the case of tape is not really possible (hence the name
FC-TAPE).
Sponsored by: Spectralogic
MFC after: 1 month
not by some hint setting. Do more preparations for FC-Tape.
Clean up resource counting for 24XX or later chipsets so
we find out after EXEC_FIRMWARE what is actually supported.
Set target mode exchange count based upon whether or not
we are supporting simultaneous target/initiator mode. Clean
up some old (pre-24XX) xfwoption and zfwoption issues.
Sponsored by: Spectralogic
MFC after: 3 days
and crosschecks against firmware documentation. We now check and report
FC firmware attributes and at least are now prepared for the upper 48 bits
of f/w attributes (which are probably for the 8100 or later cards). This
involed changing how inbits and outbits are calculated for varios commands,
hopefully clearer and cleaner. This also caused me to clean up the actual
mailbox register usage. Finally, we are now unconditionally using a CRN
for initiator mode.
A longstanding issue with the 2400/2500 is that they do *not* support
a "Prefer PTP followed by loop", which explains why enabling that
caused the f/w to crash.
A slightly more invasive change is to let the firmware load entirely
drive whether multi_id support is enabled or not.
Sponsored by: Spectralogic
MFC after: 1 week
Make the default role NONE if target mode is selected. This
allows ctl(8) to switch to/from target mode via knob settings.
If we default to role 'none', this causes a reset of the
24XX f/w which then causes initiators to wake up and notice
when we come online.
Reviewed by: kdm
MFC after: 2 weeks
Sponsored by: Spectralogic
We also revive loop down freezes. We also externaliz within isp
isp_prt_endcmd so something outside the core module can print
something about a command completing. Also some work in progress to
assist in handling timed out commands better.
Partially Sponsored by: Panasas
Approved by: re (kib)
MFC after: 1 month
- Allocate coherent DMA memory for the request/response queue area and
and the FC scratch area.
These changes allow isp(4) to work properly on sparc64 with usage of the
IOMMU streaming buffers enabled.
Approved by: mjacob
MFC after: 2 weeks
on debug output. Add a new platform function requirement to allow
for printing based upon the ITL nexus instead of the isp unit plus
channel, target and lun. This allows some printouts and error messages
from the core code to appear in the same format as the platform's
subsystem (in FreeBSD's case, CAM path).
MFC after: 1 week
numbers and handle types in rational way. This will better protect from
(unwittingly) dealing with stale handles/commands.
Fix the watchdog timeout code to better protect itself from mistakes.
If we run an abort on a putatively timed out command, the command
may in fact get completed, so check to make sure the command we're
timing it out is still around. If the abort succeeds, btw, the command
should get returned via a different path.
firmware loading bugs.
Target mode support has received some serious attention to make it
more usable and stable.
Some backward compatible additions to CAM have been made that make
target mode async events easier to deal with have also been put
into place.
Further refinement and better support for NP-IV (N-port Virtualization)
is now in place.
Code for release prior to RELENG_7 has been stripped away for code clarity.
Sponsored by: Copan Systems
Reviewed by: scottl, ken, jung-uk kim
Approved by: re
First, we were never correctly checking for a 24XX Status Type 0
response- that cased us to fall through to evaluate status for
commands as if this were a 2100/2200/2300 Status Type 0 response.
This is *close*, but not quite the same. This has been reported
to be apparent with some wierd lun configuration problems with
some arrays. It became glaringly apparent on sparc64 where none
of the correct byte swap things were done.
Fixing this omission then caused a whole universe shifting debug
cycle of endian issues for the 2400. The manual for 24XX f/w turns
out to be wrong about the endianness of a couple of entities. The
lun and cdb fields for the type 7 request are *not* unconditionally
big endian- they happen to be opposite of whatever the endian of
the current machine type is. Same with the sense data for the
24XX type 0 response.
While we're at it investigate and resolve some NVRAM endian
issues.
Approved by: re (ken)
MFC after: 3 days
front of isp_init so we can read NVRAM even if we're role ISP_NONE.
Prepare for reintroduction of channels (for FC) for N-Port
Virtualization.
Fix a botch in handle assignment that caused us to nuke one device
when a new one arrives and end up with two devices with the same
identity in the virtual target mapping table.
with- not hope for the best. Change some things which were gated
off of 24XX to be gated off of 2K login support. Convert some
isp_prt calls to xpt_print calls.
and provied an isp_control entry point so that the outer layers can
do PLOGI/LOGO explicitly. Add MS IOCB support. This completes the cycle
for base support for SMI-S.
gone device timers and zombie state entries. There are tunables
that can be used to select a number of parameters.
loop_down_limit - how long to wait for loop to come back up before
declaring
all devices dead (default 300 seconds)
gone_device_time- how long to wait for a device that has appeared
to leave the loop or fabric to reappear (default 30 seconds)
Internal tunables include (which should be externalized):
quick_boot_time- how long to wait when booting for loop to come up
change_is_bad- whether or not to accept devices with the same
WWNN/WWPN that reappear at a different PortID as being the 'same'
device.
Keen students of some of the subtle issues here will ask how
one can keep devices from being re-accepted at all (the answer
is to set a gone_device_time to zero- that effectively would
be the same thing).
(and by extension, the 2422).
One peculiar thing I've found with the 2322 is that if you
don't force it to do Hard LoopID acquisition, the firmware
crashes. This took a while to figure out.
While we're at it, fix various bugs having to do with NVRAM
reading and option setting with respect to pieces of NVRAM.
to getting rid u_int for uint and so on).
b) Turn back on 64 bit DAC support. Cheeze it a bit in that we have two
DMA callback functions- one when we have bus_addr_t > 4 bits in width and
the other which should be normal. Even Cheezier in that we turn off setting
up DMA maps to be BUS_SPACE_MAXADDR if we're in ISP_TARGET_MODE. More work
on this in a week or so.
c) Tested under amd64 and 1MB DFLTPHYS, sparc64, i386 (PAE, but insufficient
memory to really test > 4GB). LINT check under amd64.
MFC after: 1 month
up to date. Principle changes for this reelase is to support 2K Port Login
firmware. This allows us to support the 2322 (and 2422 4Gb) cards which only
come with the 2K Port Login firmware. The 2322 should now work- but we don't
have firmware sets for it in ispfw (as the change to load 2K Port Login f/w
hasn't been made- that f/w is so big it has to be loaded in more than one
chunk).
Other changes are the beginnings of cleaning up some long standing target
mode issues. The next changes here will incorporate a lot of bug fixes
from others.
Finally, some copyright cleanup and attempts to make the parts of the
driver that are FreeBSD specific start conforming more to FreeBSD style.
MFC after: 1 month