Until we can have perfect knowledge that all callers above us think it's okay
for us to sleep, releasing *our* locks of course, we don't dare try and sleep.
provide no methods does not make any sense, and is not used by any
driver.
It is a pretty hard to come up with even a theoretical concept of
a device driver which would always fail open and close with ENODEV.
Change the defaults to be nullopen() and nullclose() which simply
does nothing.
Remove explicit initializations to these from the drivers which
already used them.
fixes a longstanding issue WRT resetting the chip after startup- it
would fail if we were connected as an F-port to a switch. If we
were connected as an F-port, we got assigned a hard loop ID of 255,
which is really a bogus loop id. Then when we turned around to
reset ourselves, the firmware would reject the ICB_INIT request
because the loop id was bogus. *sputter*
Minor fixlet from somebody in NetBSD with too much time on their
hands (dma -> DMA).
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.
sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.
If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.
Reviewed by: tmm, gibbs
Devices below may experience a change in geometry.
* Due to a bug, aic(4) never used extended geometry. Changes all drives
>1G to now use extended translation.
* sbp(4) drives exactly 1 GB in size now no longer use extended geometry.
* umass(4) drives exactly 1 GB in size now no longer use extended geometry.
For all other controllers in this commit, this should be a no-op.
Looked over by: scottl
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create. Passing the
value 0 prevents the alternate kstack from being created. Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.
Reviewed by: jake, peter, jhb
Instead, based upon whether ISP_DAC_SUPPORTED is defined, typedef
isp_dma_addr_t appropriately.
If ISP_DAC_SUPPORTRED is defined, the DMA_WD2/DMA_WD3 macros do something
useful, else they define to '0'.
defined, we set the address space limitation to BUS_SPACE_UNRESTRICTED,
otherwise to BUS_SPACE_MAXADDR_32BIT.
If we have a 1240, ULTRA2 or better, or an FC card, the boundary limit
is BUS_SPACE_UNRESTRICTED and segment limit is BUS_SPACE_MAXADDR_32BIT.
The older 1020/1040 cards have boundary and segment limits of
BUS_SPACE_MAXADDR_24BIT.
load f/w images > 0x7fff words), set ISP_FW_ATTR_SCCLUN. We explicitly
don't believe we can find attributes if f/w is < 1.17.0, so we have to
set SCCLUN for the 1.15.37 f/w we're using manually- otherwise every
target will replicate itself across all 16 supported luns for non-SCCLUN
f/w.
Correctly set things up for 23XX and either fast posting or ZIO. The
23XX, it turns out, does not support RIO. If you put a non-zero value
in xfwoptions, this will disable fast posting. If you put ICBXOPT_ZIO
in xfwoptions, then the 23XX will do interrupt delays but post to the
response queue- apparently QLogic *now* believes that reading multiple
handles from registers is less of a win than writing (and delaying)
multiple 64 byte responses to the response queue.
At the end of taking a a good f/w crash dump, send the ISPASYNC_FW_DUMPED
event to the outer layers (who can then do things like wake a user
daemon to *fetch* the crash image, etc.).
fast posting command completion, and fast post CTIO completion,
the upper half of Risc2Host is a copy of mailbox #1- *not*
mailbox #0.
MFC after: 1 day
Through the PITA of endiannness, clock has to be MHz freq << 8.
Don't trust NVRAM on SBus cards.
Set a default initiator ID sensibly.
SBus/ISP now working, what with the change to sbus.c earlier today.
flags include INTR_MPSAFE. Put the flags in a common place so that
both isp_sbus && isp_pci DTRT.
In isp_mbxdma setup, drop any locks prior to calling things like
bus_dmatag_create. This gets rid of these obnoxious WITNESS messages
about 'sleeping with locks held' blah blah blah blah blah.
This code does not imply that SBus cards work yet. They hang for me.
But I can't netboot the latest snapshot on my ultra1e, and things
hang at bus_setup_intr time.
Since I'm offline for a while, I thought I'd toss this in in case somebody
else who has a bit better luck wants to fart around with it. Please try
and wait until I get back to check things in.
Oops; I forgot for previous delta... If we're and FC or ULTRA2 or better
card, we can have a 1024 element request queue instead of 256.
MFC after: 1 week
Remove sim queue freezes for resource shortages. I've had too many
strange race conditions where I freeze on a resource shortage but
never get unfrozen.
Consolidate the remaining sim queue freeze condition (for loopdown)
into an inline with debug messages that allows us to track problems
at ISP_LOGDEBUG0 level easier. Change a bunch of debug messages about
loop down/up conditions to ISP_LOGDEBUG0 level.
Remove dead isp_relsim code.
Change some internal flag stuff for efficiency.
Complain vociferously if we try and use our FC scratch area while it's
busy being used already (I mean, if we don't have solaris' ability
to sleep as an interrupt thread which would allow us to just use
a p/v semaphore, at least *say* when you've just borked yourself).
Add infrastructure to allow overrides of hard loopid && initiator
id from boot variables.
Fix the usual quota of silly bugs:
+ 'ktmature' needs to be per-instance. Argh.
+ When entering isp_watchdog, set intsok to zero, preserving
old value to restore later. It's not nice to try and sleep
from splsoftclock.
+ Fix tick overflow buglet in checking timeout value.
MFC after: 1 week
turns out that there's something of a hole in our new fabric name
server stuff. We ask the name server for entities that have
registered as a specific type. That type is FC-SCSI. If the entity
hasn't performed a REGISTER FC4 TYPES, the fabric nameserver won't
return it.
This brings this driver to a bit of a fork in the road as to what
the right thing to do is. For servicing the needs of accessing
FC-SCSI devices, this method is fine, and to be preferred. It is
extremely unlikely we're interested in fabric devices that *don't*
register correctly. If I ever get around to adding an FC-IP stack,
then asking for devices that have registers as FC-IP types is also
the right thing to do.
So- asking the fabric nameserver for a specific type is fine, *as
long as you are only interested in specific types*. If, on the other
hand, you want to create (as for management tool support) a picture
of everything on the fabric, this is *not* so fine. There are a
large class of FC-SCSI *initiators* who *don't* correctly register,
so we never will *see* them.
Is this a problem? Yes, but only a little one. If we want to do such
management tool support, we should probably run a *different* fabric
nameserver query algorithm. Better yet, we should talk to the management
nameserver in Brocade switches instead of the standard FC-GS-2 fabric
nameserver (which can be unwieldy).
Other changes: if we've overrrides marked, don't set some default
values from reading NVRAM. This allows us to override things like
EXEC throttle without having to ignore NVRAM entirely.
MFC after: 1 week
CAM_QUIRK_HILUN devices we loop thru 32bits of lun. Oops.
Switch to using USEC_DELAY rather than USEC_SLEEP at isp_reset time.
Try to paper around a defect in clients that don't correctly registers
themeselves with the fabric nameserver.
Minor updates for Mirapoint support- they still use code that is not
HANDLE_LOOPSTATE_IN_OUTER_LAYERS, and, surprise surprise, this old
stuff had some bugs in it.
Clean up some target mode stuff.
MFC after: 1 week
topology, speed, loopid, WWPN/WWNN, etc.
Beef up target mode. Add isp_handle_platform_notify_scsi and
isp_handle_platform_notify_fc platform handlers to handle immediate
notifies (isp_handle_platform_notify_scsi is still stubbed out).
In implementation of isp_handle_platform_notify_fc, for IN_ABORT_TASK,
peel off a pending XPT_IMMED_NOTIFY and call xpt_done on it and hope
that somebody upstream is listening.
Make sure on final CTIO2s that we set residual correctly. These are
absolutely crucial. Make sure we set relative offset for each CTIO2
based upon bytes we've already xferred. This is what the private
adjunct datat to the original ATIO is. Note state of command so
we can figure out where to find it if we get an ABORT from the firmware.
Make sure we *always* set CAM_TAG_ACTION_VALID for ATIO2s. Make sure
we keep track of the original lun.
If se sent status (or we're otherwise done with the command), don't
forget to free the adjunct structure.
(so we can, when things get lost, find out who currently is processing
on behalf of this open exchange. Invariably, when things are lost and
wedged, it's CAM).
Keep an atio resource counter locally.
MFC after: 1 week
running ABOUT FIRMWARE with some that were started by BIOS downloads).
Redo CTIO2 dma mapping- use continuation segments instead of multiple
CTIO2s. Thanks to Veritas for sponsoring this work (in a different
context).
MFC after: 1 week
to *not* do flow control based upon resource counts for the firmware.
Increase default immediate notify count to 16.
Change isp_target_async to a function returning an integer.
is not set in the scsi completion status, or if the residual is clearly
nonsense, then this was a command that suffered the loss of one or more
FC frames in the middle of the exchange.
Set HBA_BOTCH and hope it will get retried. It's the only thing we can do.
MFC after: 1 day
lun address modifier of sorts. Only an HP XP-512 seems to have cared.
Fix a few misplaced pointers for the new fabric goop, which has been
demonstrated to work on newer Brocades and McData switches now.
Put in commented out code which would run GFF_ID if the QLogic f/w
allowed it.
Don't whine about not being able to find a handle for a command if it
was a command aborted (by us).
Grumble. I've seen better documented architectures out of Redmond.
Redo fabric evaluation to not use GET ALL NEXT (GA_NXT). Switches seem
to be trying to wriggle out of supporting this well. Instead, use
GID_FT to get a list of Port IDs and then use GPN_ID/GNN_ID to find the
port and node wwn. This should make working on fabrics a bit cleaner and
more stable.
This also caused some cleanup of SNS subcommand canonicalization so that
we can actually check for FS_ACC and FS_RJT, and if we get an FS_RJT,
print out the reason and explanation codes.
We'll keep the old GA_NXT method around if people want to uncomment a
controlling definition in ispvar.h.
This also had us clean up ISPASYNC_FABRICDEV to use a local lportdb argument
and to have the caller explicitly say that a device is at the end of the
fabric list.
MFC after: 1 week
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
stuff was right, but the busdma stuff was massively not right.
Didn't really test on ia64 or i386- don't have the former h/w and my
FreeBSD-current disk is unwell right now. Hope that this is okay.
MFC after: 1 week
soon because it's just getting harder and harder to find switches
that correctly implement the GET ALL NEXT subcommands for the SNS
protocol.
Latch up result out pointer and set a busy flag when we're looking
at the response queue. This allows for a cleaner way to make sure
we don't get multiple CPUs trying to read the same response queue
entries.
Change how isp_handle_other_response returns values (clarity).
Make PORT UNAVAILABLE the same as PORT LOGOUT (force a LIP).
Do some formatting changes.
MFC after: 0 days
it worked- but I ran into a case with a 2204 where commands were being lost
right and left. Best be safe.
For target mode, or things called if we call isp_handle_other response- note
that we might have dropped locks by changing the output pointer so we bail
from the loop. It's the responsibility of the entity dropping the lock to
make sure that we let the f/w know we've read thus far into the response
queue (else we begin processing the same entries again- blech!).
MFC after: 1 day
OUT status. We are, apparently, required to force the f/w to log back in
if we want to try and talk to that disk again. This means either issuing
a LOGIN LOCAL LOOP PORT mailbox command, or by issuing a LIP. I've elected
to issue a LIP because this has a better chance of waking up the disk which
clearly just crashed and burned.
These should not occur at all. If they do, they should be darned rare.
MFC after: 1 week
If you want QLogic to look at a potential f/w problem for FC cards, you really
have to provide them info in the format they expect. This involves dumping
a lot of hardware registers (> 300 16 bit registers) and a lot of SRAM
(> 128KB minimum). Thus all of this code is #ifdef protected which will
become an option so that the memory allocation of where to dump the crash
image is pretty expensive. It's worth it if you have a reproducible problem
because they have some tools that can tell them, given the f/w version,
the precise state of everything.
MFC after: 1 week
disable MWI on 2300
based on function code, set an 'isp_port' for the 2312- it's a
separate instance, but the NVRAM is shared, and the second port's
NVRAM is at offset 256.
+ Enable RIO operation for LVD SCSI cards. This makes a *big* difference
as even under reasonable load we get batched completions of about 30
commands at a time on, say, an ISP1080.
+ Do 'continuation' mailbox commands- this allows us to specify a work
area within the softc and 'continue' repeated mailbox commands. This is
more or less on an ad hoc basis and is currently only used for firmware
loading (which f/w now loads substantially faster becuase the calling
thread is only woken when all the f/w words are loaded- not for each
one of the 40000 f/w words that gets loaded).
+ If we're about to return from isp_intr with a 'bogus interrupt' indication,
and we're not a 23XX card, check to see whether the semaphore register is
currently *2* (not *1* as it should be) and whether there's an async completion
sitting in outgoing mailbox0. This seems to capture cases of lost fast posting
and RIO interrupts that the 12160 && 1080 have been known to pump out under
extreme load (extreme, as in > 250 active commands).
+ FC_SCRATCH_ACQUIRE/FC_SCRATCH_RELEASE macros.
+ Endian correct swizzle/unswizzle of an ATIO2 that has a WWPN in it.
MFC after: 1 week
firmware to delay completion of commands so that it can attempt to batch
a bunch of completions at once- either returning 16 bit handles in mailbox
registers, or in a resposne queue entry that has a whole wad of 16 bit handles.
Distinguish between 2300 and 2312 chipsets- if only because the revisions
on the chips have different meanings.
Add more instrumentation plus ISP_GET_STATS and ISP_CLR_STATS ioctls.
Run up the maximum number of response queue entities we'll look at
per interrupt.
If we haven't set HBA role yet, always return success from isp_fc_runstate.
MFC after: 2 weeks
a GetAllNext response. Otherwise, we won't unswizzle
it correctly. This was found on linux/PPC.
This mandated creating another inline: isp_get_gan_response.
the response queue. Instead of the ad hoc ISP_SWIZZLE_REQUEST, we now have
a complete set of inline functions in isp_inline.h. Each platform is
responsible for providing just one of a set of ISP_IOX_{GET,PUT}{8,16,32}
macros.
The reason this needs to be done is that we need to have a single set of
functions that will work correctly on multiple architectures for both little
and big endian machines. It also needs to work correctly in the case that
we have the request or response queues in memory that has to be treated
specially (e.g., have ddi_dma_sync called on it for Solaris after we update
it or before we read from it). It also has to handle the SBus cards (for
platforms that have them) which, while on a Big Endian machine, do *not*
require *most* of the request/response queue entry fields to be swizzled
or unswizzled.
One thing that falls out of this is that we no longer build requests in the
request queue itself. Instead, we build the request locally (e.g., on the
stack) and then as part of the swizzling operation, copy it to the request
queue entry we've allocated. I thought long and hard about whether this was
too expensive a change to make as it in a lot of cases requires an extra
copy. On balance, the flexbility is worth it. With any luck, the entry that
we build locally stays in a processor writeback cache (after all, it's only
64 bytes) so that the cost of actually flushing it to the memory area that is
the shared queue with the PCI device is not all that expensive. We may examine
this again and try to get clever in the future to try and avoid copies.
Another change that falls out of this is that MEMORYBARRIER should be taken
a lot more seriously. The macro ISP_ADD_REQUEST does a MEMORYBARRIER on the
entry being added. But there had been many other places this had been missing.
It's now very important that it be done.
Additional changes:
Fix a longstanding buglet of sorts. When we get an entry via isp_getrqentry,
the iptr value that gets returned is the value we intend to eventually plug
into the ISP registers as the entry *one past* the last one we've written-
*not* the current entry we're updating. All along we've been calling sync
functions on the wrong index value. Argh. The 'fix' here is to rename all
'iptr' variables as 'nxti' to remember that this is the 'next' pointer-
not the current pointer.
Devote a single bit to mboxbsy- and set aside bits for output mbox registers
that we need to pick up- we can have at least one command which does not
have any defined output registers (MBOX_EXECUTE_FIRMWARE).
MFC after: 2 weeks
If we get a completion status of RQCS_QUEUE_FULL, it means
that the internal queues are full. Other QLogic boards set
the QFULL SCSI status. But *nooooooooooo*, not the 2300.
MFC after: 1 day
appropriate cache flush that provides MEMORY_BARRIER in between handoffs
between host && RISC processor for the shared memory request/response
queues.
Submitted by: dfr@nlsystems.com
to see if there's an interrupt (avoids PCI parity errors
which can occur on the 2312 if you access some registers
from the host at the same time the RISC on the 2312 is
C accessing them).
MFC after: 1 day
per-command component that we *don't* try and pass thru CAM. CAM just
is too risky and too much of a pain- structures get copied, but not
all info of interest can be considered safely transported thru all
consumers (including user space) from the incoming ATIO to the outgoing
CTIO- it's just much safer to have a buddy structure, identified by the
command's tag which *does* make it thru safely.
Pay attention to link speed and report 200MB/s xfer speed for a
23XX card in 2GPs mode.
MFC after: 1 week
once so there isn't a window with the ones for the 23XX cards being wrong.
When being verbose, print out some more FC NVRAM values (like framesize).
MFC after: 1 week
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
SIM (as is true for the 1280 and the 12160), then I have to have separate
flags && status for *both* busses. *Whap*.
Implement condition variables for coordination with some target mode
events. It's nice to use these and not panic in obscure little places
in the kernel like 'propagate_priority' just because we went to sleep
holding a mutex, or some other absurd thing.
Remove some bogus ISP_UNLOCK calls. *Whap*.
No longer require that somebody do a lun enable on the wildcard device
to enable target mode. They are, in fact, orthogonal. A wildcard open
is a statement that somebody upstream is willing to accept commands which
are otherwise unrouteable. Now, for QLogic regular SCSI target mode, this
won't matter for a damn because we'll never see ATIOs for luns we haven't
enabled (are listening for, if you will). But for SCCLUN fibre channel
SCSI, we get all kinds of ATIOs. We can either reflect them back here
with minimal info (which is isp_target.c:isp_endcmd() is for), or the
wildcard device (nominally targbh) can handle them.
Do further checking against firmware attributes to see whether we can,
in fact, support target mode in Fibre Channel. For now, require SCCLUN
f/w to supoprt FC target mode.
This is an awful lot of change, but target mode *still* isn't quite right.
MFC after: 4 weeks
applies to. Do more bus # foo things.
Acknowledge Immediate Notifies right away prior to throwing events upstream
(where they're currently being ignored, *groan*)
Capture ASYNC_LIP_F8 as with ASYNC_LIP_OCCURRED. Don't percolate them
upstream as if they were BUS RESETS- they're not.
and cv_wait for mailbox commands to complete if we start them from
here.
Fix residuals for target mode such that we only check the residual and
set it in the CTIO if this is the last CTIO (when we're sending status).
MFC after: 4 weeks
SIM (as is true for the 1280 and the 12160), then I have to have separate
flags && status for *both* busses. *Whap*.
Implement condition variables for coordination with some target mode
events. It's nice to use these and not panic in obscure little places
in the kernel like 'propagate_priority' just because we went to sleep
holding a mutex, or some other absurd thing.
MFC after: 4 weeks
luns) firmware for the Fibre Channel cards.
We used to assume that if we didn't download firmware, we couldn't know
what the firmware capability with respect to SCCLUNs is- and it's important
because the lun field changes in the request queue entry based upon which
firmware it is.
At any rate, we *do* get back firmware attributes in mailbox register 6
when we do ABOUT FIRMWARE for all 2200/2300 cards- and for 2100 cards
with at least 1.17.0 firmware. So- we now assume non-SCCLUN behaviour
for 2100 cards with firmware < 1.17.0- and we check the firmware attributes
for other cards (loaded firmware or not).
This also allows us to get rid of the crappy test of isp_maxluns > 16-
we simply can check firmware attributes for SCCLUN behaviour.
This required an 'oops' fix to the outgoing mailbox count field for
ABOUT FIRMWARE for FC cards.
Also- while here, hardwire firmware revisions for loaded code for SBus
cards. Apparently the 1.35 or 1.37 f/w we've been loading into isp1000
just doesn't report firmware revisions out to mailbox regs 1, 2 and 3
like everyone else. Grumble. Not that this fix hardly matters for FreeBSD.
MFC after: 4 weeks
some reworking (and consequent cleanup) of the interrupt service code.
Also begin to start a cleanup of target mode support that will (eventually)
not require more inforamtion routed with the ATIO to come back with the
CTIO other than tag.
MFC after: 4 weeks
that do not have valid NVRAM. In particular, we were leaving
a retry count set (to retry selection timeouts) when thats
not really what we want. Do some constant string additions
so that LOGDEBUG0 info is useful across all cards.
MFC after: 2 weeks
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).
This probably isn't entirely final as yet- but it's a lot closer to now
being what it should be, including allowing camcontrol to actually set
specific settings.
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).
Roll core minor.
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).
Correctly reintroduce loop_seen_once semantics- that is, if we've never
seen good link, start bouncing commands with CAM_SEL_TIMEOUT. But we
have to be careful to have let ourselves try (in isp_kthread) to check
for loop up at least once.
PR: 28992
MFC after: 1 week
We originally had it such that if the connection topology was FL-loop
(public loop), we never looked at any local loop addresses. The reason
for not doing that was fear or concern that we'd see the same local
loop disks reflected from the name server and we'd attach them twice.
However, when I recently hooked up a JBOD and a system to an ANCOR SA-8
switch, the disks did *not* show up on the fabric. So at least the
ANCOR is screening those disks from appearing on the fabric. Now, it's
possible this is a 'feature' of the ANCOR. When I get a chance, I'll
check the Brocade (it's hard to do this on a low budget).
In any case, if they *do* also show up on the fabric, we should
simply elect to not log into them because we already have an
entry for the local loop. There is relatively unexercised code
just for this case.
MFC after: 2 weeks
For fibre channel, start going for the gusto and using AC_FOUND_DEVICE
and AC_LOST_DEVICE calls to xpt_async when devices appear and disappear
as the loop or fabric changes.
ISPASYNC_FW_CRASH is the async event code where the platform layer
deals with a firmware crash.
some of the RIO (reduced interrupt operation) stuff. Add 64 bit
data list (DSD type 1) and arbitrary data list (DSD type 2)
data structure defines.
Add macros that parameterize usage of the Request/Response in/out
queue pointers. When we finish 2300 support, different registers
will be accessed for the 2300.
part of the PCI block for the 2300- not software convention usage
of the mailbox registers- so we macrosize in/out pointer usage.
Only report that a LIP destroyed commands if it actually destroyed
commands. Get the chan/tgt/lun order correct. Fix a longstanding
stupid bug that caused us to try and issue a command with a tag on
Channel B because we were checking the tagged capability for the
target against Channel A.
A firmware crash is now vectored out to platform specific code
as an async event.
Some minor formatting tweaks.
554: passing arg 4 of `resource_string_value' from incompatible pointer type
576: passing arg 4 of `resource_string_value' from incompatible pointer type
593: passing arg 4 of `resource_string_value' from incompatible pointer type
commands that complete (with no apparent error) after
we receive a LIP. This has been observed mostly on
Local Loop topologies. To be safe, let's just mark
all active commands as dead if we get a LIP and we're
on a private or public loop.
MFC after: 4 weeks
----
Make a device for each ISP- really usable only with devfs and add an ioctl
entry point (this can be used to (re)set debug levels, reset the HBA,
rescan the fabric, issue lips, etc).
----
Add in a kernel thread for Fibre Channel cards. The purpose of this
thread is to be woken up to clean up after Fibre Channel events
block things. Basically, any FC event that casts doubt on the
location or identify of FC devices blocks the queues. When, and
if, we get the PORT DATABASE CHANGED or NAME SERVER DATABASE CHANGED
async event, we activate the kthread which will then, in full thread
context, re-evaluate the local loop and/or the fabric. When it's
satisfied that things are stable, it can then release the blocked
queues and let commands flow again.
The prior mechanism was a lazy evaluation. That is, the next command
to come down the pipe after change events would pay the full price
for re-evaluation. And if this was done off of a softcall, it really
could hang up the system.
These changes brings the FreeBSD port more in line with the Solaris,
Linux and NetBSD ports. It also, more importantly, gets us being
more proactive about topology changes which could then be reflected
upwards to CAM so that the periph driver can be informed sooner
rather than later when things arrive or depart.
---
Add in the (correct) usage of locking macros- we now have lock transition
macros which allow us to transition from holding the CAM lock (Giant)
and grabbing the softc lock and vice versa. Switch over to having this
HBA do real locking. Some folks claim this won't be a win. They're right.
But you have to start somewhere, and this will begin to teach us how
to DTRT for HBAs, etc.
--
Start putting in prototype 2300 support. Add back in LIP
and Loop Reset as async events that each platform will handle.
Add in another int_bogus instrumentation point.
Do some more substantial target mode cleanups.
MFC after: 8 weeks
overflow the request queue. The reason we want to do this is that we
now push out completed CTIOs as we complete them- this gets the QLogic
working on them quicker. So we need to know whether we can put the entire
burrito out before we start.
We now support conjoint status with data for the last CTIO for both Fibre
Channel and SCSI. Leave the old code in place in case we need to go back
(minor 3 line ifdef).
Ultra-ultra important- *don't* set rq->req_seg_count for non-data
target mode requests in isp_pci_dmasetup. D'oh- this is actually
the tag value area for a CTIO. What *was* I thinking? Boy howdy
does both aic7xxx and sym get awfully unhappy when on reconnect
you give them a constant '1' for a tag value.
function- we did it a bit cleaner. We only use this if a CTIO completes with
!CT_OK state. We now have managed to get away without having to poke around
and trying to find the original ATIO- the csio we're using has the tag_id
and lun values with it which is mostly what we need when we do the putback.
Make sure we correctly propagate AT_TQAE->CT_TQAE for tags. Make sure
we call ISP_DMAFREE only if we had DATA to move.
tag is active for an ATIO, and say that you want to reconnect with
a tag value in a CTIO have *never* been exercised until now. This lossage
derived from Solaris code where this stuff originally came from that is
about 7 years old. Amazing.
We now bundle the incoming tag (legal values are 0..256) as the low
16 bits of the ccb_accept_tio's at_tagid while we put the firmware
handle for this ATIO in the top 16 bits- define some macros to make
this cleaner.
Complete some Ansification.
Redo establishment of default SCSI parameters whether or not
we've been compiled for target mode. Unfortunately, the Qlogic
f/w is confused so that if we set all targets to be 'safe' (i.e.,
narrow/async), it will also then report narrow, async if we're
contacted in target mode from that target (acting in initiator
role). D'oh!
Fix ISPCTL_TOGGLE_TMODE to correctly enable the right channel for
dual channel cards. Add some more opcodes. Fix a stupid NULL
pointer bug.
I was hanging after sending a xfer CTIO and a status CTIO for a non-discon
INQUIRY- the xfer CTIO was returned as completed OK, but the status CTIO
was dropped on the floor. All the fields looked good. I don't know why
it got dropped. But allowing status to go back with data xfer seemed to
work. I also noticed that with a non-disconnecting command that the
firmware handle in the ATIO is zero- this leads me to believe that the
f/w really can only handle one CTIO at a time in the discon case, and
it had no idea what to do with the second (status) CTIO.
CAM_SEND_STATUS. Set a timeout of 2 seconds per CTIO. Make sure
that the 'real' tag value is being checked against- not the
one that also carries the firmware handle.
the target mode code or outer layers.
Increase cd_tagval to be 32 bits since it will have to now carry 16
bits of parallel SCSI ATIO handle as well as a normal tag (if any).
Solaris (which, for reasons unknown to me, chokes on u_int16_t
as a typedef of unsigned short if used in a transitional (mixed K&R
and ANSI) way), we'll go the extra mile and fully ANSIfy things.
This is a pretty invasive change, but there are three good
reasons to do this:
1. We'll never have > 16 bits of handle.
2. We can (eventually) enable the RIO (Reduced Interrupt Operation)
bits which return multiple completing 16 bit handles in mailbox
registers.
3. The !)$*)$*~)@$*~)$* Qlogic target mode for parallel SCSI spec
changed such that at_reserved (which was 32 bits) was split into
two pieces- and one of which was a 16 bit handle id that functions
like the at_rxid for Fibre Channel (a tag for the f/w to correlate
CTIOs with a particular command). Since we had to muck with that
and this changed the whole handler architecture, we might as well...
Propagate new at_handle on through int ct_fwhandle. Follow
implications of changing to 16 bit handles.
These above changes at least get Qlogic 1040 cards working in target
mode again. 1080/12160 cards don't work yet.
In isp.c:
Prepare for doing all loop management in outer layers.
for selecting unit). Instead, use the resource hints mechanism.
One unfortunate situation here is that there is no resource_quad_value
function- which is what I needed for WWN boot time replacement. Worse-
you can't store the hint as just plain
hint.isp.0.nodewwn="0x50000000aaaa0001"
because this gets interpreted as an int- incorrectly because it can't
be converted to an int. I can't even get this as a string. To work
around this particular case for nodewwn && portwwn setting, this
rather grotesque form will be used:
hint.isp.0.nodewwn="w50000000aaaa0001"
hint.isp.0.portwwn="w50000000aaaa0002"
At the same time, if we have no hinted WWN, set the default WWN (which, btw,
gets overridden if the card has valid NVRAM, which is usual) to
0x400000007F000009ull (which translates to NAA == IPv4, 127.0.0.9).
Eliminate more printf's and replace them either with device_printf or
isp_prt calls.
ones where we have a CAM path) and replacing them with calls to isp_prt.,
Eliminate isp_unit references- we no longer have an isp_unit- we now
have an isp_dev that device_get_unit can work with.
for the ICB firmware options meant- *I* had taken it to
mean that if you set it, Node Name would be ignored and
derived from Port Name. Actually, it meant the opposite.
As a consequence- change ICBOPT_USE_PORTNAME to the
define ICBOPT_BOTH_WWNS- makes more sense.
Fix wrong input bitmap for MBOX_DUMP_RAM command. Call
ISP_DUMPREGS if we get a f/w crash. Add ISPCTL_RUN_MBOXCMD
control command (so outer layers can run a mailbox command
directly) and add a ISPASYNC_UNHANDLED_RESPONSE hook so
outer layers can understand response queue entries we
might not know about.
isp_iid_set/isp_iid for fibre channel- this is because we now
fake a port database entry for ourselves. Add the additional loop
states between LOOP_PDB_RCVD and LOOP_READY.
Change and comment on a wad of Fibre Channel isp_control functions.
Change and comment on some of the ISPASYNC Fibre Channel events.
the unit number doesn't get reused.
Make sure that if we've compiled for ISP_TARGET_MODE we set the
default role to be ISP_ROLE_INITIATOR|ISP_ROLE_TARGET.
Do some misc other cleanups.
and depending on role, make sure link is up, scan the fabric (if we're
connected to a fabric), scan the local loop (if appropriate), merge
the results into the local port database then, check once again
to make sure we have f/w at FW_READY state and the the loopstate
is LOOP_READY.
Comment out usage of ISP_SMPLOCK- I have my doubts that this works sanely
as yet because CAM itself still needs Giant. I *was* dropping my lock
and grabbing Giant when doing the upcall for completion, but this is all
seems ridiculous until CAM is fixed.
if we're ISP_ROLE_NONE. Change ISPASYNC_LOGGED_INOUT to ISPASYNC_PROMENADE.
Make sure we note if something is a fabric device.
Target mode:
Finally fix (to a first approximation) SCSI Target Mode again- we needed
to correctly check against CAM_TARGET_WILDCARD and CAM_LUN_WILDCARD
so that targbh won't confuse us. Comment out the drainqueue stuff for
now. Use isp_fc_runstate instead if isp_control/ISPCTL_FCLINK_TEST.
Remove ISP2100_FABRIC defines- we always handle fabric now. Insert
isp_getmap helper function (for getting Loop Position map). Make
sure we (for our own benefit) mark req_state_flags with RQSF_GOT_SENSE
for Fibre Channel if we got sense data- the !*$)!*$)~*$)*$ Qlogic
f/w doesn't do so. Add ISPCTL_SCAN_FABRIC, ISPCTL_SCAN_LOOP, ISPCTL_SEND_LIP,
and ISPCTL_GET_POSMAP isp_control functions. Correctly send async notifications
upstream for changes in the name server, changes in the port database, and
f/w crashes. Correctly set topology when we get a ASYNC_PTPMODE event.
Major stuff:
Quite massively redo how we handle Loop events- we've now added several
intermediate states between LOOP_PDB_RCVD and LOOP_READY. This allows us
a lot finer control about how we scan fabric, whether we go further
than scanning fabric, how we look at the local loop, and whether we
merge entries at the level or not. This is the next to last step for
moving managing loop state out of the core module entirely (whereupon
loop && fabric events will simply freeze the command queue and a thread
will run to figure out what's changed and *it* will re-enable the queu).
This fine amount of control also gets us closer to having an external
policy engine decide which fabric devices we really want to log into.
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.
Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
(so we can see rapidly whether something was a fabric device but is
now gone).
Add a tag which says what role this adapter should take. It can take
on the value of None, Target, Initiator or Both. None is useful for
warm failover purposes. Remove the ISP_CFG_NOINIT silliness since
a role of "None" does this.
Add a isp_lastmbxcmd tag to store the opcode for the last mailbox
command used.
Module) and FBM (Fibre Buffer Modules). Also remember to clear the
semaphore registers. Tell the RISC processor to not halt on FPM
parity errors.
Throw out the ISP_CFG_NOINIT silliness and instead go to the use of
adapter 'roles' to see whether one completes initialization or not
(mostly for Fibre Channel). The ultimate intent, btw, of all of this
is to have a warm standby adapter for failover reasons. Because
we do roles now, setting of Target Capable Class 3 service parameters
in the ICB for the 2x00 cards reflects from role. Also, in isp_start,
if we're not supporting an initiator role, we bounce outgoing commands
with a Selection Timeout error. Also clean out the TOGGLE_TMODE
goop for FC- there is no toggling of target mode like there is
for parallel SCSI cards.
Do more cleanup with respect to using target ids 0..125 in F-port
topologies. Also keep track of things which *were* fabric devices
so that when you rescan the fabric you can notify the outer layers
when fabric devices go away.
Only force a LOGOUT for fabric devices if they're still logged in
(i.e., you cat their Port Database entry. Clean up the Get All Next
scanning.
Finally, use a new tag in the softc to store the opcode for the
last mailbox command used so we can report which opcode timed
out.
that require us to register our FC4 types of interest. Allow ourselves, in
F-port topologies, to start logging in fabric devices in the target 0..125
range. Change ISPASYNC_PDB_CHANGED (misnamed) to ISPASYNC_LOGGED_INOUT.
Fix (*SMACK*) again some default WWN stuff. This is *really* hard to get
right across all the range of platforms.
WWNs correctly (Again!) - this time for the case that we're not going
to fully init the adapter if isp_init is called (with ISP_CFG_NOINIT
set in options). The pupose for this is to bring the adapter up to
almost ready to go, get info out of NVRAM, but to not start it up- leaving
it until later to actually start things up if wanted (and possibly with
different roles selected).
Add a test against isp->isp_osinfo.islocked prior to trying to see
whether --isp->isp_osinfo.islocked is zero to cause us to unlock
(non-SMPLOCK case).
(specifically, how many entries we've looked at so far). Maintain
interrupt instrumentation. Use USEC_SLEEP instead of USEC_DELAY in
a number of places (this allows us to drop locks and sleep instead
of spin). Track changes to configuration options for topology preference.
Fix botched order of printout for Channel, Target, Lun.
compile time will build in mutex locks, otherwise the old locking (splcam/splx
with a recursion counter) will be compiled in.
We still depend on config_intr_hook to tell us when it's okay to call
msleep instead of polling. It'd be real nice if we could do this early
enough to not hang up a machine struggling with a bad Fibre Channel loop,
but that's still to come.
the drivers.
* Remove legacy inx/outx support from chipset and replace with macros
which call busspace.
* Rework pci config accesses to route through the pcib device instead of
calling a MD function directly.
With these changes it is possible to cleanly support machines which have
more than one independantly numbered PCI busses. As a bonus, the new
busspace implementation should be measurably faster than the old one.
changes is that there's now a Solaris port of this driver, so some things
in the core version had to change (not much, but some).
In order, from the top.....:
A lot of error strings are gathered in one place at the head of the file.
This caused me to rewrite them to look consistent (with respect to
things like 'Port 0x%' and 'Target %d' and 'Loop ID 0x%x'.
The major mailbox function, isp_mboxcmd, now takes a third argument,
which is a mask that selectively says whether mailbox command failures
will be logged. This will substantially reduce a lot of spurious noise
from the driver.
At the first run through isp_reset we used to try and get the current
running firmware's revision by issuing a mailbox command. This would
invariably fail on alpha's with anything but a Qlogic 1040 since SRM
doesn't *start* the f/w on these cards. Instead, we now see whether we're
sitting ROM state before trying to get a running BIOS loaded f/w version.
All CFGPRINTF/PRINTF/IDPRINTF macros have been replaced with calls to
isp_prt. There are seperate print levels that can be independently
set (see ispvar.h), which include debugging, etc.
All SYS_DELAY macros are now USEC_DELAY macros. RQUEST_QUEUE_LEN and
RESULT_QUEUE_LEN now take ispsoftc as a parameter- the Fibre Channel
cards and the Ultra2/Ultra3 cards can have 16 bit request queue entry
indices, so we can make a 1024 entry index for them instead of the
256 entries we've had until now.
A major change it to fix isp_fclink_test to actually only wait the
delay of time specified in the microsecond argument being passed.
The problem has always been that a call to isp_mboxcmd to get he
current firmware state takes an unknown (sometimes long) amount of
time- this is if the firmware is busy doing PLOGIs while we ask
it what's up. So, up until now, the usdelay argument has been
a joke. The net effect has been that if you boot without being plugged
into a good loop or into a switch, you hang. Massively annonying, and
hard to fix because the actual time delta was impossible to know
from just guessing. Now, using the new GET_NANOTIME macros, a precise
and measured amount of USEC_DELAY calls are done so that only the
specified usecdelay is allowed to pass. This means that if the initial
startup of the firmware if followed by a call from isp_freebsd.c:isp_attach
to isp_control(isp, ISP_FCLINK_TEST, &tdelay) where tdelay is 2 * 1000000,
no more than two seconds will actually elapse before we leave concluding
that the cable is unhooked. Jeez. About time....
Change the ispscsicmd entry point to isp_start, and the XS_CMD_DONE
macro to a call to the platform supplied isp_done (sane naming).
Limit our size of request queue completions we'll look at at interrupt
time. Since we've increased the size of the Request Queue (and the
size of the Response Queue proportionally), let's not create an
interrupt stack overflow by having to keep a max completion list
(forw links are not an option because this is common code with
some platforms that don't have link space in their XS_T structures).
A limit of 32 is not unreasonable- I doubt there'd be even this many
request queue completions at a time- remember, most boards now use
fast posting for normal command completion instead of filling out
response queue entries.
In the isp_mboxcmd cleanup, also create an array of command
names so that "ABOUT FIRMWARE" can be printed instead of "CMD #8".
Remove the isp_lostcmd function- it's been deprecated for a while.
Remove isp_dumpregs- the ISP_DUMPREGS goes to the specific bus
register dump fucntion.
Various other cleanups.
isp_prt calls. We now use an argument to the ISPCTL_FCLINK_TEST
call. We change all IDPRINTF macros to isp_prt calls. We add
the isp_prt function here.
quite a bit so that all of the ports have a similar set of required
macros/definitions (and in similar places in the isp_<platform>.h
file).
Some new macros/functions added- Mailbox Acquire/Relase macros,
NANOTIME macros, SNPRINTf and STRNCAT. MemoryBarrier beomes
MEMORYBARRIER with much stronger types.
isp2100_fw_statename as an INLINE (now a function in isp.c). Remove
isp2100_pdb_statename (unused). Redo all ISP_SCSI_XFER_T as XS_T types.
Change all RQUEST_QUEUE_LEN/RESULT_QUEUE_LEN macros to take a parameter.
Add isp_print_bytes function.
when we're done reading it (makes checking things easier).
Before calling isp_notify_ack make sure we're at RUNSTATE-
elsewise we can be responding to LIPs or SCSI bus resets
before we've finished some of the wiring.
we need a function that tells the Qlogic f/w that a target mode command
is done, so increase the resource count for that lun. Add in a timeout
function to kick the putback again if we fail to do it the first time (we
may not have the request queue space for ATIO push). Split the function
isp_handle_platform_ctio into two parts so that the timeout function for
the ATIO push or isp_handle_platform_ctio can inform CAM that the requested
CTIO(s) are now done.
Clean up (cough) residual handling. What we need for Fibre Channel
is to preserve the at_datalen field from the original incoming ATIO
so we can calculate a 'true' residual. Unfortunately, we're not
guaranteed to get that back from CAM. We'll *try* to find it hiding
in the periph_priv field (layering violation)- but if an ATIO was
passed in from user land- forget it. This means that we'll probably
get residuals wrong for Fibre Channel commands we're completing
with an error. It's too late to 4.1 release to fix this- too bad.
Luckily the only device we'd really care about this occurring on
is a tape device and they're still so rare as FC attached devices
that this can be considered an untested combination anyway.
Remove all CCINCR usage (resource autoreplenish). When we've proved
to ourself that things are working properly, we can add it back
in.
Make sure we propage 'suggested' sense data from the incoming ATIO
into the created system ATIO- and set sense_len appropriately.
Correctly propagate tag values.
Fall back to the model of generating (well, the functions in isp_pci.c
do the work) multiple CTIOs based upon what we get from XPT. Instead
of being able to pair Qlogic generated ATIOs with CAM ATIOs, and then
to pair CAM CTIOs with Qlogic CTIOs, we have to take the CTIO passed
to us from XPT, and if it implies that we have to generate extra
Qlogic CTIOs, so be it. This means that we have to wait until the
last CTIO in a sequence we generated completes before calling xpt_done.
Executive summary- target mode actually now pretty much works well
enough to tell folks about.
sure that it really is by issuing a ISPCTL_ABORT_CMD just on the
off chance the f/w will start it up again and, ha ha, start using
the DMA resources we gave it but are now taking away.
us to not the ints are ok and also to (re)ENABLE isp interrupts. Remove
all splcam()/splx() invocates and replace them with ISP_LOCK/ISP_UNLOCK
macros.
to isp_osinfo substructure (all in prep for SMP). Define MBOX_WAIT_COMPLETE
and MBOX_NOTIFY_COMPLETE macros so that we can now (temp) use tsleep
to wait for mailbox completion. Requires us to guess whether we're
servicing an interrupt or not- will use intr_nesting_level.
Add local strncat function.
define). Fix stupidity wrt checking whether we've gone to
LOOP_PDB_RCVD loopstate- it's okay to be greater than this state.
D'oh! Protect calls to isp_pdb_sync and isp_fclink_state with IS_FC
macros.
Completely redo mailbox command routine (in preparation to make this
possibly wait rather than poll for completion).
Make a major attempt to solve the 'lost interrupt' problem
1. Problem
The Qlogic cards would appear to 'lose' interrupts, i.e., a legitimate
regular SCSI command placed on the request queue would never complete
and the watchdog routine in the driver would eventually wakeup and
catch it. This would typically only happen on Alphas, although a
couple folks with 700MHz Intel platforms have also seen this.
For a long time I thought it was a foulup with f/w negotiations of
SYNC and/or WIDE as it always seemed to happen right after the
platform it was running on had done a SET TARGET PARAMETERS mailbox
command to (re)enable sync && wide (after initially forcing
ASYNC/NARROW at startup). However, occasionally, the same thing
would also occur for the Fibre Channel cards as well (which, ahem,
have no SET TARGET PARAMETERS for transfer mode).
After finally putting in a better set of watchdog routines for the
platforms for this driver, it seemed to be the case that the command
in question (usually a READ CAPACITY) just had up and died- the
watchdog routine would catch it after ~10 seconds. For some platforms
(NetBSD/OpenBSD)- an ABORT COMMAND mailbox command was sent (which
would always fail- indicating that the f/w denied knowledge of this
command, i.e., the f/w thought it was a done command). In any case,
retrying the command worked. But this whole problem needed to be
really fixed.
2. A False Step That Went in The Right Direction
The mailbox code was completely rewritten to no longer try and grab
the mailbox semaphore register and to try and 'by hand' complete
async fast posting completions. It was also rewritten to now have
separate in && out bitpatterns for registers to load to start and
retrieve to complete. This means that isp_intr now handles mailbox
completions.
This substantially simplifies the mailbox handling code, and carries
things 90% toward getting this to be a non-polled routine for this
driver.
This did not solve the problem, though.
3. Register Debouncing
I saw some comments in some errata sheets and some notes in a Qlogic
produced Linux driver (for the Qlogic 2100) that seemed to indicate
that debouncing of reads of the mailbox registers might be needed,
so I added this. This did not affect the problem. In fact, it made
the problem worse for non-2100 cards.
5. Interrupt masking/unmasking
The driver *used* to do a substantial amount of masking/unmasking
of the interrupt control register. This was done to make sure that
the core common code could just assume it would never get pre-empted.
This apparently substantially contributed to the lost interrupt
problem. The rewrite of the ICR (Interrupt Control Register),
which is a separate register from the ISR (Interrupt Status Register)
should not have caused any change to interrupt assertions pending.
The manual does not state that it will, and the register layout
seems to imply that the ICR is just an active route gate. We only
enable PCI Interrupts and RISC Interrupts- this should mean that
when the f/w asserts a RISC interrupt and (and the ICR allows RISC
Interrupts) and we have PCI Interrupts enabled, we should get a
PCI interrupt. Apparently this is a latch- not a signal route.
Removing this got rid of *most* but not all, lost interrupts.
5. Watchdog Smartening
I made sure that the watchdog routine would catch cases where the
Qlogic's ISR showed an interrupt assertion. The watchdog routine
now calls the interrupt service routine if it sees this. Some
additional internal state flags were added so that the watchdog
routine could then know whether the command it was in the middle
of burying (because we had time it out) was in fact completed by
the interrupt service routine.
6. Occasional Constipation Of Commands..
In running some very strenous high IOPs tests (generating about
11000 interrupts/second across one Qlogic 1040, one Qlogic 1080
and one Qlogic 2200 on an Alpha PC164), I found that I would get
occasional but regular 'watchdog timeouts' on both the 1080 and
the 2100 cards. This is under FreeBSD, and the watchdog timeout
routine just marks the command in error and retries it.
Invariably, right after this 'watchdog timeout' error, I'd get a
command completion for the command that I had thought timed out.
That is, I'd get a command completion, but the handle returned by
the firmware mapped to no current command. The frequency of this
problem is low under such a load- it would usually take an 30
minutes per 'lost' interrupt.
I doubled the timeout for commands to see if it just was an edge
case of waiting too short a period. This has no effect.
I gathered and printed out microtimes for the watchdog completed
command and the completion that couldn't find a command- it was
always the case that the order of occurrence was "timeout, completion"
separated by a time on the order of 100 to 150 ms.
This caused me to consider 'firmware constipation' as to be a
possible culprit. That is, resubmission of a command to the device
that had suffered a watchdog timeout seemed to cause the presumed
dead command to show back up.
I added code in the watchdog routine that, when first entered for
the command, marks the command with a flag, reissues a local timeout
call for one second later, but also then issues a MARKER Request
Queue entry to the Qlogic f/w. A MARKER entry is used typically
after a Bus Reset to cause the f/w to get synchronized with respect
to either a Bus, a Nexus or a Target.
Since I've added this code, I always now see the occasional watchdog
timeout, but the command that was about to be terminated always
now seems to be completed after the MARKER entry is issued (and
before the timeout extension fires, which would come back and
*really* terminate the command).
comment. Check against firmware state- not loop state when enabling
target mode. Other changes have to do with no longer enabling/disabling
interrupts at will.
Rearchitect command watchdog timeouts-
First of all, set the timeout period for a command that has a
timeout (in isp_action) to the period of time requested *plus* two
seconds. We don't want the Qlogic firmware and the host system to
race each other to report a dead command (the watchdog is there to
catch dead and/or broken firmware).
Next, make sure that the command being watched isn't done yet. If
it's not done yet, check for INT_PENDING and call isp_intr- if that
said it serviced an interrupt, check to see whether the command is
now done (this is what the "IN WATCHDOG" private flag is for- if
isp_intr completes the command, it won't call xpt_done on it because
isp_watchdog is still looking at the command).
If no interrupt was pending, or the command wasn't completed, check
to see if we've set the private 'grace period' flag. If so, the
command really *is* dead, so report it as dead and complete it with
a CAM_CMD_TIMEOUT value.
If the grace period flag wasn't set, set it and issue a SYNCHRONIZE_ALL
Marker Request Queue entry and re-set the timeout for one second
from now (see Revision 1.45 isp.c notes for more on this) to give
the firmware a final chance to complete this command.
store a bitmask of whether we've set a value into ccb->ccb_h.status,
whether we're in the watchdog routine for this command now, whether
we've set a grace period for this command and whether this command is
actually done.
See comments of rev 1.45 of isp.c for more complete information.
output mailbox values we want to get back out of the chip once a mailbox
command is done. Add storage for the maximum number of output mailbox
registers to the softc.
Roll minor version number.
the handle (i.e., generation number), so we will now need a function that
will take a handle and return a flat index [ 0 .. maxhandles-1 ] for
auxillary routines that need an index to get at buddy store values
(like dma maps or xflist pointers).
Force alphas to prefer mem mapping as the default.
Basically, we have a pointer to a function which we can call which will
return us a pointer to firmware for the card we have. We call this function
(if it's non-NULL) with the address of our mdvec f/w pointer.
The way this works is that if ispfw (as a module or a static) is loaded,
it initializes the pointer in isp_pci, so we can call into to it to fetch
a pointer to a f/w set.
If ispfw is MOD_UNLOADed, it's retained a pointer to our mdvec f/w pointers,
which then get zeroed out so we don't have any references to data that's
now gone from kernel memory. Removing the f/w saves ~360KBytes.
Alas, there is no autounload mechanism that works for is here.
through, establish what our LUN width is. Unfortunately, we can't ask
the f/w. If we loaded the f/w, we'll now assume we have expanded LUNs
(SCCLUN for fibre channel, just plain 32 LUN for SCSI). If we didn't
load firmware, assume 8 LUNs for SCSI and 1 LUN for Fibre Channel. We
have to assume only one LUN for Fibre Channel because the LUN setting
in Request Queue entries is in different places whether we have SCCLUN
firmware or not, so the only LUN guaranteed to work for both is LUN 0.
Clean up the rest of isp.c so that ISP2100_SCCLUN defines aren't used-
instead use run time determinants based upon isp->isp_maxluns.
After starting firmware, delay 500us to give it a chance to get rolling.
Fix the interrupt service routine to check for both isr && sema being zero
before thinking this was a spurious interrupt. Following the manuals,
allow for both Mailbox as well as Queue Reponse type interrupts for regular
SCSI.
(we always support fabric now). Remove SCCLUN definition (we always
support SCCLUN now, if we load the f/w). Add typedef definition of an
external firmware fetch function.
What we'd like to know is whether or not we have a listener
upstream that really hasn't configured yet. If we do, then
we can give a more sensible reply here. If not, then we can
reject this out of hand.
Choices for what to send were
Not Ready, Unit Not Self-Configured Yet
(0x2,0x3e,0x00)
for the former and
Illegal Request, Logical Unit Not Supported
(0x5,0x25,0x00)
for the latter.
We used to decide whether there was at least one listener
based upon whether the black hole driver was configured.
However, recent config(8) changes have made this hard to do
at this time.
Actually, we didn't use the above quite yet, but were sure considering it.
changes: consider a new PDB entry different if Class 3 service parameter
roles change (!!!). Do some checking as we're getting a port database
that traps whether things change while we're doing so. Handle N-port
and F-ports correctly. Fix the fabric login loop to retain a login/binding
if things haven't changed (I mean, why logout a device only to log it back
in). No longer accept, after fabric logins, garbage if we can't get a PDB
entry that matches the device we've just logged into- if it doesn't, log
it out as it is very unlikely to still be what we thought it was. Get rid
of some of the debounce loops because we could get stuck there.
Apparently the f/w has finished the command, but somehow an interrupt is
being lost. So, we just plain wedge when booting alphas.
This is a general routine we've needed for a while.
where we can have targets (based on topology).
Much more importantly, make sure all mods to isp_sendmarker or |= so
we don't lose the marking of a bus that needs to have a marker sent for it.
require full logins after a LIP, which always led to loop resets, and
various other perturbations.
Update 2200 f/w from 2.01.00 release to 2.01.09 release.
seriously- only attempt to logout a previously logged in fabric device.
Fix a longstanding bug for aborting overtime commands- handle halves
have always been reversed.
Clean up some error messages to indicate channel number.
Approved:jkh
Andrew's problems with SCSI on some alphas- do not call isp_update
directly to update parameters- just mark them as being ready to
update for the next command- the system would just hang on a READ
CAPACITY for a drive. Really annoying because it wouldn't even timeout
(and it has a timeout) so either the SET PARAMETERS call was nuking
things or the f/w was really dropping the ball.
approved: jkh
Reviewed by: gallatin@freebsd.org
is gone as a define. We just don't support fast posting for anything less
than the 1240/1080/1280/12160 or Fibre Channel cards.
Put in support for CDB's larger than 12 bytes for parallel SCSI (up to 44
bytes are allowed).
Approved: jkh
for 1020/1X80/12160/2X00- for readability. Add in 12160 (Ultra3)
support- but not with PPR just yet. Fix and clarify fetching of
return parameter for getting firmware rev which for the 2200 contains
the connection topology (Private Loop (NL-port), N-port, FL-port,
F-port). Synthesize the connection topology for the 2100 which can
only be Private Loop or FL-port. Handle a couple of new async
mailbox commands which signify connection in Point-to-Point mode
(N-port or F-port) or indicate various toe stubbing getting to same.
Approved: jkh@freebsd.org