aic7xxx parts. This problem could result in data corruption
during periods of my PCI bus load by busmasters other than the
aic7xxx.
Many thanks to Andrew Gallatin <gallatin@cs.duke.edu> for characterizing
the symptoms of this problem and testing this fix.
Honor the 'bus reset at startup' option now that the XPT properly
handles transfer negotiation in this scenario.
Honor the sync rate settings on Ultra2 controllers. We would
always negotiate at the fastest speed. Oops.
aic7xxx.h:
Whitespace.
aic7xxx.seq:
Fix a minor nit that would cause the controller to miss the update
of the negotiation required bitmask causing the negotiation to
be delayed by a command.
tell the sequencer to pause itself for a target msg variable update. This
avoids the pause race entirely as HS_MAILBOX can be accessed without
pausing the chip.
3.2 Merge candidate.
Don't mess with the IRQMS bit in the host control register unless
we are an aic7770 chip.
Use calling context to determine if the card is already paused when
we update the target message request bit field in controller scratch
ram. Looking at the paused bit in the HCNTRL register opened up a
race condition.
Insert delays in the target message request update routine as a temporary
work around for what looks like a chip bug. I'm still investigating this
one.
Fix the Abort/Abort Tag/BDR handler to pull its message from the message
buffer in our softc instead of attempting to get it from a register on
the controller. The message is never recorded by the controller in the
new message scheme.
Don't rely on having an SCB when a BDR occurs. We can issue these during
invalid reconnects to.
Fix a few cases where we were restarting the sequencer but then still
falling out of a switch statement to unpause the sequencer again.
This could cause us to mess up sequencer state if it generated another
pausing interrupt between the time of the restart and unpause.
Kill the 'transceiver settle' loop during card initialization. I
failed to realize that a controller that is not connected to any
cables will never settle or enable the SCSI transceivers at all.
The correct solution is to monitor the IOERR interrupt which indicates
that the transceiver state has changed (UW<->LVD).
Modify the aic7xxx assembler to properly echo input when stdin is not
a tty.
connection.
Clean up support for devices featuring the multiple target SCSI ID feature.
On aic7890/91/96/97 chips, we can now assume the target role on multiple
target ids simultaneously. Although these chips also have sufficient
instruction space to hold to support the initiator and target role at the
same time, the initiator role is currently disabled as it will conflict
(chip design restriction) with the multi-tid feature. I'll probably add
a nob to enable the initiator (there-by disabling multi-tid) some time
in the future.
Return queue full or busy, depending on the tagged nature of the incoming
request, if our command input queue fills up in host memeory.
Deal with accept target I/O resource shortages.
If we get an underrun on a transaction that wasn't supposed to transmit
any data, don't attempt to print out the S/G list. The code would
run until hitting a non-present page. (oops)
black hole device. The controller will now only accept selections if
the black hole device is present and some other target/lun is enabled
for target mode.
Handle the IGNORE WIDE RESIDUE message. This support has not been tested.
Checkpoint work on handling ABORT, BUS DEVICE RESET, TERMINATE I/O PROCESS,
and CLEAR QUEUE messages as a target.
Fix a few problems with tagged command handling in target mode.
Wait until the sync offset counter falls to 0 before changing phase
after a data-in transfer completes as the DMA logic seems to indicate
transfer complete as soon as our last REQ is issued.
Simplify some of the target mode message handling code in the sequencer.
Use the host message loop for any unknown message types instead of performing
a reject message in the sequencer. Pass reject messages to the host
message loop too which frees up a sequencer interrupt type slot.
Default to issuing a bus reset if initiator mode is enabled. It seems
that the reset scsi bus bit is not defined in the same location for
all aic78xx BIOSes, so attempting to honor this setting will have to
wait until I get more information on how to detect it.
Nuke some unused variables.
in target mode, but we are not completing the command.
Use a template of allowed bus arbitration phases to selectively and
dynamically enable/disable initiator or target (re)selection.
Properly handle timeouts for target role transactions - just go to the
bus free state and report the error to the peripheral driver.
Checkpoint support for the XPT_ABORT_CCB function code. This currently
handles the accept tio and immediate notify ccb types, but does not
handle the continue target I/O or SCSI I/O ccb types. This is enough
to handle dynamic target enable/disable events.
Clean up the SCSI reset code so that we perform at most 1 SCSI bus
reset at initialization, the reset requested by the XPT layer.
is more robust and common code can be used for both the target and iniator
roles. The mechanism for tracking negotiation state has also been simplified.
Add support for sync/wide negotiation in target mode and fix many of
the target mode bugs running at higher speeds uncovered. Make a first
stab at getting all of the bus skew delays correct. Sync+Wide dataout
transfers still cause problems, but this may be an initiator problem.
Ensure that we exit BITBUCKET mode if the controller is restarted.
Add support for target mode only firmware downloads. This has been
tested on the aic7880, but should mean that we can perform target mode
on any aic7xxx controller. Mixed mode (initiator and target roles in
the same firmware load) is currently only supported on the aic7890, but
with optimization, may fit on chips with less instruction space.
use a 256 entry ring buffer of descriptersfor this purpose. This allows
the use of a simple 8bit counter in the sequencer code for tracking start
location.
Entries in the ring buffer now contain a "cmd_valid" byte at their tail.
As an entry is serviced, this byte is cleared by the kernel and set by
the sequencer during its dma of a new entry. Since this byte is the last
portion of the command touched during a dma, the kernel can use this
byte to ensure the command it processes is completely valid.
The new command format requires a fixed sized DMA from the controller
to deliver which allowed for additional simplification of the sequencer
code. The hack that required 1 SCB slot to be stolen for incoming
command delivery notification is also gone.
- Convert to CAM
- Use a new DMA based queuing and paging scheme
- Add preliminary target mode support
- Add support for the aic789X chips
- Take advantage of external SRAM on more controllers.
- Numerous bug fixes and performance improvements.
data fifo is full, but the PCI input latch is not empty,
HDMAEN cannot be cleared. The fix used here is to attempt
to drain the data fifo until there is space for the input
latch to drain and HDMAEN de-asserts.
This is a 1 instruction fix, so it should have no performance
impact.
operands that are set during seqeuncer program download instead of at
assembly time.
Convert the sequencer code to use" downloaded constants" for four run time
constants that vary depending on the board type. This frees up 4 bytes
of sequencer scratch ram space where these constants used to be stored and
also removes the additional instructions required to load their values
into the accumulator prior to using them.
Remove the REJBYTE sram variable. The host driver can just as easly
read the accumulator to get this value.
The scratch ram savings is important as the old code used to clober the
SCSICONF register on 274X cards which sits near the top of scratch ram
space. The SCSICONF register controls bus termination, and clobbering
it is not a good thing. Now we have 4 bytes to spare.
This should fix the reported problems with cards that don't have devices
attached to them failing with a stream of "Somone reset bus X" messages.
Doug Ledford determined the cause of the problem, fixes by me.
entry to the QOUTFIFO when it is full. This should eliminate the
"Timed out while idle" problems that many have reported.
In truth, this is somewhat of a hack. Although are interrupt latency is
low enough that we should be able to always service the queue in time,
since each entry must be passed up to the higher SCSI layer for what can
be a large amount of processing (perhaps even resulting in a new command
being queued) with interrupts disabled, we need this mechanism to avoid
overflow. In the future, these additional tasks will be offloaded to a
software interrupt handler which should make this hack unnecessary.
if SCB Paging was enabled:
disconnect with more data to transfer
disconnected SCB gets paged out
target reconnects so we page SCB back in
target completes transfer so residual is 0
target disconnects
SCB gets reused but not paged out since the residual is 0 (optimization)
target reconnects so we page the SCB back in
we report a residual because of stale residual information.
The fix for this is to set a flag that forces the SCB to be paged back
up to the host if we page in an SCB with a residual
Pointed out by: Doug Ledford <dledford@dialnet.net>
to fix a selection timeout problem.
If we can't find an SCB for the reconnecting target, issue a bus device
reset as the SCSI2 spec suggests.
Add a missing call to "add_scb_to_free_list" in the non paging case. In
the non-paging case, the SCBs don't really need to be on the free list,
but putting them there clears the tag field which is something the recovery
code depends on.
Be consistant about testing for parity errors after waiting for a
REQ on the bus.
Don't ack the last byte in a transaction until after we've cleared
all target state.
aic7xxx_asm.c:
Test the return value of getopt against -1 not EOF. (Yet another
shameless victum of the style guide being wrong).
loop, test for them separately. The bug report from David Malone showed that
even though we had been reselected (SELDI was true), we sat in the poll for
work loop until the selection timeout timer expired. It may be that the
SSTAT0 register doesn't like to have more than one bit tested at a time.
I've seen stranger things than this on these parts.
either by looking it up in the array of pending, per target, untagged
transactions, or by using the tag value passed in during the identify. The
old code only direct indexed for tagged transactions. This makes the
"findSCB" routine only necessary when SCB paging is enabled, so appropriately
conditionalize it. This greatly simplifies the non SCB paging code flow.
Stick 4 more, twin channel only, instructions behind
.if ( TWIN_CHANNEL)
aic7xxx_asm.c:
Add the -O options which allows the specification of which options
to include in a program listing. This makes it possible to easily
determine the address of any instruction in the program across
different hardware/option configurations. Updated usage() as well.
New sequencer assembler for the aic7xxx adapters. This assembler
performs some amount of register type checking, allows bit
manipulation of symbolic constants, and generates "patch tables"
for conditionalized downloading of portions of the program.
This makes it easier to take full advantage of the different
features of the aic7xxx cards without imposing run time penalies
or being bound to the small memory footprints of the low end
cards for features like target mode.
aic7xxx.reg:
New, assembler parsed, register definitions fo the aic7xxx cards.
This was done primarily in anticipation of 7810 support which
will have a different register layout, but should be able to use
the same assembler. The kernel aic7xxx driver consumes a generated
file in the compile directory to get the definitions of the register
locations.
aic7xxx.seq:
Convert to the slighly different syntax of the new assembler.
Conditionalize SCB_PAGING, ultra, and twin features which shaves
quite a bit of space once the program is downloaded.
Add code to leave the selection hardware enabled during reconnects
that win bus arbitration. This ensures that we will rearbitrate
as soon as the bus goes free instead of delaying for a bit.
When we expect the bus to go free, perform all of the cleanup
associated with that event "up front" and enter a loop awaiting
bus free. If we see a REQ first, complain, but attempt to
continue. This will hopefully address, or at least help diagnose,
the "target didn't send identify" messages that have been reported.
Spelling corrections obtained from NetBSD.
time that we really want to do this is when a bus reset causes the sequencer
to be reset and the kernel driver now handles this case.
Remove some reordering in the select2 routine that wasn't necessary.
It was an experimental fix for a race condition I fixed elsewhere, and
confused the code flow.
Don't bother looping on a parity error in the mesgout loop since we can't
see parity errors on out phases.
Clean up the mesgin_identify code. In the old days, we "snooped" for tag
messages and used this as an indicator of whether or not the target was
using tagged transactions. This forced the sequencer to ack the identify
before determining if a valid SCB matched the target meaning that an abort
message to handle this case might not be seen before the target entered a
data phase. Since we can determin the "tagged-ness" of a target by looking
it up in the array of busy targets (recently introduced), we can determine
this up front simplifying the search code as well as ensuring we can follow
the SCSI specs method for rejecting a reselection.
When an SCB is placed on the free list, set its SCB_TAG to SCB_LIST_NULL.
This makes it much easier for the kernel driver to find active SCBs on the
card during error recovery.
negotiation messages may be tagged, we were overrunning the old buffer.
The variable that was getting squashed is updated before the message goes
out, causing corrupted SDTR or WDTR messages. Depending on the phases
traversed before message out, this could cause the wrong offset to be
negotiated allowing data overruns to occur. The problem is easier to
detect with wide targets on the chain since the allowed offset is smaller.
Also removed the unnecessary clearing of SPIORDY during the message out
phase. We don't rely on SPIORDY any more.
When setting the HCNT registers, do so in ascending order.
When performing tagged queueing in non-paging mode, also check the
disconnected bit in the SCB as extra sanity during a reconection.
Make the labels in the DMA routine more sane.
When doing a DMA, if we see the DMADONE condition come true, we can
simply turn of the DMA enable bits in DFCNTRL without testing the FIFO
state as HDONE is true when DMADONE is true and this emplies the FIFO is
empty.
These changes clear up the data overrun error messages and seem to prevent
the "timed out in data-in phase" problems.
free.
When we clear SCSIRATE, also clear the FAST20 bit in SXFRCTL0. This also
allowed me to clean up some of the ULTRA code.
ULTRAENB->FAST20 to follow the convention in the Adaptec data books.
Fix the data-overrun code to set both stcnt and hcnt otherwise, the transfer
will just hang until we get a timeout.
Add implicit support for the NOOP message. I've never heard of the driver
issueing a reject for one, but its silly to reject NOOP and who knows how a
device might react.
In the dma routine, check SDONE before cleaing SDMAEN. The data books mention
SDONE possibly being cleared when SDMAEN is reset. Clients of dma now need
to check if SINDEX is cleared to know if a phasemis occured.
Fix some comments to be correct.
host DMAs. The additional test to ensure that the DMA has stopped is also
unnecessary since we've already waited for the DMA to complete.
Update my copyright for the new year.
Expand the boundaries of a pause disabled region to close of possible race
condition.
Revert a portion of the DMA code to fix false overruns.
Add a missing "add_scb_to_free_list" so we don't leak SCBs.
SDONE, not HDONE.
In the data phase dma handler, mask off just the enable bits instead of
clearing the whole register. Clearing the direction bit could be bad.
Also don't stop a DMA until MREQPEND goes false. Doing this may cause
an ABORT on the PCI bus although I have yet to see this happen.
Add definitions for MREQPEND and the BRDCTL register. The BRDCTL register
is used to handle high byte termination and automatic termination testing.
Only enable reselections once the channel and SCSIRATE have been cleared.
Add a pause block around the test busy code in the non-tagged case to simplify
error recovery in the corner case of aborting an SCB that just got started.
Simplify reselection processing by removing the call to initialize_scsiid.
Clear the scsiseq re/select control bits and setup for catching bogus
busfrees earlier in the re/select process.
Improve the automatic PIO code. It turns out that SPIORDY is not a reliable
hardware condition bit, so use REQINIT intstead. Don't rely on PHASEMIS
either since it can take too long to come true. Use a brute force comparison
instead.
Remove some unnecessary overhead in the command complete processing. It
should be nearly impossible to overflow the QOUTFIFO (worst case 9 command
have to complete with at least 6 of them requiring paging on an aic7850),
so don't take the additional PIO hit to guard against this condition. If we
don't see our interrupt in time, the system has bigger problems elsewhere.
If this ever does happen, the timeout handler will notice and retry the
command.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
to miss reselections from some devices and since the reselection response
timeout is only 200ns, enabling reselections too late may be the cause of our
problem.
Immediate SCBs, since they always send messages that tell the target to
transition to bus free now rely on the busfree interrupt instead of the
IMMEDDONE sequencer interrupt that was generated before.
Rearrange some code in the message out loop to give ATN a little more time
to drop before we ACK the last byte.
Use SPIORDY instead of REQINIT when snooping for a tag message on a reconnect.
This is done for the same reasons we use SPIORDY in the inb functions.
When going into BITBUCKET mode, turn off HDMAEN in the DFCNTRL register so
that we can "not care" what the value of HCNT is. If HCNT is 0, BITBUCKET
mode won't transfer any data if HDMAEN is set. Seeing as we don't want the
transfer to even think about touching the host, this seems more sane anyway.
Thanks to "Dan Willis" <dan@plutotech.com> for pointing out that this was
a problem.
SPIORDY should go active on any REQ of the bus, so testing for REQINIT is
not necessary. It also seems that testing for SPIORDY is more robust then
REQINIT since SPIORDY comes active after REQINIT and PHASEMIS seems to take
some time to come true after REQ is asserted if the phase has changed. Of
course, none of this is documented.
This should give the code savings of my original changes, without breaking the
driver on fast peripherals.
initial selection when entering the status phase. This is the same assertion
we use for all the other data transfer phases.
Hopefully fix the hangs in the mesgin and mesgout phases that I introduced
last week during some code cleanup. I need to get some of these 12MB/s
drives so I can reproduce these hangs here...
Add a pause disable in the SCB paging case around our manipulation of the
QOUTQCNT variable. This is simply extra sanity.
Set LASTPHASE to P_BUSFREE once we see a busfree so that the kernel driver can
differentiate this from a data out phase.
1) get_free_or_disc_scb was not being passed its argument correctly
in one case
2) Add protection in the form of the QOUTQCNT variable to prevent
overflowing the QOUTFIFO.
This should make SCB Paging work. Really, I mean it now. 8-)
used mvi instead of mov. Luckily this code is most likely never executed
since it is only there for sanity should a target goes into the data phase
twice during a single selection or reselection.
SCB paging is now handled almost entirely by the sequencer and also uses
DMA. This should make SCB paging at least an order of magnitude more
efficient and vastly simplifies the implementation.
Add a few space optimizations so this code still fits on aic7770 chips.
Update comments.
mode when this occurs and allow the target to complete the transaction.
Force a retry on overruns since they are usually caused by termination or
cable problems.