Commit Graph

133 Commits

Author SHA1 Message Date
Justin T. Gibbs
539fc95753 Print out some more diagnostic information when we reject a message.
When we request sense, don't allow disconnection.  This closes a window
where we might allow an overlapped tagged and non-tagged transaction.
The correct fix is to freeze the queue for the target that requests sense
which is what will happen in the new CAM framework.
1997-04-26 05:03:18 +00:00
Justin T. Gibbs
ba5da33265 No longer use AAP for queueing SCBs to the QINFIFO.
Clean up the unexpected busfree handler.  We now look directly at the
message that might have caused the bus free to occur instead of looking
at an SCB flag.  This makes the handling more robust and also allows for
recovery actions that might cause an "unexpected busfree" to be performed
even if an SCB is not availible to "tag".  Most notably, this happens
when we don't find an SCB for a reconnecting target.
1997-04-18 16:34:36 +00:00
Justin T. Gibbs
085059c3ea Be more careful about how SCBs are cleaned up during error recovery.
Add some more diagnostic information to timeouts.
1997-04-14 02:27:50 +00:00
Justin T. Gibbs
f91ddb3f3c Drop the number of allowed tags back down to 8. Pluto uses a higher value
which mistakenly got committed.

Fix two bugs in the ahc_reset_device code:
	Limit search for SCBs to process to those that are active and
	are not queued for done processing.

	It's okay for an SCB to not have a waiting next SCB.
1997-04-10 19:14:58 +00:00
Justin T. Gibbs
8d94b804a1 Fix an infinite loop caused by calling ahc_run_done queue while the
driver is waiting a bus settle delay.  There should really be a facility
for the controller driver to "freeze" its queue during recovery operations
which would make all of this gymnastics unnecessary.
1997-04-07 18:32:47 +00:00
Justin T. Gibbs
844b7c2d86 Fix a bug in the selection timeout handler that was introduced when the
selection loop was merged with the poll_for_work loop.  We cannot assume
that the SCB for the selection timeout is the current SCB.  Instead we
must look at the SCB at the head of the waiting for selection list.

This fixes part of a problem reported by David Malone, but does not explain
why he was getting selection timeouts in the first place.
1997-04-05 21:41:13 +00:00
Justin T. Gibbs
0f3a99aade Now that we use AAP, we have to explicitly unpause the sequencer when
queueing an abort SCB.
1997-04-04 19:36:04 +00:00
Justin T. Gibbs
f98b4a3bdc NOOP commit to correct the comment for the last commit:
Bump the timeout for an "ordered tag" recovery action from 1 to 5 seconds.

Remove the multiple timeout panic.  Its very easy to get into a situation
where a timedout command will time out a second time even though the
recovery code is working fine.  A good example is:

1) Command times out during recovery
2) reset the timeout for the command
3) Recovery actions complete and all transactions are requeued
4) second timeout fires off which puts us back into recovery bogusly
5) another transaction that timedout once during the first recovery action
   times out causing the panic.

In essence, the correct solution to the problem is to put every transaction
back up into the work queue and have their timeout handling done in the same
way that all commands are handled.  The CAM layer makes this easy, so it
will have to wait until then.
1997-04-04 04:21:43 +00:00
Justin T. Gibbs
5ea0ae111d When not using SCB paging, we can always directly index the SCB of interest
either by looking it up in the array of pending, per target, untagged
transactions, or by using the tag value passed in during the identify.  The
old code only direct indexed for tagged transactions.  This makes the
"findSCB" routine only necessary when SCB paging is enabled, so appropriately
conditionalize it.  This greatly simplifies the non SCB paging code flow.
1997-04-04 04:09:29 +00:00
Justin T. Gibbs
c2df2e4bc2 Fix a fencepost error in ahc_find_scb that could cause us to wrongfully
find an SCB still down on the card that was paged out.  This only affects
error recovery.

Submitted by: Daniel M. Eischen <deischen@iworks.InterWorks.org>
1997-03-24 17:42:25 +00:00
Justin T. Gibbs
7d951713e8 Fix a nasty bug that meant a QUEUE_FULL status would result in a lost
SCB.  This is probably a main reason for the recent reports of timeouts.
1997-03-24 05:05:18 +00:00
Bruce Evans
6617929964 Removed nested #includes of <scsi/scsi_debug.h> and <scsi/scsi_driver.h>
from <scsi/scsiconf.h> and fixed everything that depended on them.
1997-03-23 06:33:55 +00:00
Justin T. Gibbs
f9380a619e Adapt to some changes in the register definitions. Clear the selection
enable in SCSISEQ during error recovery to deal with the way the
sequencer leaves selections enabled now.  Add code to perform "patching"
during sequencer program download.

Spelling fixes obtained from NetBSD.
1997-03-16 07:12:07 +00:00
Justin T. Gibbs
8a709f7876 When we perform an "automatic request sense", we issue an untagged command.
The sequencer expects untagged transactions to have the SCBID of the
transaction in the "busy target" array.  So, ensure that the busy entry
is up to date for the target in this case.  The new identify code in the
sequencer that performs additional sanity checking got caught up when a
tagged transaction created an untagged request sense.

In ahc_handle_seqint, ensure that the target ID is taken from the right
place.  In the case of a selection, the ID is in SCSIID.  In the case of
a reconnection it is found in SELID.
1997-03-01 06:50:41 +00:00
Justin T. Gibbs
bccca068ce Functionalize some code that was repeated throughout the driver.
Fix a bug in the initialization of the busreset_args that left the B channel
args unitialized and the A channel ones initialized to B's vales.  Oops.

If we get a NO_IDENT sequencer interrupt (the reconnecting target didn't
issue an identify or botched it), reset the bus instead of panicing.  We
should be able to recover from this error.

In the AWAITING_MSG handler, order messages by severity.  Since the message
we send is based on a flag on the SCB, it is possible, during error recovery,
to get more than one flag set.  This is fine since any time a new flag is
set, it is meant to take us to a more draconian level of recovery.  This
also ensures that we don't lose any "history" of what the command has gone
through.

When we reset the bus, reset the "send ordered tag" bitmask.

Clear some additional interrupt status when we perform a bus reset.
1997-02-28 03:58:21 +00:00
Justin T. Gibbs
5a02006d3e Fix numerous problems with the abort/recovery code. Highlights include fixing
a race condition in how SDTR and WDTR negotiation are handled, fixes for multi-lun
non-tagged device recovery, and ensuring that the timedout scbs in the waiting queue
are cleaned up.

Fix a problem with SCB paging that caused bogus residuals to be reported.
1997-02-25 03:05:35 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Justin T. Gibbs
4d04269a37 Fix some more problems in the recovery code.
Cleanup of the disconnected list was broken in the SCB paging case
    (confusion of NULLand SCB_LIST_NULL)
Implement a clean mechanism for determining that we have exited the timeout
    state and test for this in ahc_done instead of all over the place.
Bring back the use of AAP (Auto Access Pause) I don't think it was the
    true cause of the bus hangs people were reporting.
We want to reset the bus if we've been through an Abort action, not if
    we are a recovery SCB (one implies the other, but not vice-versa).
1997-02-19 01:00:51 +00:00
Justin T. Gibbs
df98fb1bb2 Kill the initialization of two old scratch ram variables. They were removed
to make space for the larger message buffer.
1997-02-18 20:23:09 +00:00
Justin T. Gibbs
4e605384b5 Don't rely on AAP(Auto Access Pause) when queueing SCBs to the card. This
will increase the overhead of queueing a command, but some recent bug reports
make me believe that AAP isn't really working and that we are losing some
SCBs from the input queue.  Hopefully this will cure that problem.

Fix some bugs in the error recovery code.  Mainly these could cause us to
inadvertantly forget to untimeout an SCB that was recovered causing later
confusion.
1997-02-18 04:25:31 +00:00
David Greenman
0305f20493 Changed timeout for requesting sense from 100ms to 1 second.
Submitted by:	gibbs
1997-02-14 03:13:37 +00:00
Justin T. Gibbs
a042d32c0b Fix a bug in the reporting of residuals. The code was relying on the SG_COUNT
filed in the hardware SCB not changing during the course of a transaction.
Since the sequencer now DMAs the hardware SCB back up to the host when it
detects a residual, this is no longer the case.  I added a field to the
"software" scb to mirror this information and it is now used for doing the
residual calculation.
1997-02-11 17:10:37 +00:00
Justin T. Gibbs
21c89fbab3 ahc_search_qinfo->ahc_search_qinfifo
ULTRAENB->FAST20

Add a missing ahc_run_done_queue if a BRKADDRINT occurs.  This should never
happen (haven't heard of one happening), but it was still a bug.

Brought the ordered tag sending code up into the tag code to be clearer.

If we decide we should send an ordered tag, only do so for the target that
timed out instead of all targets.

Initialize the STAILQ in ahc_serach__qinfifo.  This was causing a panic
during some recovery operations.

Remove the unused varable maxtarget.
1997-02-09 03:26:56 +00:00
Justin T. Gibbs
a42f819712 Initialization of a variable got lost in the last commit when I moved
a piece of code into a subroutine.
1997-02-03 17:24:25 +00:00
Justin T. Gibbs
c646d245b2 Fix an oversight in the handling of non-tagged abort requests. We need
to search the QINFIFO to remove any possible command that is waiting
otherwise our abort request may not be held up still waiting for the
first command to complete.
1997-02-03 16:29:07 +00:00
Justin T. Gibbs
8ecec21da6 White space cleanup and other cosmetic style changes.
Fix a few panics during error recovery:
1) Stupid mistake in the "no SCB match handler"  where I was using the wrong
   variable (busy_scbid instead of scb_index).
2) Unbusy the target of an abort request if the command we are trying to
   abort is an untagged transaction.  If we don't, we get a fatal NO_MATCH_BUSY
   condition which "should never happen".
3) When an abort completes, turn off ahc->in_timeout or else the next timeout
   will hit the protective "scb timesout again" panic.
4) Fix a typo that caused the requeued "abort" SCB to have its TAG_ENB and
   disconnect bits to be cleared (missing ~) so that devices would complain
   about overlapped commands.

Be sure to turn off the unexpected busfree interrupt after we do a bus
reset since we are expecting the bus to go free in that case.

Return XS_TIMEOUT instead of XS_DRIVERSTUFFUP in certain scenarios.  XS_TIMEOUT
allows for retries, XS_DRIVERSTUFFUP does not.

Allow commands with SDTR and WDTR negotiation to be tagged.  The SCSI II spec
says that you probably should not do this for fear of hitting bogus devices.
The driver did this in the past for almost two years without any problem,
and not doing it causes problems during error recovery to a tag capable device
as the number of openings is higher than two and we'll start sending it
tagged commands causing "overlapped commands attempted" type errors.  The
real fix needs to happen in the generic SCSI layer which can limit the
number and type of transactions to a device during error recovery efficiently.

Give ourselves at least 100ms to perform a request sense instead of relying
on the original timeout to be long enough to complete this new command as
well as the one that generated the condition.

Removed some redundant code.
1997-02-03 02:16:16 +00:00
Justin T. Gibbs
54dd351d93 Add 1997 to my copyright.
If we can, use timeouts instead of DELAYs when dealing with a bus reset.
This prevents us from holding up the whole machine for a noticible amount
of time (especially for a real time app).

Make a pass over the timeout/error handling code.  Aborts are more
reliable.  We actually handle parity errors correctly now instead of
locking up the bus.  Added code to properly clean up disconnected SCBs
down on the card during error handling.  Improved robustness in several
areas.

If we are using defaults, but are an Ultra card, negotiate at 20MHz instead
of 10.

Don't attempt to handle any commands for 100ms after a reset has occured.
This is the minimum time before a target will respond to selection.  Also
disable the busfree interrupt before doing a bus reset.  This prevents the
driver from getting confused by an "unexpected busfree".
1997-01-29 05:27:03 +00:00
Justin T. Gibbs
c2f69d249e 93cx6.c:
Style nit.  Backslashes in macro weren't aligned.

aic7xxx.c:
Preserve the value of STPWEN in SXFRCTL1 during initialization.  STPWEN
controls low byte termination and is setup by the PCI probe front end.
1997-01-24 21:59:32 +00:00
Justin T. Gibbs
71318ca3ea Remove some unnecessary overhead in the command complete processing. It
should be nearly impossible to overflow the QOUTFIFO (worst case 9 command
have to complete with at least 6 of them requiring paging on an aic7850),
so don't take the additional PIO hit to guard against this condition.  If we
don't see our interrupt in time, the system has bigger problems elsewhere.
If this ever does happen, the timeout handler will notice and retry the
command.

Remove the ABORT_TAG sequencer interrupt handler.  This condition can't happen
with the new SCB paging scheme.

Fix a few bugs noticed by Dan Eischen <deischen@iworks.InterWorks.org>
that could prevent ULTRA from being negotiated to drives above ID 7 and
also could allow an SCB to be passed to ahc_done twice during error recovery.

Fix a bug notice by Rory Bolt <rory@atBackup.com>.  It turns out that a
sequencer reset will actually start the sequencer running regardless of the
state of the pause bit.  This could lead to strange problems with loading
the sequencer.
1997-01-22 18:05:31 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Justin T. Gibbs
2ca3e32777 Clear the busfree interrupt when one occurs, after a SELTO, or a bus reset. 1996-12-03 17:06:00 +00:00
Justin T. Gibbs
ccddabb0c7 Add code to the SCSIINT handler for unexpected busfrees and handling immediate
SCBs in response to a busfree.

When re-queueing an SCB that returned with QUEUE FULL status, reset its timeout.

Ensure that aborted SCBs have an error code set in there xs before it gets
passed back up with scsi_done.

Fix a few KNF nits.
1996-11-22 08:28:45 +00:00
Justin T. Gibbs
74bb76f011 Be even more careful in how we manipulate the QOUTQCNT variable. Now we
do reset it from the QOUTCNT register inside a pause/unpause.  This now happens
once per command complete interrupt in the paging case (one interrupt can be
for multiple completed commands).  I may introduce a counter and do a lazy
update in the future, similar to what is done with the QINCNT.

Enhance the QUEUE FULL condition handling so that the number of openings will
be reduced.  This has become more important now that the driver is faster.
This code really belongs in the gerneric SCSI layer, as will be the case once
3.0 gets the code from the 'SCSI' branch.

Add some #if 0'd out trace code I've been using to help debug sequencer
problems.

Fix the SCB paging problem that I was seeing.  This was only on my 7850
controller and stems from the fact that its QINFIFO can only handle 3bit
SCB identifiers.  This means that you can only have 8 transactions open at
a time with the current paging scheme to these controllers.  The code added
to enforce this is generic in that it tests for the number of relevent bits
that the QINFIFO can store and adjusts the max accordingly.  It may be possible
to come up with a scheme that allows for more than 8 commands at a time, but
I don't know that it is worth the effort simply to fix a low end card.  The
aic7880 still can do 255.

This problem may be related to what Andrey was seeing since I don't have n
aic7770 rev E chip here to test on, but as soon as someone probes one of these
cards with this new code, the dmesg output will tell the whole story.
1996-11-16 01:19:14 +00:00
Bruce Evans
11874b34bf Fixed pessimized (short) i/o port type.
Obtained from:	SCSI branch
1996-11-11 15:29:15 +00:00
Justin T. Gibbs
30136d82b2 Clean up the memory mapped/Programmed I/O stuff so that the driver completely
uses one or the other.  This required some changes to the ahc_reset()
function, and how early the probes had to allocate their softc.

Turn the AHC_IN/OUT* macros into inline functions and lowercase their names
to indicate this change.  Geting AHC_OUTSB to work as a macro doing
conditional memory mapped I/O would have been too gross.

Stop setting STPWEN in the main driver and let the PCI front end do it
instead.  It knows better.

Add the clearing of the QOUTQCNT variable during command complete processing
in the SCB paging case.

Go back to doing unconditional retries for the QUEUE FULL status condition.
This is really a kludge, but the code to handle it properly is on the SCSI
branch and will not make it into 2.2.
1996-11-11 05:24:46 +00:00
Justin T. Gibbs
ebcf5ea509 Bzero the kernel scb array after it is allocated otherwise the control byte
used on the first transaction on an SCB is indeterminate.

Spaces -> tabs.
1996-11-07 06:39:44 +00:00
Justin T. Gibbs
16da2fbab4 Update to changes in generic SCSI layer. 1996-11-05 09:20:08 +00:00
Justin T. Gibbs
bee7495e7a Add missing parenthesis. That's what I get for having e different versions
of this driver in three different trees. <sigh>
1996-11-05 08:39:33 +00:00
Justin T. Gibbs
bf850955ed Move the include opt_aic7xxx in aic7xxx.h so that all of the driver files get
it automatically.  The AHC_FORCE_PIO option wasn't having any effect because
the PCI probe code didn't include this file.

Fix some problems with the new sync and wide negotiation code.  First off,
go back to async transfers by using a message reject again.  The SCSI II and
III spec indicate that if a target's response to an initiater does not suit
(i.e. its too low), then performing a message reject is the appropriate
response.  If, on the other hand, the initiator begins the negotiation and
we want to go async, we will send back an SDTR message with a 0 period and
offset.

Also fix a really bad negotiation problem caused by a missing "break".  This
would usually hit people that had "smart" wide devices that immediately
attempt sync negotiation after a successful wide negotiation.

2.2 Candidate.
1996-11-05 07:57:29 +00:00
Justin T. Gibbs
8c64b9a600 Add basic support for the 398X cards as multi-channel SCSI host adapters.
This involves expanding the support of the SEEPROM routines to deal with
the larger SEEPROMs on these cards and providing a mechanism to share
SCB arrays between multiple controllers.

Most of the 398X support came from Dan Eischer.

ahc_data -> ahc_softc

Clean up some more type bogons I missed from the last pass.

Be more clear when handing the NO_MATCH condition.  NO_MATCH can also
happen when the sequencer encounters an SCB we've asked to be aborted.
1996-10-28 06:10:02 +00:00
Justin T. Gibbs
df9ab5b30f - KNF cleanup.
- Add support for memory mapped I/O.
- Use DMA to get SCBs down to the adapters.
- Remove old paging code.
- Be much smarter about how we allocate SCB space.  The old, simple method
  wasted almost half a page per SCB. Ooops.
- Make command complete interrupt processing more efficient.
- Break the monolithic ahc_intr into sub-routines.  The sub-routines handle
  rare, special case events so the function call is not a penalty and the
  removal of the code from the main routine most likely improves performance
  instruction prefech will work better and less code is pushed into the cache.
- Never, ever allow tagged queueing if a device has disconnection disabled.
- Clean up and simplify timeout code.  Many of the changes are to handle the
  new DMA scheme.
1996-10-25 06:42:53 +00:00
Justin T. Gibbs
0a42ab8379 Advanced Systems Inc. SCSI Controller driver and ISA/VL front end.
I have only tested the ABP5140 card and only with a single CDROM drive
but it seems to work fine.  This driver relies on features found only in
the SCSI branch so will not work in -current until those changes
are brought in.  It also doesn't have any error handling code *yet*.
The goal is to use this driver as the development platform for the new
generic SCSI layer error recovery/handling code.

PCI and EISA front ends will show up as soon as I get my hands on
the cards.  There are also a few issues in the driver that I need
to clear up with AdvanSys before I can suggest sticking one of
these cards in your server. 8-)

Thanks to AdvanSys for releasing this code under a suitable copyright.

Obtained from:  Ported from the Linux driver writen by
		bobf@advansys.com (Bob Frey).
1996-10-07 02:07:07 +00:00
Justin T. Gibbs
b109ceda2f Bring in bug fix from 'SCSI' branch. 1996-10-06 22:50:56 +00:00
Justin T. Gibbs
020f6d4ee5 If, during an SDTR negotiation, the target comes back with a response
that is too low for the aic7xxx chip to handle with sync transfers,
negotiate async transfers.
1996-10-06 19:43:36 +00:00
Justin T. Gibbs
2ea8799246 Bring aic7xxx driver bug fixes from 'SCSI' into current. 1996-10-06 16:38:45 +00:00
Paul Traina
6e702c9964 Document the Adaptec driver options for tagged command queueing and SCB
paging (with a warning not to use SCB paging) and create an opt_aic7xxx.h
file for these options.
1996-10-01 03:01:06 +00:00
Julian Elischer
516d61eba9 oops somehow this dissppeared along the way..
now I've started working on this again, I discovered it..
1996-08-19 02:42:40 +00:00
Justin T. Gibbs
356cbcce49 Fix problem with scb flag handing that crept in with the SCB paging support.
This only affected userland initiated device resets (using the reset command
from cdplay for instance).

Convert some spaces to tabs.
1996-06-23 20:02:37 +00:00
Justin T. Gibbs
1ffa43c082 Detect and report dataphase overruns. Put the adapter into 'Bit Bucket'
mode when this occurs and allow the target to complete the transaction.
Force a retry on overruns since they are usually caused by termination or
cable problems.
1996-06-09 17:33:18 +00:00
Justin T. Gibbs
41f6ef7ee2 Bring back the loop in RESTART_SEQUENCER. It seems to be necessary for
the aic7850.  Go back to autoATN on parity errors.
1996-06-08 06:55:01 +00:00