freebsd kernel with SKQ
Go to file
mjacob e4ec9bc987 Fix usage of DELAY (SYS_DELAY is the platform independent local
define).  Fix stupidity wrt checking whether we've gone to
LOOP_PDB_RCVD loopstate- it's okay to be greater than this state.
D'oh! Protect calls to isp_pdb_sync and isp_fclink_state with IS_FC
macros.

Completely redo mailbox command routine (in preparation to make this
possibly wait rather than poll for completion).

Make a major attempt to solve the 'lost interrupt' problem

1. Problem

The Qlogic cards would appear to 'lose' interrupts, i.e., a legitimate
regular SCSI command placed on the request queue would never complete
and the watchdog routine in the driver would eventually wakeup and
catch it. This would typically only happen on Alphas, although a
couple folks with 700MHz Intel platforms have also seen this.

For a long time I thought it was a foulup with f/w negotiations of
SYNC and/or WIDE as it always seemed to happen right after the
platform it was running on had done a SET TARGET PARAMETERS mailbox
command to (re)enable sync && wide (after initially forcing
ASYNC/NARROW at startup). However, occasionally, the same thing
would also occur for the Fibre Channel cards as well (which, ahem,
have no SET TARGET PARAMETERS for transfer mode).

After finally putting in a better set of watchdog routines for the
platforms for this driver, it seemed to be the case that the command
in question (usually a READ CAPACITY) just had up and died- the
watchdog routine would catch it after ~10 seconds. For some platforms
(NetBSD/OpenBSD)- an ABORT COMMAND mailbox command was sent (which
would always fail- indicating that the f/w denied knowledge of this
command, i.e., the f/w thought it was a done command). In any case,
retrying the command worked. But this whole problem needed to be
really fixed.

2. A False Step That Went in The Right Direction

The mailbox code was completely rewritten to no longer try and grab
the mailbox semaphore register and to try and 'by hand' complete
async fast posting completions. It was also rewritten to now have
separate in && out bitpatterns for registers to load to start and
retrieve to complete. This means that isp_intr now handles mailbox
completions.

This substantially simplifies the mailbox handling code, and carries
things 90% toward getting this to be a non-polled routine for this
driver.

This did not solve the problem, though.

3. Register Debouncing

I saw some comments in some errata sheets and some notes in a Qlogic
produced Linux driver (for the Qlogic 2100) that seemed to indicate
that debouncing of reads of the mailbox registers might be needed,
so I added this.  This did not affect the problem. In fact, it made
the problem worse for non-2100 cards.

5. Interrupt masking/unmasking

The driver *used* to do a substantial amount of masking/unmasking
of the interrupt control register. This was done to make sure that
the core common code could just assume it would never get pre-empted.

This apparently substantially contributed to the lost interrupt
problem.  The rewrite of the ICR (Interrupt Control Register),
which is a separate register from the ISR (Interrupt Status Register)
should not have caused any change to interrupt assertions pending.
The manual does not state that it will, and the register layout
seems to imply that the ICR is just an active route gate. We only
enable PCI Interrupts and RISC Interrupts- this should mean that
when the f/w asserts a RISC interrupt and (and the ICR allows RISC
Interrupts) and we have PCI Interrupts enabled, we should get a
PCI interrupt. Apparently this is a latch- not a signal route.

Removing this got rid of *most* but not all, lost interrupts.

5. Watchdog Smartening

I made sure that the watchdog routine would catch cases where the
Qlogic's ISR showed an interrupt assertion. The watchdog routine
now calls the interrupt service routine if it sees this. Some
additional internal state flags were added so that the watchdog
routine could then know whether the command it was in the middle
of burying (because we had time it out) was in fact completed by
the interrupt service routine.

6. Occasional Constipation Of Commands..

In running some very strenous high IOPs tests (generating about
11000 interrupts/second across one Qlogic 1040, one Qlogic 1080
and one Qlogic 2200 on an Alpha PC164), I found that I would get
occasional but regular 'watchdog timeouts' on both the 1080 and
the 2100 cards. This is under FreeBSD, and the watchdog timeout
routine just marks the command in error and retries it.

Invariably, right after this 'watchdog timeout' error, I'd get a
command completion for the command that I had thought timed out.
That is, I'd get a command completion, but the handle returned by
the firmware mapped to no current command. The frequency of this
problem is low under such a load- it would usually take an 30
minutes per 'lost' interrupt.

I doubled the timeout for commands to see if it just was an edge
case of waiting too short a period. This has no effect.

I gathered and printed out microtimes for the watchdog completed
command and the completion that couldn't find a command- it was
always the case that the order of occurrence was "timeout, completion"
separated by a time on the order of 100 to 150 ms.

This caused me to consider 'firmware constipation' as to be a
possible culprit. That is, resubmission of a command to the device
that had suffered a watchdog timeout seemed to cause the presumed
dead command to show back up.

I added code in the watchdog routine that, when first entered for
the command, marks the command with a flag, reissues a local timeout
call for one second later, but also then issues a MARKER Request
Queue entry to the Qlogic f/w. A MARKER entry is used typically
after a Bus Reset to cause the f/w to get synchronized with respect
to either a Bus, a Nexus or a Target.

Since I've added this code, I always now see the occasional watchdog
timeout, but the command that was about to be terminated always
now seems to be completed after the MARKER entry is issued (and
before the timeout extension fires, which would come back and
*really* terminate the command).
2000-06-27 19:44:31 +00:00
bin Use Dq Li (double-quoted literal) instead of Ic (internal command) to 2000-06-27 18:22:13 +00:00
contrib Change $FreeBSD$ placement. 2000-06-26 23:03:37 +00:00
crypto Also make sure to close the socket that exceeds your rate limit. 2000-06-26 23:39:26 +00:00
etc Add weekly_status_pkg_enable (defaults to NO) 2000-06-27 11:20:08 +00:00
games Remove garbage. 2000-06-02 12:49:57 +00:00
gnu Fix the upgrade-build case. 2000-06-27 15:28:14 +00:00
include Moving forward on my commitment to always make at least one commit from 2000-06-22 01:46:25 +00:00
kerberos5 Properly separate the K5-only buld from K4. 2000-03-23 14:56:47 +00:00
kerberosIV Remove the last vestiges of libRSAglue now that it's an empty stub. 2000-03-11 22:34:10 +00:00
lib Fixed PunchFWHole(): 2000-06-27 14:56:07 +00:00
libexec Fix a problem in the virtual host address compare code which caused 2000-06-26 05:36:09 +00:00
release Tiny manual correction; add mention of Kerberos 5. 2000-06-25 10:48:40 +00:00
sbin Added new option (-punch_fw) which allows to `punch holes' 2000-06-27 15:26:24 +00:00
secure MFI. This is a documentation-only, diffreducing patch, that if 2000-06-24 06:50:58 +00:00
share Reword the description of weekly_status_pkg_enable (although not 2000-06-27 12:04:43 +00:00
sys Fix usage of DELAY (SYS_DELAY is the platform independent local 2000-06-27 19:44:31 +00:00
tools Back out the previous change to the queue(3) interface. 2000-05-26 02:09:24 +00:00
usr.bin - Reflect `gateport' variable type change. 2000-06-24 15:34:31 +00:00
usr.sbin Use libfetch instead of libftpio. This adds support for http and IPv6. 2000-06-27 11:00:07 +00:00
COPYRIGHT Update to add the July 22, 1999 addendum. 1999-09-05 21:33:47 +00:00
Makefile We have a new world order in libraries. 2000-02-24 23:03:16 +00:00
Makefile.inc1 Rearrange Perl's build priority; it needs to get made earlier. 2000-06-25 15:02:18 +00:00
Makefile.upgrade $Id$ -> $FreeBSD$ 1999-08-28 01:35:59 +00:00
README $Id$ -> $FreeBSD$ 1999-08-28 01:35:59 +00:00
UPDATING Add warning about /dev/random disconnecting entropy for a few days while 2000-06-26 05:54:02 +00:00

This is the top level of the FreeBSD source directory.  This file
was last revised on:
$FreeBSD$

For copyright information, please see the file COPYRIGHT in this
directory (additional copyright information also exists for some
sources in this tree - please see the specific source directories for
more information).

The Makefile in this directory supports a number of targets for
building components (or all) of the FreeBSD source tree, the most
commonly used one being ``world'', which rebuilds and installs
everything in the FreeBSD system from the source tree except the
kernel and the contents of /etc.  Please see the top of the Makefile
in this directory for more information on the standard build targets
and compile-time flags.

Building a kernel with config(8) is a somewhat more involved process,
documentation for which can be found at:
   http://www.freebsd.org/handbook/kernelconfig.html
And in the config(8) man page.

The sample kernel configuration files reside in the sys/i386/conf
sub-directory (assuming that you've installed the kernel sources), the
file named GENERIC being the one used to build your initial installation
kernel.  The file LINT contains entries for all possible devices, not
just those commonly used, and is meant more as a general reference
than an actual kernel configuration file (a kernel built from it
wouldn't even run).


Source Roadmap:
---------------
bin		System/User commands.

contrib		Packages contributed by 3rd parties.

crypto		Export controlled stuff (see crypto/README).

etc		Template files for /etc

games		Amusements.

gnu		Various commands and libraries under the GNU Public License.
		Please see gnu/COPYING* for more information.

include		System include files.

kerberosIV	Kerberos package.

lib		System libraries.

libexec		System daemons.

release		Release building Makefile & associated tools.

sbin		System commands.

secure		DES and DES-related utilities - NOT FOR EXPORT!

share		Shared resources.

sys		Kernel sources.

tools		Utilities for regression testing and miscellaneous tasks.

usr.bin		User commands.

usr.sbin	System administration commands.


For information on synchronizing your source tree with one or more of
the FreeBSD Project's development branches, please see:

  http://www.freebsd.org/handbook/synching.html