freebsd-skq

Author	SHA1	Message	Date
Matt Jacob	81babfd043	Make the Not RESPONSE in RESPONSE QUEUE message have a bit more info (specifically, how many entries we've looked at so far). Maintain interrupt instrumentation. Use USEC_SLEEP instead of USEC_DELAY in a number of places (this allows us to drop locks and sleep instead of spin). Track changes to configuration options for topology preference. Fix botched order of printout for Channel, Target, Lun.	2000-12-02 18:08:35 +00:00
Matt Jacob	c914d4237d	Redo how default Node and Port WWNs are determined (again!). This is so we don't stomp on the differences between ports for a Qlogic 2202.	2000-10-12 23:49:09 +00:00
Matt Jacob	aa57fd6fa5	some copyright cleanups	2000-09-21 20:16:04 +00:00
Matt Jacob	c0cfc79790	Inintialize the queue index stuff from what the f/w sends back- just in case it's insane enough to not do what you tell it to. Print out (LOGINFO level) initiator ID.	2000-09-21 17:06:45 +00:00
Matt Jacob	b6b6ad2f23	various fixes	2000-08-27 23:38:44 +00:00
Matt Jacob	d0d5832ac7	Major whacking for core version 2.0. A major motivator for 2.0 and these changes is that there's now a Solaris port of this driver, so some things in the core version had to change (not much, but some). In order, from the top.....: A lot of error strings are gathered in one place at the head of the file. This caused me to rewrite them to look consistent (with respect to things like 'Port 0x%' and 'Target %d' and 'Loop ID 0x%x'. The major mailbox function, isp_mboxcmd, now takes a third argument, which is a mask that selectively says whether mailbox command failures will be logged. This will substantially reduce a lot of spurious noise from the driver. At the first run through isp_reset we used to try and get the current running firmware's revision by issuing a mailbox command. This would invariably fail on alpha's with anything but a Qlogic 1040 since SRM doesn't start the f/w on these cards. Instead, we now see whether we're sitting ROM state before trying to get a running BIOS loaded f/w version. All CFGPRINTF/PRINTF/IDPRINTF macros have been replaced with calls to isp_prt. There are seperate print levels that can be independently set (see ispvar.h), which include debugging, etc. All SYS_DELAY macros are now USEC_DELAY macros. RQUEST_QUEUE_LEN and RESULT_QUEUE_LEN now take ispsoftc as a parameter- the Fibre Channel cards and the Ultra2/Ultra3 cards can have 16 bit request queue entry indices, so we can make a 1024 entry index for them instead of the 256 entries we've had until now. A major change it to fix isp_fclink_test to actually only wait the delay of time specified in the microsecond argument being passed. The problem has always been that a call to isp_mboxcmd to get he current firmware state takes an unknown (sometimes long) amount of time- this is if the firmware is busy doing PLOGIs while we ask it what's up. So, up until now, the usdelay argument has been a joke. The net effect has been that if you boot without being plugged into a good loop or into a switch, you hang. Massively annonying, and hard to fix because the actual time delta was impossible to know from just guessing. Now, using the new GET_NANOTIME macros, a precise and measured amount of USEC_DELAY calls are done so that only the specified usecdelay is allowed to pass. This means that if the initial startup of the firmware if followed by a call from isp_freebsd.c:isp_attach to isp_control(isp, ISP_FCLINK_TEST, &tdelay) where tdelay is 2 * 1000000, no more than two seconds will actually elapse before we leave concluding that the cable is unhooked. Jeez. About time.... Change the ispscsicmd entry point to isp_start, and the XS_CMD_DONE macro to a call to the platform supplied isp_done (sane naming). Limit our size of request queue completions we'll look at at interrupt time. Since we've increased the size of the Request Queue (and the size of the Response Queue proportionally), let's not create an interrupt stack overflow by having to keep a max completion list (forw links are not an option because this is common code with some platforms that don't have link space in their XS_T structures). A limit of 32 is not unreasonable- I doubt there'd be even this many request queue completions at a time- remember, most boards now use fast posting for normal command completion instead of filling out response queue entries. In the isp_mboxcmd cleanup, also create an array of command names so that "ABOUT FIRMWARE" can be printed instead of "CMD #8". Remove the isp_lostcmd function- it's been deprecated for a while. Remove isp_dumpregs- the ISP_DUMPREGS goes to the specific bus register dump fucntion. Various other cleanups.	2000-08-01 06:51:05 +00:00
Matt Jacob	c77d11d0cc	Raise debug level for some messages. Fix botched inversion about MBOX_COMMAND_ERROR vs. MBOX_COMMAND_PARAM_ERROR.	2000-07-18 06:46:48 +00:00
Matt Jacob	3e97a5b432	Clean up ISPCTL_ABORT_CMD function to not be too chatty if it succeeds, or even if it fails with INVALID_PARM (which just means that the handle doesn't refer to an active commane).	2000-07-05 06:41:36 +00:00
Matt Jacob	1d460ef8d5	Change delay loop in new isp_mboxcmd to the use of the new MBOX_WAIT_COMPLETE macro. Change notification of completion of a mailbox command in isp_intr to MBOX_NOTIFY_COMPLETE macro.	2000-07-04 01:02:38 +00:00
Matt Jacob	28445eef28	Fix usage of DELAY (SYS_DELAY is the platform independent local define). Fix stupidity wrt checking whether we've gone to LOOP_PDB_RCVD loopstate- it's okay to be greater than this state. D'oh! Protect calls to isp_pdb_sync and isp_fclink_state with IS_FC macros. Completely redo mailbox command routine (in preparation to make this possibly wait rather than poll for completion). Make a major attempt to solve the 'lost interrupt' problem 1. Problem The Qlogic cards would appear to 'lose' interrupts, i.e., a legitimate regular SCSI command placed on the request queue would never complete and the watchdog routine in the driver would eventually wakeup and catch it. This would typically only happen on Alphas, although a couple folks with 700MHz Intel platforms have also seen this. For a long time I thought it was a foulup with f/w negotiations of SYNC and/or WIDE as it always seemed to happen right after the platform it was running on had done a SET TARGET PARAMETERS mailbox command to (re)enable sync && wide (after initially forcing ASYNC/NARROW at startup). However, occasionally, the same thing would also occur for the Fibre Channel cards as well (which, ahem, have no SET TARGET PARAMETERS for transfer mode). After finally putting in a better set of watchdog routines for the platforms for this driver, it seemed to be the case that the command in question (usually a READ CAPACITY) just had up and died- the watchdog routine would catch it after ~10 seconds. For some platforms (NetBSD/OpenBSD)- an ABORT COMMAND mailbox command was sent (which would always fail- indicating that the f/w denied knowledge of this command, i.e., the f/w thought it was a done command). In any case, retrying the command worked. But this whole problem needed to be really fixed. 2. A False Step That Went in The Right Direction The mailbox code was completely rewritten to no longer try and grab the mailbox semaphore register and to try and 'by hand' complete async fast posting completions. It was also rewritten to now have separate in && out bitpatterns for registers to load to start and retrieve to complete. This means that isp_intr now handles mailbox completions. This substantially simplifies the mailbox handling code, and carries things 90% toward getting this to be a non-polled routine for this driver. This did not solve the problem, though. 3. Register Debouncing I saw some comments in some errata sheets and some notes in a Qlogic produced Linux driver (for the Qlogic 2100) that seemed to indicate that debouncing of reads of the mailbox registers might be needed, so I added this. This did not affect the problem. In fact, it made the problem worse for non-2100 cards. 5. Interrupt masking/unmasking The driver used to do a substantial amount of masking/unmasking of the interrupt control register. This was done to make sure that the core common code could just assume it would never get pre-empted. This apparently substantially contributed to the lost interrupt problem. The rewrite of the ICR (Interrupt Control Register), which is a separate register from the ISR (Interrupt Status Register) should not have caused any change to interrupt assertions pending. The manual does not state that it will, and the register layout seems to imply that the ICR is just an active route gate. We only enable PCI Interrupts and RISC Interrupts- this should mean that when the f/w asserts a RISC interrupt and (and the ICR allows RISC Interrupts) and we have PCI Interrupts enabled, we should get a PCI interrupt. Apparently this is a latch- not a signal route. Removing this got rid of most but not all, lost interrupts. 5. Watchdog Smartening I made sure that the watchdog routine would catch cases where the Qlogic's ISR showed an interrupt assertion. The watchdog routine now calls the interrupt service routine if it sees this. Some additional internal state flags were added so that the watchdog routine could then know whether the command it was in the middle of burying (because we had time it out) was in fact completed by the interrupt service routine. 6. Occasional Constipation Of Commands.. In running some very strenous high IOPs tests (generating about 11000 interrupts/second across one Qlogic 1040, one Qlogic 1080 and one Qlogic 2200 on an Alpha PC164), I found that I would get occasional but regular 'watchdog timeouts' on both the 1080 and the 2100 cards. This is under FreeBSD, and the watchdog timeout routine just marks the command in error and retries it. Invariably, right after this 'watchdog timeout' error, I'd get a command completion for the command that I had thought timed out. That is, I'd get a command completion, but the handle returned by the firmware mapped to no current command. The frequency of this problem is low under such a load- it would usually take an 30 minutes per 'lost' interrupt. I doubled the timeout for commands to see if it just was an edge case of waiting too short a period. This has no effect. I gathered and printed out microtimes for the watchdog completed command and the completion that couldn't find a command- it was always the case that the order of occurrence was "timeout, completion" separated by a time on the order of 100 to 150 ms. This caused me to consider 'firmware constipation' as to be a possible culprit. That is, resubmission of a command to the device that had suffered a watchdog timeout seemed to cause the presumed dead command to show back up. I added code in the watchdog routine that, when first entered for the command, marks the command with a flag, reissues a local timeout call for one second later, but also then issues a MARKER Request Queue entry to the Qlogic f/w. A MARKER entry is used typically after a Bus Reset to cause the f/w to get synchronized with respect to either a Bus, a Nexus or a Target. Since I've added this code, I always now see the occasional watchdog timeout, but the command that was about to be terminated always now seems to be completed after the MARKER entry is issued (and before the timeout extension fires, which would come back and really terminate the command).	2000-06-27 19:44:31 +00:00
Matt Jacob	fb1d37adcd	Once we have firmware running (if isp_reset) and this is the first time through, establish what our LUN width is. Unfortunately, we can't ask the f/w. If we loaded the f/w, we'll now assume we have expanded LUNs (SCCLUN for fibre channel, just plain 32 LUN for SCSI). If we didn't load firmware, assume 8 LUNs for SCSI and 1 LUN for Fibre Channel. We have to assume only one LUN for Fibre Channel because the LUN setting in Request Queue entries is in different places whether we have SCCLUN firmware or not, so the only LUN guaranteed to work for both is LUN 0. Clean up the rest of isp.c so that ISP2100_SCCLUN defines aren't used- instead use run time determinants based upon isp->isp_maxluns. After starting firmware, delay 500us to give it a chance to get rolling. Fix the interrupt service routine to check for both isr && sema being zero before thinking this was a spurious interrupt. Following the manuals, allow for both Mailbox as well as Queue Reponse type interrupts for regular SCSI.	2000-06-18 04:56:17 +00:00
Matt Jacob	6d1d7d4c87	Fix some breakage about how we build WWNs. Do some other fabric related changes: consider a new PDB entry different if Class 3 service parameter roles change (!!!). Do some checking as we're getting a port database that traps whether things change while we're doing so. Handle N-port and F-ports correctly. Fix the fabric login loop to retain a login/binding if things haven't changed (I mean, why logout a device only to log it back in). No longer accept, after fabric logins, garbage if we can't get a PDB entry that matches the device we've just logged into- if it doesn't, log it out as it is very unlikely to still be what we thought it was. Get rid of some of the debounce loops because we could get stuck there.	2000-05-09 01:14:43 +00:00
Matt Jacob	c88f65e2c0	Pick up topology more sanely at f/w startup. Change the restrictions of where we can have targets (based on topology). Much more importantly, make sure all mods to isp_sendmarker or \|= so we don't lose the marking of a bus that needs to have a marker sent for it.	2000-04-21 02:04:34 +00:00
Matt Jacob	cf74f2682e	Slightly cleaner fabric support (whiter whites! redder reds!).. No, seriously- only attempt to logout a previously logged in fabric device. Fix a longstanding bug for aborting overtime commands- handle halves have always been reversed. Clean up some error messages to indicate channel number. Approved:jkh	2000-02-29 05:52:14 +00:00
Matt Jacob	fe4d046167	Clean out residual bogosity for fast posting stuff- ISP_NO_FASTPOST_SCSI is gone as a define. We just don't support fast posting for anything less than the 1240/1080/1280/12160 or Fibre Channel cards. Put in support for CDB's larger than 12 bytes for parallel SCSI (up to 44 bytes are allowed). Approved: jkh	2000-02-15 00:35:00 +00:00
Matt Jacob	0f38a25b52	Restructure nvram reading routine to split out to separate functions for 1020/1X80/12160/2X00- for readability. Add in 12160 (Ultra3) support- but not with PPR just yet. Fix and clarify fetching of return parameter for getting firmware rev which for the 2200 contains the connection topology (Private Loop (NL-port), N-port, FL-port, F-port). Synthesize the connection topology for the 2100 which can only be Private Loop or FL-port. Handle a couple of new async mailbox commands which signify connection in Point-to-Point mode (N-port or F-port) or indicate various toe stubbing getting to same. Approved: jkh@freebsd.org	2000-02-11 19:31:32 +00:00
Matt Jacob	0719e3345c	clean up for SBus Ultra (yes, we do not do that here yet)	2000-01-15 01:52:01 +00:00
Matt Jacob	e85919b9f8	change debug printout lefvels for a couple of places	2000-01-09 21:47:39 +00:00
Matt Jacob	3da7ba4d41	Make Fibre Channel cards correctly note the presence/absence of ARQ data and punt the dealing with its presence/absence to the platform layers.	2000-01-04 03:44:21 +00:00
Matt Jacob	ac1fd1487e	Raise default FCP logintime to 60 seconds. Move the position of where we could have seen the loop up at least once so it makes sense. Change some stuff in ispscsicmd so we don't get stuck there if the loop has never come up yet. Add in some target mode support code.	2000-01-03 23:52:41 +00:00
Matt Jacob	9ee303fb46	Clean up some f/w revision checking wrt enabling fast posting. Make sure we set defaults sanely for dual-bus adapters.	1999-12-20 01:34:01 +00:00
Matt Jacob	22e1dc858b	Add Dual LVD bus (1280) support	1999-12-16 05:42:02 +00:00
Matt Jacob	7457966f26	turn some messages into CFGPRINT messages	1999-12-03 06:55:39 +00:00
Matt Jacob	38dace9790	Clean up stupidity in the isp_handle_other_response function- indexes of queue entries have to be at least 16 bits now! If we're running a 2100 less than rev 5, turn off loop fairness (per Qlogic errata). Fix typo in checking against 2200 F/W revision. Slightly fix/reorder fabric login stuff. Change to usage of isp_getrqentry for code clarity. Add some defensive dual bus assumptions. Various cleanups, etc...	1999-11-21 03:18:22 +00:00
Matt Jacob	fdc79fd3fc	correct moronic typo	1999-11-01 04:39:52 +00:00
Matt Jacob	03322f8625	Use pointer to f/w in md structure as to whether f/w exists or not. If firmware length isn't specified, extract from the 4th short into the firmware.	1999-10-30 19:32:44 +00:00
Matt Jacob	2668d67e9c	I was misinformed. I cannot get away from specifying tags for FC. Some devices are happy w/o them- some are unhappy (IBM drives).	1999-10-28 02:48:42 +00:00
Matt Jacob	83d62096d6	nuke a debug printout I thought I had already nuked	1999-10-26 22:25:13 +00:00
Matt Jacob	5e73516b7b	remember to initialize mailbox 2 for FC isp bus resets	1999-10-22 17:03:03 +00:00
Matt Jacob	fc0685ea06	Remove some target mode stuff. It will get re-introduced in a different file later. Do some pencil-sharpening types of minor changes. Change how active commands are remembered (using new inline functions to get handles, etc..). Now do a GET FIRMWARE STATUS after firing up the f/w as outgoing mailbox 2 will tell you the f/w's notion of the max commands that can be supported. Attempt to retrieve loop topology. Add in the appropriate SWIZZLE/UNSWIZZLE macros calls (this is a no-op on Little Endian machines but is needed for sparc (on other platforms)). Move the temp port database we use to find out where things have moved to after a LIP to the softc and off the kernel stack. Follow Qlogic's hint and don't bother setting a tag for commands that don't have this enabled (presumably the f/w will do it's own selection then). Use an INT_PENDING macro to check for an interrupt. The call to ISP_DMAFREE now just takes the handle- not the 'handle-1' which was a layering violation. Use CFGPRINTF in a couple of places to make things less chatty if not booting verbose, or CAMDEBUG compiles, etc..	1999-10-17 18:58:22 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Matt Jacob	ce7f792d94	More code cleanup. Go back to using FULL_LOGIN Fibre Chan if f/w is less than 1.17.0 level. Change where we do the loop database init. Add in the CMD_RQLATER return. Add some register debounce.	1999-08-16 19:59:55 +00:00
Matt Jacob	3692397b0d	add 2200 f/w; fix botched define	1999-07-05 20:42:08 +00:00
Matt Jacob	83cdc1a2b0	Roll revision levels. Add support for the Qlogic 2200 (warn about not having SCSI_ISP_SCCLUN config defined if we don't have f/w for the 2200- it's resident firmware uses SCCLUN (65535 luns)). Change the way the default LoopID is gathered (it's now a platform specific define so that some attempt at a synthetic WWN can be made in case NVRAM isn't readable). Change initialization of options a bit- don't use ADISC. Set FullDuplex mode if config options tells us to do so. Do not use FULL_LOGIN after LIP- it's the right thing to do but it causes too much loop disruption (Loop Resets). Sanity check some default values. Redo construction of port and node WWNs based upon what we have- if we have 2 in the top nibble, we can have distinct port and node WWNs. Clean up some SCCLUN related code that we obviously had never compiled (:-(). Audit commands coming int ispscsicmd and don't throw commands at Fibre devices that do not have Class 3 service parameters TARGET ROLE defined. Clean up f/w initialization a bit. Add Fabric support (or at least the first blush of it). Whew - way too much to describe here. Basically, after a LIP, hang out until we see a Loop Up or a Port DataBase Change async event, then see if we're on a Fabric (GET_PORT_NAME of FL_PORT_ID). If we are, try and scan the fabric controller for fabric devices using the GetAllNext SNS subcommand. As we find devices, announce them to the outer layer. Try and do some guard code for broken (Brocade) SNS servers (that get stuck in loops- gotta maybe do this a different way using the GP_ID3 cmd instead). Then do a scan of the lower (local loop) ids using a GET_PORT_NAME to see if the f/w has logged into anything at that loop id. If so, then do a GET_PORT_DATABASE command. Do this scan into a local database. At this point we can say the loop is 'Ready'. After this, we merge our local loop port database with our stored port database- in a as yet to be really fully exercised fashion we try and follow the logic of something having moved around. The first time we see something at a Loop ID, we fix it, for the purpose of this system instance, at that Loop ID. If things shift around so it ends up somewhere else, we still keep it at this Loop ID (our 'Target') but use the new (moved) Loop ID when we actually throw commands at it. Check for insane cases of different Loop IDs both claiming to have the same WWN- if that happens, invalidate both. Notify the outer layer of devices that have arrived and devices that have gone away. Finally, when this is done, search the softc's database of Fabric devices and perform logout/login actions. The Qlogic f/w maintains logout/login for all local loop devices. We have to maintain logout/login for fabric devices- total PITA. Expect to see this area undergo more change over time.	1999-07-02 23:06:38 +00:00
Matt Jacob	442257d9c5	be a bit more chatty about some speed negotiations	1999-05-12 18:56:55 +00:00
Matt Jacob	5a025c82c6	Some massive thwunking in initialization to handle dual bus adapters. More massive thwunking to include an XS_CHANNEL value. Some changes of how parameters are reported to outer layers (including bus, e.g.). Yet more stirring around in isp_mboxcmd to try and get it right. Decode of 1080/1240 NVRAM.	1999-05-11 05:06:55 +00:00
Matt Jacob	17b1ea0341	temp fix for internal queue overflow problem	1999-04-14 17:37:36 +00:00
Matt Jacob	3c6e29e07a	Make firmware revision a triple. Clean up some FC init stuff for board versions with no BIOS. Separate mailbox interrupts from IOCB interrupts. Read OUTMAILBOX5 while RISC_INT is active- not after you clear it (potential race condition). Clear out older broken BIG_ENDIAN goop. Don't negotiate narrow/async for LVD busses at startup if already in LVD mode. Note usage of presumptive 1040C revision. For all the LIP, PDB Changed, Loop UP/DOWN async events, mark fw state as unknown as well as marking the need to do a getpdb on targets- after a LIP for certain the f/w has to do PRLI/PLOGI for all targets again and marking f/w state as unknown gives us a fighting chance to (start to) hold up for that to complete.	1999-04-04 02:28:29 +00:00
Matt Jacob	3bd28825dd	Annoying little nigglet- apparently some Qlogic temporarily ignore settings you've just sent them and return random values if you follow the set by a get. This causes problems when you latter run a Tag-enabled command when you've command tagged mode off.	1999-03-26 00:33:13 +00:00
Matt Jacob	4394c92f52	Add in 1080 LVD support and some basis also for the 1240. The port database printout is now enabled.	1999-03-25 22:52:45 +00:00
Matt Jacob	57c801f5cf	A wad of changes- prepping for 1080/1240 support (which caused a massive thwank in register layout goop). A different mboxcmd approach. Some PDB change infrastructure. Some better management of loopdown/loopup events (keep them distinct from resource starvation for simq freeze/unfreeze actions).	1999-03-17 05:04:39 +00:00
Matt Jacob	3c688670f5	Roll internal release tag. Print out if we're in a 64 bit PCI slot. Use fast memory timing NVRAM parameter. Clean up and fix establishment of default target parameters. Don't use NVRAM if are flagged as not to do so (I had a busted NVRAM setup which I couldn't edit that enabled SYNC mode but disabled disconnect/reconnect and wide!!). Fix delays after resets. BUS resets not done in isp_init anymore- relegated to OS specific outer layers. Fix a buglet where you can get in a loop for a NULL xs in the completion list in isp_intr. Add in some defines that can disable fast posting. Add in code for Loop Up/Loop Down events that call into the outer layers as to what to do.	1999-02-09 01:07:06 +00:00
Matt Jacob	cbf57b472d	Implement and use Fast Posting for both parallel && fibre. Redo a bit of the startup code. Implement a call to outer framework function so that asynchronous events can be handled (e.g., speed negotiation, target mode). Roll internal release tags.	1999-01-30 07:29:00 +00:00
Matt Jacob	bff85e290f	Suggested by bde@freebsd.org- memcpy not necessarily good to use. D'oh- not in the BSD DKI. Stop being lazy and finish the defines so MEMCPY becomes bzero for FreeBSD.	1999-01-10 11:15:23 +00:00
Matt Jacob	f7102a20c6	Add some prototype deadchip detection. Set FIFO bursting (1XX0 only- it's already on for the 2XX0) and detect the broken 1040A FIFO. Change bzero to MEMZERO (portability with **nux). Use memcpy for same reason. Finally detect QUEUE FULL conditions and return this as an error that will get cam_periph_error to do it's 'tagged openings now XXX' dance.	1999-01-10 02:55:10 +00:00
Matt Jacob	c305536317	clarify headers;move uninit to outer layer;remove watchdog	1998-12-28 19:22:27 +00:00
Matt Jacob	cc56928232	oops on last	1998-12-05 01:46:40 +00:00
Matt Jacob	e205b454ac	Remove the Target mode functions until they're in better shape. Implement some suggested compilation cleanups from Eklund. Wire down a hard loop id if we are not on a platform that has the ability to get to a PCI BIOS (it still will float to the ID it gets after a LIP but at least we can try). Clarify that the expanded lun is based upon SCCLUN defines (in f/w).	1998-12-05 01:33:57 +00:00
Matt Jacob	a70ac9fd83	per bde (who is right about this) that an inlined fucntion with const char * strings being returned defined in a header file included several places but only used in one module, is, uh, silly.	1998-09-17 23:20:29 +00:00
Matt Jacob	7ac6b5a314	Cleanliness. Don't leave defined a const char array that's only used if target mode is defined (which it isn't, yet).	1998-09-17 22:53:35 +00:00

1 2

54 Commits