lock up under moderate to heavy load.
The status & command fields share a 32-bit longword. The programming
API of the eepro apparently requires that you update the command field
of a transmit slot that you've already given to the card. This means
the card could be updating the status field of the same longword at
the same time. Since alphas can only operate on 32-bit chunks of
memory, both the status & command fields are loaded from memory &
operated on in registers when the following line of C is executed:
sc->cbl_last->cb_command &= ~FXP_CB_COMMAND_S;
The race is caused by the card DMA'ing up the status at just the wrong
time -- after it has been loaded into a register & before it has been
written back. The old value of the status is written back, clobbering
the status the card just DMA'ed up. The fact that the card has sent
this frame is missed & the transmit engine appears to hang.
Luckily, as numerous people on the freebsd-alpha list pointed out, the
load-locked/store-conditional instructions used by the atomic
functions work with respect changes in memory due to I/O devices. We
now use them to safely update the command field.
Tested by: Bernd Walter <ticso@mail.cicely.de>
errors that plagued those cards with XFree86 4.0. They have two memory
ranges as well as an IO port range to them. Also cleaned up the three
warning messages that I got, from inb(), outb() and linuxulator. Also, I
noticed that the DRI and Glide support for the Voodoo4 and 5 has been
placed upon linux.3dfx.com, too bad they haven't released the tech docs
yet. Apparently, they are still pushing glide for all of us, so I will try
and add support once those tech docs are up.
other systems.
o Normalize copyright text.
o Clean up probe code function interfaces by passing around a single
structure of common arguments instead of passing "too many" args
in each function call.
o Add support for the AAA-131 as a SCSI adapter.
o Add support for the AHA-4944 courtesy of "Matthew N. Dodd" <winter@jurai.net
o Correct manual termination support for PCI cards. The bit definitions
for manual termination control in the SEEPROM were incorrect.
o Add support for extracting NVRAM information from SCB 2 for BIOSen
that use this mechanism to pass this data to OS drivers.
o Properly set the STPWLEVEL bit in PCI config space based on the
setting in an SEEPROM.
o Go back to useing 32byte SCBs for all controllers. The current
firmware allows us to embed 12byte cdbs on all controllers in
a 32byte SCB, and larger cdbs are rarely used, so it is a
better use of this space to offer more SCBs (32).
o Add support for U160 transfers.
o Add an idle loop executed during data transfers that prefetches
S/G segments on controllers that have a secondary DMA engine
(aic789X).
o Improve the performance of reselections by avoiding an extra
one byte DMA in the case of an SCB lookup miss for the reselecting
target. We now keep a 16byte "untagged target" array on the card
for dealing with untagged reselections. If the controller has
external SCB ram and can support 64byte SCBs, then we use an
"untagged target/lun" array to maximize concurrency. Without
external SCB ram, the controller is limited to one untagged
transaction per target, auto-request sense operations excluded.
o Correct the setup of the STPWEN bit in SXFRCTL1. This control
line is tri-stated until set to one, so set it to one and then
set it to the desired value.
o Add tagged queuing support to our target role implementation.
o Handle the common cases of the ignore wide residue message
in firmware.
o Add preliminary support for 39bit addressing.
o Add support for assembling on big-endian machines. Big-endian
support is not complete in the driver.
o Correctly remove SCBs in the waiting for selection queue when
freezing a device queue.
o Now that we understand more about the autoflush bug on the
aic7890, only use the workaround on devices that need it.
o Add a workaround for the "aic7890 hangs the system when you
attempt to pause it" problem. We can now pause the aic7890
safely regardless of what instruction it is executing.
allocate a short port range in some alpha configurations.
Submitted by: "Andrew M. Miklic" <miklic@udlkern.fc.hp.com>,
Mark Abene <phiber@radicalmedia.com>
when we're done reading it (makes checking things easier).
Before calling isp_notify_ack make sure we're at RUNSTATE-
elsewise we can be responding to LIPs or SCSI bus resets
before we've finished some of the wiring.
we need a function that tells the Qlogic f/w that a target mode command
is done, so increase the resource count for that lun. Add in a timeout
function to kick the putback again if we fail to do it the first time (we
may not have the request queue space for ATIO push). Split the function
isp_handle_platform_ctio into two parts so that the timeout function for
the ATIO push or isp_handle_platform_ctio can inform CAM that the requested
CTIO(s) are now done.
Clean up (cough) residual handling. What we need for Fibre Channel
is to preserve the at_datalen field from the original incoming ATIO
so we can calculate a 'true' residual. Unfortunately, we're not
guaranteed to get that back from CAM. We'll *try* to find it hiding
in the periph_priv field (layering violation)- but if an ATIO was
passed in from user land- forget it. This means that we'll probably
get residuals wrong for Fibre Channel commands we're completing
with an error. It's too late to 4.1 release to fix this- too bad.
Luckily the only device we'd really care about this occurring on
is a tape device and they're still so rare as FC attached devices
that this can be considered an untested combination anyway.
Remove all CCINCR usage (resource autoreplenish). When we've proved
to ourself that things are working properly, we can add it back
in.
Make sure we propage 'suggested' sense data from the incoming ATIO
into the created system ATIO- and set sense_len appropriately.
Correctly propagate tag values.
Fall back to the model of generating (well, the functions in isp_pci.c
do the work) multiple CTIOs based upon what we get from XPT. Instead
of being able to pair Qlogic generated ATIOs with CAM ATIOs, and then
to pair CAM CTIOs with Qlogic CTIOs, we have to take the CTIO passed
to us from XPT, and if it implies that we have to generate extra
Qlogic CTIOs, so be it. This means that we have to wait until the
last CTIO in a sequence we generated completes before calling xpt_done.
Executive summary- target mode actually now pretty much works well
enough to tell folks about.
There is a number of devices that are compliant, of which the 3Com 5605 is
has been verified to work.
The driver is not perfect yet, but should be able to get you somewhere.
The driver was originally written by Lennart Augustsson, but Mike Smith
and Mike Meyer <mwm@mired.org> did the porting.
3.3volt PCI/cardbus chipsets similar to the 98715 (and they have
512-bit hash tables). Also update the man page to mention the 98727/98732
and the SOHOware SFA110A Rev B4 card with the 98715AEC-C chip.
entropy estimation, but causes an immediate reseed after the input
(read in sizeof(u_int64_t) chunks) is "harvested".
This will be used in the reboot "reseeder", coming in another
commit. This can be used very effectively at any time you think
your randomness is compromised; something like
# (ps -gauxwww; netstat -an; dmesg; vmstat -c10 1) > /dev/random
will give the attacker something to think about.
non-disconnecting command. Interestingly enough, of the other flavors
of the 7.65 f/w (the dual-id and multi-id flavor)- the dual-id doesn't
hang (they're also supposed to be the same except for supporting dual
or multi-id capture!), but other things are questionable as well.
which differ slightly from the Macronix MX98715AEC chip on the sample
adapter that I have in that the multicast hash table is only 128 bits
wide instead of 512. New adapters are popping up with this chip, and
due to improper handling of the smaller hash table, broadcast packets
were not being received correctly.
ether_ifdetach().
The former consolidates the operations of if_attach(), ng_ether_attach(),
and bpfattach(). The latter consolidates the corresponding detach operations.
Reviewed by: julian, freebsd-net
associated patch to XFree86 allows the X server to work with this chipset
on FreeBSD. Additional work will include porting the Linux 3D driver.
Submitted by: Ruslan Ermilov <ru@FreeBSD.org>
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.
Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
with splhigh(). However, the entropy-harvesting routine needs pretty
serious irq-protection, as it is called out of irq handlers etc.
Clues given by: bde
structure; remind myself in the cooments. Also regroup all the Yarrow
variables at the top of the variable list; they are "special".
(no functional change).
(I had been busy for my own research activity until the last weekend)
Supported devices:
SB Midi Port (sbc + midi)
SB OPL3 (sbc + midi)
16550 UART (midi, needs a trick in your hint)
CS461x Midi Port (csa + midi)
OSS-compatible sequencer (seq)
Supported playing software:
playmidi (We definitely need more)
Notes:
/dev/midistat now reports installed midi drivers. /dev/sndstat reports
only pcm drivers. We need the new name(pcmstat?).
EMU8000(SB AWE) does not sound yet but does get probed so that the OPL3
synth on an AWE card works.
TODO:
MSS/PCI bridge drivers
Midi-tty interface to support general serial devices
Modules
stack. It's bad for your machine's health.
Make the two huge structs in reseed() static to prevent crashes. This
is the bug that people have been running into and panic()ing on for the
past few days.
Reviewed by: phk
are working. Add a small blurb about XE_DEBUG as it might be useful
to some people troubelshooting problems in the future.
Submitted by: "Kevin Oberman" <oberman@es.net>
sure that it really is by issuing a ISPCTL_ABORT_CMD just on the
off chance the f/w will start it up again and, ha ha, start using
the DMA resources we gave it but are now taking away.
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.
Obtained from: BSD/OS
us to not the ints are ok and also to (re)ENABLE isp interrupts. Remove
all splcam()/splx() invocates and replace them with ISP_LOCK/ISP_UNLOCK
macros.
to isp_osinfo substructure (all in prep for SMP). Define MBOX_WAIT_COMPLETE
and MBOX_NOTIFY_COMPLETE macros so that we can now (temp) use tsleep
to wait for mailbox completion. Requires us to guess whether we're
servicing an interrupt or not- will use intr_nesting_level.
Add local strncat function.
- Add 2 explicit (paranoid?) memory barriers in the
interrupt code (After the reading of the `flag' and
prior to looking at the data, of course. :-) ).
- Remove obsolete informations from the README.sym file.
This commit actually results in no object difference
for IA32, but 2x`mb' added for Alpha.
define). Fix stupidity wrt checking whether we've gone to
LOOP_PDB_RCVD loopstate- it's okay to be greater than this state.
D'oh! Protect calls to isp_pdb_sync and isp_fclink_state with IS_FC
macros.
Completely redo mailbox command routine (in preparation to make this
possibly wait rather than poll for completion).
Make a major attempt to solve the 'lost interrupt' problem
1. Problem
The Qlogic cards would appear to 'lose' interrupts, i.e., a legitimate
regular SCSI command placed on the request queue would never complete
and the watchdog routine in the driver would eventually wakeup and
catch it. This would typically only happen on Alphas, although a
couple folks with 700MHz Intel platforms have also seen this.
For a long time I thought it was a foulup with f/w negotiations of
SYNC and/or WIDE as it always seemed to happen right after the
platform it was running on had done a SET TARGET PARAMETERS mailbox
command to (re)enable sync && wide (after initially forcing
ASYNC/NARROW at startup). However, occasionally, the same thing
would also occur for the Fibre Channel cards as well (which, ahem,
have no SET TARGET PARAMETERS for transfer mode).
After finally putting in a better set of watchdog routines for the
platforms for this driver, it seemed to be the case that the command
in question (usually a READ CAPACITY) just had up and died- the
watchdog routine would catch it after ~10 seconds. For some platforms
(NetBSD/OpenBSD)- an ABORT COMMAND mailbox command was sent (which
would always fail- indicating that the f/w denied knowledge of this
command, i.e., the f/w thought it was a done command). In any case,
retrying the command worked. But this whole problem needed to be
really fixed.
2. A False Step That Went in The Right Direction
The mailbox code was completely rewritten to no longer try and grab
the mailbox semaphore register and to try and 'by hand' complete
async fast posting completions. It was also rewritten to now have
separate in && out bitpatterns for registers to load to start and
retrieve to complete. This means that isp_intr now handles mailbox
completions.
This substantially simplifies the mailbox handling code, and carries
things 90% toward getting this to be a non-polled routine for this
driver.
This did not solve the problem, though.
3. Register Debouncing
I saw some comments in some errata sheets and some notes in a Qlogic
produced Linux driver (for the Qlogic 2100) that seemed to indicate
that debouncing of reads of the mailbox registers might be needed,
so I added this. This did not affect the problem. In fact, it made
the problem worse for non-2100 cards.
5. Interrupt masking/unmasking
The driver *used* to do a substantial amount of masking/unmasking
of the interrupt control register. This was done to make sure that
the core common code could just assume it would never get pre-empted.
This apparently substantially contributed to the lost interrupt
problem. The rewrite of the ICR (Interrupt Control Register),
which is a separate register from the ISR (Interrupt Status Register)
should not have caused any change to interrupt assertions pending.
The manual does not state that it will, and the register layout
seems to imply that the ICR is just an active route gate. We only
enable PCI Interrupts and RISC Interrupts- this should mean that
when the f/w asserts a RISC interrupt and (and the ICR allows RISC
Interrupts) and we have PCI Interrupts enabled, we should get a
PCI interrupt. Apparently this is a latch- not a signal route.
Removing this got rid of *most* but not all, lost interrupts.
5. Watchdog Smartening
I made sure that the watchdog routine would catch cases where the
Qlogic's ISR showed an interrupt assertion. The watchdog routine
now calls the interrupt service routine if it sees this. Some
additional internal state flags were added so that the watchdog
routine could then know whether the command it was in the middle
of burying (because we had time it out) was in fact completed by
the interrupt service routine.
6. Occasional Constipation Of Commands..
In running some very strenous high IOPs tests (generating about
11000 interrupts/second across one Qlogic 1040, one Qlogic 1080
and one Qlogic 2200 on an Alpha PC164), I found that I would get
occasional but regular 'watchdog timeouts' on both the 1080 and
the 2100 cards. This is under FreeBSD, and the watchdog timeout
routine just marks the command in error and retries it.
Invariably, right after this 'watchdog timeout' error, I'd get a
command completion for the command that I had thought timed out.
That is, I'd get a command completion, but the handle returned by
the firmware mapped to no current command. The frequency of this
problem is low under such a load- it would usually take an 30
minutes per 'lost' interrupt.
I doubled the timeout for commands to see if it just was an edge
case of waiting too short a period. This has no effect.
I gathered and printed out microtimes for the watchdog completed
command and the completion that couldn't find a command- it was
always the case that the order of occurrence was "timeout, completion"
separated by a time on the order of 100 to 150 ms.
This caused me to consider 'firmware constipation' as to be a
possible culprit. That is, resubmission of a command to the device
that had suffered a watchdog timeout seemed to cause the presumed
dead command to show back up.
I added code in the watchdog routine that, when first entered for
the command, marks the command with a flag, reissues a local timeout
call for one second later, but also then issues a MARKER Request
Queue entry to the Qlogic f/w. A MARKER entry is used typically
after a Bus Reset to cause the f/w to get synchronized with respect
to either a Bus, a Nexus or a Target.
Since I've added this code, I always now see the occasional watchdog
timeout, but the command that was about to be terminated always
now seems to be completed after the MARKER entry is issued (and
before the timeout extension fires, which would come back and
*really* terminate the command).
comment. Check against firmware state- not loop state when enabling
target mode. Other changes have to do with no longer enabling/disabling
interrupts at will.
Rearchitect command watchdog timeouts-
First of all, set the timeout period for a command that has a
timeout (in isp_action) to the period of time requested *plus* two
seconds. We don't want the Qlogic firmware and the host system to
race each other to report a dead command (the watchdog is there to
catch dead and/or broken firmware).
Next, make sure that the command being watched isn't done yet. If
it's not done yet, check for INT_PENDING and call isp_intr- if that
said it serviced an interrupt, check to see whether the command is
now done (this is what the "IN WATCHDOG" private flag is for- if
isp_intr completes the command, it won't call xpt_done on it because
isp_watchdog is still looking at the command).
If no interrupt was pending, or the command wasn't completed, check
to see if we've set the private 'grace period' flag. If so, the
command really *is* dead, so report it as dead and complete it with
a CAM_CMD_TIMEOUT value.
If the grace period flag wasn't set, set it and issue a SYNCHRONIZE_ALL
Marker Request Queue entry and re-set the timeout for one second
from now (see Revision 1.45 isp.c notes for more on this) to give
the firmware a final chance to complete this command.
store a bitmask of whether we've set a value into ccb->ccb_h.status,
whether we're in the watchdog routine for this command now, whether
we've set a grace period for this command and whether this command is
actually done.
See comments of rev 1.45 of isp.c for more complete information.
output mailbox values we want to get back out of the chip once a mailbox
command is done. Add storage for the maximum number of output mailbox
registers to the softc.
Roll minor version number.
the handle (i.e., generation number), so we will now need a function that
will take a handle and return a flat index [ 0 .. maxhandles-1 ] for
auxillary routines that need an index to get at buddy store values
(like dma maps or xflist pointers).
device with Yarrow, and although I coded for that in dev/MAKEDEV, I forgot
to _tell_ folks.
This commit adds back the /dev/urandom device (as a duplicate) of /dev/random,
until such time as it can be properly announced.
This will help the openssl users quite a lot.
(Reported by Matthew Jacob)
- Fix a couple of __inline__ (changed to __inline).
- Check also against DT_DATA_IN phase on parity/crc error.
(Merged from Pamela Delaney's changes in the Linux driver)
- Fix support for phase mismatch handling from the C code for
the C1010 (only useful for testing issue).
- Add an asynchonous notification handler for `lost device'
(AC_LOST).
This merges in changes from NetBSD which ensure bktr0
(actually bktr%d) is printed at the start of any output lines.
Submitted by: Thomas Klausner <wiz@danbala.ifoer.tuwien.ac.at>
This is work-in-progress, and the entropy-gathering routines are not
yet present. As such, this should be viewed as a pretty reasonable
PRNG with _ABSOLUTELY_NO_ security!!
Entropy gathering will be the subject of ongoing work.
This is written as a module, and as such is unloadable, but there is
no refcounting done. I would like to use something like device_busy(9)
to achieve this (eventually).
Lots of useful ideas from: bde, phk, Jeroen van Gelderen
Reviewed by: dfr
severely stripped down compared with its predecessor, and is measurably
a _lot_ faster.
Many thanks to Jeroen van Gelderen for lots of good ideas.
There is still a problem with this; it is written as a mudule, and as
such is theoretically unloadable. However, there is no refcounting done
as I would prefer to do that a'la device_busy(9), rather than some
"home-rolled" scheme. The point is pretty moot, as /dev/null is
effectively compulsory.
Reviewed by: dfr
the PnP probe is merely a stub as we make assumptions about some of this
hardware before we have probed it.
Since these devices (with the exception of the speaker) are 'standard',
suppress output in the !bootverbose case to clean up the probe messages
somewhat.
Renamed varible dst in ray_rx to mp as it is a pointer to an mbuf.
Correctly grok addresses in data packets.
Promte a couple of RECERRs to real errors.
Rewrote intro at top of file to reflect my better understanding of how it
the memory mapping works.
Clear the DONE list and move some thoughts into the TODO list.
Remove RECERR from RAY_DEBUG
Start to use a desired network parameter structure, only used in download
code as I've realised that there are some problems with the idea.
Break up ray_rx, and move the data packet handler into a seperate function. This meant some knock on changes in ray_rx_mgt/ray_rx_ctl to do with
mbuf freeing.
Remove some debug code/XXX comments that are out of date.
Force alphas to prefer mem mapping as the default.
Basically, we have a pointer to a function which we can call which will
return us a pointer to firmware for the card we have. We call this function
(if it's non-NULL) with the address of our mdvec f/w pointer.
The way this works is that if ispfw (as a module or a static) is loaded,
it initializes the pointer in isp_pci, so we can call into to it to fetch
a pointer to a f/w set.
If ispfw is MOD_UNLOADed, it's retained a pointer to our mdvec f/w pointers,
which then get zeroed out so we don't have any references to data that's
now gone from kernel memory. Removing the f/w saves ~360KBytes.
Alas, there is no autounload mechanism that works for is here.
through, establish what our LUN width is. Unfortunately, we can't ask
the f/w. If we loaded the f/w, we'll now assume we have expanded LUNs
(SCCLUN for fibre channel, just plain 32 LUN for SCSI). If we didn't
load firmware, assume 8 LUNs for SCSI and 1 LUN for Fibre Channel. We
have to assume only one LUN for Fibre Channel because the LUN setting
in Request Queue entries is in different places whether we have SCCLUN
firmware or not, so the only LUN guaranteed to work for both is LUN 0.
Clean up the rest of isp.c so that ISP2100_SCCLUN defines aren't used-
instead use run time determinants based upon isp->isp_maxluns.
After starting firmware, delay 500us to give it a chance to get rolling.
Fix the interrupt service routine to check for both isr && sema being zero
before thinking this was a spurious interrupt. Following the manuals,
allow for both Mailbox as well as Queue Reponse type interrupts for regular
SCSI.
(we always support fabric now). Remove SCCLUN definition (we always
support SCCLUN now, if we load the f/w). Add typedef definition of an
external firmware fetch function.
need this RSN.
Remove a pointless warning in the root device locating code.
Remove the "wd" compatibility name from the "ad" driver.
WARNING: If you have not updated to use /dev/wd* in your /etc/fstab
and modern bootblocks, it would be a very good idea to do so BEFORE
you upgrade your kernel.
Implement the Solaris way to break into DDB over a serial console
instead of sending a break. Sending the character sequence
CR ~ ^b will break the kernel into DDB (if DDB is enabled).
Reviewed by: peter
layout introduced in driver 1.5.3. The driver was
confused by the bogus TEKRAM table used to translate
user sync. setting to SCSI sync. factor.
Btw, the new TEKRAM DC-390 U3D and U3W Ultra-160
controllers seem to be using BIOS from SYMBIOS/LSI
and thus SYMBIOS NVRAM layout.
If that means that TEKRAM will now offer real
SYMBIOS software compatible SCSI controllers, then
it is a *GREAT NEWS*.
What we'd like to know is whether or not we have a listener
upstream that really hasn't configured yet. If we do, then
we can give a more sensible reply here. If not, then we can
reject this out of hand.
Choices for what to send were
Not Ready, Unit Not Self-Configured Yet
(0x2,0x3e,0x00)
for the former and
Illegal Request, Logical Unit Not Supported
(0x5,0x25,0x00)
for the latter.
We used to decide whether there was at least one listener
based upon whether the black hole driver was configured.
However, recent config(8) changes have made this hard to do
at this time.
Actually, we didn't use the above quite yet, but were sure considering it.