at. I don't think this will make a huge difference, but I have
received a report of a interrupt storm on one 16-bit card that this
might fix (chances are it won't, since I think that we may need to
check both the CBB registers for the 16-bit card as well as the PCIC
registers for power state change).
Submitted by: jhb@
interrupt code to be more robust. I've been running these changes for
over a year... With these changes, I don't see the ath card going
into reset like the code in the tree.
on a variety of cards. Adjust the comments accordingly to match the
code. Even if the vendor chose 0xffff for the device ID, the vendor
ID can't be 0xffff, so the test is still valid from a standards
perspective.
that I have. Wait up to 1.1s for the card to become ready. Document
what the standards say, and use that to justify the behavior in the
code: PCI standard says that a card must respond to configuration
cycles within 2^25 cycles after reset goes high, which is
approximately 1s. Therefore, give cards a little break and wait for
up to 1.1s for VENDOR to become valid. Only look at the vendor part
of the ID, since only it can't be 0xffff (although in practice
vendor/device will always be != 0xfffffffff). Include detailed
pointers to standards so epople understand why we're doing what we're
doing and why it just might be OK. Make it clear in the timeout
message that it is just a warning, sinc we try to soldier on as best
we can anyway.
This should eliminate an error message that r181453 produced on
certain Atheros cards.
and also holds things up, check every 20ms to see if we can read the
vendor of device 0.0. It will be 0xffffffff until the card is out of
reset. Always wait at least 20ms, for safety.
I think this is a better fix to the reset problem. However, I did it
as a separate commit in case something bad happens, people can roll
back to the commit before this one to see if that gives them reliable
behavior. I don't have FreeBSD up on enough machines to do exhaustive
testing on all known bridges...
some bridge + card combinations that take longer for reasons unknown.
Adjust the timeout to be 100ms on all !RICOH bridges, but leave RICOH
at 400ms. The 400ms is "lore" from other open source projects, and
I've never see my ricoh bridge chips take this long. Maybe it is the
same thing? Maybe a bit should be read instead of a hard-wired pause?
After this adjustment, a few cards that I'd insert and get only:
cbb0: card_power: 3V
cbb0: card_power: 0V
with full debugging enabled would actually try to attach.
Reported by: sam@ (I think)
MFC after: 3 days
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.
Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)
own entry in the softc. This should allow more of cbb_pci_intr() to
migrate to a new cbb_pci_filt() so that we don't have to run cbb's ISR
in almost every case we get an interrupt. We can't just move
cbb_pci_intr into cbb_pci_filt because it does things that aren't safe
to do from a fast interrupt handler, err I mean from a filter. This is
an important first step.
# I wonder if I need to make cardok volatile or not.
o If we don't have a filter, also check to make sure the card is there before
calling the scheduled ISR. This is necessary to help old drivers whose
ISRs can't cope with being called with the hardware missing, which sadly
still exist in the tree. This is the main reason why we have an extra
layer of indirection for cardbus interrupts.
o If the card is no longer present, mark the interrupt as 'handled' rather
than 'stray' because this accounts for why the interrupt happened. Stray
isn't all bad, since there are other filters that would claim it...
o Fix some comments
+ Add comment about why we check for CARD_OK and touch the hardware in both
the filter and ISR.
+ add a note about why we don't care about Giant
+ also note that giant can't be taken out in a filter...
+ Some minor formatting nits on very long comments.
Add some comments to explain how 10 was picked. 20 was completely
arbitrary, at least 10 has some reasoning behind it.
Also, update the comments about how long we sleep to reflect the new,
shorter timeout we use.
(1) change debounce period from 1s to 250ms. This appears to be fine and
speeds things up a little.
(2) In the middle of cbb_pcic_power_disable_socket we write 0 to the EXCA_INTR
register to put the card into reset. However, this turns off CSC
interrupts for TI bridges (and maybe others). So no further card
insertion events would be noticed. To compensate, after we've gone
through the entire power down sequence, turn on EXCA_INTR_ENABLE so
that CSC events happen.
#2 should fix the 'dead slot' problem that has been reported after
card ejection (but only 16-bit cards).
device pointers. They don't change as the children device drivers
come and go. Rather, check to see if the device is attached where we
would have checked ! NULL. This solves many asymmetries in the code
that likely could lead to crashes when loading/unloading cbb without
one or more of the expected children's driver not present.
o When detaching all children, try really hard to get all the children
list before giving up. This is based on an observation by hans petter
selasky in his usb p4 branch.
o When rescanning devices after a driver is added, abort if we can't get
the child list with a message.
o when rescanning devices, if the reprobe/attach is successful, save the
device for cardbus/pccard.
o when turning off the socket for a 16-bit card, write 0 to INTR register
rather than just tying to just clear the rest bit. this seems to fix
card insert detection after an eject on TI bridges (ricoh bridges work
either way, apparently). This is a MFp4.
o Cope better with TOPIC95 bridges on powerup. According to NetBSD driver,
these bridges don't set POWER_STATE, so cope accordingly in our power
code. They also need a little extra time to settle, so do that as well.
o It appears that we need to turn on/off one of the clocks to the card
when we power up/down that socket on a TOPIC97, also from NetBSD.
o TOPIC97 bridges need to specifically enable LV card support. Unconditionally
do this in the hopes that all laptops that have these chips support LV
voltages (they should, since they are required for CardBus).
o TOPIC register name regularization. Registers specific to models of TOPIC
are now called out as such.
# I need a machine with a TOPIC95 for testing.
was done, I believe, to work around some cards having issues in the
suspend case. I think that this helped my Sony VAIO TS505 work better
when it had certain wireless cards in it and I did a apm -z. I've not
tested suspend/resume on other laptops in a long time, so I hope this
doesn't cause greif. Please let me know if it does.
of cases where we didn't take out the lock before setting or clearing
a bit. This apparently can lead to a race at kldunload time (at least
on my Turion64 laptop, never saw it on my Sony Vaio).
in the ISR doesn't read the actual socket event register, but instead
reads garbage (usually 0xffffffff, but other times other things).
This totally violates the PCI spec, but happens rarely enough that a
workaround is in order. This adds one test when we have a real
interrupt to service (which is very rare), and doesn't affect the
usualy 'nothing to see here' case at all.
Problem reported by many, but sam@ gave me this workaround after
diagnosing the problem.
socket also supports the voltage. Some XV cards have appeared on the
scene (or cards that report they support XV), and in older machines
that have sockets that do not support XV, we were bogusly trying to
power them at XV rather than at 3.3V. Now, power up the card at the
lowest voltage supported by both the card and the socket.
MFC After: 3 days
than just deleting them. Also add comments about why we do this.
Given the current behavior of delete_child, I don't think this changes
anything. It just feels cleaner.
try very hard to be perfect. However, these attempts broke down when
there were large numbers of resources. We'd not be able to map them all.
Instead, accept that we might pass more range to thse subbus than
might be optimal be able to compute. However, there's little harm in
this and it allows us to pass greater resources through.
# it has been suggested that we allocate a fixed amount of resources
# on attach and give it out upon request. This might not be a bad idea...
o Rather than just try to turn off EXCA_INTR_RESET, set the entire register
to 0. This is slightly faster, and a better hammer.
o Move attempted clearing of the output enable (EXCA_PWRCTL_OE) back to
after we turn off the power. Modify it to write 0 so that we don't get
Bad Vcc messages on TI bridges (untested, but ru@ sent me a similar patch)
while at the same time avoiding interrupt storms on Ricoh bridges (tested
by me on my Sony).
# Many of my observations of 'breakage' for this patch are due to some bug
# in the load/unload of cbb.ko unlreated to this change. I'll be investigating
# and fixing that bug in the fullness of time.