cardbus_cis.c to this file, some code was not merged and thus resource
list entries were invalid. They didn't have a resources attached to
them.
However, the problem was masked for some time later, because newer
resources list entries were added to the head of the list, and
resource_list_find() always returned the first matching resource list
entry. Usually the underlying driver allocated a valid resource and
added it to the head of the list, and invalid one wasn't used.
In rev. 1.174 of subr_bus.c the sorting of resource list entries was
reversed demasking the problem in cardbus_alloc_resources().
This commit fixes the problem returning back some code from
cardbus_cis.c, pre-1.49 revisions.
PR: kern/87114
PR: kern/90441
Hardware provided by: Vasily Olekhov <olekhov yandex.ru>
Reviewed by: imp
. remove unnecessay header files after Scott's bus_dma(9) commit.
. remove global variable tis which was introduced at the time of
zero_copy(9) changes. The variable tis was not used at all. The
same applyes to ti_links in softc so axe it.
. deregister variables.
. axe ti_vhandle and switch to use explicit register access for
accessing NIC local memory. Creates three variants of ti_mem to
read/write NIC local memory(ti_mem_read, ti_mem_write) and clearing
NIC local memory(ti_mem_zero). This greatly enhances code
readability and have ti(4) drop using shared memory scheme for
Tigon 1. As Tigon 1 switched to use explicit register access for Tx,
axe ti_tx_ring_nic/ti_cmd_ring in softc.(Tigon 2 used to host ring
scheme which means there is no need to access NIC local memory via
register access for Tx and NIC would DMA the modified Tx rings into
its local memory.) [1]
. introduce new macro TI_EVENT_*/TI_CMD_* to handle NIC envent/command.
Instead of using bit fields assginment for accessing the event, use
shift operations to set/get it. [1]
. add additional check for valid DMA tags in ti_free_dmamaps().
. add missing bus_dmamap_sync/bus_dmamap_unload in ti_free_*_ring_*.
. fix locking nits(MTX_RECURSE mutex) and make ti(4) MPSAFE.
. change data type of ti_rdata_phys to bus_addr_t and don't blindly
cast to uint32_t.
. rearrange detach path and make ti(4) survive during device detach.
. for Tigon 1, use explicit register access for checking Tx descriptors
in ti_encap()/ti_txeof(). [1]
. properly call bus_dmamap_sync(9) for updating statistics.
. remove extra semicolon in ti_encap()
. rewrite loading MAC address to work on strict-alignment architectures.
. move TI_RD_OFF macro to if_tireg.h
. axe ETHER_ALIGN as it's already defined in <net/ethernet.h>.
. make macros immuine from expansion by adding parenthesis and do-while.
. remove alpha specific hack as vtophys(9) is no longer used in ti(4)
after Scott's bus_dma(9) fix.
Reviewed by: scottl
Obtained from: OpenBSD [1]
POSIX. This also makes the struct correct we ever implement an i386-time64
architecture. Not that we need too.
Reviewed by: imp, brooks
Approved by: njl (acpica), des (no objects, touches procfs)
Tested with: make universe
if we need a valid MAC address (for probing the media for example) before
ether_ifattach() has been called since IF_LLADDR() is NULL then.
Tested by: tisco
except for BGE_CHIPID_BCM5700_B0, which is buggy.
- All bge(4) supported hardware, has a bug that produces incorrect checksums
on Ethernet runts. However, in case of a transmitted packet, the latter can
be padded with zeroes, and the checksum would be correct. (Probably chip
includes the pad data into checksum). In case of receive, we just don't
trust checksum data in received runts.
Obtained from: NetBSD (jonathan) via Mihail Balikov
Previously it always returned 0 which means success regardless of
EEPROM status.
While here, add a check whether EEPROM read is successful.
Submitted by: jkim
- removed unused funtion bge_handle_events().
- removed bus_dmamap_destroy(9) calls for DMA maps created by
bus_dmamem_alloc(9). This should fix panics seen on sparc64
in device detach.
- added check for parent DMA tag creation.
- switched to use __NO_STRICT_ALIGNMENT as bge(4) supports all
architectures.
- added missing bus_dmamap_sync(9) in bge_txeof().
- added missing bus_dmamap_sync(9) in bge_encap().
- corrected memory synchronization operation on status block.
As the driver just read status block that was DMAed by NIC it
should use BUS_DMASYNC_POSTREAD. Likewise the driver does not
need to write status block back, so remove unnecessary
bus_dmamap_sync(9) calls in bge_intr().
- corrected memory synchronization operation on RX return ring.
The driver only read the block so remove unnecessary
bus_dmamap_sync(9) in bge_rxeof().
- force bus_dmamap_sync(9) for only modified descriptors. Blindly
synching all desciptor rings would reduce performance.
- call bus_dmamap_sync(9) for DMA maps that were modified in bge_rxeof().
Reviewed by: jkim(initial version)
Tested by: glebius(i386), jkim(amd64 initial version)
the addition of pci_find_extcap().
- Change the drm drivers to attach to vgapci. This is #ifdef'd so the
code can be shared across branches.
- Use pci_find_extcap() to look for AGP and PCIE capabilities in drm.
- GC all the drmsub stuff for i810/i830/i915. The agp and drm devices are
now both children of vgapci.
as a bus so that other drivers such as drm(4), acpi_video(4), and agp(4)
can attach to it thus allowing multiple drivers for the same device. It
also removes the need for the drmsub hack for the i8[13]0/i915 drm and agp
drivers.
attach to the hostb driver instead. This means that agp can now be loaded
at runtime (in theory at least). Also, the drivers no longer have to
explicity call device_verbose() to cancel out any earlier calls to
device_quiet() by the hostb(4) driver (this shows a limitation in new-bus,
drivers really shouldn't be doing device_quiet() until they know they are
going to drive that device, i.e. in attach).
duplicated anyways) and into a single MI driver. Extend the driver a bit
to implement the bus and PCI kobj interfaces such that other drivers can
attach to it and transparently act as if their parent device is the PCI
bus (for the most part).
drivers already map sections into KVA as needed anyway. Note that this
will probably break the nvidia driver, but I will coordinate to get that
fixed.
MFC after: 2 weeks
to search for a specific extended capability. If the specified capability
is found for the given device, then the function returns success and
optionally returns the offset of that capability. If the capability is
not found, the function returns an error.
we can cache its value in the softc. Eliminates one PCI register
write per call to bge_start().
A 1.8% speedup for UDP_RR test on my old box.
Obtained from: NetBSD(jonathan) via delphij
case if memory allocation failed.
- Remove fourth argument from VLAN_INPUT_TAG(), that was used
incorrectly in almost all drivers. Indicate failure with
mbuf value of NULL.
In collaboration with: yongari, ru, sam
command. This fixes some weird booting issues on newer versions
of the firmware on the MSA20.
Reported by: Philippe Pegon <Philippe dot Pegon at crc dot u-strasbg dot fr>
- Give up endianess support and switch to native-endian format for
accessing hardware structures. In fact embedded processor for
BCM57xx is big-endian architure(MIPS) and it requires native-endian
format for NIC structures.The NIC performs necessary byte/word
swapping depending on programmed endian type.
- With above changes all htole16/htole32 calls were gone.
- Remove bge_vhandle member in softc and changed to use explicit
register access. This may add additional performance penalty
that than that of previous memory access. But most of the access
is performed on initialization phase(e.g. RCB setup), it would be
negligible.
Due to incorrect use of bus_dma(9) in bge(4) it still panics sparc64
system in device detach path. The issue would be fixed in next patch.
Reviewed by: jkim (initial version)
Silence from: ps
Tested by: glebius
Obtained from: NetBSD via OpenBSD
1. Implement a large set of ioctl shims so that the Linux management apps
from LSI will work. This includes infrastructure to support adding, deleting
and rescanning arrays at runtime. This is based on work from Doug Ambrosko,
heavily augmented by LSI and Yahoo.
2. Implement full 64-bit DMA support. Systems with more than 4GB of RAM
can now operate without the cost of bounce buffers. Cards that cannot do
64-bit DMA will automatically revert to using bounce buffers. This option
can be forced off by setting the 'hw.amr.force_sg32" tunable in the loader.
It should only be turned off for debugging purposes. This work was sponsored
by Yahoo.
3. Streamline the command delivery and interrupt handler paths after
much discussion with Dell and LSI. The logic now closely matches the
intended design, making it both more robust and much faster. Certain
i/o failures under heavy load should be fixed with this.
4. Optimize the locking. In the interrupt handler, the card can be checked
for completed commands without any locks held, due to the handler being
implicitely serialized and there being no need to look at any shared data.
Only grab the lock to return the command structure to the free pool. A
small optimization can still be made to collect all of the completions
together and then free them together under a single lock.
Items 3 and 4 significantly increase the performance of the driver. On an
LSI 320-2X card, transactions per second went from 13,000 to 31,000 in my
testing with these changes. However, these changes are still fairly
experimental and shouldn't be merged to 6.x until there is more testing.
Thanks to Doug Ambrosko, LSI, Dell, and Yahoo for contributing towards
this.
to use busdma. Unlike most of the other drivers, but similar to the
if_em driver, pre-allocate the dmamaps at init time instead of allocating
them on the fly when descriptors need to be filled. This isn't ideal right
now because a map is allocated for every descriptor slot in the tx, rx, mini,
and jumbo rings (which is a lot!) in order to simplify the bookkeeping, even
though the driver might support filling only a subset of those slots.
Luckily, maps are typically NULL on i386 and amd64, so the cost isn't
very high. It could be an issue with sparc64, but the driver isn't endian
clean either, and that is a much bigger problem to solve first.
Note that jumbo frame support is under-tested, and I'm not even sure if
it till really works correctly given the evil VM magic that is does.
The changes here attempt to preserve the existing semanitcs.
Thanks to Martin Nillson for contributing the Netgear card for this work.
MFC-After: 3 weeks
compilation of kernels without ns8250 support but using the uart framework.
These kernels will be for machines where size matters more, so including code
that can never be executed is undesriable...
working at all and only saw "nve0: device timeout (N)" messages.
- Setup PHY before handing control to NVidia API setting
speed, duplex, enabling interrupts, etc.
- Add restriction of MAXADDR_32BIT for high address to contigmalloc
to make the driver work on machines with 4+GB of memory.
PR: kern/85583, kern/88045
Tested by: scottl, others earlier version
MFC after: 10 days
transmitted bits was between 8.6180us and 8.6200us when we used a RCLK
of 16.500MHz. This is a little low (should be 8.6805us). This error
is exactly the error one would expect if it actually had a 16.384MHz
watch oscillator (as suggested by garrett) instead of using the PCI
RCLK. Assume that the pci clock therefore wasn't really used, but
instead the cheap 16.384MH watch quartz oscillator. This gives bits
in the 8.6800us to 8.6810us ranage, which matches theoretical.
Submitted by: garrett
cluster allocator, that wasn't MPSAFE. Instead, utilize our new generic
UMA jumbo cluster allocator. Since UMA gives us a 9k piece that is contigous
in virtual memory, but isn't contigous in physical memory we need to handle
a few segments. To deal with this we utilize Tigon chip feature - extended
RX descriptors, that can handle up to four DMA segments for one frame.
Details:
o Remove bge_alloc_jumbo_mem(), bge_free_jumbo_mem(),
bge_jalloc(), bge_jfree() functions.
o Remove SLIST heads, bge_jumbo_tag, bge_jumbo_map from softc.
o Use extended RX BDs for Jumbo receive producer ring, and
initialize it appropriately.
o New bge_newbuf_jumbo():
- Allocate an mbuf with Jumbo cluster with help of m_cljget().
- Load the cluster for DMA with help of bus_dmamap_load_mbuf_sg().
- Assert that we got 3 segments in the DMA mapping.
- Fill in these 3 segments into the extended RX descriptor.
2) rework link state detection code & use it in POLLING mode
3) fix 2 bugs in link state detection code:
a) driver unable to detect link loss on bcm5721
b) on bcm570x chips (tested on bcm5700 bcm5701 bcm5702) driver fails
to detect link loss with probability 1/6 (solved in brgphy.c)
Devices working in TBI mode should not be affected by this change.
Approved by: glebius (mentor)
MFC after: 1 month
"done" method so that for non-repeat operations we have completely
finished with the transfer by the time the callback is invoked.
This makes it possible to recycle a transfer from within the callback
routine for the same transfer. Previously this almost worked, but
with OHCI controllers calling the "done" method after the callback
would zero out some important fields needed by the recycled transfer.
Only some usb peripheral drivers such as ucom appear to rely on the
ability to reuse a transfer from its callback.
MFC after: 1 week
time ago appears to be based not on the typical 1.8432MHz clock, or
the other more typical multiple of 8 of this (14.7456MHz), but instead
it appears to be 1/2 the PCI clock rate or 16.50000MHz. I'm not 100%
sure that this is right, but since I did the original entry, I'm going
to go ahead and modify it. With the 14.7456MHz value, I was getting
bits that were ~7.3us instead of ~8.6us like they are supposed to be.
My measuring gear for today is a stupid handheld scope with two
signficant digits. So I don't know if it is 33.000000/2 MHz or some
other value close to 16.5MHz, but 16.5MHz works well enough for me to
use a couple of different devices at 115200 baud, and is a nice even
multiple of a well known clock frequency...
acquired anywhere in the driver now.
- Axe the spin mutex used for the nve_oslock*() routines. The driver lock
already provides sufficient synchronization.
- Don't mess around with IFF_UP when the link state changes. IFF_UP is
an administrative flag, not a link status indicator.
MFC after: 1 week
immediately from acpi_pci_link_route_interrupt() since we aren't going
to have a valid pci_link device to talk to try to route interrupts. This
fixes a page fault if you disable just pci_link. Note that trying to use
ACPI without pci_link is probably not advised however.
MFC after: 1 week
Tested by: Eugene Grosbein eugen at kuzbass dot ru
the eaddr array (introduced in rev. 1.174) prior to writing to it. As
dc_read_eeprom() is told to write only 3 16-bit words to eaddr but eaddr
in fact is somewhat larger removal of the zeroing defeated the check
whether the MAC address is all zero as there can be some random garbage
in eaddr past the 3 words written to it and the check verifys all bits
in eaddr. Solve this by changing the check to verify only the 3 words
(happenning to be ETHER_ADDR_LEN bytes) written to eaddr.
- While here change the notation of "FCode" in a nearby comment to the
official way.
Ok'ed by: marcel, ru
polarity. Some machines route PCI IRQs to an ISA IRQ but fail to include
an interrupt override entry to set the polarity and trigger of the given
ISA IRQ in their MADT table.
PR: usb/74989
Reported by: Julien Gabel jpeg at thilelli dot net
MFC after: 1 week
and some fixes from Motomichi Matsuzaki. Testing involved many people, but the
final, successful testing was from rwatson who endured several rounds of "it
crashes at XYZ stage" "oh, please correct this typo and try again." The Linux
driver, and to a small extent the limited specs, were both used as a reference
for how to program the chipset.
PR: kern/80396
Submitted by: Martin Mersberger
Update Intel MatrixRAID support to be able to pick up RAID0+1 (RAID10)
and RAID5 arrays without panic'ing.
This has the side effect of now also supporting multiple volumes on
MatrixRAID's now I have the metadata better understood..
HW sponsored by: Mullet Scandinavia AB
commit. Copy the ethernet address into a local buffer, which we know
is sufficiently aligned for the width of the memory accesses that we
do. This also eliminates all suspicious and potentionally harmful
casts.
In collaboration with: ru
- S3 Savage driver ported.
- Added support for ATI_fragment_shader registers for r200.
- Improved r300 support, needed for latest r300 DRI driver.
- (possibly) r300 PCIE support, needs X.Org server from CVS.
- Added support for PCI Matrox cards.
- Software fallbacks fixed for Rage 128, which used to render badly or hang.
- Some issues reported by WITNESS are fixed.
- i915 module Makefile added, as the driver may now be working, but is untested.
- Added scripts for copying and preprocessing DRM CVS for inclusion in the
kernel. Thanks to Daniel Stone for getting me started on that.
appeared to rely on all kinds of non-guaranteed behaviours: the
transfer abort code assumed that TDs with no interrupt timeout
configured would end up on the done queue within 20ms, the done
queue processing assumed that all TDs from a transfer would appear
at the same time, and there were access-after-free bugs triggered
on failed transfers.
Attempt to fix these problems by the following changes:
- Use a maximum (6-frame) interrupt delay instead of no interrupt
delay to ensure that the 20ms wait in ohci_abort_xfer() is enough
for the TDs to have been taken off the hardware done queue.
- Defer cancellation of timeouts and freeing of TDs until we either
hit an error or reach the final TD.
- Remove TDs from the done queue before freeing them so that it
is safe to continue traversing the done queue.
This appears to fix a hang that was reproducable with revision 1.67
or 1.68 of ulpt.c (earlier revisions had a different transfer
pattern). With certain HP printers, the command "true > /dev/ulpt0"
would cause ohci_add_done() to spin because the done queue had a
loop. The list corruption was caused by a 3-TD transfer where the
first TD completed but remained on the internal host controller
done queue because it had no interrupt timeout. When the transfer
timed out, the TD got freed and reused, so it caused a loop in the
done queue when it was inserted a second time from a different
transfer.
Reported by: Alex Pivovarov
MFC after: 1 week
This shouldn't happen as far as the self-id buffer is vaild but
some people have this problem.
PR: kern/83999
Submitted by: Markus Wild <fbsd-lists@dudes.ch>
MFC after: 3 days