IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs. This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.
For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.
Approved by: re (kib)
MFC after: 6 weeks
- In bce_rx_intr(), use BUS_DMASYNC_POSTREAD instead of
BUS_DMASYNC_POSTWRITE, as we want to "read" from the
rx page chain pages.
- Document why we need to do PREWRITE after we have updated
the rx page chain pages.
- In bce_intr(), use BUS_DMASYNC_POSTREAD and
BUS_DMASYNC_PREREAD when before and after CPU "reading"
the status block.
- Adjust some nearby style mismatches/etc.
Pointed out by: yongari
Approved by: davidch (no objection) but bugs are mine :)
- Added missing firmware for 5709 A1 controllers.
- Changed some debug statistic variable names to be more consistent.
Submitted by: davidch
MFC after: Two weeks
a real packet error but simply indicate that an unexpected unicast or multicast
error was received by the NIC, which was not counted in the past as well.
Reported by: many (on -stable@)
Reviewed by: davidch
MFC after: 3 days
change. As a side effect, this makes the excessive interrupts to
disappear which has been observed as a regression in recent stable/7.
Reported by: many (on -stable@)
Reviewed by: davidch
and ifnet functions
- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own
- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers
- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues
This work was supported by Bitgravity Inc. and Chelsio Inc.
- Added some additional code for debug builds.
- Fixed a problem printing physical memory on 64bit system during debugging.
- Modified some of the context memory and mailbox register names to more
clearly distinguish their use.
- Added memory barriers for Intel CPUs when accessing host memory data
structures which are written by hardware.
MFC after: Two weeks.
- Fixed a problem on i386 architecture when using split header/jumbo frame
firmware caused by hardware alignment requirements.
- Added #define BCE_USE_SPLIT_HEADER to allow the feature to be enabled/
disabled. Enabled by default.
PR: kern/123696
MFC after: 2 weeks
aligned on an 8 byte boundary. Prior to rev 1.36 this wasn't a problem
because mbuf clusters tend be naturally aligned. The switch to using
split buffers with the first buffer being the embedded data area of the
mbuf has broken this assumption, at least on i386, causing a complete
failure of RX functionality. Fix this for now by using a full cluster for
the first RX buffer. A more sophisticated approach could be done with the
old buffer scheme to realign the m_data pointer with m_adj(), but I'm also
not clear on performance benefits of this old scheme or the performance
implications of adding an m_adj() call to every allocation.
TX traffic to sit in the send chain until a received packet kick
started the interrupt handler. This would cause extremely slow
performance when used with NFS over UDP.
- Removed untested polling code.
- Updated copyright year in the file header.
- Removed inadvertent ^M's created by DOS text editor.
MFC after: 2 weeks
- Added loose RX MTU functionality to allow frames larger than 1500 bytes
to be accepted even though the interface MTU is set to 1500.
- Implemented new TCP header splitting/jumbo frame support which uses
two chains for receive traffic rather than the original single recevie
chain.
- Added additional debug support code.
errors (especially when jumbo frames are enabled or in low memory systems)
because the RX chain was corrupted when an mbuf was mapped to an unexpected
number of buffers.
- Fixed a problem that would cause kernel panics when an excessively
fragmented TX mbuf couldn't be defragmented and was released by
bce_tx_encap().
Approved by: re(hrs)
MFC after: 7 days
- Updated firmware to latest release (v3.4.8) to fix TSO + jumbo frame lockup
- Added MSI (hw.bce.msi_enable) and TSO (hw.bce.tso_enable) sysctls
- Fixed kernel panic when MSI is used and module is unloaded
- Added several new debug routines
- Removed slack space for RX/TX chains since it only covers sloppy coding
- Fixed a potential problem when programming jumbo MTU size in hardware
- Various other comment changes
MFC after: 4 weeks
Updated copyright date to 2007.
Tested with BCM5706 A3.
Added ID for BCM5708 B2.
Removed unused driver version string.
Modified BCE_PRINTF macro to automatically fill-in the sc pointer.
Fixed a kernel panic when the driver was loaded as a module from the
command-line because the MII bus pointer was null (i.e. the MII bus
hadn't been enumerated yet).
Added fix proposed by Vladimir Ivanov <wawa@yandex-team.ru> to prevent
driver state corruption when releasing the lock during the ISR in
bce_rx_intr() to send packets up the stack.
Added new TX chain and register read sysctl interfaces for debugging.
Cleaned up formatting for various other debug routines.
Added a new statistic maintained by firmware which tracks the number
of received packets dropped because no receive buffers are available.
If these drivers are setting M_VLANTAG because they are stripping the
layer 2 802.1Q headers, then they need to be re-inserting them so any
bpf(4) peers can properly decode them.
It should be noted that this is compiled tested only.
MFC after: 3 weeks
blade systems, such as the Dell 1955 and the Intel SBXD132.
Development hardware for this work was provided by Broadcom and iXsystems.
A SBXD132 blade for testing was provided by Iron Systems.
that piggybacks on bce_tick() callout.
- Instead of unconditionally resetting the controller, try to
skip the reset in case we got a pause frame, like em(4) did.
- Lock bce_tick() using callout_init_mtx().
Discussed with/Reviewed by: glebius, scottl, davidch
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.
Sponsored by: Ironport
MFC After: 3 days
descriptor's mbuf pointer to see if the transmit ring is full. The
mbuf pointer is set only in the last descriptor of a
multi-descriptor packet. By relying on the mbuf pointers of the
earlier descriptors, the driver would sometimes overwrite a
descriptor belonging to a packet that wasn't completed yet. Also,
tx_chain_prod wasn't updated inside the loop, causing the wrong
descriptor to be checked after the first iteration. The upshot of
all this was the loss of some transmitted packets at medium to high
packet rates.
In bce_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.
Correct a couple of inaccurate comments.
MFC after: 1 month
and TCP checksum offloading fine, it only has a problem with IP checksums on
IP fragments.. Barring a fix or workaround available from the hardware, the
real solution would be to have finer grained control in the stack over what
can and cannot be assisted in hardware.
kept unused in the ring. This check should probably be moved up to
bce_start_locked at some point, as it'll make the loop up there slightly
more efficient, and will eliminate a costly set of busdma operations when
the ring is full. But this works for now.
This makes all of my UDP torture tests work. I'll cautiously say that
it might even work for other users now. Feedback is appreciated.
- Use bus_dmamap_load_mbuf_sg() to eliminate the need for the callback and
all of the extra bookkeeping associated with it.
- Eliminate the bce_dmamap_arg structure and streamline the memory allocation
routines to not need it. This does change some of the debugging messages.
- Refactor the loop that fills the buffer descriptor so that it can be done
with a single set of logic in a single loop instead of two sets of logic.
- Eliminate the need to cache and pass descriptor indexes between the start
loop and the encap function.
- Change the start loop to always check the ifnet sendq for more work.
This significantly helps the driver withstand large UDP workloads, though
it's still not perfect. I suspect the remaining work lies with handling
the OACTIVE flag, and also in possibly streamlining the interrupt handler
some. It is, however, nearly on par with the other popular gigabit drivers
in terms of stability now.