1) Restore the PCI Express control register after a watchdog
reset. This is required because the device will come out
of watchdog reset with the pectl reg at its default state,
and important BIOS configuration (like max payload size)
could be lost.
2) Call mxge_start_locked() for every tx queue before dropping
the lock in the watchdog handler. This is required, as
the queue's buf ring may have filled during the reset.
- Mark the link as down, so if watchdog reset fails, link watching
failover software can notice it
- Don't send MXGEFW_CMD_ETHERNET_DOWN if the NIC has been reset, it is
not needed, and will fail on a freshly reset NIC.
- Ensure the transmit routines aren't attempting to PIO write to doorbells
while the NIC is being reset.
- Download the correct f/w, rather than using the EEPROM f/w after reset.
- Export a count of the number of watchdog resets via sysctl
- Zero all f/w stats at reset. This will lead to less confusing
diagnostic output when investigating NIC failures.
MFC after: 3 days
loader, because it uses a reserved suffix (_type). Fix
this by removing the "_" and renaming the tunable to
hw.mxge.rss_hashtype. The old (rss_hash_type) tunable is
still fetched, in case people load the driver via scripts.
When both are present in the kernel environment,
the new value (hw.mxge.rss_hashtype) overrides the old
value.
Approved by: re (kib)
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs. This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.
For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.
Approved by: re (kib)
MFC after: 6 weeks
I tried re-ordering ether_ifdetach(), but this created a new race
where sometimes, when under heavy receive load (>1Mpps) and running
tcpdump, the machine would panic. At panic, the ithread was still in
the original (not dead) if_input() path, and was accessing stale BPF
data structs. By using a dying flag, I can close the interface prior
to if_detach() to be certain the interface cannot send packets up in
the middle of ether_ifdetach.
- Support for 10G-PCIE*-8B*-C (dual-port CX4) NICs
- For dual-port NICs, f/w failover is now a few microsecs
instead of a few millisecs.
- On failover, f/w sends RARP broadcast to make the change
immediately known to the network
- Fixed a bug observed on IBM X3 architecture where
some spurious ecrc errors would be reported when OS enabled
ecrc support.
Sponsored by: Myricom Inc.
- always maintain byte/mcast/drop stats via drbr
- move #define of IFNET_BUF_RING so that its picked
up by all files in the driver
- conditionalize IFNET_BUF_RING on the FreeBSD_version
bump just after it appeared in the tree.
Sponsored by: Myricom Inc.
using bus_dmamap_load_mbuf_sg() on it. This
prevents data corruption when the mxge MTU is
between 4076 and 8172 on machines with 4KB
pages and MXGE_VIRT_JUMBOS is in use (which it
isn't, in -current or -stable)
- Update mxge to use if_transmit(), and the new buf_ring
interfaces, so as to enable multiple transmit queues.
Use of if_transmit() is conditional on IFNET_BUF_RING,
and is enabled by default (as in if_em).
- Record a flow id on receive if receive hashing is active.
I currently only record the rx ring id (0..8) rather than
the 32-bit topelitz hash result, as doing the latter would
require shifting the driver to use a larger rx return ring.
Sponsored by: Myricom, Inc.
- Update to firmware 1.4.39 for dual-chip NIC (10G-PCIE2-xxx)
support, and SFP+ i2c support
- Identify newer "B" NICs (10G-PCIEx-8B-x) correctly, rather than
mis-identifying them as "A" NICs (cosmetic only)
- Identify the IFM_10G_LRM ifmedia type, where applicable.
- Identify ifmedia types for SFP+ based NICs
- Update copyright
Sponsored by: Myricom
MFC after: 1 week
and ifnet functions
- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own
- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers
- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues
This work was supported by Bitgravity Inc. and Chelsio Inc.
This update fixes a transmit bug in the multi-queue (MSI-X) firmware
which happens when RDMAs complete out of order, and provides
improved support for the new Myri10GE NIC models (10G-PCIE-8Bx)
Sponsored by: Myricom Inc.
MFC after:3 days
- Support for Myricom 10G-PCIE-8B NICs
- multi-slice firmware: fix a bug when the presence of 32-bit or
64-bit DMA addresses for interrupt queues and data is not uniform across
slices.
- Improves automatic selection between ethp_z8e/eth_z8e
Sponsored by: Myricom Inc.
and handle NIC hardware watchdog resets.
- remove buggy code at the top of mxge_tick() which tried
to detect a race which is already detected in the kernel's
callout code.
- move callout_stop() and callout_reset() into mxge_close()
mxge_open() rather than doing the callout manipulation
all over the place.
- use callout_drain(), rather than callout_stop() to prevent
a potential race between mxge_tick() and mxge_detach()
which could lead to softclock using a destroyed mutex
- restructure the mxge_tick() and mxge_watchdog_reset()
routines to avoid resetting a callout, and then
immediately stopping it if the watchdog reset routine
is called, and fails.
- enable the driver to handle NIC hardware watchdog
resets by restoring the NIC's PCI config space, which is
lost when the NIC hardware watchdog triggers.
Reviewed by: jhb (previus version)
legacy interrupts rather than MSI as a special case. Prior to this
commit, the interrupt handler was doing the slow handshaking with
the device to ensure the legacy interrupt was lowered in both
the legacy and MSI-X case. This handshaking was not
required for MSI-X.
queues (which we call slices). The NIC will steer traffic into up to
hw.mxge.max_slices different receive rings based on a configurable
hash type (hw.mxge.rss_hash_type).
Currently the driver defaults to using a single slice, so the default
behavior is unchanged. Also, transmit from non-zero slices is
disabled currently.
- Fix a bug introduced in 1.4.20 where speculative read by the processor in the
write-only doorbell region would cause a target-abort (as opposed to simply
returning random data). This could manifest itself as NMI or machine freeze
depending on how the BIOS/OS/chipset configuration handles target-abort.
- Add support for new revisions of -R cards (with AEL1002/AEL1010 xaui->xfi)
- Increase an internal timing (dispatch engine): fix possible spurious reset
(seen on very few cards).
on an down mxge interface
- Fix a bug where mxge reported the link state as
active when it wasn't (after ifconfig down).
- Prevent spurious watchdog resets when link partner is not consuming
- Add support for CX4 and popular XFP media detection
- Update the firmware and associated header files to 1.4.25
Approved by: re (kensmith)
- Add custom .c wrappers for the firmware, rather than the standard
firmware(9) generated firmware objects to work around toolchain
problems on ia64 involving linking objects produced by
ld -b -binary into the kernel.
- Move from using Myricom's ".dat" firmware blobs to using Myricom's
zlib compressed ".h" firmware header files. This is done to
facilitate the custom wrappers, and saves a fair amount of wired
memory in the case where the firmware is built in, or preloaded.
- Fix two compile issues in mxge which only appear on non-i386/amd64.
Reviewed by: mlaier, mav (earlier version with just zlib support)
Glanced at by: sam
Approved by: re (kensmith)
- Allow LRO to be enabled / disabled at runtime
- Fix a double-free at module unload time.
- Only update timestamp in lro merge when it is present in the frame
Sponsored by: Myricom
clusters. This helps quite a bit on my low end machines (improves
performance by about 300Kpps when being blasted by a hardware
packet generator).
- Include one extended f/w counter forgotten in earlier commit
Sponsored by: Myricom Inc.
- Remove code to use the special wc_fifo. It has been disabled by default
in our other drivers as it actually slows down transmit by a small amount
- Dynamically determine the amount of space required for the rx_done
ring rather than hardcoding it.
- Compute the number of tx descriptors we are willing to transmit per
frame as the minimum of 128 or 1/4 the tx ring size.
- Fix a typo in the tx dma tag setup which could lead to unnecessary
defragging of TSO packets (and potentially even dropping TSO packets
due to EFBIG being returned).
- Add a counter to keep track of how many times we've needed to
defragment a frame. It should always be zero.
- Export new extended f/w counters via sysctl
Sponsored by: Myricom, Inc.
- Update to the latest (1.4.18) f/w. This f/w introduces a new
receive mode which allows us to use FreeBSD's physically discontinuous
MJUM9BYTES clusters.
- Switch the driver from chaining MJUMPAGESIZE clusters to using
MJUM9BYTES clusters to avoid mbuf chaining overheads. Due to this
change, people running obsolete f/w images will be limited to an MTU of
PAGE_SIZE - 16.
- Add (disabled by default) support for Large Receive Offload.
Sponsored by: Myricom, Inc.