Commit Graph

34 Commits

Author SHA1 Message Date
Andrew Gallatin
c406ad2ecc Cleanup if_media handling in mxge(4)
- Re-probe xfp / sfp+ socket on link events, in case user
    has changed transceiver
- correctly report current media to avoid confusing lagg (reported by Panasas)
- Report link speed  (submitted by yongari)

Reviewed by: yongari (earlier version)

MFC after: 7 days
2010-04-15 13:50:55 +00:00
Andrew Gallatin
72c042dfb5 Move mxge(4)'s NIC watchdog reset handler from
a callout to a taskqueue
2009-10-19 20:51:27 +00:00
Andrew Gallatin
83d54b590f Two more mxge watchdog fixes:
1) Restore the PCI Express control register after a watchdog
   reset.  This is required because the device will come out
   of watchdog reset with the pectl reg at its default state,
   and important BIOS configuration (like max payload size)
   could be lost.

2) Call mxge_start_locked() for every tx queue before dropping
   the lock in the watchdog handler.   This is required, as
   the queue's buf ring may have filled during the reset.
2009-09-30 14:42:06 +00:00
Andrew Gallatin
65c690668b Add support for throttling transmit bandwidth. This is most commonly
used to reduce packet loss on high delay (WAN) paths with a
slow link.
2009-09-21 14:41:07 +00:00
Andrew Gallatin
8c5d766cd7 Add a dying flag to prevent races at detach.
I tried re-ordering ether_ifdetach(), but this created a new race
where sometimes, when under heavy receive load (>1Mpps) and running
tcpdump, the machine would panic.  At panic, the ithread was still in
the original (not dead) if_input() path, and was accessing stale BPF
data structs.  By using a dying flag, I can close the interface prior
to if_detach() to be certain the interface cannot send packets up in
the middle of ether_ifdetach.
2009-06-24 21:09:56 +00:00
Andrew Gallatin
710328322a Revert most of 193311 so as to track mxge transmit stats
on a per-ring basis and avoid racy (and costly) updates
to the ifp stats via drbr by defining NO_SLOW_STATS

Discussed with: kmacy
2009-06-23 19:04:25 +00:00
Andrew Gallatin
869c7348e5 Buf-ring fixes for mxge
- always maintain byte/mcast/drop stats via drbr
- move #define of IFNET_BUF_RING so that its picked
  up by all files in the driver
- conditionalize IFNET_BUF_RING on the FreeBSD_version
  bump just after it appeared in the tree.

Sponsored by: Myricom Inc.
2009-06-02 16:52:33 +00:00
Andrew Gallatin
4d9a5852aa Set an rx jumbo cluster to the correct size before
using bus_dmamap_load_mbuf_sg() on it. This
prevents data corruption when the mxge MTU is
between 4076 and 8172 on machines with 4KB
pages and MXGE_VIRT_JUMBOS is in use (which it
isn't, in -current or -stable)
2009-06-01 19:16:57 +00:00
Andrew Gallatin
950f22e38d Fix build: Make forgotten IFNET_MULTIQUEUE to IFNET_BUF_RING
ifdef change.
2009-04-27 17:24:47 +00:00
Andrew Gallatin
c6cb3e3fe8 Updates to mxge for multiple tx/rx rings:
- Update mxge to use if_transmit(), and the new buf_ring
  interfaces, so as to enable multiple transmit queues.
  Use of if_transmit() is conditional on IFNET_BUF_RING,
  and is enabled by default (as in if_em).

- Record a flow id on receive if receive hashing is active.
  I currently only record the rx ring id (0..8) rather than
  the 32-bit topelitz hash result, as doing the latter would
  require shifting the driver to use a larger rx return ring.

Sponsored by:  Myricom, Inc.
2009-04-27 15:45:54 +00:00
Andrew Gallatin
016385508f Better support for recent Myricom 10GbE NICs
- Update to firmware 1.4.39 for dual-chip NIC (10G-PCIE2-xxx)
  support, and SFP+ i2c support

- Identify newer "B" NICs (10G-PCIEx-8B-x) correctly, rather than
  mis-identifying them as "A" NICs (cosmetic only)

- Identify the IFM_10G_LRM ifmedia type, where applicable.

- Identify ifmedia types for SFP+ based NICs

- Update copyright

Sponsored by: Myricom
MFC after: 1 week
2009-02-17 22:15:58 +00:00
Andrew Gallatin
73c7c83f1e Restore sfence semantics in mxge after the introduction
of a global mfence based mb() in r185162
2008-11-24 19:00:57 +00:00
Kip Macy
db7f0b974f - bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
  allow drivers to efficiently manage multiple hardware queues
  (i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.
2008-11-22 05:55:56 +00:00
Andrew Gallatin
7b9f20ea8d Adapt mxge shims to detect phys contig jumbo frames
in RELENG_7

MFC after: 3 days
2008-08-21 16:35:34 +00:00
Andrew Gallatin
91ed89132c Now that mxge supports MSI-X interrupts, reverse the logic and flag
legacy interrupts rather than MSI as a special case.  Prior to this
commit, the interrupt handler was doing the slow handshaking with
the device to ensure the legacy interrupt was lowered in both
the legacy and MSI-X case.  This handshaking was not
required for MSI-X.
2008-02-14 16:24:14 +00:00
Andrew Gallatin
37d89b0c5e Add minimally invasive shims to ease MFCs of mxge back as far
as RELENG_6

Sponsored by: Myricom, Inc.
2008-02-14 00:09:59 +00:00
Andrew Gallatin
b0f7b92207 Take advantage of the new physically contiguous 9K jumbos in 8. 2008-01-22 22:04:31 +00:00
Andrew Gallatin
1e413cf932 Add optional support to mxge for MSI-X interrupts and multiple receive
queues (which we call slices).  The NIC will steer traffic into up to
hw.mxge.max_slices different receive rings based on a configurable
hash type (hw.mxge.rss_hash_type).

Currently the driver defaults to using a single slice, so the default
behavior is unchanged.  Also, transmit from non-zero slices is
disabled currently.
2008-01-15 20:34:49 +00:00
Andrew Gallatin
f15444982b Add support for a new device id (9). Mxge NICs with the new
device id support MSI-X.

Approved by: re (bmah)
2007-09-13 21:29:02 +00:00
Andrew Gallatin
c587e59f20 - Fix a bug which could cause a panic when enabling LRO
on an down mxge interface
- Fix a bug where mxge reported the link state as
   active when it wasn't (after ifconfig down).
- Prevent spurious watchdog resets when link partner is not consuming
- Add support for CX4 and popular XFP media detection
- Update the firmware and associated header files to 1.4.25

Approved by: re (kensmith)
2007-08-22 13:22:12 +00:00
Andrew Gallatin
eb8e82f5fd Update the mxge(4) driver's copyright to 2007, and drop
the binary distribution clause.

Approved by: re (bmah)
2007-07-12 16:04:55 +00:00
Andrew Gallatin
c792928f01 Add support for "hardware" vlan tag insertion & removal emulation
in the mxge driver so as to be able to do checksum offload
on vlans.  This is good enough to achieve 10GbE line rate on vlans.
2007-05-23 16:25:40 +00:00
Andrew Gallatin
adae7080fb mxge cleanups:
- Remove code to use the special wc_fifo.  It has been disabled by default
  in our other drivers as it actually slows down transmit by a small amount

- Dynamically determine the amount of space required for the rx_done
  ring rather than hardcoding it.

- Compute the number of tx descriptors we are willing to transmit per
  frame as the minimum of 128 or 1/4 the tx ring size.

- Fix a typo in the tx dma tag setup which could lead to unnecessary
  defragging of TSO packets (and potentially even dropping TSO packets
  due to EFBIG being returned).

- Add a counter to keep track of how many times we've needed to
  defragment a frame.  It should always be zero.

- Export new extended f/w counters via sysctl

Sponsored by: Myricom, Inc.
2007-05-22 15:57:49 +00:00
Andrew Gallatin
053e637f86 Improve mxge receive performance:
- Update to the latest (1.4.18) f/w.  This f/w introduces a new
  receive mode which allows us to use FreeBSD's physically discontinuous
  MJUM9BYTES clusters.

- Switch the driver from chaining MJUMPAGESIZE clusters to using
  MJUM9BYTES clusters to avoid mbuf chaining overheads.  Due to this
  change, people running obsolete f/w images will be limited to an MTU of
  PAGE_SIZE - 16.

- Add (disabled by default) support for Large Receive Offload.

Sponsored by: Myricom, Inc.
2007-05-21 18:32:27 +00:00
Andrew Gallatin
b824b7d86a Work around a firmware bug where broadcast frames would be incorrectly
treated as multicast frames and filtered, but when only when "adopting"
running firmware.  By "adopting", I mean using pre-existing firmware
loaded from eeprom at PCI reset, rather than firmware loaded by the
driver.
2007-02-21 17:34:05 +00:00
Andrew Gallatin
dce01b9b27 - Add 99% of a callout based watchdog. The remaining 1% is waiting
for pci_cfg_restore() to be exported.  It was tested using a
  hackily accessed pci_cfg_restore().

- Add ifmedia_removeall() to mxge_detach() in order to stop leaking
  an ifaddr

- Fix a small acounting bug introduced by the locking code shuffle
  which could cause spurious watchdog resets now that we have a
  watchdog.

Sponsored by: Myricom
2007-01-31 19:53:36 +00:00
Andrew Gallatin
a98d6cd71c Abandon using sleepable locks in favor of mutexes for mxge's if_ioctl
locking in preparation for adding a watchdog handler (callouts must
not use sleepable locks).  This required shuffling memory and
interrupt allocation to the attach routine rather than if_ioctl so as
to avoid potential sleeps while bringing up the interface.
2007-01-31 15:29:31 +00:00
Andrew Gallatin
a82c2581b5 Minor updates:
- initialize ifq_drv_maxlen correctly
- mark the interface as jumbo capable
- keep stats on the number of times the hw transmit queue filled and
  was restarted.
2007-01-30 08:39:44 +00:00
Andrew Gallatin
d91b1b4976 Initialization bugfixes and enhancements:
- Fix bug preventing adoption of running firmware
- Set PCIe max read request size to 4KB
- Read PCIe link width from config space
- Assume aligned completions from the southbridge ports
  of intel E5000 chips
- Use aligned firmware when link width is x4 or less
- Add hw.mxge.force_firmware tunable to allow user to force
  selection of aligned (or unaligned) firmware
2006-11-22 16:33:40 +00:00
Andrew Gallatin
0fa7f68197 - Updated to the latest myri10ge firmware
- Added support for multicast filtering, now that the firmware
  supports it.  Note that this is not yet tested, as multicast
  seems to panic -current (even w/o mxge loaded)
- Added workaround to cope with different irq data struct size on
  pre-multicast firmware which can found running on nics.
- Added Intel E5000 PCIe chipsets to list providing aligned completions.
- Replaced various magic constants with #defines, now that they are
  defined in the firmware headers.
2006-09-15 19:24:45 +00:00
Andrew Gallatin
aed8e389c1 - Added TSO support. This entailed increasing the number of send descriptors
in the transmit busdma tag, so I moved the segment list off the
stack.

- Fixed transmit routine to ensure it doesn't read past the end
of an mbuf when parsing headers.

- Corrected handling of odd length segments.  Setting MXGEFW_FLAGS_ALIGN_ODD
is required only when offloading the checksum of that frame.

Sponsored by: Myricom Inc.
2006-09-15 15:41:28 +00:00
Andrew Gallatin
5e7d85418b Update the mxge driver.
- Update the firmware to the latest released firmware
  (1.4.3), which corresponds to the firmware in the
  latest shipping drivers from Myricom.  This firmware
  fixes several bugs in the firmware's PCI-e implementation,
  and it also changes the driver/firmware interface:

  o TSO was added, and changed the format of the transmit
    descriptors.
  o The firmware no longer counts transmits descriptors,
    but frames.  So the driver needs to keep a count
    of the number of frames sent.
  o The weird interrupt strategy changed to a normal receive
    return ring.  This ring is much bigger, and we may be
    able to support DEVICE_POLLING.
  o Myricom's header files changed the name of firmware
    related #define's and enums (s/_MCP_/FW_).

- Stopped spamming the console with lots of printfs unless
  mxge_verbose (or bootverbose) is set.

- Made additional information available via sysctl, including
  the results of a PCI-e DMA benchmark run at device reset.

- Decreased the excessively long timeouts when sending commands
  from 2 seconds to 20ms.

Sponsored by: Myricom Inc.
2006-06-14 16:23:17 +00:00
Andrew Gallatin
6d87a65da4 - Complete the myri10ge -> mxge name change by doing a mechanical
s/myri10ge/mxge/g replacement in the myri10ge files.  A few contuation
  lines were joined because of the regained columns.
- Hook the mxge driver back to the build.
2006-06-13 13:53:52 +00:00
Andrew Gallatin
b2fc195e1b 10GbE mode driver and binary firmware for Myricom's PCI-express NICs.
More info regarding these nics can be found at http://www.myri.com.

Please note that the files
sys/dev/myri10ge/{mcp_gen_header.h,myri10ge_mcp.h} are internally
shared between all our drivers (solaris, macosx, windows, linux, etc).
I'd like to keep these files unchanged, so I can just import newer
versions of them when the firmware API/ABI changes.  This means I'm
stuck with some of the crazy-long #define names, and possibly
non-style(9) characteristics of these files.

Many thanks to mlaier for doing firmware(9) just as I
needed it, and to scottl for his helpful review.

Reviewed by: scottl, glebius
Sponsored by: Myricom Inc.
2006-02-19 22:39:19 +00:00