interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.
o Remove the if_baudrate_pf crutch.
o Make all fields of struct if_data fixed machine independent size. The
notion of data (packet counters, etc) are by no means MD. And it is a
bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
which at modern speeds overflow within a second.
This also removes quite a lot of COMPAT_FREEBSD32 code.
o Give 16 bit for the ifi_datalen field. This field was provided to
make future changes to if_data less ABI breaking. Unfortunately the
8 bit size of it had effectively limited sizeof if_data to 256 bytes.
o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.
__FreeBSD_version bumped.
Discussed with: emax
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
- Add support for IPv6 rx csum offload
- Finally switch mxge from using its own driver lro, to
using tcp_lro
MFC after: 7 days
Sponsored by: Myricom Inc.
- Re-probe xfp / sfp+ socket on link events, in case user
has changed transceiver
- correctly report current media to avoid confusing lagg (reported by Panasas)
- Report link speed (submitted by yongari)
Reviewed by: yongari (earlier version)
MFC after: 7 days
1) Restore the PCI Express control register after a watchdog
reset. This is required because the device will come out
of watchdog reset with the pectl reg at its default state,
and important BIOS configuration (like max payload size)
could be lost.
2) Call mxge_start_locked() for every tx queue before dropping
the lock in the watchdog handler. This is required, as
the queue's buf ring may have filled during the reset.
I tried re-ordering ether_ifdetach(), but this created a new race
where sometimes, when under heavy receive load (>1Mpps) and running
tcpdump, the machine would panic. At panic, the ithread was still in
the original (not dead) if_input() path, and was accessing stale BPF
data structs. By using a dying flag, I can close the interface prior
to if_detach() to be certain the interface cannot send packets up in
the middle of ether_ifdetach.
- always maintain byte/mcast/drop stats via drbr
- move #define of IFNET_BUF_RING so that its picked
up by all files in the driver
- conditionalize IFNET_BUF_RING on the FreeBSD_version
bump just after it appeared in the tree.
Sponsored by: Myricom Inc.
using bus_dmamap_load_mbuf_sg() on it. This
prevents data corruption when the mxge MTU is
between 4076 and 8172 on machines with 4KB
pages and MXGE_VIRT_JUMBOS is in use (which it
isn't, in -current or -stable)
- Update mxge to use if_transmit(), and the new buf_ring
interfaces, so as to enable multiple transmit queues.
Use of if_transmit() is conditional on IFNET_BUF_RING,
and is enabled by default (as in if_em).
- Record a flow id on receive if receive hashing is active.
I currently only record the rx ring id (0..8) rather than
the 32-bit topelitz hash result, as doing the latter would
require shifting the driver to use a larger rx return ring.
Sponsored by: Myricom, Inc.
- Update to firmware 1.4.39 for dual-chip NIC (10G-PCIE2-xxx)
support, and SFP+ i2c support
- Identify newer "B" NICs (10G-PCIEx-8B-x) correctly, rather than
mis-identifying them as "A" NICs (cosmetic only)
- Identify the IFM_10G_LRM ifmedia type, where applicable.
- Identify ifmedia types for SFP+ based NICs
- Update copyright
Sponsored by: Myricom
MFC after: 1 week
and ifnet functions
- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own
- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers
- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues
This work was supported by Bitgravity Inc. and Chelsio Inc.
legacy interrupts rather than MSI as a special case. Prior to this
commit, the interrupt handler was doing the slow handshaking with
the device to ensure the legacy interrupt was lowered in both
the legacy and MSI-X case. This handshaking was not
required for MSI-X.
queues (which we call slices). The NIC will steer traffic into up to
hw.mxge.max_slices different receive rings based on a configurable
hash type (hw.mxge.rss_hash_type).
Currently the driver defaults to using a single slice, so the default
behavior is unchanged. Also, transmit from non-zero slices is
disabled currently.
on an down mxge interface
- Fix a bug where mxge reported the link state as
active when it wasn't (after ifconfig down).
- Prevent spurious watchdog resets when link partner is not consuming
- Add support for CX4 and popular XFP media detection
- Update the firmware and associated header files to 1.4.25
Approved by: re (kensmith)
- Remove code to use the special wc_fifo. It has been disabled by default
in our other drivers as it actually slows down transmit by a small amount
- Dynamically determine the amount of space required for the rx_done
ring rather than hardcoding it.
- Compute the number of tx descriptors we are willing to transmit per
frame as the minimum of 128 or 1/4 the tx ring size.
- Fix a typo in the tx dma tag setup which could lead to unnecessary
defragging of TSO packets (and potentially even dropping TSO packets
due to EFBIG being returned).
- Add a counter to keep track of how many times we've needed to
defragment a frame. It should always be zero.
- Export new extended f/w counters via sysctl
Sponsored by: Myricom, Inc.
- Update to the latest (1.4.18) f/w. This f/w introduces a new
receive mode which allows us to use FreeBSD's physically discontinuous
MJUM9BYTES clusters.
- Switch the driver from chaining MJUMPAGESIZE clusters to using
MJUM9BYTES clusters to avoid mbuf chaining overheads. Due to this
change, people running obsolete f/w images will be limited to an MTU of
PAGE_SIZE - 16.
- Add (disabled by default) support for Large Receive Offload.
Sponsored by: Myricom, Inc.
treated as multicast frames and filtered, but when only when "adopting"
running firmware. By "adopting", I mean using pre-existing firmware
loaded from eeprom at PCI reset, rather than firmware loaded by the
driver.
for pci_cfg_restore() to be exported. It was tested using a
hackily accessed pci_cfg_restore().
- Add ifmedia_removeall() to mxge_detach() in order to stop leaking
an ifaddr
- Fix a small acounting bug introduced by the locking code shuffle
which could cause spurious watchdog resets now that we have a
watchdog.
Sponsored by: Myricom
locking in preparation for adding a watchdog handler (callouts must
not use sleepable locks). This required shuffling memory and
interrupt allocation to the attach routine rather than if_ioctl so as
to avoid potential sleeps while bringing up the interface.
- initialize ifq_drv_maxlen correctly
- mark the interface as jumbo capable
- keep stats on the number of times the hw transmit queue filled and
was restarted.
- Fix bug preventing adoption of running firmware
- Set PCIe max read request size to 4KB
- Read PCIe link width from config space
- Assume aligned completions from the southbridge ports
of intel E5000 chips
- Use aligned firmware when link width is x4 or less
- Add hw.mxge.force_firmware tunable to allow user to force
selection of aligned (or unaligned) firmware
- Added support for multicast filtering, now that the firmware
supports it. Note that this is not yet tested, as multicast
seems to panic -current (even w/o mxge loaded)
- Added workaround to cope with different irq data struct size on
pre-multicast firmware which can found running on nics.
- Added Intel E5000 PCIe chipsets to list providing aligned completions.
- Replaced various magic constants with #defines, now that they are
defined in the firmware headers.
in the transmit busdma tag, so I moved the segment list off the
stack.
- Fixed transmit routine to ensure it doesn't read past the end
of an mbuf when parsing headers.
- Corrected handling of odd length segments. Setting MXGEFW_FLAGS_ALIGN_ODD
is required only when offloading the checksum of that frame.
Sponsored by: Myricom Inc.
- Update the firmware to the latest released firmware
(1.4.3), which corresponds to the firmware in the
latest shipping drivers from Myricom. This firmware
fixes several bugs in the firmware's PCI-e implementation,
and it also changes the driver/firmware interface:
o TSO was added, and changed the format of the transmit
descriptors.
o The firmware no longer counts transmits descriptors,
but frames. So the driver needs to keep a count
of the number of frames sent.
o The weird interrupt strategy changed to a normal receive
return ring. This ring is much bigger, and we may be
able to support DEVICE_POLLING.
o Myricom's header files changed the name of firmware
related #define's and enums (s/_MCP_/FW_).
- Stopped spamming the console with lots of printfs unless
mxge_verbose (or bootverbose) is set.
- Made additional information available via sysctl, including
the results of a PCI-e DMA benchmark run at device reset.
- Decreased the excessively long timeouts when sending commands
from 2 seconds to 20ms.
Sponsored by: Myricom Inc.
s/myri10ge/mxge/g replacement in the myri10ge files. A few contuation
lines were joined because of the regained columns.
- Hook the mxge driver back to the build.
More info regarding these nics can be found at http://www.myri.com.
Please note that the files
sys/dev/myri10ge/{mcp_gen_header.h,myri10ge_mcp.h} are internally
shared between all our drivers (solaris, macosx, windows, linux, etc).
I'd like to keep these files unchanged, so I can just import newer
versions of them when the firmware API/ABI changes. This means I'm
stuck with some of the crazy-long #define names, and possibly
non-style(9) characteristics of these files.
Many thanks to mlaier for doing firmware(9) just as I
needed it, and to scottl for his helpful review.
Reviewed by: scottl, glebius
Sponsored by: Myricom Inc.