command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.
Submitted by: jhb
Reviewed by: jfv, marius, achadd, achim
MFC after: 1 day
- Remove vestigial null pointer tests after malloc(..., M_WAITOK).
- Remove vestigal qualhack union
- Use strlcpy() instead of the error-prone strncpy() when parsing
EEPROM and copying strings
- Check the MAC address in the EEPROM strings more strictly.
- Expand the macro MXGE_NEXT_STRING() at its only user. Due to a typo,
the macro was very confusing.
- Remove unnecessary buffer limit check. The buffer is double-NUL
terminated per construction.
PR: kern/176369
Submitted by: Christoph Mallon <christoph.mallon gmx.de>
- Some mxge nics may store the serial number in the SN2 field of the
EEPROM. These will also have an SN=0 field, so parse the SN2 field,
and give it precedence.
- Skip MXGEFW_CMD_UNALIGNED_TEST on mxge nics which do not require it.
This saves roughly 10ms per port at device attach time.
Sponsored by: Myricom
MFC After: 7 days
- Re-fix build by restoring local removed in r247151, but protected
by #if defined(INET) || defined(INET6) so that the compile
succeeds in the !(INET||INET6) case.
- Protect call to in_pseudo() with an #ifdef INET, to allow
a kernel to link with mxge when INET is not compiled in.
- Also remove an errant (improperly commented) obsolete debugging printf
Thanks to Glebius for pointing out the !(INET||INET6) build issue.
Sponsored by: Myricom
MFC After: 7 days
- Add support for IPv6 rx csum offload
- Finally switch mxge from using its own driver lro, to
using tcp_lro
MFC after: 7 days
Sponsored by: Myricom Inc.
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.
o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
ALTQ queueing or buf_ring(9) queueing.
- It doesn't touch the buf_ring stats any more.
- It doesn't touch ifnet stats anymore.
- drbr_stats_update() no longer exists.
o buf_ring(9) handles its stats itself:
- It handles br_drops itself.
- br_prod_bytes stats are dropped. Rationale: no one ever
reads them but update of a common counter on every packet
negatively affects performance due to excessive cache
invalidation.
- buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
we no longer account bytes.
o Drivers handle their stats theirselves: if_obytes, if_omcasts.
o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer
use drbr_stats_update(), and update ifnet stats theirselves.
o bxe(4) was the most correct driver, it didn't call
drbr_stats_update(), thus it was the only driver accurate under
moderate load. Now it also maintains stats itself.
o ixgbe(4) had already taken stats from hardware, so just
- drop software stats updating.
- take multicast packet count from hardware as well.
o mxge(4) just no longer needs NO_SLOW_STATS define.
o cxgb(4), cxgbe(4) need no change, since they obtain stats
from hardware.
Reviewed by: jfv, gnn
It seems strchr() and strrchr() are used more often than index() and
rindex(). Therefore, simply migrate all kernel code to use it.
For the XFS code, remove an empty line to make the code identical to
the code in the Linux kernel.
This fixes a long standing bug in mxge(4) where "ifconfig mxge0 $IP"
did not bring the interface into a RUNNING state, like it does on
most (all?) other FreeBSD NIC drivers.
Thanks to gnn for mentioning the bug, and yongari for pointing out that
ether_ioctl() invokes ifp->if_init() in SIOCSIFADDR.
MFC after: 7 days
The Myri10GE NIC will assume all TSO frames contain partial checksum,
and will emit TSO segments with bad TCP checksums if a TSO frame
contains a full checksum. The mxge driver takes care to make sure
that TSO is disabled when checksum offload is disabled for this
reason. However, modules that modify packet contents (like pf) may
end up completing a checksum on a TSO frame, leading to the NIC emitting
TSO segments with bad checksums.
To workaround this, restore the partial checksum in the mxge driver
when we're fed a TSO frame with a full checksum.
Reported by: Bob Healey
MFC after: 3 days
- Re-probe xfp / sfp+ socket on link events, in case user
has changed transceiver
- correctly report current media to avoid confusing lagg (reported by Panasas)
- Report link speed (submitted by yongari)
Reviewed by: yongari (earlier version)
MFC after: 7 days
- Don't leak slice resources when mxge_alloc_rings() fails
- Start taskq threads only after we know attach will succeed. At
boot time, taskqueue_terminate() will loop infinately, waiting
for the threads to exit, and hang the system.
Submitted by: Panasas
MFC After: 3 days
- introduce drbr_needs_enqueue that returns whether the interface/br needs
an enqueue operation: returns true if altq is enabled or there are
already packets in the ring (as we need to maintain packet order)
- update all drbr consumers
- fix drbr_flush
- avoid using the driver queue (IFQ_DRV_*) in the altq case as the
multiqueue consumer does not provide enough protection, serialize altq
interaction with the main queue lock
- make drbr_dequeue_cond work with altq
Discussed with: kmacy, yongari, jfv
MFC after: 4 weeks
by checking PCI config space when the NIC is not
transmitting. Previously, a h/w fault would not have been
detected if the NIC was down, or handling an RX only
workload.
1) Restore the PCI Express control register after a watchdog
reset. This is required because the device will come out
of watchdog reset with the pectl reg at its default state,
and important BIOS configuration (like max payload size)
could be lost.
2) Call mxge_start_locked() for every tx queue before dropping
the lock in the watchdog handler. This is required, as
the queue's buf ring may have filled during the reset.
- Mark the link as down, so if watchdog reset fails, link watching
failover software can notice it
- Don't send MXGEFW_CMD_ETHERNET_DOWN if the NIC has been reset, it is
not needed, and will fail on a freshly reset NIC.
- Ensure the transmit routines aren't attempting to PIO write to doorbells
while the NIC is being reset.
- Download the correct f/w, rather than using the EEPROM f/w after reset.
- Export a count of the number of watchdog resets via sysctl
- Zero all f/w stats at reset. This will lead to less confusing
diagnostic output when investigating NIC failures.
MFC after: 3 days
loader, because it uses a reserved suffix (_type). Fix
this by removing the "_" and renaming the tunable to
hw.mxge.rss_hashtype. The old (rss_hash_type) tunable is
still fetched, in case people load the driver via scripts.
When both are present in the kernel environment,
the new value (hw.mxge.rss_hashtype) overrides the old
value.
Approved by: re (kib)
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs. This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.
For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.
Approved by: re (kib)
MFC after: 6 weeks
I tried re-ordering ether_ifdetach(), but this created a new race
where sometimes, when under heavy receive load (>1Mpps) and running
tcpdump, the machine would panic. At panic, the ithread was still in
the original (not dead) if_input() path, and was accessing stale BPF
data structs. By using a dying flag, I can close the interface prior
to if_detach() to be certain the interface cannot send packets up in
the middle of ether_ifdetach.
- always maintain byte/mcast/drop stats via drbr
- move #define of IFNET_BUF_RING so that its picked
up by all files in the driver
- conditionalize IFNET_BUF_RING on the FreeBSD_version
bump just after it appeared in the tree.
Sponsored by: Myricom Inc.
using bus_dmamap_load_mbuf_sg() on it. This
prevents data corruption when the mxge MTU is
between 4076 and 8172 on machines with 4KB
pages and MXGE_VIRT_JUMBOS is in use (which it
isn't, in -current or -stable)
- Update mxge to use if_transmit(), and the new buf_ring
interfaces, so as to enable multiple transmit queues.
Use of if_transmit() is conditional on IFNET_BUF_RING,
and is enabled by default (as in if_em).
- Record a flow id on receive if receive hashing is active.
I currently only record the rx ring id (0..8) rather than
the 32-bit topelitz hash result, as doing the latter would
require shifting the driver to use a larger rx return ring.
Sponsored by: Myricom, Inc.
- Update to firmware 1.4.39 for dual-chip NIC (10G-PCIE2-xxx)
support, and SFP+ i2c support
- Identify newer "B" NICs (10G-PCIEx-8B-x) correctly, rather than
mis-identifying them as "A" NICs (cosmetic only)
- Identify the IFM_10G_LRM ifmedia type, where applicable.
- Identify ifmedia types for SFP+ based NICs
- Update copyright
Sponsored by: Myricom
MFC after: 1 week