- At Intel it is believed that most of their products support "only"
40 DMA segments so lower {EM,IGB}_MAX_SCATTER accordingly. Actually,
40 is more than plenty to handle full size TSO packets so it doesn't
make sense to further distinguish between MAC variants that really
can do 64 DMA segments. Moreover, capping at 40 DMA segments limits
the stack usage of {em,igb}_xmit() that - given the rare use of more
than these - previously hardly was justifiable, while still being
sufficient to avoid the problems seen with em(4) and EM_MAX_SCATTER
set to 32.
- In igb(4), pass the actually supported TSO parameters up the stack.
Previously, the defaults set in if_attach_internal() were applied,
i. e. a maximum of 35 TSO segments, which made supporting more than
these in the driver pointless. However, this might explain why no
problems were seen with IGB_MAX_SCATTER at 64.
- In em(4), take the 5 m_pullup(9) invocations performed by em_xmit()
in the TSO case into account when reporting TSO parameters upwards.
In the worst case, each of these calls will add another mbuf and,
thus, the requirement for an additional DMA segment. So for best
performance, it doesn't make sense to advertize a maximum of TSO
segments that typically will require defragmentation in em_xmit().
Again, this leaves enough room to handle full size TSO packets.
- Drop TSO macros from if_lem.h given that corresponding MACS don't
support TSO in the first place.
Reviewed by: erj, sbruno, jeffrey.e.pieper_intel.com
Approved by: erj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D5238
Major changes:
- Add i219/i219(2) hardware support. (Found on Skylake generation and newer
chipsets.)
- Further to the last Skylake support diff, this one also includes support for
the Lewisburg chipset (i219(3)).
- Add a workaround to an igb hardware errata.
All 1G server products need to have IPv6 extension header parsing turned off.
This should be listed in the specification updates for current 1G server
products, e.g. for i350 it's errata #37 in this document:
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/ethernet-controller-i350-spec-update.pdf
- Avoton (i354) PHY errata workaround added
And a bunch of minor fixes, as well as #defines for things that the current
em(4)/igb(4) drivers don't implement.
Differential Revision: https://reviews.freebsd.org/D3162
Reviewed by: sbruno, marius, gnn
Approved by: gnn
MFC after: 2 weeks
Sponsored by: Intel Corporation
alignment guarantees provided by m_defrag(9), use m_collapse(9)
instead for performance reasons.
While at it, sanitize the statistics softc members, i. e. retire
unused ones and add SYSCTL nodes missing for actually used ones.
Differential Revision: https://reviews.freebsd.org/D4717
e1000/e1000e split in linux.
Split rxbuffer and txbuffer apart to support the new RX descriptor format
structures. Move rxbuffer manipulation to em_setup_rxdesc() to unify the
new behavior changes.
Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.
Change em_receive_checksum() to process the new rxdescriptor format
status bit.
MFC after: 2 weeks
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D3447
doesn't get overrun by things like NFS that can and do shove more than 32 segs when
being used with em(4) and TSO4.
Update tso handling code in em_xmit() with update from jhb@ in email thread:
https://lists.freebsd.org/pipermail/freebsd-net/2014-July/039306.html
set ifp->if_hw_tsomax, ifp->if_hw_tsomaxsegcount & ifp->if_hw_tsomaxsegsize
to appropriate values.
Define a TSO workaround "magic" number of 4 that is used to avoid an
alignment issue in hardware.
Change a couple of integer values that were used as booleans to actual
bool types.
Ensure that em_enable_intr() enables the appropriate mask of interrupts
and not just a hardcoded define of values.
PR: 200221 199174 195078
Differential Revision: https://reviews.freebsd.org/D3192
Reviewed by: erj jhb hiren
MFC after: 2 weeks
Sponsored by: Limelight Networks
up to 2 rx/tx queues for the 82574.
Program the 82574 to enable 5 msix vectors, assign 1 to each rx queue,
1 to each tx queue and 1 to the link handler.
Inspired by DragonFlyBSD, enable some RSS logic for handling tx queue
handling/processing.
Move multiqueue handler functions so that they line up better in a diff
review to if_igb.c
Always enqueue tx work to be done in em_mq_start, if unable to acquire
the TX lock, then this will be processed in the background later by the
taskqueue. Remove mbuf argument from em_start_mq_locked() as the work
is always enqueued. (stolen from igb)
Setup TARC, TXDCTL and RXDCTL registers for better performance and stability
in multiqueue and singlequeue implementations. Handle Intel errata 3 and
generic multiqueue behavior with the initialization of TARC(0) and TARC(1)
Bind interrupt threads to cpus in order. (stolen from igb)
Add 2 new DDB functions, one to display the queue(s) and their settings and
one to reset the adapter. Primarily used for debugging.
In the multiqueue configuration, bump RXD and TXD ring size to max for the
adapter (4096). Setup an RDTR of 64 and an RADV of 128 in multiqueue configuration
to cut down on the number of interrupts. RADV was arbitrarily set to 2x RDTR
and can be adjusted as needed.
Cleanup the display in top a bit to make it clearer where the taskqueue threads
are running and what they should be doing.
Ensure that both queues are processed by em_local_timer() by writing them both
to the IMS register to generate soft interrupts.
Ensure that an soft interrupt is generated when em_msix_link() is run so that
any races between assertion of the link/status interrupt and a rx/tx interrupt
are handled.
Document existing tuneables: hw.em.eee_setting, hw.em.msix, hw.em.smart_pwr_down, hw.em.sbp
Document use of hw.em.num_queues and the new kernel option EM_MULTIQUEUE
Thanks to Intel for their continued support of FreeBSD.
Reviewed by: erj jfv hiren gnn wblock
Obtained from: Intel Corporation
MFC after: 2 weeks
Relnotes: Yes
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D1994
applying them to em(4).
Rely on iterations through the local timer, and the tx queue state to
determine if an actual hang has occurred. Any time a descriptor is used
(packet sent), the tx queue is flagged as busy. Then when txeof runs, it
either clears the flag when all is clean, or resets it to 1 if ANY are
cleaned, if nothing is cleaned it increments the flag.
Local timer simply checks to see if busy ever reaches MAX (10, which
is compile time configurable), and then sets it as HUNG, at that point
there is one more timer cycle in which to have any cleans, if not a
watchdog reset will occur.
Differential Revision: https://reviews.freebsd.org/D2019
Submitted by: jfv
Reviewed by: hiren
Obtained from: Intel Corporation
MFC after: 2 weeks
Relnotes: Yes
Sponsored by: Limelight Networks
irrespective of the setting of lem_rx_process_limit, while
giving a chance to the taskqueue scheduler to act after
each chunk.
This makes lem_rxeof similar to the one in if_em.c and if_igb.c .
if_lem.c and if_em.c: add a sysctl to manually configure the
'itr' moderation register.
Approved by: Jack Vogel
There have still been intermittent problems with apparent TX
hangs for some customers. These have been problematic to reproduce
but I believe these changes will address them. Testing on a number
of fronts have been positive.
EM: there is an important 'chicken bit' fix for 82574 in the shared
code this is supported in the core here.
- The TX path has been tightened up to improve performance. In
particular UDP with jumbo frames was having problems, and the
changes here have improved that.
- OACTIVE has been used more carefully on the theory that some
hangs may be due to a problem in this interaction
- Problems with the RX init code, the "lazy" allocation and
ring initialization has been found to cause problems in some
newer client systems, and as it really is not that big a win
(its not in a hot path) it seems best to remove it.
- HWTSO was broken when VLAN HWTAGGING or HWFILTER is used, I
found this was due to an error in setting up the descriptors
in em_xmit.
IGB:
- TX is also improved here. With multiqueue I realized its very
important to handle OACTIVE only under the CORE lock so there
are no races between the queues.
- Flow Control handling was broken in a couple ways, I have changed
and I hope improved that in this delta.
- UDP also had a problem in the TX path here, it was change to
improve that.
- On some hardware, with the driver static, a weird stray interrupt
seems to sometimes fire and cause a panic in the RX mbuf refresh
code. This is addressed by setting interrupts late in the init
path, and also to set all interrupts bits off at the start of that.
to calculate the outstanding descriptors that need to be
refreshed at any time, and use THAT in rxeof to determine
if refreshing needs to be done. Also change the local_timer
to simply fire off the appropriate interrupt rather than
schedule a tasklet, its simpler.
MFC in two weeks
been undergoing test for some weeks. This improves the RX
mbuf handling to avoid system hang due to depletion. Thanks
to all those who have been testing the code, and to Beezar
Liu for the design changes.
Next the igb driver is updated for similar RX changes, but
also to add new features support for our upcoming i350 family
of adapters.
MFC after a week
- Chasin down bogus watchdogs has led to an improved
design to this handling, the hang decision takes
place in the tx cleanup, with only a simple report
check in local_timer. Our tests have shown no false
watchdogs with this code.
- VLAN fixes from jhb, the shadow vfta should be per
interface, but as global it was not. Thanks John.
- Bug fixes in the support for new PCH2 hardware.
- Thanks for all the help and feedback on the driver,
changes to lem with be coming shortly as well.
- Sync shared code with Intel internal
- New client chipset support added
- em driver - fixes to 82574, limit queues to 1 but use MSIX
- em driver - large changes in TX checksum offload and tso
code, thanks to yongari.
- some small changes for watchdog issues.
- igb driver - local timer watchdog code was missing locking
this and a couple other watchdog related fixes.
- bug in rx discard found by Andrew Boyer, check for null pointer
MFC: a week
discrepencies from the igb version which was the target.
Change the message when neither MSI or MSIX are enabled
and a fallback to Legacy interrupts happen, the existing
message was confusing.
configuration function. For failed memory allocations, em(4)/lem(4)
called panic(9) which is not acceptable on production box.
igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which
consumed 768 bytes of stack memory which looks too big.
To address these issues, allocate multicast array memory in device
attach time and make multicast configuration success under any
conditions. This change also removes the excessive use of memory in
stack.
Reviewed by: jfv
- code flow in handler could let interrupt be
reenabled when not wanted.
- change where the RX lock is taken to improve
performance.
- adapter->msix is true for MSI systems also,
it needs to explicitly test for 82574, good one :)
em revision 7.0.0:
- Using driver devclass, seperate legacy (pre-pcie) code
into a seperate source file. This will at least help
protect against regression issues. It compiles along
with em, and is transparent to end use, devices in each
appear to be 'emX'. When using em in a modular form this
also allows the legacy stuff to be defined out.
- Add tx and rx rings as in igb, in the 82574 this becomes
actual multiqueue for the first time (2 queues) while in
other PCIE adapters its just make code cleaner.
- Add RX mbuf handling logic that matches igb, this will
eliminate packet drops due to temporary mbuf shortage.
igb revision 1.9.3:
- Following the ixgbe code, use a new approach in what
was called 'get_buf', the routine now has been made
independent of rxeof, it now does the update to the
engine TDT register, this design allows temporary
mbuf resources to become non-critical, not requiring
a packet to be discarded, instead it just returns and
does not increment the tail pointer.
- With the above change it was also unnecessary to keep
'spare' maps around, since we do not have the discard
issue.
- Performance tweaks and improvements to the code also.
MFC in a week
igb now has a queue notion that has a single interrupt
with an RX/TX pair, this will reduce the total interrupts
seen on a system. Both em and igb have a new watchdog
method. igb has fixes from Pyun Yong-Hyeon that have
improved stability, thank you :)
I wish to MFC this for 7.3 asap, please test if able.
drivers. These add new hardware support, most importantly
the pch (i5 chipset) in the em driver. Also, both drivers
now have the simplified (and I hope improved) watchdog
code. The igb driver uses the new RX cleanup that I
first implemented in ixgbe.
em - version 6.9.24
igb - version 1.8.4
will sometimes fail to initialize problem due to a lock
contention with management hardware. However, in order to
deliver that fix it was necessary to take a shared code
update as a whole, and this required scattered changes in
the core code to be compatible.
The em driver now has VLAN HW support added as the igb
driver had previously.
MFC after: ASAP - in time for 7.1 RELEASE
proved to be necessary to make the static drivers work
in EITHER/OR or BOTH configurations. Modules will still
build in sys/modules/igb or em as before.
This also updates the igb driver for support for the 82576
adapter, adds shared code fixes, and etc....
MFC after: ASAP