Commit Graph

190 Commits

Author SHA1 Message Date
Stephen Hurd
c5cf217261 Some small packet performance improvements
If the packet is smaller than MTU, disable the TSO flags.
Move TCP header parsing inside the IS_TSO?() test.
Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need to be zeroed before TX.

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12442
2017-09-23 01:33:20 +00:00
Stephen Hurd
ab2e3f7958 Revert r323516 (iflib rollup)
This was really too big of a commit even if everything worked, but there
are multiple new issues introduced in the one huge commit, so it's not
worth keeping this until it's fixed.

I'll work on splitting this up into logical chunks and introduce them one
at a time over the next week or two.

Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
2017-09-16 02:41:38 +00:00
Stephen Hurd
d300df0182 Roll up iflib commits from github. This pulls in most of the work done
by Matt Macy as well as other changes which he has accepted via pull
request to his github repo at https://github.com/mattmacy/networking/

This should bring -CURRENT and the github repo into close enough sync to
allow small feature branches rather than a large chain of interdependant
patches being developed out of tree.  The reset of the synchronization
should be able to be completed on github by splitting the remaining
changes that are not yet ready into short feature branches for later
review as smaller commits.

Here is a summary of changes included in this patch:

1)  More checks when INVARIANTS are enabled for eariler problem
    detection
2)  Group Task Queue cleanups
    - Fix use of duplicate shortdesc for gtaskqueue malloc type.
      Some interfaces such as memguard(9) use the short description to
      identify malloc types, so duplicates should be avoided.
3)  Allow gtaskqueues to use ithreads in addition to taskqueues
    - In some cases, this can improve performance
4)  Better logging when taskqgroup_attach*() fails to set interrupt
    affinity.
5)  Do not start gtaskqueues until they're needed
6)  Have mp_ring enqueue function enter the ABDICATED rather than BUSY
    state.  This moves the TX to the gtaskq and allows processing to
    continue faster as well as make TX batching more likely.
7)  Add an ift_txd_errata function to struct if_txrx.  This allows
    drivers to inspect/modify mbufs before transmission.
8)  Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need
    checksums zeroed for checksum offload to work.  This avoids modifying
    packet data in the TX path when possible.
9)  Use ithreads for iflib I/O instead of taskqueues
10) Clean up ioctl and support async ioctl functions
11) Prefetch two cachlines from each mbuf instead of one up to 128B.  We
    often need to parse packet header info beyond 64B.
12) Fix potential memory corruption due to fence post error in
    bit_nclear() usage.
13) Improved hang detection and handling
14) If the packet is smaller than MTU, disable the TSO flags.
    This avoids extra packet parsing when not needed.
15) Move TCP header parsing inside the IS_TSO?() test.
    This avoids extra packet parsing when not needed.
16) Pass chains of mbufs that are not consumed by lro to if_input()
    rather call if_input() for each mbuf.
17) Re-arrange packet header loads to get as much work as possible done
    before a cache stall.
18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/
    IFDI_DETACH();
19) Attempt to distribute RX/TX tasks across cores more sensibly,
    especially when RX and TX share an interrupt.  RX will attempt to
    take the first threads on a core, and TX will attempt to take
    successive threads.
20) Allow iflib_softirq_alloc_generic() to request affinity to the same
    cpus an interrupt has affinity with.  This allows TX queues to
    ensure they are serviced by the socket the device is on.
21) Add new iflib sysctls to net.iflib:
    - timer_int - interval at which to run per-queue timers in ticks
    - force_busdma
22) Add new per-device iflib sysctls to dev.X.Y.iflib
    - rx_budget allows tuning the batch size on the RX path
    - watchdog_events Count of watchdog events seen since load
23) Fix error where netmap_rxq_init() could get called before
    IFDI_INIT()
24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY
    when waiting for firmware
    - After interrupts are enabled, convert all waits to sleeps
    - Eliminates e1000 software/firmware synchronization busy waits after
      startup
25) e1000: Remove special case for budget=1 in em_txrx.c
    - Premature optimization which may actually be incorrect with
      multi-segment packets
26) e1000: Split out TX interrupt rather than share an interrupt for
    RX and TX.
    - Allows better performance by keeping RX and TX paths separate
27) e1000: Separate igb from em code where suitable
    Much easier to understand separate functions and "if (is_igb)" than
    previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))"

#blamebruno

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12235
2017-09-13 01:18:42 +00:00
Konstantin Belousov
dc63dc00cb Fix malloc() uses in em_get_regs().
Do not use malloc(M_NOWAIT), wait is possible there, and the malloc
failures where not checked.  Do not forget to free malloced memory.

Reported and tested by:	pho
Approved by:	sbruno
Sponsored by:	The FreeBSD Foundation
2017-09-08 14:54:07 +00:00
Sean Bruno
a969350226 Revert r323008 and its conversion of e1000/iflib to using SX locks.
This seems to be missing something on the 82574L causing NFS root mounts
to hang.

Reported by:	kib
2017-08-30 18:56:24 +00:00
Sean Bruno
e17e5b4134 Continuation of lock cleanup in e1000.
Post-cold sleep instead of DELAY when waiting for firmware.

Convert softc mutex to an SX lock.  Change all waits to sleeps
once interrupts are enabled (and it is safe to sleep).

Submitted by:	Matt Macy <matt@mattmacy.io>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12101
2017-08-30 00:20:43 +00:00
Marius Strobl
e023501c6a Don't set any WOL enabling hardware bits if WOL isn't requested
according to the enabled interface capability bits. Also remove
some dead code, which tried to preserve already set contents of
E1000_WUC while that register is completely overwritten shortly
after in all cases.
2017-08-28 22:09:12 +00:00
Sean Bruno
7d0d648446 Add a different #define for the maximum number of transmit and
recieve descriptors for the igb(4) class of devices.  This will
allow a better definition for maximum going forward.  Some igb(4)
devices support more than the default 4K.

Reported by:	Jason (j@nitrology.com)
Sponsored by:	Limelight Networks
2017-08-25 22:38:55 +00:00
Sean Bruno
5eedcb0969 Do not update stats counter in SWI context. Defer to the already existing
admin thread.

Submitted by:	Matt Macy <mmacy@mattmacy.io>
Sponsored by:	Limelight Networks
2017-07-21 17:42:54 +00:00
Enji Cooper
2df7231dbf Some trivial style(9) fixes
- Delete trailing whitespace.
- Fix leading indentation (convert single column spaces to tabs).
- Convert "[Ff]all through" to "FALLTHROUGH", per implicit project
  style/spelling.

Reviewed by:	sbruno
Differential Revision:	D11665
2017-07-20 04:32:06 +00:00
Sean Bruno
d8c2808fcf Restore igb(4) code dropped during iflib conversion
- restore newer code for vf, i350, i210, i211
 - restore dmac init code for i354 and i350
 - restore WUC/WUFC update
 - check for igb mac type before attempting trying to assert
   a media changed event.
 - handle link events for igb(4) and em(4) devices differently
   and appropriately for their respective model types.

Submitted by:	Matt Macy <mmacy@mattmacy.io>
Sponsored by:	Limelight Networks
2017-07-19 22:41:22 +00:00
Sean Bruno
60596476cf Move pause frame counter out of struct if_ctx and into struct if_softc_ctx_t
so that we can use it in iflib to detect pause frames.

The igb(4) driver definitely used to use this in its old timer function and
I see no reason to restrict it to that driver only.

Sponsored by:	Limelight Networks
2017-04-07 00:33:03 +00:00
Sean Bruno
fd70242dbc no_desc_avail is tracked in iflib now making this redundant.
Sponsored by:	Limelight Networks
2017-04-04 16:54:46 +00:00
Sean Bruno
548b549ade Remove unsafe and non-functional DDB functions that I added long ago
for debugging.
2017-04-03 21:19:12 +00:00
Sean Bruno
c45420ccac Remove rx_processing_limit sysctl and now orphaned function em_set_sysctl_value
Sponsored by:	Limelight Networks
2017-04-03 21:17:57 +00:00
Sean Bruno
935ca1ae7c Access *correct* ifp data structure when debug sysctl is invoked.
Submitted by:	Kevin Bowling <kevin.bowling@kev009.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D10126
2017-03-27 15:08:02 +00:00
Sean Bruno
cb101e1257 Add missing 'else' to conditional. This doesn't really affect the code
flow or configuration in any way.
2017-03-24 14:27:29 +00:00
Sean Bruno
fdb25f38cb Add missing 'else' to 3-state conditional during setup of interrupts.
We don't want to overwrite the 82574 interrupt setup with a different
configuration.

PR:		218041
Submitted by:	razmyslov@viva64.com
2017-03-24 14:25:56 +00:00
Sean Bruno
b207ed2b4a Actually set the MTU to the requested value and fixup handling of jumbo
frames.

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Reported by:	pho
Sponsored by:	Limelight Networks
2017-03-15 14:44:59 +00:00
Sean Bruno
95246abb21 IFLIB updates
- unconditionally enable BUS_DMA on non-x86 architectures
- speed up rxd zeroing via customized function
- support out of order updates to rxd's
- add prefetching to hardware descriptor rings
- only prefetch on 10G or faster hardware
- add seperate tx queue intr function
- preliminary rework of NETMAP interfaces, WIP

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Sponsored by:	Limelight Networks
2017-03-13 22:53:06 +00:00
Sean Bruno
38b7de95a1 Restore PBA setup for igb(4) class devices.
Do no write to PBA register on igb(4) devices unless we need
to make adjustments for the 82575 and jumbo frames.

Remove redundant LPE/~LPE assignments.

Move e1000_lv_jumbo_workaround_ich8lan() invokcation into a block
so that its not executed in the igb case.

Move em(4) class assignments of RCTL values to its own code block.

Adjust a few direct accesses of ifp->mtu to use accessor functions.

PR:		216734
Submitted by:	Kaho Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
2017-02-19 05:06:29 +00:00
Sean Bruno
9c030a21d0 Push max_frame_len back into iflib so that jumbo frame sizes work.
Reported by:	pho
2017-02-17 21:30:16 +00:00
Sean Bruno
bcc537c59d Only trigger em_local_timer on queue index 0. This was causing continuous
em_local_timer() executions during normal operation and was very likely
to cause a lock up on igb(4) devices.

Submitted by:	Matt Macy (mmacy@nextbsd.org)
Reported by:	jtl
Reviewed by:	gallatin
Sponsored by:	Limelight Networks & Netflix
2017-02-12 23:06:41 +00:00
Andrew Turner
2a3c5de4f7 Add support for the Intel 82572EI back to em(4), it seems it was dropped
when oving to iflib.

Reviewed by:	sbruno
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D9511
2017-02-09 17:48:33 +00:00
Stephen J. Kiernan
d0b2cad1ca Add the folowing set accessor functions for recently-added members of ifnet
structure:

if_gethwtsomax(), if_sethwtsomax()                 - if_hw_tsomax
if_gethwtsomaxsegcount(), if_sethwtsomaxsegcount() - if_hw_tsomaxsegcount
if_gethwtsomaxsegsize(), if_sethwtsomaxsegsize()   - if_hw_tsomaxsegsize

Update em and vnic drivers which had already been coverted to use accessor
functions for the other ifnet structure members.

Reviewed by:	erj
Approved by:	sjg (mentor)
Obtained from:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D8544
2017-01-31 16:12:31 +00:00
Sean Bruno
db569544d1 igb(4) enable WOL features for this class of devices.
PR:	208343
Submitted by:	Kaho Tashikazu <kaho@elam.kais.kyoto-u.ac.jp>
2017-01-22 18:04:57 +00:00
Eric Joyner
295df609df e1000: Add support for Kaby Lake generation i219 (4) and i219 (5) devices
MFC after:	1 week
Sponsored by:	Intel Corporation
2017-01-19 18:52:38 +00:00
Sean Bruno
653e35e69f Restore fixup for newer em(4) devices WOL capabilities post iflib integration.
PR:	208343
2017-01-12 14:47:44 +00:00
Sean Bruno
0aa7d3ff9e Reset the EIAC register to include the LINK status bit and restore
link up/down notifications.

Submitted by:	Franco Fichtner <franco@opnsense.org>
2017-01-12 14:28:32 +00:00
Sean Bruno
8bc3dfc424 Attempt to use the "new" BAR address for newer igb(4) devices. This code
was dropped during the IFLIB migration.

Reported by:	olivier
Reviewed by:	mmacy@nextbsd.org
2017-01-12 14:18:52 +00:00
Sean Bruno
d37cece2b0 Add copywrite notices, 2-clause BSD.
Reported by:	jmallett
2017-01-10 04:50:26 +00:00
Sean Bruno
f2d6ace4a6 Migrate e1000 to the IFLIB framework:
- em(4) igb(4) and lem(4)
- deprecate the igb device from kernel configurations
- create a symbolic link in /boot/kernel from if_em.ko to if_igb.ko

Devices tested:
- 82574L
- I218-LM
- 82546GB
- 82579LM
- I350
- I217

Please report problems to freebsd-net@freebsd.org

Partial review from jhb and suggestions on how to *not* brick folks who
originally would have lost their igbX device.

Submitted by:	mmacy@nextbsd.org
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Limelight Networks and Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8299
2017-01-10 03:23:22 +00:00
Sean Bruno
751e35548b Update WOL support for newer em(4) devices.
Do not overwrite the contents of the WUC register, add E1000_WUC_PME_EN
to the register contents, leaving the default contents intact.

PR:		208343
Submitted by:	Kaho Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
Reviewed by:	jeffrey piper <jeffrey.e.pieper@intel.com>
Approved by:	erj@
MFC after:	2 weeks
2016-11-14 17:19:03 +00:00
Sean Bruno
b1b6afa587 r295133 attempted to deactivate TSO in the 100Mbit link case with this
adapter to work around bugs in TSO handling at this speed.

em_init_locked is called during first boot of the adapter and will
see that link_speed is unitialized, effectively turning off tso for
all cards at all speeds, which I believe was *not* the intent.

Move the handling of TSO deactivation to the link handler where we can
more effectively make the decision about what to do.  In addition,
completely purge the TSO capabilities instead of disabling just CSUM_TSO.

Thanks to jhb for explanation of the hw capabilites api.

Thanks to royger and cognet for testing the 100Mbit failure case to
ensure that their adapters do indeed still work.

MFC after:	1 week
Sponsored by:	Limelight Networks
2016-11-05 16:30:42 +00:00
Sean Bruno
7846f73cc8 Removed unused M_TSO_LEN.
MFC after:	2 weeks
2016-11-02 14:25:30 +00:00
Sean Bruno
e760e292ae e1000: Add support for Kaby Lake IDs
Fixup some errors when transitioning to/from low power states.

Submitted by:	erj
Reviewed by:	Jeffery Piper (jeffrey.e.piper@intel.com)
MFC after:	3 days
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D7478
2016-08-15 11:24:30 +00:00
Sean Bruno
e72a746a6e r293331 mistakingly failed to add an assignment of paddr to the rxbuf
but only in the NETMAP code.  This lead to the NETMAP code paths
passing nothing up to userland.

Submitted by:	Ad Schellevis <ad@opnsense.org>
Reported by:	Franco Fichtner <franco@opnsense.org>
MFC after:	1 day
2016-08-01 21:19:51 +00:00
Sean Bruno
761e526185 Do not initialize the adapter on MTU change when adapter status is down.
This fixes long-standing problems when changing settings of the adapter.

Discussed in:
https://lists.freebsd.org/pipermail/freebsd-net/2016-June/045509.html

Submitted by:	arnaud.ysmal@stormshield.eu
Reviewed by:	erj@freebsd.org
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D7030
2016-07-07 03:39:18 +00:00
Pedro F. Giffuni
363089d89e dev/e1000,ixgbe: minor spelling fixes.
No functional change.

Differential Revision:	https://reviews.freebsd.org/D6177
2016-05-06 22:54:56 +00:00
Sean Bruno
ae6bd5b7be Correct possible underflow conditions when checking for available space
in the tx h/w ring buffer.

Reviewed by:	gnn jeb.j.cramer@intel.com
MFC after:	1 week
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D5918
2016-04-18 20:33:44 +00:00
Marius Strobl
0a01ff4d7d Fix and clean up usage of DMA and TSO segments:
- At Intel it is believed that most of their products support "only"
  40 DMA segments so lower {EM,IGB}_MAX_SCATTER accordingly. Actually,
  40 is more than plenty to handle full size TSO packets so it doesn't
  make sense to further distinguish between MAC variants that really
  can do 64 DMA segments. Moreover, capping at 40 DMA segments limits
  the stack usage of {em,igb}_xmit() that - given the rare use of more
  than these - previously hardly was justifiable, while still being
  sufficient to avoid the problems seen with em(4) and EM_MAX_SCATTER
  set to 32.
- In igb(4), pass the actually supported TSO parameters up the stack.
  Previously, the defaults set in if_attach_internal() were applied,
  i. e. a maximum of 35 TSO segments, which made supporting more than
  these in the driver pointless. However, this might explain why no
  problems were seen with IGB_MAX_SCATTER at 64.
- In em(4), take the 5 m_pullup(9) invocations performed by em_xmit()
  in the TSO case into account when reporting TSO parameters upwards.
  In the worst case, each of these calls will add another mbuf and,
  thus, the requirement for an additional DMA segment. So for best
  performance, it doesn't make sense to advertize a maximum of TSO
  segments that typically will require defragmentation in em_xmit().
  Again, this leaves enough room to handle full size TSO packets.
- Drop TSO macros from if_lem.h given that corresponding MACS don't
  support TSO in the first place.

Reviewed by:	erj, sbruno, jeffrey.e.pieper_intel.com
Approved by:	erj
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D5238
2016-02-23 01:19:26 +00:00
Eric Joyner
c80429cedb Update em(4) to 7.6.1; update igb(4) to 2.5.3.
Major changes:

- Add i219/i219(2) hardware support. (Found on Skylake generation and newer
  chipsets.)
  - Further to the last Skylake support diff, this one also includes support for
    the Lewisburg chipset (i219(3)).

- Add a workaround to an igb hardware errata.
  All 1G server products need to have IPv6 extension header parsing turned off.
  This should be listed in the specification updates for current 1G server
  products, e.g. for i350 it's errata #37 in this document:
  http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/ethernet-controller-i350-spec-update.pdf

- Avoton (i354) PHY errata workaround added

And a bunch of minor fixes, as well as #defines for things that the current
em(4)/igb(4) drivers don't implement.

Differential Revision: https://reviews.freebsd.org/D3162
Reviewed by: sbruno, marius, gnn
Approved by: gnn
MFC after: 2 weeks
Sponsored by: Intel Corporation
2016-02-05 17:14:37 +00:00
Marius Strobl
d233a81b5a As it turns out, one of the more or less recent changes to em(4)
causes watchdog timeouts when using TSO4 at link speeds below
Gigabit, at least with 82573E. So disable the assist automatically
when at lower speeds.

Submitted by:	jfv
Approved by:	erj
Obtained from:	D3162
MFC after:	3 days
2016-02-01 23:51:30 +00:00
Marius Strobl
e12a9f251e Given that em(4), lem(4) and igb(4) hardware doesn't require the
alignment guarantees provided by m_defrag(9), use m_collapse(9)
instead for performance reasons.
While at it, sanitize the statistics softc members, i. e. retire
unused ones and add SYSCTL nodes missing for actually used ones.

Differential Revision:	https://reviews.freebsd.org/D4717
2016-01-13 21:47:27 +00:00
Sean Bruno
676822acb5 Disable the reuse of checksum offload context descriptors in the case
of multiple queues in em(4).  Document errata in the code.

MFC after:	2 weeks
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D3995
2016-01-07 16:48:47 +00:00
Sean Bruno
b834dcea9a Switch em(4) to the extended RX descriptor format. This matches the
e1000/e1000e split in linux.

Split rxbuffer and txbuffer apart to support the new RX descriptor format
structures. Move rxbuffer manipulation to em_setup_rxdesc() to unify the
new behavior changes.

Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.

Change em_receive_checksum() to process the new rxdescriptor format
status bit.

MFC after:	2 weeks
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D3447
2016-01-07 16:42:48 +00:00
Sean Bruno
8061e8bb1e Wow, um ... sorry about that. The commit log for this code should have
read that it was for EM_MULTIQUEUE.  Revert this and try again.
2016-01-07 16:24:18 +00:00
Sean Bruno
712b97a630 Switch em(4) to the extended RX descriptor format. This matches the
e1000/e1000e split in linux.

MFC after:	2 weeks
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D3447
2016-01-07 16:20:55 +00:00
Sean Bruno
e373323fe2 Revert 287914,287762.
Reports of breakage on igb(4) have been narrowed down to 287762 and 287914
is an dependant change.

Submitted by:	erj
2015-09-19 18:22:59 +00:00
Sean Bruno
25fd5dd9a0 Add Intel Skylake/I219 Support
- New em(4) device in currently shipping products

Differential Revision:	https://reviews.freebsd.org/D3163
Submitted by:	erj@freebsd.org
Reviewed by:	jfv@freebsd.org
MFC after:	2 weeks
Sponsored by:	Intel Corporation
2015-09-17 15:11:45 +00:00