Commit Graph

12 Commits

Author SHA1 Message Date
Stephen Hurd
d300df0182 Roll up iflib commits from github. This pulls in most of the work done
by Matt Macy as well as other changes which he has accepted via pull
request to his github repo at https://github.com/mattmacy/networking/

This should bring -CURRENT and the github repo into close enough sync to
allow small feature branches rather than a large chain of interdependant
patches being developed out of tree.  The reset of the synchronization
should be able to be completed on github by splitting the remaining
changes that are not yet ready into short feature branches for later
review as smaller commits.

Here is a summary of changes included in this patch:

1)  More checks when INVARIANTS are enabled for eariler problem
    detection
2)  Group Task Queue cleanups
    - Fix use of duplicate shortdesc for gtaskqueue malloc type.
      Some interfaces such as memguard(9) use the short description to
      identify malloc types, so duplicates should be avoided.
3)  Allow gtaskqueues to use ithreads in addition to taskqueues
    - In some cases, this can improve performance
4)  Better logging when taskqgroup_attach*() fails to set interrupt
    affinity.
5)  Do not start gtaskqueues until they're needed
6)  Have mp_ring enqueue function enter the ABDICATED rather than BUSY
    state.  This moves the TX to the gtaskq and allows processing to
    continue faster as well as make TX batching more likely.
7)  Add an ift_txd_errata function to struct if_txrx.  This allows
    drivers to inspect/modify mbufs before transmission.
8)  Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need
    checksums zeroed for checksum offload to work.  This avoids modifying
    packet data in the TX path when possible.
9)  Use ithreads for iflib I/O instead of taskqueues
10) Clean up ioctl and support async ioctl functions
11) Prefetch two cachlines from each mbuf instead of one up to 128B.  We
    often need to parse packet header info beyond 64B.
12) Fix potential memory corruption due to fence post error in
    bit_nclear() usage.
13) Improved hang detection and handling
14) If the packet is smaller than MTU, disable the TSO flags.
    This avoids extra packet parsing when not needed.
15) Move TCP header parsing inside the IS_TSO?() test.
    This avoids extra packet parsing when not needed.
16) Pass chains of mbufs that are not consumed by lro to if_input()
    rather call if_input() for each mbuf.
17) Re-arrange packet header loads to get as much work as possible done
    before a cache stall.
18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/
    IFDI_DETACH();
19) Attempt to distribute RX/TX tasks across cores more sensibly,
    especially when RX and TX share an interrupt.  RX will attempt to
    take the first threads on a core, and TX will attempt to take
    successive threads.
20) Allow iflib_softirq_alloc_generic() to request affinity to the same
    cpus an interrupt has affinity with.  This allows TX queues to
    ensure they are serviced by the socket the device is on.
21) Add new iflib sysctls to net.iflib:
    - timer_int - interval at which to run per-queue timers in ticks
    - force_busdma
22) Add new per-device iflib sysctls to dev.X.Y.iflib
    - rx_budget allows tuning the batch size on the RX path
    - watchdog_events Count of watchdog events seen since load
23) Fix error where netmap_rxq_init() could get called before
    IFDI_INIT()
24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY
    when waiting for firmware
    - After interrupts are enabled, convert all waits to sleeps
    - Eliminates e1000 software/firmware synchronization busy waits after
      startup
25) e1000: Remove special case for budget=1 in em_txrx.c
    - Premature optimization which may actually be incorrect with
      multi-segment packets
26) e1000: Split out TX interrupt rather than share an interrupt for
    RX and TX.
    - Allows better performance by keeping RX and TX paths separate
27) e1000: Separate igb from em code where suitable
    Much easier to understand separate functions and "if (is_igb)" than
    previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))"

#blamebruno

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12235
2017-09-13 01:18:42 +00:00
Sean Bruno
a969350226 Revert r323008 and its conversion of e1000/iflib to using SX locks.
This seems to be missing something on the 82574L causing NFS root mounts
to hang.

Reported by:	kib
2017-08-30 18:56:24 +00:00
Sean Bruno
e17e5b4134 Continuation of lock cleanup in e1000.
Post-cold sleep instead of DELAY when waiting for firmware.

Convert softc mutex to an SX lock.  Change all waits to sleeps
once interrupts are enabled (and it is safe to sleep).

Submitted by:	Matt Macy <matt@mattmacy.io>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12101
2017-08-30 00:20:43 +00:00
Sean Bruno
9bc7588cb6 Deprecate unused int isc_max_txqsets and int isc_max_rxqsets as they
were redundant and not being used to set anything up.

Submitted by:	Matt Macy <mmacy@mattmacy.io>
Reported by:	Jeb Cramer <cramerj@intel.com>
Sponsored by:	Limelight Networks
2017-07-27 21:21:43 +00:00
Jason A. Harmening
eb36b1d0bc Clean up MD pollution of bus_dma.h:
--Remove special-case handling of sparc64 bus_dmamap* functions.
  Replace with a more generic mechanism that allows MD busdma
  implementations to generate inline mapping functions by
  defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>.  This
  is currently useful for sparc64, x86, and arm64, which all
  implement non-load dmamap operations as simple wrappers
  around map objects which may be bus- or device-specific.

--Remove NULL-checked bus_dmamap macros.  Implement the
  equivalent NULL checks in the inlined x86 implementation.
  For non-x86 platforms, these checks are a minor pessimization
  as those platforms do not currently allow NULL maps.  NULL
  maps were originally allowed on arm64, which appears to have
  been the motivation behind adding arm[64]-specific barriers
  to bus_dma.h, but that support was removed in r299463.

--Simplify the internal interface used by the bus_dmamap_load*
  variants and move it to bus_dma_internal.h

--Fix some drivers that directly include sys/bus_dma.h
  despite the recommendations of bus_dma(9)

Reviewed by:	kib (previous revision), marius
Differential Revision:	https://reviews.freebsd.org/D10729
2017-07-01 05:35:29 +00:00
Sean Bruno
60596476cf Move pause frame counter out of struct if_ctx and into struct if_softc_ctx_t
so that we can use it in iflib to detect pause frames.

The igb(4) driver definitely used to use this in its old timer function and
I see no reason to restrict it to that driver only.

Sponsored by:	Limelight Networks
2017-04-07 00:33:03 +00:00
Sean Bruno
ea351d3f14 Allow MSIX to be turned off by tuneable per interface, per driver.
Sponsored by:	Limelight Networks
2017-04-04 21:03:34 +00:00
Sean Bruno
95246abb21 IFLIB updates
- unconditionally enable BUS_DMA on non-x86 architectures
- speed up rxd zeroing via customized function
- support out of order updates to rxd's
- add prefetching to hardware descriptor rings
- only prefetch on 10G or faster hardware
- add seperate tx queue intr function
- preliminary rework of NETMAP interfaces, WIP

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Sponsored by:	Limelight Networks
2017-03-13 22:53:06 +00:00
Sean Bruno
e035717e57 IFLIB updates:
We found routing performance dropped significantly when configuring
FreeBSD as a router, we are applying the following changes in order to
resolve those issues and hopefully perform better.
 - don't prefetch the flags array, we usually don't need it
 - prefetch the next cache line of each of the software descriptor arrays as
   well as the first cache line of each of the next four packets' mbufs and
   clusters
 - reduce max copy size to 63 bytes
 - convert rx soft descriptors from array of structures to a structure of arrays
 - update copyrights

Submitted by:	Matt Macy <mmacy@nextbsd.org>
2017-01-27 23:08:06 +00:00
Sean Bruno
1248952a50 2017 IFLIB updates in preparation for commits to e1000 and ixgbe.
- iflib - add checksum in place support (mmacy)
- iflib - initialize IP for TSO (going to be needed for e1000) (mmacy)
- iflib - move isc_txrx from shared context to softc context (mmacy)
- iflib - Normalize checks in TXQ drainage. (shurd)
- iflib - Fix queue capping checks (mmacy)
- iflib - Fix invalid assert, em can need 2 sentinels (mmacy)
- iflib - let the driver determine what capabilities are set and what
          tx csum flags are used (mmacy)
- add INVARIANTS debugging hooks to gtaskqueue enqueue (mmacy)
- update bnxt(4) to support the changes to iflib (shurd)

Some other various, sundry updates.  Slightly more verbose changelog:

Submitted by:	mmacy@nextbsd.org
Reviewed by:	shurd
mFC after:
Sponsored by:	LimeLight Networks and Dell EMC Isilon
2017-01-02 00:56:33 +00:00
Stephen Hurd
23ac9029f9 Update iflib to support more NIC designs
- Move group task queue into kern/subr_gtaskqueue.c
- Change intr_enable to return an int so it can be detected if it's not
  implemented
- Allow different TX/RX queues per set to be different sizes
- Don't split up TX mbufs before transmit
- Allow a completion queue for TX as well as RX
- Pass the RX budget to isc_rxd_available() to allow an earlier return
  and avoid multiple calls

Submitted by:	shurd
Reviewed by:	gallatin
Approved by:	scottl
Differential Revision:	https://reviews.freebsd.org/D7393
2016-08-12 21:29:44 +00:00
Scott Long
4c7070db25 Import the 'iflib' API library for network drivers. From the author:
"iflib is a library to eliminate the need for frequently duplicated device
independent logic propagated (poorly) across many network drivers."

Participation is purely optional.  The IFLIB kernel config option is
provided for drivers that want to transition between legacy and iflib
modes of operation.  ixl and ixgbe driver conversions will be committed
shortly.  We hope to see participation from the Broadcom and maybe
Chelsio drivers in the near future.

Submitted by:   mmacy@nextbsd.org
Reviewed by:    gallatin
Differential Revision:  D5211
2016-05-18 04:35:58 +00:00