Commit Graph

17 Commits

Author SHA1 Message Date
Eric Joyner
088a0b27ad intel iflib drivers: correct initialization of tx_cidx_processed
From Jake:

In r341156 ("Fix first-packet completion", 2018-11-28) a hack to work
around a delta calculation determining how many descriptors were used
was added to ixl_isc_tx_credits_update_dwb.

The same fix was also applied to the em and igb drivers in r340310, and
to ix in r341156.

The hack checked the case where prev and cur were equal, and then added
one. This works, because by the time we do the delta check, we already
know there is at least one packet available, so the delta should be at
least one.

However, it's not a complete fix, and as indicated by the comment is
really a hack to work around the real bug.

The real problem is that the first time that we transmit a packet,
tx_cidx_processed will be set to point to the start of the ring.
Ultimately, the credits_update function expects it to point to the
*last* descriptor that was processed. Since we haven't yet processed any
descriptors, pointing it to 0 results in this incorrect calculation.

Fix the initialization code to have it point to the end of the ring
instead. One way to think about this, is that we are setting the value
to be one prior to the first available descriptor.

Doing so, corrects the delta calculation in all cases. The original fix
only works if the first packet has exactly one descriptor. Otherwise, we
will report 1 less than the correct value.

As part of this fix, also update the MPASS assertions to match the real
expectations. First, ensure that prev is not equal to cur, since this
should never happen. Second, remove the assertion about prev==0 || delta
!= 0. It looks like that originated from when the em driver was
converted to iflib. It seems like it was supposed to ensure that delta
was non-zero. However, because we originally returned 0 delta for the
first calculation, the "prev == 0" was tacked on.

Instead, replace this with a check that delta is greater than zero,
after the correction necessary when the ring pointers wrap around.

This new solution should fix the same bug as r341156 did, but in a more
robust way.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	shurd@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18545
2019-01-24 01:03:00 +00:00
Stephen Hurd
cf49cdd5a3 Fix first-packet completion
The first packet after the ring is initialized was never
completed as isc_txd_credits_update() would not include it in the
count of completed packets. This caused netmap to never complete
a batch. See PR 233022 for more details.

PR:		233022
Reported by:	lev
Reviewed by:	lev
MFC after:	3 days
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D17931
2018-11-09 22:18:43 +00:00
Eric Joyner
adf93b56dc em/igb/ix(4): Port two Tx/Rx fixes made to ixl in r339338
- Fix assert/panic on receive when Jumbo Frames are enabled.

From the commit I made to ixl:
"It turns out that *_isc_rxd_available is supposed to return how many
packets are available to be cleaned on the rx ring. This patch removes
a section of code where if the budget argument is 1, the function would return
one if there was a descriptor available, not necessarily a packet.

This is okay in regular mtu 1500 traffic since the max frame size is less
than the configured receive buffer size (2048), but this doesn't work when
received packets can span more than one  descriptor, as is the case when the
mtu is 9000 and the receive buffer size is 4096."

- Fix possible Tx hang because *_isc_txd_credits_update returns incorrect result

From the commit by Krzysztof Galazka to ixl: "Function isc_txd_update_credits
called with clear set to false should return 1 if there are TX descriptors
already handled by HW. It was always returning 0 causing troubles with UDP TX
traffic."

PR:             231659
Reported by:    lev@
Approved by:	re (gjb@)
Sponsored by:   Intel Corporation
2018-10-14 05:09:43 +00:00
Matt Macy
8fd222ebb4 fix case where pidx_last might be used uninitialized
Reviewed by:	sbruno
2018-05-04 18:59:01 +00:00
Mark Johnston
fbf8b74c10 Use C99 initializers for iflib function tables.
Reviewed by:	sbruno
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D15041
2018-04-11 15:15:34 +00:00
Stephen Hurd
7021bf0569 Update copyright per Matthew Macy
"Under my tutelage Nicole did 85% of the work. At the time it seemed
simplest for a number of reasons to put my copyright on it. I now consider
that to have been a mistake."

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Approved by:	shurd
Differential Revision:	https://reviews.freebsd.org/D14766
2018-03-21 15:57:36 +00:00
Stephen Hurd
96fc97c81f Update Matthew Macy contact info
Email address has changed, uses consistent name (Matthew, not Matt)

Reported by:	Matthew Macy <mmacy@mattmacy.io>
Differential Revision:	https://reviews.freebsd.org/D13537
2017-12-19 17:59:00 +00:00
Stephen Hurd
ab2e3f7958 Revert r323516 (iflib rollup)
This was really too big of a commit even if everything worked, but there
are multiple new issues introduced in the one huge commit, so it's not
worth keeping this until it's fixed.

I'll work on splitting this up into logical chunks and introduce them one
at a time over the next week or two.

Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
2017-09-16 02:41:38 +00:00
Stephen Hurd
d300df0182 Roll up iflib commits from github. This pulls in most of the work done
by Matt Macy as well as other changes which he has accepted via pull
request to his github repo at https://github.com/mattmacy/networking/

This should bring -CURRENT and the github repo into close enough sync to
allow small feature branches rather than a large chain of interdependant
patches being developed out of tree.  The reset of the synchronization
should be able to be completed on github by splitting the remaining
changes that are not yet ready into short feature branches for later
review as smaller commits.

Here is a summary of changes included in this patch:

1)  More checks when INVARIANTS are enabled for eariler problem
    detection
2)  Group Task Queue cleanups
    - Fix use of duplicate shortdesc for gtaskqueue malloc type.
      Some interfaces such as memguard(9) use the short description to
      identify malloc types, so duplicates should be avoided.
3)  Allow gtaskqueues to use ithreads in addition to taskqueues
    - In some cases, this can improve performance
4)  Better logging when taskqgroup_attach*() fails to set interrupt
    affinity.
5)  Do not start gtaskqueues until they're needed
6)  Have mp_ring enqueue function enter the ABDICATED rather than BUSY
    state.  This moves the TX to the gtaskq and allows processing to
    continue faster as well as make TX batching more likely.
7)  Add an ift_txd_errata function to struct if_txrx.  This allows
    drivers to inspect/modify mbufs before transmission.
8)  Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need
    checksums zeroed for checksum offload to work.  This avoids modifying
    packet data in the TX path when possible.
9)  Use ithreads for iflib I/O instead of taskqueues
10) Clean up ioctl and support async ioctl functions
11) Prefetch two cachlines from each mbuf instead of one up to 128B.  We
    often need to parse packet header info beyond 64B.
12) Fix potential memory corruption due to fence post error in
    bit_nclear() usage.
13) Improved hang detection and handling
14) If the packet is smaller than MTU, disable the TSO flags.
    This avoids extra packet parsing when not needed.
15) Move TCP header parsing inside the IS_TSO?() test.
    This avoids extra packet parsing when not needed.
16) Pass chains of mbufs that are not consumed by lro to if_input()
    rather call if_input() for each mbuf.
17) Re-arrange packet header loads to get as much work as possible done
    before a cache stall.
18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/
    IFDI_DETACH();
19) Attempt to distribute RX/TX tasks across cores more sensibly,
    especially when RX and TX share an interrupt.  RX will attempt to
    take the first threads on a core, and TX will attempt to take
    successive threads.
20) Allow iflib_softirq_alloc_generic() to request affinity to the same
    cpus an interrupt has affinity with.  This allows TX queues to
    ensure they are serviced by the socket the device is on.
21) Add new iflib sysctls to net.iflib:
    - timer_int - interval at which to run per-queue timers in ticks
    - force_busdma
22) Add new per-device iflib sysctls to dev.X.Y.iflib
    - rx_budget allows tuning the batch size on the RX path
    - watchdog_events Count of watchdog events seen since load
23) Fix error where netmap_rxq_init() could get called before
    IFDI_INIT()
24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY
    when waiting for firmware
    - After interrupts are enabled, convert all waits to sleeps
    - Eliminates e1000 software/firmware synchronization busy waits after
      startup
25) e1000: Remove special case for budget=1 in em_txrx.c
    - Premature optimization which may actually be incorrect with
      multi-segment packets
26) e1000: Split out TX interrupt rather than share an interrupt for
    RX and TX.
    - Allows better performance by keeping RX and TX paths separate
27) e1000: Separate igb from em code where suitable
    Much easier to understand separate functions and "if (is_igb)" than
    previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))"

#blamebruno

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12235
2017-09-13 01:18:42 +00:00
Sean Bruno
95246abb21 IFLIB updates
- unconditionally enable BUS_DMA on non-x86 architectures
- speed up rxd zeroing via customized function
- support out of order updates to rxd's
- add prefetching to hardware descriptor rings
- only prefetch on 10G or faster hardware
- add seperate tx queue intr function
- preliminary rework of NETMAP interfaces, WIP

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Sponsored by:	Limelight Networks
2017-03-13 22:53:06 +00:00
Sean Bruno
cbf1505df9 Implement RSS queue tagging for em(4) class devices from a copy and massage
of functions from igb(4).  This enables 2 queue routing on 82574L class
devices again.
2017-01-25 23:12:03 +00:00
Sean Bruno
bd84f70044 iflib:
Add internal tracking of smp startup status to reliably figure out
     what methods are to be used to get gtaskqueue up and running.

e1000:
     Calculating this pointer gives undefined behaviour when (last == -1)
     (it is before the buffer).  The pointer is always followed.  Panics
     occurred when it points to an unmapped page.  Otherwise, the pointed-to
     garbage tends to not have the E1000_TXD_STAT_DD bit set in it, so in the
     broken case the loop was usually null and the function just returned, and
     this was acidentally correct.

Submitted by:	bde
Reported by:	Matt Macy <mmacy@nextbsd.org>
2017-01-24 16:05:42 +00:00
Sean Bruno
36fa5d5b64 Revert 312696 due to build tests. 2017-01-24 15:55:52 +00:00
Sean Bruno
562a3182f6 iflib:
Add internal tracking of smp startup status to reliably figure out
   what methods are to be used to get gtaskqueue up and running.

e1000:
   Calculating this pointer gives undefined behaviour when (last == -1)
   (it is before the buffer).  The pointer is always followed.  Panics
   occurred when it points to an unmapped page.  Otherwise, the pointed-to
   garbage tends to not have the E1000_TXD_STAT_DD bit set in it, so in the
   broken case the loop was usually null and the function just returned, and
   this was acidentally correct.

Submitted by:	bde
Reviewed by:	Matt Macy <mmacy@nextbsd.org>
2017-01-24 14:48:32 +00:00
Sean Bruno
8237905651 Restore v6 offload caps for igb(4) class devices.
Reported by:	tuxen
2017-01-11 19:29:33 +00:00
Sean Bruno
d37cece2b0 Add copywrite notices, 2-clause BSD.
Reported by:	jmallett
2017-01-10 04:50:26 +00:00
Sean Bruno
f2d6ace4a6 Migrate e1000 to the IFLIB framework:
- em(4) igb(4) and lem(4)
- deprecate the igb device from kernel configurations
- create a symbolic link in /boot/kernel from if_em.ko to if_igb.ko

Devices tested:
- 82574L
- I218-LM
- 82546GB
- 82579LM
- I350
- I217

Please report problems to freebsd-net@freebsd.org

Partial review from jhb and suggestions on how to *not* brick folks who
originally would have lost their igbX device.

Submitted by:	mmacy@nextbsd.org
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Limelight Networks and Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8299
2017-01-10 03:23:22 +00:00