Commit Graph

90 Commits

Author SHA1 Message Date
Jack F Vogel
ab5d036272 Sync with Intel internal source:
shared code update and small changes in core required
Add support for new i210/i211 devices
Improve queue calculation based on mac type

MFC after:5 days
2012-07-05 20:26:57 +00:00
Kevin Lo
4d8b94d278 Initialize "error" to zero when it's declared in em_setup_receive_ring() 2012-05-11 03:15:22 +00:00
John Baldwin
d8a8648379 Fix a few issues with transmit handling in em(4) and igb(4):
- Do not define the foo_start() methods or set if_start in the ifnet if
  multiq transmit is enabled.  Also, set if_transmit and if_qflush before
  ether_ifattach rather than after when multiq transmit is enabled.  This
  helps to ensure that the drivers never try to mix different transmit
  methods.
- Properly restart transmit during resume.  igb(4) was not restarting it
  at all, and em(4) was restarting even if the link was down and was
  calling the wrong method if multiq transmit was enabled.
- Remove all the 'more' handling for transmit completions.  Transmit
  completion processing does not have a processing limit, so it always
  runs to completion and never has more work to do when it returns.
  Instead, the previous code was returning 'true' anytime there were
  packets in the queue that weren't still in the process of being
  transmitted.  The effect was that the driver would continuously
  reschedule a task to process TX completions in effect running at 100%
  CPU polling the hardware until it finished transmitting all of the
  packets in the ring.  Now it will just wait for the next TX completion
  interrupt.
- Restart packet transmission when the link becomes active.
- Fix the MSI-X queue interrupt handlers to restart packet transmission if
  there are pending packets in the relevant software queue (IFQ or buf_ring)
  after processing TX completions.  This is the root cause for the OACTIVE
  hangs as if the MSI-X queue handler drained all the pending packets from
  the TX ring, nothing would ever restart it.  As such, remove some
  previously-added workarounds to reschedule a task to poll the TX ring
  anytime OACTIVE was set.

Tested by:	sbruno
Reviewed by:	jfv
MFC after:	1 week
2012-03-30 19:54:48 +00:00
Luigi Rizzo
64ae02c365 A bunch of netmap fixes:
USERSPACE:
1. add support for devices with different number of rx and tx queues;

2. add better support for zero-copy operation, adding an extra field
   to the netmap ring to indicate how many buffers we have already processed
   but not yet released (with help from Eddie Kohler);

3. The two changes above unfortunately require an API change, so while
   at it add a version field and some spares to the ioctl() argument
   to help detect mismatches.

4. update the manual page for the two changes above;

5. update sample applications in tools/tools/netmap

KERNEL:

1. simplify the internal structures moving the global wait queues
   to the 'struct netmap_adapter';

2. simplify the functions that map kring<->nic ring indexes

3. normalize device-specific code, helps mainteinance;

4. start exploring the impact of micro-optimizations (prefetch etc.)
   in the ixgbe driver.
   Use 'legacy' descriptors on the tx ring and prefetch slots gives
   about 20% speedup at 900 MHz. Another 7-10% would come from removing
   the explict calls to bus_dmamap* in the core (they are effectively
   NOPs in this case, but it takes expensive load of the per-buffer
   dma maps to figure out that they are all NULL.

   Rx performance not investigated.

I am postponing the MFC so i can import a few more improvements
before merging.
2012-02-27 19:05:01 +00:00
Luigi Rizzo
5644ccec61 (This commit only touches code within the DEV_NETMAP blocks)
Introduce some functions to map NIC ring indexes into netmap ring
indexes and vice versa. This way we can implement the bound
checks only in one place (and hopefully in a correct way).

On passing, make the code and comments more uniform across the
various drivers.
2012-02-15 23:13:29 +00:00
Luigi Rizzo
ce9f43b467 clear the pointer after freeing the mbuf. Without that, we
risk a double free if the subsequent mbuf allocation fails.
This bug is not netmap-related and was introduced in  rev. 228387
2012-01-12 17:30:44 +00:00
Luigi Rizzo
467bd5c2cb fix the initialization of the rings when netmap is used,
to adapt it to the changes in  228387 .
Now the code is similar to the one used in other drivers.
Not applicable to stable/9 and stable/8
2012-01-12 17:28:00 +00:00
Luigi Rizzo
6e10c8b8c5 small code cleanup in preparation for future modifications in
the memory allocator used by netmap. No functional change,
two small bug fixes:
- in if_re.c add a missing bus_dmamap_sync()
- in netmap.c comment out a spurious free() in an error handling block
2012-01-10 19:57:23 +00:00
Kevin Lo
5bbe0c5357 ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again
Reviewed by:	yongari
2012-01-07 09:41:57 +00:00
Robert Watson
19d52de5f4 When extracting the VLAN tag from if_em and if_lem receive descriptor
rings, copy the whole VLAN tag, not just the VLAN ID.  This fixes a
problem in which VLAN priority information was dropped when using
offloaded VLAN processing with these drivers.

Discussed with:	jfv, rrs
Sponsored by:	ADARA Networks, Inc.
MFC after:	3 days
2012-01-05 17:30:15 +00:00
Jack F Vogel
62aca36544 Last change still had an issue, one more time... 2011-12-11 18:46:14 +00:00
Jack F Vogel
133f283b45 Correct LINT build issues in the ioctl code. 2011-12-11 09:37:25 +00:00
Jack F Vogel
96b38ade36 Fix NETMAP code problem in the build. 2011-12-10 18:00:53 +00:00
Jack F Vogel
fd33ce416e Part 2 of 2 New deltas for the 1G drivers.
There have still been intermittent problems with apparent TX
hangs for some customers. These have been problematic to reproduce
but I believe these changes will address them. Testing on a number
of fronts have been positive.

EM: there is an important 'chicken bit' fix for 82574 in the shared
code this is supported in the core here.
    - The TX path has been tightened up to improve performance. In
      particular UDP with jumbo frames was having problems, and the
      changes here have improved that.
    - OACTIVE has been used more carefully on the theory that some
      hangs may be due to a problem in this interaction
    - Problems with the RX init code, the "lazy" allocation and
      ring initialization has been found to cause problems in some
      newer client systems, and as it really is not that big a win
      (its not in a hot path) it seems best to remove it.
    - HWTSO was broken when VLAN HWTAGGING or HWFILTER is used, I
      found this was due to an error in setting up the descriptors
      in em_xmit.

IGB:
    - TX is also improved here. With multiqueue I realized its very
      important to handle OACTIVE only under the CORE lock so there
      are no races between the queues.
    - Flow Control handling was broken in a couple ways, I have changed
      and I hope improved that in this delta.
    - UDP also had a problem in the TX path here, it was change to
      improve that.
    - On some hardware, with the driver static, a weird stray interrupt
      seems to sometimes fire and cause a panic in the RX mbuf refresh
      code. This is addressed by setting interrupts late in the init
      path, and also to set all interrupts bits off at the start of that.
2011-12-10 07:08:52 +00:00
Luigi Rizzo
579a6e3c4e add netmap support for "em", "lem", "igb" and "re".
On my hardware, "em" in netmap mode does about 1.388 Mpps
on one card (on an Asus motherboard), and 1.1 Mpps on another
card (PCIe bus). Both seem to be NIC-limited, because
i have the same rate even with the CPU running at 150 MHz.

On the "re" driver the tx throughput is around 420-450 Kpps
on various (8111C and the like) chipsets. On the Rx side
performance seems much better, and i can receive the full
load generated by the "em" cards.

"igb" is untested as i don't have the hardware.
2011-12-05 15:33:13 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
John Baldwin
b37e0f6e9a - Add read-only sysctls for all of the tunables supported by the igb and
em drivers.
- Make the per-instance 'enable_aim' sysctl truly per-instance by having it
  change a per-instance variable (which is used to control AIM) rather
  than having all of the per-instance sysctls operate on a single global
  variable.

Reviewed by:	jfv (earlier version)
MFC after:	1 week
2011-06-29 16:20:52 +00:00
Jack F Vogel
3cec53b8a7 Add an initialization to the error variable, without
this there is a rare return path that bogusly appears
to fail when it should not.  Also white space correction.

Thanks to Arnaud Lacombe for noticing the problem.
2011-05-05 17:28:45 +00:00
Jack F Vogel
62d8da8c3a Fix to an error condition case, when an mbuf chain
get's defragged due to a mapping failure the header
pointers will be invalidated and can result in a
TSO or other failure down the line. So, when the
remapping occurs force a retry thru the offload
calculation code. Thanks to Andrew Boyer for discovering
this and cooking up the fix!!
2011-04-01 20:24:51 +00:00
Jack F Vogel
e61e0b91af Change the refresh_mbuf logic slightly, add an inline
to calculate the outstanding descriptors that need to be
refreshed at any time, and use THAT in rxeof to determine
if refreshing needs to be done. Also change the local_timer
to simply fire off the appropriate interrupt rather than
schedule a tasklet, its simpler.

MFC in two weeks
2011-04-01 18:48:31 +00:00
John Baldwin
3b0a4aef96 Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.
2011-03-23 13:10:15 +00:00
Jack F Vogel
1fd3c44f77 This delta updates the em driver to version 7.2.2 which has
been undergoing test for some weeks. This improves the RX
mbuf handling to avoid system hang due to depletion. Thanks
to all those who have been testing the code, and to Beezar
Liu for the design changes.

Next the igb driver is updated for similar RX changes, but
also to add new features support for our upcoming i350 family
of adapters.

MFC after a week
2011-03-18 18:54:00 +00:00
Jack F Vogel
fbfbce8ae9 Fix for kern/152853, pullup at the wrong point
is breaking UDP. Thanks to Petr Lampa for the
patch.
2011-01-19 18:20:11 +00:00
Matthew D Fleming
5bc0787f29 Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need
to rely on the format string.
2011-01-18 21:14:23 +00:00
Matthew D Fleming
8c49f18771 sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the Intel drivers.
2011-01-12 19:53:23 +00:00
Jack F Vogel
599564e633 A couple problems discovered by Andrew Boyer:
- failure code in em_xmit got mangled along the way
     and was not properly handling errors.
   - local timer code had a leftover UNLOCK call that
     should be removed.

MFC after 3 days
2011-01-12 00:23:47 +00:00
Jack F Vogel
1ce42f7249 Correct build error. 2010-12-04 06:38:21 +00:00
Jack F Vogel
9d43b64dbf Small cut and paste bug in flow control string fixed.
Second, correct the discard/refresh_mbufs code to behave
more like igb, there have been panics due to discards and
this should fix them.

MFC after: 3 days
2010-12-04 01:59:58 +00:00
Jack F Vogel
12203744da The purpose of this change is to add a routine to
disable ASPM L0S and L1 LINK states on 82573, 82574,
and 82583. The theory is that this is behind certain
hangs being experienced by some customers.

Also included a small optimization in the rxeof routine
that was in my internal code.

Change the PBA size for pchlan, it was incorrect.

MFC after: 3 days
2010-11-24 22:24:07 +00:00
Jack F Vogel
e4c690b4f0 Sync the lem code up with the vlan and other fixes in em.
Delete a unneeded test from the beginning of em_xmit.
CRITICAL: shared code fix for 82574, a mutex might not be
          released, this can cause hangs.
2010-11-01 20:19:25 +00:00
Jack F Vogel
35928b338e In the data setup code for doing offloads the
ip and tcp pointers were not reset after some
pullups. In practice this led to an NFS mount
failure when using UDP reported by Kevin Lo,
thanks Kevin. Fix from yongari, thank you!
2010-10-28 00:16:54 +00:00
Jack F Vogel
7deff7f9b4 Bug fix delta to the em driver:
- Chasin down bogus watchdogs has led to an improved
	  design to this handling, the hang decision takes
	  place in the tx cleanup, with only a simple report
	  check in local_timer. Our tests have shown no false
	  watchdogs with this code.
	- VLAN fixes from jhb, the shadow vfta should be per
	  interface, but as global it was not. Thanks John.
	- Bug fixes in the support for new PCH2 hardware.
	- Thanks for all the help and feedback on the driver,
	  changes to lem with be coming shortly as well.
2010-10-26 00:07:58 +00:00
Jack F Vogel
7d9119bdc4 Update code from Intel:
- Sync shared code with Intel internal
	- New client chipset support added
	- em driver - fixes to 82574, limit queues to 1 but use MSIX
	- em driver - large changes in TX checksum offload and tso
	  code, thanks to yongari.
	- some small changes for watchdog issues.
	- igb driver - local timer watchdog code was missing locking
	  this and a couple other watchdog related fixes.
	- bug in rx discard found by Andrew Boyer, check for null pointer

MFC: a week
2010-09-28 00:13:15 +00:00
John Baldwin
8385f4cf94 Tweak the stats exported by the e1000 drivers:
- Add a single sysctl procedure to all three drivers to read an arbitrary
  register (the register is passed as arg2).  Use it to replace existing
  routines in igb(4) that used a separate routine for each register, and
  to add support for missing stats in em(4) and lem(4).
- Move the 'rx_overruns' and 'watchdog_timeouts' stats out of the MAC stats
  section as they are driver stats, not MAC counters.
- Simplify the code that creates per-queue stats in igb(4) to use a single
  loop and remove duplicated code.
- Properly read all 64 bits of the 'good octets received/transmitted' in
  em(4) and lem(4).
- Actually read the interrupt count registers in em(4), and drop the
  'host to card' sysctl stats from em(4) as they are not implemented in
  any of the hardware this driver supports.
- Restore several stats to em(4) that were lost in the earlier stats
  conversion including per-queue stats.
- Export several MAC stats in em(4) that were exported in igb(4) but not
  in em(4).
- Export stats in lem(4) using individual sysctls as in em(4) and igb(4).

Reviewed by:	jfv
MFC after:	1 week
2010-09-20 16:04:44 +00:00
Jack F Vogel
26c88ee828 Code correction in refresh_mbufs, just continuing
without index recalc was wrong.
2010-09-07 21:28:45 +00:00
Jack F Vogel
d9f1a5aa8e Tighten up the rx mbuf refresh code, there were some
discrepencies from the igb version which was the target.

Change the message when neither MSI or MSIX are enabled
and a fallback to Legacy interrupts happen, the existing
message was confusing.
2010-09-07 20:13:08 +00:00
Pyun YongHyeon
dd20cce19a Do not allocate multicast array memory in multicast filter
configuration function. For failed memory allocations, em(4)/lem(4)
called panic(9) which is not acceptable on production box.
igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which
consumed 768 bytes of stack memory which looks too big.

To address these issues, allocate multicast array memory in device
attach time and make multicast configuration success under any
conditions. This change also removes the excessive use of memory in
stack.

Reviewed by:	jfv
2010-08-28 00:34:22 +00:00
Pyun YongHyeon
880a50b513 If em(4) failed to allocate RX buffers, do not call panic(9).
Just showing some buffer allocation error is more appropriate
action for drivers. This should fix occasional panic reported on
em(4) when driver encountered resource shortage.

Reviewed by:	jfv
2010-08-28 00:16:49 +00:00
Pyun YongHyeon
ad1917be37 Do not call voluntary panic(9) in case of if_alloc() failure.
Reviewed by:	jfv
2010-08-28 00:09:19 +00:00
Jack F Vogel
9886a800fc Fix for a panic when TX checksum offload is done and
a packet has only a header in the first mbuf, the
checksum code will dereference a pointer into the
non-existing IP header. Do a check for the size and
pullup if needed. Thanks to Michael Tuexen for this
fix.

MFC: asap - should be in 8.1 IMHO
2010-07-12 21:47:30 +00:00
Jack F Vogel
b7741e7a13 Two stats were duplicated, thanks to Andrew Boyer
for pointing this out.
2010-06-17 17:38:39 +00:00
George V. Neville-Neil
fdbf7e3c5e Move statistics into the sysctl tree making it easier to find
and use them.
Add previously hidden statistics, some of which include interrupt
and host/card communication counters.
2010-06-16 20:57:41 +00:00
Jack F Vogel
dfc14ce06e Changes from John Baldwin adding to last commit,
change rxeof api for poll friendliness, and
eliminate unnecessary link tasklet use. Thanks John!
2010-06-16 16:37:36 +00:00
Marius Strobl
876ab8b5e4 Fix a mismerge in r206001.
PR:		146614
Approved by:	jfv (implicit)
MFC afer:	3 days
2010-05-15 19:46:16 +00:00
Jack F Vogel
46168c5453 Small changes preparing for MFC, need to conditionalize
the buf_ring_free call, and lem is missing the WOL change
put into em.
2010-05-14 22:18:34 +00:00
Jack F Vogel
beef45ff88 Address the LOD that some are seeing, put the RX lock
back in rxeof (I could see little point in taking it out),
and now release it before the stack entry.

Also, make it so the 82574 does not configure for multiqueue
when its not used in the stack.
2010-04-28 19:22:52 +00:00
Jack F Vogel
1655af0a72 Change default WOL back to MAGIC only, having
multicast enabled causes problems in man environments.
2010-04-28 17:37:30 +00:00
Jack F Vogel
d43a118797 Add a missing fragment in the tx msix handler to invoke
another if all work is not done.

Sync the igb driver with changes suggested by yongari and
made in em, these made sense to be in both drivers.
2010-04-14 20:55:33 +00:00
Jack F Vogel
3b4e5df82c The lock move in rxeof necessitated a couple
more places to do the locking, fixes a panic.
2010-04-10 19:25:55 +00:00
Jack F Vogel
b4ab02b842 Correct broken build. 2010-04-10 07:26:51 +00:00