Commit Graph

6 Commits

Author SHA1 Message Date
Hans Petter Selasky
e936121d31 Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets
  according to the hash type and flowid. This prevents exhaustion of
  the LRO entries due to too many connections at the same time.
  Testing using a larger number of higher bandwidth TCP connections
  showed that the incoming ACK packet aggregation rate increased from
  ~1.3:1 to almost 3:1. Another test showed that for a number of TCP
  connections greater than 16 per hardware receive ring, where 8 TCP
  connections was the LRO active entry limit, there was a significant
  improvement in throughput due to being able to fully aggregate more
  than 8 TCP stream. For very few very high bandwidth TCP streams, the
  optimizing LRO wrapper will add CPU usage instead of reducing CPU
  usage. This is expected. Network drivers which want to use the
  optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
  of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
  "tcp_lro_flush()". Further the LRO control structure must be
  initialized using "tcp_lro_init_args()" passing a non-zero number
  into the "lro_mbufs" argument.

- Make LRO statistics 64-bit. Previously 32-bit integers were used for
  statistics which can be prone to wrap-around. Fix this while at it
  and update all SYSCTL's which expose LRO statistics.

- Ensure all data is freed when destroying a LRO control structures,
  especially leftover LRO entries.

- Reduce number of memory allocations needed when setting up a LRO
  control structure by precomputing the total amount of memory needed.

- Add own memory allocation counter for LRO.

- Bump the FreeBSD version to force recompilation of all KLDs due to
  change of the LRO control structure size.

Sponsored by:	Mellanox Technologies
Reviewed by:	gallatin, sbruno, rrs, gnn, transport
Tested by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D4914
2016-01-19 15:33:28 +00:00
Navdeep Parhar
7127e6acf0 Merge r254336 from user/np/cxl_tuning.
Add a last-modified timestamp to each LRO entry and provide an interface
to flush all inactive entries.  Drivers decide when to flush and what
the inactivity threshold should be.

Network drivers that process an rx queue to completion can enter a
livelock type situation when the rate at which packets are received
reaches equilibrium with the rate at which the rx thread is processing
them.  When this happens the final LRO flush (normally when the rx
routine is done) does not occur.  Pure ACKs and segments with total
payload < 64K can get stuck in an LRO entry.  Symptoms are that TCP
tx-mostly connections' performance falls off a cliff during heavy,
unrelated rx on the interface.

Flushing only inactive LRO entries works better than any of these
alternates that I tried:
- don't LRO pure ACKs
- flush _all_ LRO entries periodically (every 'x' microseconds or every
  'y' descriptors)
- stop rx processing in the driver periodically and schedule remaining
  work for later.

Reviewed by:	andre
2013-08-28 23:00:34 +00:00
Bjoern A. Zeeb
62b5b6ecd0 MFp4 bz_ipv6_fast:
Significantly update tcp_lro for mostly two things:
  1) introduce basic support for IPv6 without extension headers.
  2) try hard to also get the incremental checksum updates right,
     especially also in the IPv4 case for the IP and TCP header.

  Move variables around for better locality, factor things out into
  functions, allow checksum updates to be compiled out, ...

  Leave a few comments on further things to look at in the future,
  though that is not the full list.

  Update drivers with appropriate #includes as needed for IPv6 data
  type in LRO.

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-24 23:03:23 +00:00
Bjoern A. Zeeb
27f190a3ca Switch to a standard 2 clause BSD license (from bsd-style-copyright).
Approved by:	Myricom Inc. (gallatin)
Approved by:	Intel Corporation (jfv)
2012-05-15 13:23:44 +00:00
John Baldwin
79e955ed63 Trim extra spaces before tabs. 2011-01-07 21:40:34 +00:00
Jack F Vogel
6c5087a818 Add generic TCP LOR into netinet 2008-06-11 22:12:50 +00:00