Add a last-modified timestamp to each LRO entry and provide an interface
to flush all inactive entries. Drivers decide when to flush and what
the inactivity threshold should be.
Network drivers that process an rx queue to completion can enter a
livelock type situation when the rate at which packets are received
reaches equilibrium with the rate at which the rx thread is processing
them. When this happens the final LRO flush (normally when the rx
routine is done) does not occur. Pure ACKs and segments with total
payload < 64K can get stuck in an LRO entry. Symptoms are that TCP
tx-mostly connections' performance falls off a cliff during heavy,
unrelated rx on the interface.
Flushing only inactive LRO entries works better than any of these
alternates that I tried:
- don't LRO pure ACKs
- flush _all_ LRO entries periodically (every 'x' microseconds or every
'y' descriptors)
- stop rx processing in the driver periodically and schedule remaining
work for later.
Reviewed by: andre
Specifcially, in_cksum_hdr() returns 0 (not 0xffff) when the IPv4
checksum is correct. Without this fix, the tcp_lro code will reject
good IPv4 traffic from drivers that do not implement IPv4 header
harder csum offload.
Sponsored by: Myricom Inc.
MFC after: 7 days
queue the packet for LRO and tell the driver to directly pass it on.
This avoids re-assembly and later re-fragmentation problems when
forwarding.
It's not the best solution but the simplest and most effective for
the moment.
Should have been done: ages ago
Discussed with and by: many
MFC after: 3 days
Significantly update tcp_lro for mostly two things:
1) introduce basic support for IPv6 without extension headers.
2) try hard to also get the incremental checksum updates right,
especially also in the IPv4 case for the IP and TCP header.
Move variables around for better locality, factor things out into
functions, allow checksum updates to be compiled out, ...
Leave a few comments on further things to look at in the future,
though that is not the full list.
Update drivers with appropriate #includes as needed for IPv6 data
type in LRO.
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
when len is inserted back into the synthetic IP packet and cause a
multiple of 2^16 bytes of TCP "packet loss".
This improves Linux->FreeBSD netperf bandwidth by a factor of 300 in
testing on Amazon EC2.
Reviewed by: jfv
MFC after: 2 weeks