The receive buffer autoscaling for TCP is based on a linear growth, which
is acceptable in the congestion avoidance phase, but not during slow start. The MTU is is also not taken into account. Use a method instead, which is based on exponential growth working also in slow start and being independent from the MTU. This is joint work with rrs@. Reviewed by: rrs@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D18375
This commit is contained in:
parent
bdffe3b5bf
commit
560c058683
@ -212,11 +212,6 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, recvbuf_auto, CTLFLAG_VNET | CTLFLAG_RW,
|
||||
&VNET_NAME(tcp_do_autorcvbuf), 0,
|
||||
"Enable automatic receive buffer sizing");
|
||||
|
||||
VNET_DEFINE(int, tcp_autorcvbuf_inc) = 16*1024;
|
||||
SYSCTL_INT(_net_inet_tcp, OID_AUTO, recvbuf_inc, CTLFLAG_VNET | CTLFLAG_RW,
|
||||
&VNET_NAME(tcp_autorcvbuf_inc), 0,
|
||||
"Incrementor step size of automatic receive buffer");
|
||||
|
||||
VNET_DEFINE(int, tcp_autorcvbuf_max) = 2*1024*1024;
|
||||
SYSCTL_INT(_net_inet_tcp, OID_AUTO, recvbuf_max, CTLFLAG_VNET | CTLFLAG_RW,
|
||||
&VNET_NAME(tcp_autorcvbuf_max), 0,
|
||||
@ -1449,13 +1444,16 @@ tcp_input(struct mbuf **mp, int *offp, int proto)
|
||||
* The criteria to step up the receive buffer one notch are:
|
||||
* 1. Application has not set receive buffer size with
|
||||
* SO_RCVBUF. Setting SO_RCVBUF clears SB_AUTOSIZE.
|
||||
* 2. the number of bytes received during the time it takes
|
||||
* one timestamp to be reflected back to us (the RTT);
|
||||
* 3. received bytes per RTT is within seven eighth of the
|
||||
* current socket buffer size;
|
||||
* 4. receive buffer size has not hit maximal automatic size;
|
||||
* 2. the number of bytes received during 1/2 of an sRTT
|
||||
* is at least 3/8 of the current socket buffer size.
|
||||
* 3. receive buffer size has not hit maximal automatic size;
|
||||
*
|
||||
* This algorithm does one step per RTT at most and only if
|
||||
* If all of the criteria are met we increaset the socket buffer
|
||||
* by a 1/2 (bounded by the max). This allows us to keep ahead
|
||||
* of slow-start but also makes it so our peer never gets limited
|
||||
* by our rwnd which we then open up causing a burst.
|
||||
*
|
||||
* This algorithm does two steps per RTT at most and only if
|
||||
* we receive a bulk stream w/o packet losses or reorderings.
|
||||
* Shrinking the buffer during idle times is not necessary as
|
||||
* it doesn't consume any memory when idle.
|
||||
@ -1472,11 +1470,10 @@ tcp_autorcvbuf(struct mbuf *m, struct tcphdr *th, struct socket *so,
|
||||
if (V_tcp_do_autorcvbuf && (so->so_rcv.sb_flags & SB_AUTOSIZE) &&
|
||||
tp->t_srtt != 0 && tp->rfbuf_ts != 0 &&
|
||||
TCP_TS_TO_TICKS(tcp_ts_getticks() - tp->rfbuf_ts) >
|
||||
(tp->t_srtt >> TCP_RTT_SHIFT)) {
|
||||
if (tp->rfbuf_cnt > (so->so_rcv.sb_hiwat / 8 * 7) &&
|
||||
((tp->t_srtt >> TCP_RTT_SHIFT)/2)) {
|
||||
if (tp->rfbuf_cnt > ((so->so_rcv.sb_hiwat / 2)/ 4 * 3) &&
|
||||
so->so_rcv.sb_hiwat < V_tcp_autorcvbuf_max) {
|
||||
newsize = min(so->so_rcv.sb_hiwat +
|
||||
V_tcp_autorcvbuf_inc, V_tcp_autorcvbuf_max);
|
||||
newsize = min((so->so_rcv.sb_hiwat + (so->so_rcv.sb_hiwat/2)), V_tcp_autorcvbuf_max);
|
||||
}
|
||||
TCP_PROBE6(receive__autoresize, NULL, tp, m, tp, th, newsize);
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user