Add support for the experimental Internet-Draft "TCP Alternative Backoff with

ECN (ABE)" proposal to the New Reno congestion control algorithm module.
ABE reduces the amount of congestion window reduction in response to
ECN-signalled congestion relative to the loss-inferred congestion response.

More details about ABE can be found in the Internet-Draft:
https://tools.ietf.org/html/draft-ietf-tcpm-alternativebackoff-ecn

The implementation introduces four new sysctls:

- net.inet.tcp.cc.abe defaults to 0 (disabled) and can be set to non-zero to
  enable ABE for ECN-enabled TCP connections.

- net.inet.tcp.cc.newreno.beta and net.inet.tcp.cc.newreno.beta_ecn set the
  multiplicative window decrease factor, specified as a percentage, applied to
  the congestion window in response to a loss-based or ECN-based congestion
  signal respectively. They default to the values specified in the draft i.e.
  beta=50 and beta_ecn=80.

- net.inet.tcp.cc.abe_frlossreduce defaults to 0 (disabled) and can be set to
  non-zero to enable the use of standard beta (50% by default) when repairing
  loss during an ECN-signalled congestion recovery episode. It enables a more
  conservative congestion response and is provided for the purposes of
  experimentation as a result of some discussion at IETF 100 in Singapore.

The values of beta and beta_ecn can also be set per-connection by way of the
TCP_CCALGOOPT TCP-level socket option and the new CC_NEWRENO_BETA or
CC_NEWRENO_BETA_ECN CC algo sub-options.

Submitted by:	Tom Jones <tj@enoti.me>
Tested by:	Tom Jones <tj@enoti.me>, Grenville Armitage <garmitage@swin.edu.au>
Relnotes:	Yes
Differential Revision:	https://reviews.freebsd.org/D11616
This commit is contained in:
Lawrence Stewart 2018-03-19 16:37:47 +00:00
parent 2c58a19393
commit 370efe5ac8
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=331214
6 changed files with 282 additions and 11 deletions

View File

@ -30,17 +30,69 @@
.\"
.\" $FreeBSD$
.\"
.Dd September 15, 2011
.Dd March 19, 2018
.Dt CC_NEWRENO 4
.Os
.Sh NAME
.Nm cc_newreno
.Nd NewReno Congestion Control Algorithm
.Sh SYNOPSIS
.In netinet/cc/cc_newreno.h
.Sh DESCRIPTION
The NewReno congestion control algorithm is the default for TCP.
Details about the algorithm can be found in RFC5681.
.Sh Socket Options
The
.Nm
module supports a number of socket options under TCP_CCALGOOPT (refer to
.Xr tcp 4
and
.Xr moc_cc 9 for details)
which can
be set with
.Xr setsockopt 2
and tested with
.Xr getsockopt 2 .
The
.Nm
socket options use this structure defined in
<sys/netinet/cc/cc_newreno.h>:
.Bd -literal
struct cc_newreno_opts {
int name;
uint32_t val;
}
.Ed
.Bl -tag -width ".Va CC_NEWRENO_BETA_ECN"
.It Va CC_NEWRENO_BETA
Multiplicative window decrease factor, specified as a percentage, applied to
the congestion window in response to a congestion signal per: cwnd = (cwnd *
CC_NEWRENO_BETA) / 100.
Default is 50.
.It Va CC_NEWRENO_BETA_ECN
Multiplicative window decrease factor, specified as a percentage, applied to
the congestion window in response to an ECN congestion signal when
.Va net.inet.tcp.cc.abe=1
per: cwnd = (cwnd * CC_NEWRENO_BETA_ECN) / 100.
Default is 80.
.Sh MIB Variables
There are currently no tunable MIB variables.
The algorithm exposes these variables in the
.Va net.inet.tcp.cc.newreno
branch of the
.Xr sysctl 3
MIB:
.Bl -tag -width ".Va beta_ecn"
.It Va beta
Multiplicative window decrease factor, specified as a percentage, applied to
the congestion window in response to a congestion signal per: cwnd = (cwnd *
beta) / 100.
Default is 50.
.It Va beta_ecn
Multiplicative window decrease factor, specified as a percentage, applied to
the congestion window in response to an ECN congestion signal when
.Va net.inet.tcp.cc.abe=1
per: cwnd = (cwnd * beta_ecn) / 100.
Default is 80.
.Sh SEE ALSO
.Xr cc_chd 4 ,
.Xr cc_cubic 4 ,
@ -50,6 +102,24 @@ There are currently no tunable MIB variables.
.Xr mod_cc 4 ,
.Xr tcp 4 ,
.Xr mod_cc 9
.Rs
.%A "Mark Allman"
.%A "Vern Paxson"
.%A "Ethan Blanton"
.%T "TCP Congestion Control"
.%O "RFC 5681"
.Re
.Rs
.%A "Naeem Khademi"
.%A "Michael Welzl"
.%A "Grenville Armitage"
.%A "Gorry Fairhurst"
.%T "TCP Alternative Backoff with ECN (ABE)"
.%R "internet draft"
.%D "February 2018"
.%N "draft-ietf-tcpm-alternativebackoff-ecn"
.%O "work in progress"
.Re
.Sh ACKNOWLEDGEMENTS
Development and testing of this software were made possible in part by grants
from the FreeBSD Foundation and Cisco University Research Program Fund at
@ -78,5 +148,8 @@ congestion control module was written by
and
.An David Hayes Aq Mt david.hayes@ieee.org .
.Pp
Support for TCP ABE was added by
.An Tom Jones Aq Mt tj@enoti.me .
.Pp
This manual page was written by
.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org .

View File

@ -30,7 +30,7 @@
.\"
.\" $FreeBSD$
.\"
.Dd January 21, 2016
.Dd March 19, 2018
.Dt MOD_CC 4
.Os
.Sh NAME
@ -73,7 +73,7 @@ The framework exposes the following variables in the
branch of the
.Xr sysctl 3
MIB:
.Bl -tag -width ".Va available"
.Bl -tag -width ".Va abe_frlossreduce"
.It Va available
Read-only list of currently available congestion control algorithms by name.
.It Va algorithm
@ -83,6 +83,15 @@ When attempting to change the default algorithm, this variable should be set to
one of the names listed by the
.Va net.inet.tcp.cc.available
MIB variable.
.It Va abe
Enable support for draft-ietf-tcpm-alternativebackoff-ecn,
which alters the window decrease factor applied to the congestion window in
response to an ECN congestion signal.
Refer to individual congestion control man pages to determine if they implement
support for ABE and for configuration details.
.It Va abe_frlossreduce
If non-zero, apply standard beta instead of ABE-beta during ECN-signalled
congestion recovery episodes if loss also needs to be repaired.
.El
.Sh SEE ALSO
.Xr cc_cdg 4 ,

View File

@ -327,3 +327,14 @@ SYSCTL_PROC(_net_inet_tcp_cc, OID_AUTO, algorithm,
SYSCTL_PROC(_net_inet_tcp_cc, OID_AUTO, available, CTLTYPE_STRING|CTLFLAG_RD,
NULL, 0, cc_list_available, "A",
"List available congestion control algorithms");
VNET_DEFINE(int, cc_do_abe) = 0;
SYSCTL_INT(_net_inet_tcp_cc, OID_AUTO, abe, CTLFLAG_VNET | CTLFLAG_RW,
&VNET_NAME(cc_do_abe), 0,
"Enable draft-ietf-tcpm-alternativebackoff-ecn (TCP Alternative Backoff with ECN)");
VNET_DEFINE(int, cc_abe_frlossreduce) = 0;
SYSCTL_INT(_net_inet_tcp_cc, OID_AUTO, abe_frlossreduce, CTLFLAG_VNET | CTLFLAG_RW,
&VNET_NAME(cc_abe_frlossreduce), 0,
"Apply standard beta instead of ABE-beta during ECN-signalled congestion "
"recovery episodes if loss also needs to be repaired");

View File

@ -64,6 +64,12 @@ extern struct cc_algo newreno_cc_algo;
VNET_DECLARE(struct cc_algo *, default_cc_ptr);
#define V_default_cc_ptr VNET(default_cc_ptr)
VNET_DECLARE(int, cc_do_abe);
#define V_cc_do_abe VNET(cc_do_abe)
VNET_DECLARE(int, cc_abe_frlossreduce);
#define V_cc_abe_frlossreduce VNET(cc_abe_frlossreduce)
/* Define the new net.inet.tcp.cc sysctl tree. */
SYSCTL_DECL(_net_inet_tcp_cc);

View File

@ -3,7 +3,7 @@
*
* Copyright (c) 1982, 1986, 1988, 1990, 1993, 1994, 1995
* The Regents of the University of California.
* Copyright (c) 2007-2008,2010
* Copyright (c) 2007-2008,2010,2014
* Swinburne University of Technology, Melbourne, Australia.
* Copyright (c) 2009-2010 Lawrence Stewart <lstewart@freebsd.org>
* Copyright (c) 2010 The FreeBSD Foundation
@ -48,6 +48,11 @@
* University Research Program Fund at Community Foundation Silicon Valley.
* More details are available at:
* http://caia.swin.edu.au/urp/newtcp/
*
* Dec 2014 garmitage@swin.edu.au
* Borrowed code fragments from cc_cdg.c to add modifiable beta
* via sysctls.
*
*/
#include <sys/cdefs.h>
@ -69,20 +74,54 @@ __FBSDID("$FreeBSD$");
#include <netinet/tcp_var.h>
#include <netinet/cc/cc.h>
#include <netinet/cc/cc_module.h>
#include <netinet/cc/cc_newreno.h>
static MALLOC_DEFINE(M_NEWRENO, "newreno data",
"newreno beta values");
#define CAST_PTR_INT(X) (*((int*)(X)))
static int newreno_cb_init(struct cc_var *ccv);
static void newreno_ack_received(struct cc_var *ccv, uint16_t type);
static void newreno_after_idle(struct cc_var *ccv);
static void newreno_cong_signal(struct cc_var *ccv, uint32_t type);
static void newreno_post_recovery(struct cc_var *ccv);
static int newreno_ctl_output(struct cc_var *ccv, struct sockopt *sopt, void *buf);
static VNET_DEFINE(uint32_t, newreno_beta) = 50;
static VNET_DEFINE(uint32_t, newreno_beta_ecn) = 80;
#define V_newreno_beta VNET(newreno_beta)
#define V_newreno_beta_ecn VNET(newreno_beta_ecn)
struct cc_algo newreno_cc_algo = {
.name = "newreno",
.cb_init = newreno_cb_init,
.ack_received = newreno_ack_received,
.after_idle = newreno_after_idle,
.cong_signal = newreno_cong_signal,
.post_recovery = newreno_post_recovery,
.ctl_output = newreno_ctl_output,
};
struct newreno {
uint32_t beta;
uint32_t beta_ecn;
};
int
newreno_cb_init(struct cc_var *ccv)
{
struct newreno *nreno;
nreno = malloc(sizeof(struct newreno), M_NEWRENO, M_NOWAIT|M_ZERO);
if (nreno != NULL) {
nreno->beta = V_newreno_beta;
nreno->beta_ecn = V_newreno_beta_ecn;
}
return (0);
}
static void
newreno_ack_received(struct cc_var *ccv, uint16_t type)
{
@ -184,27 +223,48 @@ newreno_after_idle(struct cc_var *ccv)
static void
newreno_cong_signal(struct cc_var *ccv, uint32_t type)
{
u_int win;
struct newreno *nreno;
uint32_t cwin, factor;
u_int mss;
factor = V_newreno_beta;
nreno = ccv->cc_data;
if (nreno != NULL) {
if (V_cc_do_abe)
factor = (type == CC_ECN ? nreno->beta_ecn: nreno->beta);
else
factor = nreno->beta;
}
cwin = CCV(ccv, snd_cwnd);
mss = CCV(ccv, t_maxseg);
/* Catch algos which mistakenly leak private signal types. */
KASSERT((type & CC_SIGPRIVMASK) == 0,
("%s: congestion signal type 0x%08x is private\n", __func__, type));
win = max(CCV(ccv, snd_cwnd) / 2 / CCV(ccv, t_maxseg), 2) *
CCV(ccv, t_maxseg);
cwin = max(((uint64_t)cwin * (uint64_t)factor) / (100ULL * (uint64_t)mss),
2) * mss;
switch (type) {
case CC_NDUPACK:
if (!IN_FASTRECOVERY(CCV(ccv, t_flags))) {
if (IN_CONGRECOVERY(CCV(ccv, t_flags) &&
V_cc_do_abe && V_cc_abe_frlossreduce)) {
CCV(ccv, snd_ssthresh) =
((uint64_t)CCV(ccv, snd_ssthresh) *
(uint64_t)nreno->beta) /
(100ULL * (uint64_t)nreno->beta_ecn);
}
if (!IN_CONGRECOVERY(CCV(ccv, t_flags)))
CCV(ccv, snd_ssthresh) = win;
CCV(ccv, snd_ssthresh) = cwin;
ENTER_RECOVERY(CCV(ccv, t_flags));
}
break;
case CC_ECN:
if (!IN_CONGRECOVERY(CCV(ccv, t_flags))) {
CCV(ccv, snd_ssthresh) = win;
CCV(ccv, snd_cwnd) = win;
CCV(ccv, snd_ssthresh) = cwin;
CCV(ccv, snd_cwnd) = cwin;
ENTER_CONGRECOVERY(CCV(ccv, t_flags));
}
break;
@ -242,5 +302,75 @@ newreno_post_recovery(struct cc_var *ccv)
}
}
int
newreno_ctl_output(struct cc_var *ccv, struct sockopt *sopt, void *buf)
{
struct newreno *nreno;
struct cc_newreno_opts *opt;
if (sopt->sopt_valsize != sizeof(struct cc_newreno_opts))
return (EMSGSIZE);
nreno = ccv->cc_data;
opt = buf;
switch (sopt->sopt_dir) {
case SOPT_SET:
switch (opt->name) {
case CC_NEWRENO_BETA:
nreno->beta = opt->val;
break;
case CC_NEWRENO_BETA_ECN:
if (!V_cc_do_abe)
return (EACCES);
nreno->beta_ecn = opt->val;
break;
default:
return (ENOPROTOOPT);
}
case SOPT_GET:
switch (opt->name) {
case CC_NEWRENO_BETA:
opt->val = nreno->beta;
break;
case CC_NEWRENO_BETA_ECN:
opt->val = nreno->beta_ecn;
break;
default:
return (ENOPROTOOPT);
}
default:
return (EINVAL);
}
return (0);
}
static int
newreno_beta_handler(SYSCTL_HANDLER_ARGS)
{
if (req->newptr != NULL ) {
if (arg1 == &VNET_NAME(newreno_beta_ecn) && !V_cc_do_abe)
return (EACCES);
if (CAST_PTR_INT(req->newptr) <= 0 || CAST_PTR_INT(req->newptr) > 100)
return (EINVAL);
}
return (sysctl_handle_int(oidp, arg1, arg2, req));
}
SYSCTL_DECL(_net_inet_tcp_cc_newreno);
SYSCTL_NODE(_net_inet_tcp_cc, OID_AUTO, newreno, CTLFLAG_RW, NULL,
"New Reno related settings");
SYSCTL_PROC(_net_inet_tcp_cc_newreno, OID_AUTO, beta,
CTLFLAG_VNET | CTLTYPE_UINT | CTLFLAG_RW,
&VNET_NAME(newreno_beta), 3, &newreno_beta_handler, "IU",
"New Reno beta, specified as number between 1 and 100");
SYSCTL_PROC(_net_inet_tcp_cc_newreno, OID_AUTO, beta_ecn,
CTLFLAG_VNET | CTLTYPE_UINT | CTLFLAG_RW,
&VNET_NAME(newreno_beta_ecn), 3, &newreno_beta_handler, "IU",
"New Reno beta ecn, specified as number between 1 and 100");
DECLARE_CC_MODULE(newreno, &newreno_cc_algo);

View File

@ -0,0 +1,42 @@
/*-
* Copyright (c) 2017 Tom Jones <tj@enoti.me>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* $FreeBSD$
*/
#ifndef _CC_NEWRENO_H
#define _CC_NEWRENO_H
#define CCALGONAME_NEWRENO "newreno"
struct cc_newreno_opts {
int name;
uint32_t val;
};
#define CC_NEWRENO_BETA 1
#define CC_NEWRENO_BETA_ECN 2
#endif /* _CC_NEWRENO_H */