Limiters and sanity checks for TCP MSS (maximum segement size)

resource exhaustion attacks.

For network link optimization TCP can adjust its MSS and thus
packet size according to the observed path MTU.  This is done
dynamically based on feedback from the remote host and network
components along the packet path.  This information can be
abused to pretend an extremely low path MTU.

The resource exhaustion works in two ways:

 o during tcp connection setup the advertized local MSS is
   exchanged between the endpoints.  The remote endpoint can
   set this arbitrarily low (except for a minimum MTU of 64
   octets enforced in the BSD code).  When the local host is
   sending data it is forced to send many small IP packets
   instead of a large one.

   For example instead of the normal TCP payload size of 1448
   it forces TCP payload size of 12 (MTU 64) and thus we have
   a 120 times increase in workload and packets. On fast links
   this quickly saturates the local CPU and may also hit pps
   processing limites of network components along the path.

   This type of attack is particularly effective for servers
   where the attacker can download large files (WWW and FTP).

   We mitigate it by enforcing a minimum MTU settable by sysctl
   net.inet.tcp.minmss defaulting to 256 octets.

 o the local host is reveiving data on a TCP connection from
   the remote host.  The local host has no control over the
   packet size the remote host is sending.  The remote host
   may chose to do what is described in the first attack and
   send the data in packets with an TCP payload of at least
   one byte.  For each packet the tcp_input() function will
   be entered, the packet is processed and a sowakeup() is
   signalled to the connected process.

   For example an attack with 2 Mbit/s gives 4716 packets per
   second and the same amount of sowakeup()s to the process
   (and context switches).

   This type of attack is particularly effective for servers
   where the attacker can upload large amounts of data.
   Normally this is the case with WWW server where large POSTs
   can be made.

   We mitigate this by calculating the average MSS payload per
   second.  If it goes below 'net.inet.tcp.minmss' and the pps
   rate is above 'net.inet.tcp.minmssoverload' defaulting to
   1000 this particular TCP connection is resetted and dropped.

MITRE CVE:	CAN-2004-0002
Reviewed by:	sam (mentor)
MFC after:	1 day
This commit is contained in:
Andre Oppermann 2004-01-08 17:40:07 +00:00
parent 2aae662479
commit 53369ac9bb
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=124258
8 changed files with 200 additions and 4 deletions

View File

@ -409,7 +409,8 @@ icmp_input(m, off)
* notice that the MTU has changed and adapt accordingly.
* If no new MTU was suggested, then we guess a new one
* less than the current value. If the new MTU is
* unreasonably small, then we don't update the MTU value.
* unreasonably small (defined by sysctl tcp_minmss), then
* we don't update the MTU value.
*
* XXX: All this should be done in tcp_mtudisc() because
* the way we do it now, everyone can send us bogus ICMP
@ -431,7 +432,8 @@ icmp_input(m, off)
if (!mtu)
mtu = ip_next_mtu(mtu, 1);
if (mtu >= 256 + sizeof(struct tcpiphdr))
if (mtu >= max(296, (tcp_minmss +
sizeof(struct tcpiphdr))))
tcp_hc_updatemtu(&inc, mtu);
#ifdef DEBUG_MTUDISC

View File

@ -105,11 +105,29 @@ struct tcphdr {
/*
* Default maximum segment size for TCP.
* With an IP MSS of 576, this is 536,
* With an IP MTU of 576, this is 536,
* but 512 is probably more convenient.
* This should be defined as MIN(512, IP_MSS - sizeof (struct tcpiphdr)).
*/
#define TCP_MSS 512
/*
* TCP_MINMSS is defined to be 256 which is fine for the smallest
* link MTU (296 bytes, SLIP interface) in the Internet.
* However it is very unlikely to come across such low MTU interfaces
* these days (anno dato 2003).
* Probably it can be set to 512 without ill effects. But we play safe.
* See tcp_subr.c tcp_minmss SYSCTL declaration for more comments.
* Setting this to "0" disables the minmss check.
*/
#define TCP_MINMSS 256
/*
* TCP_MINMSSOVERLOAD is defined to be 1000 which should cover any type
* of interactive TCP session.
* See tcp_subr.c tcp_minmssoverload SYSCTL declaration and tcp_input.c
* for more comments.
* Setting this to "0" disables the minmssoverload check.
*/
#define TCP_MINMSSOVERLOAD 1000
/*
* Default maximum segment size for TCP6.

View File

@ -918,6 +918,61 @@ tcp_input(m, off0)
if (tp->t_state == TCPS_LISTEN)
panic("tcp_input: TCPS_LISTEN");
/*
* This is the second part of the MSS DoS prevention code (after
* minmss on the sending side) and it deals with too many too small
* tcp packets in a too short timeframe (1 second).
*
* For every full second we count the number of received packets
* and bytes. If we get a lot of packets per second for this connection
* (tcp_minmssoverload) we take a closer look at it and compute the
* average packet size for the past second. If that is less than
* tcp_minmss we get too many packets with very small payload which
* is not good and burdens our system (and every packet generates
* a wakeup to the process connected to our socket). We can reasonable
* expect this to be small packet DoS attack to exhaust our CPU
* cycles.
*
* Care has to be taken for the minimum packet overload value. This
* value defines the minimum number of packets per second before we
* start to worry. This must not be too low to avoid killing for
* example interactive connections with many small packets like
* telnet or SSH.
*
* Setting either tcp_minmssoverload or tcp_minmss to "0" disables
* this check.
*
* Account for packet if payload packet, skip over ACK, etc.
*/
if (tcp_minmss && tcp_minmssoverload &&
tp->t_state == TCPS_ESTABLISHED && tlen > 0) {
if (tp->rcv_second > ticks) {
tp->rcv_pps++;
tp->rcv_byps += tlen + off;
if (tp->rcv_pps > tcp_minmssoverload) {
if ((tp->rcv_byps / tp->rcv_pps) < tcp_minmss) {
printf("too many small tcp packets from "
"%s:%u, av. %lubyte/packet, "
"dropping connection\n",
#ifdef INET6
isipv6 ?
ip6_sprintf(&inp->inp_inc.inc6_faddr) :
#endif
inet_ntoa(inp->inp_inc.inc_faddr),
inp->inp_inc.inc_fport,
tp->rcv_byps / tp->rcv_pps);
tp = tcp_drop(tp, ECONNRESET);
tcpstat.tcps_minmssdrops++;
goto drop;
}
}
} else {
tp->rcv_second = ticks + hz;
tp->rcv_pps = 1;
tp->rcv_byps = tlen + off;
}
}
/*
* Segment received on connection.
* Reset idle time and keep-alive timer.
@ -2690,6 +2745,11 @@ tcp_mss(tp, offer)
/* FALLTHROUGH */
default:
/*
* Prevent DoS attack with too small MSS. Round up
* to at least minmss.
*/
offer = max(offer, tcp_minmss);
/*
* Sanity check: make sure that maxopd will be large
* enough to allow some data on segments even if the

View File

@ -918,6 +918,61 @@ tcp_input(m, off0)
if (tp->t_state == TCPS_LISTEN)
panic("tcp_input: TCPS_LISTEN");
/*
* This is the second part of the MSS DoS prevention code (after
* minmss on the sending side) and it deals with too many too small
* tcp packets in a too short timeframe (1 second).
*
* For every full second we count the number of received packets
* and bytes. If we get a lot of packets per second for this connection
* (tcp_minmssoverload) we take a closer look at it and compute the
* average packet size for the past second. If that is less than
* tcp_minmss we get too many packets with very small payload which
* is not good and burdens our system (and every packet generates
* a wakeup to the process connected to our socket). We can reasonable
* expect this to be small packet DoS attack to exhaust our CPU
* cycles.
*
* Care has to be taken for the minimum packet overload value. This
* value defines the minimum number of packets per second before we
* start to worry. This must not be too low to avoid killing for
* example interactive connections with many small packets like
* telnet or SSH.
*
* Setting either tcp_minmssoverload or tcp_minmss to "0" disables
* this check.
*
* Account for packet if payload packet, skip over ACK, etc.
*/
if (tcp_minmss && tcp_minmssoverload &&
tp->t_state == TCPS_ESTABLISHED && tlen > 0) {
if (tp->rcv_second > ticks) {
tp->rcv_pps++;
tp->rcv_byps += tlen + off;
if (tp->rcv_pps > tcp_minmssoverload) {
if ((tp->rcv_byps / tp->rcv_pps) < tcp_minmss) {
printf("too many small tcp packets from "
"%s:%u, av. %lubyte/packet, "
"dropping connection\n",
#ifdef INET6
isipv6 ?
ip6_sprintf(&inp->inp_inc.inc6_faddr) :
#endif
inet_ntoa(inp->inp_inc.inc_faddr),
inp->inp_inc.inc_fport,
tp->rcv_byps / tp->rcv_pps);
tp = tcp_drop(tp, ECONNRESET);
tcpstat.tcps_minmssdrops++;
goto drop;
}
}
} else {
tp->rcv_second = ticks + hz;
tp->rcv_pps = 1;
tp->rcv_byps = tlen + off;
}
}
/*
* Segment received on connection.
* Reset idle time and keep-alive timer.
@ -2690,6 +2745,11 @@ tcp_mss(tp, offer)
/* FALLTHROUGH */
default:
/*
* Prevent DoS attack with too small MSS. Round up
* to at least minmss.
*/
offer = max(offer, tcp_minmss);
/*
* Sanity check: make sure that maxopd will be large
* enough to allow some data on segments even if the

View File

@ -121,6 +121,30 @@ SYSCTL_INT(_net_inet_tcp, TCPCTL_V6MSSDFLT, v6mssdflt,
"Default TCP Maximum Segment Size for IPv6");
#endif
/*
* Minimum MSS we accept and use. This prevents DoS attacks where
* we are forced to a ridiculous low MSS like 20 and send hundreds
* of packets instead of one. The effect scales with the available
* bandwidth and quickly saturates the CPU and network interface
* with packet generation and sending. Set to zero to disable MINMSS
* checking. This setting prevents us from sending too small packets.
*/
int tcp_minmss = TCP_MINMSS;
SYSCTL_INT(_net_inet_tcp, OID_AUTO, minmss, CTLFLAG_RW,
&tcp_minmss , 0, "Minmum TCP Maximum Segment Size");
/*
* Number of TCP segments per second we accept from remote host
* before we start to calculate average segment size. If average
* segment size drops below the minimum TCP MSS we assume a DoS
* attack and reset+drop the connection. Care has to be taken not to
* set this value too small to not kill interactive type connections
* (telnet, SSH) which send many small packets.
*/
int tcp_minmssoverload = TCP_MINMSSOVERLOAD;
SYSCTL_INT(_net_inet_tcp, OID_AUTO, minmssoverload, CTLFLAG_RW,
&tcp_minmssoverload , 0, "Number of TCP Segments per Second allowed to"
"be under the MINMSS Size");
#if 0
static int tcp_rttdflt = TCPTV_SRTTDFLT / PR_SLOWHZ;
SYSCTL_INT(_net_inet_tcp, TCPCTL_RTTDFLT, rttdflt, CTLFLAG_RW,

View File

@ -121,6 +121,30 @@ SYSCTL_INT(_net_inet_tcp, TCPCTL_V6MSSDFLT, v6mssdflt,
"Default TCP Maximum Segment Size for IPv6");
#endif
/*
* Minimum MSS we accept and use. This prevents DoS attacks where
* we are forced to a ridiculous low MSS like 20 and send hundreds
* of packets instead of one. The effect scales with the available
* bandwidth and quickly saturates the CPU and network interface
* with packet generation and sending. Set to zero to disable MINMSS
* checking. This setting prevents us from sending too small packets.
*/
int tcp_minmss = TCP_MINMSS;
SYSCTL_INT(_net_inet_tcp, OID_AUTO, minmss, CTLFLAG_RW,
&tcp_minmss , 0, "Minmum TCP Maximum Segment Size");
/*
* Number of TCP segments per second we accept from remote host
* before we start to calculate average segment size. If average
* segment size drops below the minimum TCP MSS we assume a DoS
* attack and reset+drop the connection. Care has to be taken not to
* set this value too small to not kill interactive type connections
* (telnet, SSH) which send many small packets.
*/
int tcp_minmssoverload = TCP_MINMSSOVERLOAD;
SYSCTL_INT(_net_inet_tcp, OID_AUTO, minmssoverload, CTLFLAG_RW,
&tcp_minmssoverload , 0, "Number of TCP Segments per Second allowed to"
"be under the MINMSS Size");
#if 0
static int tcp_rttdflt = TCPTV_SRTTDFLT / PR_SLOWHZ;
SYSCTL_INT(_net_inet_tcp, TCPCTL_RTTDFLT, rttdflt, CTLFLAG_RW,

View File

@ -1102,7 +1102,8 @@ tcp_ctloutput(so, sopt)
if (error)
break;
if (optval > 0 && optval <= tp->t_maxseg)
if (optval > 0 && optval <= tp->t_maxseg &&
optval + 40 >= tcp_minmss)
tp->t_maxseg = optval;
else
error = EINVAL;

View File

@ -179,6 +179,10 @@ struct tcpcb {
tcp_seq snd_recover_prev; /* snd_recover prior to retransmit */
u_long t_badrxtwin; /* window for retransmit recovery */
u_char snd_limited; /* segments limited transmitted */
/* anti DoS counters */
u_long rcv_second; /* start of interval second */
u_long rcv_pps; /* received packets per second */
u_long rcv_byps; /* received bytes per second */
};
#define IN_FASTRECOVERY(tp) (tp->t_flags & TF_FASTRECOVERY)
@ -332,6 +336,7 @@ struct tcpstat {
u_long tcps_connects; /* connections established */
u_long tcps_drops; /* connections dropped */
u_long tcps_conndrops; /* embryonic connections dropped */
u_long tcps_minmssdrops; /* average minmss too low drops */
u_long tcps_closed; /* conn. closed (includes drops) */
u_long tcps_segstimed; /* segs where we tried to get rtt */
u_long tcps_rttupdated; /* times we succeeded */
@ -473,6 +478,8 @@ extern struct inpcbhead tcb; /* head of queue of active tcpcb's */
extern struct inpcbinfo tcbinfo;
extern struct tcpstat tcpstat; /* tcp statistics */
extern int tcp_mssdflt; /* XXX */
extern int tcp_minmss;
extern int tcp_minmssoverload;
extern int tcp_delack_enabled;
extern int tcp_do_newreno;
extern int path_mtu_discovery;