Teach the loopback interface about checksum generation and validation

avoidance:

- Enable setting the RXCSUM and TXCSUM flags for loopback interfaces;
  set both by default.
- When RXCSUM is set, flag packets sent over the loopback interface as
  having checked and valid IP, UDP, TCP checksums so that higher
  protocol layers won't check them.
- Always clear CSUM_{IP,UDP_TCP} checksum required flags on transmit,
  as they will have gotten there as a result of TXCSUM being set.

This is done only for packets explicitly sent over the loopback, not
simulated loopback via if_simloop() due to !SIMPLEX interfaces, etc.

Note that enabling TXCSUM but not RXCSUM will lead to unhappiness, as
checksums won't be generated but will be validated.

Kris reports that this leads to significant performance improvements
in loopback benchmarking with TCP and UDP for throughput:

	RXCSUM 	RXCSUM+TXCSUM
TCP	15%	37%
UDP	10%	74%

Update man page.

Reviewed by:	sam
Tested by:	kris
MFC after:	1 week
This commit is contained in:
Robert Watson 2009-03-15 20:17:44 +00:00
parent b41a7787e1
commit 3cb73e3d8b
2 changed files with 42 additions and 12 deletions

View File

@ -1,5 +1,7 @@
.\" Copyright (c) 1983, 1991, 1993
.\" The Regents of the University of California. All rights reserved.
.\" The Regents of the University of California.
.\" Copyright (c) 2009 Robert N. M. Watson
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
@ -9,10 +11,6 @@
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\" 3. All advertising materials mentioning features or use of this software
.\" must display the following acknowledgement:
.\" This product includes software developed by the University of
.\" California, Berkeley and its contributors.
.\" 4. Neither the name of the University nor the names of its contributors
.\" may be used to endorse or promote products derived from this software
.\" without specific prior written permission.
@ -32,7 +30,7 @@
.\" @(#)lo.4 8.1 (Berkeley) 6/5/93
.\" $FreeBSD$
.\"
.Dd June 5, 1993
.Dd March 15, 2009
.Dt LO 4
.Os
.Sh NAME
@ -58,6 +56,20 @@ The loopback should
.Em never
be configured first unless no hardware
interfaces exist.
.Pp
If the transmit checksum offload capability flag is enabled on a loopback
interface, checksums will not be generated by IP, UDP, or TCP for packets
sent on the interface.
.Pp
If the receive checksum offload capability flag is enabled on a loopback
interface, checksums will not be validated by IP, UDP, or TCP for packets
received on the interface.
.Pp
By default, both receive and transmit checksum flags will be enabled, in
order to avoid the overhead of checksumming for local communication where
data corruption is unlikely.
If transmit checksum generation is disabled, then validation should also be
disabled in order to avoid packets being dropped due to invalid checksums.
.Sh DIAGNOSTICS
.Bl -diag
.It lo%d: can't handle af%d.
@ -74,8 +86,5 @@ The
.Nm
device appeared in
.Bx 4.2 .
.Sh BUGS
Previous versions of the system enabled the loopback interface
automatically, using a nonstandard Internet address (127.1).
Use of that address is now discouraged; a reserved host address
for the local network should be used instead.
The current checksum generation and validation avoidance policy appeared in
.Fx 8.0 .

View File

@ -138,6 +138,8 @@ lo_clone_create(struct if_clone *ifc, int unit, caddr_t params)
ifp->if_ioctl = loioctl;
ifp->if_output = looutput;
ifp->if_snd.ifq_maxlen = ifqmaxlen;
ifp->if_hwassist = ifp->if_capabilities = ifp->if_capenable =
IFCAP_HWCSUM;
if_attach(ifp);
bpfattach(ifp, DLT_NULL, sizeof(u_int32_t));
if (V_loif == NULL)
@ -212,6 +214,13 @@ looutput(struct ifnet *ifp, struct mbuf *m, struct sockaddr *dst,
#if 1 /* XXX */
switch (dst->sa_family) {
case AF_INET:
if (ifp->if_capenable & IFCAP_RXCSUM) {
m->m_pkthdr.csum_data = 0xffff;
m->m_pkthdr.csum_flags = CSUM_DATA_VALID |
CSUM_PSEUDO_HDR | CSUM_IP_CHECKED |
CSUM_IP_VALID | CSUM_SCTP_VALID;
}
m->m_pkthdr.csum_flags &= ~(CSUM_IP | CSUM_TCP | CSUM_UDP);
case AF_INET6:
case AF_IPX:
case AF_APPLETALK:
@ -348,7 +357,7 @@ loioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
{
struct ifaddr *ifa;
struct ifreq *ifr = (struct ifreq *)data;
int error = 0;
int error = 0, mask;
switch (cmd) {
case SIOCSIFADDR:
@ -391,6 +400,18 @@ loioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
case SIOCSIFFLAGS:
break;
case SIOCSIFCAP:
mask = ifp->if_capenable ^ ifr->ifr_reqcap;
if ((mask & IFCAP_RXCSUM) != 0)
ifp->if_capenable ^= IFCAP_RXCSUM;
if ((mask & IFCAP_TXCSUM) != 0)
ifp->if_capenable ^= IFCAP_TXCSUM;
if (ifp->if_capenable & IFCAP_TXCSUM)
ifp->if_hwassist = CSUM_IP | CSUM_TCP | CSUM_UDP;
else
ifp->if_hwassist = 0;
break;
default:
error = EINVAL;
}