Final commit to round out the "Five New TCP Congestion Control Algorithms for
FreeBSD" FreeBSD Foundation funded project. - Add new man pages for the modular congestion control, Khelp and Hhook frameworks (cc.4, cc.9, khelp.9 and hhook.9). - Add new man pages for each available congestion control algorithm (cc_chd.4, cc_cubic.4, cc_hd.4, cc_htcp.4, cc_newreno.4 and cc_vegas.4). - Add a new man page for the Enhanced Round Trip Time (ERTT) Khelp module (h_ertt.4). - Update the TCP (tcp.4) man page to mention the TCP_CONGESTION socket option, cross reference to cc.4 and remove references to the retired "net.inet.tcp.newreno" sysctl MIB variable. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: FreeBSD Foundation MFC after: 3 months
This commit is contained in:
parent
bcaa6ebc45
commit
29f269dc1f
@ -69,6 +69,13 @@ MAN= aac.4 \
|
||||
cardbus.4 \
|
||||
carp.4 \
|
||||
cas.4 \
|
||||
cc.4 \
|
||||
cc_chd.4 \
|
||||
cc_cubic.4 \
|
||||
cc_hd.4 \
|
||||
cc_htcp.4 \
|
||||
cc_newreno.4 \
|
||||
cc_vegas.4 \
|
||||
ccd.4 \
|
||||
cd.4 \
|
||||
cdce.4 \
|
||||
@ -131,6 +138,7 @@ MAN= aac.4 \
|
||||
gif.4 \
|
||||
gpib.4 \
|
||||
gre.4 \
|
||||
h_ertt.4 \
|
||||
harp.4 \
|
||||
hatm.4 \
|
||||
hfa.4 \
|
||||
|
118
share/man/man4/cc.4
Normal file
118
share/man/man4/cc.4
Normal file
@ -0,0 +1,118 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes and
|
||||
.\" Lawrence Stewart under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt cc 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc
|
||||
.Nd Modular congestion control
|
||||
.Sh DESCRIPTION
|
||||
The modular congestion control framework allows the TCP implementation to
|
||||
dynamically change the congestion control algorithm used by new and existing
|
||||
connections.
|
||||
Algorithms are identified by a unique
|
||||
.Xr ascii 7
|
||||
name.
|
||||
Algorithm modules can be compiled into the kernel or loaded as kernel modules
|
||||
using the
|
||||
.Xr kld 4
|
||||
facility.
|
||||
.Pp
|
||||
The default algorithm is NewReno, and all connections use the default unless
|
||||
explicitly overridden using the TCP_CONGESTION socket option (see
|
||||
.Xr tcp 4
|
||||
for details).
|
||||
The default can be changed using a
|
||||
.Xr sysctl 3
|
||||
MIB variable detailed in the
|
||||
.Sx MIB Variables
|
||||
section below.
|
||||
.Sh MIB Variables
|
||||
The framework exposes the following variables in the
|
||||
.Va net.inet.tcp.cc
|
||||
branch of the
|
||||
.Xr sysctl 3
|
||||
MIB:
|
||||
.Bl -tag -width ".Va available"
|
||||
.It Va available
|
||||
Read-only list of currently available congestion control algorithms by name.
|
||||
.El
|
||||
.Bl -tag -width ".Va algorithm"
|
||||
.It Va algorithm
|
||||
Returns the current default congestion control algorithm when read, and changes
|
||||
the default when set.
|
||||
When attempting to change the default algorithm, this variable should be set to
|
||||
one of the names listed by the
|
||||
.Va net.inet.tcp.cc.available
|
||||
MIB variable.
|
||||
.El
|
||||
.Sh SEE ALSO
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
modular congestion control framework first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The framework was first released in 2007 by James Healy and Lawrence Stewart
|
||||
whilst working on the NewTCP research project at Swinburne University's Centre
|
||||
for Advanced Internet Architectures, Melbourne, Australia, which was made
|
||||
possible in part by a grant from the Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
facility was written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org ,
|
||||
.An James Healy Aq jimmy@deefa.com
|
||||
and
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
||||
.Pp
|
||||
This manual page was written by
|
||||
.An David Hayes Aq david.hayes@ieee.org
|
||||
and
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
127
share/man/man4/cc_chd.4
Normal file
127
share/man/man4/cc_chd.4
Normal file
@ -0,0 +1,127 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes
|
||||
.\" under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC_CHD 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc_chd
|
||||
.Nd CHD Congestion Control Algorithm
|
||||
.Sh DESCRIPTION
|
||||
CHD enhances the HD algorithm implemented in
|
||||
.Xr cc_hd 4 .
|
||||
It provides tolerance to non-congestion related packet loss and improvements to
|
||||
coexistence with traditional loss-based TCP flows, especially when the
|
||||
bottleneck link is lightly multiplexed.
|
||||
.Pp
|
||||
Like HD, the algorithm aims to keep network queuing delays below a particular
|
||||
threshold (queue_threshold) and decides to reduce the congestion window (cwnd)
|
||||
probabilistically based on its estimate of the network queuing delay.
|
||||
.Pp
|
||||
It differs from HD in three key aspects:
|
||||
.Bl -bullet
|
||||
.It
|
||||
The probability of cwnd reduction due to congestion is calculated once per round
|
||||
trip time instead of each time an acknowledgement is received as done by
|
||||
.Xr cc_hd 4 .
|
||||
.It
|
||||
Packet losses that occur while the queuing delay is less than queue_threshold
|
||||
do not cause cwnd to be reduced.
|
||||
.It
|
||||
CHD uses a shadow window to help regain lost transmission opportunities when
|
||||
competing with loss-based TCP flows.
|
||||
.Sh MIB Variables
|
||||
The algorithm exposes the following tunable variables in the
|
||||
.Va net.inet.tcp.cc.chd
|
||||
branch of the
|
||||
.Xr sysctl 3
|
||||
MIB:
|
||||
.Bl -tag -width ".Va queue_threshold"
|
||||
.It Va queue_threshold
|
||||
Queueing congestion threshold (qth) in ticks.
|
||||
Default is 20.
|
||||
.It Va pmax
|
||||
Per RTT maximum backoff probability as a percentage.
|
||||
Default is 50.
|
||||
.It Va qmin
|
||||
Minimum queuing delay threshold (qmin) in ticks.
|
||||
Default is 5.
|
||||
.It Va loss_fair
|
||||
If 1, cwnd is adjusted using the shadow window when a congestion
|
||||
related loss is detected.
|
||||
Default is 1.
|
||||
.It Va use_max
|
||||
If 1, the maximum RTT seen within the measurement period is used as the basic
|
||||
delay measurement for the algorithm, otherwise a sampled RTT measurement
|
||||
is used.
|
||||
Default is 1.
|
||||
.El
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr h_ertt 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9 ,
|
||||
.Xr khelp 9
|
||||
.Rs
|
||||
.%A "D. A. Hayes"
|
||||
.%A "G. Armitage"
|
||||
.%T "Improved coexistence and loss tolerance for delay based TCP congestion control"
|
||||
.%J "in 35th Annual IEEE Conference on Local Computer Networks"
|
||||
.%D "October 2010"
|
||||
.%P "24-31"
|
||||
.Re
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
congestion control module first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2010 by David Hayes whilst working on the
|
||||
NewTCP research project at Swinburne University's Centre for Advanced Internet
|
||||
Architectures, Melbourne, Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
congestion control module and this manual page were written by
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
114
share/man/man4/cc_cubic.4
Normal file
114
share/man/man4/cc_cubic.4
Normal file
@ -0,0 +1,114 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2009 Lawrence Stewart <lstewart@FreeBSD.org>
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" Portions of this documentation were written at the Centre for Advanced
|
||||
.\" Internet Architectures, Swinburne University, Melbourne, Australia by
|
||||
.\" David Hayes under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC_CUBIC 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc_cubic
|
||||
.Nd CUBIC Congestion Control Algorithm
|
||||
.Sh DESCRIPTION
|
||||
The CUBIC congestion control algorithm was designed to provide increased
|
||||
throughput in fast and long-distance networks.
|
||||
It attempts to maintain fairness when competing with legacy NewReno TCP in lower
|
||||
speed scenarios where NewReno is able to operate adequately.
|
||||
.Pp
|
||||
The congestion window is increased as a function of the time elapsed since the
|
||||
last congestion event.
|
||||
During regular operation, the window increase function follows a cubic function,
|
||||
with the inflection point set to be the congestion window value reached at the
|
||||
last congestion event.
|
||||
CUBIC also calculates an estimate of the congestion window that NewReno would
|
||||
have achieved at a given time after a congestion event.
|
||||
When updating the congestion window, the algorithm will choose the larger of the
|
||||
calculated CUBIC and estimated NewReno windows.
|
||||
.Pp
|
||||
CUBIC also backs off less on congestion by changing the multiplicative decrease
|
||||
factor from 1/2 (used by standard NewReno TCP) to 4/5.
|
||||
.Pp
|
||||
The implementation was done in a clean-room fashion, and is based on the
|
||||
Internet Draft and paper referenced in the
|
||||
.Sx SEE ALSO
|
||||
section below.
|
||||
.Sh MIB Variables
|
||||
There are currently no tunable MIB variables.
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9
|
||||
.Rs
|
||||
.%A "Sangtae Ha"
|
||||
.%A "Injong Rhee"
|
||||
.%A "Lisong Xu"
|
||||
.%T "CUBIC for Fast Long-Distance Networks"
|
||||
.%U "http://tools.ietf.org/id/draft-rhee-tcpm-cubic-02.txt"
|
||||
.Re
|
||||
.Rs
|
||||
.%A "Sangtae Ha"
|
||||
.%A "Injong Rhee"
|
||||
.%A "Lisong Xu"
|
||||
.%T "CUBIC: a new TCP-friendly high-speed TCP variant"
|
||||
.%J "SIGOPS Oper. Syst. Rev."
|
||||
.%V "42"
|
||||
.%N "5"
|
||||
.%D "July 2008"
|
||||
.%P "64-74"
|
||||
.Re
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
congestion control module first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2009 by Lawrence Stewart whilst studying at
|
||||
Swinburne University's Centre for Advanced Internet Architectures, Melbourne,
|
||||
Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
congestion control module and this manual page were written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org
|
||||
and
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
120
share/man/man4/cc_hd.4
Normal file
120
share/man/man4/cc_hd.4
Normal file
@ -0,0 +1,120 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes
|
||||
.\" under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC_HD 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc_hd
|
||||
.Nd HD Congestion Control Algorithm
|
||||
.Sh DESCRIPTION
|
||||
The HD congestion control algorithm is an implementation of the Hamilton
|
||||
Institute's delay-based congestion control which aims to keep network queuing
|
||||
delays below a particular threshold (queue_threshold).
|
||||
.Pp
|
||||
HD probabilistically reduces the congestion window (cwnd) based on its estimate
|
||||
of the network queuing delay.
|
||||
The probability of reducing cwnd is zero at hd_qmin or less, rising to a maximum
|
||||
at queue_threshold, and then back to zero at the maximum queuing delay.
|
||||
.Pp
|
||||
Loss-based congestion control algorithms such as NewReno probe for network
|
||||
capacity by filling queues until there is a packet loss.
|
||||
HD competes with loss-based congestion control algorithms by allowing its
|
||||
probability of reducing cwnd to drop from a maximum at queue_threshold to be
|
||||
zero at the maximum queuing delay.
|
||||
This has been shown to work well when the bottleneck link is highly multiplexed.
|
||||
.Sh MIB Variables
|
||||
The algorithm exposes the following tunable variables in the
|
||||
.Va net.inet.tcp.cc.hd
|
||||
branch of the
|
||||
.Xr sysctl 3
|
||||
MIB:
|
||||
.Bl -tag -width ".Va queue_threshold"
|
||||
.It Va queue_threshold
|
||||
Queueing congestion threshold (qth) in ticks.
|
||||
Default is 20.
|
||||
.It Va pmax
|
||||
Per packet maximum backoff probability as a percentage.
|
||||
Default is 5.
|
||||
.It Va qmin
|
||||
Minimum queuing delay threshold (qmin) in ticks.
|
||||
Default is 5.
|
||||
.El
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr h_ertt 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9 ,
|
||||
.Xr khelp 9
|
||||
.Rs
|
||||
.%A "L. Budzisz"
|
||||
.%A "R. Stanojevic"
|
||||
.%A "R. Shorten"
|
||||
.%A "F. Baker"
|
||||
.%T "A strategy for fair coexistence of loss and delay-based congestion control algorithms"
|
||||
.%J "IEEE Commun. Lett."
|
||||
.%D "Jul 2009"
|
||||
.%V "13"
|
||||
.%N "7"
|
||||
.%P "555-557"
|
||||
.Re
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh FUTURE WORK
|
||||
The Hamilton Institute have recently made some improvements to the algorithm
|
||||
implemented by this module and have called it Coexistent-TCP (C-TCP).
|
||||
The improvments should be evaluated and potentially incorporated into this
|
||||
module.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
congestion control module first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2010 by David Hayes whilst working on the
|
||||
NewTCP research project at Swinburne University's Centre for Advanced Internet
|
||||
Architectures, Melbourne, Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
congestion control module and this manual page were written by
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
136
share/man/man4/cc_htcp.4
Normal file
136
share/man/man4/cc_htcp.4
Normal file
@ -0,0 +1,136 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2008 Lawrence Stewart <lstewart@FreeBSD.org>
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" Portions of this documentation were written at the Centre for Advanced
|
||||
.\" Internet Architectures, Swinburne University, Melbourne, Australia by
|
||||
.\" David Hayes under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC_HTCP 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc_htcp
|
||||
.Nd H-TCP Congestion Control Algorithm
|
||||
.Sh DESCRIPTION
|
||||
The H-TCP congestion control algorithm was designed to provide increased
|
||||
throughput in fast and long-distance networks.
|
||||
It attempts to maintain fairness when competing with legacy NewReno TCP in lower
|
||||
speed scenarios where NewReno is able to operate adequately.
|
||||
.Pp
|
||||
The congestion window is increased as a function of the time elapsed since the
|
||||
last congestion event.
|
||||
The window increase algorithm operates like NewReno for the first second after a
|
||||
congestion event, and then switches to a high-speed mode based on a quadratic
|
||||
increase function.
|
||||
.Pp
|
||||
The implementation was done in a clean-room fashion, and is based on the
|
||||
Internet Draft and other documents referenced in the
|
||||
.Sx SEE ALSO
|
||||
section below.
|
||||
.Sh MIB Variables
|
||||
The algorithm exposes the following tunable variables in the
|
||||
.Va net.inet.tcp.cc.htcp
|
||||
branch of the
|
||||
.Xr sysctl 3
|
||||
MIB:
|
||||
.Bl -tag -width ".Va adaptive_backoff"
|
||||
.It Va adaptive_backoff
|
||||
Controls use of the adaptive backoff algorithm, which is designed to keep
|
||||
network queues non-empty during congestion recovery episodes.
|
||||
Default is 0 (disabled).
|
||||
.It Va rtt_scaling
|
||||
Controls use of the RTT scaling algorithm, which is designed to make congestion
|
||||
window increase during congestion avoidance mode invariant with respect to RTT.
|
||||
Default is 0 (disabled).
|
||||
.El
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9
|
||||
.Rs
|
||||
.%A "D. Leith"
|
||||
.%A "R. Shorten"
|
||||
.%T "H-TCP: TCP Congestion Control for High Bandwidth-Delay Product Paths"
|
||||
.%U "http://tools.ietf.org/id/draft-leith-tcp-htcp-06.txt"
|
||||
.Re
|
||||
.Rs
|
||||
.%A "D. Leith"
|
||||
.%A "R. Shorten"
|
||||
.%A "T. Yee"
|
||||
.%T "H-TCP: A framework for congestion control in high-speed and long-distance networks"
|
||||
.%B "Proc. PFLDnet"
|
||||
.%D "2005"
|
||||
.Re
|
||||
.Rs
|
||||
.%A "G. Armitage"
|
||||
.%A "L. Stewart"
|
||||
.%A "M. Welzl"
|
||||
.%A "J. Healy"
|
||||
.%T "An independent H-TCP implementation under FreeBSD 7.0: description and observed behaviour"
|
||||
.%J "SIGCOMM Comput. Commun. Rev."
|
||||
.%V "38"
|
||||
.%N "3"
|
||||
.%D "July 2008"
|
||||
.%P "27-38"
|
||||
.Re
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
congestion control module first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2007 by James Healy and Lawrence Stewart whilst
|
||||
working on the NewTCP research project at Swinburne University's Centre for
|
||||
Advanced Internet Architectures, Melbourne, Australia, which was made possible
|
||||
in part by a grant from the Cisco University Research Program Fund at Community
|
||||
Foundation Silicon Valley.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
congestion control module was written by
|
||||
.An James Healy Aq jimmy@deefa.com
|
||||
and
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
||||
.Pp
|
||||
This manual page was written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org
|
||||
and
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
82
share/man/man4/cc_newreno.4
Normal file
82
share/man/man4/cc_newreno.4
Normal file
@ -0,0 +1,82 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2009 Lawrence Stewart <lstewart@FreeBSD.org>
|
||||
.\" Copyright (c) 2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" Portions of this documentation were written at the Centre for Advanced
|
||||
.\" Internet Architectures, Swinburne University, Melbourne, Australia by
|
||||
.\" Lawrence Stewart under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC_NEWRENO 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc_newreno
|
||||
.Nd NewReno Congestion Control Algorithm
|
||||
.Sh DESCRIPTION
|
||||
The NewReno congestion control algorithm is the default for TCP.
|
||||
Details about the algorithm can be found in RFC5681.
|
||||
.Sh MIB Variables
|
||||
There are currently no tunable MIB variables.
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
congestion control algorithm first appeared in its modular form in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2007 by James Healy and Lawrence Stewart whilst
|
||||
working on the NewTCP research project at Swinburne University's Centre for
|
||||
Advanced Internet Architectures, Melbourne, Australia, which was made possible
|
||||
in part by a grant from the Cisco University Research Program Fund at Community
|
||||
Foundation Silicon Valley.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
congestion control module was written by
|
||||
.An James Healy Aq jimmy@deefa.com ,
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org
|
||||
and
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
||||
.Pp
|
||||
This manual page was written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
138
share/man/man4/cc_vegas.4
Normal file
138
share/man/man4/cc_vegas.4
Normal file
@ -0,0 +1,138 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes
|
||||
.\" under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC_VEGAS 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc_vegas
|
||||
.Nd Vegas Congestion Control Algorithm
|
||||
.Sh DESCRIPTION
|
||||
The Vegas congestion control algorithm uses what the authors term the actual and
|
||||
expected transmission rates to determine whether there is congestion along the
|
||||
network path i.e.
|
||||
.Pp
|
||||
.Bl -item -offset indent
|
||||
.It
|
||||
actual rate = (total data sent in a RTT) / RTT
|
||||
.It
|
||||
expected rate = cwnd / RTTmin
|
||||
.It
|
||||
diff = expected - actual
|
||||
.El
|
||||
.Pp
|
||||
where RTT is the measured instantaneous round trip time and RTTmin is the
|
||||
smallest round trip time observed during the connection.
|
||||
.Pp
|
||||
The algorithm aims to keep diff between two parameters alpha and beta, such
|
||||
that:
|
||||
.Pp
|
||||
.Bl -item -offset indent
|
||||
.It
|
||||
alpha < diff < beta
|
||||
.El
|
||||
.Pp
|
||||
If diff > beta, congestion is inferred and cwnd is decremented by one packet (or
|
||||
the maximum TCP segment size).
|
||||
If diff < alpha, then cwnd is incremented by one packet.
|
||||
Alpha and beta govern the amount of buffering along the path.
|
||||
.Pp
|
||||
The implementation was done in a clean-room fashion, and is based on the
|
||||
paper referenced in the
|
||||
.Sx SEE ALSO
|
||||
section below.
|
||||
.Sh IMPLEMENTATION NOTES
|
||||
The time from the transmission of a marked packet until the receipt of an
|
||||
acknowledgement for that packet is measured once per RTT.
|
||||
This implementation does not implement Brakmo's and Peterson's original
|
||||
duplicate ACK policy since clock ticks in today's machines are not as coarse as
|
||||
they were (i.e. 500ms) when Vegas was originally designed.
|
||||
Note that modern TCP recovery processes such as fast retransmit and SACK are
|
||||
enabled by default in the TCP stack.
|
||||
.Sh MIB Variables
|
||||
The algorithm exposes the following tunable variables in the
|
||||
.Va net.inet.tcp.cc.vegas
|
||||
branch of the
|
||||
.Xr sysctl 3
|
||||
MIB:
|
||||
.Bl -tag -width ".Va alpha"
|
||||
.It Va alpha
|
||||
Query or set the Vegas alpha parameter as a number of buffers on the path.
|
||||
When setting alpha, the value must satisfy: 0 < alpha < beta.
|
||||
Default is 1.
|
||||
.It Va beta
|
||||
Query or set the Vegas beta parameter as a number of buffers on the path.
|
||||
When setting beta, the value must satisfy: 0 < alpha < beta.
|
||||
Default is 3.
|
||||
.El
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr h_ertt 4 ,
|
||||
.Xr tcp 4 ,
|
||||
.Xr cc 9 ,
|
||||
.Xr khelp 9
|
||||
.Rs
|
||||
.%A "L. S. Brakmo"
|
||||
.%A "L. L. Peterson"
|
||||
.%T "TCP Vegas: end to end congestion avoidance on a global internet"
|
||||
.%J "IEEE J. Sel. Areas Commun."
|
||||
.%D "October 1995"
|
||||
.%V "13"
|
||||
.%N "8"
|
||||
.%P "1465-1480"
|
||||
.Re
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
congestion control module first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2010 by David Hayes whilst working on the
|
||||
NewTCP research project at Swinburne University's Centre for Advanced Internet
|
||||
Architectures, Melbourne, Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
congestion control module and this manual page were written by
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
143
share/man/man4/h_ertt.4
Normal file
143
share/man/man4/h_ertt.4
Normal file
@ -0,0 +1,143 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes
|
||||
.\" under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt h_ertt 9
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm h_ertt
|
||||
.Nd Enhanced Round Trip Time Khelp module
|
||||
.Sh SYNOPSIS
|
||||
.In netinet/khelp/h_ertt.h
|
||||
.Sh DESCRIPTION
|
||||
The
|
||||
.Nm
|
||||
Khelp module works within the
|
||||
.Xr khelp 9
|
||||
framework to provide TCP with a per-connection, low noise estimate of the
|
||||
instantaneous RTT.
|
||||
The implementation attempts to be robust in the face of delayed
|
||||
acknowledgements, TCP Segmentation Offload (TSO), receivers who manipulate TCP
|
||||
timestamps and lack of the TCP timestamp option altogether.
|
||||
.Pp
|
||||
TCP receivers using delayed acknowledgements either acknowledge every second packet
|
||||
(reflecting the time stamp of the first) or use a timeout to trigger the
|
||||
acknowledgement if no second packet arrives.
|
||||
If the heuristic used by
|
||||
.Nm
|
||||
determines that the receiver is using delayed acknowledgements, it measures the
|
||||
RTT using the second packet (the one that triggers the acknowledgement).
|
||||
It does not measure the RTT if the acknowledgement is for the
|
||||
first packet, since it cannot be accurately determined.
|
||||
.Pp
|
||||
When TSO is in use,
|
||||
.Nm
|
||||
will momentarily disable TSO whilst marking a packet to use for a new
|
||||
measurement.
|
||||
The process has negligible impact on the connection.
|
||||
.Pp
|
||||
.Nm
|
||||
associates the following struct with each connection's TCP control block:
|
||||
.Bd -literal
|
||||
struct ertt {
|
||||
TAILQ_HEAD(txseginfo_head, txseginfo) txsegi_q; /* Private. */
|
||||
long bytes_tx_in_rtt; /* Private. */
|
||||
long bytes_tx_in_marked_rtt;
|
||||
unsigned long marked_snd_cwnd;
|
||||
int rtt;
|
||||
int maxrtt;
|
||||
int minrtt;
|
||||
int dlyack_rx; /* Private. */
|
||||
int timestamp_errors; /* Private. */
|
||||
int markedpkt_rtt; /* Private. */
|
||||
uint32_t flags;
|
||||
};
|
||||
.Ed
|
||||
.Pp
|
||||
The fields marked as private should not be manipulated by any code outside of
|
||||
the
|
||||
.Nm
|
||||
implementation.
|
||||
The non-private fields provide the following data:
|
||||
.Bl -tag -width ".Va bytes_tx_in_marked_rtt" -offset indent
|
||||
.It Va bytes_tx_in_marked_rtt
|
||||
The number of bytes transmitted in the
|
||||
.Va markedpkt_rtt .
|
||||
.It Va marked_snd_cwnd
|
||||
The value of cwnd for the marked rtt measurement.
|
||||
.It Va rtt
|
||||
The most recent RTT measurement.
|
||||
.It Va maxrtt
|
||||
The longest RTT measurement that has been taken.
|
||||
.It Va minrtt
|
||||
The shortest RTT measurement that has been taken.
|
||||
.It Va flags
|
||||
The ERTT_NEW_MEASUREMENT flag will be set by the implementation when a new
|
||||
measurement is available.
|
||||
It is the responsibility of
|
||||
.Nm
|
||||
consumers to unset the flag if they wish to use it as a notification method for
|
||||
new measurements.
|
||||
.El
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr hhook 9 ,
|
||||
.Xr khelp 9
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
module first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The module was first released in 2010 by David Hayes whilst working on the
|
||||
NewTCP research project at Swinburne University's Centre for Advanced Internet
|
||||
Architectures, Melbourne, Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
Khelp module and this manual page were written by
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
||||
.Sh BUGS
|
||||
The module maintains enhanced RTT estimates for all new TCP connections created
|
||||
after the time at which the module was loaded.
|
||||
It might be beneficial to see if it is possible to have the module only affect
|
||||
connections which actually care about ERTT estimates.
|
@ -1,5 +1,11 @@
|
||||
.\" Copyright (c) 1983, 1991, 1993
|
||||
.\" The Regents of the University of California. All rights reserved.
|
||||
.\" The Regents of the University of California.
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" Portions of this documentation were written at the Centre for Advanced
|
||||
.\" Internet Architectures, Swinburne University of Technology, Melbourne,
|
||||
.\" Australia by David Hayes under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
@ -32,7 +38,7 @@
|
||||
.\" From: @(#)tcp.4 8.1 (Berkeley) 6/5/93
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd January 8, 2011
|
||||
.Dd February 15, 2011
|
||||
.Dt TCP 4
|
||||
.Os
|
||||
.Sh NAME
|
||||
@ -116,7 +122,7 @@ supports a number of socket options which can be set with
|
||||
.Xr setsockopt 2
|
||||
and tested with
|
||||
.Xr getsockopt 2 :
|
||||
.Bl -tag -width ".Dv TCP_NODELAY"
|
||||
.Bl -tag -width ".Dv TCP_CONGESTION"
|
||||
.It Dv TCP_INFO
|
||||
Information about a socket's underlying TCP session may be retrieved
|
||||
by passing the read-only option
|
||||
@ -134,6 +140,12 @@ send window size,
|
||||
receive window size,
|
||||
and
|
||||
bandwidth-controlled window space.
|
||||
.It Dv TCP_CONGESTION
|
||||
Select or query the congestion control algorithm that TCP will use for the
|
||||
connection.
|
||||
See
|
||||
.Xr cc 4
|
||||
for details.
|
||||
.It Dv TCP_NODELAY
|
||||
Under most circumstances,
|
||||
.Tn TCP
|
||||
@ -231,6 +243,14 @@ see
|
||||
.Xr ip 4 .
|
||||
Incoming connection requests that are source-routed are noted,
|
||||
and the reverse source route is used in responding.
|
||||
.Pp
|
||||
The default congestion control algorithm for
|
||||
.Tn TCP
|
||||
is
|
||||
.Xr cc_newreno 4 .
|
||||
Other congestion control algorithms can be made available using the
|
||||
.Xr cc 4
|
||||
framework.
|
||||
.Ss MIB Variables
|
||||
The
|
||||
.Tn TCP
|
||||
@ -322,11 +342,6 @@ See
|
||||
Delay ACK to try and piggyback it onto a data packet.
|
||||
.It Va delacktime
|
||||
Maximum amount of time, in milliseconds, before a delayed ACK is sent.
|
||||
.It Va newreno
|
||||
Enable
|
||||
.Tn TCP
|
||||
NewReno Fast Recovery algorithm,
|
||||
as described in RFC 2582.
|
||||
.It Va path_mtu_discovery
|
||||
Enable Path MTU Discovery.
|
||||
.It Va tcbhashsize
|
||||
@ -495,6 +510,7 @@ address.
|
||||
.Xr socket 2 ,
|
||||
.Xr sysctl 3 ,
|
||||
.Xr blackhole 4 ,
|
||||
.Xr cc 4 ,
|
||||
.Xr inet 4 ,
|
||||
.Xr intro 4 ,
|
||||
.Xr ip 4 ,
|
||||
|
@ -43,6 +43,7 @@ MAN= accept_filter.9 \
|
||||
BUS_SETUP_INTR.9 \
|
||||
bus_space.9 \
|
||||
byteorder.9 \
|
||||
cc.9 \
|
||||
cd.9 \
|
||||
condvar.9 \
|
||||
config_intrhook.9 \
|
||||
@ -122,6 +123,7 @@ MAN= accept_filter.9 \
|
||||
hash.9 \
|
||||
hashinit.9 \
|
||||
hexdump.9 \
|
||||
hhook.9 \
|
||||
ieee80211.9 \
|
||||
ieee80211_amrr.9 \
|
||||
ieee80211_beacon.9 \
|
||||
@ -144,6 +146,7 @@ MAN= accept_filter.9 \
|
||||
KASSERT.9 \
|
||||
kernacc.9 \
|
||||
kernel_mount.9 \
|
||||
khelp.9 \
|
||||
kobj.9 \
|
||||
kproc.9 \
|
||||
kqueue.9 \
|
||||
|
333
share/man/man9/cc.9
Normal file
333
share/man/man9/cc.9
Normal file
@ -0,0 +1,333 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2008-2009 Lawrence Stewart <lstewart@FreeBSD.org>
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" Portions of this documentation were written at the Centre for Advanced
|
||||
.\" Internet Architectures, Swinburne University, Melbourne, Australia by
|
||||
.\" David Hayes and Lawrence Stewart under sponsorship from the
|
||||
.\" FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt CC 9
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm cc ,
|
||||
.Nm DECLARE_CC_MODULE ,
|
||||
.Nm CC_VAR
|
||||
.Nd Modular Congestion Control
|
||||
.Sh SYNOPSIS
|
||||
.In netinet/cc.h
|
||||
.In netinet/cc/cc_module.h
|
||||
.Fn DECLARE_CC_MODULE "ccname" "ccalgo"
|
||||
.Fn CC_VAR "ccv" "what"
|
||||
.Sh DESCRIPTION
|
||||
The
|
||||
.Nm
|
||||
framework allows congestion control algorithms to be implemented as dynamically
|
||||
loadable kernel modules via the
|
||||
.Xr kld 4
|
||||
facility.
|
||||
Transport protocols can select from the list of available algorithms on a
|
||||
connection-by-connection basis, or use the system default (see
|
||||
.Xr cc 4
|
||||
for more details).
|
||||
.Pp
|
||||
.Nm
|
||||
modules are identified by an
|
||||
.Xr ascii 7
|
||||
name and set of hook functions encapsulated in a
|
||||
.Vt "struct cc_algo" ,
|
||||
which has the following members:
|
||||
.Bd -literal -offset indent
|
||||
struct cc_algo {
|
||||
char name[TCP_CA_NAME_MAX];
|
||||
int (*mod_init) (void);
|
||||
int (*mod_destroy) (void);
|
||||
int (*cb_init) (struct cc_var *ccv);
|
||||
void (*cb_destroy) (struct cc_var *ccv);
|
||||
void (*conn_init) (struct cc_var *ccv);
|
||||
void (*ack_received) (struct cc_var *ccv, uint16_t type);
|
||||
void (*cong_signal) (struct cc_var *ccv, uint32_t type);
|
||||
void (*post_recovery) (struct cc_var *ccv);
|
||||
void (*after_idle) (struct cc_var *ccv);
|
||||
};
|
||||
.Ed
|
||||
.Pp
|
||||
The
|
||||
.Va name
|
||||
field identifies the unique name of the algorithm, and should be no longer than
|
||||
TCP_CA_NAME_MAX-1 characters in length (the TCP_CA_NAME_MAX define lives in
|
||||
.In netinet/tcp.h
|
||||
for compatibility reasons).
|
||||
.Pp
|
||||
The
|
||||
.Va mod_init
|
||||
function is called when a new module is loaded into the system but before the
|
||||
registration process is complete.
|
||||
It should be implemented if a module needs to set up some global state prior to
|
||||
being available for use by new connections.
|
||||
Returning a non-zero value from
|
||||
.Va mod_init
|
||||
will cause the loading of the module to fail.
|
||||
.Pp
|
||||
The
|
||||
.Va mod_destroy
|
||||
function is called prior to unloading an existing module from the kernel.
|
||||
It should be implemented if a module needs to clean up any global state before
|
||||
being removed from the kernel.
|
||||
The return value is currently ignored.
|
||||
.Pp
|
||||
The
|
||||
.Va cb_init
|
||||
function is called when a TCP control block
|
||||
.Vt struct tcpcb
|
||||
is created.
|
||||
It should be implemented if a module needs to allocate memory for storing
|
||||
private per-connection state.
|
||||
Returning a non-zero value from
|
||||
.Va cb_init
|
||||
will cause the connection set up to be aborted, terminating the connection as a
|
||||
result.
|
||||
.Pp
|
||||
The
|
||||
.Va cb_destroy
|
||||
function is called when a TCP control block
|
||||
.Vt struct tcpcb
|
||||
is destroyed.
|
||||
It should be implemented if a module needs to free memory allocated in
|
||||
.Va cb_init .
|
||||
.Pp
|
||||
The
|
||||
.Va conn_init
|
||||
function is called when a new connection has been established and variables are
|
||||
being initialised.
|
||||
It should be implemented to initialise congestion control algorithm variables
|
||||
for the newly established connection.
|
||||
.Pp
|
||||
The
|
||||
.Va ack_received
|
||||
function is called when a TCP acknowledgement (ACK) packet is received.
|
||||
Modules use the
|
||||
.Fa type
|
||||
argument as an input to their congestion management algorithms.
|
||||
The ACK types currently reported by the stack are CC_ACK and CC_DUPACK.
|
||||
CC_ACK indicates the received ACK acknowledges previously unacknowledged data.
|
||||
CC_DUPACK indicates the received ACK acknowledges data we have already received
|
||||
an ACK for.
|
||||
.Pp
|
||||
The
|
||||
.Va cong_signal
|
||||
function is called when a congestion event is detected by the TCP stack.
|
||||
Modules use the
|
||||
.Fa type
|
||||
argument as an input to their congestion management algorithms.
|
||||
The congestion event types currently reported by the stack are CC_ECN, CC_RTO,
|
||||
CC_RTO_ERR and CC_NDUPACK.
|
||||
CC_ECN is reported when the TCP stack receives an explicit congestion notification
|
||||
(RFC3168).
|
||||
CC_RTO is reported when the retransmission time out timer fires.
|
||||
CC_RTO_ERR is reported if the retransmission time out timer fired in error.
|
||||
CC_NDUPACK is reported if N duplicate ACKs have been received back-to-back,
|
||||
where N is the fast retransmit duplicate ack threshold (N=3 currently as per
|
||||
RFC5681).
|
||||
.Pp
|
||||
The
|
||||
.Va post_recovery
|
||||
function is called after the TCP connection has recovered from a congestion event.
|
||||
It should be implemented to adjust state as required.
|
||||
.Pp
|
||||
The
|
||||
.Va after_idle
|
||||
function is called when data transfer resumes after an idle period.
|
||||
It should be implemented to adjust state as required.
|
||||
.Pp
|
||||
The
|
||||
.Fn DECLARE_CC_MODULE
|
||||
macro provides a convenient wrapper around the
|
||||
.Xr DECLARE_MODULE 9
|
||||
macro, and is used to register a
|
||||
.Nm
|
||||
module with the
|
||||
.Nm
|
||||
framework.
|
||||
The
|
||||
.Fa ccname
|
||||
argument specifies the module's name.
|
||||
The
|
||||
.Fa ccalgo
|
||||
argument points to the module's
|
||||
.Vt struct cc_algo .
|
||||
.Pp
|
||||
.Nm
|
||||
modules must instantiate a
|
||||
.Vt struct cc_algo ,
|
||||
but are only required to set the name field, and optionally any of the function
|
||||
pointers.
|
||||
The stack will skip calling any function pointer which is NULL, so there is no
|
||||
requirement to implement any of the function pointers.
|
||||
Using the C99 designated initialiser feature to set fields is encouraged.
|
||||
.Pp
|
||||
Each function pointer which deals with congestion control state is passed a
|
||||
pointer to a
|
||||
.Vt struct cc_var ,
|
||||
which has the following members:
|
||||
.Bd -literal -offset indent
|
||||
struct cc_var {
|
||||
void *cc_data;
|
||||
int bytes_this_ack;
|
||||
tcp_seq curack;
|
||||
uint32_t flags;
|
||||
int type;
|
||||
union ccv_container {
|
||||
struct tcpcb *tcp;
|
||||
struct sctp_nets *sctp;
|
||||
} ccvc;
|
||||
};
|
||||
.Ed
|
||||
.Pp
|
||||
.Vt struct cc_var
|
||||
groups congestion control related variables into a single, embeddable structure
|
||||
and adds a layer of indirection to accessing transport protocol control blocks.
|
||||
The eventual goal is to allow a single set of
|
||||
.Nm
|
||||
modules to be shared between all congestion aware transport protocols, though
|
||||
currently only
|
||||
.Xr tcp 4
|
||||
is supported.
|
||||
.Pp
|
||||
To aid the eventual transition towards this goal, direct use of variables from
|
||||
the transport protocol's data structures is strongly discouraged.
|
||||
However, it is inevitable at the current time to require access to some of these
|
||||
variables, and so the
|
||||
.Fn CC_VAR
|
||||
macro exists as a convenience accessor.
|
||||
The
|
||||
.Fa ccv
|
||||
argument points to the
|
||||
.Vt struct cc_var
|
||||
passed into the function by the
|
||||
.Nm
|
||||
framework.
|
||||
The
|
||||
.Fa what
|
||||
argument specifies the name of the variable to access.
|
||||
.Pp
|
||||
Apart from the
|
||||
.Va type
|
||||
and
|
||||
.Va ccv_container
|
||||
fields, the remaining fields in
|
||||
.Vt struct cc_var
|
||||
are for use by
|
||||
.Nm
|
||||
modules.
|
||||
.Pp
|
||||
The
|
||||
.Va cc_data
|
||||
field is available for algorithms requiring additional per-connection state to
|
||||
attach a dynamic memory pointer to.
|
||||
The memory should be allocated and attached in the module's
|
||||
.Va cb_init
|
||||
hook function.
|
||||
.Pp
|
||||
The
|
||||
.Va bytes_this_ack
|
||||
field specifies the number of new bytes acknowledged by the most recently
|
||||
received ACK packet.
|
||||
It is only valid in the
|
||||
.Va ack_received
|
||||
hook function.
|
||||
.Pp
|
||||
The
|
||||
.Va curack
|
||||
field specifies the sequence number of the most recently received ACK packet.
|
||||
It is only valid in the
|
||||
.Va ack_received ,
|
||||
.Va cong_signal
|
||||
and
|
||||
.Va post_recovery
|
||||
hook functions.
|
||||
.Pp
|
||||
The
|
||||
.Va flags
|
||||
field is used to pass useful information from the stack to a
|
||||
.Nm
|
||||
module.
|
||||
The CCF_ABC_SENTAWND flag is relevant in
|
||||
.Va ack_received
|
||||
and is set when appropriate byte counting (RFC3465) has counted a window's worth
|
||||
of bytes has been sent.
|
||||
It is the module's responsibility to clear the flag after it has processed the
|
||||
signal.
|
||||
The CCF_CWND_LIMITED flag is relevant in
|
||||
.Va ack_received
|
||||
and is set when the connection's ability to send data is currently constrained
|
||||
by the value of the congestion window.
|
||||
Algorithms should use the abscence of this flag being set to avoid accumulating
|
||||
a large difference between the congestion window and send window.
|
||||
.Sh SEE ALSO
|
||||
.Xr cc 4 ,
|
||||
.Xr cc_chd 4 ,
|
||||
.Xr cc_cubic 4 ,
|
||||
.Xr cc_hd 4 ,
|
||||
.Xr cc_htcp 4 ,
|
||||
.Xr cc_newreno 4 ,
|
||||
.Xr cc_vegas 4 ,
|
||||
.Xr tcp 4
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh FUTURE WORK
|
||||
Integrate with
|
||||
.Xr sctp 4 .
|
||||
.Sh HISTORY
|
||||
The modular Congestion Control (CC) framework first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The framework was first released in 2007 by James Healy and Lawrence Stewart
|
||||
whilst working on the NewTCP research project at Swinburne University's Centre
|
||||
for Advanced Internet Architectures, Melbourne, Australia, which was made
|
||||
possible in part by a grant from the Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
framework was written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org ,
|
||||
.An James Healy Aq jimmy@deefa.com
|
||||
and
|
||||
.An David Hayes Aq david.hayes@ieee.org .
|
||||
.Pp
|
||||
This manual page was written by
|
||||
.An David Hayes Aq david.hayes@ieee.org
|
||||
and
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
387
share/man/man9/hhook.9
Normal file
387
share/man/man9/hhook.9
Normal file
@ -0,0 +1,387 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes and
|
||||
.\" Lawrence Stewart under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt hhook 9
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm hhook ,
|
||||
.Nm hhook_head_register ,
|
||||
.Nm hhook_head_deregister ,
|
||||
.Nm hhook_head_deregister_lookup ,
|
||||
.Nm hhook_run_hooks ,
|
||||
.Nm HHOOKS_RUN_IF ,
|
||||
.Nm HHOOKS_RUN_LOOKUP_IF
|
||||
.Nd Helper Hook Framework
|
||||
.Sh SYNOPSIS
|
||||
.In sys/hhook.h
|
||||
.Ft typedef int
|
||||
.Fn "\*(lp*hhook_func_t\*(rp" "int32_t hhook_type" "int32_t hhook_id" \
|
||||
"void *udata" "void *ctx_data" "void *hdata" "struct osd *hosd"
|
||||
.Fn "int hhook_head_register" "int32_t hhook_type" "int32_t hhook_id" \
|
||||
"struct hhook_head **hhh" "uint32_t flags"
|
||||
.Fn "int hhook_head_deregister" "struct hhook_head *hhh"
|
||||
.Fn "int hhook_head_deregister_lookup" "int32_t hhook_type" "int32_t hhook_id"
|
||||
.Fn "void hhook_run_hooks" "struct hhook_head *hhh" "void *ctx_data" \
|
||||
"struct osd *hosd"
|
||||
.Fn HHOOKS_RUN_IF "hhh" "ctx_data" "hosd"
|
||||
.Fn HHOOKS_RUN_LOOKUP_IF "hhook_type" "hhook_id" "ctx_data" "hosd"
|
||||
.Sh DESCRIPTION
|
||||
.Nm
|
||||
provides a framework for managing and running arbitrary hook functions at
|
||||
defined hook points within the kernel.
|
||||
The KPI was inspired by
|
||||
.Xr pfil 9 ,
|
||||
and in many respects can be thought of as a more generic superset of pfil.
|
||||
.Pp
|
||||
The
|
||||
.Xr khelp 9
|
||||
and
|
||||
.Nm
|
||||
frameworks are tightly integrated.
|
||||
Khelp is responsible for registering and deregistering Khelp module hook
|
||||
functions with
|
||||
.Nm
|
||||
points.
|
||||
The KPI functions used by
|
||||
.Xr khelp 9
|
||||
to do this are not documented here as they are not relevant to consumers wishing
|
||||
to instantiate hook points.
|
||||
.Ss Information for Khelp Module Implementors
|
||||
Khelp modules indirectly interact with
|
||||
.Nm
|
||||
by defining appropriate hook functions for insertion into hook points.
|
||||
Hook functions must conform to the
|
||||
.Ft hhook_func_t
|
||||
function pointer declaration
|
||||
outlined in the
|
||||
.Sx SYNOPSIS .
|
||||
.Pp
|
||||
The
|
||||
.Fa hhook_type
|
||||
and
|
||||
.Fa hhook_id
|
||||
arguments identify the hook point which has called into the hook function.
|
||||
These are useful when a single hook function is registered for multiple hook
|
||||
points and wants to know which hook point has called into it.
|
||||
.In sys/hhook.h
|
||||
lists available
|
||||
.Fa hhook_type
|
||||
defines and subsystems which export hook points are responsible for defining
|
||||
the
|
||||
.Fa hhook_id
|
||||
value in appropriate header files.
|
||||
.Pp
|
||||
The
|
||||
.Fa udata
|
||||
argument will be passed to the hook function if it was specified in the
|
||||
.Vt struct hookinfo
|
||||
at hook registration time.
|
||||
.Pp
|
||||
The
|
||||
.Fa ctx_data
|
||||
argument contains context specific data from the hook point call site.
|
||||
The data type passed is subsystem dependent.
|
||||
.Pp
|
||||
The
|
||||
.Fa hdata
|
||||
argument is a pointer to the persistent per-object storage allocated for use by
|
||||
the module if required.
|
||||
The pointer will only ever be NULL if the module did not request per-object
|
||||
storage.
|
||||
.Pp
|
||||
The
|
||||
.Fa hosd
|
||||
argument can be used with the
|
||||
.Xr khelp 9
|
||||
framework's
|
||||
.Fn khelp_get_osd
|
||||
function to access data belonging to a different Khelp module.
|
||||
.Pp
|
||||
Khelp modules instruct the Khelp framework to register their hook functions with
|
||||
.Nm
|
||||
points by creating a
|
||||
.Vt "struct hookinfo"
|
||||
per hook point, which contains the following members:
|
||||
.Bd -literal -offset indent
|
||||
struct hookinfo {
|
||||
hhook_func_t hook_func;
|
||||
struct helper *hook_helper;
|
||||
void *hook_udata;
|
||||
int32_t hook_id;
|
||||
int32_t hook_type;
|
||||
};
|
||||
.Ed
|
||||
.Pp
|
||||
Khelp modules are responsible for setting all members of the struct except
|
||||
.Va hook_helper
|
||||
which is handled by the Khelp framework.
|
||||
.Ss Creating and Managing Hook Points
|
||||
Kernel subsystems that wish to provide
|
||||
.Nm
|
||||
points typically need to make four and possibly five key changes to their
|
||||
implementation:
|
||||
.Bl -bullet
|
||||
.It
|
||||
Define a list of
|
||||
.Va hhook_id
|
||||
mappings in an appropriate subsystem header.
|
||||
.It
|
||||
Register each hook point with the
|
||||
.Fn hhook_head_register
|
||||
function during initialisation of the subsystem.
|
||||
.It
|
||||
Select or create a standardised data type to pass to hook functions as
|
||||
contextual data.
|
||||
.It
|
||||
Add a call to
|
||||
.Fn HHOOKS_RUN_IF
|
||||
or
|
||||
.Fn HHOOKS_RUN_IF_LOOKUP
|
||||
at the point in the subsystem's code where the hook point should be executed.
|
||||
.It
|
||||
If the subsystem can be dynamically added/removed at runtime, each hook
|
||||
point registered with the
|
||||
.Fn hhook_head_register
|
||||
function when the subsystem was initialised needs to be deregistered with the
|
||||
.Fn hhook_head_deregister
|
||||
or
|
||||
.Fn hhook_head_deregister_lookup
|
||||
functions when the subsystem is being deinitialised prior to removal.
|
||||
.El
|
||||
.Pp
|
||||
The
|
||||
.Fn hhook_head_register
|
||||
function registers a hook point with the
|
||||
.Nm
|
||||
framework.
|
||||
The
|
||||
.Fa hook_type
|
||||
argument defines the high level type for the hook point.
|
||||
Valid types are defined in
|
||||
.In sys/hhook.h
|
||||
and new types should be added as required.
|
||||
The
|
||||
.Fa hook_id
|
||||
argument specifies a unique, subsystem specific identifier for the hook point.
|
||||
The
|
||||
.Fa hhh
|
||||
argument will, if not NULL, be used to store a reference to the
|
||||
.Vt struct hhook_head
|
||||
created as part of the registration process.
|
||||
Subsystems will generally want to store a local copy of the
|
||||
.Vt struct hhook_head
|
||||
so that they can use the
|
||||
.Fn HHOOKS_RUN_IF
|
||||
macro to instantiate hook points.
|
||||
The HHOOK_WAITOK flag may be passed in via the
|
||||
.Fa flags
|
||||
argument if
|
||||
.Xr malloc 9
|
||||
is allowed to sleep waiting for memory to become available.
|
||||
If the hook point is within a virtualised subsystem (e.g. the network stack),
|
||||
the HHOOK_HEADISINVNET flag should be passed in via the
|
||||
.Fa flags
|
||||
argument so that the
|
||||
.Vt struct hhook_head
|
||||
created during the registration process will be added to a virtualised list.
|
||||
.Pp
|
||||
The
|
||||
.Fn hhook_head_deregister
|
||||
function deregisters a previously registered hook point from the
|
||||
.Nm
|
||||
framework.
|
||||
The
|
||||
.Fa hhh
|
||||
argument is the pointer to the
|
||||
.Vt struct hhook_head
|
||||
returned by
|
||||
.Fn hhoook_head_register
|
||||
when the hook point was registered.
|
||||
.Pp
|
||||
The
|
||||
.Fn hhook_head_deregister_lookup
|
||||
function can be used instead of
|
||||
.Fn hhook_head_deregister
|
||||
in situations where the caller does not have a cached copy of the
|
||||
.Vt struct hhook_head
|
||||
and wants to deregister a hook point using the appropriate
|
||||
.Fa hook_type
|
||||
and
|
||||
.Fa hook_id
|
||||
identifiers instead.
|
||||
.Pp
|
||||
The
|
||||
.Fn hhook_run_hooks
|
||||
function should normally not be called directly and should instead be called
|
||||
indirectly via the
|
||||
.Fn HHOOKS_RUN_IF
|
||||
macro.
|
||||
However, there may be circumstances where it is preferable to call the function
|
||||
directly, and so it is documented here for completeness.
|
||||
The
|
||||
.Fa hhh
|
||||
argument references the
|
||||
.Nm
|
||||
point to call all registered hook functions for.
|
||||
The
|
||||
.Fa ctx_data
|
||||
argument specifies a pointer to the contextual hook point data to pass into the
|
||||
hook functions.
|
||||
The
|
||||
.Fa hosd
|
||||
argument should be the pointer to the appropriate object's
|
||||
.Vt struct osd
|
||||
if the subsystem provides the ability for Khelp modules to associate per-object
|
||||
data.
|
||||
Subsystems which do not should pass NULL.
|
||||
.Pp
|
||||
The
|
||||
.Fn HHOOKS_RUN_IF
|
||||
macro is the preferred way to implement hook points.
|
||||
It only calls the
|
||||
.Fn hhook_run_hooks
|
||||
function if at least one hook function is registered for the hook point.
|
||||
By checking for registered hook functions, the macro minimises the cost
|
||||
associated with adding hook points to frequently used code paths by reducing to
|
||||
a simple if test in the common case where no hook functions are registered.
|
||||
The arguments are as described for the
|
||||
.Fn hhook_run_hooks
|
||||
function.
|
||||
.Pp
|
||||
The
|
||||
.Fn HHOOKS_RUN_IF_LOOKUP
|
||||
macro performs the same function as the
|
||||
.Fn HHOOKS_RUN_IF
|
||||
macro, but performs an additional step to look up the
|
||||
.Vt struct hhook_head
|
||||
for the specified
|
||||
.Fa hook_type
|
||||
and
|
||||
.Fa hook_id
|
||||
identifiers.
|
||||
It should not be used except in code paths which are infrequently executed
|
||||
because of the reference counting overhead associated with the look up.
|
||||
.Sh IMPLEMENTATION NOTES
|
||||
Each
|
||||
.Vt struct hhook_head
|
||||
protects its internal list of hook functions with a
|
||||
.Xr rmlock 9 .
|
||||
Therefore, anytime
|
||||
.Fn hhook_run_hooks
|
||||
is called directly or indirectly via the
|
||||
.Fn HHOOKS_RUN_IF
|
||||
or
|
||||
.Fn HHOOKS_RUN_IF_LOOKUP
|
||||
macros, a non-sleepable read lock will be acquired and held across the calls to
|
||||
all registered hook functions.
|
||||
.Sh RETURN VALUES
|
||||
.Fn hhook_head_register
|
||||
returns 0 if no errors occurred.
|
||||
It returns EEXIST if a hook point with the same
|
||||
.Fa hook_type
|
||||
and
|
||||
.Fa hook_id
|
||||
is already registered.
|
||||
It returns EINVAL if the HHOOK_HEADISINVNET flag is not set in
|
||||
.Fa flags
|
||||
because the implementation does not yet support hook points in non-virtualised
|
||||
subsystems (see the
|
||||
.Sx BUGS
|
||||
section for details).
|
||||
It returns ENOMEM if
|
||||
.Xr malloc 9
|
||||
failed to allocate memory for the new
|
||||
.Vt struct hhook_head .
|
||||
.Pp
|
||||
.Fn hhook_head_deregister
|
||||
and
|
||||
.Fn hhook_head_deregister_lookup
|
||||
return 0 if no errors occurred.
|
||||
They return ENOENT if
|
||||
.Fa hhh
|
||||
is NULL.
|
||||
They return EBUSY if the reference count of
|
||||
.Fa hhh
|
||||
is greater than one.
|
||||
.Sh EXAMPLES
|
||||
A well commented example Khelp module can be found at:
|
||||
.Pa /usr/share/examples/kld/khelp/h_example.c
|
||||
.Pp
|
||||
The
|
||||
.Xr tcp 4
|
||||
implementation provides two
|
||||
.Nm
|
||||
points which are called for packets sent/received when a connection is in the
|
||||
established phase.
|
||||
Search for HHOOK in the following files:
|
||||
.Pa sys/netinet/tcp_var.h ,
|
||||
.Pa sys/netinet/tcp_input.c ,
|
||||
.Pa sys/netinet/tcp_output.c
|
||||
and
|
||||
.Pa sys/netinet/tcp_subr.c .
|
||||
.Sh SEE ALSO
|
||||
.Xr khelp 9
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
framework first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The
|
||||
.Nm
|
||||
framework was first released in 2010 by Lawrence Stewart whilst studying at
|
||||
Swinburne University's Centre for Advanced Internet Architectures, Melbourne,
|
||||
Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
framework was written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
||||
.Pp
|
||||
This manual page was written by
|
||||
.An David Hayes Aq david.hayes@ieee.org
|
||||
and
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
||||
.Sh BUGS
|
||||
The framework does not currently support registering hook points in subsystems
|
||||
which have not been virtualised with VIMAGE.
|
||||
Fairly minimal internal changes to the
|
||||
.Nm
|
||||
implementation are required to address this.
|
437
share/man/man9/khelp.9
Normal file
437
share/man/man9/khelp.9
Normal file
@ -0,0 +1,437 @@
|
||||
.\"
|
||||
.\" Copyright (c) 2010-2011 The FreeBSD Foundation
|
||||
.\" All rights reserved.
|
||||
.\"
|
||||
.\" This documentation was written at the Centre for Advanced Internet
|
||||
.\" Architectures, Swinburne University, Melbourne, Australia by David Hayes and
|
||||
.\" Lawrence Stewart under sponsorship from the FreeBSD Foundation.
|
||||
.\"
|
||||
.\" Redistribution and use in source and binary forms, with or without
|
||||
.\" modification, are permitted provided that the following conditions
|
||||
.\" are met:
|
||||
.\" 1. Redistributions of source code must retain the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer.
|
||||
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||||
.\" notice, this list of conditions and the following disclaimer in the
|
||||
.\" documentation and/or other materials provided with the distribution.
|
||||
.\"
|
||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
|
||||
.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
.\" SUCH DAMAGE.
|
||||
.\"
|
||||
.\" $FreeBSD$
|
||||
.\"
|
||||
.Dd February 15, 2011
|
||||
.Dt khelp 9
|
||||
.Os
|
||||
.Sh NAME
|
||||
.Nm khelp ,
|
||||
.Nm khelp_init_osd ,
|
||||
.Nm khelp_destroy_osd ,
|
||||
.Nm khelp_get_id ,
|
||||
.Nm khelp_get_osd ,
|
||||
.Nm khelp_add_hhook ,
|
||||
.Nm khelp_remove_hhook ,
|
||||
.Nm KHELP_DECLARE_MOD ,
|
||||
.Nm KHELP_DECLARE_MOD_UMA
|
||||
.Nd Kernel Helper Framework
|
||||
.Sh SYNOPSIS
|
||||
.In sys/khelp.h
|
||||
.In sys/module_khelp.h
|
||||
.Fn "int khelp_init_osd" "uint32_t classes" "struct osd *hosd"
|
||||
.Fn "int khelp_destroy_osd" "struct osd *hosd"
|
||||
.Fn "int32_t khelp_get_id" "char *hname"
|
||||
.Fn "void * khelp_get_osd" "struct osd *hosd" "int32_t id"
|
||||
.Fn "int khelp_add_hhook" "struct hookinfo *hki" "uint32_t flags"
|
||||
.Fn "int khelp_remove_hhook" "struct hookinfo *hki"
|
||||
.Fn KHELP_DECLARE_MOD "hname" "hdata" "hhooks" "version"
|
||||
.Fn KHELP_DECLARE_MOD_UMA "hname" "hdata" "hhooks" "version" "ctor" "dtor"
|
||||
.Sh DESCRIPTION
|
||||
.Nm
|
||||
provides a framework for managing
|
||||
.Nm
|
||||
modules, which indirectly use the
|
||||
.Xr hhook 9
|
||||
KPI to register their hook functions with hook points of interest within the
|
||||
kernel.
|
||||
Khelp modules aim to provide a structured way to dynamically extend the kernel
|
||||
at runtime in an ABI preserving manner.
|
||||
Depending on the subsystem providing hook points, a
|
||||
.Nm
|
||||
module may be able to associate per-object data for maintaining relevant state
|
||||
between hook calls.
|
||||
The
|
||||
.Xr hhook 9
|
||||
and
|
||||
.Nm
|
||||
frameworks are tightly integrated and anyone interested in
|
||||
.Nm
|
||||
should also read the
|
||||
.Xr hhook 9
|
||||
manual page thoroughly.
|
||||
.Ss Information for Khelp Module Implementors
|
||||
.Nm
|
||||
modules are represented within the
|
||||
.Nm
|
||||
framework by a
|
||||
.Vt struct helper
|
||||
which has the following members:
|
||||
.Bd -literal -offset indent
|
||||
struct helper {
|
||||
int (*mod_init) (void);
|
||||
int (*mod_destroy) (void);
|
||||
#define HELPER_NAME_MAXLEN 16
|
||||
char h_name[HELPER_NAME_MAXLEN];
|
||||
uma_zone_t h_zone;
|
||||
struct hookinfo *h_hooks;
|
||||
uint32_t h_nhooks;
|
||||
uint32_t h_classes;
|
||||
int32_t h_id;
|
||||
volatile uint32_t h_refcount;
|
||||
uint16_t h_flags;
|
||||
TAILQ_ENTRY(helper) h_next;
|
||||
};
|
||||
.Ed
|
||||
.Pp
|
||||
Modules must instantiate a
|
||||
.Vt struct helper ,
|
||||
but are only required to set the
|
||||
.Va h_classes
|
||||
field, and may optionally set the
|
||||
.Va h_flags ,
|
||||
.Va mod_init
|
||||
and
|
||||
.Va mod_destroy
|
||||
fields where required.
|
||||
The framework takes care of all other fields and modules should refrain from
|
||||
manipulating them.
|
||||
Using the C99 designated initialiser feature to set fields is encouraged.
|
||||
.Pp
|
||||
If specified, the
|
||||
.Va mod_init
|
||||
function will be run by the
|
||||
.Nm
|
||||
framework prior to completing the registration process.
|
||||
Returning a non-zero value from the
|
||||
.Va mod_init
|
||||
function will abort the registration process and fail to load the module.
|
||||
If specified, the
|
||||
.Va mod_destroy
|
||||
function will be run by the
|
||||
.Nm
|
||||
framework during the deregistration process, after the module has been
|
||||
deregistered by the
|
||||
.Nm
|
||||
framework.
|
||||
The return value is currently ignored.
|
||||
Valid
|
||||
.Nm
|
||||
classes are defined in
|
||||
.In sys/khelp.h .
|
||||
Valid flags are defined in
|
||||
.In sys/module_khelp.h .
|
||||
The HELPER_NEEDS_OSD flag should be set in the
|
||||
.Va h_flags
|
||||
field if the
|
||||
.Nm
|
||||
module requires persistent per-object data storage.
|
||||
There is no programmatic way (yet) to check if a
|
||||
.Nm
|
||||
class provides the ability for
|
||||
.Nm
|
||||
modules to associate persistent per-object data, so a manual check is required.
|
||||
.Pp
|
||||
The
|
||||
.Fn KHELP_DECLARE_MOD
|
||||
and
|
||||
.Fn KHELP_DECLARE_MOD_UMA
|
||||
macros provide convenient wrappers around the
|
||||
.Xr DECLARE_MODULE 9
|
||||
macro, and are used to register a
|
||||
.Nm
|
||||
module with the
|
||||
.Nm
|
||||
framework.
|
||||
.Fn KHELP_DECLARE_MOD_UMA
|
||||
should only be used by modules which require the use of persistent per-object
|
||||
storage i.e. modules which set the HELPER_NEEDS_OSD flag in their
|
||||
.Vt struct helper Ns 's
|
||||
.Va h_flags
|
||||
field.
|
||||
.Pp
|
||||
The first four arguments common to both macros are as follows.
|
||||
The
|
||||
.Fa hname
|
||||
argument specifies the unique
|
||||
.Xr ascii 7
|
||||
name for the
|
||||
.Nm
|
||||
module.
|
||||
It should be no longer than HELPER_NAME_MAXLEN-1 characters in length.
|
||||
The
|
||||
.Fa hdata
|
||||
argument is a pointer to the module's
|
||||
.Vt struct helper .
|
||||
The
|
||||
.Fa hhooks
|
||||
argument points to a static array of
|
||||
.Vt struct hookinfo
|
||||
structures.
|
||||
The array should contain a
|
||||
.Vt struct hookinfo
|
||||
for each
|
||||
.Xr hhook 9
|
||||
point the module wishes to hook, even when using the same hook function multiple
|
||||
times for different
|
||||
.Xr hhook 9
|
||||
points.
|
||||
The
|
||||
.Fa version
|
||||
argument specifies a version number for the module which will be passed to
|
||||
.Xr MODULE_VERSION 9 .
|
||||
The
|
||||
.Fn KHELP_DECLARE_MOD_UMA
|
||||
macro takes the additional
|
||||
.Fa ctor
|
||||
and
|
||||
.Fa dtor
|
||||
arguments, which specify optional
|
||||
.Xr uma 9
|
||||
constructor and destructor functions.
|
||||
NULL should be passed where the functionality is not required.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_get_id
|
||||
function returns the numeric identifier for the
|
||||
.Nm
|
||||
module with name
|
||||
.Fa hname .
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_get_osd
|
||||
function is used to obtain the per-object data pointer for a specified
|
||||
.Nm
|
||||
module.
|
||||
The
|
||||
.Fa hosd
|
||||
argument is a pointer to the underlying subsystem object's
|
||||
.Vt struct osd .
|
||||
This is provided by the
|
||||
.Xr hhook 9
|
||||
framework when calling into a
|
||||
.Nm
|
||||
module's hook function.
|
||||
The
|
||||
.Fa id
|
||||
argument specifies the numeric identifier for the
|
||||
.Nm
|
||||
module to extract the data pointer from
|
||||
.Fa hosd
|
||||
for.
|
||||
The
|
||||
.Fa id
|
||||
is obtained using the
|
||||
.Fn khelp_get_id
|
||||
function.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_add_hhook
|
||||
and
|
||||
.Fn khelp_remove_hhook
|
||||
functions allow a
|
||||
.Nm
|
||||
module to dynamically hook/unhook
|
||||
.Xr hhook 9
|
||||
points at run time.
|
||||
The
|
||||
.Fa hki
|
||||
argument specifies a pointer to a
|
||||
.Vt struct hookinfo
|
||||
which encapsulates the required information about the
|
||||
.Xr hhook 9
|
||||
point and hook function being manipulated.
|
||||
The HHOOK_WAITOK flag may be passed in via the
|
||||
.Fa flags
|
||||
argument of
|
||||
.Fn khelp_add_hhook
|
||||
if
|
||||
.Xr malloc 9
|
||||
is allowed to sleep waiting for memory to become available.
|
||||
.Ss Integrating Khelp Into a Kernel Subsystem
|
||||
Most of the work required to allow
|
||||
.Nm
|
||||
modules to do useful things relates to defining and instantiating suitable
|
||||
.Xr hhook 9
|
||||
points for
|
||||
.Nm
|
||||
modules to hook into.
|
||||
The only additional decision a subsystem needs to make is whether it wants to
|
||||
allow
|
||||
.Nm
|
||||
modules to associate persistent per-object data.
|
||||
Providing support for persistent data storage can allow
|
||||
.Nm
|
||||
modules to perform more complex functionality which may be desirable.
|
||||
Subsystems which want to allow Khelp modules to associate
|
||||
persistent per-object data with one of the subsystem's data structures need to
|
||||
make the following two key changes:
|
||||
.Bl -bullet
|
||||
.It
|
||||
Embed a
|
||||
.Vt struct osd
|
||||
pointer in the structure definition for the object.
|
||||
.It
|
||||
Add calls to
|
||||
.Fn khelp_init_osd
|
||||
and
|
||||
.Fn khelp_destroy_osd
|
||||
to the subsystem code paths which are responsible for respectively initialising
|
||||
and destroying the object.
|
||||
.El
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_init_osd
|
||||
function initialises the per-object data storage for all currently loaded
|
||||
.Nm
|
||||
modules of appropriate classes which have set the HELPER_NEEDS_OSD flag in their
|
||||
.Va h_flags
|
||||
field.
|
||||
The
|
||||
.Fa classes
|
||||
argument specifies a bitmask of
|
||||
.Nm
|
||||
classes which this subsystem associates with.
|
||||
If a
|
||||
.Nm
|
||||
module matches any of the classes in the bitmask, that module will be associated
|
||||
with the object.
|
||||
The
|
||||
.Fa hosd
|
||||
argument specifies the pointer to the object's
|
||||
.Vt struct osd
|
||||
which will be used to provide the persistent storage for use by
|
||||
.Nm
|
||||
modules.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_destroy_osd
|
||||
function frees all memory that was associated with an object's
|
||||
.Vt struct osd
|
||||
by a previous call to
|
||||
.Fn khelp_init_osd .
|
||||
The
|
||||
.Fa hosd
|
||||
argument specifies the pointer to the object's
|
||||
.Vt struct osd
|
||||
which will be purged in preparation for destruction.
|
||||
.Sh IMPLEMENTATION NOTES
|
||||
.Nm
|
||||
modules are protected from being prematurely unloaded by a reference count.
|
||||
The count is incremented each time a subsystem calls
|
||||
.Fn khelp_init_osd
|
||||
causing persistent storage to be allocated for the module, and decremented for
|
||||
each corresponding call to
|
||||
.Fn khelp_destroy_osd .
|
||||
Only when a module's reference count has dropped to zero can the module be
|
||||
unloaded.
|
||||
.Sh RETURN VALUES
|
||||
The
|
||||
.Fn khelp_init_osd
|
||||
function returns zero if no errors occurred.
|
||||
It returns ENOMEM if a
|
||||
.Nm
|
||||
module which requires per-object storage fails to allocate the necessary memory.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_destroy_osd
|
||||
function only returns zero to indicate that no errors occurred.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_get_id
|
||||
function returns the unique numeric identifier for the registered
|
||||
.Nm
|
||||
module with name
|
||||
.Fa hname .
|
||||
It return -1 if no module with the specified name is currently registered.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_get_osd
|
||||
function returns the pointer to the
|
||||
.Nm
|
||||
module's persistent object storage memory.
|
||||
If the module identified by
|
||||
.Fa id
|
||||
does not have persistent object storage registered with the object's
|
||||
.Fa hosd
|
||||
.Vt struct osd ,
|
||||
NULL is returned.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_add_hhook
|
||||
function returns zero if no errors occurred.
|
||||
It returns ENOENT if it could not find the requested
|
||||
.Xr hhook 9
|
||||
point.
|
||||
It returns ENOMEM if
|
||||
.Xr malloc 9
|
||||
failed to allocate memory.
|
||||
It returns EEXIST if attempting to register the same hook function more than
|
||||
once for the same
|
||||
.Xr hhook 9
|
||||
point.
|
||||
.Pp
|
||||
The
|
||||
.Fn khelp_remove_hhook
|
||||
function returns zero if no errors occurred.
|
||||
It returns ENOENT if it could not find the requested
|
||||
.Xr hhook 9
|
||||
point.
|
||||
.Sh EXAMPLES
|
||||
A well commented example Khelp module can be found at:
|
||||
.Pa /usr/share/examples/kld/khelp/h_example.c
|
||||
.Pp
|
||||
The Enhanced Round Trip Time (ERTT)
|
||||
.Xr h_ertt 4
|
||||
.Nm
|
||||
module provides a more complex example of what is possible.
|
||||
.Sh SEE ALSO
|
||||
.Xr h_ertt 4 ,
|
||||
.Xr hhook 9 ,
|
||||
.Xr osd 9
|
||||
.Sh ACKNOWLEDGEMENTS
|
||||
Development and testing of this software were made possible in part by grants
|
||||
from the FreeBSD Foundation and Cisco University Research Program Fund at
|
||||
Community Foundation Silicon Valley.
|
||||
.Sh HISTORY
|
||||
The
|
||||
.Nm
|
||||
kernel helper framework first appeared in
|
||||
.Fx 9.0 .
|
||||
.Pp
|
||||
The
|
||||
.Nm
|
||||
framework was first released in 2010 by Lawrence Stewart whilst studying at
|
||||
Swinburne University's Centre for Advanced Internet Architectures, Melbourne,
|
||||
Australia.
|
||||
More details are available at:
|
||||
.Pp
|
||||
http://caia.swin.edu.au/urp/newtcp/
|
||||
.Sh AUTHORS
|
||||
.An -nosplit
|
||||
The
|
||||
.Nm
|
||||
framework was written by
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
||||
.Pp
|
||||
This manual page was written by
|
||||
.An David Hayes Aq david.hayes@ieee.org
|
||||
and
|
||||
.An Lawrence Stewart Aq lstewart@FreeBSD.org .
|
Loading…
x
Reference in New Issue
Block a user