Commit Graph

13 Commits

Author SHA1 Message Date
glebius
99f4ec50e8 Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.
Sponsored by:	Nginx, Inc.
2014-11-07 09:39:05 +00:00
hselasky
a0b8ff0c54 The SYSCTL data pointers can come from userspace and must not be
directly accessed. Although this will work on some platforms, it can
throw an exception if the pointer is invalid and then panic the kernel.

Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-28 12:00:39 +00:00
hselasky
144232fef2 Preserve limitation of "TCP_CA_NAME_MAX" when matching the algorithm
name.

MFC after:	3 days
Suggested by:	gnn @
2014-10-27 16:08:41 +00:00
hselasky
d92cafa3fc Make assignments to "net.inet.tcp.cc.algorithm" work by fixing a bad
string comparison.

MFC after:	3 days
Reported by:	Jukka Ukkonen <jau789@gmail.com>
Sponsored by:	Mellanox Technologies
2014-10-27 11:21:47 +00:00
hselasky
e253e43882 Fix string length argument passed to "sysctl_handle_string()" so that
the complete string is returned by the function and not just only one
byte.

PR:	192544
MFC after:	2 weeks
2014-08-10 07:51:55 +00:00
lstewart
545f7c0ca7 Use the full and proper company name for Swinburne University of Technology
throughout the source tree.

Requested by:	Grenville Armitage, Director of CAIA at Swinburne University of
			Technology
MFC after:	3 days
2011-04-12 08:13:18 +00:00
lstewart
3087aa2fa9 An sbuf configured with SBUF_AUTOEXTEND will call malloc with M_WAITOK when a
write to the buffer causes it to overflow. We therefore can't hold the CC list
rwlock over a call to sbuf_printf() for an sbuf configured with SBUF_AUTOEXTEND.

Switch to a fixed length sbuf which should be of sufficient size except in the
very unlikely event that the sysctl is being processed as one or more new
algorithms are loaded. If that happens, we accept the race and may fail the
sysctl gracefully if there is insufficient room to print the names of all the
algorithms.

This should address a WITNESS warning and the potential panic that would occur
if the sbuf call to malloc did sleep whilst holding the CC list rwlock.

Sponsored by:	FreeBSD Foundation
Reported by:	Nick Hibma
Reviewed by:	bz
MFC after:	3 weeks
X-MFC with:	r215166
2011-01-23 13:00:25 +00:00
lstewart
d54d1011fb Make the CC framework more VIMAGE friendly by adding the machinery to allow
vnets to select their own default CC algorithm independent of each other and the
base system. If the base system or a vnet has set a default which gets unloaded,
we reset that netstack's default to NewReno.

Sponsored by:	FreeBSD Foundation
Tested by:	Mikolaj Golub <to.my.trociny at gmail com>
Reviewed by:	bz (briefly)
MFC after:	3 months
2010-11-16 09:34:31 +00:00
lstewart
c63666dcd4 - Querying the default CC algo is more common than setting it and the function
is small, so there is no good reason not to declare the buffer at the top.

- Fix a whitespace nit.

Sponsored by:	FreeBSD Foundation
MFC after:	11 weeks
X-MFC with:	r215166
2010-11-16 08:43:25 +00:00
lstewart
4cb0a4f8c6 Move protocol specific implementation detail out of the core CC framework.
Sponsored by:	FreeBSD Foundation
Tested by:	Mikolaj Golub <to.my.trociny at gmail com>
MFC after:	11 weeks
X-MFC with:	r215166
2010-11-16 08:30:39 +00:00
lstewart
b114c25bf8 On CC algorithm module unload, we walk the list of active TCP control blocks.
Any found to be using the algorithm that is about to go away are switched back
to NewReno to avoid leaving dangling pointers which would trigger a panic. For
VIMAGE kernels, there is a list per vnet to walk, yet the implementation was
only examining one of the vnet lists.

Fix the implementation of the above feature for VIMAGE kernels by looping
through all active TCP control blocks across all vnets.

Sponsored by:	FreeBSD Foundation
Tested by:	Mikolaj Golub <to.my.trociny at gmail com>
Reviewed by:	bz (briefly)
MFC after:	11 weeks
2010-11-16 07:57:56 +00:00
lstewart
c28db57abb cc_init() should only be run once on system boot, but with VIMAGE kernels it
runs on boot and each time a vnet jail is created. Running cc_init() multiple
times results in a panic when attempting to initialise the cc_list lock again,
and so r215166 effectively broke the use of vnet jails.

Switch to using a SYSINIT to run cc_init() on boot. CC algorithm modules loaded
on boot register in the same SI_SUB_PROTO_IFATTACHDOMAIN category as is used in
this patch, so cc_init() is run at SI_ORDER_FIRST to ensure the framework is
initialised before module registration is attempted.

Sponsored by:	FreeBSD Foundation
Reported and tested by:	Mikolaj Golub <to.my.trociny at gmail com>
MFC after:	11 weeks
X-MFC with:	r215166
2010-11-16 07:09:05 +00:00
lstewart
df9f23bf3f This commit marks the first formal contribution of the "Five New TCP Congestion
Control Algorithms for FreeBSD" FreeBSD Foundation funded project. More details
about the project are available at: http://caia.swin.edu.au/freebsd/5cc/

- Add a KPI and supporting infrastructure to allow modular congestion control
  algorithms to be used in the net stack. Algorithms can maintain per-connection
  state if required, and connections maintain their own algorithm pointer, which
  allows different connections to concurrently use different algorithms. The
  TCP_CONGESTION socket option can be used with getsockopt()/setsockopt() to
  programmatically query or change the congestion control algorithm respectively
  from within an application at runtime.

- Integrate the framework with the TCP stack in as least intrusive a manner as
  possible. Care was also taken to develop the framework in a way that should
  allow integration with other congestion aware transport protocols (e.g. SCTP)
  in the future. The hope is that we will one day be able to share a single set
  of congestion control algorithm modules between all congestion aware transport
  protocols.

- Introduce a new congestion recovery (TF_CONGRECOVERY) state into the TCP stack
  and use it to decouple the meaning of recovery from a congestion event and
  recovery from packet loss (TF_FASTRECOVERY) a la RFC2581. ECN and delay based
  congestion control protocols don't generally need to recover from packet loss
  and need a different way to note a congestion recovery episode within the
  stack.

- Remove the net.inet.tcp.newreno sysctl, which simplifies some portions of code
  and ensures the stack always uses the appropriate mechanisms for recovering
  from packet loss during a congestion recovery episode.

- Extract the NewReno congestion control algorithm from the TCP stack and
  massage it into module form. NewReno is always built into the kernel and will
  remain the default algorithm for the forseeable future. Implementations of
  additional different algorithms will become available in the near future.

- Bump __FreeBSD_version to 900025 and note in UPDATING that rebuilding code
  that relies on the size of "struct tcpcb" is required.

Many thanks go to the Cisco University Research Program Fund at Community
Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work
at the Centre for Advanced Internet Architectures, Swinburne University of
Technology is greatly appreciated.

In collaboration with:	David Hayes <dahayes at swin edu au> and
			Grenville Armitage <garmitage at swin edu au>
Sponsored by:	Cisco URP, FreeBSD Foundation
Reviewed by:	rpaulo
Tested by:	David Hayes (and many others over the years)
MFC after:	3 months
2010-11-12 06:41:55 +00:00