Commit Graph

93 Commits

Author SHA1 Message Date
Gleb Smirnoff
779f106aa1 Listening sockets improvements.
o Separate fields of struct socket that belong to listening from
  fields that belong to normal dataflow, and unionize them.  This
  shrinks the structure a bit.
  - Take out selinfo's from the socket buffers into the socket. The
    first reason is to support braindamaged scenario when a socket is
    added to kevent(2) and then listen(2) is cast on it. The second
    reason is that there is future plan to make socket buffers pluggable,
    so that for a dataflow socket a socket buffer can be changed, and
    in this case we also want to keep same selinfos through the lifetime
    of a socket.
  - Remove struct struct so_accf. Since now listening stuff no longer
    affects struct socket size, just move its fields into listening part
    of the union.
  - Provide sol_upcall field and enforce that so_upcall_set() may be called
    only on a dataflow socket, which has buffers, and for listening sockets
    provide solisten_upcall_set().

o Remove ACCEPT_LOCK() global.
  - Add a mutex to socket, to be used instead of socket buffer lock to lock
    fields of struct socket that don't belong to a socket buffer.
  - Allow to acquire two socket locks, but the first one must belong to a
    listening socket.
  - Make soref()/sorele() to use atomic(9).  This allows in some situations
    to do soref() without owning socket lock.  There is place for improvement
    here, it is possible to make sorele() also to lock optionally.
  - Most protocols aren't touched by this change, except UNIX local sockets.
    See below for more information.

o Reduce copy-and-paste in kernel modules that accept connections from
  listening sockets: provide function solisten_dequeue(), and use it in
  the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4),
  infiniband, rpc.

o UNIX local sockets.
  - Removal of ACCEPT_LOCK() global uncovered several races in the UNIX
    local sockets.  Most races exist around spawning a new socket, when we
    are connecting to a local listening socket.  To cover them, we need to
    hold locks on both PCBs when spawning a third one.  This means holding
    them across sonewconn().  This creates a LOR between pcb locks and
    unp_list_lock.
  - To fix the new LOR, abandon the global unp_list_lock in favor of global
    unp_link_lock.  Indeed, separating these two locks didn't provide us any
    extra parralelism in the UNIX sockets.
  - Now call into uipc_attach() may happen with unp_link_lock hold if, we
    are accepting, or without unp_link_lock in case if we are just creating
    a socket.
  - Another problem in UNIX sockets is that uipc_close() basicly did nothing
    for a listening socket.  The vnode remained opened for connections.  This
    is fixed by removing vnode in uipc_close().  Maybe the right way would be
    to do it for all sockets (not only listening), simply move the vnode
    teardown from uipc_detach() to uipc_close()?

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D9770
2017-06-08 21:30:34 +00:00
Michael Tuexen
5d08768a2b Use the SCTP_PCB_FLAGS_ACCEPTING flags to check for listeners.
While there, use a macro for checking the listen state to allow for
easier changes if required.

This done to help glebius@ with his listen changes.
2017-05-26 16:29:00 +00:00
Michael Tuexen
b7b84c0e02 Whitespace changes.
The toolchain for processing the sources has been updated. No functional
change.

MFC after:	3 days
2016-12-26 11:06:41 +00:00
Michael Tuexen
5b495f17a5 Whitespace changes.
The tools using to generate the sources has been updated and produces
different whitespaces. Commit this seperately to avoid intermixing
these with real code changes.

MFC after:	3 days
2016-12-06 10:21:25 +00:00
Michael Tuexen
0ba4c8abe0 netstat and sockstat expect the IPv6 link local addresses to
have an embedded scope. So don't recover.

MFC after:	3 days
2016-07-19 09:48:08 +00:00
Michael Tuexen
fd60718d17 Retire net.inet.sctp.strict_sacks and net.inet.sctp.strict_data_order
sysctl's, since they where only there to interop with non-conformant
implementations. This should not be a problem anymore.
2016-05-12 16:34:59 +00:00
Michael Tuexen
b028cf319e Use 4 spaces instead of a tab. 2016-02-11 18:35:46 +00:00
Alfred Perlstein
7325dfbb59 Increase max allowed backlog for listen sockets
from short to int.

PR: 203922
Submitted by: White Knight <white_knight@2ch.net>
MFC After: 4 weeks
2016-02-02 05:57:59 +00:00
Michael Tuexen
e92c2a8d6a Fix the exporting of SCTP association states to userland. Without this,
associations in SHUTDOWN-PENDING were never reported correctly.

MFC after:	3 weeks
2015-08-29 09:14:32 +00:00
Michael Tuexen
29b9533b43 Export the ssthresh value per SCTP path via the sysctl interface.
MFC after: 1 month
2015-07-07 06:34:28 +00:00
Michael Tuexen
0694a1bc74 Export a pointer to the SCTP socket. This is needed to add SCTP support
to sockstat.

MFC after: 3 days
2015-06-04 12:46:56 +00:00
Michael Tuexen
bcbf8c2105 Remove comparisons which are not necessary.
Reported by:	Coverity
CID:		1237826, 1237844, 1237847
MFC after:	1 week
2015-01-20 19:08:55 +00:00
Michael Tuexen
f885296d70 Don't zero the stats before they are read out.
MFC after: 3 days
2014-11-01 10:35:45 +00:00
Hans Petter Selasky
0e1152fcc2 The SYSCTL data pointers can come from userspace and must not be
directly accessed. Although this will work on some platforms, it can
throw an exception if the pointer is invalid and then panic the kernel.

Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-28 12:00:39 +00:00
Michael Tuexen
24aaac8d59 Use union sctp_sockstore instead of struct sockaddr_storage. This
eliminiates some warnings when building in userland.
Thanks to Patrick Laimbock for reporting this issue.
Remove also some unnecessary casts.
There should be no functional change.

MFC after: 1 week
2014-09-07 09:06:26 +00:00
Michael Tuexen
95e550801c Use SYSCTL_PROC instead of SYSCTL_VNET_PROC.
Suggested by: glebius@
MFC after: 1 week
2014-09-07 07:49:49 +00:00
Michael Tuexen
f47f328dc5 Fix the handling of sysctl variables when used with VIMAGE.
While there do some cleanup of the code.

MFC after: 1 week
2014-09-06 19:12:14 +00:00
Michael Tuexen
76031b19ef Announce SCTP support in the kern.features sysctl variables.
MFC after: 3 days
2014-08-26 21:15:34 +00:00
Michael Tuexen
97a0ca5b3e Change SCTP sysctl from auth_disable to auth_enable. This is
consistent with other similar sysctl variable used in SCTP.
2014-08-12 13:13:11 +00:00
Michael Tuexen
c79bec9c75 Add support for the SCTP_AUTH_SUPPORTED and SCTP_ASCONF_SUPPORTED
socket options. Add also a sysctl to control the support of ASCONF.

MFC after: 1 week
2014-08-12 11:30:16 +00:00
Michael Tuexen
317e00ef86 Add support for the SCTP_RECONFIG_SUPPORTED and the corresponding
sysctl controlling the negotiation of the RE-CONFIG extension.

MFC after: 3 days
2014-08-04 20:07:35 +00:00
Michael Tuexen
cb9b8e6f7d Add support for the SCTP_PKTDROP_SUPPORTED socket option and
the corresponding sysctl variable.
The default is off, since the specification is not an RFC yet.

MFC after: 1 week
2014-08-03 18:12:55 +00:00
Michael Tuexen
2fdf7a7a35 Use consistent names for SCTP sysctls. Rename
nr_sack_on_off to nrsack_enable.
Please note that this extension is off by default
since it is not specified in an RFC (yet).
2014-08-03 15:09:13 +00:00
Michael Tuexen
caea98793f Add SCTP socket option SCTP_NRSACK_SUPPORTED to control the
NRSACK extension. The default will still be off, since it
it not an RFC (yet).
Changing the sysctl name will be in a separate commit.

MFC after: 1 week
2014-08-03 14:10:10 +00:00
Michael Tuexen
dd973b0e15 Add support for the SCTP_PR_SUPPORTED socket option as specified in
http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-prpolicies
Add also a sysctl controlling the default of the end-points.

MFC after: 1 week
2014-08-02 21:36:40 +00:00
Michael Tuexen
47aac6fa4b Remove the asconf_auth_nochk sysctl. This was off by default and only
existed to be able to test with non-compliant peers a long time ago.
2014-08-01 20:49:27 +00:00
Michael Tuexen
6ba22f19ca Honor jails for unbound SCTP sockets when selecting source addresses,
reporting IP-addresses to the peer during the handshake, adding
addresses to the host, reporting the addresses via the sysctl
interface (used by netstat, for example) and reporting the
addresses to the application via socket options.
This issue was reported by Bernd Walter.

MFC after: 3 days
2014-06-20 13:26:49 +00:00
Michael Tuexen
ff1ffd7499 * Provide information in error causes in ASCII instead of
proprietary binary format.
* Add support for a diagnostic information error cause.
  The code is sysctlable and the default is 0, which
  means it is not sent.

This is joint work with rrs@.

MFC after: 1 week
2014-03-16 12:32:16 +00:00
Mikolaj Golub
db2f5a2461 Fixup for r261590 (vnet sysctl handlers cleanup).
Reviewed by:	glebius
2014-02-09 08:13:17 +00:00
Michael Tuexen
6be15a24c4 Export the inpcb features as a 64-bit entity.
Bump __FreeBSD_version to 1000048 since the
modified structure is user visible and used
by netstat, for example.
2013-08-22 20:29:57 +00:00
Michael Tuexen
06c9f9bddf Make also the features of the association 64-bit.
When exporting to xinpcb, just export the lower
32-bit. Using there also 64-bits will break the
ABI and will be committed separetly.

MFC after: 2 weeks
X-MFC with: 254248
2013-08-22 19:28:13 +00:00
Michael Tuexen
ee1ccd9258 Fix a bug were only 2048 streams where usable even though more than
2048 streams were negotiated on the wire. While there, remove the
hard coded limit of 2048 streams.

MFC after: 3 days
2013-07-05 10:08:49 +00:00
Michael Tuexen
a1cb341b5d Cleanup the handling of address scopes. Announce in the INIT/INIT-ACK
only the supported address types. While there, do some whitespace
cleanups.

MFC after: 1 week
2013-02-09 17:26:14 +00:00
Michael Tuexen
3a51a2647a Add support for SCTP/UDP/IPV6.
This completes the support of
http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-udp-encaps

MFC after: 1 week
2012-11-17 20:04:04 +00:00
Michael Tuexen
39803b8c58 Whitespace cleanup.
MFC after: 3 days
2012-06-25 17:15:09 +00:00
Michael Tuexen
8d9638ab33 Get rid of SCTP specific code to avoid CRC32C computations on loopback.
Just just offloading.
MFC after: 3 days
2012-05-26 09:16:33 +00:00
Michael Tuexen
807aad636f Use consistent text at the begining of the files.
MFC after: 3 days
2012-05-23 11:26:28 +00:00
Michael Tuexen
c58e60be43 Add an SCTP sysctl "blackhole", similar to the one for TCP.
If set to 1, no ABORT is sent back in response to an incoming
INIT. If set to 2, no ABORT is sent back in response to
an out of the blue packet. If set to 0 (the default), ABORTs
are sent.
Discussed with rrs@.

MFC after: 1 month.
2012-01-08 09:56:24 +00:00
Michael Tuexen
ab6174d587 Retire the SCTP sysctl "strict_init". We always perform the validation
and there is no reason to make is configuarable.
Discussed with rrs@.
2012-01-07 14:04:00 +00:00
Michael Tuexen
60990c0c06 Address issues found by clang. While there, fix also some style
issues.

MFC after: 3 months.
2011-12-27 10:16:24 +00:00
Michael Tuexen
7215cc1b74 Fix unused parameter warnings.
While there, fix some whitespace issues.

MFC after: 3 months.
2011-12-17 19:21:40 +00:00
Michael Tuexen
c9c5805975 Add support for the SCTP_REMOTE_UDP_ENCAPS_PORT socket option.
Retire the the now unused sctp_udp_tunneling_for_client_enable
sysctl variable.

MFC after: 3 months.
2011-11-20 15:00:45 +00:00
Michael Tuexen
ca85e9482a The result of a joint work between rrs@ and myself at the IETF:
* Decouple the path supervision using a separate HB timer per path.
* Add support for potentially failed state.
* Bring back RTO.min to 1 second.
* Accept packets on IP-addresses already announced via an ASCONF
* While there: do some cleanups.

Approved by: re@
MFC after: 2 months.
2011-08-03 20:21:00 +00:00
Michael Tuexen
e6194c2ed4 Improve compilation of SCTP code without INET support.
Some bugs where fixed while doing this:
* ASCONF-ACK messages might use wrong port number when using
  IPv6.
* Checking for additional addresses takes the correct address
  into account and also does not do more comparisons than
  necessary.

This patch is based on one received from bz@ who was
sponsored by The FreeBSD Foundation and iXsystems.

MFC after: 1 week
2011-04-30 11:18:16 +00:00
Randall Stewart
f79aab1866 Tunes and fixes the new DC-CC to seem to hit the
right mix.  Still may need some tweaks but it
appears to almost not give away too much to an
RFC2581 flow, but can really minimize the amount of
buffers used in the net.

MFC after:	3 months
2011-03-08 11:58:25 +00:00
Randall Stewart
48b6c64938 Adds a new Congestion Control that helps reduce
the RTT that a flow will build up in buffers in
transit. It is a slight modification to RFC2581
but is more friendly i.e. less aggressive.

MFC after:	3 months
2011-03-01 00:37:46 +00:00
Dimitry Andric
cb8750c269 Fix breakage in sys/netinet/sctp_sysctl.c, introduced by r219057. If
SCTP_HAS_RTTC is not defined, this file fails to compile.  Insert the
necessary #ifdefs to make it work.

Pointy hat to:	rrs
2011-02-26 22:45:40 +00:00
Randall Stewart
299108c5a2 Improvements to CC modules:
1) Add four new points that allow you to get more information
   to cc algo's
2) Fix the case where user changes module on a existing TCB, in
   such a case, the initialization module needs to be called on all nets.
3) Move htcp_cc structure to a union that other modules can use.
4) Add 5th point for get/set socket options for cc_module specific options

MFC after:	2 months
2011-02-26 15:23:46 +00:00
Michael Tuexen
be1d917696 * Cleanup the code computing the retransmission timeout.
* Fix an initialization bug for the scaled variance of the RTO.

MFC after: 3 months.
2011-02-24 22:36:40 +00:00
Michael Tuexen
f0878bdcc5 Bugfix: Get per vnet sysctl variables and statistics working.
MFC after:3 months.
2011-02-18 20:30:58 +00:00