Commit Graph

6529 Commits

Author SHA1 Message Date
Michael Tuexen
413c3db101 Allow the TCP backhole detection to be disabled at all, enabled only
for IPv4, enabled only for IPv6, and enabled for IPv4 and IPv6.
The current blackhole detection might classify a temporary outage as
an MTU issue and reduces permanently the MSS. Since the consequences of
such a reduction due to a misclassification are much more drastically
for IPv4 than for IPv6, allow the administrator to enable it for IPv6 only.

Reviewed by:		bcr@ (man page), Richard Scheffenegger
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D24219
2020-03-31 15:54:54 +00:00
Mark Johnston
9b1d850be8 Remove the "config" taskqgroup and its KPIs.
Equivalent functionality is already provided by taskqueue(9), just use
that instead.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-03-30 14:24:03 +00:00
Michael Tuexen
9aca687811 Small cleanup by using a variable just assigned.
MFC after:		1 week
2020-03-28 22:35:04 +00:00
Michael Tuexen
25ec355353 Handle integer overflows correctly when converting msecs and secs to
ticks and vice versa.
These issues were caught by recently added panic() calls on INVARIANTS
systems.

Reported by:		syzbot+b44787b4be7096cd1590@syzkaller.appspotmail.com
Reported by:		syzbot+35f82d22805c1e899685@syzkaller.appspotmail.com
MFC after:		1 week
2020-03-28 20:25:45 +00:00
Ed Maste
c012cfe68a sys/netinet: remove spurious doubled ;s 2020-03-27 23:10:18 +00:00
Michael Tuexen
d5d190f2f9 Some more uint32_t cleanups, no functional change.
MFC after:		1 week
2020-03-27 21:48:52 +00:00
Michael Tuexen
239e5865df Use uint32_t where it is expected to be used. No functional change.
MFC after:		1 week
2020-03-27 11:08:11 +00:00
Michael Tuexen
7c63520c42 Remove an optimization, which was incorrect a couple of times and
therefore doesn't seem worth to be there.
In this case COOKIE where not retransmitted anymore, when the
socket was already closed.

MFC after:		1 week
2020-03-25 18:20:37 +00:00
Michael Tuexen
37686ccf08 Improve consistency in debug output.
MFC after:		1 week
2020-03-25 18:14:12 +00:00
Michael Tuexen
24187cfe72 Revert https://svnweb.freebsd.org/changeset/base/357829
This introduces a regression reported by koobs@ when running a pyhton
test suite on a loaded system.

This patch resulted in a failing accept() call, when the association
was setup and gracefully shutdown by the peer before accept was called.
So the following packetdrill script would fail:

+0.0 socket(..., SOCK_STREAM, IPPROTO_SCTP) = 3
+0.0 bind(3, ..., ...) = 0
+0.0 listen(3, 1) = 0
+0.0 < sctp: INIT[flgs=0, tag=1, a_rwnd=15000, os=1, is=1, tsn=1]
+0.0 > sctp: INIT_ACK[flgs=0, tag=2, a_rwnd=..., os=..., is=..., tsn=1, ...]
+0.1 < sctp: COOKIE_ECHO[flgs=0, len=..., val=...]
+0.0 > sctp: COOKIE_ACK[flgs=0]
+0.0 < sctp: DATA[flgs=BE, len=116, tsn=1, sid=0, ssn=0, ppid=0]
+0.0 > sctp: SACK[flgs=0, cum_tsn=1, a_rwnd=..., gaps=[], dups=[]]
+0.0 < sctp: SHUTDOWN[flgs=0, cum_tsn=0]
+0.0 > sctp: SHUTDOWN_ACK[flgs=0]
+0.0 < sctp: SHUTDOWN_COMPLETE[flgs=0]
+0.0 accept(3, ..., ...) = 4
+0.0 close(3) = 0
+0.0 recv(4, ..., 4096, 0) = 100
+0.0 recv(4, ..., 4096, 0) = 0
+0.0 close(4) = 0

Reported by:		koops@
2020-03-25 15:29:01 +00:00
Michael Tuexen
23e3c0880d Use consistent debug output.
MFC after:		1 week
2020-03-25 13:19:41 +00:00
Michael Tuexen
e056fafd92 Don't restore the vnet too early in error cases.
MFC after:		1 week
2020-03-25 13:18:37 +00:00
Michael Tuexen
7522682e5e Only call panic when building with INVARIANTS.
MFC after:		1 week
2020-03-24 23:04:07 +00:00
Michael Tuexen
a412576e36 Another cleanup of the timer code. Also be more pedantic about the
parameters of the timer start and stop routines. Several inconsistencies
have been fixed in earlier commits. Now they will be catched when running
an INVARIANTS system.

MFC after:	1 week
2020-03-24 22:44:36 +00:00
Michael Tuexen
d084818d9d Cleanup the file and add two ASSERT variants for locks, which will be
used shortly.

MFC after:		1 week
2020-03-23 12:17:13 +00:00
Michael Tuexen
a57fb68b92 More timer cleanups, no functional change.
MFC after:		1 week
2020-03-21 16:12:19 +00:00
Michael Tuexen
fa8ceba9ca Remove a set, but unused variable.
MFC after:		1 week
2020-03-20 14:49:44 +00:00
Michael Tuexen
2bdebd0ce3 A a missing NET_EPOCH_ENTER/NET_EPOCH_EXIT pair. This was affecting
implicit connection setups via sendmsg().

Reported by:		syzbot+febbe3383a0e9b700c1b@syzkaller.appspotmail.com
Reported by:		syzbot+dca98631455d790223ca@syzkaller.appspotmail.com
Reported by:		syzbot+5a71a7760d6bcf11b8cd@syzkaller.appspotmail.com
Reported by:		syzbot+da64217e140444c49f00@syzkaller.appspotmail.com
2020-03-19 23:07:52 +00:00
Michael Tuexen
6fb7b4fbdb Consistently provide arguments for timer start and stop routines.
This is another step in cleaning up timer handling.
MFC after:		1 week
2020-03-19 21:01:16 +00:00
Michael Tuexen
e95b3d7faf Cleanup the stream reset and asconf timer.
MFC after:		1 week
2020-03-19 18:55:54 +00:00
Michael Tuexen
42078d5ada The MTU candidates MUST be a multiple of 4, so make them so.
MFC after:		1 week
2020-03-19 14:37:28 +00:00
Michael Tuexen
0554e01d8b Handle the timers in a consistent sequence according to the definition
of the timer type.
Just a cleanup, no functional change intended.

MFC after:		1 week
2020-03-17 19:20:12 +00:00
Andrew Gallatin
ee7a9e506e Avoid a cache miss accessing an mbuf ext_pgs pointer when doing SW kTLS.
For a Netflix 90Gb/s 100% TLS software kTLS workload, this reduces
the CPI of tcp_m_copym() from ~3.5 to ~2.5 as reported by vtune.

Reviewed by:	jtl, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D23998
2020-03-16 14:03:27 +00:00
Michael Tuexen
7ca6e2963f Use KMOD_TCPSTAT_INC instead of TCPSTAT_INC for RACK and BBR, since
these are kernel modules. Also add a KMOD_TCPSTAT_ADD and use that
instead of TCPSTAT_ADD.

Reviewed by:		jtl@, rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D23904
2020-03-12 15:37:41 +00:00
Andrew Gallatin
98085bae8c make lacp's use_numa hashing aware of send tags
When I did the use_numa support, I missed the fact that there is
a separate hash function for send tag nic selection. So when
use_numa is enabled, ktls offload does not work properly, as it
does not reliably allocate a send tag on the proper egress nic
since different egress nics are selected for send-tag allocation
and packet transmit. To fix this, this change:

- refectors lacp_select_tx_port_by_hash() and
     lacp_select_tx_port() to make lacp_select_tx_port_by_hash()
     always called by lacp_select_tx_port()

-   pre-shifts flowids to convert them to hashes when calling lacp_select_tx_port_by_hash()

-   adds a numa_domain field to if_snd_tag_alloc_params

-   plumbs the numa domain into places where we allocate send tags

In testing with NIC TLS setup on a NUMA machine, I see thousands
of output errors before the change when enabling
kern.ipc.tls.ifnet.permitted=1. After the change, I see no
errors, and I see the NIC sysctl counters showing active TLS
offload sessions.

Reviewed by:	rrs, hselasky, jhb
Sponsored by:	Netflix
2020-03-09 13:44:51 +00:00
Hiroki Sato
d726e6331b Fix an issue of net.inet.igmp.stats handler.
The header of (struct igmpstat) could be cleared by sysctl(3).
This can be reproduced by "netstat -s -z -p igmp".

PR:	244584
MFC after:	1 week
2020-03-07 08:41:10 +00:00
Michael Tuexen
9c04fdfd34 When using automatically generated flow labels and using TCP SYN
cookies, use the same flow label for the segments sent during the
handshake and after the handshake.
This fixes a bug by making sure that sc_flowlabel is always stored in
network byte order.

Reviewed by:		bz@
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D23957
2020-03-04 16:41:25 +00:00
Bjoern A. Zeeb
d2b8fd0da1 Add new ICMPv6 counters for Anti-DoS limits.
Add four new counters for ND6 related Anti-DoS measures.
We split these out into a separate upfront commit so that we only
change the struct size one time.  Implementations using them will
follow.

PR:		157410
Reviewed by:	melifaro
MFC after:	2 weeks
X-MFC:		cannot really MFC this without breaking netstat
Sponsored by:	Netflix (initially)
Differential Revision:	https://reviews.freebsd.org/D22711
2020-03-04 16:20:59 +00:00
Michael Tuexen
6605e5791f Don't send an uninitilised traffic class in the IPv6 header, when
sending a TCP segment from the TCP SYN cache (like a SYN-ACK).
This fix initialises it to zero. This is correct for the ECN bits,
but is does not honor the DSCP what an application might have set via
the IPPROTO_IPV6 level socket options IPV6_TCLASS. That will be
fixed separately.

Reviewed by:		Richard Scheffenegger
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D23900
2020-03-04 12:22:53 +00:00
Bjoern A. Zeeb
4e1a3ff884 tcp_hpts: make RSS kernel compile again.
Add proper #includes, and #ifdefs and some style fixes to make RSS
kernels compile again.  There are still possible issues with uin16_t
vs. uint_t cpuid which I am not going near.

Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D23726
2020-03-03 14:15:30 +00:00
Michael Tuexen
7e1e491f60 Remove stale definitions. The removed definitions are not used right
now and are incompatible with the correct ones in RFC 3168.

Submitted by:		Richard Scheffenegger
Differential Revision:	https://reviews.freebsd.org/D23903
2020-03-01 12:34:27 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Randall Stewart
d7313dc6f5 This commit expands tcp_ratelimit to be able to handle cards
like the mlx-c5 and c6 that require a "setup" routine before
the tcp_ratelimit code can declare and use a rate. I add the
setup routine to if_var as well as fix tcp_ratelimit to call it.
I also revisit the rates so that in the case of a mlx card
of type c5/6 we will use about 100 rates concentrated in the range
where the most gain can be had (1-200Mbps). Note that I have
tested these on a c5 and they work and perform well. In fact
in an unloaded system they pace right to the correct rate (great
job mlx!). There will be a further commit here from Hans that
will add the respective changes to the mlx driver to support this
work (which I was testing with).

Sponsored by:	Netflix Inc.
Differential Revision:	ttps://reviews.freebsd.org/D23647
2020-02-26 13:48:33 +00:00
Pawel Biernacki
295a18d184 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (14 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Approved by:	kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23639
2020-02-24 10:47:18 +00:00
Pawel Biernacki
10b49b2302 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (6 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

Mark all nodes in pf, pfsync and carp as MPSAFE.

Reviewed by:	kp
Approved by:	kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23634
2020-02-21 16:23:00 +00:00
Michael Tuexen
64f29eb1df Remove an unused timer type.
MFC after:		1 week
2020-02-20 15:37:44 +00:00
Michael Tuexen
868b51f234 Epochify SCTP. 2020-02-18 21:25:17 +00:00
Michael Tuexen
ba0d525006 Remove unused function. 2020-02-18 19:41:55 +00:00
Michael Tuexen
a610bb2120 Fix the non-default stream schedulers such that do not interleave
user messages when it is now allowed.

Thanks to Christian Wright for reporting the issue for the userland
stack and providing a fix for the priority scheduler.

MFC after:		1 week
2020-02-17 18:05:03 +00:00
Michael Tuexen
6b8fba3c5c Don't use uninitialised stack memory if the sysctl variable
net.inet.tcp.hostcache.enable is set to 0.
The bug resulted in using possibly a too small MSS value or wrong
initial retransmission timer settings. Possibly the value used
for ssthresh was also wrong.

Submitted by:		Richard Scheffenegger
Reviewed by:		Cheng Cui, rgrimes@, tuexen@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D23687
2020-02-17 14:54:21 +00:00
Hans Petter Selasky
bacb11c9ed Fix kernel panic while trying to read multicast stream.
When VIMAGE is enabled make sure the "m_pkthdr.rcvif" pointer is set
for all mbufs being input by the IGMP/MLD6 code. Else there will be a
NULL-pointer dereference in the netisr code when trying to set the
VNET based on the incoming mbuf. Add an assert to catch this when
queueing mbufs on a netisr to make debugging of similar cases easier.

Found by:	Vladislav V. Prodan
PR:		244002
Reviewed by:	bz@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-17 09:46:32 +00:00
Mateusz Guzik
6b25673f3f sctp: use new capsicum helpers 2020-02-15 01:29:40 +00:00
Michael Tuexen
a357466592 sack_newdata and snd_recover hold the same value. Therefore, use only
a single instance: use snd_recover also where sack_newdata was used.

Submitted by:		Richard Scheffenegger
Differential Revision:	https://reviews.freebsd.org/D18811
2020-02-13 15:14:46 +00:00
Michael Tuexen
33f8cfdfe4 Whitespace cleanup. No functional change.
Sponsored by:		Netflix, Inc.
2020-02-13 13:58:34 +00:00
Michael Tuexen
56ccb48fd6 Don't panic under INVARIANTS when we can't allocate memory for storing
a vtag in time wait.
This issue was found by running syzkaller.

MFC after:		1 week
2020-02-12 17:05:10 +00:00
Michael Tuexen
ca3de626ec Mark the socket as disconnected when freeing the association the first
time.
This issue was found by running syzkaller.

MFC after:		1 week
2020-02-12 17:02:15 +00:00
Randall Stewart
348404bce1 Lets get the real correct version.. gessh. I need
more coffee evidently.

Sponsored by:	Netflix
2020-02-12 15:26:56 +00:00
Randall Stewart
b8f8a6b719 Opps committed the wrong ratelimit version in the
whitespace cleanup.. Restore it to the proper version.

Sponsored by:	Netfilx Inc.
2020-02-12 13:37:53 +00:00
Randall Stewart
481be5de9d White space cleanup -- remove trailing tab's or spaces
from any line.

Sponsored by:	Netflix Inc.
2020-02-12 13:31:36 +00:00
Randall Stewart
df341f5986 Whitespace, remove from three files trailing white
space (leftover presents from emacs).

Sponsored by:	Netflix Inc.
2020-02-12 13:07:09 +00:00