No functional changes.
net/route/shared.h was created in the inital phases of nexthop conversion.
It was intended to serve the same purpose as route_var.h - share definitions
of functions and structures between the routing subsystem components. At
that time route_var.h was included by many files external to the routing
subsystem, which largerly defeats its purpose.
As currently this is not the case anymore and amount of route_var.h includes
is roughly the same as shared.h, retire the latter in favour of the former.
of acked bytes as described in Section 2.2 of that document.
This patch ensures that this limit is not also applied in congestion
avoidance. Applying this limit also in congestion avoidance can result in
using less bandwidth than allowed.
Reported by: l.tian.email@gmail.com
Reviewed by: rrs, rscheff
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D26120
Remove most special treatment for ifnet TLS in the TCP stack, except
for code to avoid mixing handshakes and bulk data.
This code made heroic efforts to send down entire TLS records to
NICs. It was added to improve the PCIe bus efficiency of older TLS
offload NICs which did not keep state per-session, and so would need
to re-DMA the first part(s) of a TLS record if a TLS record was sent
in multiple TCP packets or TSOs. Newer TLS offload NICs do not need
this feature.
At Netflix, we've run extensive QoE tests which show that this feature
reduces client quality metrics, presumably because the effort to send
TLS records atomically causes the server to both wait too long to send
data (leading to buffers running dry), and to send too much data at
once (leading to packet loss).
Reviewed by: hselasky, jhb, rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26103
Since cubic calculates cwnd based on absolute
time, retaining RFC3465 (ABC) once-per-window updates
can lead to dramatic changes of cwnd in the convex
region. Updating cwnd for each incoming ack minimizes
this delta, preventing unintentional line-rate bursts.
Reviewed by: chengc_netapp.com, tuexen (mentor)
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D26060
Remove unused arguments from dom_rtattach/dom_rtdetach functions and make
them return/accept 'struct rib_head' instead of 'void **'.
Declare inet/inet6 implementations in the relevant _var.h headers similar
to domifattach / domifdetach.
Add rib_subscribe_internal() function to accept subscriptions to the rnh
directly.
Differential Revision: https://reviews.freebsd.org/D26053
The Cubic concave region was not aligned nicely for the very first exit from
slow start, where a 50% cwnd reduction is done instead of the normal 30%.
This addresses an issue, where a short line-rate burst could result from that
sudden jump of cwnd.
In addition, the Fast Convergence Heuristic has been expanded to work also
with ECN induced congestion response.
Submitted by: chengc_netapp.com
Reported by: chengc_netapp.com
Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D25976
Initializing K to zero in D23655 introduced a miscalculation,
where cwnd would suddenly jump to cwnd_max instead of gradually
increasing, after leaving slow-start.
Properly calculating K instead of resetting it to zero resolves
this issue. Also making sure, that cwnd is recalculated at the
earliest opportunity once slow-start is over.
Reported by: chengc_netapp.com
Reviewed by: chengc_netapp.com, tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D25746
Adding proper accounting of sacked_bytes and (per-ACK)
delivered data to the SACK scoreboard. This will
allow more aspects of RFC6675 to be implemented as well
as Proportional Rate Reduction (RFC6937).
Prior to this change, the pipe calculation controlled with
net.inet.tcp.rfc6675_pipe was also susceptible to incorrect
results when more than 3 (or 4) holes in the sequence space
were present, which can no longer all fit into a single
ACK's SACK option.
Reviewed by: kbowling, rgrimes (mentor)
Approved by: rgrimes (mentor, blanket)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D18624
* Let the accepted TCP/IPv4 socket inherit the configured TTL and
TOS value.
* Let the accepted TCP/IPv6 socket inherit the configured Hop Limit.
* Use the configured Hop Limit and Traffic Class when sending
IPv6 packets.
Reviewed by: rrs, lutz_donnerhacke.de
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D25909
are where we are now. The main thing is to try to get rid of the delayed
freeing to avoid blocking on the taskq when shutting down vnets.
X-Timeout: if you still see this before 14-RELEASE remove it.
destroying a VNET or a network interface.
Else the inm release tasks, both IPv4 and IPv6 may cause a panic
accessing a freed VNET or network interface.
Reviewed by: jmg@
Discussed with: bz@
Differential Revision: https://reviews.freebsd.org/D24914
MFC after: 1 week
Sponsored by: Mellanox Technologies
sure that
* ECN is disabled if the client sends an non-ECN-setup SYN segment.
* ECN is disabled is the ECN-setup SYN-ACK segment is retransmitted more
than net.inet.tcp.ecn.maxretries times.
Reviewed by: rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D26008
When using v4-mapped IPv6 sockets with IPV6_PKTINFO we do not
respect the given v4-mapped src address on the IPv4 socket.
Implement the needed functionality. This allows single-socket
UDP applications (such as OpenVPN) to work better on FreeBSD.
Requested by: Gert Doering (gert greenie.net), pfsense
Tested by: Gert Doering (gert greenie.net)
Reviewed by: melifaro
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24135
back from the end of the function created an issue.
If one of the routines returns NULL during setup
we have inp's with extra references (which is why
the increment was at the end).
Also the stack switch return code was being ignored
and actually has meaning if the stack cannot take over
it should return NULL.
Fix both of these situation by being sure to test the
return code and of course in any case of return NULL (there
are 3) make sure we properly reduce the ref count.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D25903
an ungraceful operation.
This fixes a use-after-free bug found and reported by Taylor
Brandstetter of Google by testing the userland stack.
MFC after: 1 week
In cubic, the concave region is when snd_cwnd starts growing slower
towards max_cwnd (cwnd at the time of the congestion event), and
the convex region is when snd_cwnd starts to grow faster and
eventually appearing like slow-start like growth.
PR: 238478
Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D24657
Without versioning information, using preexisting loader /
linker code is not easily possible when another module may
have dependencies on pre-loaded modules, and also doesn't
allow the automatic loading of dependent modules.
No functional change of the actual modules.
Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D25744
This avoids a use-after-free reported for the userland stack.
Thanks to Taylor Brandstetter for suggesting a patch for
the userland stack.
MFC after: 1 week
In r361752 an error handling was introduced for using 0.0.0.0 or
255.255.255.255 as the address in connect() for TCP, since both
addresses can't be used. However, the stack maps 0.0.0.0 implicitly
to a local address and at least two regressions were reported.
Therefore, re-allow the usage of 0.0.0.0.
While there, change the error indicated when using 255.255.255.255
from EAFNOSUPPORT to EACCES as mentioned in the man-page of connect().
Reviewed by: rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D25401
In case of errors, the cleanup was not consistent.
Thanks to Felix Weinrank for fuzzing the userland stack and making
me aware of the issue.
MFC after: 1 week
This file is only compiled if INET or INET6 is defined. So there
is no need for checking that.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25635
r363079 introduced the possibility of loading the SCTP stack as a module in
addition to compiling it into the kernel. As part of this, the registration
of the system calls was removed and put into the loading of the module.
Therefore, the system calls are not registered anymore when compiling the
SCTP into the kernel. This patch addresses that.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25632
With this change, a kernel compiled with "options SCTP_SUPPORT" and
without "options SCTP" supports dynamic loading of the SCTP stack.
Currently sctp.ko cannot be unloaded since some prerequisite teardown
logic is not yet implemented. Attempts to unload the module will return
EOPNOTSUPP.
Discussed with: tuexen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21997
While testing with system default cc set to cubic, and
running a memory exhaustion validation, FreeBSD panics for a
missing inpcb reference / lock.
Reviewed by: rgrimes (mentor), tuexen (mentor)
Approved by: rgrimes (mentor), tuexen (mentor)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D25583
fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees
over pointer validness and requiring on-stack data copying.
With no callers remaining, remove fib[46]_lookup_nh_ functions.
Submitted by: Neel Chauhan <neel AT neelc DOT org>
Differential Revision: https://reviews.freebsd.org/D25445
It may be possible to fix this by deferring the lookup, but let's
keep the initial change simple to make MFCs easier.
PR: 246951
Reviewed by: melifaro
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25519
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().
Suggested by: cem
Reviewed by: cem, delphij
Approved by: csprng (cem)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25435