persist state and the retransmit timer. However, the code that implements
"bad retransmit recovery" only checks t_rxtshift to see if an ACK has been
received in during the first retransmit timeout window. As a result, if
ticks has wrapped over to a negative value and a socket is in the persist
state, it can incorrectly treat an ACK from the remote peer as a
"bad retransmit recovery" and restore saved values such as snd_ssthresh and
snd_cwnd. However, if the socket has never had a retransmit timeout, then
these saved values will be zero, so snd_ssthresh and snd_cwnd will be set
to 0.
If the socket is in fast recovery (this can be caused by excessive
duplicate ACKs such as those fixed by 220794), then each ACK that arrives
triggers either NewReno or SACK partial ACK handling which clamps snd_cwnd
to be no larger than snd_ssthresh. In effect, the socket's send window
is permamently stuck at 0 even though the remote peer is advertising a
much larger window and pending data is only sent via TCP window probes
(so one byte every few seconds).
Fix this by adding a new TCP pcb flag (TF_PREVVALID) that indicates that
the various snd_*_prev fields in the pcb are valid and only perform
"bad retransmit recovery" if this flag is set in the pcb. The flag is set
on the first retransmit timeout that occurs and is cleared on subsequent
retransmit timeouts or when entering the persist state.
Reviewed by: bz
MFC after: 2 weeks
Expose ip_icmp.c to INET6 as well and only export badport_bandlim()
along with the two sysctls in the non-INET case.
The bandlim types work for all cases I reviewed in IPv6 as well and
the sysctls are available as we export net.inet.* from in_proto.c.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
Move ip_defttl to raw_ip.c where it is actually used. In an IPv6
only world we do not want to compile ip_input.c in for that and
it is a shared default with INET6.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
adding appropriate #ifdefs. For module builds the framework needs
adjustments for at least carp.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
As long as this is a costy function, even when compiled in (along with
the option TCP_SIGNATURE), it can be disabled via the
net.inet.tcp.signature_verify_input sysctl.
Sponsored by: Sandvine Incorporated
Reviewed by: emaste, bz
MFC after: 2 weeks
INET or INET6 as it holds all the IPPROTO_* definitions needed
for the SYSCTL_NODE definitions.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 5 days
When compiling out INET we still need the initialization routines
as well as the tuning and montoring sysctls shared with IPv6.
Move the two send/recvspace variables up from the middle of the
file to ease compiling out the INET only code.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 3 days
Move the ipport_tick_callout and related functions from ip_input.c
to in_pcb.c. The random source port allocation code has been merged
and is now local to in_pcb.c only.
Use a SYSINIT to get the callout started and no longer depend on
initialization from the inet code, which would not work in an IPv6
only setup.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
Move fw_one_pass to where it belongs: it is a property of ipfw,
not of ip_input.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 3 days
calculate required memory size dynamically.
- Fix races on chain re-lock.
- Introduce new field to ip_fw_chain - generation count. Now utilized
only in the NAT configuration, but can be utilized wider in ipfw.
- Get rid of NAT_BUF_LEN in ip_fw.h
PR: kern/143653
specified minimum and maximum. In case when specified default value
is out of bounds it does not work as expected and does not limit
variable. Check that default value is in range and limit it if needed.
Also bump max_hash_size value to 65536 to correspond with manual page.
PR: kern/152887
MFC after: 2 weeks
don't force a window update if the window would not actually grow due to
window scaling. Specifically, if the window scaling factor is larger than
2 * MSS, then after the local reader has drained 2 * MSS bytes from the
socket, a window update can end up advertising the same window. If this
happens, the supposed window update actually ends up being a duplicate ACK.
This can result in an excessive number of duplicate ACKs when using a
higher maximum socket buffer size.
Reviewed by: bz
MFC after: 1 month
While it does not provide any functionality for IPv6, it provides
the sysctl nodes for net.inet.* that a lot of functionality shared
between IPv4 and IPv6 depends on. We cannot change these anymore
without breaking a lot of management and tuning.
In case of IPv6 only, we compile out everything but the sysctl node
declarations.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC After: 5 days
to the maximum allowed window. Growing the window too large would cause
an underflow in the calculations in tcp_output() to decide if a window
update should be sent which would prevent the persist timer from being
started if data was pending and the other end of the connection advertised
an initial window size of 0.
PR: kern/154006
Submitted by: Stefan `Sec` Zehl sec 42 org
Reviewed by: bz
MFC after: 1 week
from another context at the moment of later access.
PR: kern/155555
Submitted by: Andrew Boyer <aboyer att averesystems.com>
Approved by: avg (mentor)
MFC after: 2 weeks
In some cases as udp6_connect() without an earlier bind(2) to an
address, v4-mapped scokets allowed and a non mapped destination
address, we can end up here with both v4 and v6 indicated:
inp_vflag = (INP_IPV4|INP_IPV6|INP_IPV6PROTO)
In that case however laddrp is NULL as the IPv6 path does not
pass in a copy currently.
Reported by: Pawel Worach (pawel.worach gmail.com)
Tested by: Pawel Worach (pawel.worach gmail.com)
MFC after: 6 days
X-MFC with: r219579
right mix. Still may need some tweaks but it
appears to almost not give away too much to an
RFC2581 flow, but can really minimize the amount of
buffers used in the net.
MFC after: 3 months
the RTT that a flow will build up in buffers in
transit. It is a slight modification to RFC2581
but is more friendly i.e. less aggressive.
MFC after: 3 months
1) Add four new points that allow you to get more information
to cc algo's
2) Fix the case where user changes module on a existing TCB, in
such a case, the initialization module needs to be called on all nets.
3) Move htcp_cc structure to a union that other modules can use.
4) Add 5th point for get/set socket options for cc_module specific options
MFC after: 2 months
VNET socket push back:
try to minimize the number of places where we have to switch vnets
and narrow down the time we stay switched. Add assertions to the
socket code to catch possibly unset vnets as seen in r204147.
While this reduces the number of vnet recursion in some places like
NFS, POSIX local sockets and some netgraph, .. recursions are
impossible to fix.
The current expectations are documented at the beginning of
uipc_socket.c along with the other information there.
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb
Tested by: zec
Tested by: Mikolaj Golub (to.my.trociny gmail.com)
MFC after: 2 weeks
at the Univ-of-Del. Basically when a 1-to-1 socket did a
socket/bind/send(data)/close. If the timing was right
we would dereference a socket that is NULL.
MFC after: 1 month
* Store the flowid when receiving an SCTP/IPv6 packet.
* Store the flowid when receiving an SCTP packet with wrong CRC.
* Initilize flowid correctly.
* Put test code under INVARIANTS.
MFC after: 3 months.