These are needed for the forthcoming vxlan implementation. The context
pointer means we do not have to use a spare pointer field in the inpcb,
and the source address is required to populate vxlan's forwarding table.
While I highly doubt there is an out of tree consumer of the UDP
tunneling callback, this change may be a difficult to eventually MFC.
Phabricator: https://reviews.freebsd.org/D383
Reviewed by: gnn
syscalls themselves are tightly coupled with the network stack and
therefore should not be in the generic socket code.
The following four syscalls have been marked as NOSTD so they can be
dynamically registered in sctp_syscalls_init() function:
sys_sctp_peeloff
sys_sctp_generic_sendmsg
sys_sctp_generic_sendmsg_iov
sys_sctp_generic_recvmsg
The syscalls are also set up to be dynamically registered when COMPAT32
option is configured.
As a side effect of moving the SCTP syscalls, getsock_cap needs to be
made available outside of the uipc_syscalls.c source file. A proper
prototype has been added to the sys/socketvar.h header file.
API tests from the SCTP reference implementation have been run to ensure
compatibility. (http://code.google.com/p/sctp-refimpl/source/checkout)
Submitted by: Steve Kiernan <stevek@juniper.net>
Reviewed by: tuexen, rrs
Obtained from: Juniper Networks, Inc.
This partitular case is the only path where the mbuf could be NULL.
udp_append() checked for a NULL mbuf only after invoking the tunneling
callback. Our only in tree tunneling callback - SCTP - assumed a non
NULL mbuf, and it is a bit odd to make the callbacks responsible for
checking this condition.
This also reduces the differences between the IPv4 and IPv6 code.
MFC after: 1 month
outer header, consider it as new packet and clear the protocols flags.
This fixes problems when IPSEC traffic goes through various tunnels and router
doesn't send ICMP/ICMPv6 errors.
PR: 174602
Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC
consistent with the amount of data provided in the SCTP_RESET_STREAMS
socket option.
Thanks to Peter Bostroem from Google for drawing my attention to
this part of the code.
from xnu sources. If we encounter a network where ICMP is blocked
the Needs Frag indicator may not propagate back to us. Attempt to
downshift the mss once to a preconfigured value.
Default this feature to off for now while we do not have a full PLPMTUD
implementation in our stack.
Adds the following new sysctl's for control:
net.inet.tcp.pmtud_blackhole_detection -- turns on/off this feature
net.inet.tcp.pmtud_blackhole_mss -- mss to try for ipv4
net.inet.tcp.v6pmtud_blackhole_mss -- mss to try for ipv6
Adds the following new sysctl's for monitoring:
-- Number of times the code was activated to attempt a mss downshift
net.inet.tcp.pmtud_blackhole_activated
-- Number of times the blackhole mss was used in an attempt to downshift
net.inet.tcp.pmtud_blackhole_min_activated
-- Number of times that we failed to connect after we downshifted the mss
net.inet.tcp.pmtud_blackhole_failed
Phabricator: https://reviews.freebsd.org/D506
Reviewed by: rpaulo bz
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Limelight Networks
'if'/'else' case: it matches the simple 'else' case that follows.
This reduces awareness of external-storage mechanics outside of the
mbuf allocator.
Reviewed by: bz
MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D900
that this means full checksum coverage for received packets.
If an application is willing to accept packets with partial
coverage, it is expected to use the socekt option and provice
the minimum coverage it accepts.
Reviewed by: kevlo
MFC after: 3 days
- tcp_get_sav() - SADB key lookup
- tcp_signature_do_compute() - actual computation
* Fix TCP signature case for listening socket:
do not assume EVERY connection coming to socket
with TCP_SIGNATURE set to be md5 signed regardless
of SADB key existance for particular address. This
fixes the case for routing software having _some_
BGP sessions secured by md5.
* Simplify TCP_SIGNATURE handling in tcp_input()
MFC after: 2 weeks
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
Reviewed by: adrian, rmacklem
Sponsored by: Mellanox Technologies
MFC after: 1 week
fibs. Use the mbuf's or the socket's fib instead of RT_ALL_FIBS. Fixes PR
187553. Also fixes netperf's UDP_STREAM test on a nondefault fib.
sys/netinet/ip_output.c
In ip_output, lookup the source address using the mbuf's fib instead
of RT_ALL_FIBS.
sys/netinet/in_pcb.c
in in_pcbladdr, lookup the source address using the socket's fib,
because we don't seem to have the mbuf fib. They should be the same,
though.
tests/sys/net/fibs_test.sh
Clear the expected failure on udp_dontroute.
PR: 187553
CR: https://reviews.freebsd.org/D772
MFC after: 3 weeks
Sponsored by: Spectra Logic
This fixes a bug which resulted in a warning on the userland
stack, when compiled on Windows.
Thanks to Peter Kasting from Google for reporting the issue and
provinding a potential fix.
MFC after: 3 days
towards blind SYN/RST spoofed attack.
Originally our stack used in-window checks for incoming SYN/RST
as proposed by RFC793. Later, circa 2003 the RST attack was
mitigated using the technique described in P. Watson
"Slipping in the window" paper [1].
After that, the checks were only relaxed for the sake of
compatibility with some buggy TCP stacks. First, r192912
introduced the vulnerability, just fixed by aforementioned SA.
Second, r167310 had slightly relaxed the default RST checks,
instead of utilizing net.inet.tcp.insecure_rst sysctl.
In 2010 a new technique for mitigation of these attacks was
proposed in RFC5961 [2]. The idea is to send a "challenge ACK"
packet to the peer, to verify that packet arrived isn't spoofed.
If peer receives challenge ACK it should regenerate its RST or
SYN with correct sequence number. This should not only protect
against attacks, but also improve communication with broken
stacks, so authors of reverted r167310 and r192912 won't be
disappointed.
[1] http://bandwidthco.com/whitepapers/netforensics/tcpip/TCP Reset Attacks.pdf
[2] http://www.rfc-editor.org/rfc/rfc5961.txt
Changes made:
o Revert r167310.
o Implement "challenge ACK" protection as specificed in RFC5961
against RST attack. On by default.
- Carefully preserve r138098, which handles empty window edge
case, not described by the RFC.
- Update net.inet.tcp.insecure_rst description.
o Implement "challenge ACK" protection as specificed in RFC5961
against SYN attack. On by default.
- Provide net.inet.tcp.insecure_syn sysctl, to turn off
RFC5961 protection.
The changes were tested at Netflix. The tested box didn't show
any anomalies compared to control box, except slightly increased
number of TCP connection in LAST_ACK state.
Reviewed by: rrs
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
MFC after: 1 week
Sponsored by: Mellanox Technologies
ifa_ifwithdstaddr. For the sake of backwards compatibility, the new
arguments were added to new functions named ifa_ifwithnet_fib and
ifa_ifwithdstaddr_fib, while the old functions became wrappers around the
new ones that passed RT_ALL_FIBS for the fib argument. However, the
backwards compatibility is not desired for FreeBSD 11, because there are
numerous other incompatible changes to the ifnet(9) API. We therefore
decided to remove it from head but leave it in place for stable/9 and
stable/10. In addition, this commit adds the fib argument to
ifa_ifwithbroadaddr for consistency's sake.
sys/sys/param.h
Increment __FreeBSD_version
sys/net/if.c
sys/net/if_var.h
sys/net/route.c
Add fibnum argument to ifa_ifwithbroadaddr, and remove the _fib
versions of ifa_ifwithdstaddr, ifa_ifwithnet, and ifa_ifwithroute.
sys/net/route.c
sys/net/rtsock.c
sys/netinet/in_pcb.c
sys/netinet/ip_options.c
sys/netinet/ip_output.c
sys/netinet6/nd6.c
Fixup calls of modified functions.
share/man/man9/ifnet.9
Document changed API.
CR: https://reviews.freebsd.org/D458
MFC after: Never
Sponsored by: Spectra Logic
A non-global IPv6 address can be used in more than one zone of the same
scope. This zone index is used to identify to which zone a non-global
address belongs.
Also we can have many foreign hosts with equal non-global addresses,
but from different zones. So, they can have different metrics in the
host cache.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC