4974 Commits

Author SHA1 Message Date
jhb
93f2095877 Update ip_divert.ko to depend on version 3 of ipfw. 2014-10-11 16:08:54 +00:00
bryanv
ed3bbe0a29 Add context pointer and source address to the UDP tunnel callback
These are needed for the forthcoming vxlan implementation. The context
pointer means we do not have to use a spare pointer field in the inpcb,
and the source address is required to populate vxlan's forwarding table.

While I highly doubt there is an out of tree consumer of the UDP
tunneling callback, this change may be a difficult to eventually MFC.

Phabricator:	https://reviews.freebsd.org/D383
Reviewed by:	gnn
2014-10-10 06:08:59 +00:00
bryanv
61ddedd3c6 Add missing UDP multicast receive dtrace probes
Phabricator:	https://reviews.freebsd.org/D924
Reviewed by:	rpaulo markj
MFC after:	1 month
2014-10-09 22:36:21 +00:00
tuexen
a83188df55 Ensure that the flags field of sctp_tmit_chunks is initialized.
Thanks to Peter Bostroem from Google for reporting the issue.

MFC after: 3 days
2014-10-09 20:08:12 +00:00
melifaro
d23efba7dd Sync to HEAD@r272825. 2014-10-09 15:35:28 +00:00
marcel
42d9d5479e Move the SCTP syscalls to netinet with the rest of the SCTP code. The
syscalls themselves are tightly coupled with the network stack and
therefore should not be in the generic socket code.

The following four syscalls have been marked as NOSTD so they can be
dynamically registered in sctp_syscalls_init() function:
  sys_sctp_peeloff
  sys_sctp_generic_sendmsg
  sys_sctp_generic_sendmsg_iov
  sys_sctp_generic_recvmsg

The syscalls are also set up to be dynamically registered when COMPAT32
option is configured.

As a side effect of moving the SCTP syscalls, getsock_cap needs to be
made available outside of the uipc_syscalls.c source file.  A proper
prototype has been added to the sys/socketvar.h header file.

API tests from the SCTP reference implementation have been run to ensure
compatibility. (http://code.google.com/p/sctp-refimpl/source/checkout)

Submitted by:	Steve Kiernan <stevek@juniper.net>
Reviewed by:	tuexen, rrs
Obtained from:	Juniper Networks, Inc.
2014-10-09 15:16:52 +00:00
bryanv
5dee4f91b5 Check for mbuf copy failure when there are multiple multicast sockets
This partitular case is the only path where the mbuf could be NULL.
udp_append() checked for a NULL mbuf only after invoking the tunneling
callback. Our only in tree tunneling callback - SCTP - assumed a non
NULL mbuf, and it is a bit odd to make the callbacks responsible for
checking this condition.

This also reduces the differences between the IPv4 and IPv6 code.

MFC after:	1 month
2014-10-09 05:17:47 +00:00
ae
d5044f2234 When tunneling interface is going to insert mbuf into netisr queue after stripping
outer header, consider it as new packet and clear the protocols flags.

This fixes problems when IPSEC traffic goes through various tunnels and router
doesn't send ICMP/ICMPv6 errors.

PR:		174602
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2014-10-08 21:23:34 +00:00
tuexen
dd7ab00a4f Ensure that the list of streams sent in a stream reset parameter fits
in an mbuf-cluster.
Thanks to Peter Bostroem for drawing my attention to this part of the code.
2014-10-08 15:30:59 +00:00
tuexen
532fe9e857 Ensure that the number of stream reported in srs_number_streams is
consistent with the amount of data provided in the SCTP_RESET_STREAMS
socket option.
Thanks to Peter Bostroem from Google for drawing my attention to
this part of the code.
2014-10-08 15:29:49 +00:00
melifaro
13cd9545ed Add IP_FW_DUMP_SOPTCODES sopt to be able to determine
which opcodes are currently available in kernel.
2014-10-08 11:12:14 +00:00
sbruno
79bb3d3455 Implement PLPMTUD blackhole detection (RFC 4821), inspired by code
from xnu sources.  If we encounter a network where ICMP is blocked
the Needs Frag indicator may not propagate back to us.  Attempt to
downshift the mss once to a preconfigured value.

Default this feature to off for now while we do not have a full PLPMTUD
implementation in our stack.

Adds the following new sysctl's for control:
net.inet.tcp.pmtud_blackhole_detection -- turns on/off this feature
net.inet.tcp.pmtud_blackhole_mss       -- mss to try for ipv4
net.inet.tcp.v6pmtud_blackhole_mss     -- mss to try for ipv6

Adds the following new sysctl's for monitoring:
-- Number of times the code was activated to attempt a mss downshift
net.inet.tcp.pmtud_blackhole_activated
-- Number of times the blackhole mss was used in an attempt to downshift
net.inet.tcp.pmtud_blackhole_min_activated
-- Number of times that we failed to connect after we downshifted the mss
net.inet.tcp.pmtud_blackhole_failed

Phabricator:	https://reviews.freebsd.org/D506
Reviewed by:	rpaulo bz
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Limelight Networks
2014-10-07 21:50:28 +00:00
melifaro
bbf0fe2f55 Sync to HEAD@r272609. 2014-10-06 11:29:50 +00:00
hselasky
f1f39047b6 Minor code styling.
Suggested by:	glebius @
2014-10-06 06:19:54 +00:00
tuexen
7384b54f43 Remove unused MC_ALIGN macro as suggested by Robert.
MFC after: 1 week
2014-10-05 20:30:49 +00:00
rwatson
b363d98dc7 Eliminate use of M_EXT in IP6_EXTHDR_CHECK() by trimming a redundant
'if'/'else' case: it matches the simple 'else' case that follows.

This reduces awareness of external-storage mechanics outside of the
mbuf allocator.

Reviewed by:	bz
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D900
2014-10-05 06:28:53 +00:00
melifaro
e8d559896c Sync to HEAD@r272516. 2014-10-04 12:42:37 +00:00
hrs
43ddffd06b Add an additional routing table lookup when m->m_pkthdr.fibnum is changed
at a PFIL hook in ip{,6}_output().  IPFW setfib rule did not perform
a routing table lookup when the destination address was not changed.

CR:	D805
2014-10-02 00:25:57 +00:00
markj
77fb3d6bbe Add a sysctl, net.inet.icmp.tstamprepl, which can be used to disable replies
to ICMP Timestamp packets.

PR:		193689
Submitted by:	Anthony Cornehl <accornehl@gmail.com>
MFC after:	3 weeks
Sponsored by:	EMC / Isilon Storage Division
2014-10-01 18:07:34 +00:00
melifaro
d8b683d70f Remove lock init from radix.c.
Radix has never managed its locking itself.
The only consumer using radix with embeded rwlock
is system routing table. Move per-AF lock inits there.
2014-10-01 14:39:06 +00:00
tuexen
911b31bff0 The default for UDPLITE_RECV_CSCOV is zero. RFC 3828 recommend
that this means full checksum coverage for received packets.
If an application is willing to accept packets with partial
coverage, it is expected to use the socekt option and provice
the minimum coverage it accepts.

Reviewed by: kevlo
MFC after: 3 days
2014-10-01 05:43:29 +00:00
tuexen
e642e7ce0e UDPLite requires a checksum. Therefore, discard a received packet if
the checksum is 0.

MFC after: 3 days
2014-09-30 20:29:58 +00:00
tuexen
ed4e0bcac5 If the checksum coverage field in the UDPLITE header is the length
of the complete UDPLITE packet, the packet has full checksum coverage.
SO fix the condition.

Reviewed by: kevlo
MFC after: 3 days
2014-09-30 18:17:28 +00:00
jhb
9d6dc35e9d Only define the full inm_print() if KTR_IGMPV3 is enabled at compile time. 2014-09-30 17:26:34 +00:00
tuexen
186c8f6391 Checksum coverage values larger than 65535 for UDPLite are invalid.
Check for this when the user calls setsockopt using UDPLITE_{SEND,RECV}CSCOV.

Reviewed by: kevlo
MFC after: 3 days
2014-09-28 17:22:45 +00:00
melifaro
e58ee21a5e * Split tcp_signature_compute() into 2 pieces:
- tcp_get_sav() - SADB key lookup
 - tcp_signature_do_compute() - actual computation
* Fix TCP signature case for listening socket:
  do not assume EVERY connection coming to socket
  with TCP_SIGNATURE set to be md5 signed regardless
  of SADB key existance for particular address. This
  fixes the case for routing software having _some_
  BGP sessions secured by md5.
* Simplify TCP_SIGNATURE handling in tcp_input()

MFC after:	2 weeks
2014-09-27 07:04:12 +00:00
adrian
34f5e94b25 Remove an un-needed bit of pre-processor work - it all lives inside
#ifdef RSS.
2014-09-27 05:14:02 +00:00
jmg
bb3d6ada62 drop unnecessary ifdef IPSEC's. This file is only compiled when IPSEC
is defined...

Differential Revision:	D839
Reviewed by:	bz, glebius, gnn
Sponsered by:	EuroBSDCon DevSummit
2014-09-26 12:48:54 +00:00
np
ae32468228 Catch up with r271119. 2014-09-24 20:12:40 +00:00
hselasky
bdacf9ba4d Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by:	adrian, rmacklem
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-22 08:27:27 +00:00
hrs
a313921d44 Add a change missing in r271916. 2014-09-21 04:38:50 +00:00
hrs
ffad09823e - Virtualize interface cloner for gre(4). This fixes a panic when destroying
a vnet jail which has a gre(4) interface.

- Make net.link.gre.max_nesting vnet-local.
2014-09-21 03:56:06 +00:00
glebius
fec1da5cbc Mechanically convert to if_inc_counter(). 2014-09-19 10:19:51 +00:00
glebius
2bb30b3418 Remove disabled code, that is very unlikely to be ever enabled again,
as well as the comment that explains why is it disabled.
2014-09-19 05:23:47 +00:00
asomers
c1b1b15f41 Fix source address selection on unbound sockets in the presence of multiple
fibs. Use the mbuf's or the socket's fib instead of RT_ALL_FIBS. Fixes PR
187553. Also fixes netperf's UDP_STREAM test on a nondefault fib.

sys/netinet/ip_output.c
	In ip_output, lookup the source address using the mbuf's fib instead
	of RT_ALL_FIBS.

sys/netinet/in_pcb.c
	in in_pcbladdr, lookup the source address using the socket's fib,
	because we don't seem to have the mbuf fib. They should be the same,
	though.

tests/sys/net/fibs_test.sh
	Clear the expected failure on udp_dontroute.

PR:		187553
CR:		https://reviews.freebsd.org/D772
MFC after:	3 weeks
Sponsored by:	Spectra Logic
2014-09-16 15:28:19 +00:00
tuexen
996f057a5f Add a explict cast to silence a warning when building
the userland stack on Windows.
This issue was reported by Peter Kasting from Google.

MFC after: 3 days
2014-09-16 14:39:24 +00:00
tuexen
47bba753f6 Use a consistent type for the number of HMAC algorithms.
This fixes a bug which resulted in a warning on the userland
stack, when compiled on Windows.
Thanks to Peter Kasting from Google for reporting the issue and
provinding a potential fix.

MFC after: 3 days
2014-09-16 14:20:33 +00:00
tuexen
d7760b7171 Small cleanup which addresses a warning regaring the truncation
of a 64-bit entity to a 32-bit entity. This issue was reported by
Peter Kasting from Google.

MFC after: 3 days
2014-09-16 13:48:46 +00:00
glebius
f716889a49 FreeBSD-SA-14:19.tcp raised attention to the state of our stack
towards blind SYN/RST spoofed attack.

Originally our stack used in-window checks for incoming SYN/RST
as proposed by RFC793. Later, circa 2003 the RST attack was
mitigated using the technique described in P. Watson
"Slipping in the window" paper [1].

After that, the checks were only relaxed for the sake of
compatibility with some buggy TCP stacks. First, r192912
introduced the vulnerability, just fixed by aforementioned SA.
Second, r167310 had slightly relaxed the default RST checks,
instead of utilizing net.inet.tcp.insecure_rst sysctl.

In 2010 a new technique for mitigation of these attacks was
proposed in RFC5961 [2]. The idea is to send a "challenge ACK"
packet to the peer, to verify that packet arrived isn't spoofed.
If peer receives challenge ACK it should regenerate its RST or
SYN with correct sequence number. This should not only protect
against attacks, but also improve communication with broken
stacks, so authors of reverted r167310 and r192912 won't be
disappointed.

[1] http://bandwidthco.com/whitepapers/netforensics/tcpip/TCP Reset Attacks.pdf
[2] http://www.rfc-editor.org/rfc/rfc5961.txt

Changes made:

o Revert r167310.
o Implement "challenge ACK" protection as specificed in RFC5961
  against RST attack. On by default.
  - Carefully preserve r138098, which handles empty window edge
    case, not described by the RFC.
  - Update net.inet.tcp.insecure_rst description.
o Implement "challenge ACK" protection as specificed in RFC5961
  against SYN attack. On by default.
  - Provide net.inet.tcp.insecure_syn sysctl, to turn off
    RFC5961 protection.

The changes were tested at Netflix. The tested box didn't show
any anomalies compared to control box, except slightly increased
number of TCP connection in LAST_ACK state.

Reviewed by:	rrs
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-09-16 11:07:25 +00:00
tuexen
fe03a49a25 Make a type conversion explicit. When compiling this code on
Windows as part of the SCTP userland stack, this fixes a
warning reported by Peter Kasting from Google.

MFC after: 3 days
2014-09-16 10:57:55 +00:00
delphij
a90097904b Fix Denial of Service in TCP packet processing.
Submitted by:	glebius
Security:	FreeBSD-SA-14:19.tcp
2014-09-16 09:48:24 +00:00
tuexen
cce669e356 The MTU is handled as a 32-bit entity within the SCTP stack.
This was reported by Peter Kasting from Google.

MFC after: 3 days
2014-09-16 09:22:43 +00:00
adrian
3bc90623ca Ensure the correct software IPv4 hash is done based on the configured
RSS parameters, rather than assuming we're hashing IPv4+UDP and IPv4+TCP.
2014-09-16 03:26:42 +00:00
tuexen
da9214b7fd Chunk IDs are 8 bit entities, not 16 bit.
Thanks to Peter Kasting from Google for drawing
my attention to it.

MFC after: 3 days
2014-09-15 19:38:34 +00:00
hrs
0d175c82cb Use generic SYSCTL_* macro instead of deprecated SYSCTL_VNET_*.
Suggested by:	glebius
2014-09-15 14:43:58 +00:00
hrs
0d57c69d7b Make net.inet.ip.sourceroute, net.inet.ip.accept_sourceroute, and
net.inet.ip.process_options vnet-aware.  Revert changes in r271545.

Suggested by:	bz
2014-09-15 07:20:40 +00:00
hselasky
727760a4e4 Revert r271504. A new patch to solve this issue will be made.
Suggested by:	adrian @
2014-09-13 20:52:01 +00:00
hselasky
3d04a989df Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-09-13 08:26:09 +00:00
asomers
081aa8a15c Revisions 264905 and 266860 added a "int fib" argument to ifa_ifwithnet and
ifa_ifwithdstaddr. For the sake of backwards compatibility, the new
arguments were added to new functions named ifa_ifwithnet_fib and
ifa_ifwithdstaddr_fib, while the old functions became wrappers around the
new ones that passed RT_ALL_FIBS for the fib argument. However, the
backwards compatibility is not desired for FreeBSD 11, because there are
numerous other incompatible changes to the ifnet(9) API. We therefore
decided to remove it from head but leave it in place for stable/9 and
stable/10. In addition, this commit adds the fib argument to
ifa_ifwithbroadaddr for consistency's sake.

sys/sys/param.h
	Increment __FreeBSD_version

sys/net/if.c
sys/net/if_var.h
sys/net/route.c
	Add fibnum argument to ifa_ifwithbroadaddr, and remove the _fib
	versions of ifa_ifwithdstaddr, ifa_ifwithnet, and ifa_ifwithroute.

sys/net/route.c
sys/net/rtsock.c
sys/netinet/in_pcb.c
sys/netinet/ip_options.c
sys/netinet/ip_output.c
sys/netinet6/nd6.c
	Fixup calls of modified functions.

share/man/man9/ifnet.9
	Document changed API.

CR:		https://reviews.freebsd.org/D458
MFC after:	Never
Sponsored by:	Spectra Logic
2014-09-11 20:21:03 +00:00
ae
1576b695b6 Add scope zone id to the in_endpoints and hc_metrics structures.
A non-global IPv6 address can be used in more than one zone of the same
scope. This zone index is used to identify to which zone a non-global
address belongs.

Also we can have many foreign hosts with equal non-global addresses,
but from different zones. So, they can have different metrics in the
host cache.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-09-10 16:26:18 +00:00