4876 Commits

Author SHA1 Message Date
hiren
8a453508e4 MFC: r292011
MFC: r292012

Add an option to use rfc6675 based pipe/inflight bytes calculation in cubic and
newreno.
2016-01-11 23:37:31 +00:00
hiren
d2c2a54e08 MFC: r292003
Improve tcp duplicate ack processing when SACK is present.
2016-01-11 23:34:29 +00:00
hiren
4c421d2677 MFC: r290122
Calculate the correct amount of bytes that are in-flight for a connection as
suggested by RFC 6675.

MFC: r292046
r290122 added 4 bytes and removed 8 in struct sackhint. Add a pad entry of 4
bytes to restore the size.
2016-01-11 23:31:13 +00:00
stas
e4679b3dbb MFC r277938 (by hiren): make syncookie_mac() use 'tcp_seq irs' in computing hash. 2016-01-08 00:46:28 +00:00
pkelsey
ca800abb78 MFC r292706:
Implementation of server-side TCP Fast Open (TFO) [RFC7413].

TFO is disabled by default in the kernel build.  See the top comment
in sys/netinet/tcp_fastopen.c for implementation particulars.

Differential Revision:	https://reviews.freebsd.org/D4350
Sponsored by:	Verisign, Inc.
2015-12-28 02:43:12 +00:00
ae
58974a9f1d MFC r291579:
In the same way fix the problem described in r291578 for IGMPv3.

  In case when router has a lot of multicast groups, the reply can take
  several packets due to MTU limitation.
  Also we have a limit IGMP_MAX_RESPONSE_BURST == 4, that limits the number
  of packets we send in one shot. Then we recalculate the timer value and
  schedule the remaining packets for sending.
  The problem is that when we call igmp_v3_dispatch_general_query() to send
  remaining packets, we queue new reply in the same mbuf queue. And when
  number of packets is bigger than IGMP_MAX_RESPONSE_BURST, we get endless
  reply of IGMPv3 reports.
  To fix this, add the check for remaining packets in the queue.
2015-12-08 07:43:12 +00:00
fabient
ccce6feaa4 MFC r291301:
The r241129 description was wrong that the scenario is possible
 only for read locks on pcbs. The same race can happen with write
 lock semantics as well.

 The race scenario:

 - Two threads (1 and 2) locate pcb with writer semantics (INPLOOKUP_WLOCKPCB)
  and do in_pcbref() on it.
 - 1 and 2 both drop the inp hash lock.
 - Another thread (3) grabs the inp hash lock. Then it runs in_pcbfree(),
  which wlocks the pcb. They must happen faster than 1 or 2 come INP_WLOCK()!
 - 1 and 2 congest in INP_WLOCK().
 - 3 does in_pcbremlists(), drops hash lock, and runs in_pcbrele_wlocked(),
  which doesn't free the pcb due to two references on it.
  Then it unlocks the pcb.
 - 1 (or 2) gets wlock on the pcb, runs in_pcbrele_wlocked(), which doesn't
  report inp as freed, due to 2 (or 1) still helding extra reference on it.
  The thread tries to do smth with a disconnected pcb and crashes.

 Submitted by:	emeric.poupon@stormshield.eu
 Reviewed by:	glebius@
 Sponsored by: Stormshield
 Tested by: Cassiano Peixoto, Stormshield
2015-12-02 17:26:37 +00:00
gnn
3b2535b452 MFC 290028:
Turning on IPSEC used to introduce a slight amount of performance
degradation (7%) for host host TCP connections over 10Gbps links,
even when there were no secuirty policies in place. There is no
change in performance on 1Gbps network links. Testing GENERIC vs.
GENERIC-NOIPSEC vs. GENERIC with this change shows that the new
code removes any overhead introduced by having IPSEC always in the
kernel.

Differential Revision:	D3993
Sponsored by:	Rubicon Communications (Netgate)
2015-11-26 02:24:45 +00:00
hiren
07e8014e2b MFC r289293
Fix an unnecessarily aggressive behavior where mtu clamping begins on first
retransmission timeout (rto) when blackhole detection is enabled.  Make
sure it only happens when the second attempt to send the same segment also fails
with rto.

Also make sure that each mtu probing stage (usually 1448 -> 1188 -> 524) follows
the same pattern and gets 2 chances (rto) before further clamping down.

Note: RFC4821 doesn't specify implementation details on how this situation
should be handled.
2015-10-29 00:36:10 +00:00
hiren
cb1e85a62a MFC r288914
Add a comment specifying how we implement rfc3042.
2015-10-14 05:29:33 +00:00
hselasky
7ff6dd508c MFC r287775:
Update TSO limits to include all headers.

To make driver programming easier the TSO limits are changed to
reflect the values used in the BUSDMA tag a network adapter driver is
using. The TCP/IP network stack will subtract space for all linklevel
and protocol level headers and ensure that the full mbuf chain passed
to the network adapter fits within the given limits. See r287775
for a more detailed description.

Differential Revision:	https://reviews.freebsd.org/D3477
Reviewed by:		rmacklem
2015-10-08 08:30:40 +00:00
hiren
ef46b98fe1 MFC r287830
Remove unnecessary tcp state transition call.
2015-10-02 16:30:54 +00:00
melifaro
5496fd3096 MFC r266310
Fix wrong formatting of 0.0.0.0/X table records in ipfw(8).

  Add `flags` u16 field to the hole in ipfw_table_xentry structure.
  Kernel has been guessing address family for supplied record based
  on xent length size.
  Userland, however, has been getting fixed-size ipfw_table_xentry structures
  guessing address family by checking address by IN6_IS_ADDR_V4COMPAT().

  Fix this behavior by providing specific IPFW_TCF_INET flag for IPv4 records.

PR:		bin/189471,kern/200169
2015-09-18 17:29:24 +00:00
kevlo
aeb145cca6 MFC r285669:
Since the IETF has redefined the meaning of the tos field to accommodate
a set of differentiated services, set IPTOS_PREC_* macros using
IPTOS_DSCP_* macro definitions.

While here, add IPTOS_DSCP_VA macro according to RFC 5865.

Differential Revision:	https://reviews.freebsd.org/D3119
Reviewed by:	gnn
2015-07-31 03:40:09 +00:00
eri
3082d2efe1 MFC 285325
Correct issue presented in r285051 by properly initializing variable.

Differential Revision:	 https://reviews.freebsd.org/D3036
2015-07-29 17:50:14 +00:00
eri
b2cc8257fa MFC r285051
Avoid doing multiple route lookups for the same destination IP during forwarding.

Differential Revision:	https://reviews.freebsd.org/D2964
2015-07-29 17:46:16 +00:00
delphij
d3ecbb9019 Fix patch(1) shell injection vulnerability. [SA-15:14]
Fix resource exhaustion in TCP reassembly. [SA-15:15]

Fix OpenSSH multiple vulnerabilities. [SA-15:16]
2015-07-28 19:58:44 +00:00
hiren
141df3ac09 MFC r284941:
Avoid a situation where we do not set persist timer after a zero window
condition.
If you send a 0-length packet, but there is data is the socket buffer, and
neither the rexmt or persist timer is already set, then activate the persist
timer.

PR:		192599
Approved by:	re (delphij)
2015-07-22 15:05:45 +00:00
delphij
457165363e Fix resource exhaustion due to sessions stuck in LAST_ACK state.
Security:	CVE-2015-5358
Security:	SA-15:13.tcp
Submitted by:	Jonathan Looney (Juniper SIRT)
Approved by:	re (so blanket)
2015-07-21 23:42:17 +00:00
pkelsey
04fd70b7f2 MFC r285567:
Check TCP timestamp option flag so that the automatic receive buffer
scaling code does not use an uninitialized timestamp echo reply value
from the stack when timestamps are not enabled.

Approved by: re (gjb)
2015-07-21 18:38:31 +00:00
eri
5d11dcc720 MFC r284512: Properly handle locking on the ARP protocol request sending. 2015-06-24 19:06:54 +00:00
tuexen
b52cec9e00 MFC r284596:
When setting the primary address, return an error whenever it fails.
2015-06-22 05:34:13 +00:00
tuexen
094ec02dc8 MFC r284526:
Fix a bug related to flow assignment I introduced in
https://svnweb.freebsd.org/base?view=revision&revision=275483
2015-06-20 08:28:47 +00:00
tuexen
103d48d60e MFC r284515:
Add FIB support for SCTP.
This fixes https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200379

PR:	200379
2015-06-20 08:25:27 +00:00
hiren
4aefd42d20 MFC r266420 (by adrian)
Ensure that the flowid hashtype is assigned to the inp if the flowid
is also assigned.

Spotted by:	gallatin
Tested by:	gallatin
2015-06-19 19:36:21 +00:00
tuexen
6d0b3bca71 MFC r284393:
Correctly detect the case where the last address is removed.
2015-06-17 09:44:02 +00:00
tuexen
1c50eac2f6 MFC r284384:
Stop the heartbeat timer when removing a net.
Thanks to the reporter of
https://code.google.com/p/sctp-refimpl/issues/detail?id=14
for reporting the issue.
2015-06-17 09:39:40 +00:00
hselasky
27c0e6c5bd MFC r280991:
Extend fixes made in r278103 and r38754 by copying the complete packet
header and not only partial flags and fields. Firewalls can attach
classification tags to the outgoing mbufs which should be copied to
all the new fragments. Else only the first fragment will be let
through by the firewall. This can easily be tested by sending a large
ping packet through a firewall. It was also discovered that VLAN
related flags and fields should be copied for packets traversing
through VLANs. This is all handled by "m_dup_pkthdr()".

Regarding the MAC policy check in ip_fragment(), the tag provided by
the originating mbuf is copied instead of using the default one
provided by m_gethdr().

Tested by:		Karim Fodil-Lemelin <fodillemlinkarim at gmail.com>
Sponsored by:		Mellanox Technologies
PR:			7802
2015-06-17 07:21:43 +00:00
tuexen
7a06d6261d MFC r284332:
Fix the reporting of the PMTUD state for specific paths.
2015-06-16 08:26:33 +00:00
tuexen
5e6d50bc8c MFC r284331:
Code cleanup.
2015-06-16 08:24:08 +00:00
tuexen
74eb079379 MFC r284326:
In case of an output error, continue with the next net, don't try to
continue sending on the same net.

This fixes a bug where an invalid mbuf chain was constructed, if a
full size frame of control chunks should be sent and there is a
output error.

Based on a discussion with rrs@, change move to the next net. This fixes
the bug and improves the behaviour.

Thanks to Irene Ruengeler for spending a lot of time in narrowing this
problem down.
2015-06-16 08:21:58 +00:00
jch
c82317e784 MFC r284245:
Fix a callout race condition introduced in TCP timers callouts with r281599.
In TCP timer context, it is not enough to check callout_stop() return value
to decide if a callout is still running or not, previous callout_reset()
return values have also to be checked.
2015-06-11 13:44:06 +00:00
tuexen
8b8722aae0 MFC r283988:
Export a pointer to the SCTP socket. This is needed to add SCTP support
to sockstat.
2015-06-08 07:06:33 +00:00
ian
ac9f7bc3c6 MFC r279236:
Change struct attribute to avoid aligned operations mismatch
  Previous __alignment(4) allowed compiler to assume that operations are
  performed on aligned region. On ARM processor, this led to alignment fault
2015-06-06 20:14:58 +00:00
ae
f1be259e6a MFC r276148:
Remove in_gif.h and in6_gif.h files. They only contain function
  declarations used by gif(4). Instead declare these functions in C files.
  Also make some variables static.

MFC r276215:
  Extern declarations in C files loses compile-time checking that
  the functions' calls match their definitions. Move them to header files.
2015-06-06 13:26:13 +00:00
ae
920800a21f MFC r274246:
Overhaul if_gre(4).

  Split it into two modules: if_gre(4) for GRE encapsulation and
  if_me(4) for minimal encapsulation within IP.

  gre(4) changes:
  * convert to if_transmit;
  * rework locking: protect access to softc with rmlock,
    protect from concurrent ioctls with sx lock;
  * correct interface accounting for outgoing datagramms (count only payload size);
  * implement generic support for using IPv6 as delivery header;
  * make implementation conform to the RFC 2784 and partially to RFC 2890;
  * add support for GRE checksums - calculate for outgoing datagramms and check
    for inconming datagramms;
  * add support for sending sequence number in GRE header;
  * remove support of cached routes. This fixes problem, when gre(4) doesn't
    work at system startup. But this also removes support for having tunnels with
    the same addresses for inner and outer header.
  * deprecate support for various GREXXX ioctls, that doesn't used in FreeBSD.
    Use our standard ioctls for tunnels.

  me(4):
  * implementation conform to RFC 2004;
  * use if_transmit;
  * use the same locking model as gre(4);

  PR:		164475

MFC r274289 (by bz):
  gcc requires variables to be initialised in two places.  One of them
  is correctly  used only under the same conditional though.

  For module builds properly check if the kernel supports INET or INET6,
  as otherwise various mips kernels without IPv6 support would fail to build.

MFC r274964:
  Add ip_gre.h to ObsoleteFiles.inc.
2015-06-06 12:44:42 +00:00
ae
c84e575eec MFC r271918 (by hrs):
- Virtualize interface cloner for gre(4).  This fixes a panic when destroying
    a vnet jail which has a gre(4) interface.

  - Make net.link.gre.max_nesting vnet-local.
2015-06-05 08:10:08 +00:00
ae
fcbaea9548 MFC r275392:
Remove route chaching support from ipsec code. It isn't used for some time.
  * remove sa_route_union declaration and route_cache member from struct secashead;
  * remove key_sa_routechange() call from ICMP and ICMPv6 code;
  * simplify ip_ipsec_mtu();
  * remove #include <net/route.h>;

Sponsored by:	Yandex LLC
2015-06-02 03:14:42 +00:00
ae
8272d42d32 MFC r282965:
Add an ability accept encapsulated packets from different sources by one
  gif(4) interface. Add new option "ignore_source" for gif(4) interface.
  When it is enabled, gif's encapcheck function requires match only for
  packet's destination address.

  Differential Revision:	https://reviews.freebsd.org/D2004
  Sponsored by:	Yandex LLC
2015-05-31 22:58:41 +00:00
tuexen
baef3fdea0 MFC r283691:
Remove printf() noise...
2015-05-31 13:07:22 +00:00
tuexen
5cacc4c33e MFC r283666:
Report the MTU consistently as specified in
https://tools.ietf.org/html/rfc6458
Thanks to Irene Ruengeler for helping me to fix this bug.
2015-05-31 13:04:38 +00:00
tuexen
bbe91198b3 MFC r283665:
Take source and destination address into account when determining
the scope.
This fixes a problem when a client with a global address
connects to a server with a private address.
Thanks to Irene Ruengeler in helping me to find the issue.
2015-05-31 13:01:58 +00:00
tuexen
3a8e5cd397 MFC r283664:
Retire SCTP_DONT_DO_PRIVADDR_SCOPE which was never defined.
2015-05-31 12:59:22 +00:00
tuexen
f2a969c70c MFC r283662:
Fix a bug where messages would not be sent in SHUTDOWN_RECEIVED state.
This problem was reported by Mark Bonnekessel and Markus Boese.
Thanks to Irene Ruengeler for helping me to fix the cause of
the problem. It can be tested with the following packetdrill script:

+0.0 socket(..., SOCK_STREAM, IPPROTO_SCTP) = 3
+0.0 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
+0.0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
// Check the handshake with an empty(!) cookie
+0.1 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0.0 > sctp: INIT[flgs=0, tag=1, a_rwnd=..., os=..., is=..., tsn=0, ...]
+0.1 < sctp: INIT_ACK[flgs=0, tag=2, a_rwnd=10000, os=1, is=1, tsn=0, STATE_COOKIE[len=4, val=...]]
+0.0 > sctp: COOKIE_ECHO[flgs=0, len=4, val=...]
+0.1 < sctp: COOKIE_ACK[flgs=0]
+0.0 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
+0.0 write(3, ..., 1024) = 1024
+0.0 > sctp: DATA[flgs=BE, len=1040, tsn=0, sid=0, ssn=0, ppid=0]
+0.0 write(3, ..., 1024) = 1024 // Pending due to Nagle
+0.0 < sctp: SHUTDOWN[flgs=0, cum_tsn=0]
+0.0 > sctp: DATA[flgs=BE, len=1040, tsn=1, sid=0, ssn=1, ppid=0]
+0.0 < sctp: SACK[flgs=0, cum_tsn=1, a_rwnd=10000, gaps=[], dups=[]] // Do we need another SHUTDOWN here?
+0.0 > sctp: SHUTDOWN_ACK[flgs=0]
+0.0 < sctp: SHUTDOWN_COMPLETE[flgs=0]
+0.0 close(3) = 0
2015-05-31 12:56:22 +00:00
tuexen
c07a13950d MFC r283658:
Use macros for overhead in a consistent way. No functional change.
Thanks to Irene Ruengeler for suggesting the change.
2015-05-31 12:53:16 +00:00
tuexen
4cc7a6bb7a MFC r283654:
Some more debug info cleanup.
2015-05-31 12:50:10 +00:00
tuexen
beb47cc587 MFC r283650:
Fix and cleanup the debug information. This has no user-visible changes.
Thanks to Irene Ruengeler for proving a patch.
2015-05-31 12:46:40 +00:00
tuexen
b583c15e81 MFC r283648:
Address some compiler warnings. No functional change.
2015-05-31 12:42:34 +00:00
tuexen
09fd91eb46 MFC r282810:
Ensure that the COOKIE-ACK can be sent over UDP if the COOKIE-ECHO was
received over UDP.
Thanks to Felix Weinrank for makeing me aware of the problem and to
Irene Ruengeler for providing the fix.
2015-05-29 13:37:04 +00:00
tuexen
a2dbf980ab MFC r282042:
Don't panic under INVARIANTS when receiving a SACK which cumacks
a TSN never sent.
While there, fix two typos.
2015-05-29 13:34:31 +00:00