Commit Graph

174158 Commits

Author SHA1 Message Date
Andre Oppermann
322181c98e If the user has closed the socket then drop a persisting connection
after a much reduced timeout.

Typically web servers close their sockets quickly under the assumption
that the TCP connections goes away as well.  That is not entirely true
however.  If the peer closed the window we're going to wait for a long
time with lots of data in the send buffer.

MFC after:	2 weeks
2012-10-28 19:58:20 +00:00
Andre Oppermann
09440655fe Increase the initial CWND to 10 segments as defined in IETF TCPM
draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
window improves the overall performance of many web services without
risking congestion collapse.

As long as it remains a draft it is placed under a sysctl marking it
as experimental:
 net.inet.tcp.experimental.initcwnd10 = 1
When it becomes an official RFC soon the sysctl will be changed to
the RFC number and moved to net.inet.tcp.

This implementation differs from the RFC draft in that it is a bit
more conservative in the case of packet loss on SYN or SYN|ACK because
we haven't reduced the default RTO to 1 second yet.  Also the restart
window isn't yet increased as allowed.  Both will be adjusted with
upcoming changes.

Is is enabled by default.  In Linux it is enabled since kernel 3.0.

MFC after:	2 weeks
2012-10-28 19:47:46 +00:00
Edward Tomasz Napierala
05d43d9882 Declare functions as static and move global variables to the top;
no functional changes.
2012-10-28 19:38:42 +00:00
Andre Oppermann
77339e1cdc Update comment to reflect the change made in r242263.
MFC after:	2 weeks
2012-10-28 19:22:18 +00:00
Andre Oppermann
c4ab59c1a1 Add SACK_PERMIT to the list of TCP options that are switched off after
retransmitting a SYN three times.

MFC after:	2 weeks
2012-10-28 19:20:23 +00:00
Andre Oppermann
79ce26a08c Simplify and enhance the window change/update acceptance logic,
especially in the presence of bi-directional data transfers.

snd_wl1 tracks the right edge, including data in the reassembly
queue, of valid incoming data.  This makes it like rcv_nxt plus
reassembly.  It never goes backwards to prevent older, possibly
reordered segments from updating the window.

snd_wl2 tracks the left edge of sent data.  This makes it a duplicate
of snd_una.  However joining them right now is difficult due to
separate update dependencies in different places in the code flow.

snd_wnd tracks the current advertized send window by the peer.  In
tcp_output() the effective window is calculated by subtracting the
already in-flight data, snd_nxt less snd_una, from it.

ACK's become the main clock of window updates and will always update
the window when the left edge of what we sent is advanced.  The ACK
clock is the primary signaling mechanism in ongoing data transfers.
This works reliably even in the presence of reordering, reassembly
and retransmitted segments.  The ACK clock is most important because
it determines how much data we are allowed to inject into the network.

Zero window updates get us out of persistence mode are crucial.  Here
a segment that neither moves ACK nor SEQ but enlarges WND is accepted.

When the ACK clock is not active (that is we're not or no longer
sending any data) any segment that moves the extended right SEQ edge,
including out-of-order segments, updates the window.  This gives us
updates especially during ping-pong transfers where the peer isn't
done consuming the already acknowledged data from the receive buffer
while responding with data.

The SSH protocol is a prime candidate to benefit from the improved
bi-directional window update logic as it has its own windowing
mechanism on top of TCP and is frequently sending back protocol ACK's.

Tcpdump provided by:	darrenr
Tested by:	darrenr
MFC after:	2 weeks
2012-10-28 19:16:22 +00:00
Andre Oppermann
024fd5b6bb For retransmits of SYN|ACK from the syncache use the slightly more
aggressive special tcp_syn_backoff[] retransmit schedule instead of
the normal tcp_backoff[] schedule for established connections.

MFC after:	2 weeks
2012-10-28 19:02:07 +00:00
Andre Oppermann
f4748ef5fb When retransmitting SYN in TCPS_SYN_SENT state use TCPTV_RTOBASE,
the default retransmit timeout, as base to calculate the backoff
time until next try instead of the TCP_REXMTVAL() macro which only
works correctly when we already have measured an actual RTT+RTTVAR.

Before it would cause the first retransmit at RTOBASE, the next
four at the same time (!) about 200ms later, and then another one
again RTOBASE later.

MFC after:	2 weeks
2012-10-28 18:56:57 +00:00
Edward Tomasz Napierala
f1988d463c Fix two problems that caused instant panic when the device mounted
with softupdates went away.  Note that this does not fix the problem
entirely; I'm committing it now to make it easier for someone to pick
up the work.

Reviewed by:	mckusick
2012-10-28 18:53:28 +00:00
Adrian Chadd
a93c5097c9 Add a temporary (for values of "temporary") work around for hotplug
support with ath(4) and VIMAGE.

Right now the VIMAGE code doesn't supply a default vnet context during:

* hotplug attach;
* any device detach.

It special cases kldload/boot time probing (by setting the context to
vnet0) but that doesn't occur when probing devices during a bus rescan -
eg, adding a cardbus card.

These will eventually go away when the VIMAGE support extends to providing
default contexts to hotplug attach/detach.
2012-10-28 18:46:06 +00:00
Andre Oppermann
602e8e45ee Remove bogus 'else' in #ifdef that prevented the rttvar from being reset
tcp_timer_rexmt() on retransmit for IPv6 sessions.

MFC after:	2 weeks
2012-10-28 18:45:04 +00:00
Andre Oppermann
14d7c5b11c Improve m_cat() by being able to also merge contents from M_EXT
mbuf's by doing proper testing with M_WRITABLE().

In m_collapse() replace an incomplete manual check for M_RDONLY
with the M_WRITABLE() macro that also tests for shared buffers
and other cases that make a particular mbuf immutable.

MFC after:	2 weeks
2012-10-28 18:38:51 +00:00
Andre Oppermann
4faaea5505 Allow arbitrary MSS sizes and don't mind about the cluster size anymore.
We've got more cluster sizes for quite some time now and the orginally
imposed limits and the previously codified thoughts on efficiency gains
are no longer true.

MFC after:	2 weeks
2012-10-28 18:33:52 +00:00
Andre Oppermann
f3a10d7954 Change the syncache count reporting the current number of entries
from an unprotected u_int that reports garbage on SMP to a function
based sysctl obtaining the current value from UMA.

Also read back the actual cache_limit after page size rounding by UMA.

PR:		kern/165879
MFC after:	2 weeks
2012-10-28 18:07:34 +00:00
Andre Oppermann
aafa0b4164 Simplify implementation of net.inet.tcp.reass.maxsegments and
net.inet.tcp.reass.cursegments.

MFC after:	2 weeks
2012-10-28 17:59:46 +00:00
Andre Oppermann
f62563d33c Prevent a flurry of forced window updates when an application is
doing small reads on a (partially) filled receive socket buffer.

Normally one would a send a window update every time the available
space in the socket buffer increases by two times MSS.  This leads
to a flurry of window updates that do not provide any meaningful
new information to the sender.  There still is available space in
the window and the sender can continue sending data.  All window
updates then get carried by the regular ACKs.  Only when the socket
buffer was (almost) full and the window closed accordingly a window
updates delivery new information and allows the sender to start
sending more data again.

Send window updates only every two MSS when the socket buffer
has less than 1/8 space available, or the available space in the
socket buffer increased by 1/4 its full capacity, or the socket
buffer is very small.  The next regular data ACK will carry and
report the exact window size again.

Reported by:	sbruno
Tested by:	darrenr
Tested by:	Darren Baginski
PR:		kern/116335
MFC after:	2 weeks
2012-10-28 17:40:35 +00:00
Andre Oppermann
4249614cb0 When SYN or SYN/ACK had to be retransmitted RFC5681 requires us to
reduce the initial CWND to one segment.  This reduction got lost
some time ago due to a change in initialization ordering.

Additionally in tcp_timer_rexmt() avoid entering fast recovery when
we're still in TCPS_SYN_SENT state.

MFC after:	2 weeks
2012-10-28 17:30:28 +00:00
Andre Oppermann
cf8f04f4c0 When SYN or SYN/ACK had to be retransmitted RFC5681 requires us to
reduce the initial CWND to one segment.  This reduction got lost
some time ago due to a change in initialization ordering.

Additionally in tcp_timer_rexmt() avoid entering fast recovery when
we're still in TCPS_SYN_SENT state.

MFC after:	2 weeks
2012-10-28 17:25:08 +00:00
Andre Oppermann
22efabd40c Adjust the initial default CWND upon connection establishment to the
new and increased values specified by RFC5681 Section 3.1.

The even larger initial CWND per RFC3390, if enabled, is not affected.

MFC after:	2 weeks
2012-10-28 17:16:09 +00:00
Hans Petter Selasky
b4380da796 Implement support for the so-called USB feedback endpoint for USB
audio devices. This endpoint gives clues to the USB host about the
actual data rate on asynchronous endpoints and makes the more
expensive USB audio devices usable under FreeBSD.
The Linux USB audio driver was used as reference for the
automagic shift of the received value.

MFC after:	1 week
2012-10-28 14:37:17 +00:00
Konstantin Belousov
28854834f4 Fix compilation on ia64 when page size is configured for 16KB.
Reviewed by:	alc, marcel
2012-10-28 11:53:54 +00:00
Edwin Groothuis
2029f58919 Merge of vendor import of tzdata2012h
- Bahia no longer has DST.
- Tocantins has DST.
- Israel has new DST rules next year.
- Jordan stays on DST this winter.
2012-10-28 09:14:42 +00:00
Edwin Groothuis
0320df653b Vendor import of tzdata2012h
- Bahia no longer has DST.
- Tocantins has DST.
- Israel has new DST rules next year.
- Jordan stays on DST this winter.

Obtained from:	ftp://ftp.iana.org/tz/release
2012-10-28 09:12:50 +00:00
Adrian Chadd
0ef1bc21bc Add some further BAR TX debugging; it was useful when figuring out
when BAR TX was actually failing.
2012-10-28 04:18:49 +00:00
Warner Losh
8385f6bfc6 Better comments. 2012-10-28 02:55:51 +00:00
Nathan Whitehorn
643c87ca3d Extend dim's hack from r228978: not only clang but gcc on non-x86 platforms
warns about unused variables in this code, so always add -Wno-unused to
the warning flags. Why gcc on x86 *doesn't* warn about this, I will never
know. The code itself should probably be fixed at some point.
2012-10-28 02:15:35 +00:00
Davide Italiano
ba4be2110a The fields of struct timespec32 should be int32_t and not uint32_t.
Make this change.

Reviewed by:	bde, davidxu
Tested by:	pho
MFC after:	1 week
2012-10-27 23:42:41 +00:00
Juli Mallett
5eceedc5a0 Add missing return that broke 8-bit CF support in refactoring in r222671.
Tested on a Cavium CN5860-EVB-NIC4.  This was broken for over a year.
2012-10-27 23:36:41 +00:00
Nathan Whitehorn
111d36dc7b Don't try to build Linux compatibility stuff on platforms without
COMPAT_LINUX.
2012-10-27 23:14:37 +00:00
Alan Cox
e3978f3316 Eliminate a redundant TLB invalidation from pmap_pv_reclaim(). 2012-10-27 22:43:30 +00:00
Tim Kientzle
e9e4f18efc Missing paren.
Pointy hat:me
2012-10-27 22:13:42 +00:00
Devin Teske
c0d1bdc0b4 Fix bug introduced by r241902 (MANIFEST uses TAB delimiter).
PR:		bin/173140
Approved by:	adrian (co-mentor)
2012-10-27 19:56:57 +00:00
Hiroki Sato
c58c2dc7d5 Add setfib(1) support for services as <name>_fib in rc.conf. 2012-10-27 19:09:09 +00:00
Chris Rees
b2de5bffb6 Allow spaces in _chroot
Noticed by:	adj (IRC/#bsdports)
Approved by:	hrs
MFC after:	1 month
2012-10-27 17:43:30 +00:00
Alexander Kabaev
31e8efde08 Follow clang lead and include mm_malloc.h only in hosted configurations.
This makes the use of intrinsics easier in kernel environment, according
to the submitter.

Requested by: jmg
2012-10-27 17:39:36 +00:00
Hiroki Sato
274b8658fc Fix an issue when ipv6_enable=YES && ipv6_gateway_enable=YES which could
prevent rtadvd(8) from working as intended.

Spotted by:	brian
Discussed with:	brian
2012-10-27 17:06:26 +00:00
Nathan Whitehorn
86b32c0887 drm(4) works just fine on PowerPC, so connect it to the build.
MFC after:	2 weeks
2012-10-27 16:07:38 +00:00
Alexander Motin
8cff7eb82f Remove priority enforcement from xpt_ation(). It is not good and even not
safe in some cases to reduce CCB priority after it was scheduled with high
priority.  This fixes reproducible deadlock when command sent through the
pass interface while ATA XPT recovers from command timeout.

Instead of that enforce priority at passioctl().  libcam provides no obvious
interface to specify CCB priority and so much (all?) code specifies zero
(highest) priority.  This change limits pass CCBs priority to NORMAL run
level, allowing XPT to complete bus and device recovery after reset before
running any payload.
2012-10-27 10:14:12 +00:00
Alexander Motin
15a2601b29 Remove several uses of numeric priorities from immediate CCB setups. 2012-10-27 09:40:29 +00:00
Alexander Motin
e1c2df4d30 Remove one more numeric priority constant. 2012-10-27 08:52:33 +00:00
Tim Kientzle
27da503bd6 Comment out the other BOOTP option
This should make PANDABOARD suitable for building
bootable SD images.

Submitted by:	Giovanni Trematerra
2012-10-27 04:02:12 +00:00
Warner Losh
648f9a3ccd stack_machdep.c is dependent on ddb or stack options, not standard. 2012-10-26 21:25:10 +00:00
Gleb Smirnoff
078468ede4 o Remove last argument to ip_fragment(), and obtain all needed information
on checksums directly from mbuf flags. This simplifies code.
o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in
  hardware. Some driver may not announce CSUM_IP in theur if_hwassist,
  although try to do checksums if CSUM_IP set on mbuf. Example is em(4).
o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP.
  After this change CSUM_DELAY_IP vanishes from the stack.

Submitted by:	Sebastian Kuzminsky <seb lineratesystems.com>
2012-10-26 21:06:33 +00:00
Warner Losh
5c0adc7db8 Siba, in theory, is a architecturally neutral bus, so place it in
files.  It used to be in files.mips before the clean-room rewrite and
really doesn't belong there.  If we need to grow arch specific code,
we can move it into $ARCH/$ACH/siba_machdep.c.
2012-10-26 20:43:30 +00:00
Eitan Adler
e8c6172f10 While 'make universe' passed this didn't work with clang.
This reverts r242120

Submitted by:	Jan Beich
Approved by: cperciva (implicit)
2012-10-26 20:25:05 +00:00
David E. O'Brien
4e359efd5b A little bit easier to read. 2012-10-26 20:24:13 +00:00
David E. O'Brien
aa1e1e87d0 Test both active and non-active cases. 2012-10-26 20:14:40 +00:00
Alexander Motin
23d9e39c1a Implement CAM_ATAIO_NEEDRESULT (fetching full set of result registers) for
ata(4) driver in ATA_CAM mode.  That slighty improves error reporting and
also should fix `smartctl -l scterc /dev/adaX` operation.

MFC after:	3 weeks
2012-10-26 20:03:08 +00:00
Adrian Chadd
fdf3d9543a Oops, missed in my last commit. 2012-10-26 19:46:55 +00:00
Adrian Chadd
f397643e36 Allow net80211 to be built on -9 and -8.
There are some people who use the -HEAD net80211 and wireless drivers
on earlier FreeBSD versions in order to get the updated 802.11n support.
The previous if_clone API changes broke this.
2012-10-26 19:06:24 +00:00