freebsd-dev

Author	SHA1	Message	Date
Matt Macy	6573d7580b	epoch(9): allow preemptible epochs to compose - Add tracker argument to preemptible epochs - Inline epoch read path in kernel and tied modules - Change in_epoch to take an epoch as argument - Simplify tfb_tcp_do_segment to not take a ti_locked argument, there's no longer any benefit to dropping the pcbinfo lock and trying to do so just adds an error prone branchfest to these functions - Remove cases of same function recursion on the epoch as recursing is no longer free. - Remove the the TAILQ_ENTRY and epoch_section from struct thread as the tracker field is now stack or heap allocated as appropriate. Tested by: pho and Limelight Networks Reviewed by: kbowling at llnw dot com Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16066	2018-07-04 02:47:16 +00:00
Randall Stewart	89e560f441	This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent acknowledgment) where counting dup-acks is no longer done instead time is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt to try not to take a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS - PRR (partial rate reduction) see the RFC. Once built into your kernel, you can select this stack by either socket option with the name of the stack is "rack" or by setting the global sysctl so the default is rack. Note that any connection that does not support SACK will be kicked back to the "default" base FreeBSD stack (currently known as "default"). To build this into your kernel you will need to enable in your kernel: makeoptions WITH_EXTRA_TCP_STACKS=1 options TCPHPTS Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15525	2018-06-07 18:18:13 +00:00
Matt Macy	10d20c84ed	Fix spurious retransmit recovery on low latency networks TCP's smoothed RTT (SRTT) can be much larger than an actual observed RTT. This can be either because of hz restricting the calculable RTT to 10ms in VMs or 1ms using the default 1000hz or simply because SRTT recently incorporated a larger value. If an ACK arrives before the calculated badrxtwin (now + SRTT): tp->t_badrxtwin = ticks + (tp->t_srtt >> (TCP_RTT_SHIFT + 1)); We'll erroneously reset snd_una to snd_max. If multiple segments were dropped and this happens repeatedly the transmit rate will be limited to 1MSS per RTO until we've retransmitted all drops. Reported by: rstone Reviewed by: hiren, transport Approved by: sbruno MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8556	2018-05-08 02:22:34 +00:00
Michael Tuexen	4c6a10903f	SImplify the call to tcp_drop(), since the handling of soft error is also done in tcp_drop(). No functional change. Sponsored by: Netflix, Inc.	2018-05-02 20:04:31 +00:00
Jonathan T. Looney	2529f56ed3	Add the "TCP Blackbox Recorder" which we discussed at the developer summits at BSDCan and BSDCam in 2017. The TCP Blackbox Recorder allows you to capture events on a TCP connection in a ring buffer. It stores metadata with the event. It optionally stores the TCP header associated with an event (if the event is associated with a packet) and also optionally stores information on the sockets. It supports setting a log ID on a TCP connection and using this to correlate multiple connections that share a common log ID. You can log connections in different modes. If you are doing a coordinated test with a particular connection, you may tell the system to put it in mode 4 (continuous dump). Or, if you just want to monitor for errors, you can put it in mode 1 (ring buffer) and dump all the ring buffers associated with the connection ID when we receive an error signal for that connection ID. You can set a default mode that will be applied to a particular ratio of incoming connections. You can also manually set a mode using a socket option. This commit includes only basic probes. rrs@ has added quite an abundance of probes in his TCP development work. He plans to commit those soon. There are user-space programs which we plan to commit as ports. These read the data from the log device and output pcapng files, and then let you analyze the data (and metadata) in the pcapng files. Reviewed by: gnn (previous version) Obtained from: Netflix, Inc. Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11085	2018-03-22 09:40:08 +00:00
John Baldwin	f17985319d	Export tcp_always_keepalive for use by the Chelsio TOM module. This used to work by accident with ld.bfd even though always_keepalive was marked as static. LLD honors static more correctly, so export this variable properly (including moving it into the tcp_* namespace). Reviewed by: bz, emaste MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14129	2018-01-30 23:01:37 +00:00
Pedro F. Giffuni	51369649b0	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
Gleb Smirnoff	e29c55e4bb	Declare pmtud_blackhole global variables in tcp_timer.h, so that alternative TCP stacks can legally use them.	2017-10-06 20:33:40 +00:00
Michael Tuexen	3d5af7a127	Fix blackhole detection. There were two bugs related to the blackhole detection: * The smalles size was tried more than two times. * The restored MSS was not the original one, but the second candidate. MFC after: 1 week Sponsored by: Netflix, Inc.	2017-08-28 11:41:18 +00:00
Sean Bruno	32a04bb81d	Use counter(9) for PLPMTUD counters. Remove unused PLPMTUD sysctl counters. Bump UPDATING and FreeBSD Version to indicate a rebuild is required. Submitted by: kevin.bowling@kev009.com Reviewed by: jtl Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12003	2017-08-25 19:41:38 +00:00
Gleb Smirnoff	cc65eb4e79	Hide struct inpcb, struct tcpcb from the userland. This is a painful change, but it is needed. On the one hand, we avoid modifying them, and this slows down some ideas, on the other hand we still eventually modify them and tools like netstat(1) never work on next version of FreeBSD. We maintain a ton of spares in them, and we already got some ifdef hell at the end of tcpcb. Details: - Hide struct inpcb, struct tcpcb under _KERNEL \|\| _WANT_FOO. - Make struct xinpcb, struct xtcpcb pure API structures, not including kernel structures inpcb and tcpcb inside. Export into these structures the fields from inpcb and tcpcb that are known to be used, and put there a ton of spare space. - Make kernel and userland utilities compilable after these changes. - Bump __FreeBSD_version. Reviewed by: rrs, gnn Differential Revision: D10018	2017-03-21 06:39:49 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Ryan Stone	5ede40dcf2	Don't zero out srtt after excess retransmits If the TCP stack has retransmitted more than 1/4 of the total number of retransmits before a connection drop, it decides that its current RTT estimate is hopelessly out of date and decides to recalculate it from scratch starting with the next ACK. Unfortunately, it implements this by zeroing out the current RTT estimate. Drop this hack entirely, as it makes it significantly more difficult to debug connection issues. Instead check for excessive retransmits at the point where srtt is updated from an ACK being received. If we've exceeded 1/4 of the maximum retransmits, discard the previous srtt estimate and replace it with the latest rtt measurement. Differential Revision: https://reviews.freebsd.org/D9519 Reviewed by: gnn Sponsored by: Dell EMC Isilon	2017-02-11 17:05:08 +00:00
Jonathan T. Looney	6d172f58a2	The code currently resets the keepalive timer each time a packet is received on a TCP session that has entered the ESTABLISHED state. This results in a lot of calls to reset the keepalive timer. This patch changes the behavior so we set the keepalive timer for the keepalive idle time (TP_KEEPIDLE). When the keepalive timer fires, it will first check to see if the session has been idle for TP_KEEPIDLE ticks. If not, it will reschedule the keepalive timer for the time the session will have been idle for TP_KEEPIDLE ticks. For a session with regular communication, the keepalive timer should fire approximately once every TP_KEEPIDLE ticks. For sessions with irregular communication, the keepalive timer might fire more often. But, the disruption from a periodic keepalive timer should be less than the regular cost of resetting the keepalive timer on every packet. (FWIW, this change saved approximately 1.73% of the busy CPU cycles on a particular test system with a heavy TCP output load. Of course, the actual impact is very specific to the particular hardware and workload.) Reviewed by: gallatin, rrs MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D8243	2016-10-14 14:57:43 +00:00
Randall Stewart	eadd00f81a	A few more wording tweaks as suggested (with some modifications as well) by Ravi Pokala. Thanks for the comments :-) Sponsored by: Netflix Inc.	2016-08-16 15:17:36 +00:00
Randall Stewart	0fa047b98c	Comments describing how to properly use the new lock_add functions and its respective companion. Sponsored by: Netflix Inc.	2016-08-16 13:08:03 +00:00
Randall Stewart	b07fef500b	This cleans up the timer code in TCP and also makes it so we do not take the INFO lock unless we are really going to delete the TCB. Differential Revision: D7136	2016-08-16 12:40:56 +00:00
Randall Stewart	5105a92c49	This small change adopts the excellent suggestion for using named structures in the add of a new tcp-stack that came in late to me via email after the last commit. It also makes it so that a new stack may optionally get a callback during a retransmit timeout. This allows the new stack to clear specific state (think sack scoreboards or other such structures). Sponsored by: Netflix Inc. Differential Revision: http://reviews.freebsd.org/D6303	2016-05-17 09:53:22 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Randall Stewart	e5ad64562a	This cleans up the timers code in TCP to start using the new async_drain functionality. This as been tested in NF as well as by Verisign. Still to do in here is to remove all the old flags. They are currently left being maintained but probably are no longer needed. Sponsored by: Netflix Inc. Differential Revision: http://reviews.freebsd.org/D5924	2016-04-28 13:27:12 +00:00
George V. Neville-Neil	84cc0778d0	FreeBSD previously provided route caching for TCP (and UDP). Re-add route caching for TCP, with some improvements. In particular, invalidate the route cache if a new route is added, which might be a better match. The cache is automatically invalidated if the old route is deleted. Submitted by: Mike Karels Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D4306	2016-03-24 07:54:56 +00:00
Gleb Smirnoff	4644fda3f7	Rename netinet/tcp_cc.h to netinet/cc/cc.h. Discussed with: lstewart	2016-01-27 17:59:39 +00:00
Hiren Panchasara	0645c6049d	Persist timers TCPTV_PERSMIN and TCPTV_PERSMAX are hardcoded with 5 seconds and 60 seconds, respectively. Turn them into sysctls that can be tuned live. The default values of 5 seconds and 60 seconds have been retained. Submitted by: Jason Wolfe (j at nitrology dot com) Reviewed by: gnn, rrs, hiren, bz MFC after: 1 week Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D5024	2016-01-26 16:33:38 +00:00
Gleb Smirnoff	2de3e790f5	- Rename cc.h to more meaningful tcp_cc.h. - Declare it a kernel only include, which it already is. - Don't include tcp.h implicitly from tcp_cc.h	2016-01-21 22:34:51 +00:00
Gleb Smirnoff	0c39d38d21	Historically we have two fields in tcpcb to describe sender MSS: t_maxopd, and t_maxseg. This dualism emerged with T/TCP, but was not properly cleaned up after T/TCP removal. After all permutations over the years the result is that t_maxopd stores a minimum of peer offered MSS and MTU reduced by minimum protocol header. And t_maxseg stores (t_maxopd - TCPOLEN_TSTAMP_APPA) if timestamps are in action, or is equal to t_maxopd otherwise. That's a very rough estimate of MSS reduced by options length. Throughout the code it was used in places, where preciseness was not important, like cwnd or ssthresh calculations. With this change: - t_maxopd goes away. - t_maxseg now stores MSS not adjusted by options. - new function tcp_maxseg() is provided, that calculates MSS reduced by options length. The functions gives a better estimate, since it takes into account SACK state as well. Reviewed by: jtl Differential Revision: https://reviews.freebsd.org/D3593	2016-01-07 00:14:42 +00:00
Patrick Kelsey	281a0fd4f9	Implementation of server-side TCP Fast Open (TFO) [RFC7413]. TFO is disabled by default in the kernel build. See the top comment in sys/netinet/tcp_fastopen.c for implementation particulars. Reviewed by: gnn, jch, stas MFC after: 3 days Sponsored by: Verisign, Inc. Differential Revision: https://reviews.freebsd.org/D4350	2015-12-24 19:09:48 +00:00
Randall Stewart	55bceb1e2b	First cut of the modularization of our TCP stack. Still to do is to clean up the timer handling using the async-drain. Other optimizations may be coming to go with this. Whats here will allow differnet tcp implementations (one included). Reviewed by: jtl, hiren, transports Sponsored by: Netflix Inc. Differential Revision: D4055	2015-12-16 00:56:45 +00:00
Randall Stewart	7c4676ddee	This fixes several places where callout_stops return is examined. The new return codes of -1 were mistakenly being considered "true". Callout_stop now returns -1 to indicate the callout had either already completed or was not running and 0 to indicate it could not be stopped. Also update the manual page to make it more consistent no non-zero in the callout_stop or callout_reset descriptions. MFC after: 1 Month with associated callout change.	2015-11-13 22:51:35 +00:00
Hiren Panchasara	adf43a9279	Fix an unnecessarily aggressive behavior where mtu clamping begins on first retransmission timeout (rto) when blackhole detection is enabled. Make sure it only happens when the second attempt to send the same segment also fails with rto. Also make sure that each mtu probing stage (usually 1448 -> 1188 -> 524) follows the same pattern and gets 2 chances (rto) before further clamping down. Note: RFC4821 doesn't specify implementation details on how this situation should be handled. Differential Revision: https://reviews.freebsd.org/D3434 Reviewed by: sbruno, gnn (previous version) MFC after: 2 weeks Sponsored by: Limelight Networks	2015-10-14 06:57:28 +00:00
George V. Neville-Neil	5d06879adb	dd DTrace probe points, translators and a corresponding script to provide the TCPDEBUG functionality with pure DTrace. Reviewed by: rwatson MFC after: 2 weeks Sponsored by: Limelight Networks Differential Revision: D3530	2015-09-13 15:50:55 +00:00
Julien Charbon	d6de19ac2f	Put r284245 back in place: If at first this fix was seen as a temporary workaround for a callout(9) issue, it turns out it is instead the right way to use callout in mpsafe mode without using callout_drain(). r284245 commit message: Fix a callout race condition introduced in TCP timers callouts with r281599. In TCP timer context, it is not enough to check callout_stop() return value to decide if a callout is still running or not, previous callout_reset() return values have also to be checked. Differential Revision: https://reviews.freebsd.org/D2763	2015-08-30 13:44:39 +00:00
Julien Charbon	bcf9b91395	Revert r284245: "Fix a callout race condition introduced in TCP timers callouts with r281599." r281599 fixed a TCP timer race condition, but due a callout(9) bug it also introduced another race condition workaround-ed with r284245. The callout(9) bug being fixed with r286880, we can now revert the workaround (r284245). Differential Revision: https://reviews.freebsd.org/D2079 (Initial change) Differential Revision: https://reviews.freebsd.org/D2763 (Workaround) Differential Revision: https://reviews.freebsd.org/D3078 (Fix) Sponsored by: Verisign, Inc. MFC after: 2 weeks	2015-08-24 09:30:27 +00:00
Julien Charbon	31a7749d4b	Make clear that TIME_WAIT timeout expiration is managed solely by tcp_tw_2msl_scan(). Sponsored by: Verisign, Inc.	2015-08-18 08:27:26 +00:00
Julien Charbon	ff9b006d61	Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability: - The existing TCP INP_INFO lock continues to protect the global inpcb list stability during full list traversal (e.g. tcp_pcblist()). - A new INP_LIST lock protects inpcb list actual modifications (inp allocation and free) and inpcb global counters. It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input()) and INP_INFO_WLOCK only in occasional operations that walk all connections. PR: 183659 Differential Revision: https://reviews.freebsd.org/D2599 Reviewed by: jhb, adrian Tested by: adrian, nitroboost-gmail.com Sponsored by: Verisign, Inc.	2015-08-03 12:13:54 +00:00
Julien Charbon	cad814eedf	Fix a callout race condition introduced in TCP timers callouts with r281599. In TCP timer context, it is not enough to check callout_stop() return value to decide if a callout is still running or not, previous callout_reset() return values have also to be checked. Differential Revision: https://reviews.freebsd.org/D2763 Reviewed by: hiren Approved by: hiren MFC after: 1 day Sponsored by: Verisign, Inc.	2015-06-10 20:43:07 +00:00
Julien Charbon	5571f9cf81	Fix an old and well-documented use-after-free race condition in TCP timers: - Add a reference from tcpcb to its inpcb - Defer tcpcb deletion until TCP timers have finished Differential Revision: https://reviews.freebsd.org/D2079 Submitted by: jch, Marc De La Gueronniere <mdelagueronniere@verisign.com> Reviewed by: imp, rrs, adrian, jhb, bz Approved by: jhb Sponsored by: Verisign, Inc.	2015-04-16 10:00:06 +00:00
Julien Charbon	033749179f	Provide better debugging information in tcp_timer_activate() and tcp_timer_active() Differential Revision: https://reviews.freebsd.org/D2179 Suggested by: bz Reviewed by: jhb Approved by: jhb	2015-04-02 14:43:07 +00:00
Julien Charbon	18832f1fd1	Use appropriate timeout_t* instead of void* in tcp_timer_activate() Suggested by: imp Differential Revision: https://reviews.freebsd.org/D2154 Reviewed by: imp, jhb Approved by: jhb	2015-03-31 10:17:13 +00:00
Adrian Chadd	b2bdc62a95	Refactor / restructure the RSS code into generic, IPv4 and IPv6 specific bits. The motivation here is to eventually teach netisr and potentially other networking subsystems a bit more about how RSS work queues / buckets are configured so things have a hope of auto-configuring in the future. * net/rss_config.[ch] takes care of the generic bits for doing configuration, hash function selection, etc; * topelitz.[ch] is now in net/ rather than netinet/; * (and would be in libkern if it didn't directly include RSS_KEYSIZE; that's a later thing to fix up.) * netinet/in_rss.[ch] now just contains the IPv4 specific methods; * and netinet/in6_rss.[ch] now just contains the IPv6 specific methods. This should have no functional impact on anyone currently using the RSS support. Differential Revision: D1383 Reviewed by: gnn, jfv (intel driver bits)	2015-01-18 18:06:40 +00:00
Julien Charbon	cea40c4888	Fix a race condition in TCP timewait between tcp_tw_2msl_reuse() and tcp_tw_2msl_scan(). This race condition drives unplanned timewait timeout cancellation. Also simplify implementation by holding inpcb reference and removing tcptw reference counting. Differential Revision: https://reviews.freebsd.org/D826 Submitted by: Marc De la Gueronniere <mdelagueronniere@verisign.com> Submitted by: jch Reviewed By: jhb (mentor), adrian, rwatson Sponsored by: Verisign, Inc. MFC after: 2 weeks X-MFC-With: r264321	2014-10-30 08:53:56 +00:00
Hans Petter Selasky	f0188618f2	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
Sean Bruno	882ac53ed7	Handle small file case with regards to plpmtud blackhole detection. Submitted by: Mikhail <mp@lenta.ru> MFC after: 2 weeks Relnotes: yes	2014-10-13 21:06:21 +00:00
Sean Bruno	f6f6703f27	Implement PLPMTUD blackhole detection (RFC 4821), inspired by code from xnu sources. If we encounter a network where ICMP is blocked the Needs Frag indicator may not propagate back to us. Attempt to downshift the mss once to a preconfigured value. Default this feature to off for now while we do not have a full PLPMTUD implementation in our stack. Adds the following new sysctl's for control: net.inet.tcp.pmtud_blackhole_detection -- turns on/off this feature net.inet.tcp.pmtud_blackhole_mss -- mss to try for ipv4 net.inet.tcp.v6pmtud_blackhole_mss -- mss to try for ipv6 Adds the following new sysctl's for monitoring: -- Number of times the code was activated to attempt a mss downshift net.inet.tcp.pmtud_blackhole_activated -- Number of times the blackhole mss was used in an attempt to downshift net.inet.tcp.pmtud_blackhole_min_activated -- Number of times that we failed to connect after we downshifted the mss net.inet.tcp.pmtud_blackhole_failed Phabricator: https://reviews.freebsd.org/D506 Reviewed by: rpaulo bz MFC after: 2 weeks Relnotes: yes Sponsored by: Limelight Networks	2014-10-07 21:50:28 +00:00
Adrian Chadd	8f7e75cbbd	If we're doing RSS then ensure the TCP timer selection uses the multi-CPU callwheel setup, rather than just dumping all the timers on swi0.	2014-06-30 04:26:29 +00:00
Adrian Chadd	883831c675	When RSS is enabled and per cpu TCP timers are enabled, do an RSS lookup for the inp flowid/flowtype to destination CPU. This only modifies the case where RSS is enabled and the per-cpu tcp timer option is enabled. Otherwise the behaviour should be the same as before.	2014-05-18 22:39:01 +00:00
John Baldwin	66eefb1eae	Currently, the TCP slow timer can starve TCP input processing while it walks the list of connections in TIME_WAIT closing expired connections due to contention on the global TCP pcbinfo lock. To remediate, introduce a new global lock to protect the list of connections in TIME_WAIT. Only acquire the TCP pcbinfo lock when closing an expired connection. This limits the window of time when TCP input processing is stopped to the amount of time needed to close a single connection. Submitted by: Julien Charbon <jcharbon@verisign.com> Reviewed by: rwatson, rrs, adrian MFC after: 2 months	2014-04-10 18:15:35 +00:00
Davide Italiano	5b999a6be0	- Make callout(9) tickless, relying on eventtimers(4) as backend for precise time event generation. This greatly improves granularity of callouts which are not anymore constrained to wait next tick to be scheduled. - Extend the callout KPI introducing a set of callout_reset_sbt* functions, which take a sbintime_t as timeout argument. The new KPI also offers a way for consumers to specify precision tolerance they allow, so that callout can coalesce events and reduce number of interrupts as well as potentially avoid scheduling a SWI thread. - Introduce support for dispatching callouts directly from hardware interrupt context, specifying an additional flag. This feature should be used carefully, as long as interrupt context has some limitations (e.g. no sleeping locks can be held). - Enhance mechanisms to gather informations about callwheel, introducing a new sysctl to obtain stats. This change breaks the KBI. struct callout fields has been changed, in particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t' (8 bytes) and another 'sbintime_t' field was added for precision. Together with: mav Reviewed by: attilio, bde, luigi, phk Sponsored by: Google Summer of Code 2012, iXsystems inc. Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm), markj (amd64), mav, Fabian Keil	2013-03-04 11:09:56 +00:00
John Baldwin	6c0ef8957f	Don't drop options from the third retransmitted SYN by default. If the SYNs (or SYN/ACK replies) are dropped due to network congestion, then the remote end of the connection may act as if options such as window scaling are enabled but the local end will think they are not. This can result in very slow data transfers in the case of window scaling disagreements. The old behavior can be obtained by setting the net.inet.tcp.rexmit_drop_options sysctl to a non-zero value. Reviewed by: net@ MFC after: 2 weeks	2013-01-09 20:27:06 +00:00
Navdeep Parhar	825fd1e437	Make sure that tcp_timer_activate() correctly sees TCP_OFFLOAD (or not).	2012-11-27 06:42:44 +00:00
Andre Oppermann	322181c98e	If the user has closed the socket then drop a persisting connection after a much reduced timeout. Typically web servers close their sockets quickly under the assumption that the TCP connections goes away as well. That is not entirely true however. If the peer closed the window we're going to wait for a long time with lots of data in the send buffer. MFC after: 2 weeks	2012-10-28 19:58:20 +00:00

1 2 3 4

179 Commits