freebsd-dev

Author	SHA1	Message	Date
Michael Tuexen	d6194c562f	Remove a KASSERT which is not always true. In case of the empty queue tp->snd_holes and tcp_sackhole_insert() failing due to memory shortage, tp->snd_holes will be empty. This problem was hit when stress tests where performed by pho. PR: 215513 Reported by: pho Tested by: pho Sponsored by: Netflix, Inc.	2016-12-25 17:37:18 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Randall Stewart	55bceb1e2b	First cut of the modularization of our TCP stack. Still to do is to clean up the timer handling using the async-drain. Other optimizations may be coming to go with this. Whats here will allow differnet tcp implementations (one included). Reviewed by: jtl, hiren, transports Sponsored by: Netflix Inc. Differential Revision: D4055	2015-12-16 00:56:45 +00:00
Hiren Panchasara	021eaf7996	One of the ways to detect loss is to count duplicate acks coming back from the other end till it reaches predetermined threshold which is 3 for us right now. Once that happens, we trigger fast-retransmit to do loss recovery. Main problem with the current implementation is that we don't honor SACK information well to detect whether an incoming ack is a dupack or not. RFC6675 has latest recommendations for that. According to it, dupack is a segment that arrives carrying a SACK block that identifies previously unknown information between snd_una and snd_max even if it carries new data, changes the advertised window, or moves the cumulative acknowledgment point. With the prevalence of Selective ACK (SACK) these days, improper handling can lead to delayed loss recovery. With the fix, new behavior looks like following: 0) th_ack < snd_una --> ignore Old acks are ignored. 1) th_ack == snd_una, !sack_changed --> ignore Acks with SACK enabled but without any new SACK info in them are ignored. 2) th_ack == snd_una, window == old_window --> increment Increment on a good dupack. 3) th_ack == snd_una, window != old_window, sack_changed --> increment When SACK enabled, it's okay to have advertized window changed if the ack has new SACK info. 4) th_ack > snd_una --> reset to 0 Reset to 0 when left edge moves. 5) th_ack > snd_una, sack_changed --> increment Increment if left edge moves but there is new SACK info. Here, sack_changed is the indicator that incoming ack has previously unknown SACK info in it. Note: This fix is not fully compliant to RFC6675. That may require a few changes to current implementation in order to keep per-sackhole dupack counter and change to the way we mark/handle sack holes. PR: 203663 Reviewed by: jtl MFC after: 3 weeks Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D4225	2015-12-08 21:21:48 +00:00
Hiren Panchasara	12eeb81fc1	Calculate the correct amount of bytes that are in-flight for a connection as suggested by RFC 6675. Currently differnt places in the stack tries to guess this in suboptimal ways. The main problem is that current calculations don't take sacked bytes into account. Sacked bytes are the bytes receiver acked via SACK option. This is suboptimal because it assumes that network has more outstanding (unacked) bytes than the actual value and thus sends less data by setting congestion window lower than what's possible which in turn may cause slower recovery from losses. As an example, one of the current calculations looks something like this: snd_nxt - snd_fack + sackhint.sack_bytes_rexmit New proposal from RFC 6675 is: snd_max - snd_una - sackhint.sacked_bytes + sackhint.sack_bytes_rexmit which takes sacked bytes into account which is a new addition to the sackhint struct. Only thing we are missing from RFC 6675 is isLost() i.e. segment being considered lost and thus adjusting pipe based on that which makes this calculation a bit on conservative side. The approach is very simple. We already process each ack with sack info in tcp_sack_doack() and extract sack blocks/holes out of it. We'd now also track this new variable sacked_bytes which keeps track of total sacked bytes reported. One downside to this approach is that we may get incorrect count of sacked_bytes if the other end decides to drop sack info in the ack because of memory pressure or some other reasons. But in this (not very likely) case also the pipe calculation would be conservative which is okay as opposed to being aggressive in sending packets into the network. Next step is to use this more accurate pipe estimation to drive congestion window adjustments. In collaboration with: rrs Reviewed by: jason_eggnet dot com, rrs MFC after: 2 weeks Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D3971	2015-10-28 22:57:51 +00:00
Gleb Smirnoff	6df8a71067	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.	2014-11-07 09:39:05 +00:00
Gleb Smirnoff	76039bc84f	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 17:58:36 +00:00
Weongyo Jeong	c45e1b3cad	Covers values if (BYTES_THIS_ACK(tp, th) / tp->t_maxseg) value is from 2.0 to 3.0. Reviewed by: lstewart	2011-03-28 19:03:56 +00:00
Lawrence Stewart	bee9ab2bc5	Add a new sack hint to track the most recent and highest sacked sequence number. This will be used by the incoming Enhanced RTT Khelp module. Sponsored by: FreeBSD Foundation Submitted by: David Hayes <dahayes at swin edu au> Reviewed by: bz and others (as part of a larger patch) MFC after: 3 months	2010-12-28 03:27:20 +00:00
Lawrence Stewart	dbc4240942	This commit marks the first formal contribution of the "Five New TCP Congestion Control Algorithms for FreeBSD" FreeBSD Foundation funded project. More details about the project are available at: http://caia.swin.edu.au/freebsd/5cc/ - Add a KPI and supporting infrastructure to allow modular congestion control algorithms to be used in the net stack. Algorithms can maintain per-connection state if required, and connections maintain their own algorithm pointer, which allows different connections to concurrently use different algorithms. The TCP_CONGESTION socket option can be used with getsockopt()/setsockopt() to programmatically query or change the congestion control algorithm respectively from within an application at runtime. - Integrate the framework with the TCP stack in as least intrusive a manner as possible. Care was also taken to develop the framework in a way that should allow integration with other congestion aware transport protocols (e.g. SCTP) in the future. The hope is that we will one day be able to share a single set of congestion control algorithm modules between all congestion aware transport protocols. - Introduce a new congestion recovery (TF_CONGRECOVERY) state into the TCP stack and use it to decouple the meaning of recovery from a congestion event and recovery from packet loss (TF_FASTRECOVERY) a la RFC2581. ECN and delay based congestion control protocols don't generally need to recover from packet loss and need a different way to note a congestion recovery episode within the stack. - Remove the net.inet.tcp.newreno sysctl, which simplifies some portions of code and ensures the stack always uses the appropriate mechanisms for recovering from packet loss during a congestion recovery episode. - Extract the NewReno congestion control algorithm from the TCP stack and massage it into module form. NewReno is always built into the kernel and will remain the default algorithm for the forseeable future. Implementations of additional different algorithms will become available in the near future. - Bump __FreeBSD_version to 900025 and note in UPDATING that rebuilding code that relies on the size of "struct tcpcb" is required. Many thanks go to the Cisco University Research Program Fund at Community Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work at the Centre for Advanced Internet Architectures, Swinburne University of Technology is greatly appreciated. In collaboration with: David Hayes <dahayes at swin edu au> and Grenville Armitage <garmitage at swin edu au> Sponsored by: Cisco URP, FreeBSD Foundation Reviewed by: rpaulo Tested by: David Hayes (and many others over the years) MFC after: 3 months	2010-11-12 06:41:55 +00:00
Bjoern A. Zeeb	82cea7e6f3	MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days	2010-04-29 11:52:42 +00:00
Robert Watson	530c006014	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
Robert Watson	1e77c1056a	Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)	2009-07-16 21:13:04 +00:00
Robert Watson	eddfbb763d	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
Lawrence Stewart	91a5ebde45	Fix a race in the manipulation of the V_tcp_sack_globalholes global variable, which is currently not protected by any type of lock. When triggered, the bug would sometimes cause a panic when the TCP activity to an affected machine eventually slowed during a lull. The panic only occurs if INVARIANTS is compiled into the kernel, and has laid dormant for some time as a result of INVARIANTS being off by default except in FreeBSD-CURRENT. Switch to atomic operations in the locations where the variable is changed. Reads have not been updated to be protected by atomics, so there is a possibility of accounting errors in any given calculation where the variable is read. This is considered unlikely to occur in the wild, and will not cause serious harm on rare occasions where it does. Thanks to Robert Watson for debugging help. Reported by: Kamigishi Rei <spambox at haruhiism dot net> Tested by: Kamigishi Rei <spambox at haruhiism dot net> Reviewed by: silby Approved by: re (rwatson), kensmith (mentor temporarily unavailable)	2009-07-13 11:59:38 +00:00
Robert Watson	78b5071407	Update stats in struct tcpstat using two new macros, TCPSTAT_ADD() and TCPSTAT_INC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures. MFC after: 3 days	2009-04-11 22:07:19 +00:00
Marko Zec	1ed81b739e	First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)	2009-04-06 22:29:41 +00:00
Marko Zec	385195c062	Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-12-10 23:12:39 +00:00
Bjoern A. Zeeb	4b79449e2f	Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation	2008-12-02 21:37:28 +00:00
Marko Zec	44e33a0758	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-19 09:39:34 +00:00
Robert Watson	4c95fd23d6	Remove endearing but syntactically unnecessary "return;" statements directly before the final closeing brackets of some TCP functions. MFC after: 3 days	2008-10-26 19:33:22 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Robert Watson	8501a69cc9	Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive. This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code. MFC after: 3 months Tested by: kris (superset of committered patch)	2008-04-17 21:38:18 +00:00
Robert Watson	632bbf0f5b	Coalesce two identical UCB licenses into a single license instance with one set of copyright years. White space and comment cleanup. Export $FreeBSD$ via __FBSDID.	2007-05-11 11:21:43 +00:00
Robert Watson	f2565d68a4	Move universally to ANSI C function declarations, with relatively consistent style(9)-ish layout.	2007-05-10 15:58:48 +00:00
Andre Oppermann	b8152ba793	Change the TCP timer system from using the callout system five times directly to a merged model where only one callout, the next to fire, is registered. Instead of callout_reset(9) and callout_stop(9) the new function tcp_timer_activate() is used which then internally manages the callout. The single new callout is a mutex callout on inpcb simplifying the locking a bit. tcp_timer() is the called function which handles all race conditions in one place and then dispatches the individual timer functions. Reviewed by: rwatson (earlier version)	2007-04-11 09:45:16 +00:00
Andre Oppermann	5dd9dfefd6	Retire unused TCP_SACK_DEBUG.	2007-04-04 14:44:15 +00:00
Andre Oppermann	07b64b901a	In tcp_sack_doack() remove too tight KASSERT() added in last revision. This function may be called without any TCP SACK option blocks present. Protect iteration over SACK option blocks by checking for SACK options present flag first. Bug reported by: wkoszek, keramida, Nicolas Blais	2007-03-25 23:27:26 +00:00
Andre Oppermann	fc30a25199	Bring SACK option handling in tcp_dooptions() in line with all other options and ajust users accordingly.	2007-03-23 18:33:21 +00:00
Andre Oppermann	ad3f9ab320	ANSIfy function declarations and remove register keywords for variables. Consistently apply style to all function declarations.	2007-03-21 19:37:55 +00:00
Andre Oppermann	85c497918c	Make TCP_DROP_SYNFIN a standard part of TCP. Disabled by default it doesn't impede normal operation negatively and is only a few lines of code. It's close relatives blackhole and log_in_vain aren't options either.	2007-03-21 18:25:28 +00:00
Andre Oppermann	6489fe6553	Match up SYSCTL declaration style.	2007-03-19 19:00:51 +00:00
Mohan Srinivasan	1714e18e79	Eliminate debug code that catches bugs in the hinting of sack variables (tcp_sack_output_debug checks cached hints aginst computed values by walking the scoreboard and reports discrepancies). The sack hinting code has been stable for many months now so it is time for the debug code to go. Leaving tcp_sack_output_debug ifdef'ed out in case we need to resurrect it at a later point.	2006-04-06 17:21:16 +00:00
Mohan Srinivasan	1f65c2cd31	Certain (bad) values of sack blocks can end up corrupting the sack scoreboard. Make the checks in tcp_sack_doack() more robust to prevent this. Submitted by: Raja Mukerji (raja@mukerji.com) Reviewed by: Mohan Srinivasan	2006-04-05 00:11:04 +00:00
Andre Oppermann	8e8aab7aec	Remove unneeded includes and provide more accurate description to others. Submitted by: garys PR: kern/86437	2006-02-18 17:05:00 +00:00
Paul Saab	d0a14f55c3	Fix for a bug that causes SACK scoreboard corruption when the limit on holes per connection is reached. Reported by: Patrik Roos Submitted by: Mohan Srinivasan Reviewed by: Raja Mukerji, Noritoshi Demizu	2005-11-21 19:22:10 +00:00
Andre Oppermann	ef8fd90476	Remove unnecessary IPSEC includes. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-23 14:42:40 +00:00
Paul Saab	5a53ca1627	- Postpone SACK option processing until after PAWS checks. SACK option processing is now done in the ACK processing case. - Merge tcp_sack_option() and tcp_del_sackholes() into a new function called tcp_sack_doack(). - Test (SEG.ACK < SND.MAX) before processing the ACK. Submitted by: Noritoshi Demizu Reveiewed by: Mohan Srinivasan, Raja Mukerji Approved by: re	2005-06-27 22:27:42 +00:00
Paul Saab	9004ded9df	Fix for a bug in tcp_sack_option() causing crashes. Submitted by: Noritoshi Demizu, Mohan Srinivasan. Approved by: re (scottl blanket SACK)	2005-06-23 00:18:54 +00:00
Paul Saab	e912f906d0	Fix a mis-merge. Remove a redundant call to tcp_sackhole_insert Submitted by: Mohan Srinivasan	2005-06-09 17:55:29 +00:00
Paul Saab	8b9bbaaa94	Fix for a crash in tcp_sack_option() caused by hitting the limit on the number of sack holes. Reported by: Andrey Chernov Submitted by: Noritoshi Demizu Reviewed by: Raja Mukerji	2005-06-09 14:01:04 +00:00
Paul Saab	db4b83fe49	Fix for a bug in the change that walks the scoreboard backwards from the tail (in tcp_sack_option()). The bug was caused by incorrect accounting of the retransmitted bytes in the sackhint. Reported by: Kris Kennaway. Submitted by: Noritoshi Demizu.	2005-06-06 19:46:53 +00:00
Paul Saab	9d17a7a64a	Changes to tcp_sack_option() that - Walks the scoreboard backwards from the tail to reduce the number of comparisons for each sack option received. - Introduce functions to add/remove sack scoreboard elements, making the code more readable. Submitted by: Noritoshi Demizu Reviewed by: Raja Mukerji, Mohan Srinivasan	2005-06-04 08:03:28 +00:00
Paul Saab	808f11b768	This is conform with the terminology in M.Mathis and J.Mahdavi, "Forward Acknowledgement: Refining TCP Congestion Control" SIGCOMM'96, August 1996. Submitted by: Noritoshi Demizu, Raja Mukerji	2005-05-25 17:55:27 +00:00
Paul Saab	64b5fbaa04	Rewrite of tcp_sack_option(). Kentaro Kurahone (NetBSD) pointed out that if we sort the incoming SACK blocks, we can update the scoreboard in one pass of the scoreboard. The added overhead of sorting upto 4 sack blocks is much lower than traversing (potentially) large scoreboards multiple times. The code was updating the scoreboard with multiple passes over it (once for each sack option). The rewrite fixes that, reducing the complexity of the main loop from O(n^2) to O(n). Submitted by: Mohan Srinivasan, Noritoshi Demizu. Reviewed by: Raja Mukerji.	2005-05-23 19:22:48 +00:00
Paul Saab	4fc5324557	Introduce routines to alloc/free sack holes. This cleans up the code considerably. Submitted by: Noritoshi Demizu. Reviewed by: Raja Mukerji, Mohan Srinivasan.	2005-05-16 19:26:46 +00:00
Paul Saab	fdace17f81	Fix for a bug where the "nexthole" sack hint is out of sync with the real next hole to retransmit from the scoreboard, caused by a bug which did not update the "nexthole" hint in one case in tcp_sack_option(). Reported by: Daniel Eriksson Submitted by: Mohan Srinivasan	2005-05-13 18:02:02 +00:00
Paul Saab	0077b0163f	When looking for the next hole to retransmit from the scoreboard, or to compute the total retransmitted bytes in this sack recovery episode, the scoreboard is traversed. While in sack recovery, this traversal occurs on every call to tcp_output(), every dupack and every partial ack. The scoreboard could potentially get quite large, making this traversal expensive. This change optimizes this by storing hints (for the next hole to retransmit and the total retransmitted bytes in this sack recovery episode) reducing the complexity to find these values from O(n) to constant time. The debug code that sanity checks the hints against the computed value will be removed eventually. Submitted by: Mohan Srinivasan, Noritoshi Demizu, Raja Mukerji.	2005-05-11 21:37:42 +00:00
Paul Saab	a6235da61e	- Make the sack scoreboard logic use the TAILQ macros. This improves code readability and facilitates some anticipated optimizations in tcp_sack_option(). - Remove tcp_print_holes() and TCP_SACK_DEBUG. Submitted by: Raja Mukerji. Reviewed by: Mohan Srinivasan, Noritoshi Demizu.	2005-04-21 20:11:01 +00:00

1 2

64 Commits