freebsd-dev

Author	SHA1	Message	Date
Ulrich Spörlein	7cc1fde083	mdoc: drop even more redundant .Pp calls No change in rendered output, less mandoc lint warnings. Tool provided by: Nobuyuki Koganemaru n-kogane at syd.odn.ne.jp	2010-10-19 12:35:40 +00:00
Bjoern A. Zeeb	12112cf676	MfP4 CH182763 (original version): Make it harder to exploit certain in_control() related races between the intiial lookup at the beginning and the time we will remove the entry from the lists by re-checking that entry is still in the list before trying to remove it. (*) It is believed that with the current code and locking strategy we cannot completely fix all race. Reported by: Nima Misaghian (nima_misa hotmail.com) on net@ 20100817 Tested by: Nima Misaghian (nima_misa hotmail.com) (original version) PR: kern/146250 Submitted by: Mikolaj Golub (to.my.trociny gmail.com) (different version) MFC after: 1 week	2010-10-16 19:53:22 +00:00
Lawrence Stewart	ca09d7728b	Retire the system-wide, per-reassembly queue segment limit. The mechanism is far too coarse grained to be useful and the default value significantly degrades TCP performance on moderate to high bandwidth-delay product paths with non-zero loss (e.g. 5+Mbps connections across the public Internet often suffer). Replace the outgoing mechanism with an individual per-queue limit based on the number of MSS segments that fit into the socket's receive buffer. This should strike a good balance between performance and the potential for resource exhaustion when FreeBSD is acting as a TCP receiver. With socket buffer autotuning (which is enabled by default), the reassembly queue tracks the socket buffer and benefits too. As the XXX comment suggests, my testing uncovered some unexpected behaviour which requires further investigation. By using so->so_rcv.sb_hiwat instead of sbspace(&so->so_rcv), we allow more segments to be held across both the socket receive buffer and reassembly queue than we probably should. The tradeoff is better performance in at least one common scenario, versus a devious sender's ability to consume more resources on a FreeBSD receiver. Sponsored by: FreeBSD Foundation Reviewed by: andre, gnn, rpaulo MFC after: 2 weeks	2010-10-16 07:12:39 +00:00
Lawrence Stewart	c8dc0ab886	- Switch the "net.inet.tcp.reass.cursegments" and "net.inet.tcp.reass.maxsegments" sysctl variables to be based on UMA zone stats. The value returned by the cursegments sysctl is approximate owing to the way in which uma_zone_get_cur is implemented. - Discontinue use of V_tcp_reass_qsize as a global reassembly segment count variable in the reassembly implementation. The variable was used without proper synchronisation and was duplicating accounting done by UMA already. The lack of synchronisation was particularly problematic on SMP systems terminating many TCP sessions, resulting in poor TCP performance for connections with non-zero packet loss. Sponsored by: FreeBSD Foundation Reviewed by: andre, gnn, rpaulo (as part of a larger patch) MFC after: 2 weeks	2010-10-16 05:37:45 +00:00
Bjoern A. Zeeb	dc699bac75	Use ifa_ifwithaddr_check() rather than ifa_ifwithaddr() as we are not interested in the result and would leak a reference otherwise. PR: kern/151435 Submitted by: Andrew Boyer (aboyer averesystems.com) MFC after: 3 days	2010-10-14 12:32:49 +00:00
Luigi Rizzo	3f18b51c8d	put back the assigment to sched_time. It was correct, and it was necessary. Submitted by: Riccardo Panicucci	2010-10-01 15:38:35 +00:00
Bjoern A. Zeeb	544794507a	Proper bracketing. PR: kern/151100 Submitted by: SunMinghao (sunminghao hotmail.com) MFC after: 3 days	2010-10-01 11:48:14 +00:00
Luigi Rizzo	e53a34a766	remove an unnecessary (and wrong) assignment. It was meant to reset idle_time (and it was not needed), but i even used the wrong field. Obtained from: Oleg MFC after: 3 days	2010-09-29 21:02:31 +00:00
Luigi Rizzo	38cc301f9f	whitespace changes in preparation for future commits	2010-09-29 09:40:20 +00:00
Luigi Rizzo	a47ee22718	fix handling of initial credit for an idle pipe. This fixes the bug where setting bw > 1 MTU/tick resulted in infinite bandwidth if io_fast=1 PR: 147245 148429 Obtained from: Riccardo Panicucci MFC after: 3 days	2010-09-29 09:22:12 +00:00
Luigi Rizzo	8d74ca8ce9	fix breakage in in-kernel NAT: the code did not honor net.inet.ip.fw.one_pass and always moved to the next rule in case of a successful nat. This should fix several related PR (waiting for feedback before closing them) PR: 145167 149572 150141 MFC after: 3 days	2010-09-28 23:23:23 +00:00
Luigi Rizzo	c08e545e99	Whitespace changes to reduce diffs wrt the most recent ipfw/dummynet code: + remove an unused macro, + adjust the constants in an enum + small whitespace changes MFC after: 3 days	2010-09-28 22:46:13 +00:00
Xin LI	64e0f48e7c	Add a bandaid for a long-standing race condition during route entry un-expiring. The previous version of code have no locking when testing rt_refcnt. The result of the lack of locking may result in a condition where a routing entry have a reference count but at the same time have RTPRF_OURS bit set and an expiration timer. These would eventually lead to a panic: panic: rtqkill route really not free When the system have ICMP redirects accepted from local gateway in a moderate frequency, for instance. Commit this workaround for now until we have some better solution. PR: kern/149804 Reviewed by: bz Tested by: Zhao Xin, Pete French MFC after: 2 weeks	2010-09-27 19:26:56 +00:00
Lawrence Stewart	d4d3e21865	Log the number of segments currently in the reassembly queue. Sponsored by: FreeBSD Foundation	2010-09-25 09:16:46 +00:00
Lawrence Stewart	0c236c4ebd	Internalise reassembly queue related functionality and variables which should not be used outside of the reassembly queue implementation. Provide a new function to flush all segments from a reassembly queue and call it from the appropriate places instead of manipulating the queue directly. Sponsored by: FreeBSD Foundation Reviewed by: andre, gnn, rpaulo MFC after: 2 weeks	2010-09-25 04:58:46 +00:00
Attilio Rao	109c1de8ba	Make the RPC specific __rpc_inet_ntop() and __rpc_inet_pton() general in the kernel (just as inet_ntoa() and inet_aton()) are and sync their prototype accordingly with already mentioned functions. Sponsored by: Sandvine Incorporated Reviewed by: emaste, rstone Approved by: dfr MFC after: 2 weeks	2010-09-24 15:01:45 +00:00
Attilio Rao	5f6bf4518d	IP_BINDANY is not correctly handled in getsockopt() case. Fix it by specifying the correct bits. Sponsored by: Sandvine Incorporated Reviewed by: bz, emaste, rstone Obtained from: Sandvine Incorporated MFC after: 10 days	2010-09-24 14:38:54 +00:00
Gleb Smirnoff	6baf7a243a	Do not convert some meaningful error value to EINVAL. Reviewed by: will	2010-09-20 12:23:10 +00:00
Michael Tuexen	1ea735c802	Fix a locking issue which resulted in aborted associations due to a corrupted nr-mapping array. MFC after: 2 weeks.	2010-09-20 12:19:11 +00:00
Michael Tuexen	231b700b17	Allow the initial congestion window to be configure to one MTU. Improve the description. MFC after: 2 weeks.	2010-09-19 11:57:21 +00:00
Michael Tuexen	f8faf20cf6	Fix a locking issue which shows up when the code is used on Mac OS X. MFC after: 2 weeks.	2010-09-19 11:42:16 +00:00
Andre Oppermann	ed42031102	Rearrange the TSO code to make it more readable and to clearly separate the decision logic, of whether we can do TSO, and the calculation of the burst length into two distinct parts. Change the way the TSO burst length calculation is done. While TSO could do bursts of 65535 bytes that can't be represented in ip_len together with the IP and TCP header. Account for that and use IP_MAXPACKET instead of TCP_MAXWIN as base constant (both have the same value of 64K). When more data is available prevent less than MSS sized segments from being sent during the current TSO burst. Add two more KASSERTs to ensure the integrity of the packets. Tested by: Ben Wilber <ben-at-desync com> MFC after: 10 days	2010-09-17 22:05:27 +00:00
Michael Tuexen	99ddc825f3	Fix a bug where the wrong PR-SCTP policy was considered. While there, use always the same code for the check of TTL expiration. MFC after: 2 weeks.	2010-09-17 19:20:39 +00:00
Michael Tuexen	dcfc062535	Make the initial congestion window configurable via sysctl. MFC after: 2 weeks.	2010-09-17 18:53:07 +00:00
Michael Tuexen	25a2a18706	* Implement initial version of send buffer splitting. * Make send/recv buffer splitting switchable via sysctl. * While there: Fix some comments.	2010-09-17 16:20:29 +00:00
Andre Oppermann	1c18314d17	Remove the TCP inflight bandwidth limiter as announced in r211315 to give way for the pluggable congestion control framework. It is the task of the congestion control algorithm to set the congestion window and amount of inflight data without external interference. In 'struct tcpcb' the variables previously used by the inflight limiter are renamed to spares to keep the ABI intact and to have some more space for future extensions. In 'struct tcp_info' the variable 'tcpi_snd_bwnd' is not removed to preserve the ABI. It is always set to 0. In siftr.c in 'struct pkt_node' the variable 'snd_bwnd' is not removed to preserve the ABI. It is always set to 0. These unused variable in the various structures may be reused in the future or garbage collected before the next release or at some other point when an ABI change happens anyway for other reasons. No MFC is planned. The inflight bandwidth limiter stays disabled by default in the other branches but remains available.	2010-09-16 21:06:45 +00:00
Andre Oppermann	2c9879e8d3	Improve comment to TCP_MINMSS by taking the wording from lstewart (with a small difference in the last paragraph though) as suggested by jhb. Clarify that the 'reviewed by' in r212653 by lstewart was for the functional change, not the comments in the committed version.	2010-09-16 12:13:06 +00:00
Michael Tuexen	b3f7949dc5	Remove old debug code. MFC after: 2 weeks.	2010-09-15 23:56:25 +00:00
Michael Tuexen	94b0d96992	Remove unused variable/assignment. MFC after: 3 weeks.	2010-09-15 23:40:36 +00:00
Michael Tuexen	9eea4a2da7	Delay the assignment of a path for DATA chunk until they hit the sent_queue. Honor a given path when the SCTP_ADDR_OVER flag is set. MFC after: 2 weeks.	2010-09-15 23:10:45 +00:00
Michael Tuexen	24f52bbd9b	Use TAILQ_EMPTY() for testing if a tail queue is empty. Set whoFrom to NULL after freeing whoFrom.	2010-09-15 21:53:10 +00:00
Michael Tuexen	3c8c191bae	Remove unused variable/assignment. MFC after: 2 weeks.	2010-09-15 21:19:54 +00:00
Michael Tuexen	b90b577ff3	Remove assignment without effect. MFC after: 2 weeks.	2010-09-15 21:08:57 +00:00
Michael Tuexen	107cad7449	* Use !TAILQ_EMPTY() for checking if a tail queue is not empty. * Remove assignment without any effect. MFC after: 2 weeks.	2010-09-15 20:53:20 +00:00
Andre Oppermann	c183b9c683	Change the default MSS for IPv4 and IPv6 TCP connections from an artificial power-of-2 rounded number to their real values specified in RFC879 and RFC2460. From the history and existing comments it appears that the rounded numbers were intended to be advantageous for the kernel and mbuf system. However this hasn't been the case at for at least a long time. The mbuf clusters used in tcp_output() have enough space to hold the larger real value for the default MSS for both IPv4 and IPv6. Note that the default MSS is only used when path MTU discovery is disabled. Update and expand related comments. Reviewed by: lsteward (including some word-smithing) MFC after: 2 weeks	2010-09-15 10:39:30 +00:00
Qing Li	a458eaa039	Adding an address on an interface also requires the loopback route to that address be installed. PR: kern/150481 Submitted by: Ingo Flaschberger <if at xip.at> MFC after: 5 days	2010-09-12 18:04:47 +00:00
Michael Tuexen	e95307c5c5	* Remove code which has no effect. * Clean up the handling in sctp_lower_sosend(). MFC after: 3 weeks.	2010-09-09 20:51:23 +00:00
Will Andrews	15249f73e9	Fix CARP in backup mode by properly registering its hooks for INET and INET6 using ipproto_{un,}register() and the newly created ip6proto_{un,}register() so that it can again receive IPPROTO_CARP packets allowing its state machine to work. Reviewed by: bz Approved by: ken (mentor)	2010-09-06 21:06:06 +00:00
Will Andrews	e24fa11d3e	Fix static kernel builds with carp(4) by changing its SYSINIT order so that it is initialized after basic protocol initialization, which allows it to register via pf_proto_register(). Reviewed by: bz Approved by: ken (mentor)	2010-09-06 21:03:30 +00:00
Gleb Smirnoff	14a268a073	in_delayed_cksum() requires host byte order. Reported by: Alexander Levin <amindomao googlemail.com> MFC after: 1 week	2010-09-06 13:17:01 +00:00
Michael Tuexen	049640c1f0	Implement correct handling of address parameter and sendinfo for SCTP send calls. MFC after: 4 weeks.	2010-09-05 20:13:07 +00:00
Randall Stewart	52129fcd78	Fix some CLANG warnings. One clang warning is left due to the fact that its bogus.. nam->sa_family will not change from AF_INET6 to AF_INET (but clang thinks it does ;-D)	2010-09-05 13:41:45 +00:00
Bjoern A. Zeeb	42db1b87d6	In case of RADIX_MPATH do not leak the IN_IFADDR read lock on early return. MFC after: 3 days	2010-09-04 16:06:01 +00:00
Bjoern A. Zeeb	1b48d24533	MFp4 CH=183052 183053 183258: In protosw we define pr_protocol as short, while on the wire it is an uint8_t. That way we can have "internal" protocols like DIVERT, SEND or gaps for modules (PROTO_SPACER). Switch ipproto_{un,}register to accept a short protocol number() and do an upfront check for valid boundries. With this we also consistently report EPROTONOSUPPORT for out of bounds protocols, as we did for proto == 0. This allows a caller to not error for this case, which is especially important if we want to automatically call these from domain handling. () the functions have been without any in-tree consumer since the initial introducation, so this is considered save. Implement ip6proto_{un,}register() similarly to their legacy IP counter parts to allow modules to hook up dynamically. Reviewed by: philip, will MFC after: 1 week	2010-09-02 17:43:44 +00:00
Michael Tuexen	fc0487080a	Fix a bug which results in peer IPv4 addresses a.b.c.d with 224<=d<=239 incorrectly being detected as multicast addresses on little endian systems. MFC after: 2 weeks	2010-09-01 16:11:26 +00:00
Maxim Konovalov	5a47f206a1	o Some programs could send broadcast/multicast traffic to ipfw pseudo-interface. This leads to a panic due to uninitialized if_broadcastaddr address. Initialize it and implement ip_output() method to prevent mbuf leak later. ipfw pseudo-interface should never send anything therefore call panic(9) in if_start() method. PR: kern/149807 Submitted by: Dmitrij Tejblum MFC after: 2 weeks	2010-08-30 09:29:51 +00:00
Michael Tuexen	9c7635e18b	Fix the the SCTP_WITH_NO_CSUM option when used in combination with interface supporting CRC offload. While at it, make use of the feature that the loopback interface provides CRC offloading. MFC after: 4 weeks	2010-08-29 18:50:30 +00:00
Michael Tuexen	e24ea413e0	Bugfix: Do not send a packet drop report in response to a received INIT-ACK with incorrect CRC.	2010-08-28 21:15:00 +00:00
Michael Tuexen	20083c2eb1	Fix the switching on/off of CMT using sysctl and socket option. Fix the switching on/off of PF and NR-SACKs using sysctl. Add minor improvement in handling malloc failures. Improve the address checks when sending. MFC after: 4 weeks	2010-08-28 17:59:51 +00:00
John Baldwin	98b9eb0db2	Simplify the tcp pcblist estimate logic slightly. MFC after: 3 days	2010-08-27 18:17:46 +00:00

1 2 3 4 5 ...

3895 Commits