freebsd-dev

Author	SHA1	Message	Date
Michael Tuexen	e045904fdc	Fix a potential race in returning setting errno when an association goes down. Reported by Mozilla in https://bugzilla.mozilla.org/show_bug.cgi?id=845513 MFC after: 3 days	2013-02-27 19:51:47 +00:00
Andrew Gallatin	e5ca1ffab5	Fix tcp_lro_rx_ipv4() for drivers that do not set CSUM_IP_CHECKED. Specifcially, in_cksum_hdr() returns 0 (not 0xffff) when the IPv4 checksum is correct. Without this fix, the tcp_lro code will reject good IPv4 traffic from drivers that do not implement IPv4 header harder csum offload. Sponsored by: Myricom Inc. MFC after: 7 days	2013-02-21 17:00:35 +00:00
Sergey Kandaurov	46f2df9c13	ip_savecontrol() style fixes. No functional changes. - fix indentation - put the operator at the end of the line for long statements - remove spaces between the type and the variable in a cast - remove excessive parentheses Tested by: md5	2013-02-20 15:44:40 +00:00
Michael Tuexen	2416af26a0	Send the adaptation layer indication only if set by the user. MFC after: 3 days Discussed with: rrs	2013-02-11 21:02:49 +00:00
Michael Tuexen	c53f854a17	Don't send kernel provided information in the User Initiated ABORT cause, since the user can also provide this kind of information. So the receiver doesn't know who provided the information. While there: Fix a bug where the stack would send a malformed ABORT chunk when using a send() call with SCTP_ABORT\|SCT_SENDALL flags. MFC after: 3 days	2013-02-11 13:57:03 +00:00
Gleb Smirnoff	24421c1c32	Resolve source address selection in presense of CARP. Add a couple of helper functions: - carp_master() - boolean function which is true if an address is in the MASTER state. - ifa_preferred() - boolean function that compares two addresses, and is aware of CARP. Utilize ifa_preferred() in ifa_ifwithnet(). The previous version of patch also changed source address selection logic in jails using carp_master(), but we failed to negotiate this part with Bjoern. May be we will approach this problem again later. Reported & tested by: Anton Yuzhaninov <citrin citrin.ru> Sponsored by: Nginx, Inc	2013-02-11 10:58:22 +00:00
Michael Tuexen	f0d44a49a0	Make sure that received packets for removed addresses are handled consistently. While there, make variable names consistent. MFC after: 3 days	2013-02-10 19:57:19 +00:00
Michael Tuexen	a1cb341b5d	Cleanup the handling of address scopes. Announce in the INIT/INIT-ACK only the supported address types. While there, do some whitespace cleanups. MFC after: 1 week	2013-02-09 17:26:14 +00:00
Michael Tuexen	c39cfa1f7e	Fix a bug where HEARTBEATs were still sent in SHUTDOWN_SENT or SHUTDOWN_ACK_SENT state. While there, make the corresponding code consistent. MFC after: 1 week	2013-02-09 08:27:08 +00:00
John Baldwin	0d25fab44d	Add placeholder constants to reserve a portion of the socket option name space for use by downstream vendors to add custom options. MFC after: 2 weeks	2013-02-01 15:32:20 +00:00
Andre Oppermann	cda3447bb0	uma_zone_set_max() directly returns the rounded effective zone limit. Use the return value directly instead of doing a second uma_zone_set_max() step. MFC after: 1 week	2013-02-01 14:21:09 +00:00
Gleb Smirnoff	498944374f	- Move AUTHORS and ACKNOWLEDGEMENTS to the end of the page. - Add myself to list of authors.	2013-01-31 10:29:22 +00:00
Gleb Smirnoff	9711a168b9	Retire struct sockaddr_inarp. Since ARP and routing are separated, "proxy only" entries don't have any meaning, thus we don't need additional field in sockaddr to pass SIN_PROXY flag. New kernel is binary compatible with old tools, since sizes of sockaddr_inarp and sockaddr_in match, and sa_family are filled with same value. The structure declaration is left for compatibility with third party software, but in tree code no longer use it. Reviewed by: ru, andre, net@	2013-01-31 08:55:21 +00:00
Gleb Smirnoff	ea26ed7eea	Utilize m_get2() to get mbuf of appropriate size.	2013-01-30 18:40:19 +00:00
Navdeep Parhar	adfaf8f6ad	Add checks for SO_NO_OFFLOAD in a couple of places that I missed earlier in r245915.	2013-01-26 01:41:42 +00:00
Navdeep Parhar	20be068c8a	Teach toe_l2_resolve to resolve IPv6 destinations too. Reviewed by: bz@	2013-01-26 00:57:29 +00:00
Navdeep Parhar	4364ec0852	Move lle_event to if_llatbl.h lle_event replaced arp_update_event after the ARP rewrite and ended up in if_ether.h simply because arp_update_event used to be there too. IPv6 neighbor discovery is going to grow lle_event support and this is a good time to move it to if_llatbl.h. The two in-tree consumers of this event - OFED and toecore - are not affected. Reviewed by: bz@	2013-01-25 23:58:21 +00:00
Navdeep Parhar	460cf046c2	There is no need to call into the TOE driver twice in pru_rcvd (tod_rcvd and then tod_output right after that). Reviewed by: bz@	2013-01-25 22:50:52 +00:00
Navdeep Parhar	464dfeb43f	Add TCP_OFFLOAD hook in syncache_respond for IPv6 too, just like the one that exists for IPv4. Reviewed by: bz@	2013-01-25 22:16:35 +00:00
Navdeep Parhar	b218348bc3	Teach toe_4tuple_check() to deal with IPv6 4-tuples too. Reviewed by: bz@	2013-01-25 20:45:24 +00:00
Navdeep Parhar	37cc0ecb1b	Heed SO_NO_OFFLOAD. MFC after: 1 week	2013-01-25 20:23:33 +00:00
Navdeep Parhar	5cd3dcaa25	Remove redundant test, we know inp_lport is 0. MFC after: 1 week	2013-01-25 20:14:27 +00:00
John Baldwin	1d77fa5a26	Use decimal values for UDP and TCP socket options rather than hex to avoid implying that these constants should be treated as bit masks. Reviewed by: net MFC after: 1 week	2013-01-22 19:45:04 +00:00
Lawrence Stewart	5b648e797b	Simplify and fix a bug in cc_ack_received()'s "are we congestion window limited" logic (refer to [1] for associated discussion). snd_cwnd and snd_wnd are unsigned long and on 64 bit hosts, min() will truncate them to 32 bits and could therefore potentially corrupt the result (although under normal operation, neither variable should legitmately exceed 32 bits). [1] http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034297.html Submitted by: jhb MFC after: 1 week	2013-01-22 09:44:21 +00:00
John Baldwin	6c0ef8957f	Don't drop options from the third retransmitted SYN by default. If the SYNs (or SYN/ACK replies) are dropped due to network congestion, then the remote end of the connection may act as if options such as window scaling are enabled but the local end will think they are not. This can result in very slow data transfers in the case of window scaling disagreements. The old behavior can be obtained by setting the net.inet.tcp.rexmit_drop_options sysctl to a non-zero value. Reviewed by: net@ MFC after: 2 weeks	2013-01-09 20:27:06 +00:00
Peter Wemm	8a1163e82f	Temporarily revert rev 244678. This is causing loopback problems with the lo (loopback) interfaces.	2013-01-03 10:21:28 +00:00
Michael Tuexen	11e03b3200	Some cleanups. MFC after: 3 days	2012-12-27 08:10:58 +00:00
Michael Tuexen	72c123a8b4	Minor cleanups of debug messages. MFC after: 3 days	2012-12-27 08:06:58 +00:00
Michael Tuexen	2c2e3218cb	Fix a copy and paste error. MFC after: 3 days	2012-12-27 08:02:58 +00:00
Gleb Smirnoff	c4d0697685	Garbage collect carp_cksum().	2012-12-25 14:29:38 +00:00
Gleb Smirnoff	7951008b47	Change net.inet.carp.demotion sysctl to add the supplied value to the current demotion factor instead of assigning it. This allows external scripts to control demotion factor together with kernel in a raceless manner.	2012-12-25 14:08:13 +00:00
Gleb Smirnoff	e8db9937f3	Fix sysctl_handle_int() usage. Either arg1 or arg2 should be supplied, and arg2 doesn't pass size of arg1.	2012-12-25 13:55:21 +00:00
Gleb Smirnoff	468e45f3bd	The SIOCSIFFLAGS ioctl handler runs if_up()/if_down() that notify all interested parties in case if interface flag IFF_UP has changed. However, not only SIOCSIFFLAGS can raise the flag, but SIOCAIFADDR and SIOCAIFADDR_IN6 can, too. The actual \|= is done not in the protocol code, but in code of interface drivers. To fix this historical layering violation, we will check whether ifp->if_ioctl(SIOCSIFADDR) raised the IFF_UP flag, and if it did, run the if_up() handler. This fixes configuring an address under CARP control on an interface that was initially !IFF_UP. P.S. I intentionally omitted handling the IFF_SMART flag. This flag was never ever used in any driver since it was introduced, and since it means another layering violation, it should be garbage collected instead of pretended to be supported.	2012-12-25 13:01:58 +00:00
Gleb Smirnoff	3e6c8b5366	Minor style(9) changes: - Remove declaration in initializer. - Add empty line between logical blocks.	2012-12-24 21:35:48 +00:00
Gleb Smirnoff	b8056fae06	Fix !INET6 build after r244365.	2012-12-18 08:14:16 +00:00
Gleb Smirnoff	dd029d52fa	Clear correct flag in INET6 case.	2012-12-18 08:09:44 +00:00
Andrey V. Elsukov	f491274582	Since we use different flags to detect tcp forwarding, and we share the same code for IPv4 and IPv6 in tcp_input, we should check both M_IP_NEXTHOP and M_IP6_NEXTHOP flags. MFC after: 3 days	2012-12-17 20:55:33 +00:00
Gleb Smirnoff	b1ec2940af	Fix problem in r238990. The LLE_LINKED flag should be tested prior to entering llentry_free(), and in case if we lose the race, we should simply perform LLE_FREE_LOCKED(). Otherwise, if the race is lost by the thread performing arptimer(), it will remove two references from the lle instead of one. Reported by: Ian FREISLICH <ianf clue.co.za>	2012-12-13 11:11:15 +00:00
Gleb Smirnoff	78a7880f64	Fix a crash in tcp_input(), that happens when mbuf has a fwd_tag on it, but later after processing and freeing the tag, we need to jump back again to the findpcb label. Since the fwd_tag pointer wasn't NULL we tried to process and free the tag for second time. Reported & tested by: Pawel Tyll <ptyll nitronet.pl> MFC after: 3 days	2012-12-12 17:41:21 +00:00
Michael Tuexen	cca6f4a8f3	Get it compiling without INET and INET6 support (mainly userland stack). MFC after: 2 weeks	2012-12-08 15:11:09 +00:00
Pawel Jakub Dawidek	6acd596efb	More warnings for zones that depend on the kern.ipc.maxsockets limit. Obtained from: WHEEL Systems	2012-12-08 12:51:06 +00:00
Michael Tuexen	b11f07d86c	Use correct padding of the ABORT chunk in case of an user initiated abort cause is used. MFC after: 2 weeks	2012-12-08 09:50:38 +00:00
Michael Tuexen	3fb7827628	Ensure that the padding of the last parameter of an INIT chunk is not included in the chunk length as required by RFC 4960. While there, cleanup sctp_send_initiate(). MFC after: 2 weeks	2012-12-08 08:22:33 +00:00
Gleb Smirnoff	eb1b1807af	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually	2012-12-05 08:04:20 +00:00
Andre Oppermann	da2299c5c7	Remove unused and unnecessary CSUM_IP_FRAGS checksumming capability. Checksumming the IP header of fragments is no different from doing normal IP headers. Discussed with: yongari MFC after: 1 week	2012-11-27 19:31:49 +00:00
Andre Oppermann	13feab8286	Add DELACK to list of timers. MFC after: 1 week	2012-11-27 19:07:28 +00:00
Navdeep Parhar	825fd1e437	Make sure that tcp_timer_activate() correctly sees TCP_OFFLOAD (or not).	2012-11-27 06:42:44 +00:00
Alfred Perlstein	08373e0bc4	Auto size the tcbhashsize structure based on max sockets. While here, also make the code that enforces power-of-two more forgiving, instead of just resetting to 512, graciously round-down to the next lower power of two.	2012-11-27 03:04:24 +00:00
Michael Tuexen	a50f0e3152	Add support for sctp_peeloff() also in the front states of the association. MFC after: 3 days	2012-11-26 16:44:03 +00:00
Michael Tuexen	e3976bb8d7	Find the endpoint for an incoming packet also if the endpoint comes from sctp_peeloff(). MFC after: 3 days	2012-11-26 16:43:32 +00:00
Michael Tuexen	440da2d35b	Allow shutdown() to be used on fds returned from sctp_peeloff(). MFC after: 3 days	2012-11-26 08:50:00 +00:00
Michael Tuexen	a3158782c2	Remove unused function. MFC after: 1 week	2012-11-25 14:25:08 +00:00
Michael Tuexen	3a51a2647a	Add support for SCTP/UDP/IPV6. This completes the support of http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-udp-encaps MFC after: 1 week	2012-11-17 20:04:04 +00:00
Michael Tuexen	325c8c46b1	Get the accounting working. We now have counters how many chunks for each SCTP outgoing stream are in the send and sent queue. While there, improve the naming of NR-SACK related constants recently introduced. MFC after: 1 week	2012-11-16 19:39:10 +00:00
Roman Divacky	8252626fb4	Initialize hdrlen to 0 to avoid clang warning in NOINET case.	2012-11-10 10:41:00 +00:00
Bjoern A. Zeeb	ec89d0398b	Cleanup some whitspace in this file to get it out of an upcoming patch. MFC after: 10 days	2012-11-08 03:29:55 +00:00
Michael Tuexen	a7ad6026e0	Add per outgoing stream accounting for chunks in the send and sent queue. This provides no functional change, but is a preparation for an upcoming stream reset improvement. Done with rrs@. MFC after: 1 week	2012-11-07 22:11:38 +00:00
Michael Tuexen	2a4985847a	Add some missing changes missed in the last commit. MFC after: 1 week X-MFC with: 242708	2012-11-07 21:25:32 +00:00
Michael Tuexen	98f2956c11	Improve PR-SCTP if used in combination with NR-SACK. Based on work done by Mohammad Rajiullah. MFC after: 1 week	2012-11-07 20:59:00 +00:00
Kevin Lo	0f5e7edc14	Fix typo; s/ouput/output	2012-11-07 07:00:59 +00:00
Mateusz Guzik	8e1e6e5f4a	Fix possible spurious sbunlock in sctp_sorecvmsg. Reviewed by: tuexen Approved by: trasz (mentor) MFC after: 3 days	2012-11-06 23:04:23 +00:00
Michael Tuexen	f3b05218ea	Move from early SSN assignment to late SSN assignment. This doesn't change functionality, but makes upcoming change much easier. Developed with rrs@ at the IETF 85. MFC after: 1 week	2012-11-05 20:55:17 +00:00
Andre Oppermann	60ee3bb213	Back out r242262. The simplified window change/update logic wasn't complete and ready for production use. PR: kern/173309	2012-11-05 09:13:06 +00:00
Andrey V. Elsukov	ffdbf9da3b	Remove the recently added sysctl variable net.pfil.forward. Instead, add protocol specific mbuf flags M_IP_NEXTHOP and M_IP6_NEXTHOP. Use them to indicate that the mbuf's chain contains the PACKET_TAG_IPFORWARD tag. And do a tag lookup only when this flag is set. Suggested by: andre	2012-11-02 01:20:55 +00:00
Michael Tuexen	21f67da7c4	Whitespace changes due to upstream integration of SCTP changes in the FreeBSD code base.	2012-10-29 20:47:32 +00:00
Michael Tuexen	24d4ce2c87	Add braces (as used elsewhere in the SCTP code).	2012-10-29 20:44:29 +00:00
Michael Tuexen	09c1c8563a	Use ntohs() and htons() in correct order. However, this doesn't change functionality.	2012-10-29 20:42:48 +00:00
Andre Oppermann	78f59b4bfd	Forced commit to provide the correct commit message to r242251: Defer sending an independent window update if a delayed ACK is pending saving a packet. The window update then gets piggy-backed on the next already scheduled ACK. Added grammar fixes as well. MFC after: 2 weeks	2012-10-29 13:16:33 +00:00
Andre Oppermann	8d045dbdf3	Define the delayed ACK timeout value directly as hz/10 instead of obfuscating it by going through PR_FASTHZ. No functional change. MFC after: 2 weeks	2012-10-29 12:17:02 +00:00
Andre Oppermann	322181c98e	If the user has closed the socket then drop a persisting connection after a much reduced timeout. Typically web servers close their sockets quickly under the assumption that the TCP connections goes away as well. That is not entirely true however. If the peer closed the window we're going to wait for a long time with lots of data in the send buffer. MFC after: 2 weeks	2012-10-28 19:58:20 +00:00
Andre Oppermann	09440655fe	Increase the initial CWND to 10 segments as defined in IETF TCPM draft-ietf-tcpm-initcwnd-05. It explains why the increased initial window improves the overall performance of many web services without risking congestion collapse. As long as it remains a draft it is placed under a sysctl marking it as experimental: net.inet.tcp.experimental.initcwnd10 = 1 When it becomes an official RFC soon the sysctl will be changed to the RFC number and moved to net.inet.tcp. This implementation differs from the RFC draft in that it is a bit more conservative in the case of packet loss on SYN or SYN\|ACK because we haven't reduced the default RTO to 1 second yet. Also the restart window isn't yet increased as allowed. Both will be adjusted with upcoming changes. Is is enabled by default. In Linux it is enabled since kernel 3.0. MFC after: 2 weeks	2012-10-28 19:47:46 +00:00
Andre Oppermann	77339e1cdc	Update comment to reflect the change made in r242263. MFC after: 2 weeks	2012-10-28 19:22:18 +00:00
Andre Oppermann	c4ab59c1a1	Add SACK_PERMIT to the list of TCP options that are switched off after retransmitting a SYN three times. MFC after: 2 weeks	2012-10-28 19:20:23 +00:00
Andre Oppermann	79ce26a08c	Simplify and enhance the window change/update acceptance logic, especially in the presence of bi-directional data transfers. snd_wl1 tracks the right edge, including data in the reassembly queue, of valid incoming data. This makes it like rcv_nxt plus reassembly. It never goes backwards to prevent older, possibly reordered segments from updating the window. snd_wl2 tracks the left edge of sent data. This makes it a duplicate of snd_una. However joining them right now is difficult due to separate update dependencies in different places in the code flow. snd_wnd tracks the current advertized send window by the peer. In tcp_output() the effective window is calculated by subtracting the already in-flight data, snd_nxt less snd_una, from it. ACK's become the main clock of window updates and will always update the window when the left edge of what we sent is advanced. The ACK clock is the primary signaling mechanism in ongoing data transfers. This works reliably even in the presence of reordering, reassembly and retransmitted segments. The ACK clock is most important because it determines how much data we are allowed to inject into the network. Zero window updates get us out of persistence mode are crucial. Here a segment that neither moves ACK nor SEQ but enlarges WND is accepted. When the ACK clock is not active (that is we're not or no longer sending any data) any segment that moves the extended right SEQ edge, including out-of-order segments, updates the window. This gives us updates especially during ping-pong transfers where the peer isn't done consuming the already acknowledged data from the receive buffer while responding with data. The SSH protocol is a prime candidate to benefit from the improved bi-directional window update logic as it has its own windowing mechanism on top of TCP and is frequently sending back protocol ACK's. Tcpdump provided by: darrenr Tested by: darrenr MFC after: 2 weeks	2012-10-28 19:16:22 +00:00
Andre Oppermann	024fd5b6bb	For retransmits of SYN\|ACK from the syncache use the slightly more aggressive special tcp_syn_backoff[] retransmit schedule instead of the normal tcp_backoff[] schedule for established connections. MFC after: 2 weeks	2012-10-28 19:02:07 +00:00
Andre Oppermann	f4748ef5fb	When retransmitting SYN in TCPS_SYN_SENT state use TCPTV_RTOBASE, the default retransmit timeout, as base to calculate the backoff time until next try instead of the TCP_REXMTVAL() macro which only works correctly when we already have measured an actual RTT+RTTVAR. Before it would cause the first retransmit at RTOBASE, the next four at the same time (!) about 200ms later, and then another one again RTOBASE later. MFC after: 2 weeks	2012-10-28 18:56:57 +00:00
Andre Oppermann	602e8e45ee	Remove bogus 'else' in #ifdef that prevented the rttvar from being reset tcp_timer_rexmt() on retransmit for IPv6 sessions. MFC after: 2 weeks	2012-10-28 18:45:04 +00:00
Andre Oppermann	4faaea5505	Allow arbitrary MSS sizes and don't mind about the cluster size anymore. We've got more cluster sizes for quite some time now and the orginally imposed limits and the previously codified thoughts on efficiency gains are no longer true. MFC after: 2 weeks	2012-10-28 18:33:52 +00:00
Andre Oppermann	f3a10d7954	Change the syncache count reporting the current number of entries from an unprotected u_int that reports garbage on SMP to a function based sysctl obtaining the current value from UMA. Also read back the actual cache_limit after page size rounding by UMA. PR: kern/165879 MFC after: 2 weeks	2012-10-28 18:07:34 +00:00
Andre Oppermann	aafa0b4164	Simplify implementation of net.inet.tcp.reass.maxsegments and net.inet.tcp.reass.cursegments. MFC after: 2 weeks	2012-10-28 17:59:46 +00:00
Andre Oppermann	f62563d33c	Prevent a flurry of forced window updates when an application is doing small reads on a (partially) filled receive socket buffer. Normally one would a send a window update every time the available space in the socket buffer increases by two times MSS. This leads to a flurry of window updates that do not provide any meaningful new information to the sender. There still is available space in the window and the sender can continue sending data. All window updates then get carried by the regular ACKs. Only when the socket buffer was (almost) full and the window closed accordingly a window updates delivery new information and allows the sender to start sending more data again. Send window updates only every two MSS when the socket buffer has less than 1/8 space available, or the available space in the socket buffer increased by 1/4 its full capacity, or the socket buffer is very small. The next regular data ACK will carry and report the exact window size again. Reported by: sbruno Tested by: darrenr Tested by: Darren Baginski PR: kern/116335 MFC after: 2 weeks	2012-10-28 17:40:35 +00:00
Andre Oppermann	4249614cb0	When SYN or SYN/ACK had to be retransmitted RFC5681 requires us to reduce the initial CWND to one segment. This reduction got lost some time ago due to a change in initialization ordering. Additionally in tcp_timer_rexmt() avoid entering fast recovery when we're still in TCPS_SYN_SENT state. MFC after: 2 weeks	2012-10-28 17:30:28 +00:00
Andre Oppermann	cf8f04f4c0	When SYN or SYN/ACK had to be retransmitted RFC5681 requires us to reduce the initial CWND to one segment. This reduction got lost some time ago due to a change in initialization ordering. Additionally in tcp_timer_rexmt() avoid entering fast recovery when we're still in TCPS_SYN_SENT state. MFC after: 2 weeks	2012-10-28 17:25:08 +00:00
Andre Oppermann	22efabd40c	Adjust the initial default CWND upon connection establishment to the new and increased values specified by RFC5681 Section 3.1. The even larger initial CWND per RFC3390, if enabled, is not affected. MFC after: 2 weeks	2012-10-28 17:16:09 +00:00
Gleb Smirnoff	078468ede4	o Remove last argument to ip_fragment(), and obtain all needed information on checksums directly from mbuf flags. This simplifies code. o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in hardware. Some driver may not announce CSUM_IP in theur if_hwassist, although try to do checksums if CSUM_IP set on mbuf. Example is em(4). o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP. After this change CSUM_DELAY_IP vanishes from the stack. Submitted by: Sebastian Kuzminsky <seb lineratesystems.com>	2012-10-26 21:06:33 +00:00
Andrey V. Elsukov	c1de64a495	Remove the IPFIREWALL_FORWARD kernel option and make possible to turn on the related functionality in the runtime via the sysctl variable net.pfil.forward. It is turned off by default. Sponsored by: Yandex LLC Discussed with: net@ MFC after: 2 weeks	2012-10-25 09:39:14 +00:00
Gleb Smirnoff	a7f707cd37	After r241923 the updated ip_len no longer needed.	2012-10-25 09:02:21 +00:00
Gleb Smirnoff	b6fcf6f9f5	Fix error in r241913 that had broken fragment reassembly.	2012-10-25 09:00:57 +00:00
Gleb Smirnoff	9e2a372fd2	Use ip_stripoptions() instead of handrolled version.	2012-10-23 10:30:09 +00:00
Gleb Smirnoff	4937a6561f	Simplify ip_stripoptions() reducing number of intermediate variables.	2012-10-23 10:29:31 +00:00
Gleb Smirnoff	8ad458a471	Do not reduce ip_len by size of IP header in the ip_input() before passing a packet to protocol input routines. For several protocols this mean that now protocol needs to do subtraction itself, and for another half this means that we do not need to add header length back to the packet. Make ip_stripoptions() to adjust ip_len, since now we enter this function with a packet header whose ip_len does represent length of entire packet, not payload only.	2012-10-23 08:33:13 +00:00
Xin LI	6f56329a25	Remove __P. Submitted by: kevlo Reviewed by: md5(1) MFC after: 2 months	2012-10-22 21:49:56 +00:00
Gleb Smirnoff	8f134647ca	Switch the entire IPv4 stack to keep the IP packet header in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>	2012-10-22 21:09:03 +00:00
Andrey Zonov	32fe38f123	- Update cachelimit after hashsize and bucketlimit were set. Reported by: az Reviewed by: melifaro Approved by: kib (mentor) MFC after: 1 week	2012-10-19 14:00:03 +00:00
Andre Oppermann	c9b652e3e8	Mechanically remove the last stray remains of spl* calls from net/. They have been Noop's for a long time now.	2012-10-18 13:57:24 +00:00
Ed Maste	983731268c	Avoid potential bad pointer dereference. Previously RuleAdd would leave entry->la unset for the first entry in the proxyList. Sponsored by: ADARA Networks MFC After: 1 week	2012-10-17 20:23:07 +00:00
Gleb Smirnoff	e76163a539	We don't need to convert ip6_len to host byte order before ip6_output(), the IPv6 stack is working in net byte order. The reason this code worked before is that ip6_output() doesn't look at ip6_plen at all and recalculates it based on mbuf length.	2012-10-15 07:57:55 +00:00
Gleb Smirnoff	347d90acff	Fix a miss from r241344: in ip_mloopback() we need to go to net byte order prior to calling in_delayed_cksum(). Reported by: Olivier Cochard-Labbe <olivier cochard.me>	2012-10-14 15:08:07 +00:00
Alexander V. Chernikov	3bff27cd67	Cleanup documentation: cloning route support has been removed in r186119. MFC after: 2 weeks	2012-10-13 09:31:01 +00:00
Gleb Smirnoff	86b61e4748	Revert fixup of ip_len from r241480. Now stack isn't yet ready for that change.	2012-10-12 09:32:38 +00:00
Gleb Smirnoff	105bd2113b	In ip_stripoptions(): - Remove unused argument and incorrect comment. - Fixup ip_len after stripping.	2012-10-12 09:24:24 +00:00
Alexander V. Chernikov	3c2824b9ef	Do not check if found IPv4 rte is dynamic if net.inet.icmp.drop_redirect is enabled. This eliminates one mtx_lock() per each routing lookup thus improving performance in several cases (routing to directly connected interface or routing to default gateway). Icmp redirects should not be used to provide routing direction nowadays, even for end hosts. Routers should not use them too (and this is explicitly restricted in IPv6, see RFC 4861, clause 8.2). Current commit changes rnh_machaddr function to 'stock' rn_match (and back) for every AF_INET routing table in given VNET instance on drop_redirect sysctl change. This change is part of bigger patch eliminating rte locking. Sponsored by: Yandex LLC MFC after: 2 weeks	2012-10-10 19:06:11 +00:00
Kevin Lo	9823d52705	Revert previous commit... Pointyhat to: kevlo (myself)	2012-10-10 08:36:38 +00:00
Kevin Lo	a10cee30c9	Prefer NULL over 0 for pointers	2012-10-09 08:27:40 +00:00
Gleb Smirnoff	23e9c6dc1e	After r241245 it appeared that in_delayed_cksum(), which still expects host byte order, was sometimes called with net byte order. Since we are moving towards net byte order throughout the stack, the function was converted to expect net byte order, and its consumers fixed appropriately: - ip_output(), ipfilter(4) not changed, since already call in_delayed_cksum() with header in net byte order. - divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order there and back. - mrouting code and IPv6 ipsec now need to switch byte order there and back, but I hope, this is temporary solution. - In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum(). - pf_route() catches up on r241245 changes to ip_output().	2012-10-08 08:03:58 +00:00
Gleb Smirnoff	b7fb54d8ae	No reason to play with IP header before calling sctp_delayed_cksum() with offset beyond the IP header.	2012-10-08 07:21:32 +00:00
Gleb Smirnoff	21d172a3f1	A step in resolving mess with byte ordering for AF_INET. After this change: - All packets in NETISR_IP queue are in net byte order. - ip_input() is entered in net byte order and converts packet to host byte order right _after_ processing pfil(9) hooks. - ip_output() is entered in host byte order and converts packet to net byte order right _before_ processing pfil(9) hooks. - ip_fragment() accepts and emits packet in net byte order. - ip_forward(), ip_mloopback() use host byte order (untouched actually). - ip_fastforward() no longer modifies packet at all (except ip_ttl). - Swapping of byte order there and back removed from the following modules: pf(4), ipfw(4), enc(4), if_bridge(4). - Swapping of byte order added to ipfilter(4), based on __FreeBSD_version - __FreeBSD_version bumped. - pfil(9) manual page updated. Reviewed by: ray, luigi, eri, melifaro Tested by: glebius (LE), ray (BE)	2012-10-06 10:02:11 +00:00
Gleb Smirnoff	df4e91d386	There is a complex race in in_pcblookup_hash() and in_pcblookup_group(). Both functions need to obtain lock on the found PCB, and they can't do classic inter-lock with the PCB hash lock, due to lock order reversal. To keep the PCB stable, these functions put a reference on it and after PCB lock is acquired drop it. If the reference was the last one, this means we've raced with in_pcbfree() and the PCB is no longer valid. This approach works okay only if we are acquiring writer-lock on the PCB. In case of reader-lock, the following scenario can happen: - 2 threads locate pcb, and do in_pcbref() on it. - These 2 threads drop the inp hash lock. - Another thread comes to delete pcb via in_pcbfree(), it obtains hash lock, does in_pcbremlists(), drops hash lock, and runs in_pcbrele_wlocked(), which doesn't free the pcb due to two references on it. Then it unlocks the pcb. - 2 aforementioned threads acquire reader lock on the pcb and run in_pcbrele_rlocked(). One gets 1 from in_pcbrele_rlocked() and continues, second gets 0 and considers pcb freed, returns. - The thread that got 1 continutes working with detached pcb, which later leads to panic in the underlying protocol level. To plumb that problem an additional INPCB flag introduced - INP_FREED. We check for that flag in the in_pcbrele_rlocked() and if it is set, we pretend that that was the last reference. Discussed with: rwatson, jhb Reported by: Vladimir Medvedkin <medved rambler-co.ru>	2012-10-02 12:03:02 +00:00
Gleb Smirnoff	891122d180	carp_send_ad() should never return without rescheduling next run.	2012-09-29 05:52:19 +00:00
Gleb Smirnoff	85c05144f1	Fix bug in TCP_KEEPCNT setting, which slipped in in the last round of reviewing of r231025. Unlike other options from this family TCP_KEEPCNT doesn't specify time interval, but a count, thus parameter supplied doesn't need to be multiplied by hz. Reported & tested by: amdmi3	2012-09-27 07:13:21 +00:00
Michael Tuexen	e06f3469e0	Whitespace change. MFC after: 3 days	2012-09-23 07:43:10 +00:00
Michael Tuexen	a98809db78	Declare a static function as such. MFC after: 3 days	2012-09-23 07:23:18 +00:00
Michael Tuexen	efb0814c24	Fix a bug related to handling Re-config chunks. It is not true that the association can be removed if the socket is gone. MFC after: 3 days	2012-09-22 22:04:17 +00:00
Michael Tuexen	2089750009	Small cleanups. No functional change. MFC after: 10 days	2012-09-22 14:39:20 +00:00
Kevin Lo	b7e1113e8f	Fix typo: s/pakcet/packet	2012-09-20 03:29:43 +00:00
Eitan Adler	582212fa04	s/teh/the/g Approved by: cperciva MFC after: 3 days	2012-09-14 21:59:55 +00:00
Michael Tuexen	dcb68fba2d	Small cleanups. No functional change. MFC after: 10 days	2012-09-14 18:32:20 +00:00
Gleb Smirnoff	3b3a8eb937	o Create directory sys/netpfil, where all packet filters should reside, and move there ipfw(4) and pf(4). o Move most modified parts of pf out of contrib. Actual movements: sys/contrib/pf/net/.c -> sys/netpfil/pf/ sys/contrib/pf/net/.h -> sys/net/ contrib/pf/pfctl/.c -> sbin/pfctl contrib/pf/pfctl/.h -> sbin/pfctl contrib/pf/pfctl/pfctl.8 -> sbin/pfctl contrib/pf/pfctl/.4 -> share/man/man4 contrib/pf/pfctl/.5 -> share/man/man5 sys/netinet/ipfw -> sys/netpfil/ipfw The arguable movement is pf/net/*.h -> sys/net. There are future plans to refactor pf includes, so I decided not to break things twice. Not modified bits of pf left in contrib: authpf, ftp-proxy, tftp-proxy, pflogd. The ipfw(4) movement is planned to be merged to stable/9, to make head and stable match. Discussed with: bz, luigi	2012-09-14 11:51:49 +00:00
Michael Tuexen	8225a9bc85	Whitespace changes. MFC after: 10 days	2012-09-09 08:14:04 +00:00
Michael Tuexen	fe6bb0a788	Whitespace cleanup. MFC after: 10 days	2012-09-08 20:54:54 +00:00
Gleb Smirnoff	d6d3f01e0a	Merge the projects/pf/head branch, that was worked on for last six months, into head. The most significant achievements in the new code: o Fine grained locking, thus much better performance. o Fixes to many problems in pf, that were specific to FreeBSD port. New code doesn't have that many ifdefs and much less OpenBSDisms, thus is more attractive to our developers. Those interested in details, can browse through SVN log of the projects/pf/head branch. And for reference, here is exact list of revisions merged: r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330, r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656, r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782, r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868, r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223, r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456, r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505, r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168, r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230, r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398, r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548, r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672, r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169, r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442, r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522, r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661, r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212. I'd like to thank people who participated in early testing: Tested by: Florian Smeets <flo freebsd.org> Tested by: Chekaluk Vitaly <artemrts ukr.net> Tested by: Ben Wilber <ben desync.com> Tested by: Ian FREISLICH <ianf cloudseed.co.za>	2012-09-08 06:41:54 +00:00
Michael Tuexen	a169d6ec2b	Don't include a structure containing a flexible array in another structure. MFC after: 10 days	2012-09-07 13:36:42 +00:00
Michael Tuexen	12780a595e	Get rid of a gcc'ism. MFC after: 10 days	2012-09-06 07:03:56 +00:00
Michael Tuexen	dd294dcec6	Using %p in a format string requires a void *. MFC after: 10 days	2012-09-05 18:52:01 +00:00
Michael Tuexen	2899aa8f65	Use the consistenly the size of a variable. This helps to keep the code simpler for the userland implementation. MFC after: 3 days	2012-09-04 22:45:00 +00:00
Michael Tuexen	c6328f940e	Whitespace change. MFC after: 3 days	2012-09-04 22:40:49 +00:00
Alexander V. Chernikov	7d4317bd40	Introduce new link-layer PFIL hook V_link_pfil_hook. Merge ether_ipfw_chk() and part of bridge_pfil() into unified ipfw_check_frame() function called by PFIL. This change was suggested by rwatson? @ DevSummit. Remove ipfw headers from ether/bridge code since they are unneeded now. Note this thange introduce some (temporary) performance penalty since PFIL read lock has to be acquired for every link-level packet. MFC after: 3 weeks	2012-09-04 19:43:26 +00:00
Gleb Smirnoff	478df1d534	Provide a sysctl switch that allows to install ARP entries with multicast bit set. FreeBSD refuses to install such entries since 9.0, and this broke installations running Microsoft NLB, which are violating standards. Tested by: Tarasov Oleg <oleg_tarasov sg-tea.com>	2012-09-03 14:29:28 +00:00
Michael Tuexen	81eb4e6351	Fix a typo which results in RTT to be off by a factor of 10, if the RTT is larger than 1 second. MFC after: 3 days	2012-09-02 12:37:30 +00:00
Eitan Adler	64baf9fbe0	Mark the ipfw interface type as not being ether. This fixes an issue where uuidgen tried to obtain a ipfw device's mac address which was always zero. PR: 170460 Submitted by: wxs Reviewed by: bdrewery Reviewed by: delphij Approved by: cperciva MFC after: 1 week	2012-09-01 23:33:49 +00:00
Randall Stewart	ec03d5433f	This small change takes care of a race condition that can occur when both sides close at the same time. If that occurs, without this fix the connection enters FIN1 on both sides and they will forever send FIN\|ACK at each other until the connection times out. This is because we stopped processing the FIN\|ACK and thus did not advance the sequence and so never ACK'd each others FIN. This fix adjusts it so we do process the FIN properly and the race goes away ;-) MFC after: 1 month	2012-08-25 09:26:37 +00:00
Navdeep Parhar	06fd9875aa	Correctly handle the case where an inp has already been dropped by the time the TOE driver reports that an active open failed. toe_connect_failed is supposed to handle this but it should be provided the inpcb instead of the tcpcb which may no longer be around.	2012-08-21 18:09:33 +00:00
Randall Stewart	7db496de2c	Though I disagree, I conceed to jhb & Rui. Note that we still have a problem with this whole structure of locks and in_input.c [it does not lock which it should not, but this can lead to crashes]. (I have seen it in our SQA testbed.. besides the one with a refcnt issue that I will have SQA work on next week ;-)	2012-08-19 11:54:02 +00:00
Randall Stewart	9424879158	Ok jhb, lets move the ifa_free() down to the bottom to assure that all tables and such are removed before we start to free. This won't protect the Hash in ip_input.c but in theory should protect any other uses that do use locks. MFC after: 1 week (or more)	2012-08-17 05:51:46 +00:00
Lawrence Stewart	ee24d3b840	The TCP PAWS fix for kernels with fast tick rates (r231767) changed the TCP timestamp related stack variables to reference ms directly instead of ticks. The h_ertt(4) Khelp module relies on TCP timestamp information in order to calculate its enhanced RTT estimates, but was not updated as part of r231767. Consequently, h_ertt has not been calculating correct RTT estimates since r231767 was comitted, which in turn broke all delay-based congestion control algorithms because they rely on the h_ertt RTT estimates. Fix the breakage by switching h_ertt to use tcp_ts_getticks() in place of all previous uses of the ticks variable. This ensures all timestamp related variables in h_ertt use the same units as the TCP stack and therefore results in meaningful comparisons and RTT estimate calculations. Reported & tested by: Naeem Khademi (naeemk at ifi uio no) Discussed with: bz MFC after: 3 days	2012-08-17 01:49:51 +00:00
Randall Stewart	184749821f	Its never a good idea to double free the same address. MFC after: 1 week (after the other commits ahead of this gets MFC'd)	2012-08-16 17:55:16 +00:00
Luigi Rizzo	e5813a3bce	s/lenght/length/ in comments	2012-08-07 07:52:25 +00:00
Luigi Rizzo	17369272e4	move functions outside the SYSBEGIN/SYSEND block (SYSBEGIN/SYSEND are specific to ipfw/dummynet and are used to emulate sysctl on platforms that do not have them, and they work by creating an array which contains all the sysctl-ed symbols.)	2012-08-06 11:02:23 +00:00
Luigi Rizzo	00c4633285	use FREE_PKT instead of m_freem to free an mbuf. The former is the standard form used in ipfw/dummynet, so that it is easier to remap it to different memory managers depending on the platform.	2012-08-06 10:50:43 +00:00
Michael Tuexen	55b175e747	Fix a bug found by dim@: Don't use an uninitilized variable, if INVARIANTS is on and an illegal packet with destination 0 is received. MFC after: 3 days X-MFC with: 238003	2012-08-06 10:50:23 +00:00
Mikolaj Golub	655f934b78	In tcp timers, check INP_DROPPED flag a little later, after callout_deactivate(), so if INP_DROPPED is set we return with the timer active flag cleared. For me this fixes negative keep timer values reported by `netstat -x' for connections in CLOSE state. Approved by: net (silence) MFC after: 2 weeks	2012-08-05 17:30:17 +00:00
Michael Tuexen	63c6726e05	Fix a refcount issue. The called only decrements is stcb is NULL. MFC after: 3 days Discussed with: rrs	2012-08-05 10:47:18 +00:00
Michael Tuexen	832208514f	Fix a bug reported by Simon L. B. Nielsen: If an SCTP endpoint receives an ASCONF with a wildcard lookup address and incorrect verification tag, the system crashes. MFC after: 3 days.	2012-08-04 20:40:36 +00:00
Michael Tuexen	173be2b6cd	Testing an interface property should depend on the interface, not on an address. MFC after: 3 days	2012-08-04 08:03:30 +00:00
Gleb Smirnoff	ea53792942	Fix races between in_lltable_prefix_free(), lla_lookup(), llentry_free() and arptimer(): o Use callout_init_rw() for lle timeout, this allows us safely disestablish them. - This allows us to simplify the arptimer() and make it race safe. o Consistently use ifp->if_afdata_lock to lock access to linked lists in the lle hashes. o Introduce new lle flag LLE_LINKED, which marks an entry that is attached to the hash. - Use LLE_LINKED to avoid double unlinking via consequent calls to llentry_free(). - Mark lle with LLE_DELETED via \|= operation istead of =, so that other flags won't be lost. o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more consistent and provide more informative KASSERTs. The patch is a collaborative work of all submitters and myself. PR: kern/165863 Submitted by: Andrey Zonov <andrey zonov.org> Submitted by: Ryan Stone <rysto32 gmail.com> Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>	2012-08-02 13:57:49 +00:00
Luigi Rizzo	46f2f751e1	replace __unused with a portable construct; fix a couple of signed/unsigned warnings.	2012-08-02 12:45:13 +00:00
Luigi Rizzo	f5705b527d	replace inet_ntoa_r with the more standard inet_ntop(). As discussed on -current, inet_ntoa_r() is non standard, has different arguments in userspace and kernel, and almost unused (no clients in userspace, only net/flowtable.c, net/if_llatbl.c, netinet/in_pcb.c, netinet/tcp_subr.c in the kernel)	2012-08-01 18:52:07 +00:00
Luigi Rizzo	71ca24f182	add a cast to avoid a signed/unsigned warning (to be removed when we will have TUNABLE_UINT constructors)	2012-08-01 18:49:00 +00:00
Gleb Smirnoff	b9aee262e5	Some more whitespace cleanup.	2012-08-01 09:00:26 +00:00
Gleb Smirnoff	ea50c13ebe	Some style(9) and whitespace changes. Together with: Andrey Zonov <andrey zonov.org>	2012-07-31 11:31:12 +00:00
Luigi Rizzo	9df9e62789	nobody uses this file except the userspace ipfw code, but the cast of a pointer to an integer needs a cast to prevent a warning for size mismatch. MFC after: 1 week	2012-07-31 08:04:49 +00:00
Michael Tuexen	77db9ed99e	Fix the sctp_sockstore union such that userland programs don't depend on INET and/or INET6 to be defined and in-tune with how the kernel was compiled. MFC after: 3 days Discussed with: rrs	2012-07-26 08:10:29 +00:00
Bjoern A. Zeeb	8253dcabe7	Fix a problem when CARP is enabled on the interface for IPv4 but not for IPv6. The current checks in nd6_nbr.c along with the old version will result in ifa being NULL and subsequently the packet will be dropped. This prevented NS/NA, from working and with that IPv6. Now return the ifa from the carp lookup function in two cases: 1) if the address matches, is a carp address, and we are MASTER (as before), 2) if the address matches but it is not a carp address at all (new). Reported by: Peter Wemm (new Y! FreeBSD cluster, eating our own dogfood) Tested on: New Y! FreeBSD cluster machines Reviewed by: glebius	2012-07-25 12:14:39 +00:00
Robert Watson	0989f56cff	Update some stale comments regarding tcbinfo locking in the TCP input path: read locks on tcbinfo are no longer used, so won't happen. No functional change. MFC after: 3 days	2012-07-22 17:31:36 +00:00
Gleb Smirnoff	3c73180f55	Plug a reference leak: before doing 'goto again' we need to unref ia->ia_ifa if there is any. Submitted by: Andrey Zonov <andrey zonov.org>	2012-07-18 08:58:30 +00:00
Gleb Smirnoff	b9abeb9d99	When traversing global in_ifaddr list in the IFP_TO_IA() macro, we need to obtain IN_IFADDR_RLOCK().	2012-07-18 08:41:00 +00:00
Michael Tuexen	d07b2ac6c6	Fix a refcount bug when freeing an association. While there: Change code to be consistent. Discussed with rrs@. MFC after: 3 days	2012-07-17 13:03:47 +00:00
Gleb Smirnoff	df0633a145	If ip_output() returns EMSGSIZE to tcp_output(), then the latter calls tcp_mtudisc(), which in its turn may call tcp_output(). Under certain conditions (must admit they are very special) an infinite recursion can happen. To avoid recursion we can pass struct route to ip_output() and obtain correct mtu. This allows us not to use tcp_mtudisc() but call tcp_mss_update() directly. PR: kern/155585 Submitted by: Andrey Zonov <andrey zonov.org> (original version of patch)	2012-07-16 07:08:34 +00:00
Michael Tuexen	5e20b91dbe	Changes which improve compilation if neither INET nor INET6 is defined. MFC after: 3 days	2012-07-15 20:16:17 +00:00
Michael Tuexen	e0e00a4d0f	#ifdef INET and INET6 consistently. This also fixes a bug, where it was done wrong. MFC after: 3 days	2012-07-15 11:04:49 +00:00
Michael Tuexen	ad83c8a5a5	Provide the correct notification type (SCTP_SEND_FAILED_EVENT) for unsent messages. MFC after: 3 days	2012-07-14 21:25:14 +00:00
Michael Tuexen	b5e0cd79c2	Use case for selecting the address family (as in other places). MFC after: 3 days	2012-07-14 20:08:03 +00:00
Michael Tuexen	c9afdc3aa9	Use case for selecting the address family (as in other places). MFC after: 3 days	2012-07-14 19:44:39 +00:00
Michael Tuexen	5db75fc724	Fix a bug introduced in r237715. MFC after:i 3 days.	2012-07-09 10:59:39 +00:00
Hiroki Sato	8efbd296e0	Make ipfw0 logging pseudo-interface clonable. It can be created automatically by $firewall_logif rc.conf(5) variable at boot time or manually by ifconfig(8) after a boot. Discussed on: freebsd-ipfw@	2012-07-09 07:16:19 +00:00
Alexander V. Chernikov	30835e6d50	Finally fix lookup (account remaining '\0') and deletion (provide valid key length for radix lookup). Submitted by: Ihor Kaharlichenko<madkinder at gmail.com> (prev version) Approved by: kib(mentor) MFC after: 3 days Sponsored by: Shtorm ISP	2012-07-08 21:13:04 +00:00
Michael Tuexen	1ebccfd0a2	Use consistent method to determine IPV4_OUTPUT/IPV6_OUTPUT. MFC after: 3 days	2012-07-04 20:59:30 +00:00
Michael Tuexen	c76b949968	Use CSUM_SCTP_IPV6 for IPv6. MFC after: 3 days	2012-07-04 20:29:16 +00:00
Gleb Smirnoff	bf9840512a	When ip_output()/ip6_output() is supplied a struct route *ro argument, it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning here: it may be supplied to provide route, and it may be supplied to store and return to caller the route that ip_output()/ip6_output() finds. In the latter case skipping FLOWTABLE lookup is pessimisation. The difference between struct route filled by FLOWTABLE and filled by rtalloc() family is that the former doesn't hold a reference on its rtentry. Reference is hold by flow entry, and it is about to be released in future. Thus, route filled by FLOWTABLE shouldn't be passed to RTFREE() macro. - Introduce new flag for struct route/route_in6, that marks route not holding a reference on rtentry. - Introduce new macro RO_RTFREE() that cleans up a struct route depending on its kind. - All callers to ip_output()/ip6_output() that do supply non-NULL but empty route should use RO_RTFREE() to free results of lookup. - ip_output()/ip6_output() now do FLOWTABLE lookup always when ro->ro_rt == NULL. Tested by: tuexen (SCTP part)	2012-07-04 07:37:53 +00:00
Michael Tuexen	e3d6ef0b03	Iniitialize a variable. MFC after: 3 days	2012-07-03 21:41:19 +00:00
Mikolaj Golub	b7d61ba317	Don't check for ifp != NULL before KASSERT, as ifp may not be NULL here (it is dereferenced below). Discussed with: jhb MFC after: 1 week	2012-07-03 19:04:18 +00:00
Mikolaj Golub	3a288e901f	Fix RTTVAR scale in net.inet.tcp.hostcache.list sysctl. Reviewed by: andre MFC after: 3 days	2012-07-03 18:59:13 +00:00
Isabell Long	82cecbea69	- Make ipfw's sched rules case insensitive, for user-friendliness. - Add a note to the ipfw(8) man page about the rules no longer being case sensitive. - Fix some typos in the man page. PR: docs/164772 Reviewed by: bz Approved by: gabor (doc mentor, src committer) MFC after: 2 weeks	2012-07-03 08:42:48 +00:00
Gleb Smirnoff	3df6468a2d	Remove route caching from IP multicast routing code. There is no reason to do that, and also, cached route never got unreferenced, which meant a reference leak. Reviewed by: bms	2012-07-02 19:44:18 +00:00
Michael Tuexen	a8775ad93d	Move common code parts to sctp_common_input_processing(). MFC after: 3 days	2012-07-02 16:44:09 +00:00
Michael Tuexen	d8d544892c	Remove dead code (on FreeBSD) as suggested by glebius@. MFC after: 3 days	2012-07-02 16:40:11 +00:00
Michael Tuexen	b1754ad17b	Pass the src and dst address of a received packet explicitly around. MFC after: 3 days	2012-06-28 16:01:08 +00:00
Michael Tuexen	6dc5aabcb7	Unify sctp_input() and sctp6_input(). MFC after: 3 days	2012-06-25 19:13:43 +00:00
Michael Tuexen	39803b8c58	Whitespace cleanup. MFC after: 3 days	2012-06-25 17:15:09 +00:00
Michael Tuexen	20cc2188f3	Pass the packet length explicitly around. MFC after: 3 days	2012-06-24 23:12:24 +00:00
Michael Tuexen	329204ff00	Remove redundant check. MFC after: 3 days	2012-06-24 22:22:44 +00:00
Michael Tuexen	f938425253	Do packet logging in a consistent way. MFC after: 3 days	2012-06-24 21:25:54 +00:00
Alexander V. Chernikov	0bfbe5f660	Fix interface matching by ipfw table Submitted by: Ihor Kaharlichenko <madkinder@gmail.com> Tested by: Ihor Kaharlichenko <madkinder@gmail.com> Approved by: kib(mentor) MFC after: 3 days	2012-06-23 12:40:24 +00:00
Michael Tuexen	9f4ab62e92	Remove redundant #ifdef. Reported by gnn@. MFC after: 3 days	2012-06-21 12:51:24 +00:00
Navdeep Parhar	09fe63205c	- Updated TOE support in the kernel. - Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features. - iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon. Build-tested with make universe. 30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m \| grep TOE Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp \| grep toe # sockstat -46c \| grep toe Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)	2012-06-19 07:34:13 +00:00
Michael Tuexen	5ad9e57b3f	Add rate limitation for SCTP OOTB responses. MFC after: 3 days	2012-06-18 17:11:24 +00:00
Michael Tuexen	285052f0aa	Cleanup the UDP decapsulation code. MFC after: 3 days	2012-06-18 17:09:39 +00:00
Michael Tuexen	f30ac43257	Pass flowid explicitly through the stack instead of taking it from the mbuf chain at different places. While there: Fix several bugs related to VRFs. MFC after: 3 days	2012-06-14 06:54:48 +00:00
Joel Dahl	6ae742a549	mdoc: avoid nested displays. Fixes mandoc warnings.	2012-06-13 18:57:27 +00:00
Michael Tuexen	f584d74b4d	Add a cmsg of type IP_TOS for UDP/IPv4 sockets to specify the TOS byte. MFC after: 3 days	2012-06-12 14:56:08 +00:00
Michael Tuexen	3cca425b29	Add a IP_RECVTOS socket option to receive for received UDP/IPv4 packets a cmsg of type IP_RECVTOS which contains the TOS byte. Much like IP_RECVTTL does for TTL. This allows to implement a protocol on top of UDP and implementing ECN. MFC after: 3 days	2012-06-12 14:02:38 +00:00
Michael Tuexen	c9e089587c	Unify the sending of ABORT, SHUTDOWN-COMPLETE and ERROR chunks. While there: Fix also some minor bugs and prepare for SCTP/DTLS. MFC after: 3 days	2012-06-12 13:15:27 +00:00
Michael Tuexen	17267b32d8	Small cleanup. MFC after: 3 days	2012-06-12 09:18:28 +00:00
Alexander V. Chernikov	5154b2dffc	Validate IPv4 network mask being passed to ipfw kernel interface. Incorrect mask can possibly be one of the reasons for kern/127209 existance. Approved by: kib(mentor) MFC after: 3 days	2012-06-09 17:39:05 +00:00
Eitan Adler	3e0efd2ec4	Fix style nit: don't use leading zero for dates in .Dd Prompted by: brueffer Approved by: brueffer MFC after: 3 days	2012-06-05 03:14:39 +00:00
Maksim Yevmenkin	77d396fd18	Plug more refcount leaks and possible NULL deref for interface address list. Submitted by: scottl@ MFC after: 3 days	2012-06-04 18:43:51 +00:00
Michael Tuexen	2faa5be555	Remove code which is not needed. MFC after: 3 days	2012-06-03 18:14:57 +00:00
Michael Tuexen	b82bd838f6	Use an existing function to get the source address. MFC after: 3 days	2012-06-03 14:54:50 +00:00
Michael Tuexen	2566e071ec	Honor sysctl for TTL. MFC after: 3 days	2012-06-02 21:22:26 +00:00
Michael Tuexen	962cef4089	Don't request data from the IPv6 layer, which is not used. MFC after: 3 days	2012-06-02 20:53:23 +00:00
Michael Tuexen	021416fb85	Remove an unused parameter. MFC after: 3 days	2012-06-02 13:13:38 +00:00
Bjoern A. Zeeb	5fa2656e55	Make TCP LRO work properly with VIMAGE kernels rather than just panicing. There's no VIMAGE context set there yet as this is before if_ethersubr.c. MFC after: 3 days X-MFC with: r235981	2012-06-01 11:42:50 +00:00
Michael Tuexen	11ef11db47	Small cleanups. No functional change. MFC after: 3 days	2012-06-01 08:26:50 +00:00
Michael Tuexen	a6cff10f2a	Seperate SCTP checksum offloading for IPv4 and IPv6. While there: remove some trainling whitespaces. MFC after: 3 days X-MFC with: 236170	2012-05-30 20:56:07 +00:00
Gleb Smirnoff	eaf151c49d	Improve style(9) of bcopy() to and from mbuf tag. Submitted by: bde	2012-05-30 13:51:00 +00:00
Gleb Smirnoff	a856ddc665	After r228571 carp_output() expects carp_softc * pointer in the mtag. Noticed by: thompsa	2012-05-30 07:11:27 +00:00
Bjoern A. Zeeb	356ab07e2d	It turns out that too many drivers are not only parsing the L2/3/4 headers for TSO but also for generic checksum offloading. Ideally we would only have one common function shared amongst all drivers, and perhaps when updating them for IPv6 we should introduce that. Eventually we should provide the meta information along with mbufs to avoid (re-)parsing entirely. To not break IPv6 (checksums and offload) and to be able to MFC the changes without risking to hurt 3rd party drivers, duplicate the v4 framework, as other OSes have done as well. Introduce interface capability flags for TX/RX checksum offload with IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6 flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6 fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and add an alias for CSUM_DATA_VALID_IPV6. This pretty much brings IPv6 handling in line with IPv4. TSO is still handled in a different way and not via if_hwassist. Update ifconfig to allow (un)setting of the new capability flags. Update loopback to announce the new capabilities and if_hwassist flags. Individual driver updates will have to follow, as will SCTP. Reported by: gallatin, dim, .. Reviewed by: gallatin (glanced at?) MFC after: 3 days X-MFC with: r235961,235959,235958	2012-05-28 09:30:13 +00:00
Ed Maste	0615959a81	Add IPPROTO_MPLS (rfc4023) IP protocol definition There are currently no in-tree consumers; I'm adding it now for use by vendor code. This matches the change OpenBSD made while implementing MPLS in gif(4).	2012-05-27 16:16:28 +00:00
Bjoern A. Zeeb	cace7064fc	Trim the extra $FreeBSD$ from the comment below the license. We use the __FBSDID() macro on the file now instead. MFC after: 3 days	2012-05-26 10:28:11 +00:00
Michael Tuexen	8d9638ab33	Get rid of SCTP specific code to avoid CRC32C computations on loopback. Just just offloading. MFC after: 3 days	2012-05-26 09:16:33 +00:00
Michael Tuexen	430061cf18	Undefine SCTP_PACKED before including sctp_uio.h, which doesn't use it. Spotted by Irene Ruengeler. MFC after: 3 days	2012-05-25 11:14:08 +00:00
Bjoern A. Zeeb	0cfdff24d2	MFp4 bz_ipv6_fast: Properly protect the inp read access when handling the control code. In the past this was expensive but given the rlock it's not so much anymore. Spotted while: optimizing udp6 Discussed with: rwatson (a few months ago) Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 09:24:45 +00:00
Bjoern A. Zeeb	31bfc56ecd	In case forwarding is turned on for a given address family, refuse to queue the packet for LRO and tell the driver to directly pass it on. This avoids re-assembly and later re-fragmentation problems when forwarding. It's not the best solution but the simplest and most effective for the moment. Should have been done: ages ago Discussed with and by: many MFC after: 3 days	2012-05-25 08:17:59 +00:00
Bjoern A. Zeeb	45747ba53c	MFp4 bz_ipv6_fast: Add code to handle pre-checked TCP checksums as indicated by mbuf flags to save the entire computation for validation if not needed. In the IPv6 TCP output path only compute the pseudo-header checksum, set the checksum offset in the mbuf field along the appropriate flag as done in IPv4. In tcp_respond() just initialize the IPv6 payload length to 0 as ip6_output() will properly set it. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 02:23:26 +00:00
Bjoern A. Zeeb	3a9391defb	MFp4 bz_ipv6_fast: Factor out the tcp_hc_getmtu() call. As the comments say it applies to both v4 and v6, so only write it once making it easier to read the protocol family specifc code. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 01:13:39 +00:00
Bjoern A. Zeeb	62b5b6ecd0	MFp4 bz_ipv6_fast: Significantly update tcp_lro for mostly two things: 1) introduce basic support for IPv6 without extension headers. 2) try hard to also get the incremental checksum updates right, especially also in the IPv4 case for the IP and TCP header. Move variables around for better locality, factor things out into functions, allow checksum updates to be compiled out, ... Leave a few comments on further things to look at in the future, though that is not the full list. Update drivers with appropriate #includes as needed for IPv6 data type in LRO. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-24 23:03:23 +00:00
Michael Tuexen	14ce0c0c80	Add sn_send_failed_event to sctp_notification. MFC after: 3 days	2012-05-24 11:52:57 +00:00
Michael Tuexen	807aad636f	Use consistent text at the begining of the files. MFC after: 3 days	2012-05-23 11:26:28 +00:00
Marcel Moolenaar	157bb65afd	Remove unused inclusion of curses.h	2012-05-19 05:14:24 +00:00
Michael Tuexen	635b6ff793	Use a default for max_burst of 4 and l2var of 2. This was discussed with rrs@. MFC after: 3 days	2012-05-17 18:12:24 +00:00
Michael Tuexen	8b42cfa708	Support SCTP_EOF also for 1-to-1 style sockets. MFC after: 3 days	2012-05-17 17:02:47 +00:00
Bjoern A. Zeeb	27f190a3ca	Switch to a standard 2 clause BSD license (from bsd-style-copyright). Approved by: Myricom Inc. (gallatin) Approved by: Intel Corporation (jfv)	2012-05-15 13:23:44 +00:00
Michael Tuexen	389b1b118c	Support SCTP_REMOTE_ERROR notification. MFC after: 3 days	2012-05-13 22:27:54 +00:00
Michael Tuexen	1edc9dbae5	Provide in the SCTP_SEND_FAILED and SCTP_SEND_FAILED_EVENT notifications the correct ssf_error or ssfe_error as required by RFC 6458. MFC after: 3 days	2012-05-13 19:32:49 +00:00
Michael Tuexen	4b1f78e1af	Provide the error code in SCTP_PEER_ADDR_CHANGE notifications as specified in RFC 6458. MFC after: 3 days	2012-05-13 17:36:04 +00:00
Michael Tuexen	b342faf4bf	Remove unused constants. MFC after: 3 days	2012-05-13 17:17:13 +00:00
Michael Tuexen	410a3b1ef0	Use ECONNABORTED in cases where the ABORT was sent to the peer. MFC after: 3 days	2012-05-13 16:56:16 +00:00
Michael Tuexen	58411b0821	Ensure the user can read COMM_LOST notifications on 1-to-1 style sockets. MFC after: 3 days	2012-05-13 16:07:53 +00:00
Michael Tuexen	a2b42326b5	Provide in the association change notification the received ABORT chunk if case of SCTP_COMM_LOST or SCTP_CANT_STR_ASSOC as required by RFC 6458. MFC after: 3 days	2012-05-12 20:11:35 +00:00
Glen Barber	a3fb6da9ba	General mdoc(7) and typo fixes. PR: 167734 Submitted by: Nobuyuki Koganemaru (kogane!jp.freebsd.org) MFC after: 3 days	2012-05-11 20:06:46 +00:00
Michael Tuexen	ce228dab70	Fix a bug in the handling of association reset request. MFC after: 3 days	2012-05-11 19:15:33 +00:00
Michael Tuexen	c5b5675d10	Only provide the supported features in the SCTP_ASSOC_CHANGE notif if the state is SCTP_COMM_UP or SCTP_RESTART. While there, do some cleanups. MFC after: 3 days	2012-05-11 18:07:36 +00:00
Michael Tuexen	07d249d661	Remove a constant which is only used on non-FreeBSD platform. (The actual code for the socket option handling has been #ifdefed out forever...) MFC after: 3 days.	2012-05-11 17:50:51 +00:00
Michael Tuexen	8c501e516b	Address clang warnings. MFC after: 3 days	2012-05-06 14:50:54 +00:00
Michael Tuexen	e06b67c756	Add support for the sac_info field in struct sctp_assoc_change as required by RFC 6458. MFC after: 3 days	2012-05-06 13:09:13 +00:00
Michael Tuexen	3f826ed2f3	Remove debug code. MFC after: 3 days	2012-05-06 11:37:41 +00:00
Michael Tuexen	9935403ae1	Add support for SCTP_SEND_FAILED_EVENT as required by RFC 6458. MFC after: 3 days	2012-05-06 11:02:53 +00:00
Michael Tuexen	d426064666	Provide the flags in the SCTP stream reconfig related notification as specified in RFC 6525. MFC after: 3 days	2012-05-05 21:41:16 +00:00
Michael Tuexen	f3ebe64c9e	Honor SCTP_ENABLE_STREAM_RESET socket option when processing incoming requests. Fix also the provided result in the response and use names as specified in RFC 6525. MFC after: 3 days	2012-05-05 20:07:33 +00:00
Michael Tuexen	f6faece611	Do error checking for the SCTP_RESET_STREAMS, SCTP_RESET_ASSOC, and SCTP_ADD_STREAMS socket options as specified by RFC 6525. MFC after: 3 days	2012-05-05 14:06:15 +00:00
Xin LI	74882260fa	Add ToS definitions for DiffServ Codepoints as per RFC2474. Obtained from: OpenBSD MFC after: 2 weeks	2012-05-04 21:00:32 +00:00
Michael Tuexen	cdedb5425c	Add support for the SCTP_ENABLE_STREAM_RESET socket option to getsockopt(). This improves the support of RFC 6525. MFC after: 3 days	2012-05-04 17:18:02 +00:00
Michael Tuexen	88d935dad7	Add support for SCTP_STREAM_CHANGE_EVENT, SCTP_ASSOC_RESET_EVENT as required by RFC 6525. This also fixes SCTP_STREAM_RESET_EVENT. MFC after: 3 days	2012-05-04 15:49:08 +00:00
Michael Tuexen	7d44bb8b14	Call panic() only under INVARIANTS. MFC after: 3 days	2012-05-04 10:06:08 +00:00
Michael Tuexen	cd3fd53188	Use SCTP_PRINTF() instead of printf() in all SCTP sources. MFC after: 3 days	2012-05-04 09:27:00 +00:00
Michael Tuexen	254945d75a	Fix another RFC 6458 issue. Spotted by Irene Ruengeler. MFC after: 3 days	2012-05-03 10:26:33 +00:00
Alexander V. Chernikov	bdf942c3f0	Revert r234834 per luigi@ request. Cleaner solution (e.g. adding another header) should be done here. Original log: Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h. Remove ipfw/ip_fw_private.h header from non-ipfw code. Requested by: luigi Approved by: kib(mentor)	2012-05-03 08:56:43 +00:00
Alexander V. Chernikov	7bd5e9b143	Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h. Remove ipfw/ip_fw_private.h header from non-ipfw code. Approved by: ae(mentor) MFC after: 2 weeks	2012-04-30 10:22:23 +00:00
Michael Tuexen	5fc8d746ee	Add support for missing gauth_number_of_chunks field. This Bug was found by Irene Ruengeler. MFC after: 1 week	2012-04-30 08:44:21 +00:00
Michael Tuexen	ed72abc07d	Whitespace changes. MFC after: 3 days	2012-04-28 16:32:49 +00:00

... 3 4 5 6 7 ...

4763 Commits