freebsd-dev

Author	SHA1	Message	Date
Andre Oppermann	e6b9152d20	Use monotonic 'time_uptime' instead of 'time_second' as timebase for timeouts.	2005-09-19 22:31:45 +00:00
Robert Watson	b1c53bc9c0	Take a first cut at cleaning up ifnet removal and multicast socket panics, which occur when stale ifnet pointers are left in struct moptions hung off of inpcbs: - Add in_ifdetach(), which matches in6_ifdetach(), and allows the protocol to perform early tear-down on the interface early in if_detach(). - Annotate that if_detach() needs careful consideration. - Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR -- this is not the place to detect interface removal! This also removes what is basically a nasty (and now unnecessary) hack. - Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP IPv4 sockets. It is now possible to run the msocket_ifnet_remove regression test using HEAD without panicking. MFC after: 3 days	2005-09-18 17:36:28 +00:00
Andre Oppermann	db1240661f	Do not ignore all other TCP options (eg. timestamp, window scaling) when responding to TCP SYN packets with TCP_MD5 enabled and set. PR: kern/82963 Submitted by: <demizu at dd.iij4u.or.jp> MFC after: 3 days	2005-09-14 15:06:22 +00:00
Bjoern A. Zeeb	75398603ad	Fix panic when kernel compiled without INET6 by rejecting IPv6 opcodes which are behind #if(n)def INET6 now. PR: kern/85826 MFC after: 3 days	2005-09-14 07:53:54 +00:00
Andre Oppermann	ffabe3dce8	In tcp_ctlinput() do not swap ip->ip_len a second time. It has been done in icmp_input() already. This fixes the ICMP_UNREACH_NEEDFRAG case where no MTU was proposed in the ICMP reply. PR: kern/81813 Submitted by: Vitezslav Novy <vita at fio.cz> MFC after: 3 days	2005-09-10 07:43:29 +00:00
Gleb Smirnoff	a20e25385c	- Do not hold route entry lock, when calling arprequest(). One such call was introduced by me in 1.139, the other one was present before. - Do all manipulations with rtentry and la before dropping the lock. - Copy interface address from route into local variable before dropping the lock. Supply this copy as argument to arprequest() LORs fixed: http://sources.zabbadoz.net/freebsd/lor/003.html http://sources.zabbadoz.net/freebsd/lor/037.html http://sources.zabbadoz.net/freebsd/lor/061.html http://sources.zabbadoz.net/freebsd/lor/062.html http://sources.zabbadoz.net/freebsd/lor/064.html http://sources.zabbadoz.net/freebsd/lor/068.html http://sources.zabbadoz.net/freebsd/lor/071.html http://sources.zabbadoz.net/freebsd/lor/074.html http://sources.zabbadoz.net/freebsd/lor/077.html http://sources.zabbadoz.net/freebsd/lor/093.html http://sources.zabbadoz.net/freebsd/lor/135.html http://sources.zabbadoz.net/freebsd/lor/140.html http://sources.zabbadoz.net/freebsd/lor/142.html http://sources.zabbadoz.net/freebsd/lor/145.html http://sources.zabbadoz.net/freebsd/lor/152.html http://sources.zabbadoz.net/freebsd/lor/158.html	2005-09-09 10:06:27 +00:00
Gleb Smirnoff	5d40d65b5a	When a carp(4) interface is being destroyed and is in a promiscous mode, first interface is detached from parent and then bpfdetach() is called. If the interface was the last carp(4) interface attached to parent, then the mutex on parent is destroyed. When bpfdetach() calls if_setflags() we panic on destroyed mutex. To prevent the above scenario, clear pointer to parent, when we detach ourselves from parent.	2005-09-09 08:41:39 +00:00
Sam Leffler	245c31ccaf	clear lock on error in O_LIMIT case of install_state Submitted by: Ted Unangst MFC after: 3 days	2005-09-04 17:33:40 +00:00
Andre Oppermann	e0aec68255	Use the correct mbuf type for MGET().	2005-08-30 16:35:27 +00:00
Gleb Smirnoff	e3ea67a077	Add newline to debuging printf. PR: kern/85271 Submitted by: Simon Morgan	2005-08-26 15:27:18 +00:00
Gleb Smirnoff	360856f60e	- Refuse hashsize of 0, since it is invalid. - Use defined constant instead of 512.	2005-08-25 13:57:00 +00:00
Gleb Smirnoff	510b360fc0	When we have a published ARP entry for some IP address, do reply on ARP requests only on the network where this IP address belong, to. Before this change we did replied on all interfaces. This could lead to an IP address conflict with host we are doing ARP proxy for. PR: kern/75634 Reviewed by: andre	2005-08-25 13:25:57 +00:00
Paul Saab	4d3b134633	Remove a KASSERT in the sack path that fails because of a interaction between sack and a bug in the "bad retransmit recovery" logic. This is a workaround, the underlying bug will be fixed later. Submitted by: Mohan Srinivasan, Noritoshi Demizu	2005-08-24 02:48:45 +00:00
Paul Saab	b24de0e665	Fix up the comment for MAX_SACK_BLKS. Submitted by: Noritoshi Demizu	2005-08-24 02:47:16 +00:00
Andre Oppermann	ef8fd90476	Remove unnecessary IPSEC includes. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-23 14:42:40 +00:00
Andre Oppermann	23655387e9	o Fix a logic error when not doing mbuf cluster allocation. o Change an old panic() to a clean function exit. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-22 22:13:41 +00:00
Andre Oppermann	936cd18dad	Add socketoption IP_MINTTL. May be used to set the minimum acceptable TTL a packet must have when received on a socket. All packets with a lower TTL are silently dropped. Works on already connected/connecting and listening sockets for RAW/UDP/TCP. This option is only really useful when set to 255 preventing packets from outside the directly connected networks reaching local listeners on sockets. Allows userland implementation of 'The Generalized TTL Security Mechanism (GTSM)' according to RFC3682. Examples of such use include the Cisco IOS BGP implementation command "neighbor ttl-security". MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-22 16:13:08 +00:00
Andre Oppermann	6b773dff30	Always quote the entire TCP header when responding and allocate an mbuf cluster if needed. Fixes the TCP issues raised in I-D draft-gont-icmp-payload-00.txt. This aids in-the-wild debugging a lot and allows the receiver to do more elaborate checks on the validity of the response. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-22 14:12:18 +00:00
Andre Oppermann	d56ea155bd	Handle pure layer 2 broad- and multicasts properly and simplify related checks. PR: kern/85052 Submitted by: Dmitrij Tejblum <tejblum at yandex-team.ru> MFC after: 3 days	2005-08-22 12:06:26 +00:00
Andre Oppermann	bb10780f9f	Commit correct version of the change and note the name of the new sysctl: net.inet.icmp.quotelen and defaults to 8 bytes. Pointy hat to: andre	2005-08-21 15:18:00 +00:00
Andre Oppermann	e875dfb826	Add a sysctl to change to length of the quotation of the original packet in an ICMP reply. The minimum of 8 bytes is internally enforced. The maximum quotation is the remaining space in the reply mbuf. This option is added in response to the issues raised in I-D draft-gont-icmp-payload-00.txt. MFC after: 2 weeks Spnsored by: TCP/IP Optimizations Fundraise 2005	2005-08-21 15:09:07 +00:00
Andre Oppermann	a0866c8d4e	Add an option to have ICMP replies to non-local packets generated with the IP address the packet came through in. This is useful for routers to show in traceroutes the actual path a packet has taken instead of the possibly different return path. The new sysctl is named net.inet.icmp.reply_from_interface and defaults to off. MFC after: 2 weeks	2005-08-21 12:29:39 +00:00
Gleb Smirnoff	1ae954096e	In order to support CARP interfaces kernel was taught to handle more than one interface in one subnet. However, some userland apps rely on the believe that this configuration is impossible. Add a sysctl switch net.inet.ip.same_prefix_carp_only. If the switch is on, then kernel will refuse to add an additional interface to already connected subnet unless the interface is CARP. Default value is off. PR: bin/82306 In collaboration with: mlaier	2005-08-18 10:34:30 +00:00
Bjoern A. Zeeb	bd2e5495d1	Fix broken build of rev. 1.108 in case of no INET6 and IPFIREWALL compiled into kernel. Spotted and tested by: Michal Mertl <mime at traveller.cz>	2005-08-14 18:20:33 +00:00
Bjoern A. Zeeb	9066356ba1	* Add dynamic sysctl for net.inet6.ip6.fw. * Correct handling of IPv6 Extension Headers. * Add unreach6 code. * Add logging for IPv6. Submitted by: sysctl handling derived from patch from ume needed for ip6fw Obtained from: is_icmp6_query and send_reject6 derived from similar functions of netinet6,ip6fw Reviewed by: ume, gnn; silence on ipfw@ Test setup provided by: CK Software GmbH MFC after: 6 days	2005-08-13 11:02:34 +00:00
Craig Rodrigues	eee9fe3078	Add NATM_LOCK() and NATM_UNLOCK() in places where npcb_add() and npcb_free() are called, in order to eliminate witness panics. This was overlooked in removal of GIANT from ATM. Reviewed by: rwatson	2005-08-12 02:38:20 +00:00
Gleb Smirnoff	1ed7bf1e3b	o Fix a race between three threads: output path, incoming ARP packet and route request adding/removing ARP entries. The root of the problem is that struct llinfo_arp was accessed without any locks. To close race we will use locking provided by rtentry, that references this llinfo_arp: - Make arplookup() return a locked rtentry. - In arpresolve() hold the lock provided by rt_check()/arplookup() until the end of function, covering all accesses to the rtentry itself and llinfo_arp it refers to. - In in_arpinput() do not drop lock provided by arplookup() during first part of the function. - Simplify logic in the first part of in_arpinput(), removing one level of indentation. - In the second part of in_arpinput() hold rtentry lock while copying address. o Fix a condition when route entry is destroyed, while another thread is contested on its lock: - When storing a pointer to rtentry in llinfo_arp list, always add a reference to this rtentry, to prevent rtentry being destroyed via RTM_DELETE request. - Remove this reference when removing entry from llinfo_arp list. o Further cleanup of arptimer(): - Inline arptfree() into arptimer(). - Use official queue(3) way to pass LIST. - Hold rtentry lock while reading its structure. - Do not check that sdl_family is AF_LINK, but assert this. Reviewed by: sam Stress test: http://www.holm.cc/stress/log/cons141.html Stress test: http://people.freebsd.org/~pho/stress/log/cons144.html	2005-08-11 08:25:48 +00:00
David E. O'Brien	c11ba30c9a	Remove public declarations of variables that were forgotten when they were made static.	2005-08-10 07:10:02 +00:00
David E. O'Brien	31793d594b	Match IPv6 and use a static struct pr_usrreqs nousrreqs.	2005-08-10 06:41:04 +00:00
Robert Watson	a2dc1f5021	Add helper function ip_findmoptions(), which accepts an inpcb, and attempts to atomically return either an existing set of IP multicast options for the PCB, or a newlly allocated set with default values. The inpcb is returned locked. This function may sleep. Call ip_moptions() to acquire a reference to a PCB's socket options, and perform the update of the options while holding the PCB lock. Release the lock before returning. Remove garbage collection of multicast options when values return to the default, as this complicates locking substantially. Most applications allocate a socket either to be multicast, or not, and don't tend to keep around sockets that have previously been used for multicast, then used for unicast. This closes a number of race conditions involving multiple threads or processes modifying the IP multicast state of a socket simultaenously. MFC after: 7 days	2005-08-09 17:19:21 +00:00
Robert Watson	13f4c340ae	Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days	2005-08-09 10:20:02 +00:00
Gleb Smirnoff	9bd8ca3014	In preparation for fixing races in ARP (and probably in other L2/L3 mappings) make rt_check() return a locked rtentry.	2005-08-09 08:39:56 +00:00
Robert Watson	dd5a318ba3	Introduce in_multi_mtx, which will protect IPv4-layer multicast address lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case. For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held. Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output(). Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync. Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events. Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking. Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days	2005-08-03 19:29:47 +00:00
Robert Watson	bccb41014a	Modify network protocol consumers of the ifnet multicast address lists to lock if_addr_mtx. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 1 week	2005-08-02 23:51:22 +00:00
Hajimu UMEMOTO	4dad226e45	recover the line which was wrongly disappeared during scope cleanup. tcpdrop(8) should work for IPv6, again.	2005-08-01 12:08:49 +00:00
Bjoern A. Zeeb	9e669156d4	Add support for IPv6 over GRE [1]. PR kern/80340 includes the FreeBSD specific ip_newid() changes NetBSD does not have. Correct handling of non AF_INET packets passed to bpf [2]. PR: kern/80340[1], NetBSD PRs 29150[1], 30844[2] Obtained from: NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56 Submitted by: Gert Doering <gert at greenie.muc.de>[2] MFC after: 4 days	2005-08-01 08:14:21 +00:00
Hajimu UMEMOTO	c85ed85b1c	include scope6_var.h for in6_clearscope().	2005-07-26 00:19:58 +00:00
Hajimu UMEMOTO	29da8af658	include netinet6/scope6_var.h.	2005-07-25 12:36:43 +00:00
Hajimu UMEMOTO	a1f7e5f8ee	scope cleanup. with this change - most of the kernel code will not care about the actual encoding of scope zone IDs and won't touch "s6_addr16[1]" directly. - similarly, most of the kernel code will not care about link-local scoped addresses as a special case. - scope boundary check will be stricter. For example, the current BSD code allows a packet with src=::1 and dst=(some global IPv6 address) to be sent outside of the node, if the application do: s = socket(AF_INET6); bind(s, "::1"); sendto(s, some_global_IPv6_addr); This is clearly wrong, since ::1 is only meaningful within a single node, but the current implementation of the BSD kernel cannot reject this attempt. Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> Obtained from: KAME	2005-07-25 12:31:43 +00:00
Giorgos Keramidas	a09ad79379	Misc spelling and/or English fixes in comments. Reviewed by: glebius, andre	2005-07-23 00:59:13 +00:00
Hajimu UMEMOTO	6c4eaa873f	move RFC3542 related definitions into ip6.h. Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Reviewed by: mlaier Obtained from: KAME	2005-07-20 10:30:52 +00:00
Hajimu UMEMOTO	77b6f9ed40	add missing RFC3542 definition. Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Obtained from: KAME	2005-07-20 09:17:41 +00:00
Hajimu UMEMOTO	18b35df8fe	update comments: - RFC2292bis -> RFC3542 - typo fixes Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Obtained from: KAME	2005-07-20 08:59:45 +00:00
Robert Watson	de35559f82	Remove no-op spl references in in_pcb.c, since in_pcb locking has been basically complete for several years now. Update one spl comment to reference the locking strategy. MFC after: 3 days	2005-07-19 12:24:27 +00:00
Robert Watson	f59a9ebf10	Remove no-op spl's and most comment references to spls, as TCP locking is believed to be basically done (modulo any remaining bugs). MFC after: 3 days	2005-07-19 12:21:26 +00:00
Robert Watson	b77634d046	Remove spl() calls from ip_slowtimo(), as IP fragment queue locking was merged several years ago. Submitted by: gnn MFC after: 1 day	2005-07-19 12:14:22 +00:00
Max Laier	6de8d9dc52	Export pfsyncstats via sysctl "net.inet.pfsync" in order to print them with netstat (seperate commit). Requested by: glebius MFC after: 1 week	2005-07-14 22:22:51 +00:00
Robert Watson	3c308b091f	Eliminate MAC entry point mac_create_mbuf_from_mbuf(), which is redundant with respect to existing mbuf copy label routines. Expose a new mac_copy_mbuf() routine at the top end of the Framework and use that; use the existing mpo_copy_mbuf_label() routine on the bottom end. Obtained from: TrustedBSD Project Sponsored by: SPARTA, SPAWAR Approved by: re (scottl)	2005-07-05 23:39:51 +00:00
Paul Saab	d758711729	Fix for a bug in newreno partial ack handling where if a large amount of data is partial acked, snd_cwnd underflows, causing a burst. Found, Submitted by: Noritoshi Demizu Approved by: re	2005-07-05 19:23:02 +00:00
Max Laier	b4373150d9	Remove ambiguity from hlen. IPv4 is now indicated by is_ipv4 and we need a proper hlen value for IPv6 to implement O_REJECT and O_LOG. Reviewed by: glebius, brooks, gnn Approved by: re (scottl)	2005-07-03 15:42:22 +00:00
Andrew Thompson	2fcb030ad5	Check the alignment of the IP header before passing the packet up to the packet filter. This would cause a panic on architectures that require strict alignment such as sparc64 (tier1) and ia64/ppc (tier2). This adds two new macros that check the alignment, these are compile time dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where alignment isn't need so the cost is avoided. IP_HDR_ALIGNED_P() IP6_HDR_ALIGNED_P() Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment is checked for ipfw and dummynet too. PR: ia64/81284 Obtained from: NetBSD Approved by: re (dwhite), mlaier (mentor)	2005-07-02 23:13:31 +00:00
Paul Saab	482ac96888	Fix for a bug in the change that defers sack option processing until after PAWS checks. The symptom of this is an inconsistency in the cached sack state, caused by the fact that the sack scoreboard was not being updated for an ACK handled in the header prediction path. Found by: Andrey Chernov. Submitted by: Noritoshi Demizu, Raja Mukerji. Approved by: re	2005-07-01 22:54:18 +00:00
Paul Saab	69e0362019	Fix for a SACK crash caused by a bug in tcp_reass(). tcp_reass() does not clear tlen and frees the mbuf (leaving th pointing at freed memory), if the data segment is a complete duplicate. This change works around that bug. A fix for the tcp_reass() bug will appear later (that bug is benign for now, as neither th nor tlen is referenced in tcp_input() after the call to tcp_reass()). Found by: Pawel Jakub Dawidek. Submitted by: Raja Mukerji, Noritoshi Demizu. Approved by: re	2005-07-01 22:52:46 +00:00
Gleb Smirnoff	a196a3c8aa	When doing ARP load balancing source IP is taken in network byte order, so residue of division for all hosts on net is the same, and thus only one VHID answers. Change source IP in host byte order. Reviewed by: mlaier Approved by: re (scottl)	2005-07-01 08:22:13 +00:00
Simon L. B. Nielsen	0a389eab22	Fix ipfw packet matching errors with address tables. The ipfw tables lookup code caches the result of the last query. The kernel may process multiple packets concurrently, performing several concurrent table lookups. Due to an insufficient locking, a cached result can become corrupted that could cause some addresses to be incorrectly matched against a lookup table. Submitted by: ru Reviewed by: csjp, mlaier Security: CAN-2005-2019 Security: FreeBSD-SA-05:13.ipfw Correct bzip2 permission race condition vulnerability. Obtained from: Steve Grubb via RedHat Security: CAN-2005-0953 Security: FreeBSD-SA-05:14.bzip2 Approved by: obrien Correct TCP connection stall denial of service vulnerability. A TCP packets with the SYN flag set is accepted for established connections, allowing an attacker to overwrite certain TCP options. Submitted by: Noritoshi Demizu Reviewed by: andre, Mohan Srinivasan Security: CAN-2005-2068 Security: FreeBSD-SA-05:15.tcp Approved by: re (security blanket), cperciva	2005-06-29 21:36:49 +00:00
Paul Saab	5a53ca1627	- Postpone SACK option processing until after PAWS checks. SACK option processing is now done in the ACK processing case. - Merge tcp_sack_option() and tcp_del_sackholes() into a new function called tcp_sack_doack(). - Test (SEG.ACK < SND.MAX) before processing the ACK. Submitted by: Noritoshi Demizu Reveiewed by: Mohan Srinivasan, Raja Mukerji Approved by: re	2005-06-27 22:27:42 +00:00
Poul-Henning Kamp	dca9c930da	Libalias incorrectly applies proxy rules to the global divert socket: it should only look for existing translation entries, not create new ones (no matter how it got the idea). Approved by: re(scottl)	2005-06-27 22:21:42 +00:00
Gleb Smirnoff	59dde15e82	Disable checksum processing in LibAlias, when it works as a kernel module. LibAlias is not aware about checksum offloading, so the caller should provide checksum calculation. (The only current consumer is ng_nat(4)). When TCP packet internals has been changed and it requires checksum recalculation, a cookie is set in th_x2 field of TCP packet, to inform caller that it needs to recalculate checksum. This ugly hack would be removed when LibAlias is made more kernel friendly. Incremental checksum updates are left as is, since they don't conflict with offloading. Approved by: re (scottl)	2005-06-27 07:36:02 +00:00
David Malone	01399f34a5	Fix some long standing bugs in writing to the BPF device attached to a DLT_NULL interface. In particular: 1) Consistently use type u_int32_t for the header of a DLT_NULL device - it continues to represent the address family as always. 2) In the DLT_NULL case get bpf_movein to store the u_int32_t in a sockaddr rather than in the mbuf, to be consistent with all the DLT types. 3) Consequently fix a bug in bpf_movein/bpfwrite which only permitted packets up to 4 bytes less than the MTU to be written. 4) Fix all DLT_NULL devices to have the code required to allow writing to their bpf devices. 5) Move the code to allow writing to if_lo from if_simloop to looutput, because it only applies to DLT_NULL devices but was being applied to other devices that use if_simloop possibly incorrectly. PR: 82157 Submitted by: Matthew Luckie <mjl@luckie.org.nz> Approved by: re (scottl)	2005-06-26 18:11:11 +00:00
Stephan Uphoff	68d376254c	Fix a timer ticks wrap around bug for minmssoverload processing. Approved by: re (scottl,dwhite) MFC after: 4 weeks	2005-06-25 22:24:45 +00:00
Warner Losh	d980b05275	Add back missing copyright and license statement. This is identical to the statement in ip_mroute.h, as well as being the same as what OpenBSD has done with this file. It matches the copyright in NetBSD's 1.1 through 1.14 versions of the file as well, which they subsequently added back. It appears to have been lost in the 4.4-lite1 import for FreeBSD 2.0, but where and why I've not investigated further. OpenBSD had the same problem. NetBSD had a copyright notice until Multicast 3.5 was integrated verbatim back in 1995. This appears to be the version that made it into 4.4-lite1. Approved by: re (scottl) MFC after: 3 days	2005-06-23 18:42:58 +00:00
Paul Saab	9004ded9df	Fix for a bug in tcp_sack_option() causing crashes. Submitted by: Noritoshi Demizu, Mohan Srinivasan. Approved by: re (scottl blanket SACK)	2005-06-23 00:18:54 +00:00
Bjoern A. Zeeb	67df9f3896	Fix IP(v6) over IP tunneling most likely broken with ifnet changes. Reviewed by: gnn Approved by: re (dwhite), rwatson (mentor)	2005-06-20 08:39:30 +00:00
Gleb Smirnoff	72f2d6578c	- Don't use legacy function in a non-legacy one. This gives us possibility to compile libalias without legacy support. - Use correct way to mark variable as unused. Approved by: re (dwhite)	2005-06-20 08:31:48 +00:00
Max Laier	e4c959952b	In verify_rev_path6(): - do not use static memory as we are under a shared lock only - properly rtfree routes allocated with rtalloc - rename to verify_path6() - implement the full functionality of the IPv4 version Also make O_ANTISPOOF work with IPv6. Reviewed by: gnn Approved by: re (blanket)	2005-06-16 14:55:58 +00:00
Max Laier	ad7abe197d	Fix indentation in INET6 section in preperation of more serious work. Approved by: re (blanket ip6fw removal)	2005-06-16 13:20:36 +00:00
Max Laier	cf21d53cbf	When doing matching based on dst_ip/src_ip make sure we are really looking on an IPv4 packet as these variables are uninitialized if not. This used to allow arbitrary IPv6 packets depending on the value in the uninitialized variables. Some opcodes (most noteably O_REJECT) do not support IPv6 at all right now. Reviewed by: brooks, glebius Security: IPFW might pass IPv6 packets depending on stack contents. Approved by: re (blanket)	2005-06-12 16:27:10 +00:00
Brooks Davis	fc74a9f93a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
Brian Feldman	b34d56f1ef	Modify send_pkt() to return the generated packet and have the caller do the subsequent ip_output() in IPFW. In ipfw_tick(), the keep-alive packets must be generated from the data that resides under the stateful lock, but they must not be sent at that time, as this would cause a lock order reversal with the normal ordering (interface's lock, then locks belonging to the pfil hooks). In practice, this caused deadlocks when using IPFW and if_bridge(4) together to do stateful transparent filtering. MFC after: 1 week	2005-06-10 12:28:17 +00:00
Andrew Thompson	c8b0129238	Add dummynet(4) support to if_bridge, this code is largely based on bridge.c. This is the final piece to match bridge.c in functionality, we can now be a drop-in replacement. Approved by: mlaier (mentor)	2005-06-10 01:25:22 +00:00
Paul Saab	e912f906d0	Fix a mis-merge. Remove a redundant call to tcp_sackhole_insert Submitted by: Mohan Srinivasan	2005-06-09 17:55:29 +00:00
Paul Saab	8b9bbaaa94	Fix for a crash in tcp_sack_option() caused by hitting the limit on the number of sack holes. Reported by: Andrey Chernov Submitted by: Noritoshi Demizu Reviewed by: Raja Mukerji	2005-06-09 14:01:04 +00:00
Paul Saab	db4b83fe49	Fix for a bug in the change that walks the scoreboard backwards from the tail (in tcp_sack_option()). The bug was caused by incorrect accounting of the retransmitted bytes in the sackhint. Reported by: Kris Kennaway. Submitted by: Noritoshi Demizu.	2005-06-06 19:46:53 +00:00
Andrew Thompson	8f86751705	Add hooks into the networking layer to support if_bridge. This changes struct ifnet so a buildworld is necessary. Approved by: mlaier (mentor) Obtained from: NetBSD	2005-06-05 03:13:13 +00:00
Brian Feldman	5278d40bcc	Better explain, then actually implement the IPFW ALTQ-rule first-match policy. It may be used to provide more detailed classification of traffic without actually having to decide its fate at the time of classification. MFC after: 1 week	2005-06-04 19:04:31 +00:00
Paul Saab	9d17a7a64a	Changes to tcp_sack_option() that - Walks the scoreboard backwards from the tail to reduce the number of comparisons for each sack option received. - Introduce functions to add/remove sack scoreboard elements, making the code more readable. Submitted by: Noritoshi Demizu Reviewed by: Raja Mukerji, Mohan Srinivasan	2005-06-04 08:03:28 +00:00
Max Laier	57cd6d263b	Add support for IPv4 only rules to IPFW2 now that it supports IPv6 as well. This is the last requirement before we can retire ip6fw. Reviewed by: dwhite, brooks(earlier version) Submitted by: dwhite (manpage) Silence from: -ipfw	2005-06-03 01:10:28 +00:00
Ian Dowse	ba5da2a06f	Use IFF_LOCKGIANT/IFF_UNLOCKGIANT around calls to the interface if_ioctl routine. This should fix a number of code paths through soo_ioctl() that could call into Giant-locked network drivers without first acquiring Giant.	2005-06-02 00:04:08 +00:00
Robert Watson	303939942c	When aborting tcp_attach() due to a problem allocating or attaching the tcpcb, lock the inpcb before calling in_pcbdetach() or in6_pcbdetach(), as they expect the inpcb to be passed locked. MFC after: 7 days	2005-06-01 12:14:56 +00:00
Robert Watson	e6e0b5ffd1	Assert tcbinfo lock, inpcb lock in tcp_disconnect(). Assert tcbinfo lock, inpcb lock in in tcp_usrclosed(). MFC after: 7 days	2005-06-01 12:08:15 +00:00
Robert Watson	e3d5315d01	Assert tcbinfo lock in tcp_drop() due to its call of tcp_close() Assert tcbinfo lock in tcp_close() due to its call to in{,6}_detach() Assert tcbinfo lock in tcp_drop_syn_sent() due to its call to tcp_drop() MFC after: 7 days	2005-06-01 12:06:07 +00:00
Robert Watson	1e2d989d0d	Assert that tcbinfo is locked in tcp_input() before calling into tcp_drop(). MFC after: 7 days	2005-06-01 12:03:18 +00:00
Robert Watson	416738a781	Assert the tcbinfo lock whenever tcp_close() is to be called by tcp_input(). MFC after: 7 days	2005-06-01 11:49:14 +00:00
Robert Watson	7609aad7d9	Assert tcbinfo lock in tcp_attach(), as it is required; the caller (tcp_usr_attach()) currently grabs it. MFC after: 7 days	2005-06-01 11:44:43 +00:00
Robert Watson	fe6bfc3730	Commit correct version of previous commit (in_pcb.c:1.164). Use the local variables as currently named. MFC after: 7 days	2005-06-01 11:43:39 +00:00
Robert Watson	6b348152be	Assert pcbinfo lock in in_pcbdisconnect() and in_pcbdetach(), as the global pcb lists are modified. MFC after: 7 days	2005-06-01 11:39:42 +00:00
Robert Watson	3ca1570c82	Slight white space tweak. MFC after: 7 days	2005-06-01 11:38:35 +00:00
Robert Watson	277afaff66	De-spl UDP. MFC after: 3 days	2005-06-01 11:24:00 +00:00
Seigo Tanimura	29ea671b36	Let OSPFv3 go through ipfw. Some more additional checks would be desirable, though.	2005-05-28 07:46:44 +00:00
Paul Saab	808f11b768	This is conform with the terminology in M.Mathis and J.Mahdavi, "Forward Acknowledgement: Refining TCP Congestion Control" SIGCOMM'96, August 1996. Submitted by: Noritoshi Demizu, Raja Mukerji	2005-05-25 17:55:27 +00:00
Paul Saab	64b5fbaa04	Rewrite of tcp_sack_option(). Kentaro Kurahone (NetBSD) pointed out that if we sort the incoming SACK blocks, we can update the scoreboard in one pass of the scoreboard. The added overhead of sorting upto 4 sack blocks is much lower than traversing (potentially) large scoreboards multiple times. The code was updating the scoreboard with multiple passes over it (once for each sack option). The rewrite fixes that, reducing the complexity of the main loop from O(n^2) to O(n). Submitted by: Mohan Srinivasan, Noritoshi Demizu. Reviewed by: Raja Mukerji.	2005-05-23 19:22:48 +00:00
Paul Saab	2cdbfa66ee	Replace t_force with a t_flag (TF_FORCEDATA). Submitted by: Raja Mukerji. Reviewed by: Mohan, Silby, Andre Opperman.	2005-05-21 00:38:29 +00:00
Paul Saab	4fc5324557	Introduce routines to alloc/free sack holes. This cleans up the code considerably. Submitted by: Noritoshi Demizu. Reviewed by: Raja Mukerji, Mohan Srinivasan.	2005-05-16 19:26:46 +00:00
Gleb Smirnoff	32247f8629	- When carp interface is destroyed, and it affects global preemption suppresion counter, decrease the latter. [1] - Add sysctl to monitor preemption suppression. PR: kern/80972 [1] Submitted by: Frank Volf [1] MFC after: 1 week	2005-05-15 01:44:26 +00:00
Paul Saab	fdace17f81	Fix for a bug where the "nexthole" sack hint is out of sync with the real next hole to retransmit from the scoreboard, caused by a bug which did not update the "nexthole" hint in one case in tcp_sack_option(). Reported by: Daniel Eriksson Submitted by: Mohan Srinivasan	2005-05-13 18:02:02 +00:00
Gleb Smirnoff	b3cf6808ce	In div_output() explicitly set m->m_nextpkt to NULL. If divert socket is not userland, but ng_ksocket, then m->m_nextpkt may be non-NULL. In this case we would panic in sbappend.	2005-05-13 11:44:37 +00:00
Paul Saab	0077b0163f	When looking for the next hole to retransmit from the scoreboard, or to compute the total retransmitted bytes in this sack recovery episode, the scoreboard is traversed. While in sack recovery, this traversal occurs on every call to tcp_output(), every dupack and every partial ack. The scoreboard could potentially get quite large, making this traversal expensive. This change optimizes this by storing hints (for the next hole to retransmit and the total retransmitted bytes in this sack recovery episode) reducing the complexity to find these values from O(n) to constant time. The debug code that sanity checks the hints against the computed value will be removed eventually. Submitted by: Mohan Srinivasan, Noritoshi Demizu, Raja Mukerji.	2005-05-11 21:37:42 +00:00
Colin Percival	fe2eee8231	Fix two issues which were missed in FreeBSD-SA-05:08.kmem. Reported by: Uwe Doering	2005-05-07 00:41:36 +00:00
Gleb Smirnoff	cbfbc555e0	Add a workaround for 64-bit archs: store unsigned long return value in temporary variable, check it and then cast to in_addr_t.	2005-05-06 13:01:31 +00:00
Gleb Smirnoff	6293e003c9	s/DEBUG/LIBALIAS_DEBUG/, since DEBUG is defined in LINT and not supported for kernel build.	2005-05-06 11:07:49 +00:00
Colin Percival	fd94099ec2	If we are going to 1. Copy a NULL-terminated string into a fixed-length buffer, and 2. copyout that buffer to userland, we really ought to 0. Zero the entire buffer first. Security: FreeBSD-SA-05:08.kmem	2005-05-06 02:50:00 +00:00
Gleb Smirnoff	e9d5db2888	More bits for kernel version: - copy inet_aton() from libc - disable getservbyname() lookup and accept only numeric port	2005-05-05 22:00:32 +00:00
Gleb Smirnoff	75bc262006	Always include alias.h before alias_local.h	2005-05-05 21:55:17 +00:00
Gleb Smirnoff	f87fe393ce	When used in kernel define NO_FW_PUNCH, NO_LOGGING, NO_USE_SOCKETS.	2005-05-05 21:53:17 +00:00
Gleb Smirnoff	c8d3ca728f	Fix argument order for bcopy() in last commit. Noticed by: njl Pointy hat to: glebius	2005-05-05 21:40:49 +00:00
Gleb Smirnoff	efdc8fbf79	Use bcopy() instead of memmove().	2005-05-05 21:10:51 +00:00
Gleb Smirnoff	ae0440572f	Hide fflush(3) under ifdef DEBUG.	2005-05-05 21:07:34 +00:00
Gleb Smirnoff	c8564bffd2	Things required to build libalias as kernel module: - kernel module declarations and handler. - macros to map malloc(3) calls to malloc(9) ones. - malloc(9) declarations. - call finishoff() from module handler MOD_UNLOAD case instead of atexit(3). - use panic(9) instead of abort(3) - take time from time_second instead of gettimeofday(2) - define INADDR_NONE	2005-05-05 21:05:38 +00:00
Gleb Smirnoff	00fc9a5bb9	Add NO_USE_SOCKETS knob, which cuts off functionality socket binding.	2005-05-05 20:25:12 +00:00
Gleb Smirnoff	40106c140f	Add NO_LOGGING knob, which cuts off functionality of debug logging to a file.	2005-05-05 20:22:09 +00:00
Gleb Smirnoff	c649a2e033	Play with includes so that libalias can be compiled both as userland library and kernel module.	2005-05-05 19:27:32 +00:00
Andre Oppermann	9e4ca6315d	If we don't get a suggested MTU during path MTU discovery look up the packet size of the packet that generated the response, step down the MTU by one step through ip_next_mtu() and try again. Suggested by: dwmalone	2005-05-04 13:48:44 +00:00
Gleb Smirnoff	1f8f08e1c9	Cleanup IPFW2 ifdefs.	2005-05-04 13:24:37 +00:00
Gleb Smirnoff	c3c2f9a9ba	Makefile is not needed here.	2005-05-04 13:24:12 +00:00
Andre Oppermann	4c037f8d6e	Add another step of 1280 (gif(4) tunnels) to ip_next_mtu().	2005-05-04 13:23:54 +00:00
Gleb Smirnoff	a1429ad928	IPFW version 2 is the only option in HEAD and RELENG_5. Thus, cleanup unnecessary now ifdefs.	2005-05-04 13:12:52 +00:00
Andre Oppermann	c773494edd	Pass icmp_error() the MTU argument directly instead of an interface pointer. This simplifies a couple of uses and removes some XXX workarounds.	2005-05-04 13:09:19 +00:00
Robert Watson	b60d26c9b9	Remove now unused inirw variable from previous use of COMMON_END(). Reported by: csjp	2005-05-01 14:01:38 +00:00
Peter Grehan	73fddedac8	Fix typo in last commit. Approved by: rwatson	2005-05-01 13:06:05 +00:00
Robert Watson	d1401c9000	Slide unlocking of the tcbinfo lock earlier in tcp_usr_send(), as it's needed only for implicit connect cases. Under load, especially on SMP, this can greatly reduce contention on the tcbinfo lock. NB: Ambiguities about the state of so_pcb need to be resolved so that all use of the tcbinfo lock in non-implicit connection cases can be eliminated. Submited by: Kazuaki Oda <kaakun at highway dot ne dot jp>	2005-05-01 11:11:38 +00:00
Brooks Davis	31519b13c8	Introduce a struct icmphdr which contains the type, code, and cksum fields of an ICMP packet. Use this to allow ipfw to pullup only these values since it does not use the rest of the packet and it was failed on ICMP packets because they were not long enough. struct icmp should probably be modified to use these at some point, but that will break a fair bit of code so it can wait for another day. On the off chance that adding this struct breaks something in ports, bump __FreeBSD_version. Reported by: Randy Bush <randy at psg dot com> Tested by: Randy Bush <randy at psg dot com>	2005-04-26 18:10:21 +00:00
Paul Saab	91232d6ccc	Remove some code that snuck in by accident. Submitted by: Mohan Srinivasan	2005-04-21 20:29:40 +00:00
Paul Saab	be3f3b5ead	Fix for interaction problems between TCP SACK and TCP Signature. If TCP Signatures are enabled, the maximum allowed sack blocks aren't going to fit. The fix is to compute how many sack blocks fit and tack these on last. Also on SYNs, defer padding until after the SACK PERMITTED option has been added. Found by: Mohan Srinivasan. Submitted by: Mohan Srinivasan, Noritoshi Demizu. Reviewed by: Raja Mukerji.	2005-04-21 20:26:07 +00:00
Paul Saab	97b76190eb	Undo rev 1.71 as it is the wrong change.	2005-04-21 20:24:43 +00:00
Paul Saab	a6235da61e	- Make the sack scoreboard logic use the TAILQ macros. This improves code readability and facilitates some anticipated optimizations in tcp_sack_option(). - Remove tcp_print_holes() and TCP_SACK_DEBUG. Submitted by: Raja Mukerji. Reviewed by: Mohan Srinivasan, Noritoshi Demizu.	2005-04-21 20:11:01 +00:00
Paul Saab	a3047bc036	Fix for 2 bugs related to TCP Signatures : - If the peer sends the Signature option in the SYN, use of Timestamps and Window Scaling were disabled (even if the peer supports them). - The sender must not disable signatures if the option is absent in the received SYN. (See comment in syncache_add()). Found, Submitted by: Noritoshi Demizu <demizu at dd dot ij4u dot or dot jp>. Reviewed by: Mohan Srinivasan <mohans at yahoo-inc dot com>.	2005-04-21 20:09:09 +00:00
Andre Oppermann	1aedbd9c80	Move Path MTU discovery ICMP processing from icmp_input() to tcp_ctlinput() and subject it to active tcpcb and sequence number checking. Previously any ICMP unreachable/needfrag message would cause an update to the TCP hostcache. Now only ICMP PMTU messages belonging to an active TCP session with the correct src/dst/port and sequence number will update the hostcache and complete the path MTU discovery process. Note that we don't entirely implement the recommended counter measures of Section 7.2 of the paper. However we close down the possible degradation vector from trivially easy to really complex and resource intensive. In addition we have limited the smallest acceptable MTU with net.inet.tcp.minmss sysctl for some time already, further reducing the effect of any degradation due to an attack. Security: draft-gont-tcpm-icmp-attacks-03.txt Section 7.2 MFC after: 3 days	2005-04-21 14:29:34 +00:00
Andre Oppermann	1600372b6b	Ignore ICMP Source Quench messages for TCP sessions. Source Quench is ineffective, depreciated and can be abused to degrade the performance of active TCP sessions if spoofed. Replace a bogus call to tcp_quench() in tcp_output() with the direct equivalent tcpcb variable assignment. Security: draft-gont-tcpm-icmp-attacks-03.txt Section 7.1 MFC after: 3 days	2005-04-21 12:37:12 +00:00
Gleb Smirnoff	9dc1f8e41e	Remove anti-LOR bandaid, it is not needed now. Sponsored by: Rambler	2005-04-20 09:32:05 +00:00
Poul-Henning Kamp	6196e2db3e	Make DUMMYNET compile without INET6	2005-04-19 10:12:21 +00:00
Poul-Henning Kamp	d137deac11	typo	2005-04-19 10:04:38 +00:00
Poul-Henning Kamp	7292e12676	Make IPFIREWALL compile without INET6	2005-04-19 09:56:14 +00:00
Brooks Davis	8195404bed	Add IPv6 support to IPFW and Dummynet. Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi)	2005-04-18 18:35:05 +00:00
Paul Saab	b7c755717c	Rewrite of tcp_update_sack_list() to make it simpler and more readable than our original OpenBSD derived version. Submitted by: Noritoshi Demizu Reviewed by: Mohan Srinivasan, Raja Mukerji	2005-04-18 18:10:56 +00:00
Brooks Davis	27a2f39bcf	Centralized finding the protocol header in IP packets in preperation for IPv6 support. The header in IPv6 is more complex then in IPv4 so we want to handle skipping over it in one location. Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi)	2005-04-15 00:47:44 +00:00
Paul Saab	25e6f9ed4b	Fix for a TCP SACK bug where more than (win/2) bytes could have been in flight in SACK recovery. Found by: Noritoshi Demizu Submitted by: Mohan Srinivasan <mohans at yahoo-inc dot com> Noritoshi Demizu <demizu at dd dot ij4u dot or dot jp> Raja Mukerji <raja at moselle dot com>	2005-04-14 20:09:52 +00:00
Paul Saab	cf09195ba5	- Tighten up the Timestamp checks to prevent a spoofed segment from setting ts_recent to an arbitrary value, stopping further communication between the two hosts. - If the Echoed Timestamp is greater than the current time, fall back to the non RFC 1323 RTT calculation. Submitted by: Raja Mukerji (raja at moselle dot com) Reviewed by: Noritoshi Demizu, Mohan Srinivasan	2005-04-10 05:24:59 +00:00
Paul Saab	e346eeff65	- If the reassembly queue limit was reached or if we couldn't allocate a reassembly queue state structure, don't update (receiver) sack report. - Similarly, if tcp_drain() is called, freeing up all items on the reassembly queue, clean the sack report. Found, Submitted by: Noritoshi Demizu <demizu at dd dot iij4u dot or dot jp> Reviewed by: Mohan Srinivasan (mohans at yahoo-inc dot com), Raja Mukerji (raja at moselle dot com).	2005-04-10 05:21:29 +00:00
Paul Saab	b962fa74b5	When the rightmost SACK block expands, rcv_lastsack should be updated. (Fix for kern/78226). Submitted by : Noritoshi Demizu <demizu at dd dot iij4u dot or dot jp> Reviewed by : Mohan Srinivasan (mohans at yahoo-inc dot com), Raja Mukerji (raja at moselle dot com).	2005-04-10 05:20:10 +00:00
Paul Saab	da39f5b963	Remove some unused sack fields. Submitted by : Noritoshi Demizu, Mohan Srinivasan.	2005-04-10 05:19:22 +00:00
Maxim Konovalov	800af1fb81	o Nano optimize ip_reass() code path for the first fragment: do not try to reasseble the packet from the fragments queue with the only fragment, finish with the first fragment as soon as we create a queue. Spotted by: Vijay Singh o Drop the fragment if maxfragsperpacket == 0, no chances we will be able to reassemble the packet in future. Reviewed by: silby	2005-04-08 10:25:13 +00:00
Maxim Konovalov	29f2a6ec18	o Tweak the comment a bit.	2005-04-08 08:43:21 +00:00
Maxim Konovalov	e99971bf2f	o Disable random port allocation when ip.portrange.first == ip.portrange.last and there is the only port for that because: a) it is not wise; b) it leads to a panic in the random ip port allocation code. In general we need to disable ip port allocation randomization if the last - first delta is ridiculous small. PR: kern/79342 Spotted by: Anjali Kulkarni Glanced at by: silby MFC after: 2 weeks	2005-04-08 08:42:10 +00:00
Gleb Smirnoff	8351d04f34	When a packet has been reinjected into ipfw(4) after dummynet(4) processing we have a non-NULL args.rule. If the same packet later is subject to "tee" rule, its original is sent again into ipfw_chk() and it reenters at the same rule. This leads to infinite loop and frozen router. Assign args.rule to NULL, any time we are going to send packet back to ipfw_chk() after a tee rule. This is a temporary workaround, which we will leave for RELENG_5. In HEAD we are going to make divert(4) save next rule the same way as dummynet(4) does. PR: kern/79546 Submitted by: Oleg Bulyzhin Reviewed by: maxim, andre MFC after: 3 days	2005-04-06 14:00:33 +00:00
Brooks Davis	a0d17f7e98	Use ACTION_PTR(r) instead of (r->cmd + r->act_ofs). Reviewed by: md5	2005-04-06 00:26:08 +00:00
Brooks Davis	f4ff11976d	Make dummynet_flush() match its prototype.	2005-04-05 23:38:16 +00:00
Poul-Henning Kamp	a8bc22b47a	natd core dumps when -reverse switch is used because of a bug in libalias. In /usr/src/lib/libalias/alias.c, the functions LibAliasIn and LibAliasOutTry call the legacy PacketAliasIn/PacketAliasOut instead of LibAliasIn/LibAliasOut when the PKT_ALIAS_REVERSE option is set. In this case, the context variable "la" gets lost because the legacy compatibility routines expect "la" to be global. This was obviously an oversight when rewriting the PacketAlias* functions to the LibAlias* functions. The fix (as shown in the patch below) is to remove the legacy subroutine calls and replace with the new ones using the "la" struct as the first arg. Submitted by: Gil Kloepfer <fgil@kloepfer.org> Confirmed by: <nicolai@catpipe.net> PR: 76839 MFC after: 3 days	2005-04-05 13:04:35 +00:00
Gleb Smirnoff	4cb39345c0	When several carp interfaces are attached to Ethernet interface, carp_carpdev_state_locked() is called every time carp interface is attached. The first call backs up flags of the first interface, and the second call backs up them again, erasing correct values. To solve this, a carp_sc_state_locked() function is introduced. It is called when interface is attached to parent, instead of calling carp_carpdev_state_locked. carp_carpdev_state_locked() calls carp_sc_state_locked() for each sc in chain. Reported by: Yuriy N. Shkandybin, sem	2005-03-30 11:44:43 +00:00
Gleb Smirnoff	d1a4742962	- Don't free mbuf, passed to interface output method if the latter returns error. In this case mbuf has already been freed. [1] - Remove redundant declaration. PR: kern/78893 [1] Submitted by: Liang Yi [1] Reviewed by: sam MFC after: 1 day	2005-03-29 13:43:09 +00:00
Sam Leffler	812d865346	eliminate extraneous null ptr checks Noticed by: Coverity Prevent analysis tool	2005-03-29 01:10:46 +00:00
Sam Leffler	5309f84168	deal with malloc failures Noticed by: Coverity Prevent analysis tool Together with: mdodd	2005-03-26 22:20:22 +00:00
Maxim Konovalov	6ee79c59d2	o Document net.inet.ip.portrange.random* sysctls. o Correct a comment about random port allocation threshold implementation. Reviewed by: silby, ru MFC after: 3 days	2005-03-23 09:26:38 +00:00
Gleb Smirnoff	d4d2297060	ifma_protospec is a pointer. Use NULL when assigning or compating it.	2005-03-20 14:31:45 +00:00
Gleb Smirnoff	50bb170471	Remove a workaround from previos revision. It proved to be incorrect. Add two another workarounds for carp(4) interfaces: - do not add connected route when address is assigned to carp(4) interface - do not add connected route when other interface goes down Embrace workarounds with #ifdef DEV_CARP	2005-03-20 10:27:17 +00:00
Gleb Smirnoff	ee6f227017	If vhid exists return more informative EEXIST instead of EINVAL. While here remove redundant brackets.	2005-03-18 13:41:38 +00:00
Gleb Smirnoff	9860bab349	Fix a potential crash that could occur when CARP_LOG is being used. Obtained from: OpenBSD (pat)	2005-03-18 13:18:34 +00:00
Sam Leffler	6a9909b5e6	plug resource leak Noticed by: Coverity Prevent analysis tool	2005-03-16 05:27:19 +00:00
Robert Watson	d2bc35ab29	In tcp_usr_send(), broaden coverage of the socket buffer lock in the non-OOB case so that the sbspace() check is performed under the same lock instance as the append to the send socket buffer. MFC after: 1 week	2005-03-14 22:15:14 +00:00
Gleb Smirnoff	422a115a4a	Embrace with #ifdef DEV_CARP carp-related code.	2005-03-13 11:23:22 +00:00
Gleb Smirnoff	0504a89fdd	Add antifootshooting workaround, which will make all routes "connected" to carp(4) interfaces host routes. This prevents a problem, when connected network is routed to carp(4) interface.	2005-03-10 15:26:45 +00:00
Paul Saab	e891d82b56	Add limits on the number of elements in the sack scoreboard both per-connection and globally. This eliminates potential DoS attacks where SACK scoreboard elements tie up too much memory. Submitted by: Raja Mukerji (raja at moselle dot com). Reviewed by: Mohan Srinivasan (mohans at yahoo-inc dot com).	2005-03-09 23:14:10 +00:00
Gleb Smirnoff	2ef4a436e0	Make ARP do not complain about wrong interface if correct interface is a carp one and address matched it. Reviewed by: brooks	2005-03-09 10:00:01 +00:00
Joe Marcus Clarke	70037e98c4	Fix a problem in the Skinny ALG where a specially crafted packet could cause a libalias application (e.g. natd, ppp, etc.) to crash. Note: Skinny support is not enabled in natd or ppp by default. Approved by: secteam (nectar) MFC after: 1 day Secuiryt: This fixes a remote DoS exploit	2005-03-03 03:06:37 +00:00
Gleb Smirnoff	b82936c5d4	Fix typo. Unbreak build. Take pointy hat.	2005-03-02 09:11:18 +00:00
Gleb Smirnoff	d220759b41	Add more locking when reading/writing to carp softc. When carp softc is attached to a parent interface we use its mutex to lock the softc. This means that in several places like carp_ioctl() we lock softc conditionaly. This should be redesigned. To avoid LORs when MII announces us a link state change, we schedule a quick callout and call carp_carpdev_state_locked() from it. Initialize callouts using NET_CALLOUT_MPSAFE. Sponsored by: Rambler Reviewed by: mlaier	2005-03-01 13:14:33 +00:00
Gleb Smirnoff	d92d54d54d	- Add carp_mtx. Use it to protect list of all carp interfaces. - In carp_send_ad_all() walk through list of all carp interfaces instead of walking through list of all interfaces. Sponsored by: Rambler Reviewed by: mlaier	2005-03-01 12:36:07 +00:00
Gleb Smirnoff	31199c8463	Use NET_CALLOUT_MPSAFE macro.	2005-03-01 12:01:17 +00:00
Gleb Smirnoff	3a84d72a78	Revert change to struct ifnet. Use ifnet pointer in softc. Embedding ifnet into smth will soon be removed. Requested by: brooks	2005-03-01 10:59:14 +00:00
Gleb Smirnoff	4358dfc32f	Remove debugging printf. Reviewed by: mlaier	2005-03-01 09:31:36 +00:00
Yaroslav Tykhiy	630481bb92	Support running carp(4) over a vlan(4) parent interface. Encouraged by: glebius	2005-02-28 16:19:11 +00:00
Gleb Smirnoff	1d0a237660	Remove unused field from carp softc. OK'ed by: mcbride@OpenBSD	2005-02-28 11:57:03 +00:00
Gleb Smirnoff	3e07def4cd	Fix tcpdump(8) on carp(4) interface: - Use our loop DLT type, not OpenBSD. [1] - The fields that are converted to network byte order are not 32-bit fields but 16-bit fields, so htons should be used in htonl. [1] - Secondly, ip_input changes ip->ip_len into its value without the ip-header length. So, restore the length to make bpf happy. [1] - Use bpf_mtap2(), use temporary af1, since bpf_mtap2 doesn't understand uint8_t af identifier. Submitted by: Frank Volf [1]	2005-02-28 11:54:36 +00:00
Paul Saab	8291294024	If the receiver sends an ack that is out of [snd_una, snd_max], ignore the sack options in that segment. Else we'd end up corrupting the scoreboard. Found by: Raja Mukerji (raja at moselle dot com) Submitted by: Mohan Srinivasan	2005-02-27 20:39:04 +00:00
Max Laier	a4e5390551	Unbreak the build. carp_iamatch6 and carp_macmatch6 are not supposed to be static as they are used elsewhere.	2005-02-27 11:32:26 +00:00
Gleb Smirnoff	e8c34a71eb	Remove carp_softc.sc_ifp member in favor of union pointers in struct ifnet. Obtained from: OpenBSD	2005-02-26 13:55:07 +00:00
Gleb Smirnoff	5c1f0f6de5	Staticize local functions.	2005-02-26 10:33:14 +00:00
Gleb Smirnoff	88bf82a62e	New lines when logging.	2005-02-25 11:26:39 +00:00
Gleb Smirnoff	947b7cf3c6	Embrace macros with do {} while (0) Submitted by: maxim	2005-02-25 10:49:47 +00:00
Gleb Smirnoff	39aeaa0eb5	Call carp_carpdev_state() from carp_set_addr6(). See log for rev 1.4. Sponsored by: Rambler	2005-02-25 10:12:11 +00:00
Gleb Smirnoff	1e9e65729b	Improve logging: - Simplify CARP_LOG() and making it working (we don't have addlog in FreeBSD). - Introduce CARP_DEBUG() which logs with LOG_DEBUG severity when net.inet.carp.log > 1 - Use CARP_DEBUG to log state changes of carp interfaces. After CARP_LOG() cleanup it appeared that carp_input_c() does not need sc argument. Remove it. Sponsored by: Rambler	2005-02-25 10:09:44 +00:00
Gleb Smirnoff	6fba4c0bae	Fix problem when master comes up with one interface down, and preempts mastering on all other interfaces: - call carp_carpdev_state() on initialize instead of just setting to INIT - in carp_carpdev_state() check that interface is UP, instead of checking that it is not DOWN, because a rebooted machine may have interface in UNKNOWN state. Sponsored by: Rambler Obtained from: OpenBSD (partially)	2005-02-24 09:05:28 +00:00
Sam Leffler	db77984c5b	fix potential invalid index into ip_protox array Noticed by: Coverity Prevent analysis tool	2005-02-23 00:38:12 +00:00
Maxime Henrion	2368737719	Unbreak CARP build on 64-bit architectures. Tested on: sparc64	2005-02-23 00:20:33 +00:00
Andre Oppermann	099dd0430b	Bring back the full packet destination manipulation for 'ipfw fwd' with the kernel compile time option: options IPFIREWALL_FORWARD_EXTENDED This option has to be specified in addition to IPFIRWALL_FORWARD. With this option even packets targeted for an IP address local to the host can be redirected. All restrictions to ensure proper behaviour for locally generated packets are turned off. Firewall rules have to be carefully crafted to make sure that things like PMTU discovery do not break. Document the two kernel options. PR: kern/71910 PR: kern/73129 MFC after: 1 week	2005-02-22 17:40:40 +00:00
Gleb Smirnoff	67df421496	Remove promisc counter from parent interface in carp_clone_destroy(), so that parent interface is not left in promiscous mode after carp interface is destroyed. This is not perfect, since promisc counter is added when carp interface is assigned an IP address. However, when address is removed parent interface is still in promiscuous mode. Only removal of carp interface removes promisc from parent. Same way in OpenBSD. Sponsored by: Rambler	2005-02-22 16:24:55 +00:00
Gleb Smirnoff	a97719482d	Add CARP (Common Address Redundancy Protocol), which allows multiple hosts to share an IP address, providing high availability and load balancing. Original work on CARP done by Michael Shalayeff, with many additions by Marco Pfatschbacher and Ryan McBride. FreeBSD port done solely by Max Laier. Patch by: mlaier Obtained from: OpenBSD (mickey, mcbride)	2005-02-22 13:04:05 +00:00
Gleb Smirnoff	797127a9bf	We can make code simplier after last change. Noticed by: Andrew Thompson	2005-02-22 08:35:24 +00:00
Gleb Smirnoff	3a1757b9c0	In in_pcbconnect_setup() jailed sockets are treated specially: if local address is not supplied, then jail IP is choosed and in_pcbbind() is called. Since udp_output() does not save local addr after call to in_pcbconnect_setup(), in_pcbbind() is called for each packet, and this is incorrect. So, we shall treat jailed sockets specially in udp_output(), we will save their local address. This fixes a long standing bug with broken sendto() system call in jails. PR: kern/26506 Reviewed by: rwatson MFC after: 2 weeks	2005-02-22 07:50:02 +00:00
Gleb Smirnoff	914d092f5d	In in_pcbconnect_setup() remove a check that route points at loopback interface. Nobody have explained me sense of this check. It breaks connect() system call to a destination address which is loopback routed (e.g. blackholed). Reviewed by: silence on net@ MFC after: 2 weeks	2005-02-22 07:39:15 +00:00
Robert Watson	0daccb9c94	In the current world order, solisten() implements the state transition of a socket from a regular socket to a listening socket able to accept new connections. As part of this state transition, solisten() calls into the protocol to update protocol-layer state. There were several bugs in this implementation that could result in a race wherein a TCP SYN received in the interval between the protocol state transition and the shortly following socket layer transition would result in a panic in the TCP code, as the socket would be in the TCPS_LISTEN state, but the socket would not have the SO_ACCEPTCONN flag set. This change does the following: - Pushes the socket state transition from the socket layer solisten() to to socket "library" routines called from the protocol. This permits the socket routines to be called while holding the protocol mutexes, preventing a race exposing the incomplete socket state transition to TCP after the TCP state transition has completed. The check for a socket layer state transition is performed by solisten_proto_check(), and the actual transition is performed by solisten_proto(). - Holds the socket lock for the duration of the socket state test and set, and over the protocol layer state transition, which is now possible as the socket lock is acquired by the protocol layer, rather than vice versa. This prevents additional state related races in the socket layer. This permits the dual transition of socket layer and protocol layer state to occur while holding locks for both layers, making the two changes atomic with respect to one another. Similar changes are likely require elsewhere in the socket/protocol code. Reported by: Peter Holm <peter@holm.cc> Review and fixes from: emax, Antoine Brodin <antoine.brodin@laposte.net> Philosophical head nod: gnn	2005-02-21 21:58:17 +00:00
Paul Saab	7643c37cf2	Remove 2 (SACK) fields from the tcpcb. These are only used by a function that is called from tcp_input(), so they oughta be passed on the stack instead of stuck in the tcpcb. Submitted by: Mohan Srinivasan	2005-02-17 23:04:56 +00:00
Paul Saab	7776346f83	Fix for a SACK (receiver) bug where incorrect SACK blocks are reported to the sender - in the case where the sender sends data outside the window (as WinXP does :(). Reported by: Sam Jensen <sam at wand dot net dot nz> Submitted by: Mohan Srinivasan	2005-02-16 01:46:17 +00:00
Paul Saab	8db456bf17	- Retransmit just one segment on initiation of SACK recovery. Remove the SACK "initburst" sysctl. - Fix bugs in SACK dupack and partialack handling that can cause large bursts while in SACK recovery. Submitted by: Mohan Srinivasan	2005-02-14 21:01:08 +00:00
Maxim Konovalov	9945c0e21f	o Add handling of an IPv4-mapped IPv6 address. o Use SYSCTL_IN() macro instead of direct call of copyin(9). Submitted by: ume o Move sysctl_drop() implementation to sys/netinet/tcp_subr.c where most of tcp sysctls live. o There are net.inet[6].tcp[6].getcred sysctls already, no needs in a separate struct tcp_ident_mapping. Suggested by: ume	2005-02-14 07:37:51 +00:00
Gleb Smirnoff	1af305441d	Jump to common action checks after doing specific once. This fixes adding of divert rules, which I break in previous commit. Pointy hat to: glebius	2005-02-06 11:13:59 +00:00
Maxim Konovalov	212a79b010	o Implement net.inet.tcp.drop sysctl and userland part, tcpdrop(8) utility: The tcpdrop command drops the TCP connection specified by the local address laddr, port lport and the foreign address faddr, port fport. Obtained from: OpenBSD Reviewed by: rwatson (locking), ru (man page), -current MFC after: 1 month	2005-02-06 10:47:12 +00:00
Gleb Smirnoff	670742a102	Add a ng_ipfw node, implementing a quick and simple interface between ipfw(4) and netgraph(4) facilities. Reviewed by: andre, brooks, julian	2005-02-05 12:06:33 +00:00
Hajimu UMEMOTO	6d0a982bdf	teach scope of IPv6 address to net.inet6.tcp6.getcred. MFC after: 1 week	2005-02-04 14:43:05 +00:00
Robert Watson	06456da2c6	Update an additional reference to the rate of ISN tick callouts that was missed in tcp_subr.c:1.216: projected_offset must also reflect how often the tcp_isn_tick() callout will fire. MFC after: 2 weeks Submitted by: silby	2005-01-31 01:35:01 +00:00
Christian S.J. Peron	0ba04c87b3	Change the state allocator from using regular malloc to using a UMA zone instead. This should eliminate a bit of the locking overhead associated with with malloc and reduce the memory consumption associated with each new state. Reviewed by: rwatson, andre Silence on: ipfw@ MFC after: 1 week	2005-01-31 00:48:39 +00:00
Robert Watson	54082796aa	Have tcp_isn_tick() fire 100 times a second, rather than HZ times a second; since the default hz has changed to 1000 times a second, this resulted in unecessary work being performed. MFC after: 2 weeks Discussed with: phk, cperciva General head nod: silby	2005-01-30 23:30:28 +00:00
Robert Watson	024105493d	Prefer (NULL) spelling of (0) for pointers. MFC after: 3 days	2005-01-30 19:29:47 +00:00
Robert Watson	77c16eed7c	Remove clause three from tcp_syncache.c license per permission of McAfee. Update copyright to McAfee from NETA.	2005-01-30 19:28:27 +00:00
Alan Cox	7258e9687b	Correctly move the packet header in ip_insertoptions(). Reported by: Anupam Chanda Reviewed by: sam@ MFC after: 2 weeks	2005-01-23 19:43:46 +00:00
Ruslan Ermilov	24a0682c64	Sort sections.	2005-01-20 09:17:07 +00:00
Gleb Smirnoff	28935658c4	- Reduce number of arguments passed to dummynet_io(), we already have cookie in struct ip_fw_args itself. - Remove redundant &= 0xffff from dummynet_io().	2005-01-16 11:13:18 +00:00
Gleb Smirnoff	6c69a7c30b	o Clean up interface between ip_fw_chk() and its callers: - ip_fw_chk() returns action as function return value. Field retval is removed from args structure. Action is not flag any more. It is one of integer constants. - Any action-specific cookies are returned either in new "cookie" field in args structure (dummynet, future netgraph glue), or in mbuf tag attached to packet (divert, tee, some future action). o Convert parsing of return value from ip_fw_chk() in ipfw_check_{in,out}() to a switch structure, so that the functions are more readable, and a future actions can be added with less modifications. Approved by: andre MFC after: 2 months	2005-01-14 09:00:46 +00:00
Paul Saab	8d03f2b53b	Fix a TCP SACK related crash resulting from incorrect computation of len in tcp_output(), in the case where the FIN has already been transmitted. The mis-computation of len is because of a gcc optimization issue, which this change works around. Submitted by: Mohan Srinivasan	2005-01-12 21:40:51 +00:00
Brian Somers	2a4cd52421	include "alias.h", not <alias.h> MFC after: 3 days	2005-01-10 10:54:06 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
Mike Silbersack	a69968ee4e	Add a sysctl (net.inet.tcp.insecure_rst) which allows one to specify that the RFC 793 specification for accepting RST packets should be following. When followed, this makes one vulnerable to the attacks described in "slipping in the window", but it may be necessary in some odd circumstances.	2005-01-03 07:08:37 +00:00
Mike Silbersack	5f311da2cc	Port randomization leads to extremely fast port reuse at high connection rates, which is causing problems for some users. To retain the security advantage of random ports and ensure correct operation for high connection rate users, disable port randomization during periods of high connection rates. Whenever the connection rate exceeds randomcps (10 by default), randomization will be disabled for randomtime (45 by default) seconds. These thresholds may be tuned via sysctl. Many thanks to Igor Sysoev, who proved the necessity of this change and tested many preliminary versions of the patch. MFC After: 20 seconds	2005-01-02 01:50:57 +00:00
Robert Watson	74d4630b71	Remove an errant blank line apparently introduced in ip_output.c:1.194.	2004-12-25 22:59:42 +00:00
Robert Watson	42cf3289c3	In the dropafterack case of tcp_input(), it's OK to release the TCP pcbinfo lock before calling tcp_output(), as holding just the inpcb lock is sufficient to prevent garbage collection.	2004-12-25 22:26:13 +00:00
Robert Watson	e0bef1cb35	Revert parts of tcp_input.c:1.255 associated with the header predicted cases for tcp_input(): While it is true that the pcbinfo lock provides a pseudo-reference to inpcbs, both the inpcb and pcbinfo locks are required to free an un-referenced inpcb. As such, we can release the pcbinfo lock as long as the inpcb remains locked with the confidence that it will not be garbage-collected. This leads to a less conservative locking strategy that should reduce contention on the TCP pcbinfo lock. Discussed with: sam	2004-12-25 22:23:13 +00:00
Robert Watson	452d9f5b1c	Attempt to consistently use () around return values in calls to return() in newer code (sysctl, ISN, timewait).	2004-12-23 01:34:26 +00:00
Robert Watson	06da46b241	Remove an XXXRW comment relating to whether or not the TCP timers are MPSAFE: they are now believed to be. Correct a typo in a second comment. MFC after: 2 weeks	2004-12-23 01:27:13 +00:00
Robert Watson	db0aae38b6	Remove the now unused tcp_canceltimers() function. tcpcb timers are now stopped as part of tcp_discardcb(). MFC after: 2 weeks	2004-12-23 01:25:59 +00:00
Robert Watson	950ab1e470	Remove an annotation of a minor race relating to the update of multiple MIB entries using sysctl in short order, which might result in unexpected values for tcp_maxidle being generated by tcp_slowtimo. In practice, this will not happen, or at least, doesn't require an explicit comment. MFC after: 2 weeks	2004-12-23 01:21:54 +00:00
Gleb Smirnoff	5e5da86597	In certain cases ip_output() can free our route, so check for its presence before RTFREE(). Noticed by: ru	2004-12-10 07:51:14 +00:00
Gleb Smirnoff	d2a09f901a	Revert last change. Andre: First lets get major new features into the kernel in a clean and nice way, and then start optimizing. In this case we don't have any obfusication that makes later profiling and/or optimizing difficult in any way. Requested by: csjp, sam	2004-12-10 07:47:17 +00:00
Christian S.J. Peron	fbf2edb6e4	This commit adds a shared locking mechanism very similar to the mechanism used by pfil. This shared locking mechanism will remove a nasty lock order reversal which occurs when ucred based rules are used which results in hard locks while mpsafenet=1. So this removes the debug.mpsafenet=0 requirement when using ucred based rules with IPFW. It should be noted that this locking mechanism does not guarantee fairness between read and write locks, and that it will favor firewall chain readers over writers. This seemed acceptable since write operations to firewall chains protected by this lock tend to be less frequent than reads. Reviewed by: andre, rwatson Tested by: myself, seanc Silence on: ipfw@ MFC after: 1 month	2004-12-10 02:17:18 +00:00
Gleb Smirnoff	f5a19d3909	Check that DUMMYNET_LOADED before seeking dummynet m_tag. Reviewed by: andre MFC after: 1 week	2004-12-09 16:41:47 +00:00
Max Laier	067a8bab8a	More fixing of multiple addresses in the same prefix. This time do not try to arp resolve "secondary" local addresses. Found and submitted by: ru With additions from: OpenBSD (rev. 1.47) Reviewed by: ru	2004-12-09 00:12:41 +00:00
Ruslan Ermilov	5cae05ad33	Time out routes created by redirect.	2004-12-06 22:27:22 +00:00
Gleb Smirnoff	98335aa976	- Make route cacheing optional, configurable via IFF_LINK0 flag. - Turn it off by default. Requested by: many Reviewed by: andre Approved by: julian (mentor) MFC after: 3 days	2004-12-06 19:02:43 +00:00
Robert Watson	79a9e59c89	Assert the tcptw inpcb lock in tcp_timer_2msl_reset(), as fields in the tcptw undergo non-atomic read-modify-writes. MFC after: 2 weeks	2004-12-05 22:47:29 +00:00
Robert Watson	b9155d92b2	Assert inpcb lock in: tcpip_fillheaders() tcp_discardcb() tcp_close() tcp_notify() tcp_new_isn() tcp_xmit_bandwidth_limit() Fix a locking comment in tcp_twstart(): the pcbinfo will be locked (and is asserted). MFC after: 2 weeks	2004-12-05 22:27:53 +00:00
Robert Watson	6fbed4af22	Minor grammer fix in comment.	2004-12-05 22:20:59 +00:00
Robert Watson	89924e5865	Pass the inpcb reference into ip_getmoptions() rather than just the inp->inp_moptions pointer, so that ip_getmoptions() can perform necessary locking when doing non-atomic reads. Lock the inpcb by default to copy any data to local variables, then unlock before performing sooptcopyout(). MFC after: 2 weeks	2004-12-05 22:08:37 +00:00
Robert Watson	92c71ab30b	Define INP_UNLOCK_ASSERT() to assert that an inpcb is unlocked. MFC after: 2 weeks	2004-12-05 22:07:14 +00:00
Robert Watson	5c918b56d8	Push the inpcb argument into ip_setmoptions() when setting IP multicast socket options, so that it is available for locking.	2004-12-05 21:38:33 +00:00
Robert Watson	993d9505d4	Start working through inpcb locking for ip_ctloutput() by cleaning up modifications to the inpcb IP options mbuf: - Lock the inpcb before passing it into ip_pcbopts() in order to prevent simulatenous reads and read-modify-writes that could result in races. - Pass the inpcb reference into ip_pcbopts() instead of the option chain pointer in the inpcb. - Assert the inpcb lock in ip_pcbots. - Convert one or two uses of a pointer as a boolean or an integer comparison to a comparison with NULL for readability.	2004-12-05 19:11:09 +00:00
Paul Saab	7d5ed1ceea	Fixes a bug in SACK causing us to send data beyond the receive window. Found by: Pawel Worach and Daniel Hartmeier Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-11-29 18:47:27 +00:00
Robert Watson	2be3bf2244	Assert the inpcb lock in tcp_xmit_timer() as it performs read-modify- write of various time/rtt-related fields in the tcpcb.	2004-11-28 11:06:22 +00:00
Robert Watson	18ad5842c5	Expand coverage of the receive socket buffer lock when handling urgent pointer updates: test available space while holding the socket buffer mutex, and continue to hold until until the pointer update has been performed. MFC after: 2 weeks	2004-11-28 11:01:31 +00:00
Robert Watson	c8443a1dc0	Do export the advertised receive window via the tcpi_rcv_space field of struct tcp_info.	2004-11-27 20:20:11 +00:00
Robert Watson	b8af5dfa81	Implement parts of the TCP_INFO socket option as found in Linux 2.6. This socket option allows processes query a TCP socket for some low level transmission details, such as the current send, bandwidth, and congestion windows. Linux provides a 'struct tcpinfo' structure containing various variables, rather than separate socket options; this makes the API somewhat fragile as it makes it dificult to add new entries of interest as requirements and implementation evolve. As such, I've included a large pad at the end of the structure. Right now, relatively few of the Linux API fields are filled in, and some contain no logical equivilent on FreeBSD. I've include __'d entries in the structure to make it easier to figure ou what is and isn't omitted. This API/ABI should be considered unstable for the time being.	2004-11-26 18:58:46 +00:00
Mike Silbersack	6a220ed80a	Fix a problem where our TCP stack would ignore RST packets if the receive window was 0 bytes in size. This may have been the cause of unsolved "connection not closing" reports over the years. Thanks to Michiel Boland for providing the fix and providing a concise test program for the problem. Submitted by: Michiel Boland MFC after: 2 weeks	2004-11-25 19:04:20 +00:00
Robert Watson	de30ea131f	In tcp_reass(), assert the inpcb lock on the passed tcpcb, since the contents of the tcpcb are read and modified in volume. In tcp_input(), replace th comparison with 0 with a comparison with NULL. At the 'findpcb', 'dropafterack', and 'dropwithreset' labels in tcp_input(), assert 'headlocked'. Try to improve consistency between various assertions regarding headlocked to be more informative. MFC after: 2 weeks	2004-11-23 23:41:20 +00:00
Robert Watson	cce83ffb5a	tcp_timewait() performs multiple non-atomic reads on the tcptw structure, so assert the inpcb lock associated with the tcptw. Also assert the tcbinfo lock, as tcp_timewait() may call tcp_twclose() or tcp_2msl_rest(), which require it. Since tcp_timewait() is already called with that lock from tcp_input(), this doesn't change current locking, merely documents reasons for it. In tcp_twstart(), assert the tcbinfo lock, as tcp_timer_2msl_rest() is called, which requires that lock. In tcp_twclose(), assert the tcbinfo lock, as tcp_timer_2msl_stop() is called, which requires that lock. Document the locking strategy for the time wait queues in tcp_timer.c, which consists of protecting the time wait queues in the same manner as the tcbinfo structure (using the tcbinfo lock). In tcp_timer_2msl_reset(), assert the tcbinfo lock, as the time wait queues are modified. In tcp_timer_2msl_stop(), assert the tcbinfo lock, as the time wait queues may be modified. In tcp_timer_2msl_tw(), assert the tcbinfo lock, as the time wait queues may be modified. MFC after: 2 weeks	2004-11-23 17:21:30 +00:00
Robert Watson	b42ff86e73	De-spl tcp_slowtimo; tcp_maxidle assignment is subject to possible but unlikely races that could be corrected by having tcp_keepcnt and tcp_keepintvl modifications go through handler functions via sysctl, but probably is not worth doing. Updates to multiple sysctls within evaluation of a single addition are unlikely. Annotate that tcp_canceltimers() is currently unused. De-spl tcp_timer_delack(). De-spl tcp_timer_2msl(). MFC after: 2 weeks	2004-11-23 16:45:07 +00:00
Robert Watson	7258e91f0f	Assert the inpcb lock in tcp_twstart(), which does both read-modify-write on the tcpcb, but also calls into tcp_close() and tcp_twrespond(). Annotate that tcp_twrecycleable() requires the inpcb lock because it does a series of non-atomic reads of the tcpcb, but is currently called without the inpcb lock by the caller. This is a bug. Assert the inpcb lock in tcp_twclose() as it performs a read-modify-write of the timewait structure/inpcb, and calls in_pcbdetach() which requires the lock. Assert the inpcb lock in tcp_twrespond(), as it performs multiple non-atomic reads of the tcptw and inpcb structures, as well as calling mac_create_mbuf_from_inpcb(), tcpip_fillheaders(), which require the inpcb lock. MFC after: 2 weeks	2004-11-23 16:23:13 +00:00
Robert Watson	8263bab34d	Assert inpcb lock in tcp_quench(), tcp_drop_syn_sent(), tcp_mtudisc(), and tcp_drop(), due to read-modify-write of TCP state variables. MFC after: 2 weeks	2004-11-23 16:06:15 +00:00
Robert Watson	8438db0f59	Assert the tcbinfo write lock in tcp_new_isn(), as the tcbinfo lock protects access to the ISN state variables. Acquire the tcbinfo write lock in tcp_isn_tick() to synchronize timer-driven isn bumping. Staticize internal ISN variables since they're not used outside of tcp_subr.c. MFC after: 2 weeks	2004-11-23 15:59:43 +00:00
Robert Watson	ca127a3e80	Remove "Unlocked read" annotations associated with previously unlocked use of socket buffer fields in the TCP input code. These references are now protected by use of the receive socket buffer lock. MFC after: 1 week	2004-11-22 13:16:27 +00:00
Robert Watson	98734750b4	s/send/sent/ in comment describing TCPS_SYN_RECEIVED.	2004-11-21 14:38:04 +00:00
Gleb Smirnoff	c1384b5ae2	- Since divert protocol is not connection oriented, remove SS_ISCONNECTED flag from divert sockets. - Remove div_disconnect() method, since it shouldn't be called now. - Remove div_abort() method. It was never called directly, since protocol doesn't have listen queue. It was called only from div_disconnect(), which is removed now. Reviewed by: rwatson, maxim Approved by: julian (mentor) MT5 after: 1 week MT4 after: 1 month	2004-11-18 13:49:18 +00:00
Max Laier	9a6a6eeba2	Fix host route addition for more than one address to a loopback interface after allowing more than one address with the same prefix. Reported by: Vladimir Grebenschikov <vova NO fbsd SPAM ru> Submitted by: ru (also NetBSD rev. 1.83) Pointyhat to: mlaier	2004-11-17 23:14:03 +00:00
Max Laier	81d96ce8a4	Merge copyright notices. Requested by: njl	2004-11-13 17:05:40 +00:00

... 3 4 5 6 7 ...

2562 Commits