freebsd-nq

Author	SHA1	Message	Date
Andrew Thompson	5feebeeb53	Enable proxy ARP answers on any of the bridged interfaces if proxy record belongs to another interface within the bridge group. PR: kern/94408 Submitted by: Eygene A. Ryabinkin MFC after: 1 month	2006-06-09 00:33:30 +00:00
Oleg Bulyzhin	458009ae93	install_state() should properly initialize 'addr_type' field of newly created flows for O_LIMIT rules. Otherwise 'ipfw -d show' is unable to display PARENT rules properly. (This bug was exposed by ipfw2.c rev.1.90) Approved by: glebius (mentor) MFC after: 2 weeks	2006-06-08 11:27:45 +00:00
Oleg Bulyzhin	d2dc1907e8	Fix following rules: pipe X (tag\|altq) Y ... Approved by: glebius (mentor) MFC after: 2 weeks	2006-06-08 11:13:23 +00:00
Robert Watson	f2de87fec4	Push acquisition of pcbinfo lock out of tcp_usr_attach() into tcp_attach() after the call to soreserve(), as it doesn't require the global lock. Rearrange inpcb locking here also. MFC after: 1 month	2006-06-04 09:31:34 +00:00
Robert Watson	d8ab0ec661	When entering a timer on a tcpcb, don't continue processing if it has been dropped. This prevents a bug introduced during the socket/pcb refcounting work from occuring, in which occasionally the retransmit timer may fire after a connection has been reset, resulting in the resulting R\|A TCP packet having a source port of 0, as the port reservation has been released. While here, fixing up some RUNLOCK->WUNLOCK bugs. MFC after: 1 month	2006-06-03 19:37:08 +00:00
Robert Watson	f24618aaf0	Acquire udbinfo lock after call to soreserve() rather than before, as it is not required. This simplifies error-handling, and reduces the time that this lock is held. MFC after: 1 month	2006-06-03 19:29:26 +00:00
Christian S.J. Peron	16d878cc99	Fix the following bpf(4) race condition which can result in a panic: (1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off to bpf (4) bpf peer detaches from netif0 resulting in ifp->if_bpf being initialized to NULL. (5) ifp->if_bpf is dereferenced by bpf machinery (6) Kaboom This race condition likely explains the various different kernel panics reported around sending SIGINT to tcpdump or dhclient processes. But really this race can result in kernel panics anywhere you have frequent bpf attach and detach operations with high packet per second load. Summary of changes: - Remove the bpf interface's "driverp" member - When we attach bpf interfaces, we now set the ifp->if_bpf member to the bpf interface structure. Once this is done, ifp->if_bpf should never be NULL. [1] - Introduce bpf_peers_present function, an inline operation which will do a lockless read bpf peer list associated with the interface. It should be noted that the bpf code will pickup the bpf_interface lock before adding or removing bpf peers. This should serialize the access to the bpf descriptor list, removing the race. - Expose the bpf_if structure in bpf.h so that the bpf_peers_present function can use it. This also removes the struct bpf_if; hack that was there. - Adjust all consumers of the raw if_bpf structure to use bpf_peers_present Now what happens is: (1) Packet is received by netif0 (2) Check to see if bpf descriptor list is empty (3) Pickup the bpf interface lock (4) Hand packet off to process From the attach/detach side: (1) Pickup the bpf interface lock (2) Add/remove from bpf descriptor list Now that we are storing the bpf interface structure with the ifnet, there is is no need to walk the bpf interface list to locate the correct bpf interface. We now simply look up the interface, and initialize the pointer. This has a nice side effect of changing a bpf interface attach operation from O(N) (where N is the number of bpf interfaces), to O(1). [1] From now on, we can no longer check ifp->if_bpf to tell us whether or not we have any bpf peers that might be interested in receiving packets. In collaboration with: sam@ MFC after: 1 month	2006-06-02 19:59:33 +00:00
Robert Watson	ad3a630f7e	Minor restyling and cleanup around ipport_tick(). MFC after: 1 month	2006-06-02 08:18:27 +00:00
Oleg Bulyzhin	6a7d5cb645	Implement internal (i.e. inside kernel) packet tagging using mbuf_tags(9). Since tags are kept while packet resides in kernelspace, it's possible to use other kernel facilities (like netgraph nodes) for altering those tags. Submitted by: Andrey Elsukov <bu7cher at yandex dot ru> Submitted by: Vadim Goncharov <vadimnuclight at tpu dot ru> Approved by: glebius (mentor) Idea from: OpenBSD PF MFC after: 1 month	2006-05-24 13:09:55 +00:00
Maxim Konovalov	d45e4f9945	o In udp\|rip_disconnect() acquire a socket lock before the socket state modification. To prevent races do that while holding inpcb lock. Reviewed by: rwatson	2006-05-21 19:28:46 +00:00
Maxim Konovalov	635354c446	o Add missed error check: in ip_ctloutput() sooptcopyin() returns a result but we never examine it. Reviewed by: rwatson MFC after: 2 weeks	2006-05-21 17:52:08 +00:00
Bruce M Simpson	8d7d85149e	Initialize the new members of struct ip_moptions as a defensive programming measure. Note that whilst these members are not used by the ip_output() path, we are passing an instance of struct ip_moptions here which is declared on the stack (which could be considered a bad thing). ip_output() does not consume struct ip_moptions, but in case it does in future, declare an in_multi vector on the stack too to behave more like ip_findmoptions() does.	2006-05-18 19:51:08 +00:00
Gleb Smirnoff	e5f88c4492	Since m_pullup() can return a new mbuf, change gre_input2() to return mbuf back to gre_input(). If the former returns mbuf back to the latter, then pass it to raw_input(). Coverity ID: 829	2006-05-16 11:15:22 +00:00
Gleb Smirnoff	ffb761f624	- Backout one line from 1.78. The tp can be freed by tcp_drop(). - Style next line. Coverity ID: 912	2006-05-16 10:51:26 +00:00
Maxim Konovalov	eb16472f74	o In rip_disconnect() do not call rip_abort(), just mark a socket as not connected. In soclose() case rip_detach() will kill inpcb for us later. It makes rawconnect regression test do not panic a system. Reviewed by: rwatson X-MFC after: with all 1th April inpcb changes	2006-05-15 09:28:57 +00:00
Max Laier	0e7185f6e7	Use only lower 64bit of src/dest (and src/dest port) for hashing of IPv6 connections and get rid of the flow_id as it is not guaranteed to be stable some (most?) current implementations seem to just zero it out. PR: kern/88664 Reported by: jylefort Submitted by: Joost Bekkers (w/ changes) Tested by "regisr" <regisrApoboxDcom>	2006-05-14 23:42:24 +00:00
Bruce M Simpson	3548bfc964	Fix a long-standing limitation in IPv4 multicast group membership. By making the imo_membership array a dynamically allocated vector, this minimizes disruption to existing IPv4 multicast code. This change breaks the ABI for the kernel module ip_mroute.ko, and may cause a small amount of churn for folks working on the IGMPv3 merge. Previously, sockets were subject to a compile-time limitation on the number of IPv4 group memberships, which was hard-coded to 20. The imo_membership relationship, however, is 1:1 with regards to a tuple of multicast group address and interface address. Users who ran routing protocols such as OSPF ran into this limitation on machines with a large system interface tree.	2006-05-14 14:22:49 +00:00
Max Laier	656faadcb8	Remove ip6fw. Since ipfw has full functional IPv6 support now and - in contrast to ip6fw - is properly lockes, it is time to retire ip6fw.	2006-05-12 20:39:23 +00:00
Max Laier	e93187482d	Reintroduce net.inet6.ip6.fw.enable sysctl to dis/enable the ipv6 processing seperately. Also use pfil hook/unhook instead of keeping the check functions in pfil just to return there based on the sysctl. While here fix some whitespace on a nearby SYSCTL_ macro.	2006-05-12 04:41:27 +00:00
Max Laier	432288dcb6	Don't claim "(+ipv6)" if we didn't build with INET6.	2006-05-11 15:22:38 +00:00
Robert Watson	59b8854eee	Modify UDP to use sosend_dgram() instead of sosend(). This allows for signicantly optimized UDP socket I/O when using a single UDP socket from many threads or processes that share it, by avoiding significant locking and other overhead in the general sosend() path that isn't necessary for simple datagram sockets. Specifically, this change results in a significant performance improvement for threaded name service in BIND9 under load. Suggested by: Jinmei_Tatsuya at isc dot org	2006-05-06 11:24:59 +00:00
Bjoern A. Zeeb	91b309a1c4	Make sure the ip data pointer is correct before touching it again after ipsec4_output processing else KAME IPSec using the handbook configuration with gif(4) will panic the kernel. Problem reported by: t. patterson <tp lot.org> Tested by: t. patterson <tp lot.org>	2006-05-05 07:31:03 +00:00
Robert Watson	3127286870	Only return (tw) from tcp_twclose() if reuse is passed, otherwise return NULL. In principle this shouldn't change the behavior, but avoids returning a potentially invalid/inappropriate pointer to the caller. Found with: Coverity Prevent (tm) Submitted by: pjd MFC after: 3 months	2006-05-05 06:50:23 +00:00
Pawel Jakub Dawidek	1d7d0bfe5e	/tmp/cvsTXPIwQ	2006-05-05 06:24:34 +00:00
Marcel Moolenaar	7c5a8ab212	In in_pcbdrop(), fix !INVARIANTS build.	2006-04-25 23:23:13 +00:00
Robert Watson	8e3f3b169e	Rename 'last' to 'inp' in udp_append(): the name 'last' is due to the fact that the loop through inpcb's in udp_input() tracks the last inpcb while looping. We keep that name in the calling loop but not in the delivery routine itself. MFC after: 3 months	2006-04-25 17:38:08 +00:00
Robert Watson	10702a2840	Abstract inpcb drop logic, previously just setting of INP_DROPPED in TCP, into in_pcbdrop(). Expand logic to detach the inpcb from its bound address/port so that dropping a TCP connection releases the inpcb resource reservation, which since the introduction of socket/pcb reference count updates, has been persisting until the socket closed rather than being released implicitly due to prior freeing of the inpcb on TCP drop. MFC after: 3 months	2006-04-25 11:17:35 +00:00
Robert Watson	c78cbc7b1d	Instead of calling tcp_usr_detach() from tcp_usr_abort(), break out common pcb tear-down logic into tcp_detach(), which is called from either. Invoke tcp_drop() from the tcp_usr_abort() path rather than tcp_disconnect(), as we want to drop it immediately not perform a FIN sequence. This is one reason why some people were experiencing panics in sodealloc(), as the netisr and aborting thread were simultaneously trying to tear down the socket. This bug could often be reproduced using repeated runs of the listenclose regression test. MFC after: 3 months PR: 96090 Reported by: Peter Kostouros <kpeter at melbpc dot org dot au>, kris Tested by: Peter Kostouros <kpeter at melbpc dot org dot au>, kris	2006-04-24 08:20:02 +00:00
Robert Watson	9106a6d6b0	Replace isn_mtx direct use with ISN_*() lock macros so that locking details/strategy can be changed without touching every use. MFC after: 3 months	2006-04-23 12:27:42 +00:00
Robert Watson	4c0e8f41f6	Introduce a new TCP mutex, isn_mtx, which protects the initial sequence number state, rather than re-using pcbinfo. This introduces some additional mutex operations during isn query, but avoids hitting the TCP pcbinfo lock out of yet another frequently firing TCP timer. MFC after: 3 months	2006-04-22 19:23:24 +00:00
Robert Watson	602cc7f12b	Assert the inpcb lock when rehashing an inpcb. Improve consistency of style around some current assertions. MFC after: 3 months	2006-04-22 19:15:20 +00:00
Robert Watson	6466b28a40	Remove pcbinfo locking from in_setsockaddr() and in_setpeeraddr(); holding the inpcb lock is sufficient to prevent races in reading the address and port, as both the inpcb lock and pcbinfo lock are required to change the address/port. Improve consistency of spelling in assertions about inp != NULL. MFC after: 3 months	2006-04-22 19:10:02 +00:00
Paul Saab	4f590175b7	Allow for nmbclusters and maxsockets to be increased via sysctl. An eventhandler is used to update all the various zones that depend on these values.	2006-04-21 09:25:40 +00:00
Gleb Smirnoff	4cbb118526	Merge rev. 1.240 of ip_output.c, so that IPFIREWALL_FORWARD_EXTENDED kernel option will affect both forwarding methods - classic and fast.	2006-04-18 09:20:16 +00:00
Robert Watson	3cbe7fafa5	Modify tcp_timewait() to accept an inpcb reference, not a tcptw reference. For now, we allow the possibility that the in_ppcb pointer in the inpcb may be NULL if a timewait socket has had its tcptw structure recycled. This allows tcp_timewait() to consistently unlock the inpcb. Reported by: Kazuaki Oda <kaakun at highway dot ne dot jp> MFC after: 3 months	2006-04-09 16:59:19 +00:00
Mohan Srinivasan	1714e18e79	Eliminate debug code that catches bugs in the hinting of sack variables (tcp_sack_output_debug checks cached hints aginst computed values by walking the scoreboard and reports discrepancies). The sack hinting code has been stable for many months now so it is time for the debug code to go. Leaving tcp_sack_output_debug ifdef'ed out in case we need to resurrect it at a later point.	2006-04-06 17:21:16 +00:00
Robert Watson	a460ae4b4c	Don't unlock a timewait structure if the pointer is NULL in tcp_timewait(). This corrects a bug (or lack of fixing of a bug) in tcp_input.c:1.295. Submitted by: Kazuaki Oda <kaakun at highway dot ne dot jp> MFC after: 3 months	2006-04-05 08:45:59 +00:00
Mohan Srinivasan	1f65c2cd31	Certain (bad) values of sack blocks can end up corrupting the sack scoreboard. Make the checks in tcp_sack_doack() more robust to prevent this. Submitted by: Raja Mukerji (raja@mukerji.com) Reviewed by: Mohan Srinivasan	2006-04-05 00:11:04 +00:00
Gleb Smirnoff	a73b656763	Add a tunable net.inet.tcp.maxtcptw, that allows to set a limit on tcptw zone independently from setting a limit on socket zone.	2006-04-04 14:31:37 +00:00
Robert Watson	ae0e714308	Before dereferencing intotw() when INP_TIMEWAIT, check for inp_ppcb being NULL. We currently do allow this to happen, but may want to remove that possibility in the future. This case can occur when a socket is left open after TCP wraps up, and the timewait state is recycled. This will be cleaned up in the future. Found by: Kazuaki Oda <kaakun at highway dot ne dot jp> MFC after: 3 months	2006-04-04 12:26:07 +00:00
Robert Watson	cb895fb9b0	In TCP notify routines, check inpcb for INP_TIMEWAIT and INP_DROPPED. The INP_DROPPED check replaces the current NULL checks; the INP_TIMEWAIT checks appear to have always been required, but not been there, which is/was a bug. This avoids unconditionally casting of in_ppcb to a tcpcb, when it may be a twtcb, which may have resulted in obscure ICMP-related panics in earlier releases. MFC after: 3 months	2006-04-03 14:07:50 +00:00
Robert Watson	afa39e25c4	Change inp_ppcb from caddr_t to void , fix/remove associated related casts. Consistently use intotw() to cast inp_ppcb pointers to struct tcptw pointers. Consistently use intotcpcb() to cast inp_ppcb pointers to struct tcpcb * pointers. Don't assign tp to the results to intotcpcb() during variable declation at the top of functions, as that is before the asserts relating to locking have been performed. Do this later in the function after appropriate assertions have run to allow that operation to be conisdered safe. MFC after: 3 months	2006-04-03 13:33:55 +00:00
Robert Watson	43f56a32a0	Style tweaks: convert to ANSI from K&R function prototypes. MFC after: 3 months	2006-04-03 12:59:27 +00:00
Robert Watson	2fc5ae87d0	Update comment on tcp_close() for new world order. MFC after: 3 months	2006-04-03 12:52:13 +00:00
Robert Watson	e6e65783d6	Clarify comment on handling of non-timewait TCP states in tcp_usr_detach(). MFC after: 3 months	2006-04-03 12:43:56 +00:00
Robert Watson	fa38deac65	Fix up locking surrounding tcp_drop sysctl: in the new world order, we don't free inpcbs until after the socket is closed, so we always need to unlock an inpcb after calling tcp_drop() on it. MFC after: 3 months	2006-04-03 11:57:12 +00:00
Robert Watson	3d2d3ef434	After checking for SO_ISDISCONNECTED in tcp_usr_accept(), return immediately rather than jumping to the normal output handling, which assumes we've pulled out the inpcb, which hasn't happened at this point (and isn't necessary). Return ECONNABORTED instead of EINVAL when the inpcb has entered INP_TIMEWAIT or INP_DROPPED, as this is the documented error value. This may correct the panic seen by Ganbold. MFC after: 1 month Reported by: Ganbold <ganbold at micom dot mng dot net>	2006-04-03 09:52:55 +00:00
Robert Watson	a34f6c1e1d	Correct incorrect assertion in div_bind(): inp must not be NULL here. Reported by: tegge MFC after: 3 months	2006-04-03 09:01:17 +00:00
Robert Watson	953b5606df	During reformulation of tcp_usr_detach(), the call to initiate TCP disconnect for fully connected sockets was dropped, meaning that if the socket was closed while the connection was alive, it would be leaked. Structure tcp_usr_detach() so that there are two clear parts: initiating disconnect, and reclaiming state, and reintroduce the tcp_disconnect() call in the first part. MFC after: 3 months	2006-04-02 16:42:51 +00:00
Robert Watson	34af7bae80	Properly handle an edge case previously not handled correctly: a socket can have a tcp connection that has entered time wait attached to it, in the event that shutdown() is called on the socket and the FINs properly exchange before close(). In this case we don't detach or free the inpcb, just leave the tcptw detached and freed, but we must release the inpcb lock (which we didn't previously). MFC after: 3 months	2006-04-01 23:53:25 +00:00

1 2 3 4 5 ...

2525 Commits