freebsd-skq

Author	SHA1	Message	Date
rwatson	a761d4fb36	Reorganize in_control() so that invariants are more obvious, and so that it is easier to lock: - Handle the unsupported ioctl case at the beginning of in_control(), handing off to ifp->if_ioctl, rather than looking up interfaces and addresses unnecessarily in this case. - Make it an invariant that ifp is always non-NULL when running in_control()-implemented ioctls, simplifying the code structure. MFC after: 3 weeks	2009-04-23 21:41:37 +00:00
bms	d3ee88b536	Bracket struct mfc and struct rtdetq with #ifdef _KERNEL. Match the bracketing in netstat. Since the cleanup of MROUTING, ports have broken because they expect to include <netinet/ip_mroute.h> without including <sys/queue.h>. Fix breakage at source. The real fix, of course, is to fix the MROUTING APIs by blowing them away and replacing them with something else...	2009-04-21 12:47:09 +00:00
bms	424b3895c7	remove IFF_ASSERTGIANT	2009-04-21 09:43:51 +00:00
rwatson	c869a90c83	Prefer actual field names (if_addrhead, ifa_link) to macros aliasing those field names in FreeBSD code. MFC after: 2 weeks	2009-04-20 22:40:44 +00:00
rwatson	cbaf9d767f	In ip_input(), cache the received mbuf's network interface in a local variable. Acquire the interface address list lock when iterating over the interface address list searching for a matching received broadcast address. MFC after: 2 weeks	2009-04-20 14:35:42 +00:00
rwatson	bb7780fc12	In icmp_reflect(), acquire the inteface address list lock when searching for a source address to use. MFC after: 2 weeks Reviewed by: bz	2009-04-20 13:45:39 +00:00
rwatson	901df80bd6	Lock the interface address list when searching for a matching interface by address, or when implementing 'me' rules on IPv6. Prefer the field name if_addrhead to the macro if_addrlist. MFC after: 2 weeks	2009-04-19 22:34:35 +00:00
rwatson	ee528abe46	In divert_packet(), lock the interface address list before iterating over it in search of an address. MFC after: 2 weeks	2009-04-19 22:29:16 +00:00
rwatson	12c8c21e5c	Lock interface address lists in in_pcbladdr() when searching for a source address for a connection and there's no route or now interface for the route. MFC after: 2 weeks	2009-04-19 22:25:09 +00:00
rwatson	38c104950f	Protect against some writer-writer races in in_control() by acquiring the interface address list lock around interface address list modifications. More to do here. MFC after: 2 weeks	2009-04-19 22:16:19 +00:00
bms	1b10d27b7f	Now that IFF_NEEDSGIANT has been removed from the network stack, catch up with this in IGMPv3 and remove dead code. This has the side-effect of not being back-portable to RELENG_7 w/o further changes.	2009-04-19 08:14:21 +00:00
kmacy	96ce69a748	- Allocate a small flowtable in ip_input.c (changeable by tuneable) - Use for accelerating ip_output	2009-04-19 04:44:05 +00:00
kmacy	84c8e0e755	s/void/void */	2009-04-16 23:02:56 +00:00
kmacy	f35899195a	restore spare pointers for MFCing	2009-04-16 22:47:43 +00:00
kmacy	24b38efdce	Change if_output to take a struct route as its fourth argument in order to allow passing a cached struct llentry * down to L2 Reviewed by: rwatson	2009-04-16 20:30:28 +00:00
kmacy	5767ab2087	- convert pspare pointers in inpcb to an llentry and rtentry cache - add flags to indicate their validity	2009-04-15 22:22:00 +00:00
kmacy	a895457456	- add second flags field to to inpcb - update comments in vflag	2009-04-15 22:09:42 +00:00
kmacy	d35e8eca03	provide additional convenience macros for inpcb locking (upgrade, downgrade, exclusive)	2009-04-15 21:39:56 +00:00
kmacy	af3258259b	make LLTABLE visible to netinet	2009-04-15 20:49:59 +00:00
kmacy	52b9562b83	add an llentry to struct route{_in6} to allow it to be passed around with the rtentry	2009-04-15 20:34:19 +00:00
rrs	1a40206ce3	Add missing address lock when we look at the ifa list	2009-04-14 19:20:27 +00:00
rrs	d2ba7ebed4	Move the flight size reduction to right after we recognize its a retransmit, ahead of the PR-SCTP work. Without this fix, we end up NOT reducing flight size and causing an miscalculation when PR-SCTP is active and data is skipped. Obtained from: Michael Tuexen.	2009-04-14 07:50:29 +00:00
rwatson	67227c12c0	Put TCPSTAT_ADD() and TCPSTAT_INC() behind _KERNEL. MFC after: 3 days	2009-04-12 21:28:35 +00:00
rwatson	2c81196a78	Update stats in struct carpstats using two new macros: CARPSTATS_ADD() and CARPSTATS_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structure. MFC after: 3 days	2009-04-12 14:19:37 +00:00
rwatson	0d601ec4b5	Update stats in struct pimstat using two new macros: PIMSTAT_ADD() and PIMSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structure. MFC after: 3 days	2009-04-12 14:06:26 +00:00
rwatson	7dde6eabe7	Update stats in struct mrtstat using two new macros: MRTSTAT_ADD() and MRTSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structure. MFC after: 3 days	2009-04-12 14:00:36 +00:00
rwatson	b6c3ef293c	Update stats in struct igmpstat using two new macros: IGMPSTAT_ADD() and IGMPSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures. MFC after: 3 days	2009-04-12 13:41:13 +00:00
rwatson	4801e9aee9	Update stats in struct icmpstat and icmp6stat using four new macros: ICMPSTAT_ADD(), ICMPSTAT_INC(), ICMP6STAT_ADD(), and ICMP6STAT_INC(), rather than directly manipulating the fields of these structures across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures. In on case, icmp6stat members are manipulated indirectly, by icmp6_errcount(), and this will require further work to fix for per-CPU stats. MFC after: 3 days	2009-04-12 13:22:33 +00:00
rwatson	5dc0256f14	Update stats in struct udpstat using two new macros, UDPSTAT_ADD() and UDPSTAT_INC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures. MFC after: 3 days	2009-04-12 11:42:40 +00:00
rwatson	692f8aa2fa	Update stats in struct ipstat using four new macros, IPSTAT_ADD(), IPSTAT_INC(), IPSTAT_SUB(), and IPSTAT_DEC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures. MFC after: 3 days	2009-04-11 23:35:20 +00:00
rwatson	b79ff9a30d	Update stats in struct tcpstat using two new macros, TCPSTAT_ADD() and TCPSTAT_INC(), rather than directly manipulating the fields across the kernel. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures. MFC after: 3 days	2009-04-11 22:07:19 +00:00
piso	dc27114478	What's the point of adjusting a checksum if we are going to toss the packet? Anticipate the check/return code.	2009-04-11 15:26:31 +00:00
piso	a36990c3db	Plug two bugs introduced with modules conversion: -UdpAliasIn(): correctly check return code after modules ran. -alias_nbt: in case of malformed packets (or some other unrecoverable error), toss the packet.	2009-04-11 15:19:09 +00:00
piso	e605b01ef4	Remove stale comments.	2009-04-11 15:05:19 +00:00
zec	b39b54e6de	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)	2009-04-11 05:58:58 +00:00
kmacy	fbd3646842	Import "flowid" support for serializing flows across transmit queues Reviewed by: rwatson and jeli	2009-04-10 06:16:14 +00:00
luigi	5c7675fccb	Add emulation of delay profiles, which lets you model various types of MAC overheads such as preambles, link level retransmissions and more. Note- this commit changes the userland/kernel ABI for pipes (but not for ordinary firewall rules) so you need to rebuild kernel and /sbin/ipfw to use dummynet features. Please check the manpage for details on the new feature. The MFC would be trivial but it breaks the ABI, so it will be postponed until after 7.2 is released. Interested users are welcome to apply the patch manually to their RELENG_7 tree. Work supported by the European Commission, Projects Onelab and Onelab2 (contract 224263).	2009-04-09 12:46:00 +00:00
rrs	1b546b35c3	Fix a FR bug. When doing PR-SCTP with number rtx set to a low number. The check for skipping was in the incorrect place. Which meant we would FR chunks we should not. MFC after: 1 Month	2009-04-08 12:52:05 +00:00
rrs	0d84cb3f8b	Add more padding and a new variable. This will help us be able to keep ABI compatibility between 8 and 9. MFC after: Never	2009-04-08 12:49:36 +00:00
piso	5551ee2e3d	-don't pass down, to module's fingerprint function, unused data like a pointer to the ip header. -style -spacing	2009-04-08 11:56:49 +00:00
bz	055adbbfb8	With the right comparison we get a proper wscale value and thus more adequate TCP performance with IPv6. Changes for IPv4, r166403 and r172795, both ignored the IPv6 counterpart and left it in the state of art of year 2000. The same logic in syncache already shares code between v4 and v6 so things do not need to be adapted there. Reported by: Steinar Haug (sthaug nethelp.no) Tested by: Steinar Haug (sthaug nethelp.no) MFC after: 3 days	2009-04-07 14:42:40 +00:00
zec	c85551e0bc	First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)	2009-04-06 22:29:41 +00:00
kan	6f5bda3790	If KTR_SUBSYS is compiled in, it does not necessarily mean that user is interested in being spammed by mcast-related printfs. Use proper check against ktr_mask instead KTR_COMPILE.	2009-04-05 23:25:06 +00:00
bms	d3a98ed39c	Fix mbuf chain layout pessimization: in the case where a single mbuf is allocated due to m_getcl() returning NULL, we already call MH_ALIGN, so do not increment m->m_data in this case. Found during MLDv2 port.	2009-04-04 15:32:23 +00:00
bms	b8d7b83e60	Do not obliterate QQI with MAXRESP. Found during MLDv2 port.	2009-04-04 15:26:32 +00:00
rrs	f72ef579b2	Many bug fixes (from the IETF hack-fest): - PR-SCTP had major issues when skipping through a multi-part message. o Did not look at socket buffer. o Did not properly handle the reassmebly queue. o The MARKED segments could interfere and un-skip a chunk causing a problem with the proper FWD-TSN. o No FR of FWD-TSN's was being done. - NR-Sack code was basically disabled. It needed fixes that never got into the real code. - CMT code had issues when the two paths were NOT the same b/w. We found a few small bugs, but also the critcal one here was not dividing the rwnd amongst the paths. Obtained from: Michael Tuexen and myself at the IETF hack-fest ;-)	2009-04-04 11:43:32 +00:00
piso	c9b4c10995	Implement an ipfw action to reassemble ip packets: reass.	2009-04-01 20:23:47 +00:00
bms	635596f1db	Don't call m_freem() after ip_output(), as it always consumes the mbuf chain provided to it. Found by: Pierre Guinoiseau	2009-03-24 01:22:12 +00:00
jmallett	4cae140c41	Remove local in6_addr variables for local and foreign addresses in sysctl_drop, they were passed uninitialized to in6_pcblookup_hash. Instead, do as is done for IPv4 and use the addresses within the sockaddr structure, which are correctly populated. This fixes tcpdrop(8) for IPv6 address pairs. Reviewed by: bz	2009-03-22 00:45:47 +00:00
bms	76ed51bd5a	Fix brainos introduced during mechanical KTR change. Pointy hat to: bms	2009-03-20 13:13:50 +00:00
bms	70bfeb48b5	Cleanup: Nuke debug.mrtdebug, and replace it with KTR.	2009-03-19 14:14:21 +00:00
bms	76f193cd69	Introduce a number of changes to the MROUTING code. This is purely a forwarding plane cleanup; no control plane code is involved. Summary: * Split IPv4 and IPv6 MROUTING support. The static compile-time kernel option remains the same, however, the modules may now be built for IPv4 and IPv6 separately as ip_mroute_mod and ip6_mroute_mod. * Clean up the IPv4 multicast forwarding code to use BSD queue and hash table constructs. Don't build our own timer abstractions when ratecheck() and timevalclear() etc will do. * Expose the multicast forwarding cache (MFC) and virtual interface table (VIF) as sysctls, to reduce netstat's dependence on libkvm for this information for running kernels. * bandwidth meters however still require libkvm. * Make the MFC hash table size a boot/load-time tunable ULONG, net.inet.ip.mfchashsize (defaults to 256). * Remove unused members from struct vif and struct mfc. * Kill RSVP support, as no current RSVP implementation uses it. These stubs could be moved to raw_ip.c. * Don't share locks or initialization between IPv4 and IPv6. * Don't use a static struct route_in6 in ip6_mroute.c. The v6 code is still using a cached struct route_in6, this is moved to mif6 for the time being. * More cleanup remains to be merged from ip_mroute.c to ip6_mroute.c. v4 path tested using ports/net/mcast-tools. v6 changes are mostly mechanical locking and have not been tested. As these changes partially break some kernel ABIs, they will not be MFCed. There is a lot more work to be done here. Reviewed by: Pavlin Radoslavov	2009-03-19 01:43:03 +00:00
bms	be7293a078	Comment IGMP_PIM as being very historic, as in, don't use.	2009-03-19 01:15:26 +00:00
bms	f2006dd38e	Deal with the case where ifma_protospec may be NULL, during any IPv4 multicast operations which reference it. There is a potential race because ifma_protospec is set to NULL when we discover the underlying ifnet has gone away. This write is not covered by the IF_ADDR_LOCK, and it's difficult to widen its scope without making it a recursive lock. It isn't clear why this manifests more quickly with 802.11 interfaces, but does not seem to manifest at all with wired interfaces. With this change, the 802.11 related panics reported by sam@ and cokane@ should go away. It is not the right fix, that requires more thought before 8.0. Idea from: sam Tested by: cokane	2009-03-17 14:41:54 +00:00
rwatson	70b6a8119c	Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced in FreeBSD 5.x to allow network device drivers to run with Giant despite the network stack being Giant-free. This significantly simplifies calls into ioctl() on network interfaces, especially in the multicast code, as well as eliminates deferred invocation of interface if_start routines. Disable the build on device drivers still depending on IFF_NEEDSGIANT as they no longer compile. They will be removed in a few weeks if they haven't been made MPSAFE in that time. Disabled drivers: if_ar if_axe if_aue if_cdce if_cue if_kue if_ray if_rue if_rum if_sr if_udav if_ural if_zyd Drivers that were already disabled because of tty changes: if_ppp if_sl Discussed on: arch@	2009-03-15 14:21:05 +00:00
rwatson	038bfe209e	Correct a number of evolved problems with inp_vflag and inp_flags: certain flags that should have been in inp_flags ended up in inp_vflag, meaning that they were inconsistently locked, and in one case, interpreted. Move the following flags from inp_vflag to gaps in the inp_flags space (and clean up the inp_flags constants to make gaps more obvious to future takers): INP_TIMEWAIT INP_SOCKREF INP_ONESBCAST INP_DROPPED Some aspects of this change have no effect on kernel ABI at all, as these are UDP/TCP/IP-internal uses; however, netstat and sockstat detect INP_TIMEWAIT when listing TCP sockets, so any MFC will need to take this into account. MFC after: 1 week (or after dependencies are MFC'd) Reviewed by: bz	2009-03-15 09:58:31 +00:00
rrs	723e44888d	Opps.. I missed a file on the commit :-)	2009-03-14 23:13:16 +00:00
das	58fce43140	Namespace: Defining htonl() and friends here instead of arpa/inet.h is a BSD extension.	2009-03-14 20:16:54 +00:00
rrs	f25b21ac98	Fixes several PR-SCTP releated bugs. - When sending large PR-SCTP messages over a lossy link we would incorrectly calculate the fwd-tsn - When receiving large multipart pr-sctp packets we would incorrectly send back a SACK that would renege improperly on already received packets thus causing unneeded retransmissions.	2009-03-14 13:42:13 +00:00
rwatson	fae6f1ab82	Add INP_INHASHLIST flag for inpcb->inp_flags to indicate whether or not the inpcb is currenty on various hash lookup lists, rather than using (lport != 0) to detect this. This means that the full 4-tuple of a connection can be retained after close, which should lead to more sensible netstat output in the window between TCP close and socket close. MFC after: 2 weeks	2009-03-11 00:29:22 +00:00
rwatson	f0bf25503d	Remove unused v6 macro aliases for inpcb fields: in6p_ip6_nxt in6p_vflag in6p_flags in6p_socket in6p_lport in6p_fport in6p_ppcb Remove unused v6 macro aliases for inpcb flags: IN6P_HIGHPORT IN6P_LOWPORT IN6P_ANONPORT IN6P_RECVIF IN6P_MTUDISC IN6P_FAITH IN6P_CONTROLOPTS References to in6p_lport and in6_fport in sockstat are also replaced with normal inp_lport and inp_fport references. MFC after: 3 days Reviewed by: bz	2009-03-10 17:57:41 +00:00
bms	b00c7265e0	Don't print inm_print() chatter when KTR_IGMPV3 is not enabled in the KTR_COMPILE mask. Found by: gnn	2009-03-10 17:48:49 +00:00
rwatson	d085935615	Remove now-unused INP_UNMAPPABLEOPTS. MFC after: 3 days Discussed with: bz	2009-03-10 11:04:19 +00:00
bms	4961c33f6b	Fix uninitialized use of ifp for ii. Found by: Peter Holm	2009-03-09 22:54:17 +00:00
bms	71233409ea	Merge IGMPv3 and Source-Specific Multicast (SSM) to the FreeBSD IPv4 stack. Diffs are minimized against p4. PCS has been used for some protocol verification, more widespread testing of recorded sources in Group-and-Source queries is needed. sizeof(struct igmpstat) has changed. __FreeBSD_version is bumped to 800070.	2009-03-09 17:53:05 +00:00
marius	74f63d4ce1	On architectures with strict alignment requirements compensate the misalignment of the IP header that prepending the EtherIP header might have caused. PR: 131921 MFC after: 1 week	2009-03-07 19:08:58 +00:00
rrs	41426e5a37	Fixes for window probes: 1) WP should never be marked unless flight size is 0 2) When recovering from wp if the peer ack's it we don't mark for retran 3) When recovering, we must assure a timer is still running.	2009-03-06 11:03:52 +00:00
rrs	f2254573b9	- PR-SCTP bug, where the CUM-ACK was not being updated into the advance_peer_ack point so we would incorrectly send a wrong value in the FWD-TSN - PR-SCTP bug, where an PR packet is used for a window probe which could incorrectly get the packet moved back into the send_queue, which will cause major issues and should not happen. - Fix a trace to use the proper macro.	2009-03-04 20:54:42 +00:00
bms	6370f947cc	In ip_output(), do not acquire the IN_MULTI_LOCK(), and do not attempt to perform a group lookup. This is a socket layer lock, and the bottom half of IP really has no business taking it. Use the value of the in_mcast_loop sysctl to determine if we should loop back by default, in the absence of any multicast socket options. Because the check on group membership is now deferred to the input path, an m_copym() is now required. This should increase multicast send performance where the source has not requested loopback, although this has not been benchmarked or measured. It is also a necessary change for IN_MULTI_LOCK to become non-recursive, which is required in order to implement IGMPv3 in a thread-safe way.	2009-03-04 03:45:34 +00:00
bms	ea92a55057	Add sysctl net.inet.ip.mcast.loop. This controls whether or not IPv4 multicast sends are looped back to senders by default on a stack-wide basis, rather than relying on the socket option. Note that the sysctl only applies to newly created multicast sockets.	2009-03-04 03:40:02 +00:00
bms	24af7b630b	Merge header file definitions used by the new IGMPv3 implementation. This is a partial merge. Compatibility defines are retained for the existing IGMPv2 implementation.	2009-03-04 03:22:03 +00:00
bms	28b42e9e29	Add various defines/macros required by IGMPv3: * MCAST_UNDEFINED state. * in_allhosts() macro (group is 224.0.0.1). This uses a const endian comparison. * IP_MAX_GROUP_SRC_FILTER, IP_MAX_SOCK_SRC_FILTER default resource limits.	2009-03-04 03:01:05 +00:00
bms	0bcee3a566	Add function ip_checkrouteralert(), which will be used by IGMPv3 to check for the IPv4 Router Alert [RFC2113] option in a pulled-up IP mbuf chain.	2009-03-04 02:51:22 +00:00
bz	59d53a5bdb	Start removing IPv6 Type 0 Routing header code. RH0 was deprecated by RFC 5095. While most of the code had been disabled by #if 0 already, leave a bit of infrastructure for possible RH2 code and a log message under BURN_BRIDGES in case a user still tries to send RH0 packets. Reviewed by: gnn (a bit back, earlier version)	2009-03-03 13:12:12 +00:00
luigi	12e34d9850	curr_time is a 64 bit variable so SYSCTL_LONG is not appropriate as a handler. The variable was exported only for debugging, but there is little reason to do it now that the timekeeping is supported by various other variables. For the time being just comment out the sysctl, but I think this should go away.	2009-03-02 22:16:50 +00:00
luigi	6580dcca60	fw_debug has been unused for ages, so remove it from the list of sysctl_variables. I would also remove it from the VNET record but I am unsure if there is any ABI issue -- so for the time being just mark it as unused in ip_fw.h, and then we will collect the garbage at some appropriate time in the future. MFC after: 3 days	2009-03-02 22:11:48 +00:00
bz	4321e2a8f4	Add size-guards evaluated at compile-time to the main struct vnet_* which are not in a module of their own like gif. Single kernel compiles and universe will fail if the size of the struct changes. Th expected values are given in sys/vimage.h. See the comments where how to handle this. Requested by: peter	2009-03-01 11:01:00 +00:00
rwatson	ac5e328817	Remove unreachable code for generating RST segments from tcp_twcheck(); this code became stale when T/TCP support was removed. Discussed with: bz, sam MFC after: 1 month	2009-02-28 22:58:52 +00:00
rrs	57c49bea03	Fix the add stream feature of strm-reset to really work: - Fix the copy, we can't do a blind copy but must transfer the data from the old to the new. - Fix the ACK processing so we properly stop retransmitting the thing. - Fix it so if we get a retran we will properly reply with the saved response without doing anything. MFC after: 1 month	2009-02-27 20:54:45 +00:00
bz	df2be82cec	For all files including net/vnet.h directly include opt_route.h and net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.	2009-02-27 14:12:05 +00:00
rdivacky	e5bfcba080	Change the functions to ANSI in those cases where it breaks promotion to int rule. See ISO C Standard: SS6.7.5.3:15. Approved by: kib (mentor) Reviewed by: warner Tested by: silence on -current	2009-02-24 18:09:31 +00:00
rwatson	313ce0df5b	In tcp_usr_shutdown() and tcp_usr_send(), I missed converting NULL checks for the tcpcb, previously used to detect complete disconnection, with INP_DROPPED checks. Correct that, preventing shutdown() from improperly generating a TCP segment with destination IP and port of 0.0.0.0:0. PR: kern/132050 Reported by: david gueluy <david.gueluy at netasq.com> MFC after: 3 weeks	2009-02-24 11:17:50 +00:00
rwatson	73b78f4e11	In in_rtqkill(), assert the radix head lock, and pass RTF_RNH_LOCKED to in_rtrequest(); the radix head lock is already acquired before rnh_walktree is called in in_rtqtimo_one(). This avoids a recursive acquisition that is no longer permitted in 8.x due to use of an rwlock for the radix head lock. Reported by: dikshie <dikshie at gmail.com> MFC after: 3 days	2009-02-23 22:57:55 +00:00
rrs	fedf855989	Add the add-stream capability. Still needs more testing.. MFC after: 1 month	2009-02-20 15:03:54 +00:00
rrs	ced5185c01	Fix a bug. The sending was being restricted improperly by the max_burst. It should only be gated by cwnd in the lower level send. Obtained from: Michael Tuexen MFC after: 1 week.	2009-02-20 14:33:45 +00:00
luigi	18ef459d66	correct some #include	2009-02-16 15:10:51 +00:00
luigi	0a0d3f132a	remove dependency on eventhandler.h, we only need a forward declaration	2009-02-16 15:08:41 +00:00
luigi	d5904787c0	remove dependency on net/if.h of this header	2009-02-16 15:07:40 +00:00
luigi	e8429cbcf1	use a const format string in the log message so we can check the arguments (if/when we enable those checks)	2009-02-16 12:09:52 +00:00
luigi	fe66a1d80c	remove unnecessary #include from vnet.h and vinet.h Approved by: Marko Zec	2009-02-15 00:28:28 +00:00
rrs	d22f1350d2	This commit fixes the issue with alias_sctp.c. No longer do we require SCTP to be in the kernel for the lib to be able to handle SCTP. We do this by moving the CRC32c checksum into libkern/crc32.c and then adjusting all routines to use the common methods. Note that this will improve the performance of iSCSI since they were using the old single 256 bit table lookup versus the slicing 8 algorithm (which gives a 4x speed up in CRC32c calculation :-D) Reviewed by:rwatson, gnn, scottl, paolo MFC after: 4 week? (assuming we MFC the alias_sctp changes)	2009-02-14 11:34:57 +00:00
rrs	968b33ff72	Have the jail code use the error returned to pass not constant errors. Obtained from: jamie@freebsd.org	2009-02-13 18:44:30 +00:00
luigi	98d62453db	remove unnecessary #include, and document some of the others	2009-02-13 15:37:14 +00:00
luigi	f34bfbb655	Use uint32_t instead of n_long and n_time, and uint16_t instead of n_short. Add a note next to fields in network format. The n_* types are not enough for compiler checks on endianness, and their use often requires an otherwise unnecessary #include <netinet/in_systm.h> The typedef in in_systm.h are still there.	2009-02-13 15:14:43 +00:00
rrs	a15ded2637	Move the new rwnd field down to the very end of the xsctp structure. This is where all new fields belong (not that we will be ABI compatiable with 7.x anyway.. sigh).	2009-02-13 14:43:46 +00:00
rrs	b885a20f24	Add padding to then end of the xsctp_xxx structures to allow future changes to be able to maintain ABI compatibility	2009-02-09 17:37:17 +00:00
rrs	e7e0acd223	Fix minor spacing problem found by s9indent from last commit.	2009-02-09 11:42:23 +00:00
rrs	b2d67dbfe6	Fix INET only build breakage with SCTP - pointy hat to me :-)	2009-02-09 11:41:54 +00:00
bz	8d30abae87	Try to remove/assimilate as much of formerly IPv4/6 specific (duplicate) code in sys/netipsec/ipsec.c and fold it into common, INET/6 independent functions. The file local functions ipsec4_setspidx_inpcb() and ipsec6_setspidx_inpcb() were 1:1 identical after the change in r186528. Rename to ipsec_setspidx_inpcb() and remove the duplicate. Public functions ipsec[46]_get_policy() were 1:1 identical. Remove one copy and merge in the factored out code from ipsec_get_policy() into the other. The public function left is now called ipsec_get_policy() and callers were adapted. Public functions ipsec[46]_set_policy() were 1:1 identical. Rename file local ipsec_set_policy() function to ipsec_set_policy_internal(). Remove one copy of the public functions, rename the other to ipsec_set_policy() and adapt callers. Public functions ipsec[46]_hdrsiz() were logically identical (ignoring one questionable assert in the v6 version). Rename the file local ipsec_hdrsiz() to ipsec_hdrsiz_internal(), the public function to ipsec_hdrsiz(), remove the duplicate copy and adapt the callers. The v6 version had been unused anyway. Cleanup comments. Public functions ipsec[46]_in_reject() were logically identical apart from statistics. Move the common code into a file local ipsec46_in_reject() leaving vimage+statistics in small AF specific wrapper functions. Note: unfortunately we already have a public ipsec_in_reject(). Reviewed by: sam Discussed with: rwatson (renaming to *_internal) MFC after: 26 days X-MFC: keep wrapper functions for public symbols?	2009-02-08 09:27:07 +00:00
piso	6ae5f1f310	Silent LINT: add 2 stubs (update_crc32 and sctp_finalize_crc32) to fix LIBALIAS + SCTP_NO_CSUM case.	2009-02-08 03:03:55 +00:00
piso	d34fad2923	Add SCTP NAT support. Submitted by: CAIA (http://caia.swin.edu.au)	2009-02-07 18:49:42 +00:00
jamie	bbcda547da	Remove redundant calls of prison_local_ip4 in in_pcbbind_setup, and of prison_local_ip6 in in6_pcbbind. Approved by: bz (mentor)	2009-02-05 14:25:53 +00:00
jamie	12bbe1869f	Standardize the various prison_foo_ip[46] functions and prison_if to return zero on success and an error code otherwise. The possible errors are EADDRNOTAVAIL if an address being checked for doesn't match the prison, and EAFNOSUPPORT if the prison doesn't have any addresses in that address family. For most callers of these functions, use the returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or EINVAL. Always include a jailed() check in these functions, where a non-jailed cred always returns success (and makes no changes). Remove the explicit jailed() checks that preceded many of the function calls. Approved by: bz (mentor)	2009-02-05 14:06:09 +00:00
rrs	51b2f38816	LOR fix - Lock only when calling the actual code that is messing with the UDP tunnel. This means that if two users actually tried to change the tunnel port at the same time interesting things COULD result, but its probably very unlikely to happen :-)	2009-02-03 20:33:28 +00:00
rrs	520c389cb4	- Cleanup checksum code. - Prepare for CRC offloading, add MIB counters (RS/MT). - Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT). - Bugfix: Handle close() with SO_LINGER correctly when notifications are generated during the close() call(MT). - Bugfix: Generate DRY event when sender is dry during subscription. Only for 1-to-1 style sockets (RS/MT) - Bugfix: Put vtags for the correct amount of time into time-wait (MT). - Bugfix: Clear vtag entries correctly on expiration (MT). - Bugfix: shutdown() indicates ENOTCONN when called for unconnected 1-to-1 style sockets (MT). - Bugfix: In sctp Auth code (PL). - Add support for devices that support SCTP csum offload (igb). - Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS) Obtained from: With help from Peter Lei and Michael Tuexen	2009-02-03 11:04:03 +00:00
rrs	2688c69180	Adds support for SCTP checksum offload. This means we, like TCP and UDP, move the checksum calculation into the IP routines when there is no hardware support we call into the normal SCTP checksum routine. The next round of SCTP updates will use this functionality. Of course the IGB driver needs a few updates to support the new intel controller set that actually does SCTP csum offload too. Reviewed by: gnn, rwatson, kmacy	2009-02-03 11:00:43 +00:00
luigi	5c21ddfc6f	initialize a couple of variables, gcc 4.2.4-4 (linux) reports some possible uninitialized uses and the warning does make sense.	2009-01-28 13:39:01 +00:00
luigi	b179ef61df	For some reason (probably dating ages ago) an #ifdef SYSCTL_NODE / #endif section included a lot of stuff that did not belong there. So split the block in multiple components each around the relevant stuff. This said, I wonder if building a kernel where SYSCTL_NODE is not defined is supported at all. Submitted by: Marta Carbone	2009-01-28 13:11:22 +00:00
bz	6dddd78341	For consistency with prison_{local,remote,check}_ipN rename prison_getipN to prison_get_ipN. Submitted by: jamie (as part of a larger patch) MFC after: 1 week	2009-01-25 10:11:58 +00:00
bz	44ed405ccd	Add externs to fix build with VIMAGE_GLOBALS after r187289.	2009-01-22 10:29:09 +00:00
sam	b278e68100	remove too noisy DIAGNOSTIC code Reviewed by: qingli	2009-01-18 07:20:02 +00:00
piso	a5f2b191e0	Silent userland warnings about missing prototypes. Submitted by: Roman Divacky <rdivacky@freebsd.org>	2009-01-15 19:35:23 +00:00
lstewart	d5deb43d0f	Add TCP Appropriate Byte Counting (RFC 3465) support to kernel. The new behaviour is on by default, and can be disabled by setting the net.inet.tcp.rfc3465 sysctl to 0 to obtain previous behaviour. The patch changes struct tcpcb in sys/netinet/tcp_var.h which breaks the ABI. Bump __FreeBSD_version to 800061 accordingly. User space tools that rely on the size of struct tcpcb (e.g. sockstat) need to be recompiled. Reviewed by: rpaulo, gnn Approved by: gnn, kmacy (mentors) Sponsored by: FreeBSD Foundation	2009-01-15 06:44:22 +00:00
rwatson	903b355d85	Since we allow conditional allocation of labels on syncache entries, remove historic assertion that labels are always present.	2009-01-11 20:01:43 +00:00
bz	ffd2421407	Restrict arp, ndp and theoretically the FIB listing (if not read with libkvm) to the addresses of a prison, when inside a jail. [1] As the patch from the PR was pre-'new-arp', add checks to the llt_dump handlers as well. While touching RTM_GET in route_output(), consistently use curthread credentials rather than the creds from the socket there. [2] PR: kern/68189 Submitted by: Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1] Discussed with: rwatson [2] Reviewed by: rwatson MFC after: 4 weeks	2009-01-09 21:57:49 +00:00
adrian	5500b82820	Fix fat-fingered comment. Noticed-by: julian	2009-01-09 18:38:57 +00:00
adrian	2b9fd92371	Fix indentation; add FALLTHROUGH. Thanks Max!	2009-01-09 17:21:22 +00:00
adrian	24f22ace02	Better comment what the socket option does. Thanks to Sam Leffler for suggesting this.	2009-01-09 17:18:17 +00:00
adrian	3d9557d3bd	Comment some potentially confusing logic. Nitpicking by: mlaier MFC after: 2 weeks	2009-01-09 17:16:18 +00:00
adrian	e2eee65f21	Implement a new IP option (not compiled/enabled by default) to allow applications to specify a non-local IP address when bind()'ing a socket to a local endpoint. This allows applications to spoof the client IP address of connections if (obviously!) they somehow are able to receive the traffic normally destined to said clients. This patch doesn't include any changes to ipfw or the bridging code to redirect the client traffic through the PCB checks so TCP gets a shot at it. The normal behaviour is that packets with a non-local destination IP address are not handled locally. This can be dealth with some IPFW hackery; modifications to IPFW to make this less hacky will occur in subsequent commmits. Thanks to Julian Elischer and others at Ironport. This work was approved and donated before Cisco acquired them. Obtained from: Julian Elischer and others MFC after: 2 weeks	2009-01-09 16:02:19 +00:00
bz	60c950d4ff	Make SIOCGIFADDR and related, as well as SIOCGIFADDR_IN6 and related jail-aware. Up to now we returned the first address of the interface for SIOCGIFADDR w/o an ifr_addr in the query. This caused problems for programs querying for an address but running inside a jail, as the address returned usually did not belong to the jail. Like for v6, if there was an ifr_addr given on v4, you could probe for more addresses on the interfaces that you were not allowed to see from inside a jail. Return an error (EADDRNOTAVAIL) in that case now unless the address is on the given interface and valid for the jail. PR: kern/114325 Reviewed by: rwatson MFC after: 4 weeks	2009-01-09 13:06:56 +00:00
harti	d5349ad3cd	Set a minimum of information in the routing message (like version and type) so that generic routing message parsing code can parse the messages for L2 info that are retrieved via the sysctl interface.	2009-01-09 10:58:59 +00:00
rrs	fcaf24fb54	Addresses Roberts comments on comments. Also adds the KASSERT and checks suggested. Reviewed by: The udp tunneling was discussed on net@ under the thread entitled "Heads up -- Thinking about UDP and tunneling"	2009-01-06 13:27:56 +00:00
rrs	8bff422255	Add the ability of an alternate transport protocol to easily tunnel over udp by providing a hook function that will be called instead of appending to the socket buffer.	2009-01-06 12:13:40 +00:00
rwatson	b56f7e98e0	Allow the IP_MINTTL socket option to be set to 0 so that it can be disabled entirely, which is its default state before set to a non-zero value. PR: 128790 Submitted by: Nick Hilliard <nick at foobar dot org> MFC after: 3 weeks	2009-01-03 11:35:31 +00:00
qingli	efe3f87721	Some modules such as SCTP supplies a valid route entry as an input argument to ip_output(). The destionation is represented in a sockaddr{} object that may contain other pieces of information, e.g., port number. This same destination sockaddr{} object may be passed into L2 code, which could be used to create a L2 entry. Since there exists a L2 table per address family, the L2 lookup function can make address family specific comparison instead of the generic bcmp() operation over the entire sockaddr{} structure. Note in the IPv6 case the sin6_scope_id is not compared because the address is currently stored in the embedded form inside the kernel. The in6_lltable_lookup() has to account for the scope-id if this storage format were to change in the future.	2009-01-03 00:27:28 +00:00
bz	e46dfc4a18	For consistency use LLE_IS_VALID() in this 4th place that is actually interested in the (void *)-1 return value hack. This way we can easily identify those special parts of the code.	2008-12-28 21:18:01 +00:00
piso	8af3d78dd2	Import sctp nat support in ipfw obtained from CAIA - http://caia.swin.edu.au .	2008-12-28 17:16:32 +00:00
qingli	1d851edfc0	This checkin addresses a couple of issues: 1. The "route" command allows route insertion through the interface-direct option "-iface". During if_attach(), an sockaddr_dl{} entry is created for the interface and is part of the interface address list. This sockaddr_dl{} entry describes the interface in detail. The "route" command selects this entry as the "gateway" object when the "-iface" option is present. The "arp" and "ndp" commands also interact with the kernel through the routing socket when adding and removing static L2 entries. The static L2 information is also provided through the "gateway" object with an AF_LINK family type, similar to what is provided by the "route" command. In order to differentiate between these two types of operations, a RTF_LLDATA flag is introduced. This flag is set by the "arp" and "ndp" commands when issuing the add and delete commands. This flag is also set in each L2 entry returned by the kernel. The "arp" and "ndp" command follows a convention where a RTM_GET is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills in the fields for a "rtm" object, which is reinjected into the kernel by a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET is a prefix route, so the RTF_LLDATA flag must be specified when issuing the RTM_ADD/DELETE messages. 2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the specification for retrieving L2 information. Also optimized the code logic. Reviewed by: julian	2008-12-26 19:45:24 +00:00
kmacy	bc73e3c369	Fix missed unlock and reference drop of lle Found by: pho	2008-12-24 05:31:26 +00:00
bz	46b52a2d38	Remove long unused netinet/ipprotosw.h (basically since r82884). Discussed with: rwatson MFC after: 4 weeks	2008-12-23 16:52:03 +00:00
qingli	95b2457edf	Don't create a bogus ARP entry for 0.0.0.0.	2008-12-23 03:33:32 +00:00
qingli	e1a252e123	The proxy-arp code was broken and responds to ARP requests for addresses that are not proxied locally.	2008-12-19 11:07:34 +00:00
bz	b1db56aa98	Another step assimilating IPv[46] PCB code: normalize IN6P_* compat flags usage to their equialent INP_* counterpart. Discussed with: rwatson Reviewed by: rwatson MFC after: 4 weeks	2008-12-17 13:00:18 +00:00
bz	ea0d9d2e9a	Use inc_flags instead of the inc_isipv6 alias which so far had been the only flag with random usage patterns. Switch inc_flags to be used as a real bit field by using INC_ISIPV6 with bitops to check for the 'isipv6' condition. While here fix a place or two where in case of v4 inc_flags were not properly initialized before.[1] Found by: rwatson during review [1] Discussed with: rwatson Reviewed by: rwatson MFC after: 4 weeks	2008-12-17 12:52:34 +00:00
kmacy	ea41cce94c	default to doing lla_lookup with shared afdata lock and returning a shared lock on the lle - thus restoring parallel performance to pre-arpv2 level	2008-12-17 00:14:28 +00:00
rwatson	662640d1ee	IPFW's pfil hook/unhook code ignores the return values of pfil_add_hook() and pfil_remove_hook(), so cast them to (void). MFC after: pretty soon	2008-12-16 15:05:35 +00:00
kmacy	bf1d63fc61	ipfw doesn't use the radix node head lock to protect the radix tree - remove acquisition	2008-12-16 11:06:30 +00:00
kmacy	9be993627e	check pointer against NULL add new line after declaration for style	2008-12-16 03:18:59 +00:00
kmacy	81140652bf	don't unlock lle if it is NULL	2008-12-16 02:48:12 +00:00
kmacy	c9eebde165	unlock and destroy an llentry's lock before freeing Found by: sam	2008-12-16 00:20:49 +00:00
bz	03f6bb9dc9	Another step assimilating IPv[46] PCB code - directly use the inpcb names rather than the following IPv6 compat macros: in6pcb,in6p_sp, in6p_ip6_nxt,in6p_flowinfo,in6p_vflag, in6p_flags,in6p_socket,in6p_lport,in6p_fport,in6p_ppcb and sotoin6pcb(). Apart from removing duplicate code in netipsec, this is a pure whitespace, not a functional change. Discussed with: rwatson Reviewed by: rwatson (version before review requested changes) MFC after: 4 weeks (set the timer and see then)	2008-12-15 21:50:54 +00:00
qingli	ec826ad5c7	This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion	2008-12-15 06:10:57 +00:00
bz	f7adea0f1e	Add a check, that is currently under discussion for 8 but that we need to keep for 7-STABLE when MFCing in_pcbladdr() to not change the behaviour there. With this a destination route via a loopback interface is treated as a valid and reachable thing for IPv4 source address selection, even though nothing of that network is ever directly reachable, but it is more like a blackhole route. With this the source address will be selected and IPsec can grab the packets before we would discard them at a later point, encapsulate them and send them out from a different tunnel endpoint IP. Discussed on: net Reported by: Frank Behrens <frank@harz.behrens.de> Tested by: Frank Behrens <frank@harz.behrens.de> MFC after: 4 weeks (just so that I get the mail)	2008-12-14 17:47:33 +00:00
bz	ef0e31ac36	De-virtualize the MD5 context for TCP initial seq number generation and make it a function local variable like we do almost everywhere inside the kernel. Discussed with: rwatson, silby MFC after: 4 weeks	2008-12-13 21:59:18 +00:00
kmacy	6e26928458	version that will compile	2008-12-13 20:34:41 +00:00
kmacy	658f51b3ec	radix node head lock needs to be held when calling rnh_addaddr	2008-12-13 20:18:05 +00:00
kmacy	2a8fe85d73	don't acquire lock recursively	2008-12-13 20:16:03 +00:00
bz	98e7fe0e6a	Second round of putting global variables, which were virtualized but formerly missed under VIMAGE_GLOBAL. Put the extern declarations of the virtualized globals under VIMAGE_GLOBAL as the globals themsevles are already. This will help by the time when we are going to remove the globals entirely. Sponsored by: The FreeBSD Foundation	2008-12-13 19:13:03 +00:00
bz	83a32f8750	Put a global variables, which were virtualized but formerly missed under VIMAGE_GLOBAL. Start putting the extern declarations of the virtualized globals under VIMAGE_GLOBAL as the globals themsevles are already. This will help by the time when we are going to remove the globals entirely. While there garbage collect a few dead externs from ip6_var.h. Sponsored by: The FreeBSD Foundation	2008-12-11 16:26:38 +00:00

1 2 3 4 5 ...

3515 Commits