freebsd-skq

Author	SHA1	Message	Date
kevlo	2b7da5612e	If RTF_HOST flag is specified, then we are interested in destination address. PR: kern/159600 Submitted by: Svatopluk Kraus <onwahe at gmail dot com> Approved by: re (hrs)	2011-08-10 06:17:06 +00:00
zec	99a0b299b3	Permit ARP to proceed for IPv4 host routes for which the gateway is the same as the host address. This already works fine for INET6 and ND6. While here, remove two function pointers from struct lltable which are only initialized but never used. MFC after: 3 days	2011-07-08 09:38:33 +00:00
qingli	d7fa164fc4	Supply the LLE_STATIC flag bit to in_ifscurb() when scrubbing interface address so that proper clean up will take place in the routing code. This patch fixes the bootp panic on startup problem. Also, added more error handling and logging code in function in_scrubprefix(). MFC after: 5 days	2011-05-29 02:21:35 +00:00
qingli	a1bf1a2582	The statically configured (permanent) ARP entries are removed when an interface is brought down, even though the interface address is still valid. This patch maintains the permanent ARP entries as long as the interface address (having the same prefix as that of the ARP entries) is valid. Reviewed by: delphij MFC after: 5 days	2011-05-20 19:12:20 +00:00
pluknet	1231af61c3	Reference ifaddr object before unlocking as it can be freed from another context at the moment of later access. PR: kern/155555 Submitted by: Andrew Boyer <aboyer att averesystems.com> Approved by: avg (mentor) MFC after: 2 weeks	2011-03-21 14:19:40 +00:00
glebius	bdd7d886f9	Use time_uptime instead of non-monotonic time_second to drive ARP timeouts. Suggested by: bde	2010-11-30 15:57:00 +00:00
dim	fb307d7d1d	After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 \| dim \| 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) \| 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 \| dim \| 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) \| 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 \| dim \| 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) \| 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.	2010-11-22 19:32:54 +00:00
dim	fda4020a88	Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.	2010-11-14 20:38:11 +00:00
gnn	c3225b5eaa	Add a queue to hold packets while we await an ARP reply. When a fast machine first brings up some non TCP networking program it is quite possible that we will drop packets due to the fact that only one packet can be held per ARP entry. This leads to packets being missed when a program starts or restarts if the ARP data is not currently in the ARP cache. This code adds a new sysctl, net.link.ether.inet.maxhold, which defines a system wide maximum number of packets to be held in each ARP entry. Up to maxhold packets are queued until an ARP reply is received or the ARP times out. The default setting is the old value of 1 which has been part of the BSD networking code since time immemorial. Expose the time we hold an incomplete ARP entry by adding the sysctl net.link.ether.inet.wait, which defaults to 20 seconds, the value used when the new ARP code was added.. Reviewed by: bz, rpaulo MFC after: 3 weeks	2010-11-12 22:03:02 +00:00
bz	afcba250aa	MfP4 CH182763 (original version): Make it harder to exploit certain in_control() related races between the intiial lookup at the beginning and the time we will remove the entry from the lists by re-checking that entry is still in the list before trying to remove it. (*) It is believed that with the current code and locking strategy we cannot completely fix all race. Reported by: Nima Misaghian (nima_misa hotmail.com) on net@ 20100817 Tested by: Nima Misaghian (nima_misa hotmail.com) (original version) PR: kern/146250 Submitted by: Mikolaj Golub (to.my.trociny gmail.com) (different version) MFC after: 1 week	2010-10-16 19:53:22 +00:00
bz	ef87b56ba4	In case of RADIX_MPATH do not leak the IN_IFADDR read lock on early return. MFC after: 3 days	2010-09-04 16:06:01 +00:00
will	aa4e762c4a	Allow carp(4) to be loaded as a kernel module. Follow precedent set by bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks	2010-08-11 00:51:50 +00:00
qingli	f6ab4a6810	This patch fixes the problem where proxy ARP entries cannot be added over the if_ng interface. MFC after: 3 days	2010-05-25 20:42:35 +00:00
bz	0a90ef1728	MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days	2010-04-29 11:52:42 +00:00
bz	d7a91dc6bf	Plug reference leaks in the link-layer code ("new-arp") that previously prevented the link-layer entry from being freed. In both in.c and in6.c (though that code path seems to be basically dead) plug a reference leak in case of a pending callout being drained. In if_ether.c consistently add a reference before resetting the callout and in case we canceled a pending one remove the reference for that. In the final case in arptimer, before freeing the expired entry, remove the reference again and explicitly call callout_stop() to clear the active flag. In nd6.c:nd6_free() we are only ever called from the callout function and thus need to remove the reference there as well before calling into llentry_free(). In if_llatbl.c when freeing entire tables make sure that in case we cancel a pending callout to remove the reference as well. Reviewed by: qingli (earlier version) MFC after: 10 days Problem observed, patch tested by: simon on ipv6gw.f.o, Christian Kratzer (ck cksoft.de), Evgenii Davidov (dado korolev-net.ru) PR: kern/144564 Configurations still affected: with options FLOWTABLE	2010-04-11 16:04:08 +00:00
qingli	93013817b0	One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to allow for connection load balancing across interfaces. Currently the address alias handling method is colliding with the ECMP code. For example, when two interfaces are configured on the same prefix, only one prefix route is installed. So connection load balancing among the available interfaces is not possible. The other advantage of ECMP is for failover. The issue with the current code, is that the interface link-state is not reflected in the route entry. For example, if there are two interfaces on the same prefix, the cable on one interface is unplugged, new and existing connections should switch over to the other interface. This is not done today and packets go into a black hole. Also, there is a small bug in the kernel where deleting ECMP routes in the userland will always return an error even though the command is successfully executed. MFC after: 5 days	2010-03-09 01:11:45 +00:00
qingli	4d8ba24be3	Some of the existing ppp and vpn related scripts create and set the IP addresses of the tunnel end points to the same value. In these cases the loopback route is not installed for the local end. Verified by: avg MFC after: 5 days	2010-02-02 20:38:30 +00:00
qingli	316634c7ad	Ensure an address is removed from the interface address list when the installation of that address fails. PR: 139559	2010-01-08 17:49:24 +00:00
qingli	0897bcc8ad	Consolidate the route message generation code for when address aliases were added or deleted. The announced route entry for an address alias is no longer empty because this empty route entry was causing some route daemon to fail and exit abnormally. MFC after: 5 days	2009-12-30 22:13:01 +00:00
qingli	ed965a92bc	The proxy arp entries could not be added into the system over the IFF_POINTOPOINT link types. The reason was due to the routing entry returned from the kernel covering the remote end is of an interface type that does not support ARP. This patch fixes this problem by providing a hint to the kernel routing code, which indicates the prefix route instead of the PPP host route should be returned to the caller. Since a host route to the local end point is also added into the routing table, and there could be multiple such instantiations due to multiple PPP links can be created with the same local end IP address, this patch also fixes the loopback route installation failure problem observed prior to this patch. The reference count of loopback route to local end would be either incremented or decremented. The first instantiation would create the entry and the last removal would delete the route entry. MFC after: 5 days	2009-12-30 21:35:34 +00:00
qingli	c96d27ad80	Use the correct option name in the preprocessor command to enable or disable diagnostic messages. Reviewed by: ru MFC after: 3 days	2009-10-23 18:27:34 +00:00
qingli	7d73ff246e	This patch fixes the following issues in the ARP operation: 1. There is a regression issue in the ARP code. The incomplete ARP entry was timing out too quickly (1 second timeout), as such, a new entry is created each time arpresolve() is called. Therefore the maximum attempts made is always 1. Consequently the error code returned to the application is always 0. 2. Set the expiration of each incomplete entry to a 20-second lifetime. 3. Return "incomplete" entries to the application. Reviewed by: kmacy MFC after: 3 days	2009-10-15 06:12:04 +00:00
qingli	a1020b6c31	Remove a log message from production code. This log message can be triggered by a misconfigured host that is sending out gratuious ARPs. This log message can also be triggered during a network renumbering event when multiple prefixes co-exist on a single network segment. MFC after: immediately	2009-10-02 01:45:11 +00:00
qingli	81ff2327e3	Previously, if an address alias is configured on an interface, and this address alias has a prefix matching that of another address configured on the same interface, then the ARP entry for the alias is not deleted from the ARP table when that address alias is removed. This patch fixes the aforementioned issue. PR: kern/139113 MFC after: 3 days	2009-10-02 01:34:55 +00:00
qingli	3a82e44273	Self pointing routes are installed for configured interface addresses and address aliases. After an interface is brought down and brought back up again, those self pointing routes disappeared. This patch ensures after an interface is brought back up, the loopback routes are reinstalled properly. Reviewed by: bz MFC after: immediately	2009-09-15 19:18:34 +00:00
qingli	d073699112	The bootp code installs an interface address and the nfs client module tries to install the same address again. This extra code is removed, which was discovered by the removal of a call to in_ifscrub() in r196714. This call to in_ifscrub is put back here because the SIOCAIFADDR command can be used to change the prefix length of an existing alias. Reviewed by: kmacy	2009-09-15 01:01:03 +00:00
np	ba75578c03	Add arp_update_event. This replaces route_arp_update_event, which has not worked since the arp-v2 rewrite. The event handler will be called with the llentry write-locked and can examine la_flags to determine whether the entry is being added or removed. Reviewed by: gnn, kmacy Approved by: gnn (mentor) MFC after: 1 month	2009-09-08 21:17:17 +00:00
qingli	58b42709a0	This patch fixes the following issues: - Routing messages are not generated when adding and removing interface address aliases. - Loopback route installed for an interface address alias is not deleted from the routing table when that address alias is removed from the associated interface. - Function in_ifscrub() is called extraneously. Reviewed by: gnn, kmacy, sam MFC after: 3 days	2009-08-31 21:02:48 +00:00
rwatson	544dfa0789	Use locks specific to the lltable code, rather than borrow the ifnet list/index locks, to protect link layer address tables. This avoids lock order issues during interface teardown, but maintains the bug that sysctl copy routines may be called while a non-sleepable lock is held. Reviewed by: bz, kmacy MFC after: 3 days	2009-08-25 09:52:38 +00:00
rwatson	ef8d755d4d	Rework global locks for interface list and index management, correcting several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days	2009-08-23 20:40:19 +00:00
rwatson	fb9ffed650	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
qingli	8c1899d934	This patch does the following: - Allow loopback route to be installed for address assigned to interface of IFF_POINTOPOINT type. - Install loopback route for an IPv4 interface addreess when the "useloopback" sysctl variable is enabled. Similarly, install loopback route for an IPv6 interface address when the sysctl variable "nd6_useloopback" is enabled. Deleting loopback routes for interface addresses is unconditional in case these sysctl variables were disabled after an interface address has been assigned. Reviewed by: bz Approved by: re	2009-07-27 17:08:06 +00:00
rwatson	88f8de4d40	Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)	2009-07-16 21:13:04 +00:00
rwatson	57ca4583e7	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
rwatson	ea70a3542d	Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz	2009-06-25 11:52:33 +00:00
rwatson	c9ef486fe1	Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)	2009-06-23 20:19:09 +00:00
rwatson	1f7e54e8c5	Clean up common ifaddr management: - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks	2009-06-21 19:30:33 +00:00
bz	b7ff2bdc20	After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.	2009-06-08 19:57:35 +00:00
bz	f09c1df7a5	If including vnet.h one has to include opt_route.h as well. This is because struct vnet_net holds the rt_tables[][] for MRT and array size is compile time dependent. If you had ROUTETABLES set to >1 after r192011 V_loif was pointing into nonsense leading to strange results or even panics for some people. Reviewed by: mz	2009-05-22 23:03:15 +00:00
qingli	e6b86b7c8f	When an interface address is removed and the last prefix route is also being deleted, the link-layer address table (arp or nd6) will flush those L2 llinfo entries that match the removed prefix. Reviewed by: kmacy	2009-05-20 21:07:15 +00:00
bz	5243547e6f	Unbreak options VIMAGE builds, in a followup to r192011 which did not introduce INIT_VNET_NET() initializers necessary for accessing V_loif. Submitted by: zec Reviewed by: julian	2009-05-17 20:53:10 +00:00
qingli	be1f376f27	Ignore the INADDR_ANY address inserted/deleted by DHCP when installing a loopback route to the interface address.	2009-05-14 05:27:09 +00:00
qingli	7f344a91ea	This patch adds a host route to an interface address (that is assigned to a non loopback/ppp link types) through the loopback interface. Prior to the new L2/L3 rewrite, this host route is implicitly added by the L2 code during RTM_RESOLVE of that interface address. This host route is deleted when that interface is removed. Reviewed by: kmacy	2009-05-12 07:41:20 +00:00
zec	8d976eab5c	In preparation for turning on options VIMAGE in next commits, rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace. Reviewed by: bz (an older version of the patch) Approved by: julian (mentor)	2009-04-26 22:06:42 +00:00
rwatson	a5976db964	Expand coverage of IF_ADDR_LOCK() in in_control() from point of initial lookup of 'ia' from if_addrhead through most use. Note that we currently have to drop it prematurely in some cases due to calls out to the routing and interface code while using 'ia', but this closes many races. Annotate several potential races that persist after this change. Move to using M_NOWAIT for allocating new interface addresses due to lock(s) being held. MFC after: 3 weeks	2009-04-25 23:02:57 +00:00
rwatson	98b615af71	In in_purgemaddrs(), remove the inm being freed from the address list before freeing it, rather than vice version, to avoid potential use after free. Reviewed by: bms	2009-04-24 22:11:53 +00:00
rwatson	5eaa7f128c	Relocate permissions checking code in in_control() to before the body of the implementation of ioctls. This makes the mapping of ioctls to specific privileges more explicit, and also simplifies the implementation by reducing the use of FALLTHROUGH handling in switch. While this is not intended to be a functional change, it does mean that certain privilege checks are now performed earlier, so EPERM might be returned in preference to EADDRNOTAVAIL for management ioctls that could have failed for both reasons. MFC after: 3 weeks	2009-04-24 09:54:46 +00:00
rwatson	a761d4fb36	Reorganize in_control() so that invariants are more obvious, and so that it is easier to lock: - Handle the unsupported ioctl case at the beginning of in_control(), handing off to ifp->if_ioctl, rather than looking up interfaces and addresses unnecessarily in this case. - Make it an invariant that ifp is always non-NULL when running in_control()-implemented ioctls, simplifying the code structure. MFC after: 3 weeks	2009-04-23 21:41:37 +00:00
rwatson	38c104950f	Protect against some writer-writer races in in_control() by acquiring the interface address list lock around interface address list modifications. More to do here. MFC after: 2 weeks	2009-04-19 22:16:19 +00:00
bms	f2006dd38e	Deal with the case where ifma_protospec may be NULL, during any IPv4 multicast operations which reference it. There is a potential race because ifma_protospec is set to NULL when we discover the underlying ifnet has gone away. This write is not covered by the IF_ADDR_LOCK, and it's difficult to widen its scope without making it a recursive lock. It isn't clear why this manifests more quickly with 802.11 interfaces, but does not seem to manifest at all with wired interfaces. With this change, the 802.11 related panics reported by sam@ and cokane@ should go away. It is not the right fix, that requires more thought before 8.0. Idea from: sam Tested by: cokane	2009-03-17 14:41:54 +00:00
rwatson	70b6a8119c	Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced in FreeBSD 5.x to allow network device drivers to run with Giant despite the network stack being Giant-free. This significantly simplifies calls into ioctl() on network interfaces, especially in the multicast code, as well as eliminates deferred invocation of interface if_start routines. Disable the build on device drivers still depending on IFF_NEEDSGIANT as they no longer compile. They will be removed in a few weeks if they haven't been made MPSAFE in that time. Disabled drivers: if_ar if_axe if_aue if_cdce if_cue if_kue if_ray if_rue if_rum if_sr if_udav if_ural if_zyd Drivers that were already disabled because of tty changes: if_ppp if_sl Discussed on: arch@	2009-03-15 14:21:05 +00:00
bms	4961c33f6b	Fix uninitialized use of ifp for ii. Found by: Peter Holm	2009-03-09 22:54:17 +00:00
bms	71233409ea	Merge IGMPv3 and Source-Specific Multicast (SSM) to the FreeBSD IPv4 stack. Diffs are minimized against p4. PCS has been used for some protocol verification, more widespread testing of recorded sources in Group-and-Source queries is needed. sizeof(struct igmpstat) has changed. __FreeBSD_version is bumped to 800070.	2009-03-09 17:53:05 +00:00
jamie	12bbe1869f	Standardize the various prison_foo_ip[46] functions and prison_if to return zero on success and an error code otherwise. The possible errors are EADDRNOTAVAIL if an address being checked for doesn't match the prison, and EAFNOSUPPORT if the prison doesn't have any addresses in that address family. For most callers of these functions, use the returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or EINVAL. Always include a jailed() check in these functions, where a non-jailed cred always returns success (and makes no changes). Remove the explicit jailed() checks that preceded many of the function calls. Approved by: bz (mentor)	2009-02-05 14:06:09 +00:00
sam	b278e68100	remove too noisy DIAGNOSTIC code Reviewed by: qingli	2009-01-18 07:20:02 +00:00
bz	ffd2421407	Restrict arp, ndp and theoretically the FIB listing (if not read with libkvm) to the addresses of a prison, when inside a jail. [1] As the patch from the PR was pre-'new-arp', add checks to the llt_dump handlers as well. While touching RTM_GET in route_output(), consistently use curthread credentials rather than the creds from the socket there. [2] PR: kern/68189 Submitted by: Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1] Discussed with: rwatson [2] Reviewed by: rwatson MFC after: 4 weeks	2009-01-09 21:57:49 +00:00
bz	60c950d4ff	Make SIOCGIFADDR and related, as well as SIOCGIFADDR_IN6 and related jail-aware. Up to now we returned the first address of the interface for SIOCGIFADDR w/o an ifr_addr in the query. This caused problems for programs querying for an address but running inside a jail, as the address returned usually did not belong to the jail. Like for v6, if there was an ifr_addr given on v4, you could probe for more addresses on the interfaces that you were not allowed to see from inside a jail. Return an error (EADDRNOTAVAIL) in that case now unless the address is on the given interface and valid for the jail. PR: kern/114325 Reviewed by: rwatson MFC after: 4 weeks	2009-01-09 13:06:56 +00:00
harti	d5349ad3cd	Set a minimum of information in the routing message (like version and type) so that generic routing message parsing code can parse the messages for L2 info that are retrieved via the sysctl interface.	2009-01-09 10:58:59 +00:00
qingli	efe3f87721	Some modules such as SCTP supplies a valid route entry as an input argument to ip_output(). The destionation is represented in a sockaddr{} object that may contain other pieces of information, e.g., port number. This same destination sockaddr{} object may be passed into L2 code, which could be used to create a L2 entry. Since there exists a L2 table per address family, the L2 lookup function can make address family specific comparison instead of the generic bcmp() operation over the entire sockaddr{} structure. Note in the IPv6 case the sin6_scope_id is not compared because the address is currently stored in the embedded form inside the kernel. The in6_lltable_lookup() has to account for the scope-id if this storage format were to change in the future.	2009-01-03 00:27:28 +00:00
bz	e46dfc4a18	For consistency use LLE_IS_VALID() in this 4th place that is actually interested in the (void *)-1 return value hack. This way we can easily identify those special parts of the code.	2008-12-28 21:18:01 +00:00
qingli	1d851edfc0	This checkin addresses a couple of issues: 1. The "route" command allows route insertion through the interface-direct option "-iface". During if_attach(), an sockaddr_dl{} entry is created for the interface and is part of the interface address list. This sockaddr_dl{} entry describes the interface in detail. The "route" command selects this entry as the "gateway" object when the "-iface" option is present. The "arp" and "ndp" commands also interact with the kernel through the routing socket when adding and removing static L2 entries. The static L2 information is also provided through the "gateway" object with an AF_LINK family type, similar to what is provided by the "route" command. In order to differentiate between these two types of operations, a RTF_LLDATA flag is introduced. This flag is set by the "arp" and "ndp" commands when issuing the add and delete commands. This flag is also set in each L2 entry returned by the kernel. The "arp" and "ndp" command follows a convention where a RTM_GET is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills in the fields for a "rtm" object, which is reinjected into the kernel by a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET is a prefix route, so the RTF_LLDATA flag must be specified when issuing the RTM_ADD/DELETE messages. 2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the specification for retrieving L2 information. Also optimized the code logic. Reviewed by: julian	2008-12-26 19:45:24 +00:00
kmacy	c9eebde165	unlock and destroy an llentry's lock before freeing Found by: sam	2008-12-16 00:20:49 +00:00
qingli	ec826ad5c7	This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion	2008-12-15 06:10:57 +00:00
bz	604d89458a	Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation	2008-12-02 21:37:28 +00:00
zec	7ecd715d48	Unhide declarations of network stack virtualization structs from underneath #ifdef VIMAGE blocks. This change introduces some churn in #include ordering and nesting throughout the network stack and drivers but is not expected to cause any additional issues. In the next step this will allow us to instantiate the virtualization container structures and switch from using global variables to their "containerized" counterparts. Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-28 23:30:51 +00:00
zec	815d52c5df	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-19 09:39:34 +00:00
bz	a2ae61fc14	Style changes only: - Consistently add parentheses to return statements. - Use NULL instead of 0 when comparing pointers, also avoiding unnecessary casts. - Do not use pointers as booleans. Reviewed by: rwatson (earlier version) MFC after: 2 months	2008-10-26 19:17:25 +00:00
zec	8797d4caec	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
bz	1021d43b56	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
gonzo	afeaa2fd28	In case of interface initialization failure remove struct in_ifaddr* from in_ifaddrhashtbl in in_ifinit because error handler in in_control removes entries only for AF_INET addresses. If in_ifinit is called for the cloned inteface that has just been created its address family is not AF_INET and therefor LIST_REMOVE is not called for respective LIST_INSERT_HEAD and freed entries remain in in_ifaddrhashtbl and lead to memory corruption. PR: kern/124384	2008-06-24 13:58:28 +00:00
bz	ca561e0217	Differentiate between addifaddr and delifaddr for the privilege check. Reviewed by: rwatson MFC after: 2 weeks	2008-01-24 08:14:38 +00:00
silby	f965c7bdc4	Add FBSDID to all files in netinet so that people can more easily include file version information in bug reports. Approved by: re (kensmith)	2007-10-07 20:44:24 +00:00
mjacob	49d2064d40	Simplification to quiet a gcc4.2 warning. Just by setting match.s_addr to nonzero you fulfill the same function as the variable 'cmp'. so you might as well zero match and test against it later. Reviewed by: timeout on review request	2007-06-17 00:31:24 +00:00
bms	ffd77d9ba5	Import rewrite of IPv4 socket multicast layer to support source-specific and protocol-independent host mode multicast. The code is written to accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work. This change only pertains to FreeBSD's use as a multicast end-station and does not concern multicast routing; for an IGMPv3/MLDv2 router implementation, consider the XORP project. The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html Summary * IPv4 multicast socket processing is now moved out of ip_output.c into a new module, in_mcast.c. * The in_mcast.c module implements the IPv4 legacy any-source API in terms of the protocol-independent source-specific API. * Source filters are lazy allocated as the common case does not use them. They are part of per inpcb state and are covered by the inpcb lock. * struct ip_mreqn is now supported to allow applications to specify multicast joins by interface index in the legacy IPv4 any-source API. * In UDP, an incoming multicast datagram only requires that the source port matches the 4-tuple if the socket was already bound by source port. An unbound socket SHOULD be able to receive multicasts sent from an ephemeral source port. * The UDP socket multicast filter mode defaults to exclusive, that is, sources present in the per-socket list will be blocked from delivery. * The RFC 3678 userland functions have been added to libc: setsourcefilter, getsourcefilter, setipv4sourcefilter, getipv4sourcefilter. * Definitions for IGMPv3 are merged but not yet used. * struct sockaddr_storage is now referenced from <netinet/in.h>. It is therefore defined there if not already declared in the same way as for the C99 types. * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF which are then interpreted as interface indexes) is now deprecated. * A patch for the Rhyolite.com routed in the FreeBSD base system is available in the -net archives. This only affects individuals running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces. * Make IPv6 detach path similar to IPv4's in code flow; functionally same. * Bump __FreeBSD_version to 700048; see UPDATING. This work was financially supported by another FreeBSD committer. Obtained from: p4://bms_netdev Submitted by: Wilbert de Graaf (original work) Reviewed by: rwatson (locking), silence from fenner, net@ (but with encouragement)	2007-06-12 16:24:56 +00:00
rwatson	a25f94b5ae	Move universally to ANSI C function declarations, with relatively consistent style(9)-ish layout.	2007-05-10 15:58:48 +00:00
bms	44f999134f	Fix a bug in IPv4 address configuration exposed by refcounting. * Join the IPv4 all-hosts multicast group 224.0.0.1 once only; that is, when an IPv4 address is first configured on an interface. * Do not join it for subsequent IPv4 addresses as this violates IGMP. * Be sure to leave the group when all IPv4 addresses have been removed from the interface. * Add two DIAGNOSTIC printfs related to the issue. Further care and attention is needed in this area; it is suggested that netinet's attachment to the ifnet structure be compartmentalized and non-implicit. Bug found by: andre MFC after: 1 month	2007-03-29 21:39:22 +00:00
bms	4ffc004901	Implement reference counting for ifmultiaddr, in_multi, and in6_multi structures. Detect when ifnet instances are detached from the network stack and perform appropriate cleanup to prevent memory leaks. This has been implemented in such a way as to be backwards ABI compatible. Kernel consumers are changed to use if_delmulti_ifma(); in_delmulti() is unable to detect interface removal by design, as it performs searches on structures which are removed with the interface. With this architectural change, the panics FreeBSD users have experienced with carp and pfsync should be resolved. Obtained from: p4 branch bms_netdev Reviewed by: andre Sponsored by: Garance A Drosehn Idea from: NetBSD MFC after: 1 month	2007-03-20 00:36:10 +00:00
bms	b6b883252e	In regular forwarding path, reject packets destined for 169.254.0.0/16 link-local addresses. See RFC 3927 section 2.7.	2007-02-03 06:45:51 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
bms	b8c88cc531	The IPv4 code should clean up multicast group state when an interface goes away. Without this change, it leaks in_multi (and often ether_multi state) if many clonable interfaces are created and destroyed in quick succession. The concept of this fix is borrowed from KAME. Detailed information about this behaviour, as well as test cases, are available in the PR. PR: kern/78227 MFC after: 1 week	2006-09-28 10:04:07 +00:00
andre	26e03a9ecb	In in_control() remove the temporary in_ifaddr structure from the ia_hash only if it actually is an AF_INET address. All other places test for sa_family == AF_INET but this one. PR: kern/92091 Submitted by: Seth Kingsley <sethk-at-meowfishies.com> MFC after: 3 days	2006-01-24 16:19:31 +00:00
glebius	ed194f4ef3	First fill in structure with valid values, and only then attach it to the global list. Reviewed by: rwatson	2005-10-28 20:29:42 +00:00
glebius	9ae88c3a40	In in_addprefix() compare not only route addresses, but their masks, too. This fixes problem when connected prefixes overlap. Obtained from: OpenBSD (rev. 1.40 by claudio); [ I came to this fix myself, and then found out that OpenBSD had already fixed it the same way.]	2005-10-22 14:50:27 +00:00
rwatson	dc2be38288	Unlock Giant symmetrically with respect to lock acquire order as that's generally nicer. Spotted by: johan MFC after: 1 week	2005-10-03 11:34:29 +00:00
rwatson	2d9ef0269a	Acquire Giant conditionally in in_addmulti() and in_delmulti() based on whether the interface being accessed is IFF_NEEDSGIANT or not. This avoids lock order reversals when calling into the interface ioctl handler, which could potentially lead to deadlock. The long term solution is to eliminate non-MPSAFE network drivers. Discussed with: jhb MFC after: 1 week	2005-10-03 11:09:39 +00:00
rwatson	64eedb0310	Take a first cut at cleaning up ifnet removal and multicast socket panics, which occur when stale ifnet pointers are left in struct moptions hung off of inpcbs: - Add in_ifdetach(), which matches in6_ifdetach(), and allows the protocol to perform early tear-down on the interface early in if_detach(). - Annotate that if_detach() needs careful consideration. - Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR -- this is not the place to detect interface removal! This also removes what is basically a nasty (and now unnecessary) hack. - Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP IPv4 sockets. It is now possible to run the msocket_ifnet_remove regression test using HEAD without panicking. MFC after: 3 days	2005-09-18 17:36:28 +00:00
glebius	b2a592be59	In order to support CARP interfaces kernel was taught to handle more than one interface in one subnet. However, some userland apps rely on the believe that this configuration is impossible. Add a sysctl switch net.inet.ip.same_prefix_carp_only. If the switch is on, then kernel will refuse to add an additional interface to already connected subnet unless the interface is CARP. Default value is off. PR: bin/82306 In collaboration with: mlaier	2005-08-18 10:34:30 +00:00
rwatson	7504160c1e	Introduce in_multi_mtx, which will protect IPv4-layer multicast address lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case. For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held. Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output(). Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync. Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events. Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking. Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days	2005-08-03 19:29:47 +00:00
iedowse	7de9a3957f	Use IFF_LOCKGIANT/IFF_UNLOCKGIANT around calls to the interface if_ioctl routine. This should fix a number of code paths through soo_ioctl() that could call into Giant-locked network drivers without first acquiring Giant.	2005-06-02 00:04:08 +00:00
glebius	d2dabbb47c	ifma_protospec is a pointer. Use NULL when assigning or compating it.	2005-03-20 14:31:45 +00:00
glebius	3adfed4ac4	Remove a workaround from previos revision. It proved to be incorrect. Add two another workarounds for carp(4) interfaces: - do not add connected route when address is assigned to carp(4) interface - do not add connected route when other interface goes down Embrace workarounds with #ifdef DEV_CARP	2005-03-20 10:27:17 +00:00
glebius	53e474b9c4	Add antifootshooting workaround, which will make all routes "connected" to carp(4) interfaces host routes. This prevents a problem, when connected network is routed to carp(4) interface.	2005-03-10 15:26:45 +00:00
imp	a50ffc2912	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
mlaier	4603a76576	Fix host route addition for more than one address to a loopback interface after allowing more than one address with the same prefix. Reported by: Vladimir Grebenschikov <vova NO fbsd SPAM ru> Submitted by: ru (also NetBSD rev. 1.83) Pointyhat to: mlaier	2004-11-17 23:14:03 +00:00
mlaier	5780422cd7	Merge copyright notices. Requested by: njl	2004-11-13 17:05:40 +00:00
mlaier	583a3d8244	Change the way we automatically add prefix routes when adding a new address. This makes it possible to have more than one address with the same prefix. The first address added is used for the route. On deletion of an address with IFA_ROUTE set, we try to find a "fallback" address and hand over the route if possible. I plan to MFC this in 4 weeks, hence I keep the - now obsolete - argument to in_ifscrub as it must be considered KAPI as it is not static in in.c. I will clean this after the MFC. Discussed on: arch, net Tested by: many testers of the CARP patches Nits from: ru, Andrea Campi <andrea+freebsd_arch webcom it> Obtained from: WIDE via OpenBSD MFC after: 1 month	2004-11-12 20:53:51 +00:00
rwatson	87aa99bbbb	White space cleanup for netinet before branch: - Trailing tab/space cleanup - Remove spurious spaces between or before tabs This change avoids touching files that Andre likely has in his working set for PFIL hooks changes for IPFW/DUMMYNET. Approved by: re (scottl) Submitted by: Xin LI <delphij@frontfree.net>	2004-08-16 18:32:07 +00:00
andre	a93503bce5	Add the function in_localip() which returns 1 if an internet address is for the local host and configured on one of its interfaces.	2004-08-11 11:49:48 +00:00
imp	b49b7fe799	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 20:46:16 +00:00
mlaier	d937176b34	Bring eventhandler callbacks for pf. This enables pf to track dynamic address changes on interfaces (dailup) with the "on (<ifname>)"-syntax. This also brings hooks in anticipation of tracking cloned interfaces, which will be in future versions of pf. Approved by: bms(mentor)	2004-02-26 04:27:55 +00:00

1 2 3 4 5

223 Commits