freebsd-dev

Author	SHA1	Message	Date
Alexander V. Chernikov	9479029b1f	* Add lltable llt_hash callback * Move lltable items insertions/deletions to generic llt code.	2014-11-23 12:15:28 +00:00
Alexander V. Chernikov	7c066c18db	Use less-invasive approach for IF_AFDATA lock: convert into 2 locks: use rwlock accessible via external functions (IF_AFDATA_CFG_* -> if_afdata_cfg_()) for all control plane tasks use rmlock (IF_AFDATA_RUN_) for fast-path lookups.	2014-11-22 19:53:36 +00:00
Alexander V. Chernikov	27688dfe1d	Temporarily revert r274774.	2014-11-22 17:57:54 +00:00
Alexander V. Chernikov	8f465f6690	Convert &in_ifaddr_lock to dual-locking model: use rwlock accessible via external functions (IN_IFADDR_CFG_* -> in_ifaddr_cfg_()) for all control plane tasks use rmlock (IN_IFADDR_RUN_) for fast-path lookups.	2014-11-22 16:27:51 +00:00
Alexander V. Chernikov	86b94cffe4	Finish r274774: add more headers/fix build for non-debug case.	2014-11-21 23:36:21 +00:00
Alexander V. Chernikov	9883e41b4b	Switch IF_AFDATA lock to rmlock	2014-11-21 02:28:56 +00:00
Alexander V. Chernikov	f9723c7705	Simplify API: use new NHOP_LOOKUP_AIFP flag to select what ifp we need to return. Rename fib[64]_lookup_nh_basic to fib[64]_lookup_nh, add flags fields for all relevant functions.	2014-11-20 22:41:59 +00:00
Alexander V. Chernikov	df629abf3e	Rework LLE code locking: * struct llentry is now basically split into 2 pieces: all fields within 64 bytes (amd64) are now protected by both ifdata lock AND lle lock, e.g. you require both locks to be held exclusively for modification. All data necessary for fast path operations is kept here. Some fields were added: - r_l3addr - makes lookup key liev within first 64 bytes. - r_flags - flags, containing pre-compiled decision whether given lle contains usable data or not. Current the only flag is RLLE_VALID. - r_len - prepend data len, currently unused - r_kick - used to provide feedback to control plane (see below). All other fields are protected by lle lock. * Add simple state machine for ARP to handle "about to expire" case: Current model (for the fast path) is the following: - rlock afdata - find / rlock rte - runlock afdata - see if "expire time" is approaching (time_uptime + la->la_preempt > la->la_expire) - if true, call arprequest() and decrease la_preempt - store MAC and runlock rte New model (data plane): - rlock afdata - find rte - check if it can be used using r_* fields only - if true, store MAC - if r_kick field != 0 set it to 0. - runlock afdata New mode (control plane): - schedule arptimer to be called in (V_arpt_keep - V_arp_maxtries) seconds instead of V_arpt_keep. - on first timer invocation change state from ARP_LLINFO_REACHABLE to ARP_LLINFO_VERIFY, sets r_kick to 1 and shedules next call in V_arpt_rexmit (default to 1 sec). - on subsequent timer invocations in ARP_LLINFO_VERIFY state, checks for r_kick value: reschedule if not changed, and send arprequest() if set to zero (e.g. entry was used). * Convert IPv4 path to use new single-lock approach. IPv6 bits to follow. * Slow down in_arpinput(): now valid reply will (in most cases) require acquiring afdata WLOCK twice. This is requirement for storing changed lle data. This change will be slightly optimized in future. * Provide explicit hash link/unlink functions for both ipv4/ipv6 code. This will probably be moved to generic lle code once we have per-AF hashing callback inside lltable. * Perform lle unlink on deletion immediately instead of delaying it to the timer routine. * Make r244183 more explicit: use new LLE_CALLOUTREF flag to indicate the presence of lle reference used for safe callout calls.	2014-11-16 20:12:49 +00:00
Alexander V. Chernikov	b4b1367ae4	* Move lle creation/deletion from lla_lookup to separate functions: lla_lookup(LLE_CREATE) -> lla_create lla_lookup(LLE_DELETE) -> lla_delete Assume lla_create to return LLE_EXCLUSIVE lock for lle. * Rework lla_rt_output to perform all lle changes under afdata WLOCK. * change arp_ifscrub() ackquire afdata WLOCK, the same as arp_ifinit().	2014-11-15 18:54:07 +00:00
Alexander V. Chernikov	a9413f6ca0	Sync to HEAD@r274297.	2014-11-08 18:13:35 +00:00
Gleb Smirnoff	6df8a71067	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.	2014-11-07 09:39:05 +00:00
Alexander V. Chernikov	064b1bdb2d	Convert lle rtchecks to use new routing API. For inet/ case, this involves reverting r225947 which seem to be pretty strange commit and should be reverted in HEAD ad well.	2014-11-06 23:35:22 +00:00
Alexander V. Chernikov	8c3cfe0be0	Hide 'struct rtentry' and all its macro inside new header: net/route_internal.h The goal is to make its opaque for all code except route/rtsock and proto domain _rmx.	2014-11-04 17:28:13 +00:00
Hiroki Sato	cc45ae406d	Add a change missing in r271916.	2014-09-21 04:38:50 +00:00
Xin LI	a7f77a3950	Restore historical behavior of in_control, which, when no matching address is found, the first usable address is returned for legacy ioctls like SIOCGIFBRDADDR, SIOCGIFDSTADDR, SIOCGIFNETMASK and SIOCGIFADDR. While there also fix a subtle issue that a caller from a jail asking for INADDR_ANY may get the first IP of the host that do not belong to the jail. Submitted by: glebius Differential Revision: https://reviews.freebsd.org/D667	2014-08-22 19:08:12 +00:00
Steven Hartland	5af464bbe0	Ensure that IP's added to CARP always use the CARP MAC Previously there was a race condition between the address addition and associating it with the CARP which resulted in the interface MAC, instead of the CARP MAC, being used for a brief amount of time. This caused "is using my IP address" warnings as well as data being sent to the wrong machine due to incorrect ARP entries being recorded by other devices on the network.	2014-07-31 16:43:56 +00:00
Steven Hartland	d34165f759	Only check error if one could have been generated	2014-07-31 09:18:29 +00:00
Gleb Smirnoff	9753faf553	Garbage collect couple of unused fields from struct ifaddr: - ifa_claim_addr() unused since removal of NetAtalk - ifa_metric seems to be never utilized, always a copy of if_metric	2014-07-29 15:01:29 +00:00
Alan Somers	7278b62aee	Fix a panic when removing an IP address from an interface, if the same address exists on another interface. The panic was introduced by change 264887, which changed the fibnum parameter in the call to rtalloc1_fib() in ifa_switch_loopback_route() from RT_DEFAULT_FIB to RT_ALL_FIBS. The solution is to use the interface fib in that call. For the majority of users, that will be equivalent to the legacy behavior. PR: kern/189089 Reported by: neel Reviewed by: neel MFC after: 3 weeks X-MFC with: 264887 Sponsored by: Spectra Logic	2014-04-29 14:46:45 +00:00
Alan Somers	0cfee0c223	Fix subnet and default routes on different FIBs on the same subnet. These two bugs are closely related. The root cause is that ifa_ifwithnet does not consider FIBs when searching for an interface address. sys/net/if_var.h sys/net/if.c Add a fib argument to ifa_ifwithnet and ifa_ifwithdstadddr. Those functions will only return an address whose interface fib equals the argument. sys/net/route.c Update calls to ifa_ifwithnet and ifa_ifwithdstaddr with fib arguments. sys/netinet/in.c Update in_addprefix to consider the interface fib when adding prefixes. This will prevent it from not adding a subnet route when one already exists on a different fib. sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/netinet6/nd6.c Add RT_DEFAULT_FIB arguments to ifa_ifwithdstaddr and ifa_ifwithnet. In some cases it there wasn't a clear specific fib number to use. In others, I was unable to test those functions so I chose RT_DEFAULT_FIB to minimize divergence from current behavior. I will fix some of the latter changes along with PR kern/187553. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c tests/sys/netinet/Makefile Revert r263738. The udp_dontroute test was right all along. However, bugs kern/187550 and kern/187553 cancelled each other out when it came to this test. Because of kern/187553, ifa_ifwithnet searched the default fib instead of the requested one, but because of kern/187550, there was an applicable subnet route on the default fib. The new test added in r263738 doesn't work right, however. I can verify with dtrace that ifa_ifwithnet returned the wrong address before I applied this commit, but route(8) miraculously found the correct interface to use anyway. I don't know how. Clear expected failure messages for kern/187550 and kern/187552. PR: kern/187550 PR: kern/187552 Reviewed by: melifaro MFC after: 3 weeks Sponsored by: Spectra Logic	2014-04-24 23:56:56 +00:00
Alan Somers	0489b8916e	Fix host and network routes for new interfaces when net.add_addr_allfibs=0 sys/net/route.c In rtinit1, use the interface fib instead of the process fib. The latter wasn't very useful because ifconfig(8) is usually invoked with the default process fib. Changing ifconfig(8) to use setfib(2) would be redundant, because it already sets the interface fib. tests/sys/netinet/fibs_test.sh Clear the expected ATF failure sys/net/if.c Pass the interface fib in calls to rtrequest1_fib and rtalloc1_fib sys/netinet/in.c sys/net/if_var.h Add a fibnum argument to ifa_switch_loopback_route, a subroutine of in_scrubprefix. Pass it the interface fib. PR: kern/187549 Reviewed by: melifaro MFC after: 3 weeks Sponsored by: Spectra Logic Corporation	2014-04-24 17:23:16 +00:00
Kevin Lo	e06e816f67	Add support for UDP-Lite protocol (RFC 3828) to IPv4 and IPv6 stacks. Tested with vlc and a test suite [1]. [1] http://www.erg.abdn.ac.uk/~gerrit/udp-lite/files/udplite_linux.tar.gz Reviewed by: jhb, glebius, adrian	2014-04-07 01:53:03 +00:00
Alan Somers	743c072a09	Correct ARP update handling when the routes for network interfaces are restricted to a single FIB in a multifib system. Restricting an interface's routes to the FIB to which it is assigned (by setting net.add_addr_allfibs=0) causes ARP updates to fail with "arpresolve: can't allocate llinfo for x.x.x.x". This is due to the ARP update code hard coding it's lookup for existing routing entries to FIB 0. sys/netinet/in.c: When dealing with RTM_ADD (add route) requests for an interface, use the interface's assigned FIB instead of the default (FIB 0). sys/netinet/if_ether.c: In arpresolve(), enhance error message generated when an lla_lookup() fails so that the interface causing the error is visible in logs. tests/sys/netinet/fibs_test.sh Clear ATF expected error. PR: kern/167947 Submitted by: Nikolay Denev <ndenev@gmail.com> (previous version) Reviewed by: melifaro MFC after: 3 weeks Sponsored by: Spectra Logic Corporation	2014-03-26 22:46:03 +00:00
Alexander V. Chernikov	a49b317c41	Fix refcount leak on netinet ifa. Reviewed by: glebius MFC after: 2 weeks Sponsored by: Yandex LLC	2014-01-16 12:35:18 +00:00
Alexander V. Chernikov	d375edc9b5	Simplify inet alias handling code: if we're adding/removing alias which has the same prefix as some other alias on the same interface, use newly-added rt_addrmsg() instead of hand-rolled in_addralias_rtmsg(). This eliminates the following rtsock messages: Pinned RTM_ADD for prefix (for alias addition). Pinned RTM_DELETE for prefix (for alias withdrawal). Example (got 10.0.0.1/24 on vlan4, playing with 10.0.0.2/24): before commit, addition: got message of size 116 on Fri Jan 10 14:13:15 2014 RTM_NEWADDR: address being added to iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255 got message of size 192 on Fri Jan 10 14:13:15 2014 RTM_ADD: Add Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 10.0.0.2 (255) ffff ffff ff after commit, addition: got message of size 116 on Fri Jan 10 13:56:26 2014 RTM_NEWADDR: address being added to iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 14.0.0.2 14.0.0.255 before commit, wihdrawal: got message of size 192 on Fri Jan 10 13:58:59 2014 RTM_DELETE: Delete Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 10.0.0.2 (255) ffff ffff ff got message of size 116 on Fri Jan 10 13:58:59 2014 RTM_DELADDR: address being removed from iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255 adter commit, withdrawal: got message of size 116 on Fri Jan 10 14:14:11 2014 RTM_DELADDR: address being removed from iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255 Sending both RTM_ADD/RTM_DELETE messages to rtsock is completely wrong (and requires some hacks to keep prefix in route table on RTM_DELETE). I've tested this change with quagga (no change) and bird (). bird alias handling is already broken in BSD sysdep code, so nothing changes here, too. I'm going to MFC this change if there will be no complains about behavior change. While here, fix some style(9) bugs introduced by r260488 (pointed by glebius and bde). Sponsored by: Yandex LLC MFC after: 4 weeks	2014-01-10 12:13:55 +00:00
Andrey V. Elsukov	e2d14d9317	Add IF_AFDATA_WLOCK_ASSERT() in case lla_lookup() is called with LLE_CREATE flag. MFC after: 1 week	2014-01-03 02:32:05 +00:00
Gleb Smirnoff	9706c950a2	Fix couple of bugs from r257692 related to scan of address list on an interface: - in in_control() skip over not AF_INET addresses. - in in_aifaddr_ioctl() and in_difaddr_ioctl() do correct check of address family, w/o accessing memory beyond struct ifaddr. Sponsored by: Nginx, Inc.	2013-12-29 22:20:06 +00:00
Gleb Smirnoff	c1f7c3f500	In r257692 I intentionally deleted code that handled P2P interfaces with equal addresses on both sides. It appeared that OpenVPN uses such configutations. Submitted by: trociny	2013-11-17 15:14:07 +00:00
Gleb Smirnoff	555036b5f6	Remove never used ioctls that originate from KAME. The proof of their zero usage was exp-run from misc/183538.	2013-11-11 05:39:42 +00:00
Gleb Smirnoff	77b89ad837	Provide compat layer for OSIOCAIFADDR.	2013-11-06 19:46:20 +00:00
Gleb Smirnoff	821b5caf7a	Fix my braino in r257692. For SIOCG*ADDR we don't need exact match on specified address, actually in most cases the address isn't specified. Reported by: peter	2013-11-06 08:36:08 +00:00
Nathan Whitehorn	6224cd89c0	Fix build on GCC.	2013-11-06 01:14:00 +00:00
Gleb Smirnoff	f7a39160c2	Rewrite in_control(), so that it is comprehendable without getting mad. o Provide separate functions for SIOCAIFADDR and for SIOCDIFADDR, with clear code flow from beginning to the end. After that the rest of in_control() gets very small and clear. o Provide sx(9) lock to protect against parallel ioctl() invocations. o Reimplement logic from r201282, that tried to keep localhost route in table when multiple P2P interfaces with same local address are created and deleted. Discussed with: pluknet, melifaro Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-11-05 07:44:15 +00:00
Gleb Smirnoff	b1b9dcae46	Remove net.link.ether.inet.useloopback sysctl tunable. It was always on by default from the very beginning. It was placed in wrong namespace net.link.ether, originally it had been at another wrong namespace. It was incorrectly documented at incorrect manual page arp(8). Since new-ARP commit, the tunable have been consulted only on route addition, and ignored on route deletion. Behaviour of a system with tunable turned off is not fully correct, and has no advantages comparing to normal behavior.	2013-11-05 07:32:09 +00:00
Gleb Smirnoff	237bf7f773	Cleanup in_ifscrub(), which is just an entry to in_scrubprefix().	2013-11-01 10:18:41 +00:00
Gleb Smirnoff	c3322cb91c	Include necessary headers that now are available due to pollution via if_var.h. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-28 07:29:16 +00:00
Gleb Smirnoff	4675896098	Remove ifa_init() and provide ifa_alloc() that will allocate and setup struct ifaddr internally. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-15 10:31:42 +00:00
Andrey V. Elsukov	5b7cb97c2b	Migrate structs arpstat, icmpstat, mrtstat, pimstat and udpstat to PCPU counters.	2013-07-09 09:50:15 +00:00
Oleg Bulyzhin	1571132f14	Plug static llentry leak (ipv4 & ipv6 were affected). PR: kern/172985 MFC after: 1 month	2013-04-21 21:28:38 +00:00
Gleb Smirnoff	9711a168b9	Retire struct sockaddr_inarp. Since ARP and routing are separated, "proxy only" entries don't have any meaning, thus we don't need additional field in sockaddr to pass SIN_PROXY flag. New kernel is binary compatible with old tools, since sizes of sockaddr_inarp and sockaddr_in match, and sa_family are filled with same value. The structure declaration is left for compatibility with third party software, but in tree code no longer use it. Reviewed by: ru, andre, net@	2013-01-31 08:55:21 +00:00
Peter Wemm	8a1163e82f	Temporarily revert rev 244678. This is causing loopback problems with the lo (loopback) interfaces.	2013-01-03 10:21:28 +00:00
Gleb Smirnoff	468e45f3bd	The SIOCSIFFLAGS ioctl handler runs if_up()/if_down() that notify all interested parties in case if interface flag IFF_UP has changed. However, not only SIOCSIFFLAGS can raise the flag, but SIOCAIFADDR and SIOCAIFADDR_IN6 can, too. The actual \|= is done not in the protocol code, but in code of interface drivers. To fix this historical layering violation, we will check whether ifp->if_ioctl(SIOCSIFADDR) raised the IFF_UP flag, and if it did, run the if_up() handler. This fixes configuring an address under CARP control on an interface that was initially !IFF_UP. P.S. I intentionally omitted handling the IFF_SMART flag. This flag was never ever used in any driver since it was introduced, and since it means another layering violation, it should be garbage collected instead of pretended to be supported.	2012-12-25 13:01:58 +00:00
Gleb Smirnoff	3e6c8b5366	Minor style(9) changes: - Remove declaration in initializer. - Add empty line between logical blocks.	2012-12-24 21:35:48 +00:00
Randall Stewart	7db496de2c	Though I disagree, I conceed to jhb & Rui. Note that we still have a problem with this whole structure of locks and in_input.c [it does not lock which it should not, but this can lead to crashes]. (I have seen it in our SQA testbed.. besides the one with a refcnt issue that I will have SQA work on next week ;-)	2012-08-19 11:54:02 +00:00
Randall Stewart	9424879158	Ok jhb, lets move the ifa_free() down to the bottom to assure that all tables and such are removed before we start to free. This won't protect the Hash in ip_input.c but in theory should protect any other uses that do use locks. MFC after: 1 week (or more)	2012-08-17 05:51:46 +00:00
Randall Stewart	184749821f	Its never a good idea to double free the same address. MFC after: 1 week (after the other commits ahead of this gets MFC'd)	2012-08-16 17:55:16 +00:00
Gleb Smirnoff	ea53792942	Fix races between in_lltable_prefix_free(), lla_lookup(), llentry_free() and arptimer(): o Use callout_init_rw() for lle timeout, this allows us safely disestablish them. - This allows us to simplify the arptimer() and make it race safe. o Consistently use ifp->if_afdata_lock to lock access to linked lists in the lle hashes. o Introduce new lle flag LLE_LINKED, which marks an entry that is attached to the hash. - Use LLE_LINKED to avoid double unlinking via consequent calls to llentry_free(). - Mark lle with LLE_DELETED via \|= operation istead of =, so that other flags won't be lost. o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more consistent and provide more informative KASSERTs. The patch is a collaborative work of all submitters and myself. PR: kern/165863 Submitted by: Andrey Zonov <andrey zonov.org> Submitted by: Ryan Stone <rysto32 gmail.com> Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>	2012-08-02 13:57:49 +00:00
Gleb Smirnoff	b9aee262e5	Some more whitespace cleanup.	2012-08-01 09:00:26 +00:00
Gleb Smirnoff	ea50c13ebe	Some style(9) and whitespace changes. Together with: Andrey Zonov <andrey zonov.org>	2012-07-31 11:31:12 +00:00
Navdeep Parhar	09fe63205c	- Updated TOE support in the kernel. - Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features. - iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon. Build-tested with make universe. 30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m \| grep TOE Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp \| grep toe # sockstat -46c \| grep toe Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)	2012-06-19 07:34:13 +00:00
Gleb Smirnoff	90b357f6ec	M_DONTWAIT is a flag from historical mbuf(9) allocator, not malloc(9) or uma(9) flag.	2012-04-10 06:52:39 +00:00
Kip Macy	a93cda789a	When using flowtable llentrys can outlive the interface with which they're associated at which the lle_tbl pointer points to freed memory and the llt_free pointer is no longer valid. Move the free pointer in to the llentry itself and update the initalization sites. MFC after: 2 weeks	2012-02-23 18:21:37 +00:00
Bjoern A. Zeeb	81d5d46b3c	Add multi-FIB IPv6 support to the core network stack supplementing the original IPv4 implementation from r178888: - Use RT_DEFAULT_FIB in the IPv4 implementation where noticed. - Use rtfib() KPI with explicit RT_DEFAULT_FIB where applicable in the NFS code. - Use the new in6_rt KPI in TCP, gif(4), and the IPv6 network stack where applicable. - Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally prevent multiple initializations of callouts in in6_inithead(). - Use wrapper functions where needed to preserve the current KPI to ease MFCs. Use BURN_BRIDGES to indicate expected future cleanup. - Fix (related) comments (both technical or style). - Convert to rtinit() where applicable and only use custom loops where currently not possible otherwise. - Multicast group, most neighbor discovery address actions and faith(4) are locked to the default FIB. Individual IPv6 addresses will only appear in the default FIB, however redirect information and prefixes of connected subnets are automatically propagated to all FIBs by default (mimicking IPv4 behavior as closely as possible). Sponsored by: Cisco Systems, Inc.	2012-02-03 13:08:44 +00:00
Gleb Smirnoff	56cf9dc1f6	Drop support for SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFBRDADDR, SIOCSIFDSTADDR ioctl commands. PR: 163524 Reviewed by: net	2012-01-16 09:53:24 +00:00
John Baldwin	137f91e80f	Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks	2012-01-05 19:00:36 +00:00
John Baldwin	0f188ebb1d	Use a helper variable to wrap a long line.	2012-01-04 13:29:26 +00:00
John Baldwin	56d6e1292d	In the handling of the SIOC[DG]LIFADDR icotls in in_lifaddr_ioctl(), add missing interface address list locking and grab a reference on the matching interface address after dropping the lock while it is used to avoid a potential use after free. Reviewed by: bz MFC after: 1 week	2012-01-04 13:26:56 +00:00
John Baldwin	0823c29b81	Fix the SIOC[DG]LIFADDR ioctls in in_lifaddr_ioctl() to work with IPv4 interface address rather than IPv6. Submitted by: hrs Reviewed by: bz MFC after: 1 week	2012-01-04 13:23:51 +00:00
Gleb Smirnoff	7121247312	Provide ABI compatibility shim to enable configuring of addresses with ifconfig(8) prior to r228571. Requested by: brooks	2011-12-21 12:39:08 +00:00
Gleb Smirnoff	92ed4e1a24	Since size of struct in_aliasreq has just been changed in r228571, and thus ifconfig(8) needs recompile, it is a good chance to make parameter checks on SIOCAIFADDR arguments more strict.	2011-12-16 13:30:17 +00:00
Gleb Smirnoff	08b68b0e4c	A major overhaul of the CARP implementation. The ip_carp.c was started from scratch, copying needed functionality from the old implemenation on demand, with a thorough review of all code. The main change is that interface layer has been removed from the CARP. Now redundant addresses are configured exactly on the interfaces, they run on. The CARP configuration itself is, as before, configured and read via SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or SIOCAIFADDR_IN6 may now be configured to a particular virtual host id, which makes the prefix redundant. ifconfig(8) semantics has been changed too: now one doesn't need to clone carpXX interface, he/she should directly configure a vhid on a Ethernet interface. To supply vhid data from the kernel to an application the getifaddrs(8) function had been changed to pass ifam_data with each address. [1] The new implementation definitely closes all PRs related to carp(4) being an interface, and may close several others. It also allows to run a single redundant IP per interface. Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for idea on using ifam_data and for several rounds of reviewing! PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448 Reviewed by: bz Submitted by: bz [1]	2011-12-16 12:16:56 +00:00
Gleb Smirnoff	55174c34ef	Belatedly catch up with r151555. in_scrubprefix() also needs this fix. We should compare not only addresses, but their masks, too, when searching for matching prefix.	2011-12-13 06:56:43 +00:00
Gleb Smirnoff	f769e5b0fa	Fix a very special case when SIOCAIFADDR supplies mask of 0.0.0.0, don't overwrite the mask with autoguessing based on classes.	2011-12-06 20:55:20 +00:00
Gleb Smirnoff	89b9325530	Fix one more fallout from r227791: do not overwrite trimmed sa_len on the ia_sockmask when doing SIOCSIFNETMASK. Reported by: Stefan Bethke <stb lassitu.de>, gonzo Pointy hat to: glebius	2011-11-28 13:30:14 +00:00
Gleb Smirnoff	c6e5c71116	Remove superfluous check: SIOCAIFADDR must have ifra_addr supplied.	2011-11-24 22:46:11 +00:00
Gleb Smirnoff	bd47ae58a6	Fix stupid typo in r227830. PR: 162806 Pointy hat to: glebius	2011-11-24 22:43:48 +00:00
Gleb Smirnoff	e278f44bb5	style(9) nit	2011-11-22 19:39:27 +00:00
Gleb Smirnoff	bbaa3f944e	Fix SIOCDIFADDR semantics: if no address is specified, then delete first one.	2011-11-22 19:37:57 +00:00
Gleb Smirnoff	cf00e5c6b7	This check isn't needed now, sanity checking done in the beginning. Missed it in last commit.	2011-11-21 20:07:12 +00:00
Gleb Smirnoff	6d00fd9c2d	Historically in_control() did not check sockaddrs supplied with structs ifreq/in_aliasreq and there've been several panics due to that problem. All these panics were fixed just a couple of lines above the panicing code. Take a more general approach: sanity check sockaddrs supplied with SIOCAIFADDR and SIOCSIF*ADDR at the beggining of the function and drop all checks below. One check is now disabled due to strange code in ifconfig(8) that I've removed recently. I'm going to enable it with next __FreeBSD_version bump. Historically in_ifinit() was able to recover from an error and restore old address. Nowadays this feature isn't working for all error cases, but for some of them. I suppose no software relies on this behavior, so I'd like to remove it, since this simplifies code a lot. Also, move if_scrub() earlier in the in_ifinit(). It is more correct to wipe routes before removing address from local address list, and interface address list. Silence from: bz, brooks, andre, rwatson, 3 weeks	2011-11-21 14:10:13 +00:00
Qing Li	b3664a14cc	Exclude host routes when checking for prefix coverage on multiple interfaces. A host route has a NULL mask so check for that condition. I have also been told by developers who customize the packet output path with direct manipulation of the route entry (or the outgoing interface to be specific). This patch checks for the route mask explicitly to make sure custom code will not panic. PR: kern/161805 MFC after: 3 days	2011-10-25 04:06:29 +00:00
Gleb Smirnoff	53883e0c24	Add support for IPv4 /31 prefixes, as described in RFC3021. To run a /31 network, participating hosts MUST drop support for directed broadcasts, and treat the first and last addresses on subnet as unicast. The broadcast address for the prefix should be the link local broadcast address, INADDR_BROADCAST.	2011-10-15 18:41:25 +00:00
Gleb Smirnoff	b365d954cc	Remove last remnants of classful addressing: - Remove ia_net, ia_netmask, ia_netbroadcast from struct in_ifaddr. - Remove net.inet.ip.subnetsarelocal, I bet no one need it in 2011. - fix bug when we were not forwarding to a host which matches classful net address. For example router having 192.168.x.y/16 network attached, would not forward traffic to 192.168.*.0, which are legal IPs in CIDR world. - For compatibility, leave autoguessing of mask based on class. Reviewed by: andre, bz, rwatson	2011-10-15 16:28:06 +00:00
Gleb Smirnoff	a0b5928b29	De-spl(9).	2011-10-13 13:30:41 +00:00
Qing Li	15d2521975	All indirect routes will fail the rtcheck, except for a special host route where the destination IP and the gateway IP is the same. This special case handling is only meant for backward compatibility reason. The last commit introduced a bug in the route check logic, where a valid special case is treated as an error. This patch fixes that bug along with some code cleanup. Suggested by: gleb Reviewed by: kmacy, discussed with gleb MFC after: 1 day	2011-10-10 17:41:11 +00:00
Qing Li	6703e7ea10	Do not try removing an ARP entry associated with a given interface address if that interface does not support ARP. Otherwise the system will generate error messages unnecessarily due to the missing entry. PR: kern/159602 Submitted by: pluknet MFC after: 3 days	2011-10-07 22:22:19 +00:00
Qing Li	41b210c6f6	Remove the reference held on the loopback route when the interface address is being deleted. Only the last reference holder deletes the loopback route. All other delete operations just clear the IFA_RTSELF flag. PR: kern/159601 Submitted by: pluknet Reviewed by: discussed on net@ MFC after: 3 days	2011-10-07 18:01:34 +00:00
Qing Li	db92413e6a	A system may have multiple physical interfaces, all of which are on the same prefix. Since a single route entry is installed for the prefix (without RADIX_MPATH), incoming packets on the interfaces that are not associated with the prefix route may trigger an error message about unable to allocation LLE entry, and fails L2. This patch makes sure a valid route is present in the system, and allow the aforementioned condition to exist and treats as valid. Reviewed by: bz MFC after: 5 days	2011-10-03 19:51:18 +00:00
Qing Li	6cf8e3300e	This patch allows ARP to work properly in the presence of self-referencing routes. This patch is a rework of r223862. Reviewed by: bz, zec MFC after: 5 days	2011-10-03 19:06:55 +00:00
Qing Li	1184509858	When an interface address route is removed from the system, another route with the same prefix is searched for as a replacement. The current code did not bypass routes that have non-operational interfaces. This patch fixes that bug and will find a replacement route with an active interface. PR: kern/159603 Submitted by: pluknet, ambrisko at ambrisko dot com Reviewed by: discussed on net@ Approved by: re (bz) MFC after: 3 days	2011-08-28 00:14:40 +00:00
Kevin Lo	7236660627	If RTF_HOST flag is specified, then we are interested in destination address. PR: kern/159600 Submitted by: Svatopluk Kraus <onwahe at gmail dot com> Approved by: re (hrs)	2011-08-10 06:17:06 +00:00
Marko Zec	13e255fab7	Permit ARP to proceed for IPv4 host routes for which the gateway is the same as the host address. This already works fine for INET6 and ND6. While here, remove two function pointers from struct lltable which are only initialized but never used. MFC after: 3 days	2011-07-08 09:38:33 +00:00
Qing Li	92322284cd	Supply the LLE_STATIC flag bit to in_ifscurb() when scrubbing interface address so that proper clean up will take place in the routing code. This patch fixes the bootp panic on startup problem. Also, added more error handling and logging code in function in_scrubprefix(). MFC after: 5 days	2011-05-29 02:21:35 +00:00
Qing Li	5b84dc789a	The statically configured (permanent) ARP entries are removed when an interface is brought down, even though the interface address is still valid. This patch maintains the permanent ARP entries as long as the interface address (having the same prefix as that of the ARP entries) is valid. Reviewed by: delphij MFC after: 5 days	2011-05-20 19:12:20 +00:00
Sergey Kandaurov	79d514355c	Reference ifaddr object before unlocking as it can be freed from another context at the moment of later access. PR: kern/155555 Submitted by: Andrew Boyer <aboyer att averesystems.com> Approved by: avg (mentor) MFC after: 2 weeks	2011-03-21 14:19:40 +00:00
Gleb Smirnoff	a98c06f1c8	Use time_uptime instead of non-monotonic time_second to drive ARP timeouts. Suggested by: bde	2010-11-30 15:57:00 +00:00
Dimitry Andric	3e288e6238	After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 \| dim \| 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) \| 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 \| dim \| 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) \| 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 \| dim \| 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) \| 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.	2010-11-22 19:32:54 +00:00
Dimitry Andric	31c6a0037e	Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.	2010-11-14 20:38:11 +00:00
George V. Neville-Neil	e162ea60d4	Add a queue to hold packets while we await an ARP reply. When a fast machine first brings up some non TCP networking program it is quite possible that we will drop packets due to the fact that only one packet can be held per ARP entry. This leads to packets being missed when a program starts or restarts if the ARP data is not currently in the ARP cache. This code adds a new sysctl, net.link.ether.inet.maxhold, which defines a system wide maximum number of packets to be held in each ARP entry. Up to maxhold packets are queued until an ARP reply is received or the ARP times out. The default setting is the old value of 1 which has been part of the BSD networking code since time immemorial. Expose the time we hold an incomplete ARP entry by adding the sysctl net.link.ether.inet.wait, which defaults to 20 seconds, the value used when the new ARP code was added.. Reviewed by: bz, rpaulo MFC after: 3 weeks	2010-11-12 22:03:02 +00:00
Bjoern A. Zeeb	12112cf676	MfP4 CH182763 (original version): Make it harder to exploit certain in_control() related races between the intiial lookup at the beginning and the time we will remove the entry from the lists by re-checking that entry is still in the list before trying to remove it. (*) It is believed that with the current code and locking strategy we cannot completely fix all race. Reported by: Nima Misaghian (nima_misa hotmail.com) on net@ 20100817 Tested by: Nima Misaghian (nima_misa hotmail.com) (original version) PR: kern/146250 Submitted by: Mikolaj Golub (to.my.trociny gmail.com) (different version) MFC after: 1 week	2010-10-16 19:53:22 +00:00
Bjoern A. Zeeb	42db1b87d6	In case of RADIX_MPATH do not leak the IN_IFADDR read lock on early return. MFC after: 3 days	2010-09-04 16:06:01 +00:00
Will Andrews	54bfbd5153	Allow carp(4) to be loaded as a kernel module. Follow precedent set by bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks	2010-08-11 00:51:50 +00:00
Qing Li	0ed6142b31	This patch fixes the problem where proxy ARP entries cannot be added over the if_ng interface. MFC after: 3 days	2010-05-25 20:42:35 +00:00
Bjoern A. Zeeb	82cea7e6f3	MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days	2010-04-29 11:52:42 +00:00
Bjoern A. Zeeb	becba438d2	Plug reference leaks in the link-layer code ("new-arp") that previously prevented the link-layer entry from being freed. In both in.c and in6.c (though that code path seems to be basically dead) plug a reference leak in case of a pending callout being drained. In if_ether.c consistently add a reference before resetting the callout and in case we canceled a pending one remove the reference for that. In the final case in arptimer, before freeing the expired entry, remove the reference again and explicitly call callout_stop() to clear the active flag. In nd6.c:nd6_free() we are only ever called from the callout function and thus need to remove the reference there as well before calling into llentry_free(). In if_llatbl.c when freeing entire tables make sure that in case we cancel a pending callout to remove the reference as well. Reviewed by: qingli (earlier version) MFC after: 10 days Problem observed, patch tested by: simon on ipv6gw.f.o, Christian Kratzer (ck cksoft.de), Evgenii Davidov (dado korolev-net.ru) PR: kern/144564 Configurations still affected: with options FLOWTABLE	2010-04-11 16:04:08 +00:00
Qing Li	c7ea0aa648	One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to allow for connection load balancing across interfaces. Currently the address alias handling method is colliding with the ECMP code. For example, when two interfaces are configured on the same prefix, only one prefix route is installed. So connection load balancing among the available interfaces is not possible. The other advantage of ECMP is for failover. The issue with the current code, is that the interface link-state is not reflected in the route entry. For example, if there are two interfaces on the same prefix, the cable on one interface is unplugged, new and existing connections should switch over to the other interface. This is not done today and packets go into a black hole. Also, there is a small bug in the kernel where deleting ECMP routes in the userland will always return an error even though the command is successfully executed. MFC after: 5 days	2010-03-09 01:11:45 +00:00
Qing Li	d577d18a00	Some of the existing ppp and vpn related scripts create and set the IP addresses of the tunnel end points to the same value. In these cases the loopback route is not installed for the local end. Verified by: avg MFC after: 5 days	2010-02-02 20:38:30 +00:00
Qing Li	646c800540	Ensure an address is removed from the interface address list when the installation of that address fails. PR: 139559	2010-01-08 17:49:24 +00:00
Qing Li	ccbb9c359d	Consolidate the route message generation code for when address aliases were added or deleted. The announced route entry for an address alias is no longer empty because this empty route entry was causing some route daemon to fail and exit abnormally. MFC after: 5 days	2009-12-30 22:13:01 +00:00
Qing Li	c7ab66020f	The proxy arp entries could not be added into the system over the IFF_POINTOPOINT link types. The reason was due to the routing entry returned from the kernel covering the remote end is of an interface type that does not support ARP. This patch fixes this problem by providing a hint to the kernel routing code, which indicates the prefix route instead of the PPP host route should be returned to the caller. Since a host route to the local end point is also added into the routing table, and there could be multiple such instantiations due to multiple PPP links can be created with the same local end IP address, this patch also fixes the loopback route installation failure problem observed prior to this patch. The reference count of loopback route to local end would be either incremented or decremented. The first instantiation would create the entry and the last removal would delete the route entry. MFC after: 5 days	2009-12-30 21:35:34 +00:00

1 2 3 4 5 ...

303 Commits