freebsd-skq

Author	SHA1	Message	Date
markj	e89e3efaa6	Rename ndpr_refcnt to ndpr_addrcnt. This field counts derived addresses and is not a true refcount for prefix objects, so the previous name was misleading. MFC after: 1 week	2016-09-24 01:14:25 +00:00
bz	7a1c0b1ad1	Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747	2016-06-21 13:48:49 +00:00
gnn	d75e0c471e	This change re-adds L2 caching for TCP and UDP, as originally added in D4306 but removed due to other changes in the system. Restore the llentry pointer to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as appropriate. Submitted by: Mike Karels Differential Revision: https://reviews.freebsd.org/D6262	2016-06-02 17:51:29 +00:00
markj	a09fd6097b	Move IPv6 malloc tag definitions into the IPv6 code.	2016-05-20 04:45:08 +00:00
bz	d4cf530887	Mfp4 @180378: Factor out nd6 and in6_attach initialization to their own files. Also move destruction into those files though still called from the central initialization. Sponsored by: CK Software GmbH Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D5033	2016-03-22 15:43:47 +00:00
markj	8c873a2b1b	Modify defrouter_remove() to perform the router lookup before removal. This allows some simplification of its callers. No functional change intended. Tested by: Larry Rosenman (as part of a larger change) MFC after: 1 month	2016-03-17 19:01:44 +00:00
markj	9524b1b571	Lock the NDP default router list and count defrouter references. This addresses a number of race conditions that can cause crashes as a result of unsynchronized access to the list. PR: 206904 Tested by: Larry Rosenman <ler@lerctr.org>, Kevin Bowling <kevin.bowling@kev009.com> MFC after: 2 months Differential Revision: https://reviews.freebsd.org/D5315	2016-02-25 20:12:05 +00:00
markj	7f716b8a59	Remove a prototype for the non-existent prelist_del(). MFC after: 1 week	2016-02-17 23:53:24 +00:00
markj	9ec0a4438f	Rename the flags field of struct nd_defrouter to "raflags". This field contains the flags inherited from the corresponding router advertisement message and is not for storing private state. MFC after: 1 week	2016-02-12 21:15:57 +00:00
melifaro	93152c67c9	Implement interface link header precomputation API. Add if_requestencap() interface method which is capable of calculating various link headers for given interface. Right now there is support for INET/INET6/ARP llheader calculation (IFENCAP_LL type request). Other types are planned to support more complex calculation (L2 multipath lagg nexthops, tunnel encap nexthops, etc..). Reshape 'struct route' to be able to pass additional data (with is length) to prepend to mbuf. These two changes permits routing code to pass pre-calculated nexthop data (like L2 header for route w/gateway) down to the stack eliminating the need for other lookups. It also brings us closer to more complex scenarios like transparently handling MPLS nexthops and tunnel interfaces. Last, but not least, it removes layering violation introduced by flowtable code (ro_lle) and simplifies handling of existing if_output consumers. ARP/ND changes: Make arp/ndp stack pre-calculate link header upon installing/updating lle record. Interface link address change are handled by re-calculating headers for all lles based on if_lladdr event. After these changes, arpresolve()/nd6_resolve() returns full pre-calculated header for supported interfaces thus simplifying if_output(). Move these lookups to separate ether_resolve_addr() function which ether returs error or fully-prepared link header. Add <arp\|nd6_>resolve_addr() compat versions to return link addresses instead of pre-calculated data. BPF changes: Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT. Despite the naming, both of there have ther header "complete". The only difference is that interface source mac has to be filled by OS for AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside BPF and not pollute if_output() routines. Convert BPF to pass prepend data via new 'struct route' mechanism. Note that it does not change non-optimized if_output(): ro_prepend handling is purely optional. Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI. It is not needed for ethernet anymore. The only remaining FDDI user is dev/pdq mostly untouched since 2007. FDDI support was eliminated from OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65). Flowtable changes: Flowtable violates layering by saving (and not correctly managing) rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated header data from that lle. Differential Revision: https://reviews.freebsd.org/D4102	2015-12-31 05:03:27 +00:00
smh	45d5617154	Revert r292275 & r292379 glebius has concerns about these changes so reverting those can be discussed and addressed. Sponsored by: Multiplay	2015-12-17 14:41:30 +00:00
smh	864cf18128	Fix lagg failover due to missing notifications When using lagg failover mode neither Gratuitous ARP (IPv4) or Unsolicited Neighbour Advertisements (IPv6) are sent to notify other nodes that the address may have moved. This results is slow failover, dropped packets and network outages for the lagg interface when the primary link goes down. We now use the new if_link_state_change_cond with the force param set to allow lagg to force through link state changes and hence fire a ifnet_link_event which are now monitored by rip and nd6. Upon receiving these events each protocol trigger the relevant notifications: * inet4 => Gratuitous ARP * inet6 => Unsolicited Neighbour Announce This also fixes the carp IPv6 NA's that stopped working after r251584 which added the ipv6_route__llma route. The new behavour can be controlled using the sysctls: * net.link.ether.inet.arp_on_link * net.inet6.icmp6.nd6_on_link Also removed unused param from lagg_port_state and added descriptions for the sysctls while here. PR: 156226 MFC after: 1 month Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D4111	2015-12-15 16:02:11 +00:00
melifaro	2bf2184989	Bring back the ability of passing cached route via nd6_output_ifp().	2015-11-15 16:02:22 +00:00
melifaro	0dee60f8c8	Eliminate nd6_llinfo_settimer(). All consumers were converted to use nd6_llinfo_settimer_locked() in r216022. Make nd6_llinfo_settimer_locked() static: last external consumer was converted in r288124.	2015-10-04 08:33:16 +00:00
melifaro	91b3356875	Eliminate nd6_nud_hint() and its TCP bindings. Initially function was introduced in r53541 (KAME initial commit) to "provide hints from upper layer protocols that indicate a connection is making "forward progress"" (quote from RFC 2461 7.3.1 Reachability Confirmation). However, it was converted to do nothing (e.g. just return) in r122922 (tcp_hostcache implementation) back in 2003. Some defines were moved to tcp_var.h in r169541. Then, it was broken (for non-corner cases) by r186119 (L2<>L3 split) in 2008 (NULL ifp in nd6_lookup). So, right now this code is broken and has no "real" base users. Differential Revision: https://reviews.freebsd.org/D3699	2015-09-27 05:29:34 +00:00
melifaro	88b54de46b	Use standard lle LLE_EXCLUSIVE request flags instead of its redefined version.	2015-09-22 20:45:04 +00:00
melifaro	536c20752f	Unify nd6 state switching by using newly-created nd6_llinfo_setstate() function. The change is mostly mechanical with the following exception: Last piece of nd6_resolve_slow() was refactored: ND6_LLINFO_PERMANENT condition was removed as always-true, explicit ND6_LLINFO_NOSTATE -> ND6_LLINFO_INCOMPLETE state transition was removed as duplicate. Reviewed by: ae Sponsored by: Yandex LLC	2015-09-21 11:19:53 +00:00
melifaro	493325342d	Simplify the way of attaching IPv6 link-layer header. Problem description: How do we currently perform layer 2 resolution and header imposition: For IPv4 we have the following chain: ip_output() -> (ether\|atm\|whatever)_output() -> arpresolve() Lookup is done in proper place (link-layer output routine) and it is possible to provide cached lle data. For IPv6 situation is more complex: ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_storelladdr() We have ip6_ouput() which calls nd6_output() instead of link output routine. nd6_output() does the following: * checks if lle exists, creates it if needed (similar to arpresolve()) * performes lle state transitions (similar to arpresolve()) * calls nd6_output_ifp() which pushes packets to link output routine along with running SeND/MAC hooks regardless of lle state (e.g. works as run-hooks placeholder). After that, iface output routine like ether_output() calls nd6_storelladdr() which performs lle lookup once again. As a result, we perform lookup twice for each outgoing packet for most types of interfaces. We also need to maintain runtime-checked table of 'nd6-free' interfaces (see nd6_need_cache()). Fix this behavior by eliminating first ND lookup. To be more specific: * make all nd6_output() consumers use nd6_output_ifp() instead * rename nd6_output[_slow]() to nd6_resolve_[slow]() * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics, e.g. copy L2 address to buffer instead of pushing packet towards lower layers * Make all nd6_storelladdr() users use nd6_resolve() * eliminate nd6_storelladdr() The resulting callchain is the following: ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve() Error handling: Currently sending packet to non-existing la results in ip6_<output\|forward> -> nd6_output() -> nd6_output _lle() which returns 0. In new scenario packet is propagated to <ether\|whatever>_output() -> nd6_resolve() which will return EWOULDBLOCK, and that result will be converted to 0. (And EWOULDBLOCK is actually used by IB/TOE code). Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D1469	2015-09-16 14:26:28 +00:00
melifaro	daa0d6c584	Constantify lookup key in several nd6_* functions.	2015-09-16 11:06:07 +00:00
melifaro	f6e576f283	Simplify nd6_cache_lladdr: * Move isRouter calculation code to separate nd6_is_router() function. * Make nd6_cache_lladdr() return void: its return value hasn't been used since r53541 KAME import in 1999. Sponsored by: Yandex LLC	2015-09-15 17:16:31 +00:00
melifaro	5ad1f2444d	* Do more fine-grained locking: call eventhandlers/free_entry without holding afdata wlock * convert per-af delete_address callback to global lltable_delete_entry() and more low-level "delete this lle" per-af callback * fix some bugs/inconsistencies in IPv4/IPv6 ifscrub procedures Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3573	2015-09-14 16:48:19 +00:00
melifaro	e31cc5ffc6	Do not pass lle to nd6_ns_output(). Use newly-added nd6_llinfo_get_holdsrc() to extract desired IPv6 source from holdchain and pass it to the nd6_ns_output().	2015-09-05 14:14:03 +00:00
melifaro	54b3b78856	* Split allocation and table linking for lle's. Before that, the logic besides lle_create() was the following: return existing if found, create if not. This behaviour was error-prone since we had to deal with 'sudden' static<>dynamic lle changes. This commit fixes bunch of different issues like: - refcount leak when lle is converted to static. Simple check case: console 1: while true; do for i in `arp -an\|awk '$4~/incomp/{print$2}'\|tr -d '()'`; do arp -s $i 00:22:44:66:88:00 ; arp -d $i; done; done console 2: ping -f any-dead-host-in-L2 console 3: # watch for memory consumption: vmstat -m \| awk '$1~/lltable/{print$2}' - possible problems in arptimer() / nd6_timer() when dropping/reacquiring lock. New logic explicitly handles use-or-create cases in every lla_create user. Basically, most of the changes are purely mechanical. However, we explicitly avoid using existing lle's for interface/static LLE records. * While here, call lle_event handlers on all real table lle change. * Create lltable_free_entry() calling existing per-lltable lle_free_t callback for entry deletion	2015-08-20 12:05:17 +00:00
melifaro	20bb5966e2	MFP r274553: * Move lle creation/deletion from lla_lookup to separate functions: lla_lookup(LLE_CREATE) -> lla_create lla_lookup(LLE_DELETE) -> lla_delete lla_create now returns with LLE_EXCLUSIVE lock for lle. * Provide typedefs for new/existing lltable callbacks. Reviewed by: ae	2015-08-08 17:48:54 +00:00
hrs	64bd31eb61	- Remove ND6_IFF_IGNORELOOP. This functionality was useless in practice because a link where looped back NS messages are permanently observed does not work with either NDP or ARP for IPv4. - draft-ietf-6man-enhanced-dad is now RFC 7527. Discussed with: hiren MFC after: 3 days	2015-05-12 03:31:57 +00:00
hrs	218de7f1d6	- Implement loopback probing state in enhanced DAD algorithm. - Add no_dad and ignoreloop per-IF knob. no_dad disables DAD completely, and ignoreloop is to prevent infinite loop in loopback probing state when loopback is permanently expected.	2015-03-05 21:27:49 +00:00
hrs	9d87f6a3c3	Implement Enhanced DAD algorithm for IPv6 described in draft-ietf-6man-enhanced-dad-13. This basically adds a random nonce option (RFC 3971) to NS messages for DAD probe to detect a looped back packet. This looped back packet prevented DAD on some pseudo-interfaces which aggregates multiple L2 links such as lagg(4). The length of the nonce is set to 6 bytes. This algorithm can be disabled by setting net.inet6.ip6.dad_enhanced sysctl to 0 in a per-vnet basis. Reported by: hiren Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D1835	2015-03-02 17:30:26 +00:00
melifaro	abec420f04	* Use newly-created nd6_grab_holdchain() function to retrieve lle hold mbuf chain instead of calling full-blown nd6_output_lle() for each packet. This simplifies both callers and nd6_output_lle() implementation. * Make nd6_output_lle() static and remove now-unused lle and chain arguments. * Rename nd6_output_flush() -> nd6_flush_holdchain() to be consistent. * Move all pre-send transmit hooks to newly-created nd6_output_ifp(). Now nd6_output(), nd6_output_lle() and nd6_flush_holdchain() are using it to send mbufs to if_output. * Remove SeND hook from nd6_na_input() because it was implemented incorrectly since the beginning (r211501): - it tagged initial input mbuf (m) instead of m_hold - tagging _all_ mbufs in holdchain seems to be wrong anyway.	2015-01-08 18:02:05 +00:00
markj	f0265d5d83	Add refcounting to IPv6 DAD objects and simplify the DAD code to fix a number of races which could cause double frees or use-after-frees when performing DAD on an address. In particular, an IPv6 address can now only be marked as a duplicate from the DAD callout. Differential Revision: https://reviews.freebsd.org/D1258 Reviewed by: ae, hrs Reported by: rstone MFC after: 1 month	2014-12-08 04:44:40 +00:00
melifaro	95c680b9a3	Do not return unlocked/unreferenced lle in arpresolve/nd6_storelladdr - return lle flags IFF needed. Do not pass rte to arpresolve - pass is_gateway flag instead.	2014-11-27 23:06:25 +00:00
melifaro	89eb9b8cc2	* Make nd6_dad_duplicated() constant. * Simplify refcounting by using nd6_dad_add() / nd6_dad_del(). Reviewed by: ae MFC after: 2 weeks Sponsored by: Yandex LLC	2014-11-10 16:01:39 +00:00
melifaro	94efd07208	Remove unused 'struct route *' argument from nd6_output_flush().	2014-11-09 16:20:27 +00:00
melifaro	6ee59753ec	Further rework netinet6 address handling code: * Set ia address/mask values BEFORE attaching to address lists. Inet6 address assignment is not atomic, so the simplest way to do this atomically is to fill in ia before attach. * Validate irfa->ia_addr field before use (we permit ANY sockaddr in old code). * Do some renamings: in6_ifinit -> in6_notify_ifa (interaction with other subsystems is here) in6_setup_ifa -> in6_broadcast_ifa (LLE/Multicast/DaD code) in6_ifaddloop -> nd6_add_ifa_lle in6_ifremloop -> nd6_rem_ifa_lle * Split working with LLE and route announce code for last two. Add temporary in6_newaddrmsg() function to mimic current rtsock behaviour. * Call device SIOCSIFADDR handler IFF we're adding first address. In IPv4 we have to call it on every address change since ARP record is installed by arp_ifinit() which is called by given handler. IPv6 stack, on the opposite is responsible to call nd6_add_ifa_lle() so there is no reason to call SIOCSIFADDR often.	2014-01-19 16:07:27 +00:00
glebius	b4bc270e8f	Add const qualifier to the dst parameter of the ifnet if_output method.	2013-04-26 12:50:32 +00:00
ume	e33acd92c3	Add no_prefer_iface option. It stops treating the address on the interface as special by source address selection rule even when the interface is outgoing interface. This is desired in some situation. Requested by: hrs Reviewed by: IHANet folks including hrs MFC after: 1 week	2013-01-09 18:18:08 +00:00
delphij	3948ce713c	Remove __P. Submitted by: kevlo Reviewed by: md5(1) MFC after: 2 months	2012-10-22 21:49:56 +00:00
hrs	43a17c6d76	Remove "prefer_source" address selection option. FreeBSD has had an implementation of RFC 3484 for this purpose for a long time and "prefer_source" was never implemented actually. ND6_IFF_PREFER_SOURCE macro is left intact.	2012-07-09 06:21:46 +00:00
bz	bfb18e3280	Mark a couple of file local functions static and stop exporting them. MFC after: 1 week	2012-01-05 01:14:35 +00:00
jhb	7a0151720c	Use queue(3) macros instead of home-rolled versions in several places in the INET6 code. This includes retiring the 'ndpr_next' and 'pfr_next' macros. Submitted by: pluknet (earlier version) Reviewed by: pluknet	2011-12-29 18:25:18 +00:00
qingli	3b996bbc11	A default route learned from the RAs could be deleted manually after its installation. This removal may be accidental and can prevent the default route from being installed in the future if the associated default router has the best preference. The cause is the lack of status update in the default router on the state of its route installation in the kernel FIB. This patch fixes the described problem. Reviewed by: hrs, discussed with hrs MFC after: 5 days	2011-11-11 23:22:38 +00:00
hrs	4c2206b625	- Accept Router Advertisement messages even when net.inet6.ip6.forwarding=1. - A new per-interface knob IFF_ND6_NO_RADR and sysctl IPV6CTL_NO_RADR. This controls if accepting a route in an RA message as the default route. The default value for each interface can be set by net.inet6.ip6.no_radr. The system wide default value is 0. - A new sysctl: net.inet6.ip6.norbit_raif. This controls if setting R-bit in NA on RA accepting interfaces. The default is 0 (R-bit is set based on net.inet6.ip6.forwarding). Background: IPv6 host/router model suggests a router sends an RA and a host accepts it for router discovery. Because of that, KAME implementation does not allow accepting RAs when net.inet6.ip6.forwarding=1. Accepting RAs on a router can make the routing table confused since it can change the default router unintentionally. However, in practice there are cases where we cannot distinguish a host from a router clearly. For example, a customer edge router often works as a host against the ISP, and as a router against the LAN at the same time. Another example is a complex network configurations like an L2TP tunnel for IPv6 connection to Internet over an Ethernet link with another native IPv6 subnet. In this case, the physical interface for the native IPv6 subnet works as a host, and the pseudo-interface for L2TP works as the default IP forwarding route. Problem: Disabling processing RA messages when net.inet6.ip6.forwarding=1 and accepting them when net.inet6.ip6.forward=0 cause the following practical issues: - A router cannot perform SLAAC. It becomes a problem if a box has multiple interfaces and you want to use SLAAC on some of them, for example. A customer edge router for IPv6 Internet access service using an IPv6-over-IPv6 tunnel sometimes needs SLAAC on the physical interface for administration purpose; updating firmware and so on (link-local addresses can be used there, but GUAs by SLAAC are often used for scalability). - When a host has multiple IPv6 interfaces and it receives multiple RAs on them, controlling the default route is difficult. Router preferences defined in RFC 4191 works only when the routers on the links are under your control. Details of Implementation Changes: Router Advertisement messages will be accepted even when net.inet6.ip6.forwarding=1. More precisely, the conditions are as follow: (ACCEPT_RTADV && !NO_RADR && !ip6.forwarding) => Normal RA processing on that interface. (as IPv6 host) (ACCEPT_RTADV && (NO_RADR \|\| ip6.forwarding)) => Accept RA but add the router to the defroute list with rtlifetime=0 unconditionally. This effectively prevents from setting the received router address as the box's default route. (!ACCEPT_RTADV) => No RA processing on that interface. ACCEPT_RTADV and NO_RADR are per-interface knob. In short, all interface are classified as "RA-accepting" or not. An RA-accepting interface always processes RA messages regardless of ip6.forwarding. The difference caused by NO_RADR or ip6.forwarding is whether the RA source address is considered as the default router or not. R-bit in NA on the RA accepting interfaces is set based on net.inet6.ip6.forwarding. While RFC 6204 W-1 rule (for CPE case) suggests a router should disable the R-bit completely even when the box has net.inet6.ip6.forwarding=1, I believe there is no technical reason with doing so. This behavior can be set by a new sysctl net.inet6.ip6.norbit_raif (the default is 0). Usage: # ifconfig fxp0 inet6 accept_rtadv => accept RA on fxp0 # ifconfig fxp0 inet6 accept_rtadv no_radr => accept RA on fxp0 but ignore default route information in it. # sysctl net.inet6.ip6.norbit_no_radr=1 => R-bit in NAs on RA accepting interfaces will always be set to 0.	2011-06-06 02:14:23 +00:00
bz	0a90ef1728	MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days	2010-04-29 11:52:42 +00:00
hrs	2eb62239d7	Improve flexibility of receiving Router Advertisement and automatic link-local address configuration: - Convert a sysctl net.inet6.ip6.accept_rtadv to one for the default value of a per-IF flag ND6_IFF_ACCEPT_RTADV, not a global knob. The default value of the sysctl is 0. - Add a new per-IF flag ND6_IFF_AUTO_LINKLOCAL and convert a sysctl net.inet6.ip6.auto_linklocal to one for its default value. The default value of the sysctl is 1. - Make ND6_IFF_IFDISABLED more robust. It can be used to disable IPv6 functionality of an interface now. - Receiving RA is allowed if ip6_forwarding==0 and ND6_IFF_ACCEPT_RTADV is set on that interface. The former condition will be revisited later to support a "host + router" box like IPv6 CPE router. The current behavior is compatible with the older releases of FreeBSD. - The ifconfig(8) now supports these ND6 flags as well as "nud", "prefer_source", and "disabled" in ndp(8). The ndp(8) now supports "auto_linklocal". Discussed with: bz and jinmei Reviewed by: bz MFC after: 3 days	2009-09-12 22:08:20 +00:00
rwatson	88f8de4d40	Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)	2009-07-16 21:13:04 +00:00
rwatson	57ca4583e7	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
zec	8b1f38241a	Introduce an infrastructure for dismantling vnet instances. Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)	2009-06-08 17:15:40 +00:00
kmacy	24b38efdce	Change if_output to take a struct route as its fourth argument in order to allow passing a cached struct llentry * down to L2 Reviewed by: rwatson	2009-04-16 20:30:28 +00:00
qingli	c6a0a000ca	in6_clsroute() was applied to prefix routes causing some of them to expire. in6_clsroute() was only applied to cloned routes that are no longer applicable after the arp-v2 commit.	2008-12-17 10:03:49 +00:00
qingli	ec826ad5c7	This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion	2008-12-15 06:10:57 +00:00
bz	83a32f8750	Put a global variables, which were virtualized but formerly missed under VIMAGE_GLOBAL. Start putting the extern declarations of the virtualized globals under VIMAGE_GLOBAL as the globals themsevles are already. This will help by the time when we are going to remove the globals entirely. While there garbage collect a few dead externs from ip6_var.h. Sponsored by: The FreeBSD Foundation	2008-12-11 16:26:38 +00:00

1 2

74 Commits