2005-01-07 02:30:35 +00:00
/*-
2000-07-04 16:35:15 +00:00
* Copyright ( C ) 1995 , 1996 , 1997 , and 1998 WIDE Project .
* All rights reserved .
*
* Redistribution and use in source and binary forms , with or without
* modification , are permitted provided that the following conditions
* are met :
* 1. Redistributions of source code must retain the above copyright
* notice , this list of conditions and the following disclaimer .
* 2. Redistributions in binary form must reproduce the above copyright
* notice , this list of conditions and the following disclaimer in the
* documentation and / or other materials provided with the distribution .
* 3. Neither the name of the project nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission .
*
* THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ` ` AS IS ' ' AND
* ANY EXPRESS OR IMPLIED WARRANTIES , INCLUDING , BUT NOT LIMITED TO , THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED . IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT , INDIRECT , INCIDENTAL , SPECIAL , EXEMPLARY , OR CONSEQUENTIAL
* DAMAGES ( INCLUDING , BUT NOT LIMITED TO , PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES ; LOSS OF USE , DATA , OR PROFITS ; OR BUSINESS INTERRUPTION )
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY , WHETHER IN CONTRACT , STRICT
* LIABILITY , OR TORT ( INCLUDING NEGLIGENCE OR OTHERWISE ) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE , EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE .
2007-12-10 16:03:40 +00:00
*
* $ KAME : in6_src . c , v 1.132 2003 / 08 / 26 04 : 42 : 27 keiichi Exp $
2000-07-04 16:35:15 +00:00
*/
2005-01-07 02:30:35 +00:00
/*-
2000-07-04 16:35:15 +00:00
* Copyright ( c ) 1982 , 1986 , 1991 , 1993
* The Regents of the University of California . All rights reserved .
*
* Redistribution and use in source and binary forms , with or without
* modification , are permitted provided that the following conditions
* are met :
* 1. Redistributions of source code must retain the above copyright
* notice , this list of conditions and the following disclaimer .
* 2. Redistributions in binary form must reproduce the above copyright
* notice , this list of conditions and the following disclaimer in the
* documentation and / or other materials provided with the distribution .
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission .
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ` ` AS IS ' ' AND
* ANY EXPRESS OR IMPLIED WARRANTIES , INCLUDING , BUT NOT LIMITED TO , THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED . IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT , INDIRECT , INCIDENTAL , SPECIAL , EXEMPLARY , OR CONSEQUENTIAL
* DAMAGES ( INCLUDING , BUT NOT LIMITED TO , PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES ; LOSS OF USE , DATA , OR PROFITS ; OR BUSINESS INTERRUPTION )
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY , WHETHER IN CONTRACT , STRICT
* LIABILITY , OR TORT ( INCLUDING NEGLIGENCE OR OTHERWISE ) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE , EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE .
*
* @ ( # ) in_pcb . c 8.2 ( Berkeley ) 1 / 4 / 94
*/
2007-12-10 16:03:40 +00:00
# include <sys/cdefs.h>
__FBSDID ( " $FreeBSD$ " ) ;
2000-07-04 16:35:15 +00:00
# include "opt_inet.h"
# include "opt_inet6.h"
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
# include "opt_mpath.h"
2000-07-04 16:35:15 +00:00
# include <sys/param.h>
# include <sys/systm.h>
2007-03-31 23:23:42 +00:00
# include <sys/lock.h>
2001-06-11 12:39:29 +00:00
# include <sys/malloc.h>
2000-07-04 16:35:15 +00:00
# include <sys/mbuf.h>
2006-11-06 13:42:10 +00:00
# include <sys/priv.h>
2000-07-04 16:35:15 +00:00
# include <sys/protosw.h>
# include <sys/socket.h>
# include <sys/socketvar.h>
2003-10-30 15:29:17 +00:00
# include <sys/sockio.h>
# include <sys/sysctl.h>
2000-07-04 16:35:15 +00:00
# include <sys/errno.h>
# include <sys/time.h>
MFp4:
Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..
SCTP support was updated and supports IPv6 in jails as well.
Cpuset support permits jails to be bound to specific processor
sets after creation.
Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.
DDB 'show jails' command was added to aid debugging.
Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.
Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.
Bump __FreeBSD_version for the afore mentioned and in kernel changes.
Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
on cluster machines as well as all the testers and people
who provided feedback the last months on freebsd-jail and
other channels.
- My employer, CK Software GmbH, for the support so I could work on this.
Reviewed by: (see above)
MFC after: 3 months (this is just so that I get the mail)
X-MFC Before: 7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
# include <sys/jail.h>
2003-11-04 14:08:31 +00:00
# include <sys/kernel.h>
2005-08-17 16:46:55 +00:00
# include <sys/sx.h>
2000-07-04 16:35:15 +00:00
# include <net/if.h>
2009-09-05 16:43:16 +00:00
# include <net/if_dl.h>
2000-07-04 16:35:15 +00:00
# include <net/route.h>
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
# include <net/if_llatbl.h>
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
# ifdef RADIX_MPATH
# include <net/radix_mpath.h>
# endif
2000-07-04 16:35:15 +00:00
# include <netinet/in.h>
# include <netinet/in_var.h>
# include <netinet/in_systm.h>
# include <netinet/ip.h>
# include <netinet/in_pcb.h>
2008-10-20 18:43:59 +00:00
# include <netinet/ip_var.h>
# include <netinet/udp.h>
# include <netinet/udp_var.h>
2008-12-02 21:37:28 +00:00
2000-07-04 16:35:15 +00:00
# include <netinet6/in6_var.h>
# include <netinet/ip6.h>
# include <netinet6/in6_pcb.h>
# include <netinet6/ip6_var.h>
2005-07-25 12:31:43 +00:00
# include <netinet6/scope6_var.h>
2000-07-04 16:35:15 +00:00
# include <netinet6/nd6.h>
2003-10-30 15:29:17 +00:00
static struct mtx addrsel_lock ;
# define ADDRSEL_LOCK_INIT() mtx_init(&addrsel_lock, "addrsel_lock", NULL, MTX_DEF)
# define ADDRSEL_LOCK() mtx_lock(&addrsel_lock)
# define ADDRSEL_UNLOCK() mtx_unlock(&addrsel_lock)
# define ADDRSEL_LOCK_ASSERT() mtx_assert(&addrsel_lock, MA_OWNED)
2005-08-17 16:46:55 +00:00
static struct sx addrsel_sxlock ;
# define ADDRSEL_SXLOCK_INIT() sx_init(&addrsel_sxlock, "addrsel_sxlock")
# define ADDRSEL_SLOCK() sx_slock(&addrsel_sxlock)
# define ADDRSEL_SUNLOCK() sx_sunlock(&addrsel_sxlock)
# define ADDRSEL_XLOCK() sx_xlock(&addrsel_sxlock)
# define ADDRSEL_XUNLOCK() sx_xunlock(&addrsel_sxlock)
2003-10-30 15:29:17 +00:00
# define ADDR_LABEL_NOTAPP (-1)
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)
2009-07-14 22:48:30 +00:00
static VNET_DEFINE ( struct in6_addrpolicy , defaultaddrpolicy ) ;
VNET_DEFINE ( int , ip6_prefer_tempaddr ) ;
2009-07-16 21:13:04 +00:00
# define V_defaultaddrpolicy VNET(defaultaddrpolicy)
2003-11-04 20:22:33 +00:00
2005-07-25 12:31:43 +00:00
static int selectroute __P ( ( struct sockaddr_in6 * , struct ip6_pktopts * ,
struct ip6_moptions * , struct route_in6 * , struct ifnet * * ,
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
struct rtentry * * , int ) ) ;
2003-11-04 20:22:33 +00:00
static int in6_selectif __P ( ( struct sockaddr_in6 * , struct ip6_pktopts * ,
2004-02-04 12:55:45 +00:00
struct ip6_moptions * , struct route_in6 * ro , struct ifnet * * ) ) ;
2003-11-04 20:22:33 +00:00
2008-01-08 19:08:58 +00:00
static struct in6_addrpolicy * lookup_addrsel_policy ( struct sockaddr_in6 * ) ;
2003-11-04 20:22:33 +00:00
2008-01-08 19:08:58 +00:00
static void init_policy_queue ( void ) ;
static int add_addrsel_policyent ( struct in6_addrpolicy * ) ;
static int delete_addrsel_policyent ( struct in6_addrpolicy * ) ;
2003-10-30 15:29:17 +00:00
static int walk_addrsel_policy __P ( ( int ( * ) ( struct in6_addrpolicy * , void * ) ,
void * ) ) ;
2008-01-08 19:08:58 +00:00
static int dump_addrsel_policyent ( struct in6_addrpolicy * , void * ) ;
static struct in6_addrpolicy * match_addrsel_policy ( struct sockaddr_in6 * ) ;
2003-10-30 15:29:17 +00:00
2000-07-04 16:35:15 +00:00
/*
2001-06-11 12:39:29 +00:00
* Return an IPv6 address , which is the most appropriate for a given
2000-07-04 16:35:15 +00:00
* destination and user specified options .
2001-06-11 12:39:29 +00:00
* If necessary , this function lookups the routing table and returns
2000-07-04 16:35:15 +00:00
* an entry to the caller for later use .
*/
2003-11-04 20:22:33 +00:00
# define REPLACE(r) do {\
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
if ( ( r ) < sizeof ( V_ip6stat . ip6s_sources_rule ) / \
sizeof ( V_ip6stat . ip6s_sources_rule [ 0 ] ) ) /* check for safety */ \
V_ip6stat . ip6s_sources_rule [ ( r ) ] + + ; \
2008-01-20 10:08:15 +00:00
/* { \
char ip6buf [ INET6_ADDRSTRLEN ] , ip6b [ INET6_ADDRSTRLEN ] ; \
printf ( " in6_selectsrc: replace %s with %s by %d \n " , ia_best ? ip6_sprintf ( ip6buf , & ia_best - > ia_addr . sin6_addr ) : " none " , ip6_sprintf ( ip6b , & ia - > ia_addr . sin6_addr ) , ( r ) ) ; \
} */ \
2003-11-04 20:22:33 +00:00
goto replace ; \
} while ( 0 )
# define NEXT(r) do {\
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
if ( ( r ) < sizeof ( V_ip6stat . ip6s_sources_rule ) / \
sizeof ( V_ip6stat . ip6s_sources_rule [ 0 ] ) ) /* check for safety */ \
V_ip6stat . ip6s_sources_rule [ ( r ) ] + + ; \
2008-01-20 10:08:15 +00:00
/* { \
char ip6buf [ INET6_ADDRSTRLEN ] , ip6b [ INET6_ADDRSTRLEN ] ; \
printf ( " in6_selectsrc: keep %s against %s by %d \n " , ia_best ? ip6_sprintf ( ip6buf , & ia_best - > ia_addr . sin6_addr ) : " none " , ip6_sprintf ( ip6b , & ia - > ia_addr . sin6_addr ) , ( r ) ) ; \
} */ \
2007-07-05 16:29:40 +00:00
goto next ; /* XXX: we can't use 'continue' here */ \
2003-11-04 20:22:33 +00:00
} while ( 0 )
# define BREAK(r) do { \
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
if ( ( r ) < sizeof ( V_ip6stat . ip6s_sources_rule ) / \
sizeof ( V_ip6stat . ip6s_sources_rule [ 0 ] ) ) /* check for safety */ \
V_ip6stat . ip6s_sources_rule [ ( r ) ] + + ; \
2007-07-05 16:29:40 +00:00
goto out ; /* XXX: we can't use 'break' here */ \
2003-11-04 20:22:33 +00:00
} while ( 0 )
2009-06-23 22:08:55 +00:00
int
2007-07-05 16:23:49 +00:00
in6_selectsrc ( struct sockaddr_in6 * dstsock , struct ip6_pktopts * opts ,
2008-07-08 18:41:36 +00:00
struct inpcb * inp , struct route_in6 * ro , struct ucred * cred ,
2009-06-23 22:08:55 +00:00
struct ifnet * * ifpp , struct in6_addr * srcp )
2000-07-04 16:35:15 +00:00
{
2005-07-25 12:31:43 +00:00
struct in6_addr dst ;
2003-11-04 20:22:33 +00:00
struct ifnet * ifp = NULL ;
struct in6_ifaddr * ia = NULL , * ia_best = NULL ;
2000-07-04 16:35:15 +00:00
struct in6_pktinfo * pi = NULL ;
2003-11-04 20:22:33 +00:00
int dst_scope = - 1 , best_scope = - 1 , best_matchlen = - 1 ;
struct in6_addrpolicy * dst_policy = NULL , * best_policy = NULL ;
u_int32_t odstzone ;
int prefer_tempaddr ;
2009-06-23 22:08:55 +00:00
int error ;
2008-07-08 18:41:36 +00:00
struct ip6_moptions * mopts ;
2003-11-05 16:09:21 +00:00
2009-06-23 22:08:55 +00:00
KASSERT ( srcp ! = NULL , ( " %s: srcp is NULL " , __func__ ) ) ;
2005-07-25 12:31:43 +00:00
dst = dstsock - > sin6_addr ; /* make a copy for local operation */
if ( ifpp )
* ifpp = NULL ;
2000-07-04 16:35:15 +00:00
2008-07-09 16:33:21 +00:00
if ( inp ! = NULL ) {
INP_LOCK_ASSERT ( inp ) ;
2008-07-08 18:41:36 +00:00
mopts = inp - > in6p_moptions ;
2008-07-09 16:33:21 +00:00
} else {
2008-07-08 18:41:36 +00:00
mopts = NULL ;
2008-07-09 16:33:21 +00:00
}
2008-07-08 18:41:36 +00:00
2000-07-04 16:35:15 +00:00
/*
* If the source address is explicitly specified by the caller ,
2003-11-04 20:22:33 +00:00
* check if the requested source address is indeed a unicast address
* assigned to the node , and can be used as the packet ' s source
* address . If everything is okay , use the address as source .
2000-07-04 16:35:15 +00:00
*/
if ( opts & & ( pi = opts - > ip6po_pktinfo ) & &
2003-11-04 20:22:33 +00:00
! IN6_IS_ADDR_UNSPECIFIED ( & pi - > ipi6_addr ) ) {
struct sockaddr_in6 srcsock ;
struct in6_ifaddr * ia6 ;
/* get the outgoing interface */
2009-06-23 22:08:55 +00:00
if ( ( error = in6_selectif ( dstsock , opts , mopts , ro , & ifp ) ) ! = 0 )
return ( error ) ;
2004-02-04 12:55:45 +00:00
2003-11-04 20:22:33 +00:00
/*
* determine the appropriate zone id of the source based on
* the zone of the destination and the outgoing interface .
2005-07-25 12:31:43 +00:00
* If the specified address is ambiguous wrt the scope zone ,
* the interface must be specified ; otherwise , ifa_ifwithaddr ( )
* will fail matching the address .
2003-11-04 20:22:33 +00:00
*/
bzero ( & srcsock , sizeof ( srcsock ) ) ;
srcsock . sin6_family = AF_INET6 ;
srcsock . sin6_len = sizeof ( srcsock ) ;
srcsock . sin6_addr = pi - > ipi6_addr ;
if ( ifp ) {
2009-06-23 22:08:55 +00:00
error = in6_setscope ( & srcsock . sin6_addr , ifp , NULL ) ;
if ( error )
return ( error ) ;
2003-11-04 20:22:33 +00:00
}
2009-06-23 22:08:55 +00:00
if ( cred ! = NULL & & ( error = prison_local_ip6 ( cred ,
2009-02-05 14:06:09 +00:00
& srcsock . sin6_addr , ( inp ! = NULL & &
( inp - > inp_flags & IN6P_IPV6_V6ONLY ) ! = 0 ) ) ) ! = 0 )
2009-06-23 22:08:55 +00:00
return ( error ) ;
2005-07-25 12:31:43 +00:00
2009-06-23 20:19:09 +00:00
ia6 = ( struct in6_ifaddr * ) ifa_ifwithaddr (
( struct sockaddr * ) & srcsock ) ;
2003-11-04 20:22:33 +00:00
if ( ia6 = = NULL | |
( ia6 - > ia6_flags & ( IN6_IFF_ANYCAST | IN6_IFF_NOTREADY ) ) ) {
2009-06-23 20:19:09 +00:00
if ( ia6 ! = NULL )
ifa_free ( & ia6 - > ia_ifa ) ;
2009-06-23 22:08:55 +00:00
return ( EADDRNOTAVAIL ) ;
2003-11-04 20:22:33 +00:00
}
pi - > ipi6_addr = srcsock . sin6_addr ; /* XXX: this overrides pi */
2005-07-25 12:31:43 +00:00
if ( ifpp )
* ifpp = ifp ;
2009-06-23 22:08:55 +00:00
bcopy ( & ia6 - > ia_addr . sin6_addr , srcp , sizeof ( * srcp ) ) ;
2009-06-23 20:19:09 +00:00
ifa_free ( & ia6 - > ia_ifa ) ;
2009-06-23 22:08:55 +00:00
return ( 0 ) ;
2003-11-04 20:22:33 +00:00
}
2000-07-04 16:35:15 +00:00
/*
2003-11-04 20:22:33 +00:00
* Otherwise , if the socket has already bound the source , just use it .
2000-07-04 16:35:15 +00:00
*/
2008-07-08 18:41:36 +00:00
if ( inp ! = NULL & & ! IN6_IS_ADDR_UNSPECIFIED ( & inp - > in6p_laddr ) ) {
2009-02-05 14:06:09 +00:00
if ( cred ! = NULL & &
2009-06-23 22:08:55 +00:00
( error = prison_local_ip6 ( cred , & inp - > in6p_laddr ,
2009-02-05 14:06:09 +00:00
( ( inp - > inp_flags & IN6P_IPV6_V6ONLY ) ! = 0 ) ) ) ! = 0 )
2009-06-23 22:08:55 +00:00
return ( error ) ;
bcopy ( & inp - > in6p_laddr , srcp , sizeof ( * srcp ) ) ;
return ( 0 ) ;
2008-07-08 18:41:36 +00:00
}
2000-07-04 16:35:15 +00:00
2010-01-17 12:57:11 +00:00
/*
* Bypass source address selection and use the primary jail IP
* if requested .
*/
if ( cred ! = NULL & & ! prison_saddrsel_ip6 ( cred , srcp ) )
return ( 0 ) ;
2000-07-04 16:35:15 +00:00
/*
2003-11-04 20:22:33 +00:00
* If the address is not specified , choose the best one based on
* the outgoing interface and the destination address .
2000-07-04 16:35:15 +00:00
*/
2003-11-04 20:22:33 +00:00
/* get the outgoing interface */
2009-06-23 22:08:55 +00:00
if ( ( error = in6_selectif ( dstsock , opts , mopts , ro , & ifp ) ) ! = 0 )
return ( error ) ;
2003-11-04 20:22:33 +00:00
# ifdef DIAGNOSTIC
if ( ifp = = NULL ) /* this should not happen */
panic ( " in6_selectsrc: NULL ifp " ) ;
# endif
2009-06-23 22:08:55 +00:00
error = in6_setscope ( & dst , ifp , & odstzone ) ;
if ( error )
return ( error ) ;
2005-07-25 12:31:43 +00:00
2009-06-25 16:35:28 +00:00
IN6_IFADDR_RLOCK ( ) ;
2009-06-24 21:00:25 +00:00
TAILQ_FOREACH ( ia , & V_in6_ifaddrhead , ia_link ) {
2003-11-04 20:22:33 +00:00
int new_scope = - 1 , new_matchlen = - 1 ;
struct in6_addrpolicy * new_policy = NULL ;
u_int32_t srczone , osrczone , dstzone ;
2005-07-25 12:31:43 +00:00
struct in6_addr src ;
2003-11-04 20:22:33 +00:00
struct ifnet * ifp1 = ia - > ia_ifp ;
/*
* We ' ll never take an address that breaks the scope zone
* of the destination . We also skip an address if its zone
* does not contain the outgoing interface .
* XXX : we should probably use sin6_scope_id here .
*/
2005-07-25 12:31:43 +00:00
if ( in6_setscope ( & dst , ifp1 , & dstzone ) | |
2003-11-04 20:22:33 +00:00
odstzone ! = dstzone ) {
continue ;
}
2005-07-25 12:31:43 +00:00
src = ia - > ia_addr . sin6_addr ;
if ( in6_setscope ( & src , ifp , & osrczone ) | |
in6_setscope ( & src , ifp1 , & srczone ) | |
2003-11-04 20:22:33 +00:00
osrczone ! = srczone ) {
continue ;
}
2000-07-04 16:35:15 +00:00
2003-11-04 20:22:33 +00:00
/* avoid unusable addresses */
if ( ( ia - > ia6_flags &
( IN6_IFF_NOTREADY | IN6_IFF_ANYCAST | IN6_IFF_DETACHED ) ) ) {
continue ;
2000-07-04 16:35:15 +00:00
}
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
if ( ! V_ip6_use_deprecated & & IFA6_IS_DEPRECATED ( ia ) )
2003-11-04 20:22:33 +00:00
continue ;
2000-07-04 16:35:15 +00:00
MFp4:
Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..
SCTP support was updated and supports IPv6 in jails as well.
Cpuset support permits jails to be bound to specific processor
sets after creation.
Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.
DDB 'show jails' command was added to aid debugging.
Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.
Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.
Bump __FreeBSD_version for the afore mentioned and in kernel changes.
Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
on cluster machines as well as all the testers and people
who provided feedback the last months on freebsd-jail and
other channels.
- My employer, CK Software GmbH, for the support so I could work on this.
Reviewed by: (see above)
MFC after: 3 months (this is just so that I get the mail)
X-MFC Before: 7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
if ( cred ! = NULL & &
prison_local_ip6 ( cred , & ia - > ia_addr . sin6_addr ,
( inp ! = NULL & &
( inp - > inp_flags & IN6P_IPV6_V6ONLY ) ! = 0 ) ) ! = 0 )
continue ;
2003-11-04 20:22:33 +00:00
/* Rule 1: Prefer same address */
2005-07-25 12:31:43 +00:00
if ( IN6_ARE_ADDR_EQUAL ( & dst , & ia - > ia_addr . sin6_addr ) ) {
2003-11-04 20:22:33 +00:00
ia_best = ia ;
BREAK ( 1 ) ; /* there should be no better candidate */
}
if ( ia_best = = NULL )
REPLACE ( 0 ) ;
/* Rule 2: Prefer appropriate scope */
if ( dst_scope < 0 )
2005-07-25 12:31:43 +00:00
dst_scope = in6_addrscope ( & dst ) ;
2003-11-04 20:22:33 +00:00
new_scope = in6_addrscope ( & ia - > ia_addr . sin6_addr ) ;
if ( IN6_ARE_SCOPE_CMP ( best_scope , new_scope ) < 0 ) {
if ( IN6_ARE_SCOPE_CMP ( best_scope , dst_scope ) < 0 )
REPLACE ( 2 ) ;
NEXT ( 2 ) ;
} else if ( IN6_ARE_SCOPE_CMP ( new_scope , best_scope ) < 0 ) {
if ( IN6_ARE_SCOPE_CMP ( new_scope , dst_scope ) < 0 )
NEXT ( 2 ) ;
REPLACE ( 2 ) ;
}
/*
* Rule 3 : Avoid deprecated addresses . Note that the case of
* ! ip6_use_deprecated is already rejected above .
*/
if ( ! IFA6_IS_DEPRECATED ( ia_best ) & & IFA6_IS_DEPRECATED ( ia ) )
NEXT ( 3 ) ;
if ( IFA6_IS_DEPRECATED ( ia_best ) & & ! IFA6_IS_DEPRECATED ( ia ) )
REPLACE ( 3 ) ;
/* Rule 4: Prefer home addresses */
/*
* XXX : This is a TODO . We should probably merge the MIP6
* case above .
*/
/* Rule 5: Prefer outgoing interface */
if ( ia_best - > ia_ifp = = ifp & & ia - > ia_ifp ! = ifp )
NEXT ( 5 ) ;
if ( ia_best - > ia_ifp ! = ifp & & ia - > ia_ifp = = ifp )
REPLACE ( 5 ) ;
/*
* Rule 6 : Prefer matching label
* Note that best_policy should be non - NULL here .
*/
if ( dst_policy = = NULL )
dst_policy = lookup_addrsel_policy ( dstsock ) ;
if ( dst_policy - > label ! = ADDR_LABEL_NOTAPP ) {
new_policy = lookup_addrsel_policy ( & ia - > ia_addr ) ;
if ( dst_policy - > label = = best_policy - > label & &
dst_policy - > label ! = new_policy - > label )
NEXT ( 6 ) ;
if ( dst_policy - > label ! = best_policy - > label & &
dst_policy - > label = = new_policy - > label )
REPLACE ( 6 ) ;
}
/*
* Rule 7 : Prefer public addresses .
* We allow users to reverse the logic by configuring
* a sysctl variable , so that privacy conscious users can
* always prefer temporary addresses .
*/
if ( opts = = NULL | |
opts - > ip6po_prefer_tempaddr = = IP6PO_TEMPADDR_SYSTEM ) {
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
prefer_tempaddr = V_ip6_prefer_tempaddr ;
2003-11-04 20:22:33 +00:00
} else if ( opts - > ip6po_prefer_tempaddr = =
IP6PO_TEMPADDR_NOTPREFER ) {
prefer_tempaddr = 0 ;
} else
prefer_tempaddr = 1 ;
if ( ! ( ia_best - > ia6_flags & IN6_IFF_TEMPORARY ) & &
( ia - > ia6_flags & IN6_IFF_TEMPORARY ) ) {
if ( prefer_tempaddr )
REPLACE ( 7 ) ;
else
NEXT ( 7 ) ;
}
if ( ( ia_best - > ia6_flags & IN6_IFF_TEMPORARY ) & &
! ( ia - > ia6_flags & IN6_IFF_TEMPORARY ) ) {
if ( prefer_tempaddr )
NEXT ( 7 ) ;
else
REPLACE ( 7 ) ;
2000-07-04 16:35:15 +00:00
}
2003-11-04 20:22:33 +00:00
/*
* Rule 8 : prefer addresses on alive interfaces .
* This is a KAME specific rule .
*/
if ( ( ia_best - > ia_ifp - > if_flags & IFF_UP ) & &
! ( ia - > ia_ifp - > if_flags & IFF_UP ) )
NEXT ( 8 ) ;
if ( ! ( ia_best - > ia_ifp - > if_flags & IFF_UP ) & &
( ia - > ia_ifp - > if_flags & IFF_UP ) )
REPLACE ( 8 ) ;
/*
* Rule 14 : Use longest matching prefix .
* Note : in the address selection draft , this rule is
* documented as " Rule 8 " . However , since it is also
* documented that this rule can be overridden , we assign
* a large number so that it is easy to assign smaller numbers
* to more preferred rules .
*/
2005-07-25 12:31:43 +00:00
new_matchlen = in6_matchlen ( & ia - > ia_addr . sin6_addr , & dst ) ;
2003-11-04 20:22:33 +00:00
if ( best_matchlen < new_matchlen )
REPLACE ( 14 ) ;
if ( new_matchlen < best_matchlen )
NEXT ( 14 ) ;
/* Rule 15 is reserved. */
/*
* Last resort : just keep the current candidate .
* Or , do we need more rules ?
*/
continue ;
replace :
ia_best = ia ;
best_scope = ( new_scope > = 0 ? new_scope :
in6_addrscope ( & ia_best - > ia_addr . sin6_addr ) ) ;
best_policy = ( new_policy ? new_policy :
lookup_addrsel_policy ( & ia_best - > ia_addr ) ) ;
best_matchlen = ( new_matchlen > = 0 ? new_matchlen :
in6_matchlen ( & ia_best - > ia_addr . sin6_addr ,
2005-07-25 12:31:43 +00:00
& dst ) ) ;
2003-11-04 20:22:33 +00:00
next :
continue ;
out :
break ;
}
2009-06-25 16:35:28 +00:00
if ( ( ia = ia_best ) = = NULL ) {
IN6_IFADDR_RUNLOCK ( ) ;
2009-06-23 22:08:55 +00:00
return ( EADDRNOTAVAIL ) ;
2009-06-25 16:35:28 +00:00
}
2003-11-04 20:22:33 +00:00
2005-07-25 12:31:43 +00:00
if ( ifpp )
* ifpp = ifp ;
2009-06-23 22:08:55 +00:00
bcopy ( & ia - > ia_addr . sin6_addr , srcp , sizeof ( * srcp ) ) ;
2009-06-25 16:35:28 +00:00
IN6_IFADDR_RUNLOCK ( ) ;
2009-06-23 22:08:55 +00:00
return ( 0 ) ;
2003-11-04 20:22:33 +00:00
}
2007-07-05 16:23:49 +00:00
/*
* clone - meaningful only for bsdi and freebsd
*/
2003-11-04 20:22:33 +00:00
static int
2007-07-05 16:23:49 +00:00
selectroute ( struct sockaddr_in6 * dstsock , struct ip6_pktopts * opts ,
struct ip6_moptions * mopts , struct route_in6 * ro ,
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
struct ifnet * * retifp , struct rtentry * * retrt , int norouteok )
2003-11-04 20:22:33 +00:00
{
int error = 0 ;
struct ifnet * ifp = NULL ;
struct rtentry * rt = NULL ;
struct sockaddr_in6 * sin6_next ;
struct in6_pktinfo * pi = NULL ;
struct in6_addr * dst = & dstsock - > sin6_addr ;
#if 0
2006-12-12 12:17:58 +00:00
char ip6buf [ INET6_ADDRSTRLEN ] ;
2003-11-04 20:22:33 +00:00
if ( dstsock - > sin6_addr . s6_addr32 [ 0 ] = = 0 & &
dstsock - > sin6_addr . s6_addr32 [ 1 ] = = 0 & &
! IN6_IS_ADDR_LOOPBACK ( & dstsock - > sin6_addr ) ) {
printf ( " in6_selectroute: strange destination %s \n " ,
2006-12-12 12:17:58 +00:00
ip6_sprintf ( ip6buf , & dstsock - > sin6_addr ) ) ;
2003-11-04 20:22:33 +00:00
} else {
printf ( " in6_selectroute: destination = %s%%%d \n " ,
2006-12-12 12:17:58 +00:00
ip6_sprintf ( ip6buf , & dstsock - > sin6_addr ) ,
2003-11-04 20:22:33 +00:00
dstsock - > sin6_scope_id ) ; /* for debug */
}
# endif
/* If the caller specify the outgoing interface explicitly, use it. */
if ( opts & & ( pi = opts - > ip6po_pktinfo ) ! = NULL & & pi - > ipi6_ifindex ) {
/* XXX boundary check is assumed to be already done. */
ifp = ifnet_byindex ( pi - > ipi6_ifindex ) ;
if ( ifp ! = NULL & &
2005-07-25 12:31:43 +00:00
( norouteok | | retrt = = NULL | |
IN6_IS_ADDR_MULTICAST ( dst ) ) ) {
2003-11-04 20:22:33 +00:00
/*
2005-08-12 15:27:25 +00:00
* we do not have to check or get the route for
2003-11-04 20:22:33 +00:00
* multicast .
*/
goto done ;
} else
goto getroute ;
}
/*
* If the destination address is a multicast address and the outgoing
* interface for the address is specified by the caller , use it .
*/
if ( IN6_IS_ADDR_MULTICAST ( dst ) & &
mopts ! = NULL & & ( ifp = mopts - > im6o_multicast_ifp ) ! = NULL ) {
goto done ; /* we do not need a route for multicast. */
}
getroute :
/*
* If the next hop address for the packet is specified by the caller ,
* use it as the gateway .
*/
if ( opts & & opts - > ip6po_nexthop ) {
struct route_in6 * ron ;
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
struct llentry * la ;
2003-11-04 20:22:33 +00:00
sin6_next = satosin6 ( opts - > ip6po_nexthop ) ;
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
2003-11-04 20:22:33 +00:00
/* at this moment, we only support AF_INET6 next hops */
if ( sin6_next - > sin6_family ! = AF_INET6 ) {
error = EAFNOSUPPORT ; /* or should we proceed? */
goto done ;
}
/*
* If the next hop is an IPv6 address , then the node identified
* by that address must be a neighbor of the sending host .
*/
ron = & opts - > ip6po_nextroute ;
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
/*
* XXX what do we do here ?
* PLZ to be fixing
*/
if ( ron - > ro_rt = = NULL ) {
rtalloc ( ( struct route * ) ron ) ; /* multi path case? */
if ( ron - > ro_rt = = NULL ) {
if ( ron - > ro_rt ) {
RTFREE ( ron - > ro_rt ) ;
ron - > ro_rt = NULL ;
}
error = EHOSTUNREACH ;
goto done ;
}
}
rt = ron - > ro_rt ;
ifp = rt - > rt_ifp ;
IF_AFDATA_LOCK ( ifp ) ;
la = lla_lookup ( LLTABLE6 ( ifp ) , 0 , ( struct sockaddr * ) & sin6_next - > sin6_addr ) ;
IF_AFDATA_UNLOCK ( ifp ) ;
2008-12-16 02:30:42 +00:00
if ( la ! = NULL )
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
LLE_RUNLOCK ( la ) ;
else {
error = EHOSTUNREACH ;
goto done ;
}
#if 0
2003-11-04 20:22:33 +00:00
if ( ( ron - > ro_rt & &
( ron - > ro_rt - > rt_flags & ( RTF_UP | RTF_LLINFO ) ) ! =
( RTF_UP | RTF_LLINFO ) ) | |
2005-10-19 16:53:24 +00:00
! IN6_ARE_ADDR_EQUAL ( & satosin6 ( & ron - > ro_dst ) - > sin6_addr ,
& sin6_next - > sin6_addr ) ) {
2003-11-04 20:22:33 +00:00
if ( ron - > ro_rt ) {
RTFREE ( ron - > ro_rt ) ;
ron - > ro_rt = NULL ;
2000-07-04 16:35:15 +00:00
}
2003-11-04 20:22:33 +00:00
* satosin6 ( & ron - > ro_dst ) = * sin6_next ;
}
if ( ron - > ro_rt = = NULL ) {
rtalloc ( ( struct route * ) ron ) ; /* multi path case? */
if ( ron - > ro_rt = = NULL | |
! ( ron - > ro_rt - > rt_flags & RTF_LLINFO ) ) {
if ( ron - > ro_rt ) {
RTFREE ( ron - > ro_rt ) ;
ron - > ro_rt = NULL ;
}
error = EHOSTUNREACH ;
goto done ;
2000-07-04 16:35:15 +00:00
}
}
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
# endif
2003-11-04 20:22:33 +00:00
/*
* When cloning is required , try to allocate a route to the
* destination so that the caller can store path MTU
* information .
*/
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
goto done ;
2000-07-04 16:35:15 +00:00
}
/*
2003-11-04 20:22:33 +00:00
* Use a cached route if it exists and is valid , else try to allocate
* a new one . Note that we should check the address family of the
* cached destination , in case of sharing the cache with IPv4 .
2000-07-04 16:35:15 +00:00
*/
if ( ro ) {
if ( ro - > ro_rt & &
2002-01-21 20:02:36 +00:00
( ! ( ro - > ro_rt - > rt_flags & RTF_UP ) | |
2003-11-04 20:22:33 +00:00
( ( struct sockaddr * ) ( & ro - > ro_dst ) ) - > sa_family ! = AF_INET6 | |
2002-01-21 20:02:36 +00:00
! IN6_ARE_ADDR_EQUAL ( & satosin6 ( & ro - > ro_dst ) - > sin6_addr ,
2004-02-04 12:55:45 +00:00
dst ) ) ) {
2000-07-04 16:35:15 +00:00
RTFREE ( ro - > ro_rt ) ;
2003-11-04 20:22:33 +00:00
ro - > ro_rt = ( struct rtentry * ) NULL ;
2000-07-04 16:35:15 +00:00
}
2003-11-04 20:22:33 +00:00
if ( ro - > ro_rt = = ( struct rtentry * ) NULL ) {
2001-06-11 12:39:29 +00:00
struct sockaddr_in6 * sa6 ;
2000-07-04 16:35:15 +00:00
/* No route yet, so try to acquire one */
bzero ( & ro - > ro_dst , sizeof ( struct sockaddr_in6 ) ) ;
2001-06-11 12:39:29 +00:00
sa6 = ( struct sockaddr_in6 * ) & ro - > ro_dst ;
2003-11-04 20:22:33 +00:00
* sa6 = * dstsock ;
2003-11-05 16:09:21 +00:00
sa6 - > sin6_scope_id = 0 ;
2003-11-20 20:07:39 +00:00
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
# ifdef RADIX_MPATH
rtalloc_mpath ( ( struct route * ) ro ,
ntohl ( sa6 - > sin6_addr . s6_addr32 [ 3 ] ) ) ;
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
# else
2000-07-04 16:35:15 +00:00
ro - > ro_rt = rtalloc1 ( & ( ( struct route * ) ro )
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
- > ro_dst , 0 , 0UL ) ;
2003-10-23 21:41:00 +00:00
if ( ro - > ro_rt )
RT_UNLOCK ( ro - > ro_rt ) ;
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
# endif
2000-07-04 16:35:15 +00:00
}
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
2000-07-04 16:35:15 +00:00
/*
2003-11-04 20:22:33 +00:00
* do not care about the result if we have the nexthop
* explicitly specified .
2000-07-04 16:35:15 +00:00
*/
2003-11-04 20:22:33 +00:00
if ( opts & & opts - > ip6po_nexthop )
goto done ;
2000-07-04 16:35:15 +00:00
if ( ro - > ro_rt ) {
2003-11-04 20:22:33 +00:00
ifp = ro - > ro_rt - > rt_ifp ;
if ( ifp = = NULL ) { /* can this really happen? */
RTFREE ( ro - > ro_rt ) ;
ro - > ro_rt = NULL ;
}
2000-07-04 16:35:15 +00:00
}
2003-11-04 20:22:33 +00:00
if ( ro - > ro_rt = = NULL )
error = EHOSTUNREACH ;
rt = ro - > ro_rt ;
/*
* Check if the outgoing interface conflicts with
* the interface specified by ipi6_ifindex ( if specified ) .
* Note that loopback interface is always okay .
* ( this may happen when we are sending a packet to one of
* our own addresses . )
*/
2005-05-15 02:28:30 +00:00
if ( ifp & & opts & & opts - > ip6po_pktinfo & &
2004-02-04 12:55:45 +00:00
opts - > ip6po_pktinfo - > ipi6_ifindex ) {
2003-11-04 20:22:33 +00:00
if ( ! ( ifp - > if_flags & IFF_LOOPBACK ) & &
ifp - > if_index ! =
opts - > ip6po_pktinfo - > ipi6_ifindex ) {
error = EHOSTUNREACH ;
goto done ;
}
2000-07-04 16:35:15 +00:00
}
}
2003-11-04 20:22:33 +00:00
done :
if ( ifp = = NULL & & rt = = NULL ) {
/*
* This can happen if the caller did not pass a cached route
* nor any other hints . We treat this case an error .
*/
error = EHOSTUNREACH ;
}
if ( error = = EHOSTUNREACH )
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
V_ip6stat . ip6s_noroute + + ;
2003-11-04 20:22:33 +00:00
2009-09-05 16:43:16 +00:00
if ( retifp ! = NULL ) {
2003-11-04 20:22:33 +00:00
* retifp = ifp ;
2009-09-05 16:43:16 +00:00
/*
* Adjust the " outgoing " interface . If we ' re going to loop
* the packet back to ourselves , the ifp would be the loopback
* interface . However , we ' d rather know the interface associated
* to the destination address ( which should probably be one of
* our own addresses . )
*/
if ( rt ) {
if ( ( rt - > rt_ifp - > if_flags & IFF_LOOPBACK ) & &
( rt - > rt_gateway - > sa_family = = AF_LINK ) )
* retifp =
ifnet_byindex ( ( ( struct sockaddr_dl * )
rt - > rt_gateway ) - > sdl_index ) ;
}
}
2003-11-04 20:22:33 +00:00
if ( retrt ! = NULL )
* retrt = rt ; /* rt may be NULL */
return ( error ) ;
2000-07-04 16:35:15 +00:00
}
2005-07-25 12:31:43 +00:00
static int
2007-07-05 16:23:49 +00:00
in6_selectif ( struct sockaddr_in6 * dstsock , struct ip6_pktopts * opts ,
struct ip6_moptions * mopts , struct route_in6 * ro , struct ifnet * * retifp )
2005-07-25 12:31:43 +00:00
{
int error ;
struct route_in6 sro ;
struct rtentry * rt = NULL ;
if ( ro = = NULL ) {
bzero ( & sro , sizeof ( sro ) ) ;
ro = & sro ;
}
if ( ( error = selectroute ( dstsock , opts , mopts , ro , retifp ,
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
& rt , 1 ) ) ! = 0 ) {
2006-05-23 00:32:22 +00:00
if ( ro = = & sro & & rt & & rt = = sro . ro_rt )
2005-07-25 12:31:43 +00:00
RTFREE ( rt ) ;
return ( error ) ;
}
/*
* do not use a rejected or black hole route .
* XXX : this check should be done in the L2 output routine .
* However , if we skipped this check here , we ' d see the following
* scenario :
* - install a rejected route for a scoped address prefix
* ( like fe80 : : / 10 )
* - send a packet to a destination that matches the scoped prefix ,
* with ambiguity about the scope zone .
* - pick the outgoing interface from the route , and disambiguate the
* scope zone with the interface .
* - ip6_output ( ) would try to get another route with the " new "
* destination , which may be valid .
* - we ' d see no error on output .
* Although this may not be very harmful , it should still be confusing .
* We thus reject the case here .
*/
if ( rt & & ( rt - > rt_flags & ( RTF_REJECT | RTF_BLACKHOLE ) ) ) {
int flags = ( rt - > rt_flags & RTF_HOST ? EHOSTUNREACH : ENETUNREACH ) ;
2006-05-23 00:32:22 +00:00
if ( ro = = & sro & & rt & & rt = = sro . ro_rt )
2005-07-25 12:31:43 +00:00
RTFREE ( rt ) ;
return ( flags ) ;
}
2006-05-23 00:32:22 +00:00
if ( ro = = & sro & & rt & & rt = = sro . ro_rt )
2005-07-25 12:31:43 +00:00
RTFREE ( rt ) ;
return ( 0 ) ;
}
2007-07-05 16:23:49 +00:00
/*
* clone - meaningful only for bsdi and freebsd
*/
2005-07-25 12:31:43 +00:00
int
2007-07-05 16:23:49 +00:00
in6_selectroute ( struct sockaddr_in6 * dstsock , struct ip6_pktopts * opts ,
struct ip6_moptions * mopts , struct route_in6 * ro ,
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
struct ifnet * * retifp , struct rtentry * * retrt )
2005-07-25 12:31:43 +00:00
{
2007-07-05 16:23:49 +00:00
2005-07-25 12:31:43 +00:00
return ( selectroute ( dstsock , opts , mopts , ro , retifp ,
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
retrt , 0 ) ) ;
2005-07-25 12:31:43 +00:00
}
2000-07-04 16:35:15 +00:00
/*
* Default hop limit selection . The precedence is as follows :
* 1. Hoplimit value specified via ioctl .
* 2. ( If the outgoing interface is detected ) the current
* hop limit of the interface specified by router advertisement .
* 3. The system default hoplimit .
2003-11-20 20:07:39 +00:00
*/
2000-07-04 16:35:15 +00:00
int
Another step assimilating IPv[46] PCB code - directly use
the inpcb names rather than the following IPv6 compat macros:
in6pcb,in6p_sp, in6p_ip6_nxt,in6p_flowinfo,in6p_vflag,
in6p_flags,in6p_socket,in6p_lport,in6p_fport,in6p_ppcb and
sotoin6pcb().
Apart from removing duplicate code in netipsec, this is a pure
whitespace, not a functional change.
Discussed with: rwatson
Reviewed by: rwatson (version before review requested changes)
MFC after: 4 weeks (set the timer and see then)
2008-12-15 21:50:54 +00:00
in6_selecthlim ( struct inpcb * in6p , struct ifnet * ifp )
2000-07-04 16:35:15 +00:00
{
2007-07-05 16:23:49 +00:00
2000-07-04 16:35:15 +00:00
if ( in6p & & in6p - > in6p_hops > = 0 )
2003-10-06 14:02:09 +00:00
return ( in6p - > in6p_hops ) ;
2000-07-04 16:35:15 +00:00
else if ( ifp )
2003-10-17 15:46:31 +00:00
return ( ND_IFINFO ( ifp ) - > chlim ) ;
2003-11-20 20:07:39 +00:00
else if ( in6p & & ! IN6_IS_ADDR_UNSPECIFIED ( & in6p - > in6p_faddr ) ) {
struct route_in6 ro6 ;
struct ifnet * lifp ;
bzero ( & ro6 , sizeof ( ro6 ) ) ;
ro6 . ro_dst . sin6_family = AF_INET6 ;
ro6 . ro_dst . sin6_len = sizeof ( struct sockaddr_in6 ) ;
ro6 . ro_dst . sin6_addr = in6p - > in6p_faddr ;
rtalloc ( ( struct route * ) & ro6 ) ;
if ( ro6 . ro_rt ) {
lifp = ro6 . ro_rt - > rt_ifp ;
RTFREE ( ro6 . ro_rt ) ;
if ( lifp )
return ( ND_IFINFO ( lifp ) - > chlim ) ;
} else
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
return ( V_ip6_defhlim ) ;
2003-11-20 20:07:39 +00:00
}
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
return ( V_ip6_defhlim ) ;
2000-07-04 16:35:15 +00:00
}
/*
* XXX : this is borrowed from in6_pcbbind ( ) . If possible , we should
* share this function by all * bsd * . . .
*/
int
2007-07-05 16:23:49 +00:00
in6_pcbsetport ( struct in6_addr * laddr , struct inpcb * inp , struct ucred * cred )
2000-07-04 16:35:15 +00:00
{
struct socket * so = inp - > inp_socket ;
u_int16_t lport = 0 , first , last , * lastport ;
2009-02-05 14:06:09 +00:00
int count , error , wild = 0 , dorandom ;
2000-07-04 16:35:15 +00:00
struct inpcbinfo * pcbinfo = inp - > inp_pcbinfo ;
2006-04-25 12:09:58 +00:00
INP_INFO_WLOCK_ASSERT ( pcbinfo ) ;
2008-04-17 21:38:18 +00:00
INP_WLOCK_ASSERT ( inp ) ;
2006-04-25 12:09:58 +00:00
2009-02-05 14:06:09 +00:00
error = prison_local_ip6 ( cred , laddr ,
( ( inp - > inp_flags & IN6P_IPV6_V6ONLY ) ! = 0 ) ) ;
if ( error )
return ( error ) ;
MFp4:
Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..
SCTP support was updated and supports IPv6 in jails as well.
Cpuset support permits jails to be bound to specific processor
sets after creation.
Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.
DDB 'show jails' command was added to aid debugging.
Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.
Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.
Bump __FreeBSD_version for the afore mentioned and in kernel changes.
Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
on cluster machines as well as all the testers and people
who provided feedback the last months on freebsd-jail and
other channels.
- My employer, CK Software GmbH, for the support so I could work on this.
Reviewed by: (see above)
MFC after: 3 months (this is just so that I get the mail)
X-MFC Before: 7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
2000-07-04 16:35:15 +00:00
/* XXX: this is redundant when called from in6_pcbbind */
if ( ( so - > so_options & ( SO_REUSEADDR | SO_REUSEPORT ) ) = = 0 )
wild = INPLOOKUP_WILDCARD ;
inp - > inp_flags | = INP_ANONPORT ;
if ( inp - > inp_flags & INP_HIGHPORT ) {
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
first = V_ipport_hifirstauto ; /* sysctl */
last = V_ipport_hilastauto ;
2007-04-30 23:12:05 +00:00
lastport = & pcbinfo - > ipi_lasthi ;
2000-07-04 16:35:15 +00:00
} else if ( inp - > inp_flags & INP_LOWPORT ) {
2007-06-12 00:12:01 +00:00
error = priv_check_cred ( cred , PRIV_NETINET_RESERVEDPORT , 0 ) ;
2006-11-06 13:42:10 +00:00
if ( error )
2000-07-04 16:35:15 +00:00
return error ;
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
first = V_ipport_lowfirstauto ; /* 1023 */
last = V_ipport_lowlastauto ; /* 600 */
2007-04-30 23:12:05 +00:00
lastport = & pcbinfo - > ipi_lastlow ;
2000-07-04 16:35:15 +00:00
} else {
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
first = V_ipport_firstauto ; /* sysctl */
last = V_ipport_lastauto ;
2007-04-30 23:12:05 +00:00
lastport = & pcbinfo - > ipi_lastport ;
2000-07-04 16:35:15 +00:00
}
2008-10-20 18:43:59 +00:00
/*
* For UDP , use random port allocation as long as the user
* allows it . For TCP ( and as of yet unknown ) connections ,
* use random port allocation only if the user allows it AND
* ipport_tick ( ) allows it .
*/
if ( V_ipport_randomized & &
( ! V_ipport_stoprandom | | pcbinfo = = & V_udbinfo ) )
dorandom = 1 ;
else
dorandom = 0 ;
2000-07-04 16:35:15 +00:00
/*
2008-10-20 18:43:59 +00:00
* It makes no sense to do random port allocation if
* we have the only port available .
*/
if ( first = = last )
dorandom = 0 ;
/* Make sure to not include UDP packets in the count. */
if ( pcbinfo ! = & V_udbinfo )
V_ipport_tcpallocs + + ;
/*
* Instead of having two loops further down counting up or down
* make sure that first is always < = last and go with only one
* code path implementing all logic .
2000-07-04 16:35:15 +00:00
*/
if ( first > last ) {
2008-10-20 18:43:59 +00:00
u_int16_t aux ;
aux = first ;
first = last ;
last = aux ;
2000-07-04 16:35:15 +00:00
}
2008-10-20 18:43:59 +00:00
if ( dorandom )
* lastport = first + ( arc4random ( ) % ( last - first ) ) ;
count = last - first ;
do {
if ( count - - < 0 ) { /* completely used? */
/* Undo an address bind that may have occurred. */
inp - > in6p_laddr = in6addr_any ;
return ( EADDRNOTAVAIL ) ;
}
+ + * lastport ;
if ( * lastport < first | | * lastport > last )
* lastport = first ;
lport = htons ( * lastport ) ;
} while ( in6_pcblookup_local ( pcbinfo , & inp - > in6p_laddr ,
lport , wild , cred ) ) ;
2000-07-04 16:35:15 +00:00
inp - > inp_lport = lport ;
if ( in_pcbinshash ( inp ) ! = 0 ) {
inp - > in6p_laddr = in6addr_any ;
inp - > inp_lport = 0 ;
return ( EAGAIN ) ;
}
2003-10-06 14:02:09 +00:00
return ( 0 ) ;
2000-07-04 16:35:15 +00:00
}
2003-10-30 15:29:17 +00:00
void
2007-07-05 16:23:49 +00:00
addrsel_policy_init ( void )
2003-10-30 15:29:17 +00:00
{
2008-11-19 09:39:34 +00:00
V_ip6_prefer_tempaddr = 0 ;
2003-10-30 15:29:17 +00:00
init_policy_queue ( ) ;
/* initialize the "last resort" policy */
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
bzero ( & V_defaultaddrpolicy , sizeof ( V_defaultaddrpolicy ) ) ;
V_defaultaddrpolicy . label = ADDR_LABEL_NOTAPP ;
First pass at separating per-vnet initializer functions
from existing functions for initializing global state.
At this stage, the new per-vnet initializer functions are
directly called from the existing global initialization code,
which should in most cases result in compiler inlining those
new functions, hence yielding a near-zero functional change.
Modify the existing initializer functions which are invoked via
protosw, like ip_init() et. al., to allow them to be invoked
multiple times, i.e. per each vnet. Global state, if any,
is initialized only if such functions are called within the
context of vnet0, which will be determined via the
IS_DEFAULT_VNET(curvnet) check (currently always true).
While here, V_irtualize a few remaining global UMA zones
used by net/netinet/netipsec networking code. While it is
not yet clear to me or anybody else whether this is the right
thing to do, at this stage this makes the code more readable,
and makes it easier to track uncollected UMA-zone-backed
objects on vnet removal. In the long run, it's quite possible
that some form of shared use of UMA zone pools among multiple
vnets should be considered.
Bump __FreeBSD_version due to changes in layout of structs
vnet_ipfw, vnet_inet and vnet_net.
Approved by: julian (mentor)
2009-04-06 22:29:41 +00:00
if ( ! IS_DEFAULT_VNET ( curvnet ) )
return ;
ADDRSEL_LOCK_INIT ( ) ;
ADDRSEL_SXLOCK_INIT ( ) ;
2003-10-30 15:29:17 +00:00
}
2003-11-04 20:22:33 +00:00
static struct in6_addrpolicy *
2007-07-05 16:23:49 +00:00
lookup_addrsel_policy ( struct sockaddr_in6 * key )
2003-11-04 20:22:33 +00:00
{
struct in6_addrpolicy * match = NULL ;
ADDRSEL_LOCK ( ) ;
match = match_addrsel_policy ( key ) ;
if ( match = = NULL )
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
match = & V_defaultaddrpolicy ;
2003-11-04 20:22:33 +00:00
else
match - > use + + ;
ADDRSEL_UNLOCK ( ) ;
return ( match ) ;
}
2003-10-30 15:29:17 +00:00
/*
* Subroutines to manage the address selection policy table via sysctl .
*/
struct walkarg {
struct sysctl_req * w_req ;
} ;
static int in6_src_sysctl ( SYSCTL_HANDLER_ARGS ) ;
SYSCTL_DECL ( _net_inet6_ip6 ) ;
SYSCTL_NODE ( _net_inet6_ip6 , IPV6CTL_ADDRCTLPOLICY , addrctlpolicy ,
CTLFLAG_RD , in6_src_sysctl , " " ) ;
static int
in6_src_sysctl ( SYSCTL_HANDLER_ARGS )
{
struct walkarg w ;
if ( req - > newptr )
return EPERM ;
bzero ( & w , sizeof ( w ) ) ;
w . w_req = req ;
return ( walk_addrsel_policy ( dump_addrsel_policyent , & w ) ) ;
}
int
2007-07-05 16:23:49 +00:00
in6_src_ioctl ( u_long cmd , caddr_t data )
2003-10-30 15:29:17 +00:00
{
int i ;
struct in6_addrpolicy ent0 ;
if ( cmd ! = SIOCAADDRCTL_POLICY & & cmd ! = SIOCDADDRCTL_POLICY )
return ( EOPNOTSUPP ) ; /* check for safety */
ent0 = * ( struct in6_addrpolicy * ) data ;
if ( ent0 . label = = ADDR_LABEL_NOTAPP )
return ( EINVAL ) ;
/* check if the prefix mask is consecutive. */
if ( in6_mask2len ( & ent0 . addrmask . sin6_addr , NULL ) < 0 )
return ( EINVAL ) ;
/* clear trailing garbages (if any) of the prefix address. */
for ( i = 0 ; i < 4 ; i + + ) {
ent0 . addr . sin6_addr . s6_addr32 [ i ] & =
ent0 . addrmask . sin6_addr . s6_addr32 [ i ] ;
}
ent0 . use = 0 ;
switch ( cmd ) {
case SIOCAADDRCTL_POLICY :
return ( add_addrsel_policyent ( & ent0 ) ) ;
case SIOCDADDRCTL_POLICY :
return ( delete_addrsel_policyent ( & ent0 ) ) ;
}
return ( 0 ) ; /* XXX: compromise compilers */
}
/*
* The followings are implementation of the policy table using a
* simple tail queue .
* XXX such details should be hidden .
* XXX implementation using binary tree should be more efficient .
*/
struct addrsel_policyent {
TAILQ_ENTRY ( addrsel_policyent ) ape_entry ;
struct in6_addrpolicy ape_policy ;
} ;
TAILQ_HEAD ( addrsel_policyhead , addrsel_policyent ) ;
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)
2009-07-14 22:48:30 +00:00
static VNET_DEFINE ( struct addrsel_policyhead , addrsel_policytab ) ;
2009-07-16 21:13:04 +00:00
# define V_addrsel_policytab VNET(addrsel_policytab)
2003-10-30 15:29:17 +00:00
static void
2007-07-05 16:23:49 +00:00
init_policy_queue ( void )
2003-10-30 15:29:17 +00:00
{
2007-07-05 16:23:49 +00:00
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_INIT ( & V_addrsel_policytab ) ;
2003-10-30 15:29:17 +00:00
}
static int
2007-07-05 16:23:49 +00:00
add_addrsel_policyent ( struct in6_addrpolicy * newpolicy )
2003-10-30 15:29:17 +00:00
{
struct addrsel_policyent * new , * pol ;
2008-10-23 15:53:51 +00:00
new = malloc ( sizeof ( * new ) , M_IFADDR ,
2003-10-30 18:42:25 +00:00
M_WAITOK ) ;
2005-08-17 16:46:55 +00:00
ADDRSEL_XLOCK ( ) ;
2003-10-30 15:29:17 +00:00
ADDRSEL_LOCK ( ) ;
/* duplication check */
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_FOREACH ( pol , & V_addrsel_policytab , ape_entry ) {
2005-10-19 16:53:24 +00:00
if ( IN6_ARE_ADDR_EQUAL ( & newpolicy - > addr . sin6_addr ,
& pol - > ape_policy . addr . sin6_addr ) & &
IN6_ARE_ADDR_EQUAL ( & newpolicy - > addrmask . sin6_addr ,
& pol - > ape_policy . addrmask . sin6_addr ) ) {
2003-10-30 18:42:25 +00:00
ADDRSEL_UNLOCK ( ) ;
2005-08-17 16:46:55 +00:00
ADDRSEL_XUNLOCK ( ) ;
2008-10-23 15:53:51 +00:00
free ( new , M_IFADDR ) ;
2003-10-30 15:29:17 +00:00
return ( EEXIST ) ; /* or override it? */
}
}
bzero ( new , sizeof ( * new ) ) ;
/* XXX: should validate entry */
new - > ape_policy = * newpolicy ;
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_INSERT_TAIL ( & V_addrsel_policytab , new , ape_entry ) ;
2003-10-30 15:29:17 +00:00
ADDRSEL_UNLOCK ( ) ;
2005-08-17 16:46:55 +00:00
ADDRSEL_XUNLOCK ( ) ;
2003-10-30 15:29:17 +00:00
return ( 0 ) ;
}
static int
2007-07-05 16:23:49 +00:00
delete_addrsel_policyent ( struct in6_addrpolicy * key )
2003-10-30 15:29:17 +00:00
{
struct addrsel_policyent * pol ;
2005-08-17 16:46:55 +00:00
ADDRSEL_XLOCK ( ) ;
2003-10-30 15:29:17 +00:00
ADDRSEL_LOCK ( ) ;
/* search for the entry in the table */
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_FOREACH ( pol , & V_addrsel_policytab , ape_entry ) {
2005-10-19 16:53:24 +00:00
if ( IN6_ARE_ADDR_EQUAL ( & key - > addr . sin6_addr ,
& pol - > ape_policy . addr . sin6_addr ) & &
IN6_ARE_ADDR_EQUAL ( & key - > addrmask . sin6_addr ,
& pol - > ape_policy . addrmask . sin6_addr ) ) {
2003-10-30 15:29:17 +00:00
break ;
}
}
2003-10-30 18:42:25 +00:00
if ( pol = = NULL ) {
ADDRSEL_UNLOCK ( ) ;
2005-08-17 16:46:55 +00:00
ADDRSEL_XUNLOCK ( ) ;
2003-10-30 15:29:17 +00:00
return ( ESRCH ) ;
2003-10-30 18:42:25 +00:00
}
2003-10-30 15:29:17 +00:00
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_REMOVE ( & V_addrsel_policytab , pol , ape_entry ) ;
2003-10-30 15:29:17 +00:00
ADDRSEL_UNLOCK ( ) ;
2005-08-17 16:46:55 +00:00
ADDRSEL_XUNLOCK ( ) ;
2003-10-30 15:29:17 +00:00
return ( 0 ) ;
}
static int
2008-01-08 19:08:58 +00:00
walk_addrsel_policy ( int ( * callback ) ( struct in6_addrpolicy * , void * ) ,
2007-07-05 16:23:49 +00:00
void * w )
2003-10-30 15:29:17 +00:00
{
struct addrsel_policyent * pol ;
int error = 0 ;
2005-08-17 16:46:55 +00:00
ADDRSEL_SLOCK ( ) ;
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_FOREACH ( pol , & V_addrsel_policytab , ape_entry ) {
2005-08-17 16:46:55 +00:00
if ( ( error = ( * callback ) ( & pol - > ape_policy , w ) ) ! = 0 ) {
ADDRSEL_SUNLOCK ( ) ;
2003-10-30 15:29:17 +00:00
return ( error ) ;
2005-08-17 16:46:55 +00:00
}
2003-10-30 15:29:17 +00:00
}
2005-08-17 16:46:55 +00:00
ADDRSEL_SUNLOCK ( ) ;
2003-10-30 15:29:17 +00:00
return ( error ) ;
}
static int
2007-07-05 16:23:49 +00:00
dump_addrsel_policyent ( struct in6_addrpolicy * pol , void * arg )
2003-10-30 15:29:17 +00:00
{
int error = 0 ;
struct walkarg * w = arg ;
error = SYSCTL_OUT ( w - > w_req , pol , sizeof ( * pol ) ) ;
return ( error ) ;
}
2003-11-04 20:22:33 +00:00
static struct in6_addrpolicy *
2007-07-05 16:23:49 +00:00
match_addrsel_policy ( struct sockaddr_in6 * key )
2003-11-04 20:22:33 +00:00
{
struct addrsel_policyent * pent ;
struct in6_addrpolicy * bestpol = NULL , * pol ;
int matchlen , bestmatchlen = - 1 ;
u_char * mp , * ep , * k , * p , m ;
Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
2008-08-17 23:27:27 +00:00
TAILQ_FOREACH ( pent , & V_addrsel_policytab , ape_entry ) {
2003-11-04 20:22:33 +00:00
matchlen = 0 ;
pol = & pent - > ape_policy ;
mp = ( u_char * ) & pol - > addrmask . sin6_addr ;
ep = mp + 16 ; /* XXX: scope field? */
k = ( u_char * ) & key - > sin6_addr ;
p = ( u_char * ) & pol - > addr . sin6_addr ;
for ( ; mp < ep & & * mp ; mp + + , k + + , p + + ) {
m = * mp ;
if ( ( * k & m ) ! = * p )
goto next ; /* not match */
if ( m = = 0xff ) /* short cut for a typical case */
matchlen + = 8 ;
else {
while ( m > = 0x80 ) {
matchlen + + ;
m < < = 1 ;
}
}
}
/* matched. check if this is better than the current best. */
if ( bestpol = = NULL | |
matchlen > bestmatchlen ) {
bestpol = pol ;
bestmatchlen = matchlen ;
}
next :
continue ;
}
return ( bestpol ) ;
}