2005-01-07 01:45:51 +00:00
|
|
|
/*-
|
1994-09-15 10:36:56 +00:00
|
|
|
* Copyright (c) 1982, 1986, 1993
|
|
|
|
* The Regents of the University of California. All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
* 4. Neither the name of the University nor the names of its contributors
|
|
|
|
* may be used to endorse or promote products derived from this software
|
|
|
|
* without specific prior written permission.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
|
|
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
* SUCH DAMAGE.
|
|
|
|
*
|
1995-09-21 17:58:07 +00:00
|
|
|
* @(#)in_proto.c 8.2 (Berkeley) 2/9/95
|
1994-09-15 10:36:56 +00:00
|
|
|
*/
|
1994-05-24 10:09:53 +00:00
|
|
|
|
2007-10-07 20:44:24 +00:00
|
|
|
#include <sys/cdefs.h>
|
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
1997-12-15 20:31:25 +00:00
|
|
|
#include "opt_ipx.h"
|
2003-08-07 18:16:59 +00:00
|
|
|
#include "opt_mrouting.h"
|
1999-12-22 19:13:38 +00:00
|
|
|
#include "opt_ipsec.h"
|
|
|
|
#include "opt_inet6.h"
|
2004-06-16 23:24:02 +00:00
|
|
|
#include "opt_pf.h"
|
2005-02-22 13:04:05 +00:00
|
|
|
#include "opt_carp.h"
|
2006-11-03 15:23:16 +00:00
|
|
|
#include "opt_sctp.h"
|
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
|
|
|
#include "opt_mpath.h"
|
1997-11-05 20:17:23 +00:00
|
|
|
|
1994-09-15 10:36:56 +00:00
|
|
|
#include <sys/param.h>
|
2003-02-19 22:32:43 +00:00
|
|
|
#include <sys/systm.h>
|
1995-05-11 00:13:26 +00:00
|
|
|
#include <sys/kernel.h>
|
1994-09-15 10:36:56 +00:00
|
|
|
#include <sys/socket.h>
|
|
|
|
#include <sys/domain.h>
|
Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.
Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.
Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively
#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif
Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.
Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.
Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.
De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.
Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.
Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
2008-12-10 23:12:39 +00:00
|
|
|
#include <sys/proc.h>
|
1995-11-20 12:28:21 +00:00
|
|
|
#include <sys/protosw.h>
|
1999-02-16 10:49:55 +00:00
|
|
|
#include <sys/queue.h>
|
1995-11-09 20:23:09 +00:00
|
|
|
#include <sys/sysctl.h>
|
1994-05-24 10:09:53 +00:00
|
|
|
|
1994-09-15 10:36:56 +00:00
|
|
|
#include <net/if.h>
|
|
|
|
#include <net/route.h>
|
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
|
|
|
#ifdef RADIX_MPATH
|
|
|
|
#include <net/radix_mpath.h>
|
|
|
|
#endif
|
2009-08-01 19:26:27 +00:00
|
|
|
#include <net/vnet.h>
|
1994-05-24 10:09:53 +00:00
|
|
|
|
1994-09-15 10:36:56 +00:00
|
|
|
#include <netinet/in.h>
|
|
|
|
#include <netinet/in_systm.h>
|
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
|
|
|
#include <netinet/in_var.h>
|
1994-09-15 10:36:56 +00:00
|
|
|
#include <netinet/ip.h>
|
|
|
|
#include <netinet/ip_var.h>
|
|
|
|
#include <netinet/ip_icmp.h>
|
|
|
|
#include <netinet/igmp_var.h>
|
|
|
|
#include <netinet/tcp.h>
|
|
|
|
#include <netinet/tcp_timer.h>
|
|
|
|
#include <netinet/tcp_var.h>
|
|
|
|
#include <netinet/udp.h>
|
|
|
|
#include <netinet/udp_var.h>
|
2000-07-04 16:35:15 +00:00
|
|
|
#include <netinet/ip_encap.h>
|
1999-12-22 19:13:38 +00:00
|
|
|
|
1994-09-15 10:36:56 +00:00
|
|
|
/*
|
|
|
|
* TCP/IP protocol family: IP, ICMP, UDP, TCP.
|
|
|
|
*/
|
1994-05-24 10:09:53 +00:00
|
|
|
|
2005-08-10 06:41:04 +00:00
|
|
|
static struct pr_usrreqs nousrreqs;
|
|
|
|
|
2007-07-03 12:13:45 +00:00
|
|
|
#ifdef IPSEC
|
2002-10-16 02:25:05 +00:00
|
|
|
#include <netipsec/ipsec.h>
|
2007-07-03 12:13:45 +00:00
|
|
|
#endif /* IPSEC */
|
2002-10-16 02:25:05 +00:00
|
|
|
|
2006-11-03 15:23:16 +00:00
|
|
|
#ifdef SCTP
|
|
|
|
#include <netinet/in_pcb.h>
|
|
|
|
#include <netinet/sctp_pcb.h>
|
|
|
|
#include <netinet/sctp.h>
|
|
|
|
#include <netinet/sctp_var.h>
|
|
|
|
#endif /* SCTP */
|
|
|
|
|
2004-06-16 23:24:02 +00:00
|
|
|
#ifdef DEV_PFSYNC
|
|
|
|
#include <net/pfvar.h>
|
|
|
|
#include <net/if_pfsync.h>
|
|
|
|
#endif
|
|
|
|
|
2005-02-22 13:04:05 +00:00
|
|
|
#ifdef DEV_CARP
|
|
|
|
#include <netinet/ip_carp.h>
|
|
|
|
#endif
|
|
|
|
|
1994-09-15 10:36:56 +00:00
|
|
|
extern struct domain inetdomain;
|
2004-10-19 15:58:22 +00:00
|
|
|
|
|
|
|
/* Spacer for loadable protocols. */
|
2005-11-09 13:29:16 +00:00
|
|
|
#define IPPROTOSPACER \
|
|
|
|
{ \
|
|
|
|
.pr_domain = &inetdomain, \
|
|
|
|
.pr_protocol = PROTO_SPACER, \
|
|
|
|
.pr_usrreqs = &nousrreqs \
|
2004-10-19 15:58:22 +00:00
|
|
|
}
|
1994-09-06 22:42:31 +00:00
|
|
|
|
2001-09-03 20:03:55 +00:00
|
|
|
struct protosw inetsw[] = {
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = 0,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_IP,
|
|
|
|
.pr_init = ip_init,
|
|
|
|
.pr_slowtimo = ip_slowtimo,
|
|
|
|
.pr_drain = ip_drain,
|
|
|
|
.pr_usrreqs = &nousrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_DGRAM,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_UDP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = udp_input,
|
|
|
|
.pr_ctlinput = udp_ctlinput,
|
Added support for NAT-Traversal (RFC 3948) in IPsec stack.
Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry
Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele
(julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense
team, and all people who used / tried the NAT-T patch for years and
reported bugs, patches, etc...
X-MFC: never
Reviewed by: bz
Approved by: gnn(mentor)
Obtained from: NETASQ
2009-06-12 15:44:35 +00:00
|
|
|
.pr_ctloutput = udp_ctloutput,
|
2005-11-09 13:29:16 +00:00
|
|
|
.pr_init = udp_init,
|
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.
Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
|
|
|
#ifdef VIMAGE
|
|
|
|
.pr_destroy = udp_destroy,
|
|
|
|
#endif
|
2005-11-09 13:29:16 +00:00
|
|
|
.pr_usrreqs = &udp_usrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_STREAM,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_TCP,
|
|
|
|
.pr_flags = PR_CONNREQUIRED|PR_IMPLOPCL|PR_WANTRCVD,
|
|
|
|
.pr_input = tcp_input,
|
|
|
|
.pr_ctlinput = tcp_ctlinput,
|
|
|
|
.pr_ctloutput = tcp_ctloutput,
|
|
|
|
.pr_init = tcp_init,
|
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.
Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
|
|
|
#ifdef VIMAGE
|
|
|
|
.pr_destroy = tcp_destroy,
|
|
|
|
#endif
|
2005-11-09 13:29:16 +00:00
|
|
|
.pr_slowtimo = tcp_slowtimo,
|
|
|
|
.pr_drain = tcp_drain,
|
|
|
|
.pr_usrreqs = &tcp_usrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2006-11-03 15:23:16 +00:00
|
|
|
#ifdef SCTP
|
|
|
|
{
|
|
|
|
.pr_type = SOCK_DGRAM,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_SCTP,
|
|
|
|
.pr_flags = PR_WANTRCVD,
|
|
|
|
.pr_input = sctp_input,
|
|
|
|
.pr_ctlinput = sctp_ctlinput,
|
|
|
|
.pr_ctloutput = sctp_ctloutput,
|
|
|
|
.pr_init = sctp_init,
|
|
|
|
.pr_drain = sctp_drain,
|
|
|
|
.pr_usrreqs = &sctp_usrreqs
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.pr_type = SOCK_SEQPACKET,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_SCTP,
|
|
|
|
.pr_flags = PR_WANTRCVD,
|
|
|
|
.pr_input = sctp_input,
|
|
|
|
.pr_ctlinput = sctp_ctlinput,
|
|
|
|
.pr_ctloutput = sctp_ctloutput,
|
|
|
|
.pr_drain = sctp_drain,
|
|
|
|
.pr_usrreqs = &sctp_usrreqs
|
|
|
|
},
|
|
|
|
|
|
|
|
{
|
|
|
|
.pr_type = SOCK_STREAM,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_SCTP,
|
|
|
|
.pr_flags = PR_WANTRCVD,
|
|
|
|
.pr_input = sctp_input,
|
|
|
|
.pr_ctlinput = sctp_ctlinput,
|
|
|
|
.pr_ctloutput = sctp_ctloutput,
|
|
|
|
.pr_drain = sctp_drain,
|
|
|
|
.pr_usrreqs = &sctp_usrreqs
|
|
|
|
},
|
|
|
|
#endif /* SCTP */
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_RAW,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = rip_input,
|
|
|
|
.pr_ctlinput = rip_ctlinput,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_ICMP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = icmp_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
2008-11-19 09:39:34 +00:00
|
|
|
.pr_init = icmp_init,
|
2005-11-09 13:29:16 +00:00
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_IGMP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = igmp_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_fasttimo = igmp_fasttimo,
|
|
|
|
.pr_slowtimo = igmp_slowtimo,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_RSVP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = rsvp_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1994-09-15 10:36:56 +00:00
|
|
|
},
|
2007-07-03 12:13:45 +00:00
|
|
|
#ifdef IPSEC
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_AH,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = ah4_input,
|
|
|
|
.pr_ctlinput = ah4_ctlinput,
|
|
|
|
.pr_usrreqs = &nousrreqs
|
2002-11-08 23:37:50 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_ESP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = esp4_input,
|
|
|
|
.pr_ctlinput = esp4_ctlinput,
|
|
|
|
.pr_usrreqs = &nousrreqs
|
2002-11-08 23:37:50 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_IPCOMP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = ipcomp4_input,
|
|
|
|
.pr_usrreqs = &nousrreqs
|
2002-11-08 23:37:50 +00:00
|
|
|
},
|
2007-07-03 12:13:45 +00:00
|
|
|
#endif /* IPSEC */
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_IPV4,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = encap4_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_init = encap_init,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1999-12-07 17:39:16 +00:00
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_MOBILE,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = encap4_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_init = encap_init,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
2002-09-06 17:12:50 +00:00
|
|
|
},
|
2005-12-21 21:29:45 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_ETHERIP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = encap4_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_init = encap_init,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
|
|
|
},
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_GRE,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = encap4_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_init = encap_init,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
2002-09-06 17:12:50 +00:00
|
|
|
},
|
1999-12-07 17:39:16 +00:00
|
|
|
# ifdef INET6
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_IPV6,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
|
|
|
.pr_input = encap4_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_init = encap_init,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1999-12-07 17:39:16 +00:00
|
|
|
},
|
|
|
|
#endif
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_PIM,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR|PR_LASTHDR,
|
2007-02-10 13:59:13 +00:00
|
|
|
.pr_input = encap4_input,
|
2005-11-09 13:29:16 +00:00
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
2003-08-07 18:16:59 +00:00
|
|
|
},
|
2004-06-16 23:24:02 +00:00
|
|
|
#ifdef DEV_PFSYNC
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_PFSYNC,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = pfsync_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
2004-06-16 23:24:02 +00:00
|
|
|
},
|
|
|
|
#endif /* DEV_PFSYNC */
|
2005-02-22 13:04:05 +00:00
|
|
|
#ifdef DEV_CARP
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_protocol = IPPROTO_CARP,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = carp_input,
|
|
|
|
.pr_output = (pr_output_t*)rip_output,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_usrreqs = &rip_usrreqs
|
2005-02-22 13:04:05 +00:00
|
|
|
},
|
|
|
|
#endif /* DEV_CARP */
|
2004-10-19 15:58:22 +00:00
|
|
|
/* Spacer n-times for loadable protocols. */
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
IPPROTOSPACER,
|
|
|
|
/* raw wildcard */
|
2005-11-09 13:29:16 +00:00
|
|
|
{
|
|
|
|
.pr_type = SOCK_RAW,
|
|
|
|
.pr_domain = &inetdomain,
|
|
|
|
.pr_flags = PR_ATOMIC|PR_ADDR,
|
|
|
|
.pr_input = rip_input,
|
|
|
|
.pr_ctloutput = rip_ctloutput,
|
|
|
|
.pr_init = rip_init,
|
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.
Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
|
|
|
#ifdef VIMAGE
|
|
|
|
.pr_destroy = rip_destroy,
|
|
|
|
#endif
|
2005-11-09 13:29:16 +00:00
|
|
|
.pr_usrreqs = &rip_usrreqs
|
1994-05-24 10:09:53 +00:00
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2002-03-19 21:25:46 +00:00
|
|
|
extern int in_inithead(void **, int);
|
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.
Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
|
|
|
extern int in_detachhead(void **, int);
|
1994-11-02 04:41:39 +00:00
|
|
|
|
2005-11-09 13:29:16 +00:00
|
|
|
struct domain inetdomain = {
|
|
|
|
.dom_family = AF_INET,
|
|
|
|
.dom_name = "internet",
|
|
|
|
.dom_protosw = inetsw,
|
|
|
|
.dom_protoswNPROTOSW = &inetsw[sizeof(inetsw)/sizeof(inetsw[0])],
|
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
|
|
|
#ifdef RADIX_MPATH
|
|
|
|
.dom_rtattach = rn4_mpath_inithead,
|
|
|
|
#else
|
2005-11-09 13:29:16 +00:00
|
|
|
.dom_rtattach = in_inithead,
|
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.
Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
|
|
|
#endif
|
|
|
|
#ifdef VIMAGE
|
|
|
|
.dom_rtdetach = in_detachhead,
|
This patch provides the back end support for equal-cost multi-path
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
2008-04-13 05:45:14 +00:00
|
|
|
#endif
|
2005-11-09 13:29:16 +00:00
|
|
|
.dom_rtoffset = 32,
|
This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
|
|
|
.dom_maxrtkey = sizeof(struct sockaddr_in),
|
|
|
|
.dom_ifattach = in_domifattach,
|
|
|
|
.dom_ifdetach = in_domifdetach
|
2005-11-09 13:29:16 +00:00
|
|
|
};
|
1994-05-24 10:09:53 +00:00
|
|
|
|
Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:
- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
occur each time a network stack is instantiated and destroyed. In the
!VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
For the VIMAGE case, we instead use SYSINIT's to track their order and
properties on registration, using them for each vnet when created/
destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
previously, as well as its dependency scheme: we now just use the
SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
they want init functions to be called for each virtual network stack
rather than just once at boot, compiling down to DOMAIN_SET() in the
non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
of modinfo or DOMAIN_SET() for init/uninit events. In some cases,
convert modular components from using modevent to using sysinit (where
appropriate). In some cases, do minor rejuggling of SYSINIT ordering
to make room for or better manage events.
Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup)
Discussed with: jhb, bz, julian, zec
Reviewed by: bz
Approved by: re (VIMAGE blanket)
2009-07-23 20:46:49 +00:00
|
|
|
VNET_DOMAIN_SET(inet);
|
1995-05-11 00:13:26 +00:00
|
|
|
|
1995-12-20 21:53:53 +00:00
|
|
|
SYSCTL_NODE(_net, PF_INET, inet, CTLFLAG_RW, 0,
|
|
|
|
"Internet Family");
|
1995-11-09 20:23:09 +00:00
|
|
|
|
1995-12-20 21:53:53 +00:00
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_IP, ip, CTLFLAG_RW, 0, "IP");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_ICMP, icmp, CTLFLAG_RW, 0, "ICMP");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_UDP, udp, CTLFLAG_RW, 0, "UDP");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_TCP, tcp, CTLFLAG_RW, 0, "TCP");
|
2006-11-03 15:23:16 +00:00
|
|
|
#ifdef SCTP
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_SCTP, sctp, CTLFLAG_RW, 0, "SCTP");
|
|
|
|
#endif
|
1995-12-20 21:53:53 +00:00
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_IGMP, igmp, CTLFLAG_RW, 0, "IGMP");
|
2007-07-03 12:13:45 +00:00
|
|
|
#ifdef IPSEC
|
2002-10-16 02:25:05 +00:00
|
|
|
/* XXX no protocol # to use, pick something "reserved" */
|
|
|
|
SYSCTL_NODE(_net_inet, 253, ipsec, CTLFLAG_RW, 0, "IPSEC");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_AH, ah, CTLFLAG_RW, 0, "AH");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_ESP, esp, CTLFLAG_RW, 0, "ESP");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_IPCOMP, ipcomp, CTLFLAG_RW, 0, "IPCOMP");
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_IPIP, ipip, CTLFLAG_RW, 0, "IPIP");
|
2007-07-03 12:13:45 +00:00
|
|
|
#endif /* IPSEC */
|
1997-02-18 20:46:36 +00:00
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_RAW, raw, CTLFLAG_RW, 0, "RAW");
|
2005-07-14 22:22:51 +00:00
|
|
|
#ifdef DEV_PFSYNC
|
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_PFSYNC, pfsync, CTLFLAG_RW, 0, "PFSYNC");
|
2003-08-07 18:16:59 +00:00
|
|
|
#endif
|
2005-02-22 13:04:05 +00:00
|
|
|
#ifdef DEV_CARP
|
2005-07-14 22:22:51 +00:00
|
|
|
SYSCTL_NODE(_net_inet, IPPROTO_CARP, carp, CTLFLAG_RW, 0, "CARP");
|
2005-02-22 13:04:05 +00:00
|
|
|
#endif
|