Add IPv6 related docs.
Reviewed by: phantom
This commit is contained in:
parent
60843485ad
commit
9b8b207497
@ -177,6 +177,8 @@
|
||||
dict
|
||||
..
|
||||
doc
|
||||
IPv6
|
||||
..
|
||||
bind
|
||||
html
|
||||
..
|
||||
@ -275,6 +277,8 @@
|
||||
examples
|
||||
FreeBSD_version
|
||||
..
|
||||
IPv6
|
||||
..
|
||||
atapi
|
||||
..
|
||||
atm
|
||||
|
998
share/doc/IPv6/IMPLEMENTATION
Normal file
998
share/doc/IPv6/IMPLEMENTATION
Normal file
@ -0,0 +1,998 @@
|
||||
Implementation Note
|
||||
|
||||
KAME Project
|
||||
http://www.kame.net/
|
||||
$FreeBSD$
|
||||
|
||||
1. IPv6
|
||||
|
||||
1.1 Conformance
|
||||
|
||||
The KAME kit conforms, or tries to conform, to the latest set of IPv6
|
||||
specifications. For future reference we list some of the relevant documents
|
||||
below (NOTE: this is not a complete list - this is too hard to maintain...).
|
||||
For details please refer to specific chapter in the document, RFCs, manpages
|
||||
come with KAME, or comments in the source code.
|
||||
|
||||
Conformance tests have been performed on the KAME STABLE kit
|
||||
at TAHI project. Results can be viewed at http://www.tahi.org/report/KAME/.
|
||||
We also attended Univ. of New Hampshire IOL tests (http://www.iol.unh.edu/)
|
||||
in the past, with our past snapshots.
|
||||
|
||||
RFC1639: FTP Operation Over Big Address Records (FOOBAR)
|
||||
* RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428,
|
||||
then RFC1639 if failed.
|
||||
RFC1886: DNS Extensions to support IPv6
|
||||
RFC1933: Transition Mechanisms for IPv6 Hosts and Routers
|
||||
* IPv4 compatible address is not supported.
|
||||
* automatic tunneling (4.3) is not supported.
|
||||
* "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way,
|
||||
and it covers "configured tunnel" described in the spec.
|
||||
See 1.5 in this document for details.
|
||||
RFC1981: Path MTU Discovery for IPv6
|
||||
RFC2080: RIPng for IPv6
|
||||
* KAME-supplied route6d, bgpd and hroute6d support this.
|
||||
RFC2283: Multiprotocol Extensions for BGP-4
|
||||
* so-called "BGP4+".
|
||||
* KAME-supplied bgpd supports this.
|
||||
RFC2292: Advanced Sockets API for IPv6
|
||||
* For supported library functions/kernel APIs, see sys/netinet6/ADVAPI.
|
||||
RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM)
|
||||
* RFC2362 defines packet formats for PIM-SM. draft-ietf-pim-ipv6-01.txt
|
||||
is written based on this.
|
||||
RFC2373: IPv6 Addressing Architecture
|
||||
* KAME supports node required addresses, and conforms to the scope
|
||||
requirement.
|
||||
RFC2374: An IPv6 Aggregatable Global Unicast Address Format
|
||||
* KAME supports 64-bit length of Interface ID.
|
||||
RFC2375: IPv6 Multicast Address Assignments
|
||||
* Userland applications use the well-known addresses assigned in the RFC.
|
||||
RFC2428: FTP Extensions for IPv6 and NATs
|
||||
* RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428,
|
||||
then RFC1639 if failed.
|
||||
RFC2460: IPv6 specification
|
||||
RFC2461: Neighbor discovery for IPv6
|
||||
* See 1.2 in this document for details.
|
||||
RFC2462: IPv6 Stateless Address Autoconfiguration
|
||||
* See 1.4 in this document for details.
|
||||
RFC2463: ICMPv6 for IPv6 specification
|
||||
* See 1.8 in this document for details.
|
||||
RFC2464: Transmission of IPv6 Packets over Ethernet Networks
|
||||
RFC2465: MIB for IPv6: Textual Conventions and General Group
|
||||
* Necessary statistics are gathered by the kernel. Actual IPv6 MIB
|
||||
support is provided as patchkit for ucd-snmp.
|
||||
RFC2466: MIB for IPv6: ICMPv6 group
|
||||
* Necessary statistics are gathered by the kernel. Actual IPv6 MIB
|
||||
support is provided as patchkit for ucd-snmp.
|
||||
RFC2467: Transmission of IPv6 Packets over FDDI Networks
|
||||
RFC2472: IPv6 over PPP
|
||||
RFC2492: IPv6 over ATM Networks
|
||||
* only PVC is supported.
|
||||
RFC2497: Transmission of IPv6 packet over ARCnet Networks
|
||||
RFC2545: Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing
|
||||
RFC2553: Basic Socket Interface Extensions for IPv6
|
||||
* IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind
|
||||
socket (3.8) are supported.
|
||||
see 1.12 in this document for details.
|
||||
RFC2675: IPv6 Jumbograms
|
||||
* See 1.7 in this document for details.
|
||||
RFC2710: Multicast Listener Discovery for IPv6
|
||||
RFC2711: IPv6 router alert option
|
||||
draft-ietf-ipngwg-router-renum-08: Router renumbering for IPv6
|
||||
draft-ietf-ipngwg-icmp-namelookups-02: IPv6 Name Lookups Through ICMP
|
||||
draft-ietf-ipngwg-icmp-name-lookups-03: IPv6 Name Lookups Through ICMP
|
||||
draft-ietf-pim-ipv6-01.txt: PIM for IPv6
|
||||
* pim6dd implements dense mode. pim6sd implements sparse mode.
|
||||
draft-ietf-dhc-dhcpv6-14.txt: DHCPv6
|
||||
draft-ietf-dhc-v6exts-11.txt: Extensions for DHCPv6
|
||||
* kame/dhcp6 has test implementation, which will not be compiled in
|
||||
default compilation.
|
||||
draft-itojun-ipv6-tcp-to-anycast-00:
|
||||
Disconnecting TCP connection toward IPv6 anycast address
|
||||
draft-yamamoto-wideipv6-comm-model-00
|
||||
* See 1.6 in this document for details.
|
||||
draft-ietf-ipngwg-scopedaddr-format-00.txt:
|
||||
An Extension of Format for IPv6 Scoped Addresses
|
||||
|
||||
1.2 Neighbor Discovery
|
||||
|
||||
Neighbor Discovery is fairly stable. Currently Address Resolution,
|
||||
Duplicated Address Detection, and Neighbor Unreachability Detection
|
||||
are supported. In the near future we will be adding Proxy Neighbor
|
||||
Advertisement support in the kernel and Unsolicited Neighbor Advertisement
|
||||
transmission command as admin tool.
|
||||
|
||||
If DAD fails, the address will be marked "duplicated" and message will be
|
||||
generated to syslog (and usually to console). The "duplicated" mark
|
||||
can be checked with ifconfig. It is administrators' responsibility to check
|
||||
for and recover from DAD failures.
|
||||
The behavior should be improved in the near future.
|
||||
|
||||
Some of the network driver loops multicast packets back to itself,
|
||||
even if instructed not to do so (especially in promiscuous mode).
|
||||
In such cases DAD may fail, because DAD engine sees inbound NS packet
|
||||
(actually from the node itself) and considers it as a sign of duplicate.
|
||||
You may want to look at #if condition marked "heuristics" in
|
||||
sys/netinet6/nd6_nbr.c:nd6_dad_timer() as workaround (note that the code
|
||||
fragment in "heuristics" section is not spec conformant).
|
||||
|
||||
Neighbor Discovery specification (RFC2461) does not talk about neighbor
|
||||
cache handling in the following cases:
|
||||
(1) when there was no neighbor cache entry, node received unsolicited
|
||||
RS/NS/NA/redirect packet without link-layer address
|
||||
(2) neighbor cache handling on medium without link-layer address
|
||||
(we need a neighbor cache entry for IsRouter bit)
|
||||
For (1), we implemented workaround based on discussions on IETF ipngwg mailing
|
||||
list. For more details, see the comments in the source code and email
|
||||
thread started from (IPng 7155), dated Feb 6 1999.
|
||||
|
||||
IPv6 on-link determination rule (RFC2461) is quite different from assumptions
|
||||
in BSD network code. At this moment, KAME does not implement on-link
|
||||
determination rule when default router list is empty (RFC2461, section 5.2,
|
||||
last sentence in 2nd paragraph - note that the spec misuse the word "host"
|
||||
and "node" in several places in the section).
|
||||
|
||||
To avoid possible DoS attacks and infinite loops, KAME stack will accept
|
||||
only 10 options on ND packet. Therefore, if you have 20 prefix options
|
||||
attached to RA, only the first 10 prefixes will be recognized.
|
||||
If this troubles you, please contact KAME team and/or modify
|
||||
nd6_maxndopt in sys/netinet6/nd6.c. If there are high demands we may
|
||||
provide sysctl knob for the variable.
|
||||
|
||||
1.3 Scope Index
|
||||
|
||||
IPv6 uses scoped addresses. Therefore, it is very important to
|
||||
specify scope index (interface index for link-local address, or
|
||||
site index for site-local address) with an IPv6 address. Without
|
||||
scope index, scoped IPv6 address is ambiguous to the kernel, and
|
||||
kernel will not be able to determine the outbound interface for a
|
||||
packet.
|
||||
|
||||
Ordinary userland applications should use advanced API (RFC2292) to
|
||||
specify scope index, or interface index. For similar purpose,
|
||||
sin6_scope_id member in sockaddr_in6 structure is defined in RFC2553.
|
||||
However, the semantics for sin6_scope_id is rather vague. If you
|
||||
care about portability of your application, we suggest you to use
|
||||
advanced API rather than sin6_scope_id.
|
||||
|
||||
In the kernel, an interface index for link-local scoped address is
|
||||
embedded into 2nd 16bit-word (3rd and 4th byte) in IPv6 address.
|
||||
For example, you may see something like:
|
||||
fe80:1::200:f8ff:fe01:6317
|
||||
in the routing table and interface address structure (struct
|
||||
in6_ifaddr). The address above is a link-local unicast address
|
||||
which belongs to a network interface whose interface identifier is 1.
|
||||
The embedded index enables us to identify IPv6 link local
|
||||
addresses over multiple interfaces effectively and with only a
|
||||
little code change.
|
||||
Routing daemons and configuration programs, like route6d and
|
||||
ifconfig, will need to manipulate the "embedded" scope index.
|
||||
These programs use routing sockets and ioctls (like SIOCGIFADDR_IN6)
|
||||
and the kernel API will return IPv6 addresses with 2nd 16bit-word
|
||||
filled in. The APIs are for manipulating kernel internal structure.
|
||||
Programs that use these APIs have to be prepared about differences
|
||||
in kernels anyway.
|
||||
|
||||
When you specify scoped address to the command line, NEVER write the
|
||||
embedded form (such as ff02:1::1 or fe80:2::fedc). This is not supposed
|
||||
to work. Always use standard form, like ff02::1 or fe80::fedc, with
|
||||
command line option for specifying interface (like "ping6 -I ne0 ff02::1).
|
||||
In general, if a command does not have command line option to specify
|
||||
outgoing interface, that command is not ready to accept scoped address.
|
||||
This may seem to be opposite from IPv6's premise to support "dentist office"
|
||||
situation. We believe that specifications need some improvements for this.
|
||||
|
||||
Some of the userland tools support extended numeric IPv6 syntax, as
|
||||
documented in draft-ietf-ipngwg-scopedaddr-format-00.txt. You can specify
|
||||
outgoing link, by using name of the outgoing interface like "fe80::1%ne0".
|
||||
This way you will be able to specify link-local scoped address without much
|
||||
trouble.
|
||||
To use this extension in your program, you'll need to use getaddrinfo(3),
|
||||
and getnameinfo(3) with NI_WITHSCOPEID.
|
||||
The implementation currently assumes 1-to-1 relationship between a link and an
|
||||
interface, which is stronger than what specs say.
|
||||
|
||||
1.4 Plug and Play
|
||||
|
||||
The KAME kit implements most of the IPv6 stateless address
|
||||
autoconfiguration in the kernel.
|
||||
Neighbor Discovery functions are implemented in the kernel as a whole.
|
||||
Router Advertisement (RA) input for hosts is implemented in the
|
||||
kernel. Router Solicitation (RS) output for endhosts, RS input
|
||||
for routers, and RA output for routers are implemented in the
|
||||
userland.
|
||||
|
||||
1.4.1 Assignment of link-local, and special addresses
|
||||
|
||||
IPv6 link-local address is generated from IEEE802 adddress (ethernet MAC
|
||||
address). Each of interface is assigned an IPv6 link-local address
|
||||
automatically, when the interface becomes up (IFF_UP). Also, direct route
|
||||
for the link-local address is added to routing table.
|
||||
|
||||
Here is an output of netstat command:
|
||||
|
||||
Internet6:
|
||||
Destination Gateway Flags Netif Expire
|
||||
fe80:1::%ed0/64 link#1 UC ed0
|
||||
fe80:2::%ep0/64 link#2 UC ep0
|
||||
|
||||
Interfaces that has no IEEE802 address (pseudo interfaces like tunnel
|
||||
interfaces, or ppp interfaces) will borrow IEEE802 address from other
|
||||
interfaces, such as ethernet interfaces, whenever possible.
|
||||
If there is no IEEE802 hardware attached, last-resort pseudorandom value,
|
||||
which is from MD5(hostname), will be used as source of link-local address.
|
||||
If it is not suitable for your usage, you will need to configure the
|
||||
link-local address manually.
|
||||
|
||||
If an interface is not capable of handling IPv6 (such as lack of multicast
|
||||
support), link-local address will not be assigned to that interface.
|
||||
See section 2 for details.
|
||||
|
||||
Each interface joins the solicited multicast address and the
|
||||
link-local all-nodes multicast addresses (e.g. fe80::1:ff01:6317
|
||||
and ff02::1, respectively, on the link the interface is attached).
|
||||
In addition to a link-local address, the loopback address (::1) will be
|
||||
assigned to the loopback interface. Also, ::1/128 and ff01::/32 are
|
||||
automatically added to routing table, and loopback interface joins
|
||||
node-local multicast group ff01::1.
|
||||
|
||||
1.4.2 Stateless address autoconfiguration on hosts
|
||||
|
||||
In IPv6 specification, nodes are separated into two categories:
|
||||
routers and hosts. Routers forward packets addressed to others, hosts does
|
||||
not forward the packets. net.inet6.ip6.forwarding defines whether this
|
||||
node is router or host (router if it is 1, host if it is 0).
|
||||
|
||||
When a host hears Router Advertisement from the router, a host may
|
||||
autoconfigure itself by stateless address autoconfiguration.
|
||||
This behavior can be controlled by net.inet6.ip6.accept_rtadv
|
||||
(host autoconfigures itself if it is set to 1).
|
||||
By autoconfiguration, network address prefix for the receiving interface
|
||||
(usually global address prefix) is added. Default route is also configured.
|
||||
Routers periodically generate Router Advertisement packets. To request
|
||||
an adjacent router to generate RA packet, a host can transmit Router
|
||||
Solicitation. To generate a RS packet at any time, use the "rtsol" command.
|
||||
"rtsold" daemon is also available. "rtsold" generates Router Solicitation
|
||||
whenever necessary, and it works great for nomadic usage (notebooks/laptops).
|
||||
If one wishes to ignore Router Advertisements, use sysctl to set
|
||||
net.inet6.ip6.accept_rtadv to 0.
|
||||
|
||||
To generate Router Advertisement from a router, use the "rtadvd" daemon.
|
||||
|
||||
Note that, IPv6 specification assumes the following items, and nonconforming
|
||||
cases are left unspecified:
|
||||
- Only hosts will listen to router advertisements
|
||||
- Hosts have single network interface (except loopback)
|
||||
Therefore, this is unwise to enable net.inet6.ip6.accept_rtadv on routers,
|
||||
or multi-interface host. A misconfigured node can behave strange
|
||||
(KAME code allows nonconforming configuration, for those who would like
|
||||
to do some experiments).
|
||||
|
||||
To summarize the sysctl knob:
|
||||
accept_rtadv forwarding role of the node
|
||||
--- --- ---
|
||||
0 0 host (to be manually configured)
|
||||
0 1 router
|
||||
1 0 autoconfigured host
|
||||
(spec assumes that host has single
|
||||
interface only, autoconfigured host
|
||||
with multiple interface is
|
||||
out-of-scope)
|
||||
1 1 invalid, or experimental
|
||||
(out-of-scope of spec)
|
||||
|
||||
RFC2462 has validation rule against incoming RA prefix information option,
|
||||
in 5.5.3 (e). This is to protect hosts from malicious (or misconfigured)
|
||||
routers that advertise very short prefix lifetime.
|
||||
There was an update from Jim Bound to ipngwg mailing list (look
|
||||
for "(ipng 6712)" in the archive) and KAME implements Jim's update.
|
||||
|
||||
See 1.2 in the document for relationship between DAD and autoconfiguration.
|
||||
|
||||
1.4.3 DHCPv6 (not yet put into freebsd4.0)
|
||||
|
||||
We supply a tiny DHCPv6 server/client in kame/dhcp6. However, the
|
||||
implementation is very premature (for example, this does NOT
|
||||
implement address lease/release), and it is not in default compilation
|
||||
tree. If you want to do some experiment, compile it on your own.
|
||||
|
||||
DHCPv6 and autoconfiguration also needs more work. "Managed" and "Other"
|
||||
bits in RA have no special effect to stateful autoconfiguration procedure
|
||||
in DHCPv6 client program ("Managed" bit actually prevents stateless
|
||||
autoconfiguration, but no special action will be taken for DHCPv6 client).
|
||||
|
||||
1.5 Generic tunnel interface
|
||||
|
||||
GIF (Generic InterFace) is a pseudo interface for configured tunnel.
|
||||
Details are described in gif(4) manpage.
|
||||
Currently
|
||||
v6 in v6
|
||||
v6 in v4
|
||||
v4 in v6
|
||||
v4 in v4
|
||||
are available. Use "gifconfig" to assign physical (outer) source
|
||||
and destination address to gif interfaces.
|
||||
Configuration that uses same address family for inner and outer IP
|
||||
header (v4 in v4, or v6 in v6) is dangerous. It is very easy to
|
||||
configure interfaces and routing tables to perform infinite level
|
||||
of tunneling. Please be warned.
|
||||
|
||||
gif can be configured to be ECN-friendly. See 4.5 for ECN-friendliness
|
||||
of tunnels, and gif(4) manpage for how to configure.
|
||||
|
||||
If you would like to configure an IPv4-in-IPv6 tunnel with gif interface,
|
||||
read gif(4) carefully. You will need to remove IPv6 link-local address
|
||||
automatically assigned to the gif interface.
|
||||
|
||||
1.6 Source Address Selection
|
||||
|
||||
Source selection of KAME is scope oriented (there are some exceptions -
|
||||
see below). For a given destination, a source IPv6 address is selected
|
||||
by the following rule:
|
||||
1. If the source address is explicitly specified by the user
|
||||
(e.g. via the advanced API), the specified address is used.
|
||||
2. If there is an address assigned to the outgoing interface
|
||||
(which is usually determined by looking up the routing table)
|
||||
that has the same scope as the destination address, the address
|
||||
is used.
|
||||
This is the most typical case.
|
||||
3. If there is no address that satisfies the above condition,
|
||||
choose a global address assigned to one of the interfaces
|
||||
on the sending node.
|
||||
4. If there is no address that satisfies the above condition,
|
||||
and destination address is site local scope,
|
||||
choose a site local address assigned to one of the interfaces
|
||||
on the sending node.
|
||||
5. If there is no address that satisfies the above condition,
|
||||
choose the address associated with the routing table
|
||||
entry for the destination.
|
||||
This is the last resort, which may cause scope violation.
|
||||
|
||||
For instance, ::1 is selected for ff01::1, fe80:1::200:f8ff:fe01:6317
|
||||
for fe80:1::2a0:24ff:feab:839b (note that embedded interface index -
|
||||
described in 1.3 - helps us choose the right source address. Those
|
||||
embedded indices will not be on the wire).
|
||||
If the outgoing interface has multiple address for the scope,
|
||||
a source is selected longest match basis (rule 3). Suppose
|
||||
3ffe:501:808:1:200:f8ff:fe01:6317 and 3ffe:2001:9:124:200:f8ff:fe01:6317
|
||||
are given to the outgoing interface. 3ffe:501:808:1:200:f8ff:fe01:6317
|
||||
is chosen as the source for the destination 3ffe:501:800::1.
|
||||
|
||||
Note that the above rule is not documented in the IPv6 spec. It is
|
||||
considered "up to implementation" item.
|
||||
There are some cases where we do not use the above rule. One
|
||||
example is connected TCP session, and we use the address kept in tcb
|
||||
as the source.
|
||||
Another example is source address for Neighbor Advertisement.
|
||||
Under the spec (RFC2461 7.2.2) NA's source should be the target
|
||||
address of the corresponding NS's target. In this case we follow
|
||||
the spec rather than the above longest-match rule.
|
||||
|
||||
For new connections (when rule 1 does not apply), deprecated addresses
|
||||
(addresses with preferred lifetime = 0) will not be chosen as source address
|
||||
if other choises are available. If no other choices are available,
|
||||
deprecated address will be used as a last resort. If there are multiple
|
||||
choice of deprecated addresses, the above scope rule will be used to choose
|
||||
from those deprecated addreses. If you would like to prohibit the use
|
||||
of deprecated address for some reason, configure net.inet6.ip6.use_deprecated
|
||||
to 0. The issue related to deprecated address is described in RFC2462 5.5.4
|
||||
(NOTE: there is some debate underway in IETF ipngwg on how to use
|
||||
"deprecated" address).
|
||||
|
||||
1.7 Jumbo Payload
|
||||
|
||||
KAME supports the Jumbo Payload hop-by-hop option used to send IPv6
|
||||
packets with payloads longer than 65,535 octets. But since currently
|
||||
KAME does not support any physical interface whose MTU is more than
|
||||
65,535, such payloads can be seen only on the loopback interface(i.e.
|
||||
lo0).
|
||||
|
||||
If you want to try jumbo payloads, you first have to reconfigure the
|
||||
kernel so that the MTU of the loopback interface is more than 65,535
|
||||
bytes; add the following to the kernel configuration file:
|
||||
options "LARGE_LOMTU" #To test jumbo payload
|
||||
and recompile the new kernel.
|
||||
|
||||
Then you can test jumbo payloads by the ping6 command with -b and -s
|
||||
options. The -b option must be specified to enlarge the size of the
|
||||
socket buffer and the -s option specifies the length of the packet,
|
||||
which should be more than 65,535. For example, type as follows;
|
||||
% ping6 -b 70000 -s 68000 ::1
|
||||
|
||||
The IPv6 specification requires that the Jumbo Payload option must not
|
||||
be used in a packet that carries a fragment header. If this condition
|
||||
is broken, an ICMPv6 Parameter Problem message must be sent to the
|
||||
sender. KAME kernel follows the specification, but you cannot usually
|
||||
see an ICMPv6 error caused by this requirement.
|
||||
|
||||
If KAME kernel receives an IPv6 packet, it checks the frame length of
|
||||
the packet and compares it to the length specified in the payload
|
||||
length field of the IPv6 header or in the value of the Jumbo Payload
|
||||
option, if any. If the former is shorter than the latter, KAME kernel
|
||||
discards the packet and increments the statistics. You can see the
|
||||
statistics as output of netstat command with `-s -p ip6' option:
|
||||
% netstat -s -p ip6
|
||||
ip6:
|
||||
(snip)
|
||||
1 with data size < data length
|
||||
|
||||
So, KAME kernel does not send an ICMPv6 error unless the erroneous
|
||||
packet is an actual Jumbo Payload, that is, its packet size is more
|
||||
than 65,535 bytes. As described above, KAME kernel currently does not
|
||||
support physical interface with such a huge MTU, so it rarely returns an
|
||||
ICMPv6 error.
|
||||
|
||||
TCP/UDP over jumbogram is not supported at this moment. This is because
|
||||
we have no medium (other than loopback) to test this. Contact us if you
|
||||
need this.
|
||||
|
||||
IPsec does not work on jumbograms. This is due to some specification twists
|
||||
in supporting AH with jumbograms (AH header size influences payload length,
|
||||
and this makes it real hard to authenticate inbound packet with jumbo payload
|
||||
option as well as AH).
|
||||
|
||||
There are fundamental issues in *BSD support for jumbograms. We would like to
|
||||
address those, but we need more time to finalize these. To name a few:
|
||||
- mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it will not hold
|
||||
jumbogram with len > 2G on 32bit architecture CPUs. If we would like to
|
||||
support jumbogram properly, the field must be expanded to hold 4G +
|
||||
IPv6 header + link-layer header. Therefore, it must be expanded to at least
|
||||
int64_t (u_int32_t is NOT enough).
|
||||
- We mistakingly use "int" to hold packet length in many places. We need
|
||||
to convert them into larger integral type. It needs a great care, as we may
|
||||
experience overflow during packet length computation.
|
||||
- We mistakingly check for ip6_plen field of IPv6 header for packet payload
|
||||
length in various places. We should be checking mbuf pkthdr.len instead.
|
||||
ip6_input() will perform sanity check on jumbo payload option on input,
|
||||
and we can safely use mbuf pkthdr.len afterwards.
|
||||
- TCP code needs a careful update in bunch of places, of course.
|
||||
|
||||
1.8 Loop prevention in header processing
|
||||
|
||||
IPv6 specification allows arbitrary number of extension headers to
|
||||
be placed onto packets. If we implement IPv6 packet processing
|
||||
code in the way BSD IPv4 code is implemented, kernel stack may
|
||||
overflow due to long function call chain. KAME sys/netinet6 code
|
||||
is carefully designed to avoid kernel stack overflow. Because of
|
||||
this, KAME sys/netinet6 code defines its own protocol switch
|
||||
structure, as "struct ip6protosw" (see netinet6/ip6protosw.h).
|
||||
There is no such update to IPv4 part (sys/netinet) for
|
||||
compatibility, but small change is added to its pr_input()
|
||||
prototype. So "struct ipprotosw" is also defined.
|
||||
Because of this, if you receive IPsec-over-IPv4 packet with massive
|
||||
number of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay.
|
||||
(Off-course, for those all IPsec headers to be processed, each
|
||||
such IPsec header must pass each IPsec check. So an anonymous
|
||||
attacker won't be able to do such an attack.)
|
||||
|
||||
1.9 ICMPv6
|
||||
|
||||
After RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error
|
||||
packet against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium.
|
||||
KAME already implements this into the kernel.
|
||||
|
||||
1.10 Applications
|
||||
|
||||
For userland programming, we support IPv6 socket API as specified in
|
||||
RFC2553, RFC2292 and upcoming internet drafts.
|
||||
|
||||
TCP/UDP over IPv6 is available and quite stable. You can enjoy "telnet",
|
||||
"ftp", "rlogin", "rsh", "ssh", etc. These applications are protocol
|
||||
independent. That is, they automatically chooses IPv4 or IPv6
|
||||
according to DNS.
|
||||
|
||||
1.11 Kernel Internals
|
||||
|
||||
(*) TCP/UDP part is handled differently between operating system platforms.
|
||||
See 1.12 for details.
|
||||
|
||||
The current KAME has escaped from the IPv4 netinet logic. While
|
||||
ip_forward() calls ip_output(), ip6_forward() directly calls
|
||||
if_output() since routers must not divide IPv6 packets into fragments.
|
||||
|
||||
ICMPv6 should contain the original packet as long as possible up to
|
||||
1280. UDP6/IP6 port unreach, for instance, should contain all
|
||||
extension headers and the *unchanged* UDP6 and IP6 headers.
|
||||
So, all IP6 functions except TCP never convert network byte
|
||||
order into host byte order, to save the original packet.
|
||||
|
||||
tcp_input(), udp6_input() and icmp6_input() can't assume that IP6
|
||||
header is preceding the transport headers due to extension
|
||||
headers. So, in6_cksum() was implemented to handle packets whose IP6
|
||||
header and transport header is not continuous. TCP/IP6 nor UDP6/IP6
|
||||
header structure don't exist for checksum calculation.
|
||||
|
||||
To process IP6 header, extension headers and transport headers easily,
|
||||
KAME requires network drivers to store packets in one internal mbuf or
|
||||
one or more external mbufs. A typical old driver prepares two
|
||||
internal mbufs for 96 - 204 bytes data, however, KAME's reference
|
||||
implementation stores it in one external mbuf.
|
||||
|
||||
"netstat -s -p ip6" tells you whether or not your driver conforms
|
||||
KAME's requirement. In the following example, "cce0" violates the
|
||||
requirement. (For more information, refer to Section 2.)
|
||||
|
||||
Mbuf statistics:
|
||||
317 one mbuf
|
||||
two or more mbuf::
|
||||
lo0 = 8
|
||||
cce0 = 10
|
||||
3282 one ext mbuf
|
||||
0 two or more ext mbuf
|
||||
|
||||
Each input function calls IP6_EXTHDR_CHECK in the beginning to check
|
||||
if the region between IP6 and its header is
|
||||
continuous. IP6_EXTHDR_CHECK calls m_pullup() only if the mbuf has
|
||||
M_LOOP flag, that is, the packet comes from the loopback
|
||||
interface. m_pullup() is never called for packets coming from physical
|
||||
network interfaces.
|
||||
|
||||
Both IP and IP6 reassemble functions never call m_pullup().
|
||||
|
||||
1.12 IPv4 mapped address and IPv6 wildcard socket
|
||||
|
||||
RFC2553 describes IPv4 mapped address (3.7) and special behavior
|
||||
of IPv6 wildcard bind socket (3.8). The spec allows you to:
|
||||
- Accept IPv4 connections by AF_INET6 wildcard bind socket.
|
||||
- Transmit IPv4 packet over AF_INET6 socket by using special form of
|
||||
the address like ::ffff:10.1.1.1.
|
||||
but the spec itself is very complicated and does not specify how the
|
||||
socket layer should behave.
|
||||
Here we call the former one "listening side" and the latter one "initiating
|
||||
side", for reference purposes.
|
||||
|
||||
Almost all KAME implementations treat tcp/udp port number space separately
|
||||
between IPv4 and IPv6. You can perform wildcard bind on both of the adderss
|
||||
families, on the same port.
|
||||
|
||||
The following table show the behavior of FreeBSD4x.
|
||||
|
||||
listening side initiating side
|
||||
(AF_INET6 wildcard (connetion to ::ffff:10.1.1.1)
|
||||
socket gets IPv4 conn.)
|
||||
--- ---
|
||||
FreeBSD4x configurable supported
|
||||
default: enabled
|
||||
|
||||
The following sections will give you more details, and how you can
|
||||
configure the behavior.
|
||||
|
||||
Comments on listening side:
|
||||
|
||||
It looks that RFC2553 talks too little on wildcard bind issue,
|
||||
especially on the port space issue, failure mode and relationship
|
||||
between AF_INET/INET6 wildcard bind. There can be several separate
|
||||
interpretation for this RFC which conform to it but behaves differently.
|
||||
So, to implement portable application you should assume nothing
|
||||
about the behavior in the kernel. Using getaddrinfo() is the safest way.
|
||||
Port number space and wildcard bind issues were discussed in detail
|
||||
on ipv6imp mailing list, in mid March 1999 and it looks that there's
|
||||
no concrete consensus (means, up to implementers). You may want to
|
||||
check the mailing list archives.
|
||||
|
||||
If a server application would like to accept IPv4 and IPv6 connections,
|
||||
there will be two alternatives.
|
||||
|
||||
One is using AF_INET and AF_INET6 socket (you'll need two sockets).
|
||||
Use getaddrinfo() with AI_PASSIVE into ai_flags, and socket(2) and bind(2)
|
||||
to all the addresses returned.
|
||||
By opening multiple sockets, you can accept connections onto the socket with
|
||||
proper address family. IPv4 connections will be accepted by AF_INET socket,
|
||||
and IPv6 connections will be accepted by AF_INET6 socket.
|
||||
|
||||
Another way is using one AF_INET6 wildcard bind socket.
|
||||
Use getaddrinfo() with AI_PASSIVE into ai_flags and with
|
||||
AF_INET6 into ai_family, and set the 1st argument hostname to
|
||||
NULL. And socket(2) and bind(2) to the address returned.
|
||||
(should be IPv6 unspecified addr)
|
||||
You can accept either of IPv4 and IPv6 packet via this one socket.
|
||||
|
||||
To support only IPv6 traffic on AF_INET6 wildcard binded socket portably,
|
||||
always check the peer address when a connection is made toward
|
||||
AF_INET6 listening socket. If the address is IPv4 mapped address, you may
|
||||
want to reject the connection. You can check the condition by using
|
||||
IN6_IS_ADDR_V4MAPPED() macro.
|
||||
To resolv this issue more easily, there is system dependent setsockopt()
|
||||
option, IPV6_BINDV6ONLY, used like below.
|
||||
int on;
|
||||
|
||||
setsockopt(s, IPPROTO_IPV6, IPV6_BINDV6ONLY,
|
||||
(char *)&on, sizeof (on)) < 0));
|
||||
When this call succeed, then this socket only receive IPv6 packets.
|
||||
|
||||
|
||||
Comments on initiating side:
|
||||
|
||||
Advise to application implementers: to implement a portable IPv6 application
|
||||
(which works on multiple IPv6 kernels), we believe that the following
|
||||
is the key to the success:
|
||||
- NEVER hardcode AF_INET nor AF_INET6.
|
||||
- Use getaddrinfo() and getnameinfo() throughout the system.
|
||||
Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*().
|
||||
(To update existing applications to be IPv6 aware easily,
|
||||
sometime getipnodeby*() will be useful. But if possible, try to
|
||||
rewrite the code to use getaddrinfo() and getnameinfo().)
|
||||
- If you would like to connect to destination, use getaddrinfo() and try
|
||||
all the destination returned, like telnet does.
|
||||
- Some of the IPv6 stack is shipped with buggy getaddrinfo(). Ship a minimal
|
||||
working version with your application and use that as last resort.
|
||||
|
||||
If you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing
|
||||
connection, you will need to use getipnodebyname(). When you would like to
|
||||
update your existing appication to be IPv6 aware with minimal effort,
|
||||
this approach might be choosed. But please note that it is a temporal
|
||||
solution, because getipnodebyname() itself is not recommended as it does
|
||||
not handle scoped IPv6 addresses at all. For IPv6 name resolution,
|
||||
getaddrinfo() is the preferred API. So you should rewrite your
|
||||
application to use getaddrinfo(), when you get the time to do it.
|
||||
|
||||
When writing applications that make outgoing connections, story goes much
|
||||
simpler if you treat AF_INET and AF_INET6 as totally seaprate address family.
|
||||
{set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do
|
||||
not recommend you to rely upon IPv4 mapped address.
|
||||
|
||||
1.12.1 FreeBSD4x
|
||||
|
||||
FreeBSD4x uses shared tcp4/6 code (from sys/netinet/tcp*) and separete
|
||||
udp4/6 code. It uses unified inpcb/in6pcb structure.
|
||||
|
||||
The platform can be configured to support IPv4 mapped address.
|
||||
Kernel configuration is summarized as follows:
|
||||
- By default, AF_INET6 socket will grab IPv4 connections in certain condition,
|
||||
and can initiate connection to IPv4 destination embedded in
|
||||
IPv4 mapped IPv6 address.
|
||||
- You can disable it on entire system with sysctl like below.
|
||||
sysctl -w net.inet6.ip6.mapped_addr=0
|
||||
|
||||
1.12.1.1 FreeBSD4x, listening side
|
||||
|
||||
Each socket can be configured to support special AF_INET6 wildcard bind
|
||||
(enabled by default).
|
||||
You can disable it on each socket basis with setsockopt() like below.
|
||||
int on;
|
||||
|
||||
setsockopt(s, IPPROTO_IPV6, IPV6_BINDV6ONLY,
|
||||
(char *)&on, sizeof (on)) < 0));
|
||||
|
||||
Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following
|
||||
conditions are satisfied:
|
||||
- there's no AF_INET socket that matches the IPv4 connection
|
||||
- the AF_INET6 socket is configured to accept IPv4 traffic, i.e.
|
||||
getsockopt(IPV6_BINDV6ONLY) returns 0.
|
||||
There's no problem with open/close ordering.
|
||||
|
||||
1.12.1.2 FreeBSD4x, initiating side
|
||||
|
||||
FreeBSD4x supports outgoing connetion to IPv4 mapped address
|
||||
(::ffff:10.1.1.1), if the node is configured to support IPv4 mapped address.
|
||||
|
||||
1.13 sockaddr_storage
|
||||
|
||||
When RFC2553 was about to be finalized, there was discusson on how struct
|
||||
sockaddr_storage members are named. One proposal is to prepend "__" to the
|
||||
members (like "__ss_len") as they should not be touched. The other proposal
|
||||
was that don't prepend it (like "ss_len") as we need to touch those members
|
||||
directly. There was no clear consensus on it.
|
||||
|
||||
As a result, RFC2553 defines struct sockaddr_storage as follows:
|
||||
struct sockaddr_storage {
|
||||
u_char __ss_len; /* address length */
|
||||
u_char __ss_family; /* address family */
|
||||
/* and bunch of padding */
|
||||
};
|
||||
On the contrary, XNET draft defines as follows:
|
||||
struct sockaddr_storage {
|
||||
u_char ss_len; /* address length */
|
||||
u_char ss_family; /* address family */
|
||||
/* and bunch of padding */
|
||||
};
|
||||
|
||||
In December 1999, it was agreed that RFC2553bis should pick the latter (XNET)
|
||||
definition.
|
||||
|
||||
KAME kit prior to December 1999 used RFC2553 definition. KAME kit after
|
||||
December 1999 (including December) will conform to XNET definition,
|
||||
based on RFC2553bis discusson.
|
||||
|
||||
If you look at multiple IPv6 implementations, you will be able to see
|
||||
both definitions. As an userland programmer, the most portable way of
|
||||
dealing with it is to:
|
||||
(1) ensure ss_family and/or ss_len are available on the platform, by using
|
||||
GNU autoconf,
|
||||
(2) have -Dss_family=__ss_family to unify all occurences (including header
|
||||
file) into __ss_family, or
|
||||
(3) never touch __ss_family. cast to sockaddr * and use sa_family like:
|
||||
struct sockaddr_storage ss;
|
||||
family = ((struct sockaddr *)&ss)->sa_family
|
||||
|
||||
2. Network Drivers
|
||||
|
||||
KAME requires two items to be added into the standard drivers:
|
||||
|
||||
(1) mbuf clustering requirement. In this stable release, we changed
|
||||
MINCLSIZE into MHLEN+1 for all the operating systems in order to make
|
||||
all the drivers behave as we expect.
|
||||
|
||||
(2) multicast. If "ifmcstat" yields no multicast group for a
|
||||
interface, that interface has to be patched.
|
||||
|
||||
If any of the driver don't support the requirements, then the driver
|
||||
can't be used for IPv6 and/or IPsec communication. If you find any
|
||||
problem with your card using IPv6/IPsec, then, please report it to
|
||||
freebsd-bugs@freebsd.org.
|
||||
|
||||
(NOTE: In the past we required all pcmcia drivers to have a call to
|
||||
in6_ifattach(). We have no such requirement any more)
|
||||
|
||||
3. Translator
|
||||
|
||||
We categorize IPv4/IPv6 translator into 4 types.
|
||||
|
||||
Translator A --- It is used in the early stage of transition to make
|
||||
it possible to establish a connection from an IPv6 host in an IPv6
|
||||
island to an IPv4 host in the IPv4 ocean.
|
||||
|
||||
Translator B --- It is used in the early stage of transition to make
|
||||
it possible to establish a connection from an IPv4 host in the IPv4
|
||||
ocean to an IPv6 host in an IPv6 island.
|
||||
|
||||
Translator C --- It is used in the late stage of transition to make it
|
||||
possible to establish a connection from an IPv4 host in an IPv4 island
|
||||
to an IPv6 host in the IPv6 ocean.
|
||||
|
||||
Translator D --- It is used in the late stage of transition to make it
|
||||
possible to establish a connection from an IPv6 host in the IPv6 ocean
|
||||
to an IPv4 host in an IPv4 island.
|
||||
|
||||
KAME provides an TCP relay translator for category A. This is called
|
||||
"FAITH". We also provide IP header translator for category A.
|
||||
(The latter is not yet put into FreeBSD4.x yet.)
|
||||
|
||||
3.1 FAITH TCP relay translator
|
||||
|
||||
FAITH system uses TCP relay daemon called "faithd" helped by the KAME kernel.
|
||||
FAITH will reserve an IPv6 address prefix, and relay TCP connection
|
||||
toward that prefix to IPv4 destination.
|
||||
|
||||
For example, if the reserved IPv6 prefix is 3ffe:0501:0200:ffff::, and
|
||||
the IPv6 destination for TCP connection is 3ffe:0501:0200:ffff::163.221.202.12,
|
||||
the connection will be relayed toward IPv4 destination 163.221.202.12.
|
||||
|
||||
destination IPv4 node (163.221.202.12)
|
||||
^
|
||||
| IPv4 tcp toward 163.221.202.12
|
||||
FAITH-relay dual stack node
|
||||
^
|
||||
| IPv6 TCP toward 3ffe:0501:0200:ffff::163.221.202.12
|
||||
source IPv6 node
|
||||
|
||||
faithd must be invoked on FAITH-relay dual stack node.
|
||||
|
||||
For more details, consult src/usr.sbin/faithd/README.
|
||||
|
||||
3.2 IPv6-to-IPv4 header translator
|
||||
|
||||
(to be written)
|
||||
|
||||
4. IPsec
|
||||
|
||||
IPsec is mainly organized by three components.
|
||||
|
||||
(1) Policy Management
|
||||
(2) Key Management
|
||||
(3) AH and ESP handling
|
||||
|
||||
4.1 Policy Management
|
||||
|
||||
The kernel implements experimental policy management code. There are two way
|
||||
to manage security policy. One is to configure per-socket policy using
|
||||
setsockopt(3). In this cases, policy configuration is described in
|
||||
ipsec_set_policy(3). The other is to configure kernel packet filter-based
|
||||
policy using PF_KEY interface, via setkey(8).
|
||||
|
||||
The policy entry is not re-ordered with its
|
||||
indexes, so the order of entry when you add is very significant.
|
||||
|
||||
4.2 Key Management
|
||||
|
||||
The key management code implemented in this kit (sys/netkey) is a
|
||||
home-brew PFKEY v2 implementation. This conforms to RFC2367.
|
||||
|
||||
The home-brew IKE daemon, "racoon" is included in the kit
|
||||
(kame/kame/racoon).
|
||||
Basically you'll need to run racoon as daemon, then setup a policy
|
||||
to require keys (like ping -P 'out ipsec esp/transport//use').
|
||||
The kernel will contact racoon daemon as necessary to exchange keys.
|
||||
|
||||
4.3 AH and ESP handling
|
||||
|
||||
IPsec module is implemented as "hooks" to the standard IPv4/IPv6
|
||||
processing. When sending a packet, ip{,6}_output() checks if ESP/AH
|
||||
processing is required by checking if a matching SPD (Security
|
||||
Policy Database) is found. If ESP/AH is needed,
|
||||
{esp,ah}{4,6}_output() will be called and mbuf will be updated
|
||||
accordingly. When a packet is received, {esp,ah}4_input() will be
|
||||
called based on protocol number, i.e. (*inetsw[proto])().
|
||||
{esp,ah}4_input() will decrypt/check authenticity of the packet,
|
||||
and strips off daisy-chained header and padding for ESP/AH. It is
|
||||
safe to strip off the ESP/AH header on packet reception, since we
|
||||
will never use the received packet in "as is" form.
|
||||
|
||||
By using ESP/AH, TCP4/6 effective data segment size will be affected by
|
||||
extra daisy-chained headers inserted by ESP/AH. Our code takes care of
|
||||
the case.
|
||||
|
||||
Basic crypto functions can be found in directory "sys/crypto". ESP/AH
|
||||
transform are listed in {esp,ah}_core.c with wrapper functions. If you
|
||||
wish to add some algorithm, add wrapper function in {esp,ah}_core.c, and
|
||||
add your crypto algorithm code into sys/crypto.
|
||||
|
||||
Tunnel mode is partially supported in this release, with the following
|
||||
restrictions:
|
||||
- IPsec tunnel is not combined with GIF generic tunneling interface.
|
||||
It needs a great care because we may create an infinite loop between
|
||||
ip_output() and tunnelifp->if_output(). Opinion varies if it is better
|
||||
to unify them, or not.
|
||||
- MTU and Don't Fragment bit (IPv4) considerations need more checking, but
|
||||
basically works fine.
|
||||
- Authentication model for AH tunnel must be revisited. We'll need to
|
||||
improve the policy management engine, eventually.
|
||||
|
||||
4.4 Conformance to RFCs and IDs
|
||||
|
||||
The IPsec code in the kernel conforms (or, tries to conform) to the
|
||||
following standards:
|
||||
"old IPsec" specification documented in rfc182[5-9].txt
|
||||
"new IPsec" specification documented in rfc240[1-6].txt, rfc241[01].txt,
|
||||
rfc2451.txt and draft-mcdonald-simple-ipsec-api-01.txt (draft expired,
|
||||
but you can take from ftp://ftp.kame.net/pub/internet-drafts/).
|
||||
(NOTE: IKE specifications, rfc241[7-9].txt are implemented in userland,
|
||||
as "racoon" IKE daemon)
|
||||
|
||||
Currently supported algorithms are:
|
||||
old IPsec AH
|
||||
null crypto checksum (no document, just for debugging)
|
||||
keyed MD5 with 128bit crypto checksum (rfc1828.txt)
|
||||
keyed SHA1 with 128bit crypto checksum (no document)
|
||||
HMAC MD5 with 128bit crypto checksum (rfc2085.txt)
|
||||
HMAC SHA1 with 128bit crypto checksum (no document)
|
||||
old IPsec ESP
|
||||
null encryption (no document, similar to rfc2410.txt)
|
||||
DES-CBC mode (rfc1829.txt)
|
||||
new IPsec AH
|
||||
null crypto checksum (no document, just for debugging)
|
||||
keyed MD5 with 96bit crypto checksum (no document)
|
||||
keyed SHA1 with 96bit crypto checksum (no document)
|
||||
HMAC MD5 with 96bit crypto checksum (rfc2403.txt
|
||||
HMAC SHA1 with 96bit crypto checksum (rfc2404.txt)
|
||||
new IPsec ESP
|
||||
null encryption (rfc2410.txt)
|
||||
DES-CBC with derived IV
|
||||
(draft-ietf-ipsec-ciph-des-derived-01.txt, draft expired)
|
||||
DES-CBC with explicit IV (rfc2405.txt)
|
||||
3DES-CBC with explicit IV (rfc2451.txt)
|
||||
BLOWFISH CBC (rfc2451.txt)
|
||||
CAST128 CBC (rfc2451.txt)
|
||||
RC5 CBC (rfc2451.txt)
|
||||
each of the above can be combined with:
|
||||
ESP authentication with HMAC-MD5(96bit)
|
||||
ESP authentication with HMAC-SHA1(96bit)
|
||||
|
||||
The following algorithms are NOT supported:
|
||||
old IPsec AH
|
||||
HMAC MD5 with 128bit crypto checksum + 64bit replay prevention
|
||||
(rfc2085.txt)
|
||||
keyed SHA1 with 160bit crypto checksum + 32bit padding (rfc1852.txt)
|
||||
|
||||
IPsec (in kernel) and IKE (in userland as "racoon") has been tested
|
||||
at several interoperability test events, and it is known to interoperate
|
||||
with many other implementations well. Also, KAME IPsec has quite wide
|
||||
coverage for IPsec crypto algorithms documented in RFC (we cover
|
||||
algorithms without intellectual property issues only).
|
||||
|
||||
4.5 ECN consideration on IPsec tunnels
|
||||
|
||||
KAME IPsec implements ECN-friendly IPsec tunnel, described in
|
||||
draft-ipsec-ecn-00.txt.
|
||||
Normal IPsec tunnel is described in RFC2401. On encapsulation,
|
||||
IPv4 TOS field (or, IPv6 traffic class field) will be copied from inner
|
||||
IP header to outer IP header. On decapsulation outer IP header
|
||||
will be simply dropped. The decapsulation rule is not compatible
|
||||
with ECN, since ECN bit on the outer IP TOS/traffic class field will be
|
||||
lost.
|
||||
To make IPsec tunnel ECN-friendly, we should modify encapsulation
|
||||
and decapsulation procedure. This is described in
|
||||
http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt, chapter 3.
|
||||
|
||||
KAME IPsec tunnel implementation can give you three behaviors, by setting
|
||||
net.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value:
|
||||
- RFC2401: no consideration for ECN (sysctl value -1)
|
||||
- ECN forbidden (sysctl value 0)
|
||||
- ECN allowed (sysctl value 1)
|
||||
Note that the behavior is configurable in per-node manner, not per-SA manner
|
||||
(draft-ipsec-ecn-00 wants per-SA configuration, but it looks too much for me).
|
||||
|
||||
The behavior is summarized as follows (see source code for more detail):
|
||||
|
||||
encapsulate decapsulate
|
||||
--- ---
|
||||
RFC2401 copy all TOS bits drop TOS bits on outer
|
||||
from inner to outer. (use inner TOS bits as is)
|
||||
|
||||
ECN forbidden copy TOS bits except for ECN drop TOS bits on outer
|
||||
(masked with 0xfc) from inner (use inner TOS bits as is)
|
||||
to outer. set ECN bits to 0.
|
||||
|
||||
ECN allowed copy TOS bits except for ECN use inner TOS bits with some
|
||||
CE (masked with 0xfe) from change. if outer ECN CE bit
|
||||
inner to outer. is 1, enable ECN CE bit on
|
||||
set ECN CE bit to 0. the inner.
|
||||
|
||||
General strategy for configuration is as follows:
|
||||
- if both IPsec tunnel endpoint are capable of ECN-friendly behavior,
|
||||
you'd better configure both end to "ECN allowed" (sysctl value 1).
|
||||
- if the other end is very strict about TOS bit, use "RFC2401"
|
||||
(sysctl value -1).
|
||||
- in other cases, use "ECN forbidden" (sysctl value 0).
|
||||
The default behavior is "ECN forbidden" (sysctl value 0).
|
||||
|
||||
For more information, please refer to:
|
||||
http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt
|
||||
RFC2481 (Explicit Congestion Notification)
|
||||
KAME sys/netinet6/{ah,esp}_input.c
|
||||
|
||||
(Thanks goes to Kenjiro Cho <kjc@csl.sony.co.jp> for detailed analysis)
|
||||
|
||||
4.6 Interoperability
|
||||
|
||||
Here are (some of) platforms we have tested IPsec/IKE interoperability
|
||||
in the past. Note that both ends (KAME and others) may have modified their
|
||||
implementation, so use the following list just for reference purposes.
|
||||
Altiga, Ashley-laurent (vpcom.com), Data Fellows (F-Secure), Ericsson
|
||||
ACC, FreeS/WAN, HITACHI, IBM AIX, IIJ, Intel, Microsoft WinNT, NIST
|
||||
(linux IPsec + plutoplus), Netscreen, OpenBSD, RedCreek, Routerware,
|
||||
SSH, Secure Computing, Soliton, Toshiba, VPNet, Yamaha RT100i
|
||||
|
||||
5. IPComp
|
||||
(not yet put into FreeBSD4.x, due to inflate related changes in 4.x.)
|
||||
|
||||
IPComp stands for IP payload compression protocol. This is aimed for
|
||||
payload compression, not the header compression like PPP VJ compression.
|
||||
This may be useful when you are using slow serial link (say, cell phone)
|
||||
with powerful CPU (well, recent notebook PCs are really powerful...).
|
||||
The protocol design of IPComp is very similar to IPsec.
|
||||
|
||||
KAME implements the following specifications:
|
||||
- RFC2393: IP Payload Compression Protocol (IPComp)
|
||||
- RFC2394: IP Payload Compression Using DEFLATE
|
||||
|
||||
Here are some points to be noted:
|
||||
- IPComp is treated as part of IPsec protocol suite, and SPI and
|
||||
CPI space is unified. Spec says that there's no relationship
|
||||
between two so they are assumed to be separate.
|
||||
- IPComp association (IPCA) is kept in SAD.
|
||||
- It is possible to use well-known CPI (CPI=2 for DEFLATE for example),
|
||||
for outbound/inbound packet, but for indexing purposes one element from
|
||||
SPI/CPI space will be occupied anyway.
|
||||
- pfkey is modified to support IPComp. However, there's no official
|
||||
SA type number assignment yet. Portability with other IPComp
|
||||
stack is questionable (anyway, who else implement IPComp on UN*X?).
|
||||
- Spec says that IPComp output processing must be performed before IPsec
|
||||
output processing, to achieve better compression ratio and "stir" data
|
||||
stream before encryption. However, with manual SPD setting, you are able to
|
||||
violate the ordering requirement (KAME code is too generic, maybe).
|
||||
- Though MTU can be significantly decreased by using IPComp, no special
|
||||
consideration is made about path MTU (spec talks nothing about MTU
|
||||
consideration). IPComp is designed for serial links, not ethernet-like
|
||||
medium, it seems.
|
||||
- You can change compression ratio on outbound packet, by changing
|
||||
deflate_policy in sys/netinet6/ipcomp_core.c. You can also change history
|
||||
buffer size by changing deflate_window in the same source code.
|
||||
(should it be sysctl accessible? or per-SAD configurable?)
|
||||
- Tunnel mode IPComp is not working right. KAME box can generate tunnelled
|
||||
IPComp packet, however, cannot accept tunneled IPComp packet.
|
||||
|
||||
6. ALTQ
|
||||
(not yet put into FreeBSD4.x)
|
||||
|
||||
<end of IMPLEMENTATION>
|
629
share/examples/IPv6/USAGE
Normal file
629
share/examples/IPv6/USAGE
Normal file
@ -0,0 +1,629 @@
|
||||
USAGE
|
||||
|
||||
KAME Project
|
||||
http://www.kame.net/newsletter/
|
||||
$FreeBSD$
|
||||
|
||||
This is a introduction of how to use the commands provided in the KAME
|
||||
kit. For more information, please refer to each man page.
|
||||
|
||||
<<<ifconfig>>>
|
||||
|
||||
A link-local address is automatically assigned to each interface, when
|
||||
the interface becomes up for the first time. Even if you find an interface
|
||||
without a link-local address, do not panic. The link-local address will be
|
||||
assigned when it becomes up (with "ifconfig IF up").
|
||||
|
||||
Some network drivers allow an interface to become up even without a
|
||||
hardware address (for example, PCMCIA network cards). In such cases, it is
|
||||
possible that an interface has no link-local address even if the
|
||||
interface is up. If you see such situation, please disable the
|
||||
interface once and then re-enable it (i.e. do `ifconfig IF down;
|
||||
ifconfig IF up').
|
||||
|
||||
Pseudo interfaces (like "gif" tunnel device) will borrow IPv6 interface
|
||||
identifier (lowermost 64bit of the address) from EUI64/IEEE802 sources,
|
||||
like ethernet cards. Pseudo interfaces will be able to get IPv6 link-local
|
||||
address, if you have other "real" interface configured beforehand.
|
||||
If you have no EUI64/IEEE802 sources on the node, you may need to configure
|
||||
link-local address manually. Though we have last-resort code in the kernel,
|
||||
which generates interface identifier from MD5(hostname), it may not suitable
|
||||
for your usage (for example, if you configure same hostname on both sides
|
||||
of gif tunnel, you will be doomed).
|
||||
|
||||
If you have a router announcing Router Advertisement,
|
||||
global addresses will be assigned automatically. So, "ifconfig" is not
|
||||
necessary for your *host*. (Please refer to "sysctl" section for configuring
|
||||
a host to accept Router Advertisement.)
|
||||
|
||||
If you want to set up a router, you need to assign global addresses
|
||||
for two or more interfaces by "ifconfig" or "prefix". (prefix command
|
||||
is described at next section)
|
||||
If you want to assign a global address by "ifconfig", don't forget to
|
||||
specify the "alias" argument to keep the link-local address.
|
||||
|
||||
# ifconfig de0 inet6 fec0:0:0:1000:200:f8ff:fe01:6317 alias
|
||||
# ifconfig de0
|
||||
de0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet 172.16.202.12 netmask 0xffffff00 broadcast 172.16.202.255
|
||||
inet6 fe80::200:f8ff:fe01:6317%de0 prefixlen 64
|
||||
inet6 fec0:0:0:1000:200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:1000:: prefixlen 64 anycast
|
||||
ether 00:00:f8:01:63:17
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
|
||||
|
||||
See also "/etc/rc.network6" for actual examples.
|
||||
|
||||
<<prefix>>
|
||||
|
||||
In IPv6 architecture, an IPv6 address of an interface can be generated
|
||||
from a prefix assigned to it, and a link-dependent identifier for the
|
||||
interface. Assigning a full IPv6 address by ifconfig is not
|
||||
necessary anymore, because, user can only take care of prefix, by letting
|
||||
system take care of interface identifier.
|
||||
|
||||
The newly added "prefix" command enables user to just assign prefixes
|
||||
for interfaces, and let your system automatically generate IPv6
|
||||
addresses. Prefixes added by the "prefix" command is maintained in
|
||||
the kernel consistently with prefixes assigned by Router
|
||||
Renumbering(in case of routers).
|
||||
|
||||
But "prefix" command can only be used on router, because host should be
|
||||
able to configure its addr automatically. Prefixes added by the "prefix"
|
||||
command are maintained independently from prefixes assigned by
|
||||
Router Advertisement. Those two type of prefixes should not coexist on
|
||||
a machine at the same time, and when it happens, it is considered to be
|
||||
miss configuration.
|
||||
|
||||
Manual assignment of prefixes or change of prefix properties take
|
||||
precedence over ones assigned by Router Renumbering.
|
||||
|
||||
If you want to assign a prefix(and consequently an address) manually, do
|
||||
as follows:
|
||||
|
||||
# prefix de0 fec0:0:0:1000::
|
||||
# ifconfig de0
|
||||
de0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet 172.16.202.12 netmask 0xffffff00 broadcast 172.16.202.255
|
||||
inet6 fe80:1::200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:1000:200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:1000:: prefixlen 64 anycast
|
||||
ether 00:00:f8:01:63:17
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
|
||||
|
||||
To check assigned prefix, use the "ndp" command. (See description of
|
||||
ndp command about its usage)
|
||||
|
||||
# ndp -p
|
||||
fec0:0:0:1000::/64 if=de0
|
||||
flags=LA, vltime=2592000, pltime=604800, expire=Never
|
||||
No advertising router
|
||||
|
||||
The "prefix" command also has node internal prefix renumbering
|
||||
ability.
|
||||
|
||||
If you have multiple prefixes which have fec0:0:0:1000:/56 at the top,
|
||||
and would like to renumber them to fec0:0:0:2000:/56, then use the
|
||||
"prefix" command with the "matchpr" argument and the "usepr" argument.
|
||||
|
||||
Suppose that current state of before renumbering as follows:
|
||||
|
||||
# ifconfig de0
|
||||
de0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet 172.16.202.12 netmask 0xffffff00 broadcast 172.16.202.255
|
||||
inet6 fe80:1::200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:1000:200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:1000:: prefixlen 64 anycast
|
||||
ether 00:00:f8:01:63:17
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
|
||||
|
||||
# ifconfig de1
|
||||
de1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet 172.16.203.12 netmask 0xffffff00 broadcast 172.16.203.255
|
||||
inet6 fe80:1::200:f8ff:fe55:7011 prefixlen 64
|
||||
inet6 fec0:0:0:1001:200:f8ff:fe55:7011 prefixlen 64
|
||||
inet6 fec0:0:0:1001:: prefixlen 64 anycast
|
||||
ether 00:00:f8:55:70:11
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
|
||||
|
||||
# ndp -p
|
||||
fec0:0:0:1000::/64 if=de0
|
||||
flags=LA, vltime=2592000, pltime=604800, expire=Never
|
||||
No advertising router
|
||||
fec0:0:0:1001::/64 if=de1
|
||||
flags=LA, vltime=2592000, pltime=604800, expire=Never
|
||||
No advertising router
|
||||
|
||||
Then do as follows:
|
||||
|
||||
# prefix -a matchpr fec0:0:0:1000:: mp_len 56 usepr fec0:0:0:2000:: up_uselen 56 change
|
||||
|
||||
If command is successful, prefixes and addresses will be renumbered as
|
||||
follows.
|
||||
|
||||
# ifconfig de0
|
||||
de0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet 172.16.202.12 netmask 0xffffff00 broadcast 172.16.202.255
|
||||
inet6 fe80:1::200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:2000:200:f8ff:fe01:6317 prefixlen 64
|
||||
inet6 fec0:0:0:2000:: prefixlen 64 anycast
|
||||
ether 00:00:f8:01:63:17
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
|
||||
# ifconfig de1
|
||||
de1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet 172.16.203.12 netmask 0xffffff00 broadcast 172.16.203.255
|
||||
inet6 fe80:1::200:f8ff:fe55:7011 prefixlen 64
|
||||
inet6 fec0:0:0:2001:200:f8ff:fe55:7011 prefixlen 64
|
||||
inet6 fec0:0:0:2001:: prefixlen 64 anycast
|
||||
ether 00:00:f8:55:70:11
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP
|
||||
# ndp -p
|
||||
fec0:0:0:2000::/64 if=de0
|
||||
flags=LA, vltime=2592000, pltime=604800, expire=Never
|
||||
No advertising router
|
||||
fec0:0:0:2001::/64 if=de1
|
||||
flags=LA, vltime=2592000, pltime=604800, expire=Never
|
||||
No advertising router
|
||||
|
||||
See also "/etc/rc.network6" for actual examples.
|
||||
|
||||
<<<route>>>
|
||||
|
||||
If there is a router announcing Router Advertisement on the subnet,
|
||||
you don't need to add a default route for your host by yourself.
|
||||
(Please refer to "sysctl" section to accept Router Advertisement.)
|
||||
|
||||
If you want to add a default route manually, do as follows:
|
||||
|
||||
# route add -inet6 default fe80::200:a2ff:fe0e:7543%de0
|
||||
|
||||
"default" means ::/0.
|
||||
|
||||
Note that, in IPv6, link-local address should be used as gateway
|
||||
("fe80::200:a2ff:fe0e:7543%de1" in the above). If you use global addresses,
|
||||
icmp6 redirect may not work properly. For ease of configuration we recommend
|
||||
you to avoid static routes and run a routing daemon (route6d for example)
|
||||
instead.
|
||||
|
||||
<<<ping6>>> (This might be integrated into "ping" as "ping -6" in the future.)
|
||||
|
||||
Reachability can be checked by "ping6". This "ping6" allows multicast
|
||||
for its argument.
|
||||
|
||||
% ping6 -I xl0 ff02::1
|
||||
or
|
||||
% ping6 ff02::1%xl0
|
||||
|
||||
PING6(56=40+8+8 bytes) fe80::5254:ff:feda:cb7d --> ff02::1
|
||||
56 bytes from fe80::5254:ff:feda:cb7d, icmp_seq=0 hlim=64 time=0.25 ms
|
||||
56 bytes from fe80::2a0:c9ff:fe84:ed6c, icmp_seq=0 hlim=64 time=1.333 ms(DUP!)
|
||||
56 bytes from fe80::5254:ff:feda:d161, icmp_seq=0 hlim=64 time=1.459 ms(DUP!)
|
||||
56 bytes from fe80::260:97ff:fec2:80bf, icmp_seq=0 hlim=64 time=1.538 ms(DUP!)
|
||||
|
||||
<<<ping6 -w>>>
|
||||
|
||||
Name resolution is possible by ICMPv6 node information query message.
|
||||
This is very convenient for link-local addresses whose host name cannot be
|
||||
resolved by DNS. Specify the "-w" option to "ping6".
|
||||
|
||||
% ping6 -I xl0 -w ff02::1
|
||||
|
||||
64 bytes from fe80::5254:ff:feda:cb7d: fto.kame.net
|
||||
67 bytes from fe80::5254:ff:feda:d161: banana.kame.net
|
||||
69 bytes from fe80::2a0:c9ff:fe84:ebd9: paradise.kame.net
|
||||
66 bytes from fe80::260:8ff:fe8b:447f: taroh.kame.net
|
||||
66 bytes from fe80::2a0:c9ff:fe84:ed6c: ayame.kame.net
|
||||
|
||||
<<<traceroute6>>>
|
||||
|
||||
The route for a target host can be checked by "traceroute6".
|
||||
|
||||
% traceroute6 tokyo.v6.wide.ad.jp
|
||||
|
||||
traceroute to tokyo.v6.wide.ad.jp (3ffe:501:0:401:200:e8ff:fed5:8923), 30 hops max, 12 byte packets
|
||||
1 nr60.v6.kame.net 1.239 ms 0.924 ms 0.908 ms
|
||||
2 otemachi.v6.wide.ad.jp 28.953 ms 31.451 ms 26.567 ms
|
||||
3 tokyo.v6.wide.ad.jp 26.549 ms 26.58 ms 26.186 ms
|
||||
|
||||
If the -l option is specified, both address and name are shown in each line.
|
||||
% traceroute6 -l tokyo.v6.wide.ad.jp
|
||||
|
||||
traceroute to tokyo.v6.wide.ad.jp (3ffe:501:0:401:200:e8ff:fed5:8923), 30 hops max, 12 byte packets
|
||||
1 nr60.v6.kame.net (3ffe:501:4819:2000:260:97ff:fec2:80bf) 1.23 ms 0.952 ms 0.92 ms
|
||||
2 otemachi.v6.wide.ad.jp (3ffe:501:0:1802:260:97ff:feb6:7ff0) 27.345 ms 26.706 ms 26.563 ms
|
||||
3 tokyo.v6.wide.ad.jp (3ffe:501:0:401:200:e8ff:fed5:8923) 26.329 ms 26.36 ms 28.63 ms
|
||||
|
||||
<<<ndp>>>
|
||||
|
||||
To display the current Neighbor cache, use "ndp":
|
||||
|
||||
% ndp -a
|
||||
Neighbor Linklayer Address Netif Expire St Flgs Prbs
|
||||
nr60.v6.kame.net 0:60:97:c2:80:bf xl0 expired S R
|
||||
fec0:0:0:1000:2c0:cff:fe10 0:c0:c:10:3a:53 xl0 permanent R
|
||||
paradise.v6.kame.net 52:54:0:dc:52:17 xl0 expired S R
|
||||
fe80:1::200:eff:fe49:f929 0:0:e:49:f9:29 xl0 expired S R
|
||||
fe80:1::200:86ff:fe05:80da 0:0:86:5:80:da xl0 expired S
|
||||
fe80:1::200:86ff:fe05:c2d8 0:0:86:5:c2:d8 xl0 9s R
|
||||
|
||||
To flush the all NDP cache, execute the following by root.
|
||||
|
||||
# ndp -c
|
||||
|
||||
To display the prefix list.
|
||||
|
||||
% ndp -p
|
||||
fec0:0:0::1000::/64 if=xl0
|
||||
flags=LA, vltime=2592000, pltime=604800, expire=29d23h59m58s
|
||||
advertised by
|
||||
fe80::5254:ff:fedc:5217
|
||||
fe80::260:97ff:fec2:80bf
|
||||
fe80::200:eff:fe49:f929
|
||||
|
||||
To display the default router list.
|
||||
|
||||
% ndp -r
|
||||
fe80::260:97ff:fec2:80bf if=xl0, flags=, expire=29m55s
|
||||
fe80::5254:ff:fedc:5217 if=xl0, flags=, expire=29m7s
|
||||
fe80::200:eff:fe49:f929 if=xl0, flags=, expire=28m47s
|
||||
|
||||
<<<rtsol>>>
|
||||
|
||||
To generate a Router Solicitation message right now to get global
|
||||
addresses, use "rtsol".
|
||||
|
||||
# ifconfig xl0
|
||||
xl0: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet6 fe80:2::2a0:24ff:feab:839b%xl0 prefixlen 64
|
||||
ether 0:a0:24:ab:83:9b
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP 100baseTX <hw-loopback>
|
||||
|
||||
# rtsol xl0
|
||||
# ifconfig xl0
|
||||
xl0: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
|
||||
inet6 fe80:2::2a0:24ff:feab:839b%xl0 prefixlen 64
|
||||
inet6 fec0:0:0:1000:2a0:24ff:feab:839b prefixlen 64
|
||||
ether 0:a0:24:ab:83:9b
|
||||
media: autoselect (10baseT/UTP) status: active
|
||||
supported media: autoselect 100baseTX <full-duplex> 100baseTX 10baseT/UTP <full-duplex> 10baseT/UTP 100baseTX <hw-loopback>
|
||||
|
||||
|
||||
<<<rtsold>>>
|
||||
|
||||
rtsold is a daemon version of rtsol. If you run KAME IPv6 on a laptop
|
||||
computer and frequently move with it, the daemon is useful since it watches
|
||||
the interface and sends router solicitations when the status of the interface
|
||||
changes. Note, however, that the feature is disabled by default. Please
|
||||
add -m option at invocation of rtsold.
|
||||
|
||||
rtsold also supports multiple interfaces. For example, you can
|
||||
invoke the daemon as follows:
|
||||
# rtsold -m ep0 cnw0
|
||||
|
||||
<<<netstat>>>
|
||||
|
||||
To see routing table:
|
||||
|
||||
# netstat -nr
|
||||
# netstat -nrl (long format with Ref and Use)
|
||||
|
||||
<<<sysctl>>>
|
||||
|
||||
If "net.inet6.ip6.accept_rtadv" is 1, Router Advertisement is
|
||||
accepted. This means that global addresses and default route are
|
||||
automatically set up. Otherwise, the announcement is rejected. The
|
||||
default value is 0. To set "net.inet6.ip6.accept_rtadv" to 1, execute
|
||||
as follows:
|
||||
|
||||
# sysctl -w net.inet6.ip6.accept_rtadv=1
|
||||
|
||||
<<<gifconfig>>>
|
||||
|
||||
"gif" interface enables you to perform IPv{4,6} over IPv{4,6}
|
||||
protocol tunneling. To use this interface, you must specify the
|
||||
outer IPv{4,6} address by using gifconfig, like:
|
||||
|
||||
# gifconfig gif0 172.16.198.61 172.16.11.21
|
||||
|
||||
"ifconfig gif0" will configure the address pair used for inner
|
||||
IPv{4,6} header.
|
||||
|
||||
It is not required to configure inner IPv{4,6} address pair. If
|
||||
you do not configure inner IPv{4,6} address pair, tunnel link is
|
||||
considered as un-numbered link and the source address of inner
|
||||
IPv{4,6} address pair will be borrowed from other interfaces.
|
||||
|
||||
The following example configures un-numbered IPv6-over-IPv4 tunnel:
|
||||
# gifconfig gif0 10.0.0.1 10.0.0.1 netmask 255.255.255.0
|
||||
|
||||
The following example configures numbered IPv6-over-IPv4 tunnel:
|
||||
# gifconfig gif0 10.0.0.1 10.0.0.1 netmask 255.255.255.0
|
||||
# ifconfig gif0 inet6 fec0:0:0:3000::1 fec0:0:0:3000::2 prefixlen 64 alias
|
||||
|
||||
IPv6 spec allows you to use point-to-point link without global IPv6
|
||||
address assigned to the interface. Routing protocol (such as RIPng)
|
||||
uses link-local addresses only. If you are to configure IPv6-over-IPv4
|
||||
tunnel, you need not to configure an address pair for inner IPv6
|
||||
header. We suggest you to use the former example (un-numbered
|
||||
IPv6-over-IPv4 tunnel) to connect to 6bone for simplicity,
|
||||
for router to router connection.
|
||||
|
||||
Note that it is so easy to make an infinite routing loop using gif
|
||||
interface, if you configure a tunnel using the same protocol family
|
||||
for inner and outer header (i.e. IPv4-over-IPv4).
|
||||
|
||||
Refer to gifconfig(8) for more details.
|
||||
|
||||
<<<inetd>>>
|
||||
|
||||
Inetd supports AF_INET and AF_INET6 sockets, with IPsec policy
|
||||
configuration support.
|
||||
|
||||
Refer to inetd(8) for more details.
|
||||
|
||||
<<<IPsec>>>
|
||||
|
||||
The current KAME supports both transport mode and tunnel mode.
|
||||
However, tunnel mode comes with some restrictions.
|
||||
http://www.kame.net/newsletter/ has more comprehensive examples.
|
||||
|
||||
Let's setup security association to deploy a secure channel between
|
||||
HOST A (10.2.3.4) and HOST B (10.6.7.8). Here we show a little
|
||||
complicated example. From HOST A to HOST B, only old AH is used.
|
||||
From HOST B to HOST A, new AH and new ESP are combined.
|
||||
|
||||
Now we should choose algorithm to be used corresponding to "AH"/"new
|
||||
AH"/"ESP"/"new ESP". Please refer to the "setkey" man page to know
|
||||
algorithm names. Our choice is MD5 for AH, new-HMAC-SHA1 for new AH,
|
||||
and new-DES-expIV with 8 byte IV for new ESP.
|
||||
|
||||
Key length highly depends on each algorithm. For example, key
|
||||
length must be equal to 16 bytes for MD5, 20 for new-HMAC-SHA1,
|
||||
and 8 for new-DES-expIV. Now we choose "MYSECRETMYSECRET",
|
||||
"KAMEKAMEKAMEKAMEKAME", "PASSWORD", respectively.
|
||||
|
||||
OK, let's assign SPI (Security Parameter Index) for each protocol.
|
||||
Please note that we need 3 SPIs for this secure channel since three
|
||||
security headers are produced (one for from HOST A to HOST B, two for
|
||||
from HOST B to HOST A). Please also note that SPI MUST be greater
|
||||
than or equal to 256. We choose, 1000, 2000, and 3000, respectively.
|
||||
|
||||
|
||||
(1)
|
||||
HOST A ------> HOST B
|
||||
|
||||
(1)PROTO=AH
|
||||
ALG=MD5(RFC1826)
|
||||
KEY=MYSECRETMYSECRET
|
||||
SPI=1000
|
||||
|
||||
(2.1)
|
||||
HOST A <------ HOST B
|
||||
<------
|
||||
(2.2)
|
||||
|
||||
(2.1)
|
||||
PROTO=AH
|
||||
ALG=new-HMAC-SHA1(new AH)
|
||||
KEY=KAMEKAMEKAMEKAMEKAME
|
||||
SPI=2000
|
||||
|
||||
(2.2)
|
||||
PROTO=ESP
|
||||
ALG=new-DES-expIV(new ESP)
|
||||
IV length = 8
|
||||
KEY=PASSWORD
|
||||
SPI=3000
|
||||
|
||||
Now, let's setup security association. Execute "setkey" on both HOST
|
||||
A and B:
|
||||
|
||||
# setkey -c
|
||||
add 10.2.3.4 10.6.7.8 ah 1000 -m transport -A keyed-md5 "MYSECRETMYSECRET" ;
|
||||
add 10.6.7.8 10.2.3.4 ah 2000 -m transport -A hmac-sha1 "KAMEKAMEKAMEKAMEKAME" ;
|
||||
add 10.6.7.8 10.2.3.4 esp 3000 -m transport -E des-cbc "PASSWORD" ;
|
||||
^D
|
||||
|
||||
Actually, IPsec communication doesn't process until security policy
|
||||
entries will be defined. In this case, you must setup each host.
|
||||
|
||||
At A:
|
||||
# setkey -c
|
||||
spdadd 10.2.3.4 10.6.7.8 any -P out ipsec
|
||||
ah/transport/10.2.3.4-10.6.7.8/require ;
|
||||
^D
|
||||
|
||||
At B:
|
||||
spdadd 10.6.7.8 10.2.3.4 any -P out ipsec
|
||||
esp/transport/10.6.7.8-10.2.3.4/require ;
|
||||
spdadd 10.6.7.8 10.2.3.4 any -P out ipsec
|
||||
ah/transport/10.6.7.8-10.2.3.4/require ;
|
||||
^D
|
||||
|
||||
To utilize the security associations installed into the kernel, you
|
||||
must set the socket security level by using setsockopt().
|
||||
This is per-application (or per-socket) security. For example,
|
||||
the "ping" command has the -P option with parameter to enable AH and/or ESP.
|
||||
|
||||
For example:
|
||||
% ping -P "out ipsec \
|
||||
ah/transport/10.0.1.1-10.0.2.2/use \
|
||||
esp/tunnel/10.0.1.1-10.0.1.2/require" 10.0.2.2
|
||||
|
||||
If there are proper SAs, this policy specification causes ICMP packet
|
||||
to be AH transport mode inner ESP tunnel mode like below.
|
||||
|
||||
HOST C -----------> GATEWAY D ----------> HOST E
|
||||
10.0.1.1 10.0.1.2 10.0.2.1 10.0.2.2
|
||||
| | | |
|
||||
| ======= ESP ======= |
|
||||
==================== AH ==================
|
||||
|
||||
|
||||
|
||||
Another example using IPv6.
|
||||
|
||||
ESP transport mode is recommended for TCP port number 110 between Host-A and
|
||||
Host-B.
|
||||
|
||||
============ ESP ============
|
||||
| |
|
||||
Host-A Host-B
|
||||
fec0::10 -------------------- fec0::11
|
||||
|
||||
Encryption algorithm is blowfish-cbc whose key is "kamekame", and
|
||||
authentication algorithm is hmac-sha1 whose key is "this is the test key".
|
||||
Configuration at Host-A:
|
||||
|
||||
# setkey -c <<EOF
|
||||
spdadd fec0::10[any] fec0::11[110] tcp -P out ipsec
|
||||
esp/transport/fec0::10-fec0::11/use ;
|
||||
spdadd fec0::11[110] fec0::10[any] tcp -P in ipsec
|
||||
esp/transport/fec0::11-fec0::10/use ;
|
||||
add fec0::10 fec0::11 esp 0x10001
|
||||
-m transport
|
||||
-E blowfish-cbc "kamekame"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
add fec0::11 fec0::10 esp 0x10002
|
||||
-m transport
|
||||
-E blowfish-cbc "kamekame"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
EOF
|
||||
|
||||
and at Host-B:
|
||||
|
||||
# setkey -c <<EOF
|
||||
spdadd fec0::11[110] fec0::10[any] tcp -P out ipsec
|
||||
esp/transport/fec0::11-fec0::10/use ;
|
||||
spdadd fec0::10[any] fec0::11[110] tcp -P in ipsec
|
||||
esp/transport/fec0::10-fec0::11/use ;
|
||||
add fec0::10 fec0::11 esp 0x10001 -m transport
|
||||
-E blowfish-cbc "kamekame"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
add fec0::11 fec0::10 esp 0x10002 -m transport
|
||||
-E blowfish-cbc "kamekame"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
EOF
|
||||
|
||||
Note the direction of SP.
|
||||
|
||||
|
||||
Tunnel mode between two security gateways
|
||||
|
||||
Security protocol is old AH tunnel mode, i.e. specified by RFC1826, with
|
||||
keyed-md5 whose key is "this is the test" as authentication algorithm.
|
||||
|
||||
======= AH =======
|
||||
| |
|
||||
Network-A Gateway-A Gateway-B Network-B
|
||||
10.0.1.0/24 ---- 172.16.0.1 ----- 172.16.0.2 ---- 10.0.2.0/24
|
||||
|
||||
Configuration at Gateway-A:
|
||||
|
||||
# setkey -c <<EOF
|
||||
spdadd 10.0.1.0/24 10.0.2.0/24 any -P out ipsec
|
||||
ah/tunnel/172.16.0.1-172.16.0.2/require ;
|
||||
spdadd 10.0.2.0/24 10.0.1.0/24 any -P in ipsec
|
||||
ah/tunnel/172.16.0.2-172.16.0.1/require ;
|
||||
add 172.16.0.1 172.16.0.2 ah-old 0x10003 -m any
|
||||
-A keyed-md5 "this is the test" ;
|
||||
add 172.16.0.2 172.16.0.1 ah-old 0x10004 -m any
|
||||
-A keyed-md5 "this is the test" ;
|
||||
|
||||
If port number field is omitted such above then "[any]" is employed. `-m'
|
||||
specifies the mode of SA to be used. "-m any" means wild-card of mode of
|
||||
security protocol. You can use this SA for both tunnel and transport mode.
|
||||
|
||||
and at Gateway-B:
|
||||
|
||||
# setkey -c <<EOF
|
||||
spdadd 10.0.2.0/24 10.0.1.0/24 any -P out ipsec
|
||||
ah/tunnel/172.16.0.2-172.16.0.1/require ;
|
||||
spdadd 10.0.1.0/24 10.0.2.0/24 any -P in ipsec
|
||||
ah/tunnel/172.16.0.1-172.16.0.2/require ;
|
||||
add 172.16.0.1 172.16.0.2 ah-old 0x10003 -m any
|
||||
-A keyed-md5 "this is the test" ;
|
||||
add 172.16.0.2 172.16.0.1 ah-old 0x10004 -m any
|
||||
-A keyed-md5 "this is the test" ;
|
||||
|
||||
|
||||
Making SA bundle between two security gateways
|
||||
|
||||
AH transport mode and ESP tunnel mode is required between Gateway-A and
|
||||
Gateway-B. In this case, ESP tunnel mode is applied first, and AH transport
|
||||
mode is next.
|
||||
|
||||
========== AH =========
|
||||
| ======= ESP ===== |
|
||||
| | | |
|
||||
Network-A Gateway-A Gateway-B Network-B
|
||||
fec0:0:0:1::/64 --- fec0:0:0:1::1 ---- fec0:0:0:2::1 --- fec0:0:0:2::/64
|
||||
|
||||
Encryption algorithm is 3des-cbc, and authentication algorithm for ESP is
|
||||
hmac-sha1. Authentication algorithm for AH is hmac-md5.
|
||||
Configuration at Gateway-A:
|
||||
|
||||
# setkey -c <<EOF
|
||||
spdadd fec0:0:0:1::/64 fec0:0:0:2::/64 any -P out ipsec
|
||||
esp/tunnel/fec0:0:0:1::1-fec0:0:0:2::1/require
|
||||
ah/transport/fec0:0:0:1::1-fec0:0:0:2::1/require ;
|
||||
spdadd fec0:0:0:2::/64 fec0:0:0:1::/64 any -P in ipsec
|
||||
esp/tunnel/fec0:0:0:2::1-fec0:0:0:1::1/require
|
||||
ah/transport/fec0:0:0:2::1-fec0:0:0:1::1/require ;
|
||||
add fec0:0:0:1::1 fec0:0:0:2::1 esp 0x10001 -m tunnel
|
||||
-E 3des-cbc "kamekame12341234kame1234"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
add fec0:0:0:1::1 fec0:0:0:2::1 ah 0x10001 -m transport
|
||||
-A hmac-md5 "this is the test" ;
|
||||
add fec0:0:0:2::1 fec0:0:0:1::1 esp 0x10001 -m tunnel
|
||||
-E 3des-cbc "kamekame12341234kame1234"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
add fec0:0:0:2::1 fec0:0:0:1::1 ah 0x10001 -m transport
|
||||
-A hmac-md5 "this is the test" ;
|
||||
|
||||
|
||||
Making SAs with the different end
|
||||
|
||||
ESP tunnel mode is required between Host-A and Gateway-A. Encryption
|
||||
algorithm is cast128-cbc, and authentication algorithm for ESP is hmac-sha1.
|
||||
ESP transport mode is recommended between Host-A and Host-B. Encryption
|
||||
algorithm is rc5-cbc, and authentication algorithm for ESP is hmac-md5.
|
||||
|
||||
================== ESP =================
|
||||
| ======= ESP ======= |
|
||||
| | | |
|
||||
Host-A Gateway-A Host-B
|
||||
fec0:0:0:1::1 ---- fec0:0:0:2::1 ---- fec0:0:0:2::2
|
||||
|
||||
Configuration at Host-A:
|
||||
|
||||
# setkey -c <<EOF
|
||||
spdadd fec0:0:0:1::1[any] fec0:0:0:2::2[80] tcp -P out ipsec
|
||||
esp/transport/fec0:0:0:1::1-fec0:0:0:2::2/use
|
||||
esp/tunnel/fec0:0:0:1::1-fec0:0:0:2::1/require ;
|
||||
spdadd fec0:0:0:2::1[80] fec0:0:0:1::1[any] tcp -P in ipsec
|
||||
esp/transport/fec0:0:0:2::2-fec0:0:0:l::1/use
|
||||
esp/tunnel/fec0:0:0:2::1-fec0:0:0:1::1/require ;
|
||||
add fec0:0:0:1::1 fec0:0:0:2::2 esp 0x10001
|
||||
-m transport
|
||||
-E cast128-cbc "12341234"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
add fec0:0:0:1::1 fec0:0:0:2::1 esp 0x10002
|
||||
-E rc5-cbc "kamekame"
|
||||
-A hmac-md5 "this is the test" ;
|
||||
add fec0:0:0:2::2 fec0:0:0:1::1 esp 0x10003
|
||||
-m transport
|
||||
-E cast128-cbc "12341234"
|
||||
-A hmac-sha1 "this is the test key" ;
|
||||
add fec0:0:0:2::1 fec0:0:0:1::1 esp 0x10004
|
||||
-E rc5-cbc "kamekame"
|
||||
-A hmac-md5 "this is the test" ;
|
||||
|
||||
<end of USAGE>
|
Loading…
Reference in New Issue
Block a user