Commit Graph

58 Commits

Author SHA1 Message Date
bz
18b40a43ee MFp4 CH=183052 183053 183258:
In protosw we define pr_protocol as short, while on the wire
  it is an uint8_t.  That way we can have "internal" protocols
  like DIVERT, SEND or gaps for modules (PROTO_SPACER).
  Switch ipproto_{un,}register to accept a short protocol number(*)
  and do an upfront check for valid boundries. With this we
  also consistently report EPROTONOSUPPORT for out of bounds
  protocols, as we did for proto == 0.  This allows a caller
  to not error for this case, which is especially important
  if we want to automatically call these from domain handling.

  (*) the functions have been without any in-tree consumer
  since the initial introducation, so this is considered save.

  Implement ip6proto_{un,}register() similarly to their legacy IP
  counter parts to allow modules to hook up dynamically.

Reviewed by:	philip, will
MFC after:	1 week
2010-09-02 17:43:44 +00:00
bz
0a90ef1728 MFP4: @176978-176982, 176984, 176990-176994, 177441
"Whitspace" churn after the VIMAGE/VNET whirls.

Remove the need for some "init" functions within the network
stack, like pim6_init(), icmp_init() or significantly shorten
others like ip6_init() and nd6_init(), using static initialization
again where possible and formerly missed.

Move (most) variables back to the place they used to be before the
container structs and VIMAGE_GLOABLS (before r185088) and try to
reduce the diff to stable/7 and earlier as good as possible,
to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9.

This also removes some header file pollution for putatively
static global variables.

Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are
no longer needed.

Reviewed by:	jhb
Discussed with:	rwatson
Sponsored by:	The FreeBSD Foundation
Sponsored by:	CK Software GmbH
MFC after:	6 days
2010-04-29 11:52:42 +00:00
julian
79c1f884ef Virtualize the pfil hooks so that different jails may chose different
packet filters. ALso allows ipfw to be enabled on on ejail and disabled
on another. In 8.0 it's a global setting.

Sitting aroung in tree waiting to commit for: 2 months
MFC after:	2 months
2009-10-11 05:59:43 +00:00
rwatson
88f8de4d40 Remove unused VNET_SET() and related macros; only VNET_GET() is
ever actually used.  Rename VNET_GET() to VNET() to shorten
variable references.

Discussed with:	bz, julian
Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-16 21:13:04 +00:00
rwatson
57ca4583e7 Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
bz
a8839212d2 Make callers to in6_selectsrc() and in6_pcbladdr() pass in memory
to save the selected source address rather than returning an
unreferenced copy to a pointer that might long be gone by the
time we use the pointer for anything meaningful.

Asked for by:	rwatson
Reviewed by:	rwatson
2009-06-23 22:08:55 +00:00
zec
8b1f38241a Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state.  The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.

While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers.  Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.

Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels.  Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.

Bump __FreeBSD_version to 800097.
Reviewed by:	bz, julian
Approved by:	rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
bms
32a71137f0 Bite the bullet, and make the IPv6 SSM and MLDv2 mega-commit:
import from p4 bms_netdev.  Summary of changes:

 * Connect netinet6/in6_mcast.c to build.
   The legacy KAME KPIs are mostly preserved.
 * Eliminate now dead code from ip6_output.c.
   Don't do mbuf bingo, we are not going to do RFC 2292 style
   CMSG tricks for multicast options as they are not required
   by any current IPv6 normative reference.
 * Refactor transports (UDP, raw_ip6) to do own mcast filtering.
   SCTP, TCP unaffected by this change.
 * Add ip6_msource, in6_msource structs to in6_var.h.
 * Hookup mld_ifinfo state to in6_ifextra, allocate from
   domifattach path.
 * Eliminate IN6_LOOKUP_MULTI(), it is no longer referenced.
   Kernel consumers which need this should use in6m_lookup().
 * Refactor IPv6 socket group memberships to use a vector (like IPv4).
 * Update ifmcstat(8) for IPv6 SSM.
 * Add witness lock order for IN6_MULTI_LOCK.
 * Move IN6_MULTI_LOCK out of lower ip6_output()/ip6_input() paths.
 * Introduce IP6STAT_ADD/SUB/INC/DEC as per rwatson's IPv4 cleanup.
 * Update carp(4) for new IPv6 SSM KPIs.
 * Virtualize ip6_mrouter socket.
   Changes mostly localized to IPv6 MROUTING.
 * Don't do a local group lookup in MROUTING.
 * Kill unused KAME prototypes in6_purgemkludge(), in6_restoremkludge().
 * Preserve KAME DAD timer jitter behaviour in MLDv1 compatibility mode.
 * Bump __FreeBSD_version to 800084.
 * Update UPDATING.

NOTE WELL:
 * This code hasn't been tested against real MLDv2 queriers
   (yet), although the on-wire protocol has been verified in Wireshark.
 * There are a few unresolved issues in the socket layer APIs to
   do with scope ID propagation.
 * There is a LOR present in ip6_output()'s use of
   in6_setscope() which needs to be resolved. See comments in mld6.c.
   This is believed to be benign and can't be avoided for the moment
   without re-introducing an indirect netisr.

This work was mostly derived from the IGMPv3 implementation, and
has been sponsored by a third party.
2009-04-29 19:19:13 +00:00
bz
5d8f0a53a7 Remove the single global unlocked route cache ip6_forward_rt
from the inet6 stack along with statistics and make sure we
properly free the rt in all cases.

While the current situation is not better performance wise it
prevents panics seen more often these days.
After more inet6 and ipsec cleanup we should be able to improve
the situation again passing the rt to ip6_forward directly.

Leave the ip6_forward_rt entry in struct vinet6 but mark it
for removal.

PR:		kern/128247, kern/131038
MFC after:	25 days
Committed from:	Bugathon #6
Tested by:	Denis Ahrens <denis@h3q.com> (different initial version)
2009-02-01 21:11:08 +00:00
bz
ec7e619d54 Remove 4 entirely unsued ip6 variables.
Leave then in struct vinet6 to not break the ABI with kernel modules
but mark them for removal so we can do it in one batch when the time
is right.

MFC after:	1 month
2009-01-30 23:40:24 +00:00
qingli
ec826ad5c7 This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
   possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,

The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.

Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:

- Kip Macy revised the locking code completely, thus completing
  the last piece of the puzzle, Kip has also been conducting
  active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
  provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
  me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
bz
98e7fe0e6a Second round of putting global variables, which were virtualized
but formerly missed under VIMAGE_GLOBAL.

Put the extern declarations of the  virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.

Sponsored by:	The FreeBSD Foundation
2008-12-13 19:13:03 +00:00
bz
83a32f8750 Put a global variables, which were virtualized but formerly
missed under VIMAGE_GLOBAL.

Start putting the extern declarations of the  virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.

While there garbage collect a few dead externs from ip6_var.h.

Sponsored by:	The FreeBSD Foundation
2008-12-11 16:26:38 +00:00
rwatson
0db6d4519c Add a MAC label, MAC Framework, and MAC policy entry points for IPv6
fragment reassembly queues.

This allows policies to label reassembly queues, perform access
control checks when matching fragments to a queue, update a queue
label when fragments are matched, and label the resulting
reassembled datagram.

Obtained from:	TrustedBSD Project
2008-10-26 22:45:18 +00:00
bz
eafee510a9 Fix a regression introduced in r179289 splitting up ip6_savecontrol()
into v4-only vs. v6-only inp_flags processing.
When ip6_savecontrol_v4() is called from ip6_savecontrol() we
were not passing back the **mp thus the information will be missing
in userland.
Istead of going with a *** as suggested in the PR we are returning
**mp now and passing in the v4only flag as a pointer argument.

PR:		kern/126349
Reviewed by:	rwatson, dwmalone
2008-08-16 06:39:18 +00:00
bz
13896f2e51 Change the parameters to in6_selectsrc():
- pass in the inp instead of both in6p_moptions and laddr.
 - pass in cred for upcoming prison checks.

Reviewed by:	rwatson
2008-07-08 18:41:36 +00:00
bz
f3ab94f7b4 Factor out the v4-only vs. the v6-only inp_flags processing in
ip6_savecontrol in preparation for udp_append() to no longer
need an WLOCK as we will no longer be modifying socket options.

Requested by:		rwatson
Reviewed by:		gnn
MFC after:		10 days
2008-05-24 15:20:48 +00:00
bz
1c376286e0 Replace the last susers calls in netinet6/ with privilege checks.
Introduce a new privilege allowing to set certain IP header options
(hop-by-hop, routing headers).

Leave a few comments to be addressed later.

Reviewed by:	rwatson (older version, before addressing his comments)
2008-01-24 08:25:59 +00:00
obrien
0d684d927b Clean up VCS Ids. 2007-12-10 16:03:40 +00:00
delphij
42fe5e7f83 Space cleanup
Approved by:	re (rwatson)
2007-07-05 16:29:40 +00:00
jinmei
eff13571bc cleanup about the reassembly structures and routine:
- removed unused structure members
  - fixed a minor bug that the ECN code point may not be restored correctly

Approved by:	ume (mentor)
MFC after:	1 week
2007-06-04 06:06:35 +00:00
bms
3e83ac6653 Make IPv6 multicast forwarding dynamically loadable from a GENERIC kernel.
It is built in the same module as IPv4 multicast forwarding, i.e. ip_mroute.ko,
if and only if IPv6 support is enabled for loadable modules.
Export IPv6 forwarding structs to userland netstat(1) via sysctl(9).
2007-02-24 11:38:47 +00:00
suz
f77be35fb4 sync with KAME (renamed a macro IPV6_DADOUTPUT to IPV6_UNSPECSRC)
Obtained from: KAME
2005-10-21 15:45:13 +00:00
ume
08a5c349d1 added a knob to enable path MTU discovery for multicast packets.
(by default, it is disabled)

Submitted by:	suz
Obtained from:	KAME
2005-08-13 19:55:06 +00:00
suz
62a02b864f supports stealth forwarding in IPv6, as well as in IPv4
PR: kern/54625
MFC after: 1 week
2005-08-10 09:13:35 +00:00
ume
da2cf62b28 scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from:	KAME
2005-07-25 12:31:43 +00:00
ume
5960d35ade always copy ip6_pktopt. remove needcopy and needfree
argument/structure member accordingly.

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-21 16:39:23 +00:00
ume
166d271598 be consistent on naming advanced API functions; use ip6_XXXpktopt(s).
Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-21 15:06:32 +00:00
thompsa
d7e928629d Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).

This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.

 IP_HDR_ALIGNED_P()
 IP6_HDR_ALIGNED_P()

Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.

PR:		ia64/81284
Obtained from:	NetBSD
Approved by:	re (dwhite), mlaier (mentor)
2005-07-02 23:13:31 +00:00
imp
2b54eeafae /* -> /*- for license, minor formatting changes, separate for KAME 2005-01-07 02:30:35 +00:00
andre
d243747d92 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
dwmalone
5df13d37b2 Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD
have already done this, so I have styled the patch on their work:

        1) introduce a ip_newid() static inline function that checks
        the sysctl and then decides if it should return a sequential
        or random IP ID.

        2) named the sysctl net.inet.ip.random_id

        3) IPv6 flow IDs and fragment IDs are now always random.
        Flow IDs and frag IDs are significantly less common in the
        IPv6 world (ie. rarely generated per-packet), so there should
        be smaller performance concerns.

The sysctl defaults to 0 (sequential IP IDs).

Reviewed by:	andre, silby, mlaier, ume
Based on:	NetBSD
MFC after:	2 months
2004-08-14 15:32:40 +00:00
imp
b49b7fe799 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
ume
41930d5085 source address selection part of RFC3484.
TODO: since there is scope issue to be solved, multicast and
link-local address are treated as special for workaround for
now.

Obtained from:	KAME
2003-11-04 20:22:33 +00:00
ume
36edae8e0d ip6_savecontrol() argument is redundant 2003-10-29 12:52:28 +00:00
ume
bde54b9152 hide m_tag, again.
Requested by:	sam
2003-10-29 12:49:12 +00:00
ume
67fa4b4d82 cleanup use of m_tag.
Obtained from:	KAME
2003-10-28 16:29:26 +00:00
ume
ecb479d77a re-add wrongly disappered IPV6_CHECKSUM stuff by introducing
ip6_raw_ctloutput().

Obtained from:	KAME
2003-10-26 18:17:01 +00:00
ume
881c4fa391 Switch Advanced Sockets API for IPv6 from RFC2292 to RFC3542
(aka RFC2292bis).  Though I believe this commit doesn't break
backward compatibility againt existing binaries, it breaks
backward compatibility of API.
Now, the applications which use Advanced Sockets API such as
telnet, ping6, mld6query and traceroute6 use RFC3542 API.

Obtained from:	KAME
2003-10-24 18:26:30 +00:00
ume
55ad7e749c - implement lock around IPv6 reassembly, to avoid panic due to
frag6_drain (mutex version will come later).
- limit number of fragments (not fragment queues) in kernel.

Obtained from:	KAME
2003-10-22 15:29:42 +00:00
ume
cb2c1545ab - fix typo in comments.
- style.
- NULL is not 0.
- some variables were renamed.
- nuke unused logic.
(there is no functional change.)

Obtained from:	KAME
2003-10-08 18:26:08 +00:00
ume
dff1ad31d5 randomize IPv6 flowlabel when RANDOM_IP_ID is defined.
Obtained from:	KAME
2003-10-01 21:24:28 +00:00
ume
f0203cacdc we don't need ip6_id when RANDOM_IP_ID is defined. 2003-10-01 18:23:27 +00:00
ume
4d9c2d35dd Don't compiled ip6_randomid() in if RANDOM_IP_ID is not defined. 2003-10-01 16:22:58 +00:00
ume
7a9738e262 randomize IPv6 fragment ID.
Obtained from:	KAME
2003-10-01 15:13:29 +00:00
sam
cd738e8574 o update PFIL_HOOKS support to current API used by netbsd
o revamp IPv4+IPv6+bridge usage to match API changes
o remove pfil_head instances from protosw entries (no longer used)
o add locking
o bump FreeBSD version for 3rd party modules

Heavy lifting by:	"Max Laier" <max@love2party.net>
Supported by:		FreeBSD Foundation
Obtained from:		NetBSD (bits of pfil.h and pfil.c)
2003-09-23 17:54:04 +00:00
jlemon
04e28d5a81 Update netisr handling; Each SWI now registers its queue, and all queue
drain routines are done by swi_net, which allows for better queue control
at some future point.  Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.

Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
2003-03-04 23:19:55 +00:00
schweikh
86f7487fb6 Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/
Add FreeBSD Id tag where missing.
2002-12-30 21:18:15 +00:00
sam
2a86be217a Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
ume
e96d7f2205 cleanup usage of ip6_mapped_addr_on and ip6_v6only. now,
ip6_mapped_addr_on is unified into ip6_v6only.

MFC after:	1 week
2002-07-25 17:40:45 +00:00