Commit Graph

895 Commits

Author SHA1 Message Date
Jamie Gritton
7c2f3cb964 Remove redundant calls of prison_local_ip4 in in_pcbbind_setup, and of
prison_local_ip6 in in6_pcbbind.

Approved by:	bz (mentor)
2009-02-05 14:25:53 +00:00
Jamie Gritton
b89e82dd87 Standardize the various prison_foo_ip[46] functions and prison_if to
return zero on success and an error code otherwise.  The possible errors
are EADDRNOTAVAIL if an address being checked for doesn't match the
prison, and EAFNOSUPPORT if the prison doesn't have any addresses in
that address family.  For most callers of these functions, use the
returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or
EINVAL.

Always include a jailed() check in these functions, where a non-jailed
cred always returns success (and makes no changes).  Remove the explicit
jailed() checks that preceded many of the function calls.

Approved by:	bz (mentor)
2009-02-05 14:06:09 +00:00
Bjoern A. Zeeb
5f16e341d4 When iterating through the list trying to find a router in
defrouter_select(), NULL the cached llentry after unlocking
as we are no longer interested in it and with the second
iteration would try to unlock it again resulting in
panic: Lock (rw) lle not locked @ ...

Reported by:	Mark Atkinson <m.atkinson@f5.com>
Tested by:	Mark Atkinson <m.atkinson@f5.com>
PR:		kern/128247 (in follow-up, unrelated to original report)
2009-02-04 10:35:27 +00:00
Randall Stewart
a99b67833a - Cleanup checksum code.
- Prepare for CRC offloading, add MIB counters (RS/MT).
- Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT).
- Bugfix: Handle close() with SO_LINGER correctly when notifications
          are generated during the close() call(MT).
- Bugfix: Generate DRY event when sender is dry during subscription.
          Only for 1-to-1 style sockets (RS/MT)
- Bugfix: Put vtags for the correct amount of time into time-wait (MT).
- Bugfix: Clear vtag entries correctly on expiration (MT).
- Bugfix: shutdown() indicates ENOTCONN when called for unconnected
          1-to-1 style sockets (MT).
- Bugfix: In sctp Auth code (PL).
- Add support for devices that support SCTP csum offload (igb).
- Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS)
Obtained from:	With help from Peter Lei and Michael Tuexen
2009-02-03 11:04:03 +00:00
Bjoern A. Zeeb
09f8c3ff36 Remove the single global unlocked route cache ip6_forward_rt
from the inet6 stack along with statistics and make sure we
properly free the rt in all cases.

While the current situation is not better performance wise it
prevents panics seen more often these days.
After more inet6 and ipsec cleanup we should be able to improve
the situation again passing the rt to ip6_forward directly.

Leave the ip6_forward_rt entry in struct vinet6 but mark it
for removal.

PR:		kern/128247, kern/131038
MFC after:	25 days
Committed from:	Bugathon #6
Tested by:	Denis Ahrens <denis@h3q.com> (different initial version)
2009-02-01 21:11:08 +00:00
Bjoern A. Zeeb
959e14c15e Remove unused local MACROs.
Submitted by:	Christoph Mallon christoph.mallon@gmx.de
MFC after:	2 weeks
2009-01-31 17:35:44 +00:00
Bjoern A. Zeeb
39f046dac2 Coalesce two consecutive #ifdef IPSEC blocks.
Move the skip_ipsec: label below the goto as we can never have
ipsecrt set if we get to that label so there is no need to check.

MFC after:	2 weeks
2009-01-31 12:24:53 +00:00
Bjoern A. Zeeb
e173d3df0c Remove dead code from #if 0:
we do not have an ipsrcchk_rt anywhere else.

MFC after:	2 weeks
2009-01-31 11:19:20 +00:00
Bjoern A. Zeeb
2e730bea0a Like with r185713 make sure to not leak a lock as rtalloc1(9) returns
a locked route. Thus we have to use RTFREE_LOCKED(9) to get it unlocked
and rtfree(9)d rather than just rtfree(9)d.

Since the PR was filed, new places with the same problem were added
with new code.  Also check that the rt is valid before freeing it
either way there.

PR:		kern/129793
Submitted by:	Dheeraj Reddy <dheeraj@ece.gatech.edu>
MFC after:	2 weeks
Committed from:	Bugathon #6
2009-01-31 10:48:02 +00:00
Bjoern A. Zeeb
351c4745f1 Remove 4 entirely unsued ip6 variables.
Leave then in struct vinet6 to not break the ABI with kernel modules
but mark them for removal so we can do it in one batch when the time
is right.

MFC after:	1 month
2009-01-30 23:40:24 +00:00
Bjoern A. Zeeb
1cecba0fcd For consistency with prison_{local,remote,check}_ipN rename
prison_getipN to prison_get_ipN.

Submitted by:	jamie (as part of a larger patch)
MFC after:	1 week
2009-01-25 10:11:58 +00:00
Sam Leffler
cbd1844537 remove too noisy DIAGNOSTIC code
Reviewed by:	qingli
2009-01-18 07:20:02 +00:00
Qing Li
14981d8057 Revive the RTF_LLINFO flag in route.h. The kernel code is guarded
by the new kernel option COMPAT_ROUTE_FLAGS for binary backward
compatibility. The RTF_LLDATA flag maps to the same value as RTF_LLINFO.
RTF_LLDATA is used by the arp and ndp utilities. The RTF_LLDATA flag is
always returned to the userland regardless whether the COMPAT_ROUTE_FLAGS
is defined.
2009-01-12 11:24:32 +00:00
Bjoern A. Zeeb
813dd6ae5e Restrict arp, ndp and theoretically the FIB listing (if not
read with libkvm) to the addresses of a prison, when inside a
jail. [1]
As the patch from the PR was pre-'new-arp', add checks to the
llt_dump handlers as well.

While touching RTM_GET in route_output(), consistently use
curthread credentials rather than the creds from the socket
there. [2]

PR:		kern/68189
Submitted by:	Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1]
Discussed with:	rwatson [2]
Reviewed by:	rwatson
MFC after:	4 weeks
2009-01-09 21:57:49 +00:00
Bjoern A. Zeeb
5ce0eb7f08 Make SIOCGIFADDR and related, as well as SIOCGIFADDR_IN6 and related
jail-aware. Up to now we returned the first address of the interface
for SIOCGIFADDR w/o an ifr_addr in the query. This caused problems for
programs querying for an address but running inside a jail, as the
address returned usually did not belong to the jail.
Like for v6, if there was an ifr_addr given on v4, you could probe
for more addresses on the interfaces that you were not allowed to see
from inside a jail. Return an error (EADDRNOTAVAIL) in that case
now unless the address is on the given interface and valid for the
jail.

PR:		kern/114325
Reviewed by:	rwatson
MFC after:	4 weeks
2009-01-09 13:06:56 +00:00
Randall Stewart
bbb0e3d9d5 Addresses Roberts comments on comments. Also adds
the KASSERT and checks suggested.

Reviewed by:	The udp tunneling was discussed on net@ under the
                thread entitled "Heads up -- Thinking about UDP and tunneling"
2009-01-06 13:27:56 +00:00
Randall Stewart
c7c7ea4b5a Add the ability of an alternate transport protocol
to easily tunnel over udp by providing a hook
function that will be called instead of appending
to the socket buffer.
2009-01-06 12:13:40 +00:00
Bjoern A. Zeeb
4b5c098fdf Switch the last protosw* structs to C99 initializers.
Reviewed by:	ed, julian, Christoph Mallon <christoph.mallon@gmx.de>
MFC after:	2 weeks
2009-01-05 20:29:01 +00:00
Robert Watson
5e48a30d2e Unlike with struct protosw, several instances of struct ip6protosw
did not use C99-style sparse structure initialization, so remove
NULL assignments for now-removed pr_usrreq function pointers.

Reported by:	Chris Ruiz <yr.retarded at gmail.com>
2009-01-04 21:53:42 +00:00
Robert Watson
cba318dc12 struct ip6protosw is a copy of struct protosw, so remove pr_usrreq there
to reflect removal from struct protosw.

Spotted by:	ed
2009-01-04 21:13:51 +00:00
Qing Li
dc49549713 Some modules such as SCTP supplies a valid route entry as an input argument
to ip_output(). The destionation is represented in a sockaddr{} object
that may contain other pieces of information, e.g., port number. This
same destination sockaddr{} object may be passed into L2 code, which
could be used to create a L2 entry. Since there exists a L2 table per
address family, the L2 lookup function can make address family specific
comparison instead of the generic bcmp() operation over the entire
sockaddr{} structure.

Note in the IPv6 case the sin6_scope_id is not compared because the
address is currently stored in the embedded form inside the kernel.
The in6_lltable_lookup() has to account for the scope-id if this
storage format were to change in the future.
2009-01-03 00:27:28 +00:00
Qing Li
8eca593c5a This checkin addresses a couple of issues:
1. The "route" command allows route insertion through the interface-direct
   option "-iface". During if_attach(), an sockaddr_dl{} entry is created
   for the interface and is part of the interface address list. This
   sockaddr_dl{} entry describes the interface in detail. The "route"
   command selects this entry as the "gateway" object when the "-iface"
   option is present. The "arp" and "ndp" commands also interact with the
   kernel through the routing socket when adding and removing static L2
   entries. The static L2 information is also provided through the
   "gateway" object with an AF_LINK family type, similar to what is
   provided by the "route" command. In order to differentiate between
   these two types of operations, a RTF_LLDATA flag is introduced. This
   flag is set by the "arp" and "ndp" commands when issuing the add and
   delete commands. This flag is also set in each L2 entry returned by the
   kernel. The "arp" and "ndp" command follows a convention where a RTM_GET
   is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills
   in the fields for a "rtm" object, which is reinjected into the kernel by
   a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET
   is a prefix route, so the RTF_LLDATA flag must be specified when issuing
   the RTM_ADD/DELETE messages.

2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the
   specification for retrieving L2 information. Also optimized the
   code logic.

Reviewed by:   julian
2008-12-26 19:45:24 +00:00
Kip Macy
ee6326a30b avoid lock recursion by deferring the link check until after LLE lock is dropped 2008-12-24 01:08:18 +00:00
Bjoern A. Zeeb
f5d35259fe Correct variable name in comment.
MFC after:	4 weeks
2008-12-22 12:54:52 +00:00
Qing Li
ebf1c74403 Similar to the INET case, do not destroy the nd6 entries for
interface addresses until those addresses are removed. I already
made the patch in INET but forgot to bring the code over for
INET6.
2008-12-22 07:11:15 +00:00
Bjoern A. Zeeb
099d0bd34b Only unlock the llentry if it is actually valid.
Reported by:	ed
2008-12-18 19:09:14 +00:00
Bjoern A. Zeeb
97590249ad Another step assimilating IPv[46] PCB code:
normalize IN6P_* compat flags usage to their equialent
INP_* counterpart.

Discussed with:	rwatson
Reviewed by:	rwatson
MFC after:	4 weeks
2008-12-17 13:00:18 +00:00
Bjoern A. Zeeb
dcdb4371ca Use inc_flags instead of the inc_isipv6 alias which so far
had been the only flag with random usage patterns.
Switch inc_flags to be used as a real bit field by using
INC_ISIPV6 with bitops to check for the 'isipv6' condition.

While here fix a place or two where in case of v4 inc_flags
were not properly initialized before.[1]

Found by:	rwatson during review [1]
Discussed with:	rwatson
Reviewed by:	rwatson
MFC after:	4 weeks
2008-12-17 12:52:34 +00:00
Qing Li
9928dafbb8 Remove the rt argument from nd6_storelladdr() because
rt is no longer accessed.
2008-12-17 10:27:34 +00:00
Qing Li
f16e1269b4 A couple of files were not meant to be committed. 2008-12-17 10:19:53 +00:00
Qing Li
bbd8aebaba in6_clsroute() was applied to prefix routes causing some
of them to expire. in6_clsroute() was only applied to
cloned routes that are no longer applicable after the
arp-v2 commit.
2008-12-17 10:03:49 +00:00
Kip Macy
a614678035 * Compare pointer with NULL
* Remove trailing whitespace (added in r186162)
* Reduce indentation by rephrasing test

Submitted by:	Christopher Mallon (christoph dot mallon at gmx dot de)
2008-12-16 23:56:24 +00:00
Kip Macy
fd14c50bbb - Simplify handling of the deferring of mbuf transmit until after lle lock drop
- add a couple of comments to clarify intent
2008-12-16 23:06:36 +00:00
Kip Macy
75bab8b81d check pointers against NULL 2008-12-16 06:01:08 +00:00
Kip Macy
aba53ef0a6 convert more pointer validation checks to checking against NULL 2008-12-16 03:12:44 +00:00
Kip Macy
d78be3a909 simplify locking in find_pfxlist_reachable_router 2008-12-16 03:05:18 +00:00
Kip Macy
23ee1bfa82 explicitly check return of lla_lookup against NULL 2008-12-16 02:47:22 +00:00
Kip Macy
15209fb6e8 advance tail pointer in nd6_output_lle and check lla_output return against NULL 2008-12-16 02:33:53 +00:00
Kip Macy
688d079b2d check return from lla_lookup against NULL not zero 2008-12-16 02:30:42 +00:00
Kip Macy
56c423b065 make sure redirect doesn't return without dropping the lock 2008-12-16 02:06:26 +00:00
Kip Macy
83904f7116 need to check that lle is not null before unlock if the break condition is not met
also fix the break condition to explicitly check against NULL
2008-12-16 02:05:11 +00:00
Kip Macy
6289115121 unlock the llentry after use in find_pfxlist_reachable_router 2008-12-16 01:58:30 +00:00
Qing Li
3d3728e9f8 Initialize the variable "router", and apply "static_route" flag
across the entire nd6_cache_lladdr() function.
2008-12-16 01:21:19 +00:00
Kip Macy
fbc2ca1bef unlock and destroy an llentry's lock before freeing
Found by: sam
2008-12-16 00:20:49 +00:00
Kip Macy
c0641cc03b unlock looked up llentrys in defrouter_select 2008-12-16 00:18:04 +00:00
Kip Macy
c2b3a02b38 fix two use after frees in nd6_cache_lladdr caused by last minute unlock shuffling 2008-12-16 00:16:51 +00:00
Bjoern A. Zeeb
fc384fa5d6 Another step assimilating IPv[46] PCB code - directly use
the inpcb names rather than the following IPv6 compat macros:
in6pcb,in6p_sp, in6p_ip6_nxt,in6p_flowinfo,in6p_vflag,
in6p_flags,in6p_socket,in6p_lport,in6p_fport,in6p_ppcb and
sotoin6pcb().

Apart from removing duplicate code in netipsec, this is a pure
whitespace, not a functional change.

Discussed with:	rwatson
Reviewed by:	rwatson (version before review requested changes)
MFC after:	4 weeks (set the timer and see then)
2008-12-15 21:50:54 +00:00
Qing Li
6e6b3f7cbc This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
   possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,

The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.

Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:

- Kip Macy revised the locking code completely, thus completing
  the last piece of the puzzle, Kip has also been conducting
  active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
  provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
  me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
Kip Macy
41c6def2d1 in6_addroute is called through rnh_addadr which is always called with the radix node head lock held
exclusively. Pass RTF_RNH_LOCKED to rtalloc so that rtalloc1_fib will not try to re-acquire the lock.
2008-12-13 20:15:42 +00:00
Bjoern A. Zeeb
1b193af610 Second round of putting global variables, which were virtualized
but formerly missed under VIMAGE_GLOBAL.

Put the extern declarations of the  virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.

Sponsored by:	The FreeBSD Foundation
2008-12-13 19:13:03 +00:00
Kip Macy
7b5ba4e7f0 RTF_RNH_LOCKED needs to be passed in the flags arg not report,
apologies to thompsa
2008-12-12 02:07:45 +00:00
Andrew Thompson
84b5117529 Pass RTF_RNH_LOCKED to rtalloc1 sunce the node head is locked, this avoids a
recursive lock panic on inet6 detach.

Reviewed by:	kmacy
2008-12-12 01:46:59 +00:00
Bjoern A. Zeeb
86413abf5f Put a global variables, which were virtualized but formerly
missed under VIMAGE_GLOBAL.

Start putting the extern declarations of the  virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.

While there garbage collect a few dead externs from ip6_var.h.

Sponsored by:	The FreeBSD Foundation
2008-12-11 16:26:38 +00:00
Marko Zec
385195c062 Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out.  Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs.  This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps.  PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw.  Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at:	devsummit Strassburg
Reviewed by:	bz, julian
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-12-10 23:12:39 +00:00
Warner Losh
609ff41f16 Add missing include to sys/lock.h before sys/rwlock.h 2008-12-08 00:28:21 +00:00
Kip Macy
3120b9d428 - convert radix node head lock from mutex to rwlock
- make radix node head lock not recursive
 - fix LOR in rtexpunge
 - fix LOR in rtredirect

Reviewed by:	sam
2008-12-07 21:15:43 +00:00
Randall Stewart
830d754d52 Code from the hack-session known as the IETF (and a
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
   o don't pre-hash them when you issue one in a cookie.
   o Allow duplicates and use addresses and ports to
     discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
  is still experimental and needs more extensive testing with the
  Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
  with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
  if the adv-peer-ack point progressed. However we still make sure
  a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
  unreliable but not abandoned yet.

With the help of:	Michael Teuxen and Peter Lei :-)
MFC after:	 4 weeks
2008-12-06 13:19:54 +00:00
Bjoern A. Zeeb
4b79449e2f Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by:	brooks, gnn, des, zec, imp
Sponsored by:	The FreeBSD Foundation
2008-12-02 21:37:28 +00:00
Bjoern A. Zeeb
413628a7e3 MFp4:
Bring in updated jail support from bz_jail branch.

This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..

SCTP support was updated and supports IPv6 in jails as well.

Cpuset support permits jails to be bound to specific processor
sets after creation.

Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.

DDB 'show jails' command was added to aid debugging.

Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.

Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.

Bump __FreeBSD_version for the afore mentioned and in kernel changes.

Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
  and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
  help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
  suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
  on cluster machines as well as all the testers and people
  who provided feedback the last months on freebsd-jail and
  other channels.
- My employer, CK Software GmbH, for the support so I could work on this.

Reviewed by:	(see above)
MFC after:	3 months (this is just so that I get the mail)
X-MFC Before:   7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
Marko Zec
f02493cbbd Unhide declarations of network stack virtualization structs from
underneath #ifdef VIMAGE blocks.

This change introduces some churn in #include ordering and nesting
throughout the network stack and drivers but is not expected to cause
any additional issues.

In the next step this will allow us to instantiate the virtualization
container structures and switch from using global variables to their
"containerized" counterparts.

Reviewed by:	bz, julian
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-11-28 23:30:51 +00:00
Bjoern A. Zeeb
6aee2fc550 Merge in6_pcbfree() into in_pcbfree() which after the previous
IPsec change in r185366 only differed in two additonal IPv6 lines.
Rather than splattering conditional code everywhere add the v6
check centrally at this single place.

Reviewed by:	rwatson (as part of a larger changset)
MFC after:	6 weeks (*)
(*) possibly need to leave a stub wrapper in 7 to keep the symbol.
2008-11-27 12:04:35 +00:00
Bjoern A. Zeeb
6974bd9e75 Unify ipsec[46]_delete_pcbpolicy in ipsec_delete_pcbpolicy.
Ignoring different names because of macros (in6pcb, in6p_sp) and
inp vs. in6p variable name both functions were entirely identical.

Reviewed by:	rwatson (as part of a larger changeset)
MFC after:	6 weeks (*)
(*) possibly need to leave a stub wrappers in 7 to keep the symbols.
2008-11-27 10:43:08 +00:00
Marko Zec
97021c2464 Merge more of currently non-functional (i.e. resolving to
whitespace) macros from p4/vimage branch.

Do a better job at enclosing all instantiations of globals
scheduled for virtualization in #ifdef VIMAGE_GLOBALS blocks.

De-virtualize and mark as const saorder_state_alive and
saorder_state_any arrays from ipsec code, given that they are never
updated at runtime, so virtualizing them would be pointless.

Reviewed by:  bz, julian
Approved by:  julian (mentor)
Obtained from:        //depot/projects/vimage-commit2/...
X-MFC after:  never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
2008-11-26 22:32:07 +00:00
Bjoern A. Zeeb
0206cdb846 Remove in6_pcbdetach() as it is exactly the same function
as in_pcbdetach() and we don't need the code twice.

Reviewed by:	rwatson
MFC after:	6 weeks (*)
(*) possibly need to leave a stub wrapper in 7 to keep the symbol.
2008-11-26 20:52:26 +00:00
Bjoern A. Zeeb
a7df09e8c9 Unify the v4 and v6 versions of pcbdetach and pcbfree as good
as possible so that they are easily diffable.

No functional changes.

Reviewed by:	rwatson
MFC after:	6 weeks
2008-11-26 12:54:31 +00:00
Bjoern A. Zeeb
b0fab0344f Plug a credential leak in case the inpcb is freed by
in6_pcbfree() instead of in_pcbfree(); missed in r183606.

Reviewed by:	rwatson
MFC after:	3 days (instantly for 7.1-RC?)
2008-11-26 12:24:18 +00:00
Marko Zec
44e33a0758 Change the initialization methodology for global variables scheduled
for virtualization.

Instead of initializing the affected global variables at instatiation,
assign initial values to them in initializer functions.  As a rule,
initialization at instatiation for such variables should never be
introduced again from now on.  Furthermore, enclose all instantiations
of such global variables in #ifdef VIMAGE_GLOBALS blocks.

Essentialy, this change should have zero functional impact.  In the next
phase of merging network stack virtualization infrastructure from
p4/vimage branch, the new initialization methology will allow us to
switch between using global variables and their counterparts residing in
virtualization containers with minimum code churn, and in the long run
allow us to intialize multiple instances of such container structures.

Discussed at:	devsummit Strassburg
Reviewed by:	bz, julian
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-11-19 09:39:34 +00:00
Robert Watson
4b908c8bb4 Add a MAC label, MAC Framework, and MAC policy entry points for IPv6
fragment reassembly queues.

This allows policies to label reassembly queues, perform access
control checks when matching fragments to a queue, update a queue
label when fragments are matched, and label the resulting
reassembled datagram.

Obtained from:	TrustedBSD Project
2008-10-26 22:45:18 +00:00
Dag-Erling Smørgrav
e11e3f187d Fix a number of style issues in the MALLOC / FREE commit. I've tried to
be careful not to fix anything that was already broken; the NFSv4 code is
particularly bad in this respect.
2008-10-23 20:26:15 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
Bjoern A. Zeeb
dc3c09c89f Bring over the change switching from using sequential to random
ephemeral port allocation as implemented in netinet/in_pcb.c rev. 1.143
(initially from OpenBSD) and follow-up commits during the last four and
a half years including rev. 1.157, 1.162 and 1.199.
This now is relying on the same infrastructure as has been implemented
in in_pcb.c since rev. 1.199.

Reviewed by:	silby, rpaulo, mlaier
MFC after:	2 months
2008-10-20 18:43:59 +00:00
Bjoern A. Zeeb
6f4da20196 Check that the mbuf len is positive (like we do in the v4 case).
Read the other way round this means that even with the checks
the m_len turned negative in some cases which led to panics.
The reason to my understanding seems to be that the checks are wrong
(also for v4) ignoring possible padding when checking cmsg_len or
padding after data when adjusting the mbuf.
Doing proper cheks seems to break applications like named so
further investigation and regression tests are needed.

PR:		kern/119123
Tested by:	Ashish Shukla  wahjava gmail.com
MFC after:	3 days
2008-10-15 19:24:18 +00:00
Robert Watson
a6a6167314 When disconnecting a UDPv6 socket, acquire the socket lock around the
changing of the so_state field, as is done in UDPv4.  Remove XXX
locking comment.

MFC after:	3 days
2008-10-12 20:01:32 +00:00
Bjoern A. Zeeb
55fd3bafdb Style changes: compare pointer to NULL and move a }.
MFC after:	6 weeks
2008-10-04 17:07:58 +00:00
Bjoern A. Zeeb
86d02c5c63 Cache so_cred as inp_cred in the inpcb.
This means that inp_cred is always there, even after the socket
has gone away. It also means that it is constant for the lifetime
of the inp.
Both facts lead to simpler code and possibly less locking.

Suggested by:	rwatson
Reviewed by:	rwatson
MFC after:	6 weeks
X-MFC Note:	use a inp_pspare for inp_cred
2008-10-04 15:06:34 +00:00
Marko Zec
8b615593fc Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by:	julian, bz, brooks, zec
Reviewed by:	julian, bz, brooks, kris, rwatson, ...
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-10-02 15:37:58 +00:00
Colin Percival
29a6d781af Default to ignoring potentially evil IPv6 Neighbor Solicitation
messages.

Approved by:    so (cperciva)
Approved by:	re (kensmith)
Security:       FreeBSD-SA-08:10.nd6
Thanks to:      jinmei, bz
2008-10-02 00:32:59 +00:00
Robert Watson
4a0a13971e When invoking the udp_send() from udp6_send() due to use of a v6-mapped
IPv4 address, first drop the udbinfo and inpcb locks, which will otherwise
be recursed.  This leads to a potential minor race, but is preferable to a
deadlock when acquiring a read lock after a write lock on the inpcb.

MFC after:	3 days
Reported by:	Norbert Papke <fbsd-ml@scrapper.ca>, lioux
2008-09-22 06:44:03 +00:00
Bjoern A. Zeeb
9de45b2780 mld_timerresid() returns ms so instead of doing the maths in usec
and then dividing down to ms, do the maths in ms.

Obtained from:	NetBSD mld6.c rev. 1.47
MFC after:	2 months
2008-09-10 19:42:13 +00:00
Simon L. B. Nielsen
59ca51adba - Fix amd64 local privilege escalation. [08:07]
- Fix nmount(2) local privilege escalation. [08:08]
- Fix IPv6 remote kernel panics. [08:09]

Fix for [08:07] is merge of r181823.

Submitted by:	kib [08:07], csjp [08:08], bz [08:09]
Reviewed by:	peter [08:07], jhb [08:07]
Reviewed by:	jinmei [08:09], rwatson [08:09]
Approved by:	re (SA blanket)
Approved by:	so (simon)
Security:	FreeBSD-SA-08:07.amd64
Security:	FreeBSD-SA-08:08.nmount
Security:	FreeBSD-SA-08:09.icmp6
2008-09-03 19:09:47 +00:00
Bjoern A. Zeeb
bf0d5f8e16 Fix a bug, when a specially crafted ICMPV6 MLD packet could lead
to an integer divide by zero panic in the kernel, if the kernel was
run with hz<1000.
Neither i386, pc98, amd64 or sparc64 are affected in the currently
supported branches and default configuration.

Submitted by:	Miikka Saukko, Ossi Herrala and Jukka Taimisto from
		the CROSS project at Codenomicon Ltd. via CERT-FI.
Reviewed by:	bz, rwatson
Security:	CVE-2008-2464
MFC after:	8 hours
2008-09-03 08:13:58 +00:00
Robert Watson
7a0a0eecf0 In UDPv6, reduce scope of global udbinfo lock during append to last
matching socket by dropping it before udp6_append(), and remove
duplicate unlocks of udbinfo and inpcb in sysctl return path.

MFC after:	3 days
2008-08-31 13:16:45 +00:00
Julian Elischer
5e5d5c6f17 another missed V_ 2008-08-25 06:09:32 +00:00
Julian Elischer
5ed3800e41 Fix some of the formatting fixes.. It's amazing how some thing stand out
in a commit message.
2008-08-20 01:24:55 +00:00
Julian Elischer
ac957cd271 A bunch of formatting fixes brough to light by, or created by the Vimage commit
a few days ago.
2008-08-20 01:05:56 +00:00
Bjoern A. Zeeb
f125044552 As part of step 1.5 of the vimage framework resolve conflicts with
file local static globals which would be folded onto the same name
with the V_ macros.

Reviewed by:	kris, brooks, simon
2008-08-18 13:16:19 +00:00
Bjoern A. Zeeb
603724d3ab Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
Bjoern A. Zeeb
48d48eb980 Fix a regression introduced in r179289 splitting up ip6_savecontrol()
into v4-only vs. v6-only inp_flags processing.
When ip6_savecontrol_v4() is called from ip6_savecontrol() we
were not passing back the **mp thus the information will be missing
in userland.
Istead of going with a *** as suggested in the PR we are returning
**mp now and passing in the v4only flag as a pointer argument.

PR:		kern/126349
Reviewed by:	rwatson, dwmalone
2008-08-16 06:39:18 +00:00
Robert Watson
2209e8f159 Adopt the slightly weaker consistency locking approach used in IPv4 raw
sockets for IPv6 raw sockets: separately lock the inpcb for determining
the destination address for a connect()'d raw socket at the rip6_send()
layer, and then re-acquire the inpcb lock in the rip6_output() layer to
query other options on the socket.  Previously, the global raw IP socket
lock was used, which while correct and marginally more consistent, could
add significantly to global raw IP socket lock contention.

MFC after:	1 week
2008-07-30 09:26:27 +00:00
Robert Watson
ae89d5a389 When copying in and out current ICMPv6 filters on a raw IPv6 socket,
lock the inpcb and use a local stack variable to copy to/from userspace
so that sooptcopyin()/sooptcopyout() aren't called while holding an
rwlock.

While here, fix a bug in which a failed sooptcopyin() might lead to
partially consistent ICMPv6 filters on the socket by not ignoring the
error returned by sooptcopyin().

MFC after:	2 weeks
2008-07-29 19:37:16 +00:00
Robert Watson
2f1ff0cd80 Since we fail IPv6 raw socket allocation if inp->in6p_icmp6filt can't
be allocated, there's no need to conditionize use and freeing of it
later.

MFC after:	1 week
2008-07-29 18:09:46 +00:00
Robert Watson
cc29ac7d22 Marginally decomplicate set/getsockopt code in ip6_output.c by simply
using the passed arguments explicitly and unconditionally rather than
testing them and calling panic().  The result is the same but easier
to read.

MFC after:	3 days
2008-07-29 09:31:03 +00:00
Alexander Motin
6c5bbf5ce1 Move inpcb lock higher to protect some nonbinding fields reading.
It fixes nothing at this time, but decided to be more correct.
2008-07-28 19:32:18 +00:00
Alexander Motin
b11e21ae80 According to in_pcb.h protocol binding information has double locking.
It allows access it while list travercing holding only global pcbinfo lock.
2008-07-27 20:30:34 +00:00
Bjoern A. Zeeb
078b704233 Pass the ucred along into in{,6}_pcblookup_local for upcoming
prison checks.

Reviewed by:	rwatson
2008-07-10 13:31:11 +00:00
Bjoern A. Zeeb
cdcb11b92c For consistency take lport as u_short in in{,6}_pcblookup_local.
All callers either pass in an u_short or u_int16_t.

Reviewed by:	rwatson
2008-07-10 13:23:22 +00:00
Randall Stewart
fc14de76f4 1) Adds the rest of the VIMAGE change macros
2) Adds some __UserSpace__ on some of the common defines that
   the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
   need to a) trim off the data chunk headers, if present, and
   b) make sure the frag bit is communicated properly for the
   msgs coming off the stream queues... i.e. we see if some
   of the msg has been taken.

Obtained from:	jeli contributed the VIMAGE changes on this pass Thanks Julain!
2008-07-09 16:45:30 +00:00
Bjoern A. Zeeb
a55b8b2068 Document required locking in in6_sleectsrc() in case an inp is
passed in by adding an assert.

Requested by:	rwatson
Reviewed by:	rwatson
2008-07-09 16:33:21 +00:00
Bjoern A. Zeeb
f2f877d38c Change the parameters to in6_selectsrc():
- pass in the inp instead of both in6p_moptions and laddr.
 - pass in cred for upcoming prison checks.

Reviewed by:	rwatson
2008-07-08 18:41:36 +00:00
Robert Watson
963e491243 Use soreceive_dgram() and sosend_dgram() with UDPv6, as we do with UDPv4.
Tested by:	ps
MFC after:	3 months
2008-07-08 10:15:23 +00:00