Commit Graph

770 Commits

Author SHA1 Message Date
Warner Losh
1c0ee39e74 Marked these as packed correctly 2007-01-12 07:20:25 +00:00
Randall Stewart
139bc87fda a) macro-ization of all mbuf and random number
access plus timers. This makes the code
   more portable and able to change out the
   mbuf or timer system used more easily ;-)
b) removal of all use of pkt-hdr's until only
   the places we need them (before ip_output routines).
c) remove a bunch of code not needed due to <b> aka
   worrying about pkthdr's :-)
d) There was one last reorder problem it looks where
   if a restart occur's and we release and relock (at
   the point where we setup our alias vtag) we would
   end up possibly getting the wrong TSN in place. The
   code that fixed the TSN's just needed to be shifted
   around BEFORE the release of the lock.. also code that
   set the state (since this also could contribute).
Approved by:	gnn
2006-12-29 20:21:42 +00:00
Bjoern A. Zeeb
e521ae0c64 In ip6_sprintf print the addresses in a more common/readable
format eliminating leading zeros like in :0001 -> :1.

Reviewed by:	mlaier
2006-12-16 14:15:31 +00:00
Randall Stewart
a5d547add3 1) Fixes on a number of different collision case LOR's.
2) Fix all "magic numbers" to be constants.
3) A collision case that would generate two associations to
   the same peer due to a missing lock is fixed.
4) Added tracking of where timers are stopped.
Approved by:	gnn
2006-12-14 17:02:55 +00:00
Bjoern A. Zeeb
1d54aa3ba9 MFp4: 92972, 98913 + one more change
In ip6_sprintf no longer use and return one of eight static buffers
for printing/logging ipv6 addresses.
The caller now has to hand in a sufficiently large buffer as first
argument.
2006-12-12 12:17:58 +00:00
Ruslan Ermilov
f9a047a1b7 - In nd6_rtrequest(), when caching an rtentry, don't forget
to add a reference to it; otherwise, we could later access
  a freed memory.  This is believed to fix panics some users
  were observing when running route6d(8), and is similar to
  the fix in sys/netinet/if_ether.c,v 1.139 by glebius@.

PR:		kern/93910, kern/105437
Testing by:	Wojciech Puchar (still ongoing)

- Add rtentry locking to nd6_output() similar to rt_check().

MFC after:	4 days
2006-11-25 20:38:56 +00:00
Randall Stewart
03b0b02163 -Fixes first of all the getcred on IPv6 and V4. The
copy's were incorrect and so was the locking.
-A bug was also found that would create a race and
 panic when an abort arrived on a socket being read
 from.
-Also fix the reader to get MSG_TRUNC when a partial
 delivery is aborted.
-Also addresses a couple of coverity caught error path
 memory leaks and a couple of other valid complaints
Approved by:	gnn
2006-11-08 00:21:13 +00:00
Robert Watson
b96fbb37da Convert three new suser(9) calls introduced between when the priv(9)
patch was prepared and committed to priv(9) calls.  Add XXX comments
as, in each case, the semantics appear to differ from the TCP/UDP
versions of the calls with respect to jail, and because cr_canseecred()
is not used to validate the query.

Obtained from:	TrustedBSD Project
2006-11-06 14:54:06 +00:00
Robert Watson
acd3428b7d Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
Randall Stewart
50cec91936 Tons of fixes to get all the 64bit issues removed.
This also moves two 16 bit int's to become 32 bit
values so we do not have to use atomic_add_16.
Most of the changes are %p, casts and other various
nasty's that were in the orignal code base. With this
commit my machine will now do a build universe.. however
I as yet have not tested on a 64bit machine .. it may not work :-(
2006-11-05 13:25:18 +00:00
Randall Stewart
73932c69b6 Opps... in my fix up of all the $FreeBSD:$-> $FreeBSD$ I
inserted a few to the new files.. but I falied to
add the #include <sys/cdef.h>

Which causes a compile error.. sorry about that... got it
now :-)

Approved by:gnn
2006-11-03 17:21:53 +00:00
Randall Stewart
f8829a4a40 Ok, here it is, we finally add SCTP to current. Note that this
work is not just mine, but it is also the works of Peter Lei
and Michael Tuexen. They both are my two key other developers
working on the project.. and they need ata-boy's too:
****
peterlei@cisco.com
tuexen@fh-muenster.de
****
I did do a make sysent which updated the
syscall's and sysproto.. I hope that is correct... without
it you don't build since we have new syscalls for SCTP :-0

So go out and look at the NOTES, add
option SCTP (make sure inet and inet6 are present too)
and play with SCTP.

I will see about comitting some test tools I have after I
figure out where I should place them. I also have a
lib (libsctp.a) that adds some of the missing socketapi
functions that I need to put into lib's.. I will talk
to George about this :-)

There may still be some 64 bit issues in here, none of
us have a 64 bit processor to test with yet.. Michael
may have a MAC but thats another beast too..

If you have a mac and want to use SCTP contact Michael
he maintains a web site with a loadable module with
this code :-)

Reviewed by:	gnn
Approved by:	gnn
2006-11-03 15:23:16 +00:00
Robert Watson
aed5570872 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
Hajimu UMEMOTO
e7e51bc3e1 Make net.inet6.ip6.auto_linklocal tunable. Someone may want to
enable/disable auto_linklocal even in single user mode.

Discussed with:	re@, gnn@
MFC after:	3 days
2006-10-13 12:45:53 +00:00
Hajimu UMEMOTO
f5c04409eb Revert the default value of net.inet6.ip6.auto_linklocal to 1.
If ipv6_enable is not set to "YES", net.inet6.ip6.auto_linklocal
is turned to 0 at boot.

Discussed with:	re@, gnn@
MFC after:	3 days
2006-10-13 12:41:36 +00:00
John Hay
ae0ddac700 Hopefully the last tweak in trying to make it possible to add ipv6 direct
host routes without side effects.

Submitted by:	JINMEI Tatuya
MFC after:	4 days
2006-10-02 19:15:10 +00:00
George V. Neville-Neil
90ce6fa1c8 Turn off automatic link local address if ipv6_enable is not set to YES
in rc.conf

Reviewed by:    KAME core team, cperciva
MFC after:      3 days
2006-10-02 10:13:30 +00:00
John Hay
584b68e792 A better fix is to check if it is a host route.
Submitted by:	ume
MFC after:	5 days
2006-09-30 20:25:33 +00:00
John Hay
c482f11edb My previous commit broke "route add -inet6 <network_addr> -interface gif0".
Fix that by excluding point-to-point interfaces.

MFC after:	5 days
2006-09-30 14:08:57 +00:00
Bruce M Simpson
910e1364b6 Nits.
Submitted by:	ru
2006-09-29 16:16:41 +00:00
Bruce M Simpson
2d20d32344 Push removal of mrouted down to the rest of the tree. 2006-09-29 15:45:11 +00:00
SUZUKI Shinsuke
831c32014e fixed a bug that IPv6 packets arriving to stf are not accepted.
(a degrade introduced in in6.c Rev 1.61)

PR: kern/103415
Submitted by: JINMEI Tatuya
MFC after: 1 week
2006-09-22 01:42:22 +00:00
John Hay
f129892448 Make it possible to add an IPv6 host route to a host directly connected.
Use something like this:
route add -inet6 <dest_addr> <my_addr_on_that_interface> -interface -llinfo

This is usefull for wireless adhoc mesh networks.

MFC after:	5 days
2006-09-16 06:24:28 +00:00
John Hay
1fcae350ae All multicast listeners on a port should get one copy of the packet. This
was broken during the locking changes.
2006-09-07 18:44:54 +00:00
Andre Oppermann
233dcce118 First step of TSO (TCP segmentation offload) support in our network stack.
o add IFCAP_TSO[46] for drivers to announce this capability for IPv4 and IPv6
 o add CSUM_TSO flag to mbuf pkthdr csum_flags field
 o add tso_segsz field to mbuf pkthdr
 o enhance ip_output() packet length check to allow for large TSO packets
 o extend tcp_maxmtu[46]() with a flag pointer to pass interface capabilities
 o adjust all callers of tcp_maxmtu[46]() accordingly

Discussed on:	-current, -net
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-09-06 21:51:59 +00:00
John Hay
80a684e083 Use net.inet6.ip6.redirect / ip6_sendredirects as part of the decision
to generate icmp6 redirects. Now it is possible to switch redirects off.

MFC after:	1 week
2006-09-05 19:20:42 +00:00
Brooks Davis
43bc7a9c62 With exception of the if_name() macro, all definitions in net_osdep.h
were unused or already in if_var.h so add if_name() to if_var.h and
remove net_osdep.h along with all references to it.

Longer term we may want to kill off if_name() entierly since all modern
BSDs have if_xname variables rendering it unnecessicary.
2006-08-04 21:27:40 +00:00
Robert Watson
c9db0fad09 Align IPv6 socket locking with IPv4 locking: lock socket buffer explicitly
and use _locked variants to avoid extra lock and unlock operations.

Reviewed by:	gnn
MFC after:	1 week
2006-07-23 12:24:22 +00:00
George V. Neville-Neil
c6af35ee0e The KAME project ceased work on IPv6 and IPSec in March of 2006.
Remove the README file which warns against cosmetic or local only
changes.  FreeBSD committers should now feel free to work on the
IPv6 and IPSec code without fetters.  The KAME mailing lists still
exist and it is always a good idea to ask questions about this code
on the snap-users@kame.net mailing list.

Reviewed by:	rwatson, brooks
2006-07-22 02:32:32 +00:00
Robert Watson
a152f8a361 Change semantics of socket close and detach. Add a new protocol switch
function, pru_close, to notify protocols that the file descriptor or
other consumer of a socket is closing the socket.  pru_abort is now a
notification of close also, and no longer detaches.  pru_detach is no
longer used to notify of close, and will be called during socket
tear-down by sofree() when all references to a socket evaporate after
an earlier call to abort or close the socket.  This means detach is now
an unconditional teardown of a socket, whereas previously sockets could
persist after detach of the protocol retained a reference.

This faciliates sharing mutexes between layers of the network stack as
the mutex is required during the checking and removal of references at
the head of sofree().  With this change, pru_detach can now assume that
the mutex will no longer be required by the socket layer after
completion, whereas before this was not necessarily true.

Reviewed by:	gnn
2006-07-21 17:11:15 +00:00
Stephan Uphoff
d915b28015 Fix race conditions on enumerating pcb lists by moving the initialization
( and where appropriate the destruction) of the pcb mutex to the init/finit
functions of the pcb zones.
This allows locking of the pcb entries and race condition free comparison
of the generation count.
Rearrange locking a bit to avoid extra locking operation to update the generation
count in in_pcballoc(). (in_pcballoc now returns the pcb locked)

I am planning to convert pcb list handling from a type safe to a reference count
model soon. ( As this allows really freeing the PCBs)

Reviewed by:	rwatson@, mohans@
MFC after:	1 week
2006-07-18 22:34:27 +00:00
Oleg Bulyzhin
6372145725 Complete timebase (time_second -> time_uptime) conversion.
PR:		kern/94249
Reviewed by:	andre (few months ago)
Approved by:	glebius (mentor)
2006-07-05 23:37:21 +00:00
Yaroslav Tykhiy
4e6098c6a4 We needn't check "m" for NULL here because "off" should be within
the mbuf chain.  If we ever get a buggy caller, a bogus "off" should
be caught by the sanity check at the function entry.  Null "m" here
means a very unusual condition of a totally broken mbuf chain (wrong
m_pkthdr.len or whatever), so we can just page fault later.

Found by:	Coverity Prevent(tm)
CID:		825
2006-06-30 18:25:07 +00:00
Yaroslav Tykhiy
4b97d7affd There is a consensus that ifaddr.ifa_addr should never be NULL,
except in places dealing with ifaddr creation or destruction; and
in such special places incomplete ifaddrs should never be linked
to system-wide data structures.  Therefore we can eliminate all the
superfluous checks for "ifa->ifa_addr != NULL" and get ready
to the system crashing honestly instead of masking possible bugs.

Suggested by:	glebius, jhb, ru
2006-06-29 19:22:05 +00:00
Yaroslav Tykhiy
40e4360c10 Use queue(3) macros instead of accessing list/queue internals directly. 2006-06-29 16:56:07 +00:00
Bjoern A. Zeeb
421d8aa603 Use INPLOOKUP_WILDCARD instead of just 1 more consistently.
OKed by: rwatson (some weeks ago)
2006-06-29 10:49:49 +00:00
Pawel Jakub Dawidek
5279398812 - Use suser_cred(9) instead of directly comparing cr_uid.
- Compare pointer with NULL, instead of 0.

Reviewed by:	rwatson
2006-06-27 11:40:05 +00:00
Pawel Jakub Dawidek
835d4b8924 - Use suser_cred(9) instead of directly checking cr_uid.
- Change the order of conditions to first verify that we actually need
  to check for privileges and then eventually check them.

Reviewed by:	rwatson
2006-06-27 11:35:53 +00:00
Robert Watson
1e0acb6801 Use suser_cred() instead of a direct comparison of cr_uid with 0 in
rip6_output().

MFC after:	1 week
2006-06-25 13:54:59 +00:00
George V. Neville-Neil
a59af512d4 Fix spurious warnings from neighbor discovery when working with IPv6 over
point to point tunnels (gif).

PR:		93220
Submitted by:	Jinmei Tatuya
MFC after:	1 week
2006-06-08 00:31:17 +00:00
Seigo Tanimura
f8366b0334 Avoid spurious release of an rtentry. 2006-05-23 00:32:22 +00:00
Bjoern A. Zeeb
93e4f81d9f In IN6_IS_ADDR_V4MAPPED case instead of returning directly set error and
goto out so that locks will be dropped.

Reviewed by: rwatson, gnn
2006-05-20 13:26:08 +00:00
Max Laier
656faadcb8 Remove ip6fw. Since ipfw has full functional IPv6 support now and - in
contrast to ip6fw - is properly lockes, it is time to retire ip6fw.
2006-05-12 20:39:23 +00:00
Bjoern A. Zeeb
1b34a059cb Assert ip6_forward_rt protected by Giant adding GIANT_REQUIRED to
functions not yet asserting it but working on global ip6_forward_rt
route cache which is not locked and perhaps should go away in the
future though cache hit/miss ration wasn't bad.

It's #if 0ed in frag6 because the code working on ip6_forward_rt is.
2006-05-04 18:41:08 +00:00
Robert Watson
20e3d71cdd Break out socket access control and delivery logic from udp6_input()
into its own function, udp6_append().  This mirrors a similar structure
in udp_input() and udp_append(), and makes the whole thing a lot more
readable.

While here, add missing inpcb locking in UDP6 input path.

Reviewed by:	bz
MFC after:	3 months
2006-05-01 21:39:48 +00:00
Robert Watson
8deea4a8f3 Move lock assertions to top of in6_pcbladdr(): we still want them to run
even if we're going to return an argument-based error.

Assert pcbinfo lock in in6_pcblookup_local(), in6_pcblookup_hash(), since
they walk pcbinfo inpcb lists.

Assert inpcb and pcbinfo locks in in6_pcbsetport(), since
port reservations are changing.

MFC after:	3 months
2006-04-25 12:09:58 +00:00
Robert Watson
04f2073775 Modify in6_pcbpurgeif0() to accept a pcbinfo structure rather than a pcb
list head structure; this improves congruence to IPv4, and also allows
in6_pcbpurgeif0() to lock the pcbinfo.  Modify in6_pcbpurgeif0() to lock
the pcbinfo before iterating the pcb list, use queue(9)'s LIST_FOREACH()
for the iteration, and to lock individual inpcb's while manipulating
them.

MFC after:	3 months
2006-04-23 15:06:16 +00:00
Paul Saab
4f590175b7 Allow for nmbclusters and maxsockets to be increased via sysctl.
An eventhandler is used to update all the various zones that depend
on these values.
2006-04-21 09:25:40 +00:00
Robert Watson
086dafc15b Mirror IPv4 pcb locking into in6_setsockaddr() and in6_setpeeraddr():
acquire inpcb lock when reading inpcb port+address in order to prevent
races with other threads that may be changing them.

MFC after:	3 months
2006-04-15 05:24:23 +00:00
Robert Watson
8511b981f6 Assert the inpcb lock in udp6_output(), as we dereference various
fields.

MFC after:	3 months
2006-04-12 03:34:22 +00:00
Robert Watson
dec8026073 Add comment to udp6_input() that locking is missing from multicast
UDPv6 delivery.

Lock the inpcb of the UDP connection being delivered to before
processing IPSEC policy and other delivery activities.

MFC after:	3 months
2006-04-12 03:32:54 +00:00
Robert Watson
5383103aa0 Add udbinfo locking in udp6_input() to protect lookups of the inpcb
lists during UDPv6 receipt.

MFC after:	3 months
2006-04-12 03:23:56 +00:00
Robert Watson
ff7425ced0 Don't use spl around call to in_pcballoc() in IPv6 raw socket support;
all necessary synchronization appears present.

MFC after:	3 months
2006-04-12 03:07:22 +00:00
Robert Watson
41ba156433 Remove one remaining use of spl in the IPv6 fragmentation code, as
this code appears properly locked.

MFC after:	3 months
2006-04-12 03:06:20 +00:00
Robert Watson
e3beea90c7 Add missing locking to udp6_getcred(), remove spl use.
MFC after:	3 months
2006-04-12 03:03:47 +00:00
Robert Watson
4847772314 Remove spl use from IPv6 inpcb code.
In various inpcb methods for IPv6 sockets, don't check of so_pcb is NULL,
assert it isn't.

MFC after:	3 months
2006-04-12 02:52:14 +00:00
SUZUKI Shinsuke
8447156ce0 ip6_mrouter_done(): use if_allmulti(0) for disabling the multicast promiscuous mode
Obtained from: KAME
MFC after: 2 days
2006-04-10 14:33:22 +00:00
Robert Watson
c60afb3f55 Fix assertion description: !=, not ==.
Submitted by:	pjd
MFC after:	3 months
2006-04-09 16:33:41 +00:00
Robert Watson
14ba8add01 Update in_pcb-derived basic socket types following changes to
pru_abort(), pru_detach(), and in_pcbdetach():

- Universally support and enforce the invariant that so_pcb is
  never NULL, converting dozens of unnecessary NULL checks into
  assertions, and eliminating dozens of unnecessary error handling
  cases in protocol code.

- In some cases, eliminate unnecessary pcbinfo locking, as it is no
  longer required to ensure so_pcb != NULL.  For example, in protocol
  shutdown methods, and in raw IP send.

- Abort and detach protocol switch methods no longer return failures,
  nor attempt to free sockets, as the socket layer does this.

- Invoke in_pcbfree() after in_pcbdetach() in order to free the
  detached in_pcb structure for a socket.

MFC after:	3 months
2006-04-01 16:20:54 +00:00
Robert Watson
4c7c478d0f Break out in_pcbdetach() into two functions:
- in_pcbdetach(), which removes the link between an inpcb and its
  socket.

- in_pcbfree(), which frees a detached pcb.

Unlike the previous in_pcbdetach(), neither of these functions will
attempt to conditionally free the socket, as they are responsible only
for managing in_pcb memory.  Mirror these changes into in6_pcbdetach()
by breaking it into in6_pcbdetach() and in6_pcbfree().

While here, eliminate undesired checks for NULL inpcb pointers in
sockets, as we will now have as an invariant that sockets will always
have valid so_pcb pointers.

MFC after:	3 months
2006-04-01 16:04:42 +00:00
Robert Watson
bc725eafc7 Chance protocol switch method pru_detach() so that it returns void
rather than an error.  Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.

soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF.  so_pcb is now entirely owned and
managed by the protocol code.  Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.

Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.

In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.

netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit.  In their current state they may leak
memory or panic.

MFC after:	3 months
2006-04-01 15:42:02 +00:00
Robert Watson
ac45e92ff2 Change protocol switch pru_abort() API so that it returns void rather
than an int, as an error here is not meaningful.  Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.

This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit.  This will be corrected shortly in followup
commits to these components.

MFC after:      3 months
2006-04-01 15:15:05 +00:00
David Malone
fe12457335 This comment on various IPPORT_ defines was copied from in.h and
probably never fully applied to IPv6. Over time it has become more
stale, so replace it with something more up to date.

Reviewed by:	ume
MFC after:	1 month
2006-03-28 12:51:22 +00:00
Robert Watson
85f1f481ab Remove manual assignment of m_pkthdr from one mbuf to another in
ipsec_copypkt(), as this is already handled by the call to M_MOVE_PKTHDR(),
which also knows how to correctly handle MAC m_tags.  This corrects a panic
when running with MAC and KAME IPSEC.

PR:		kern/94599
Submitted by:	zhouyi zhou <zhouyi04 at ios dot cn>
Reviewed by:	bz
MFC after:	3 days
2006-03-28 10:16:38 +00:00
SUZUKI Shinsuke
31d4137bf3 fixed a memory leak when net.inet6.icmp6.nd6_maxqueuelen is greater than 1
Obtained from: KAME
MFC after: 3 days
2006-03-24 16:20:12 +00:00
David Malone
fcd1001c63 Make net.inet.ip.portrange.reservedhigh and
net.inet.ip.portrange.reservedlow apply to IPv6 aswell as IPv4.

We could have made new sysctls for IPv6, but that potentially makes
things complicated for mapped addresses. This seems like the least
confusing option and least likely to cause obscure problems in the
future.

This change makes the mac_portacl module useful with IPv6 apps.

Reviewed by:	ume
MFC after:	1 month
2006-03-19 11:48:48 +00:00
SUZUKI Shinsuke
d3693a631e implements section 2.2 of RFC4191, regarding the reserved preference value (10)
Obtained from: KAME
MFC after: 1 day
2006-03-19 06:38:39 +00:00
SUZUKI Shinsuke
e381ac4daa updates net.inet6.ip6.kame_version as the proof of the latest KAME merge
Reviewed by: KAME
MFC after: 2 days
2006-03-19 02:11:42 +00:00
SUZUKI Shinsuke
2c112cdc6d fixed a bug that an MLD report is not advertised when group-specific MLD query is received.
PR:	kern/93526
Obtained from:	KAME
MFC after:	1 day
2006-03-04 09:17:11 +00:00
Hajimu UMEMOTO
430683286b avoided the use of purged address structure when an address became
invalid in nd6_timer().

PR:		kern/93170
Reported by:	kris
Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Confirmed by:	kris
Obtained from:	KAME
MFC after:	2 days
2006-02-12 15:37:08 +00:00
George V. Neville-Neil
f2b1bd14dc Fix for an inappropriate bzero of the ICMPv6 stats. The code was zero'ing the wrong structure member but setting the correct one.
Submitted by:	James dot Juran at baesystems dot com
Reviewed by:	gnn
MFC after:	1 week
2006-02-08 07:16:46 +00:00
Hajimu UMEMOTO
8c76311215 shut up strict-aliasing rules warning. 2006-02-05 09:52:40 +00:00
Hajimu UMEMOTO
92cb1c3210 make IPV6_V6ONLY socket option work for UDP as well.
PR:		ports/92620
Reported by:	Kurt Miller <kurt__at__intricatesoftware.com>
MFC after:	1 week
2006-02-02 11:46:05 +00:00
Christian S.J. Peron
604afec496 Somewhat re-factor the read/write locking mechanism associated with the packet
filtering mechanisms to use the new rwlock(9) locking API:

- Drop the variables stored in the phil_head structure which were specific to
  conditions and the home rolled read/write locking mechanism.
- Drop some includes which were used for condition variables
- Drop the inline functions, and convert them to macros. Also, move these
  macros into pfil.h
- Move pfil list locking macros intp phil.h as well
- Rename ph_busy_count to ph_nhooks. This variable will represent the number
  of IN/OUT hooks registered with the pfil head structure
- Define PFIL_HOOKED macro which evaluates to true if there are any
  hooks to be ran by pfil_run_hooks
- In the IP/IP6 stacks, change the ph_busy_count comparison to use the new
  PFIL_HOOKED macro.
- Drop optimization in pfil_run_hooks which checks to see if there are any
  hooks to be ran, and returns if not. This check is already performed by the
  IP stacks when they call:

        if (!PFIL_HOOKED(ph))
                goto skip_hooks;

- Drop in assertion which makes sure that the number of hooks never drops
  below 0 for good measure. This in theory should never happen, and if it
  does than there are problems somewhere
- Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep
- Drop variables which support home rolled read/write locking mechanism from
  the IPFW firewall chain structure.
- Swap out the read/write firewall chain lock internal to use the rwlock(9)
  API instead of our home rolled version
- Convert the inlined functions to macros

Reviewed by:	mlaier, andre, glebius
Thanks to:	jhb for the new locking API
2006-02-02 03:13:16 +00:00
Gleb Smirnoff
25af0bb50e Add some initial locking to gif(4). It doesn't covers the whole driver,
however IPv4-in-IPv4 tunnels are now stable on SMP. Details:

- Add per-softc mutex.
- Hold the mutex on output.

The main problem was the rtentry, placed in softc. It could be
freed by ip_output(). Meanwhile, another thread being in
in_gif_output() can read and write this rtentry.

Reported by:	many
Tested by:	Alexander Shiryaev <aixp mail.ru>
2006-01-30 08:39:09 +00:00
Hajimu UMEMOTO
411babc618 don't embed scope id before running packet filters.
Reported by:	YAMAMOTO Takashi <yamt__at__mwd.biglobe.ne.jp>
Obtained from:	NetBSD
MFC after:	1 week
2006-01-25 08:17:02 +00:00
Robert Watson
9f8a02f168 Convert in6_cksum() to ANSI C function declaration.
MFC after:	1 week
2006-01-22 01:17:57 +00:00
Robert Watson
fc4c825847 When storing the results of malloc() in a pointer to a pointer, check
the pointer to a pointer for NULL, not the pointer for NULL.

Noticed by:	Coverity Prevent analysis tool
MFC after:	3 days
2006-01-14 00:09:41 +00:00
Robert Watson
2ab392c630 In ipcomp6_input(), check 'md' not 'm' after a call to m_pulldown(): 'm'
may be a stale pointer at this point, and we're interested in whether or
not m_pulldown() failed.

Noticed by:	Coverity Prevent analysis tool
MFC after:	3 days
2006-01-13 23:53:23 +00:00
SUZUKI Shinsuke
02ff33e2d0 added a note about the assumption for m->m_pkthdr.rcvif
Obtained from: KAME
MFC After: 1 day
2006-01-09 09:08:43 +00:00
Andrew Thompson
73ff045c57 Add RFC 3378 EtherIP support. This change makes it possible to add gif
interfaces to bridges, which will then send and receive IP protocol 97 packets.
Packets are Ethernet frames with an EtherIP header prepended.

Obtained from:	NetBSD
MFC after:	2 weeks
2005-12-21 21:29:45 +00:00
SUZUKI Shinsuke
7014e0eb11 fixed a kernel crash at the initialization time of PIM-SM register interface
MFC after: 2 days
2005-12-09 04:42:19 +00:00
Hajimu UMEMOTO
4a3df7fe7b the response NS to a DAD NS was not sent correctly due to the
invalid destination address.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
MFC after:	1 day
2005-12-08 06:43:39 +00:00
SUZUKI Shinsuke
a829cf5765 fixed a kernel crash due to an improper removal of callout-timer
(ToDo: similar fix is necessary for other NDP-related callout-timers
 in netinet6/nd6*.c)

PR: kern/88725
MFC after: 1 month
2005-11-16 12:36:08 +00:00
Ruslan Ermilov
303989a2f3 Use sparse initializers for "struct domain" and "struct protosw",
so they are easier to follow for the human being.
2005-11-09 13:29:16 +00:00
SUZUKI Shinsuke
797df30d75 statically configured IPv6 address is properly added/deleted now
Obtained from: KAME
Reported in: freebsd-net@freebsd
MFC after: 1 day
2005-10-31 23:06:04 +00:00
SUZUKI Shinsuke
36dc24e61e fixed a compilation failure on amd64/sparc64/ia64
Submitted by: max
MFC after: 2 month
2005-10-22 05:07:16 +00:00
SUZUKI Shinsuke
200caaf0c0 nuked non-existing commands 2005-10-21 16:31:39 +00:00
SUZUKI Shinsuke
743eee666f sync with KAME regarding NDP
- introduced fine-grain-timer to manage ND-caches and IPv6 Multicast-Listeners
- supports Router-Preference <draft-ietf-ipv6-router-selection-07.txt>
- better prefix lifetime management
- more spec-comformant DAD advertisement
- updated RFC/internet-draft revisions

Obtained from: KAME
Reviewed by: ume, gnn
MFC after: 2 month
2005-10-21 16:23:01 +00:00
SUZUKI Shinsuke
9c8aab3e0b perform NUD on an IPv6-aware point-to-point interface
Obtained from: KAME
MFC after: 1 week
2005-10-21 15:59:00 +00:00
SUZUKI Shinsuke
4ecbe3316a sync with KAME (renamed a macro IPV6_DADOUTPUT to IPV6_UNSPECSRC)
Obtained from: KAME
2005-10-21 15:45:13 +00:00
SUZUKI Shinsuke
7aa5949375 sync with KAME (nuked unused code, use NULL to denote a NULL pointer)
Obtained from: KAME
Reviewed by: ume, gnn
2005-10-19 17:18:49 +00:00
SUZUKI Shinsuke
c1a049ac20 sync with KAME (removed a unnecesary non-standard macro)
Obtained from: KAME
Reviewd by: ume, gnn
2005-10-19 16:53:24 +00:00
SUZUKI Shinsuke
d28bde669a sync with KAME regarding the following clarification in RFC3542:
- disable IPv6 operation if DAD fails for some EUI-64 link-local addresses.
 - export get_hw_ifid() (and rename it) as a subroutine for this process.

Obtained from: KAME
Reviewd by: ume, gnn
MFC after: 2 week
2005-10-19 16:43:57 +00:00
SUZUKI Shinsuke
a22adbc68c sync with KAME (don't respond to NI_QTYPE_IPV4ADDR)
Obtained from: KAME
Reviewed by: ume, gnn
2005-10-19 16:27:33 +00:00
SUZUKI Shinsuke
5b27b04579 supported an ndp command suboption to disable IPv6 in the given interface
Obtained from: KAME
Reviewd by: ume, gnn
MFC after: 2 week
2005-10-19 16:20:18 +00:00
SUZUKI Shinsuke
b9204379a1 added an ioctl option in kernel so that ndp/rtadvd can change some NDP-related kernel variables based on their configurations (RFC2461 p.43 6.2.1 mandates this for IPv6 routers)
Obtained from: KAME
Reviewd by: ume, gnn
MFC after: 2 weeks
2005-10-19 15:05:42 +00:00
SUZUKI Shinsuke
2ce62dce17 sync with KAME in the following points:
- fixed typos
- improved some comment descriptions
- use NULL, instead of 0, to denote a NULL pointer
- avoid embedding a magic number in the code
- use nd6log() instead of log() to record NDP-specific logs
- nuked an unnecessay white space

Obtained from: KAME
MFC after:  1 day
2005-10-19 10:09:19 +00:00
SUZUKI Shinsuke
4350fcab1b Raw IPv6 checksum must use the protocol number of the last header, instead of the first next-header value.
Obtained from: KAME
MFC after: 1 day
2005-10-19 01:21:49 +00:00
SUZUKI Shinsuke
2d70ebe43d fixed a kernel crash when IPv6 PIM-SM routing is enabled and a PIM register message is received
Obtained from: KAME
MFC After: 3 days
2005-10-17 13:47:31 +00:00
SUZUKI Shinsuke
971b154cd3 added a missing unlock
Submitted by: JINMEI Tatuya
MFC After: 1 day
2005-10-15 08:49:49 +00:00
Hajimu UMEMOTO
9129d539e2 AES counter mode uses 8byte IV, not 16 bytes.
Obtained from:	NetBSD
2005-10-12 09:13:48 +00:00
Andre Oppermann
fe53256dc2 Use monotonic 'time_uptime' instead of 'time_second' as timebase
for rt->rt_rmx.rmx_expire.
2005-09-19 22:54:55 +00:00
SUZUKI Shinsuke
9689258fb5 plugged a possible memory leak
Obtained from: KAME
MFC after: 1 day
2005-09-16 01:42:50 +00:00
David E. O'Brien
7ba26d99d8 IPv6 was improperly defining its malloc type the same as IPv4 (M_IPMADDR,
M_IPMOPTS, M_MRTABLE).  Thus we had conflicting instantiations.
Create an IPv6-specific type to overcome this.
2005-09-07 10:11:49 +00:00
Andrew Thompson
59280079d3 Add support for multicast to the bridge and allow inet6 addresses to be
assigned to the interface.

IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge case and
it may cause scope violation.

An address can be assigned in the usual way;
  ifconfig bridge0 inet6 xxxx:...

Tested by:	bmah
Reviewed by:	ume (netinet6)
Approved by:	mlaier (mentor)
MFC after:	1 week
2005-09-06 21:11:59 +00:00
Andre Oppermann
e0aec68255 Use the correct mbuf type for MGET(). 2005-08-30 16:35:27 +00:00
SUZUKI Shinsuke
2af9b91993 added a missing unlock (just do the same thing as in netinet/raw_ip.c)
Obtained from: KAME
MFC after: 3 days
2005-08-18 11:11:27 +00:00
Hajimu UMEMOTO
5d52565396 - fix race condition using sx lock.
- use TAILQ_FOREACH() for readability.

Suggested by:	jhb
2005-08-17 16:46:55 +00:00
Hajimu UMEMOTO
1c44678637 avoid exclusive sleep mutex. 2005-08-16 19:49:10 +00:00
Hajimu UMEMOTO
5af09736a8 added a knob to enable path MTU discovery for multicast packets.
(by default, it is disabled)

Submitted by:	suz
Obtained from:	KAME
2005-08-13 19:55:06 +00:00
Hajimu UMEMOTO
cd0fdcf7a7 - fix typo in comment.
- nuke unused code.

Submitted by:	suz
Obtained from:	KAME
2005-08-12 15:27:25 +00:00
Gleb Smirnoff
530f95fc08 o Make rt_check() function more strict:
- rt0 passed to rt_check() must not be NULL, assert this.
  - rt returned by rt_check() must be valid locked rtentry,
    if no error occured.
o Modify callers, so that they never pass NULL rt0
  to rt_check().

Reviewed by:	sam, ume (nd6.c)
2005-08-11 08:14:53 +00:00
Hajimu UMEMOTO
ae12c6579e create sysctl tree dynamically. it is required to share
net.inet6.ip6.fw with upcomming ipfw2 improvement for IPv6.

Requested by:	bz
2005-08-11 07:28:01 +00:00
Hajimu UMEMOTO
31c8e3fbec removed RFC1885-related code. it was obsoleted by RFC2463, and the
code was #ifdef'ed out for a long time.

Submitted by:	suz
Obtained from:	KAME
2005-08-10 17:30:10 +00:00
SUZUKI Shinsuke
f8a8f9ca5e supports stealth forwarding in IPv6, as well as in IPv4
PR: kern/54625
MFC after: 1 week
2005-08-10 09:13:35 +00:00
David E. O'Brien
c11ba30c9a Remove public declarations of variables that were forgotten when they were
made static.
2005-08-10 07:10:02 +00:00
David E. O'Brien
6ca6f60b07 Style nit. 2005-08-10 06:38:46 +00:00
SUZUKI Shinsuke
05b697ddcb fixed a kernel crash at the start-up time of an IPv6 multicast daemons o
(e.g. pim6dd, pim6sd)

MFC after: 3 days
2005-08-10 05:28:11 +00:00
Hajimu UMEMOTO
c66b5fea43 corrected the fourth argument to ni6_addrs(). 2005-08-09 12:24:11 +00:00
Robert Watson
13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
Gleb Smirnoff
9bd8ca3014 In preparation for fixing races in ARP (and probably in other
L2/L3 mappings) make rt_check() return a locked rtentry.
2005-08-09 08:39:56 +00:00
Gleb Smirnoff
401df2f296 - Use 'error' variable to store error value, instead of 'i'.
- Push 'i' into the only block where it is used.
- Remove redundant check for rt being NULL. If rt_check() hasn't
  returned an error, then rt is valid.

Reviewed by:	gnn
2005-08-09 08:37:28 +00:00
Robert Watson
bccb41014a Modify network protocol consumers of the ifnet multicast address lists
to lock if_addr_mtx.

Problem reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after:		1 week
2005-08-02 23:51:22 +00:00
Hajimu UMEMOTO
e770771a78 simplied the fix to FreeBSD-SA-04:06.ipv6. The previous one worried
too much even though we actually validate the parameters.  This code
also is more compatible with other *BSDs, which do copyin within
setsockopt().

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Reviewed by:	security-officer (nectar)
Obtained from:	KAME
2005-07-28 18:07:07 +00:00
Colin Percival
1fcc990954 Correct a buffer overflow which can occur when decompressing a
carefully crafted deflated data stream. [1]

Correct problems in the AES-XCBC-MAC IPsec authentication algorithm. [2]

Submitted by:	suz [2]
Security:	FreeBSD-SA-05:18.zlib [1], FreeBSD-SA-05:19.ipsec [2]
2005-07-27 08:41:17 +00:00
Hajimu UMEMOTO
d6bb0cb7eb nuke duplicate inclusion of scope6_var.h. 2005-07-26 11:46:15 +00:00
Hajimu UMEMOTO
336a1a7b37 oops, make it compilable. i need sleep. X-( 2005-07-25 17:28:39 +00:00
Hajimu UMEMOTO
a7734b4bfd restore locks which disappeared wrongly by my previous commit. 2005-07-25 17:05:37 +00:00
Hajimu UMEMOTO
a1f7e5f8ee scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from:	KAME
2005-07-25 12:31:43 +00:00
Hajimu UMEMOTO
885adbfa81 always copy ip6_pktopt. remove needcopy and needfree
argument/structure member accordingly.

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-21 16:39:23 +00:00
Hajimu UMEMOTO
e07db7aa57 simplified udp6_output() and rip6_output(): do not override
in6p_outputopts at the entrance of the functions.  this trick was
necessary when we passed an in6 pcb to in6_embedscope(), within which
the in6p_outputopts member was used, but we do not use this kind of
interface any more.

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-21 16:32:50 +00:00
Hajimu UMEMOTO
d5e3406d06 be consistent on naming advanced API functions; use ip6_XXXpktopt(s).
Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-21 15:06:32 +00:00
Hajimu UMEMOTO
8507acb169 NULL is not zero.
Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-21 14:57:53 +00:00
Hajimu UMEMOTO
9727df0c09 do not hardcode if_mtu values in here, except for IFT_{ARC,FDDI} -
they need special handling.  makes it possible to take advantage of 9k ether
frames.

Obtained from:	NetBSD
2005-07-20 20:02:28 +00:00
Hajimu UMEMOTO
18b35df8fe update comments:
- RFC2292bis -> RFC3542
  - typo fixes

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
2005-07-20 08:59:45 +00:00
Andrew Thompson
2fcb030ad5 Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).

This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.

 IP_HDR_ALIGNED_P()
 IP6_HDR_ALIGNED_P()

Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.

PR:		ia64/81284
Obtained from:	NetBSD
Approved by:	re (dwhite), mlaier (mentor)
2005-07-02 23:13:31 +00:00
Hajimu UMEMOTO
d098c2c166 fix IP(v4) over IPv6 tunneling most likely broken with ifnet changes.
Submitted by:	bz
Approved by:	re (dwhite)
2005-06-20 20:17:00 +00:00
Brooks Davis
be4889bb80 Fix IPv6 neighbor discovery by using IF_LLADDR to get the mac address
instead of a particularly ugly cast + pointer math hack.

Reported by:	kuriyama, kris
2005-06-12 00:45:24 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Ian Dowse
ba5da2a06f Use IFF_LOCKGIANT/IFF_UNLOCKGIANT around calls to the interface
if_ioctl routine. This should fix a number of code paths through
soo_ioctl() that could call into Giant-locked network drivers without
first acquiring Giant.
2005-06-02 00:04:08 +00:00
Robert Watson
8a2aa63d7e Lock udbinfo and inp before calling in6_pcbdetach() from udp6_abort().
MFC after:	1 week
2005-06-01 11:38:19 +00:00
George V. Neville-Neil
403cbcf59f Fixes for various nits found by the Coverity tool.
In particular 2 missed return values and an inappropriate bcopy from
a possibly NULL pointer.

Reviewed by:	jake
Approved by:	rwatson
MFC after:	1 week
2005-05-15 02:28:30 +00:00
Brooks Davis
8195404bed Add IPv6 support to IPFW and Dummynet.
Submitted by:	Mariano Tortoriello and Raffaele De Lorenzo (via luigi)
2005-04-18 18:35:05 +00:00
George V. Neville-Neil
c543ec4e34 Remove dead code which would never execute.
i.e. checking to see if a cluster was every less than 48 bytes,
    a rather unlikely case.

Check return value of m_dup_pkthdr() calls.

Found by: Coverity
Reviewed by: rwatson (mentor), Keiichi Shima (for Kame)
Approved by: rwatson (mentor)
2005-04-14 11:41:23 +00:00
Sam Leffler
8a9d54df38 check for malloc failure (also move malloc up to simplify error recovery)
Noticed by:	Coverity Prevent analysis tool
Reviewed by:	gnn
2005-03-29 01:26:27 +00:00
Gleb Smirnoff
d4d2297060 ifma_protospec is a pointer. Use NULL when assigning or compating it. 2005-03-20 14:31:45 +00:00
Sam Leffler
6c011e4dc3 correct bounds check
Noticed by:	Coverity Prevent analysis tool
2005-03-16 05:11:11 +00:00
Hajimu UMEMOTO
9f65b10b0f refer opencrypto/cast.h directly. 2005-03-11 12:37:07 +00:00
Hajimu UMEMOTO
d34fd3c7e0 reported from VANHULLEBUS Yvan [remote kernel crash may result]
Submitted by:	itojun
Obtained from:	KAME
MFC after:	1 day
2005-03-09 14:39:48 +00:00
SUZUKI Shinsuke
da57b1caf8 ignores ICMPv6 code field in case of ICMPv6 Packet-Too-Big (as specified in RFC2463 and draft-ietf-ipngwg-icmp-v3-06.txt)
Obtained from: KAME
MFC after: 1 day
2005-03-02 05:14:15 +00:00
Hajimu UMEMOTO
9c0fda722d icmp6_notify_error uses IP6_EXTHDR_CHECK, which in turn calls
m_pullup.  icmp6_notify_error continued to use the old pointer,
which after the m_pullup is not suitable as a packet header any
longer (see m_move_pkthdr).
and this is what causes the kernel panic in sbappendaddr later on.

PR:		kern/77934
Submitted by:	Gerd Rausch <gerd@juniper.net>
MFC after:	2 days
2005-02-27 18:57:10 +00:00
Hajimu UMEMOTO
bee48028f0 fix typo.
MFC after:	2 days
2005-02-27 18:23:29 +00:00
Hajimu UMEMOTO
283f9f8a3c initialized the last arg to ip6_process_hopopts(), because the recent
code requires it to be 0 when a jumbo payload option is contained.

PR:		kern/77934
Submitted by:	Gerd Rausch <gerd@juniper.net>
Obtained from:	KAME
MFC after:	2 days
2005-02-27 18:07:18 +00:00
Sam Leffler
ba1a42195c remove dead code
Noticed by:	Coverity Prevent analysis tool
2005-02-25 22:58:25 +00:00
Sam Leffler
7f560471fe eliminate dead code
Noticed by:	Coverity Prevent analysis tool
2005-02-23 22:53:04 +00:00
Gleb Smirnoff
a97719482d Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by:	mlaier
Obtained from:	OpenBSD (mickey, mcbride)
2005-02-22 13:04:05 +00:00
Robert Watson
da2ecc1aa6 Add missed merge of ripcbinfo extern. Given how widely used
ripcbinfo is, we should probably add it to an include file.

Spotted by:	mux
2005-02-09 01:12:43 +00:00
Robert Watson
62a2c81733 Lock raw IP socket pcb list and PCBs when processing input via
icmp6_rip6_input().

Reviewed by:	gnn
MFC after:	1 week
2005-02-08 22:16:26 +00:00
Robert Watson
8760934124 Remove a comment from the raw IPv6 output function regarding
M_TRYWAIT allocations: M_PREPEND() now uses M_DONTWAIT.

MFC after:	3 days
2005-02-06 21:43:55 +00:00
Hajimu UMEMOTO
5b8c5ac438 we don't need to make fake sockaddr_in6 to compare subject address.
MFC after:	1 week
2005-01-21 18:12:46 +00:00
Warner Losh
caf43b0208 /* -> /*- for license, minor formatting changes, separate for KAME 2005-01-07 02:30:35 +00:00
Gleb Smirnoff
5e5da86597 In certain cases ip_output() can free our route, so check
for its presence before RTFREE().

Noticed by:	ru
2004-12-10 07:51:14 +00:00
Gleb Smirnoff
34291a9efc style the last change 2004-12-09 09:52:58 +00:00
Gleb Smirnoff
39817106d4 MFinet4:
- Make route cacheing optional, configurable via IFF_LINK0 flag.
  - Turn it off by default.

Reminded by:	suz
2004-12-09 09:48:47 +00:00
George V. Neville-Neil
026e67b69b Reviewed by: SUZUKI Shinsuke <suz@kame.net>
Approved by:  Robert Watson <rwatson@freebsd.org>

Add locking to the IPv6 scoping code.

All spl() like calls have also been removed.

Cleaning up the handling of ifnet data will happen at a later date.
2004-11-29 03:10:35 +00:00
SUZUKI Shinsuke
3d54848fc2 support TCP-MD5(IPv4) in KAME-IPSEC, too.
MFC after: 3 week
2004-11-08 18:49:51 +00:00
Poul-Henning Kamp
756d52a195 Initialize struct pr_userreqs in new/sparse style and fill in common
default elements in net_init_domain().

This makes it possible to grep these structures and see any bogosities.
2004-11-08 14:44:54 +00:00
SUZUKI Shinsuke
b3fe9bc483 fixed a bug that incorrect IPsec request level may be returned for proto AH
Obtained from: KAME
2004-10-28 09:24:45 +00:00
Andre Oppermann
f45cd79a03 Be more careful to only index valid IP protocols and be more verbose with
comments.
2004-10-19 14:26:44 +00:00
Robert Watson
81158452be Push acquisition of the accept mutex out of sofree() into the caller
(sorele()/sotryfree()):

- This permits the caller to acquire the accept mutex before the socket
  mutex, avoiding sofree() having to drop the socket mutex and re-order,
  which could lead to races permitting more than one thread to enter
  sofree() after a socket is ready to be free'd.

- This also covers clearing of the so_pcb weak socket reference from
  the protocol to the socket, preventing races in clearing and
  evaluation of the reference such that sofree() might be called more
  than once on the same socket.

This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket.  The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets.  The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.

RELENG_5_3 candidate.

MFC after:	3 days
Reviewed by:	dwhite
Discussed with:	gnn, dwhite, green
Reported by:	Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by:	Vlad <marchenko at gmail dot com>
2004-10-18 22:19:43 +00:00
SUZUKI Shinsuke
6f9e3ebf47 fixed too delayed routing cache expiry. (tvtohz() converts a time interval to ticks, whereas hzto() converts an absolute time to ticks)
Obtained from: KAME
2004-10-06 03:32:26 +00:00
Brian Feldman
77b691e0ad Prevent reentrancy of the IPv6 routing code (leading to crash with
INVARIANTS on, who knows what with it off).
2004-10-03 00:49:33 +00:00
Doug White
763f534e3c Disable MTU feedback in IPv6 if the sender writes data that must be fragmented.
Discussed extensively with KAME.  The API author's intent isn't clear at this
point, so rather than remove the code entirely, #if 0 out and put a big
comment in for now. The IPV6_RECVPATHMTU sockopt is available if the
application wants to be notified of the path MTU to optimize packet sizes.

Thanks to JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp> for putting up
with my incessant badgering on this issue, and fenner for pointing out
the API issue and suggesting solutions.
2004-10-02 23:45:02 +00:00
Max Laier
d6a8d58875 Add an additional struct inpcb * argument to pfil(9) in order to enable
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.

This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.

Suggested by:		rwatson
A lot of work by:	csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by:		rwatson, csjp
Tested by:		-pf, -ipfw, LINT, csjp and myself
MFC after:		3 days

LOR IDs:		14 - 17 (not fixed yet)
2004-09-29 04:54:33 +00:00
Stefan Farfeleder
e7b80a8e24 Prefer C99's __func__ over GCC's __FUNCTION__. 2004-09-22 17:16:04 +00:00
Robert Watson
690be704f3 Call callout_init() on nd6_slowtimo_ch before setting it going; otherwise,
the flags field will be improperly initialized resulting in inconsistent
operation (sometimes with Giant, sometimes without, et al).

RELENG_5 candidate.
2004-09-05 17:27:54 +00:00
Robert Watson
0b7851fa03 Unlock rather than lock the ripcbinfo lock at the end of rip6_input().
RELENG_5 candidate.

Foot provided by:	Patrick Guelat <pg at imp dot ch>
2004-09-02 20:18:02 +00:00
Robert Watson
98f6a62499 Mark Netgraph TTY, KAME IPSEC, and IPX/SPX as requiring Giant for correct
operation using NET_NEEDS_GIANT().  This will result in a boot-time
restoration of Giant-enabled network operation, or run-time warning on
dynamic load (applicable only to the Netgraph component).  Additional
components will likely need to be marked with this in the future.
2004-08-28 15:24:53 +00:00
Andre Oppermann
3161f583ca Apply error and success logic consistently to the function netisr_queue() and
its users.

netisr_queue() now returns (0) on success and ERRNO on failure.  At the
moment ENXIO (netisr queue not functional) and ENOBUFS (netisr queue full)
are supported.

Previously it would return (1) on success but the return value of IF_HANDOFF()
was interpreted wrongly and (0) was actually returned on success.  Due to this
schednetisr() was never called to kick the scheduling of the isr.  However this
was masked by other normal packets coming through netisr_dispatch() causing the
dequeueing of waiting packets.

PR:		kern/70988
Found by:	MOROHOSHI Akihiko <moro@remus.dti.ne.jp>
MFC after:	3 days
2004-08-27 18:33:08 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
Robert Watson
c415679d71 Remove in6_prefix.[ch] and the contained router renumbering capability.
The prefix management code currently resides in nd6, leaving only the
unused router renumbering capability in the in6_prefix files.  Removing
it will make it easier for us to provide locking for the remainder of
IPv6 by reducing the number of objects requiring synchronized access.

This functionality has also been removed from NetBSD and OpenBSD.

Submitted by:	George Neville-Neil <gnn at neville-neil.com>
Discussed with/approved by:	suz, keiichi at kame.net, core at kame.net
2004-08-23 03:00:27 +00:00
Robert Watson
5a0192650e When notifying protocol components of an event on an in6pcb, use the
result of the notify() function to decide if we need to unlock the
in6pcb or not, rather than always unlocking.  Otherwise, we may unlock
and already unlocked in6pcb.

Reported by:	kuriyama, Gordon Bergling <gbergling at 0xfce3.net>
Tested by:	kuriyama, Gordon Bergling <gbergling at 0xfce3.net>
Discussed with:	mdodd
2004-08-21 17:38:48 +00:00
David Malone
1f44b0a1b5 Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD
have already done this, so I have styled the patch on their work:

        1) introduce a ip_newid() static inline function that checks
        the sysctl and then decides if it should return a sequential
        or random IP ID.

        2) named the sysctl net.inet.ip.random_id

        3) IPv6 flow IDs and fragment IDs are now always random.
        Flow IDs and frag IDs are significantly less common in the
        IPv6 world (ie. rarely generated per-packet), so there should
        be smaller performance concerns.

The sysctl defaults to 0 (sequential IP IDs).

Reviewed by:	andre, silby, mlaier, ume
Based on:	NetBSD
MFC after:	2 months
2004-08-14 15:32:40 +00:00
Robert Watson
8a0c4da871 When allocating the IPv6 header to stick in front of raw packet being
sent via a raw IPv6 socket, use M_DONTWAIT not M_TRYWAIT, as we're
holding the raw pcb mutex.

Reported, tested by:	kuriyama
2004-08-12 18:31:36 +00:00
Robert Watson
f31f65a708 Pass pcbinfo structures to in6_pcbnotify() rather than pcbhead
structures, allowing in6_pcbnotify() to lock the pcbinfo and each
inpcb that it notifies of ICMPv6 events.  This prevents inpcb
assertions from firing when IPv6 generates and delievers event
notifications for inpcbs.

Reported by:	kuriyama
Tested by:	kuriyama
2004-08-06 03:45:45 +00:00
Yaroslav Tykhiy
a4eb4405e3 Disallow a particular kind of port theft described by the following scenario:
Alice is too lazy to write a server application in PF-independent
	manner.  Therefore she knocks up the server using PF_INET6 only
	and allows the IPv6 socket to accept mapped IPv4 as well.  An evil
	hacker known on IRC as cheshire_cat has an account in the same
	system.  He starts a process listening on the same port as used
	by Alice's server, but in PF_INET.  As a consequence, cheshire_cat
	will distract all IPv4 traffic supposed to go to Alice's server.

Such sort of port theft was initially enabled by copying the code that
implemented the RFC 2553 semantics on IPv4/6 sockets (see inet6(4)) for
the implied case of the same owner for both connections.  After this
change, the above scenario will be impossible.  In the same setting,
the user who attempts to start his server last will get EADDRINUSE.

Of course, using IPv4 mapped to IPv6 leads to security complications
in the first place, but there is no reason to make it even more unsafe.

This change doesn't apply to KAME since it affects a FreeBSD-specific
part of the code.  It doesn't modify the out-of-box behaviour of the
TCP/IP stack either as long as mapping IPv4 to IPv6 is off by default.

MFC after:	1 month
2004-07-28 13:03:07 +00:00
Robert Watson
07385abd73 Commit a first pass at in6pcb and pcbinfo locking for IPv6,
synchronizing IPv6 protocol control blocks and lists.  These changes
are modeled on the inpcb locking for IPv4, submitted by Jennifer Yang,
and committed by Jeffrey Hsu.  With these locking changes, IPv6 use of
inpcbs is now substantially more MPSAFE, and permits IPv4 inpcb locking
assertions to be run in the presence of IPv6 compiled into the kernel.
2004-07-27 23:44:03 +00:00
Yaroslav Tykhiy
f66145c6bd Don't consider TCP connections beyond LISTEN state
(i.e. with the foreign address being not wildcard) when checking
for possible port theft since such connections cannot be stolen.

The port theft check is FreeBSD-specific and isn't in the KAME tree.

PR:		bin/65928 (in the audit trail)
Reviewed by:	-net, -hackers (silence)
Tested by:	Nick Leuta <skynick at mail.sc.ru>
MFC after:	1 month
2004-07-27 16:35:09 +00:00
Colin Percival
56f21b9d74 Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with:	rwatson, scottl
Requested by:	jhb
2004-07-26 07:24:04 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Max Laier
02b199f158 Link ALTQ to the build and break with ABI for struct ifnet. Please recompile
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.

__FreeBSD_version bump will follow.

Tested-by:	(i386)LINT
2004-06-13 17:29:10 +00:00
Robert Watson
359fdba7a7 Missed directory in previous commit; need to hold SOCK_LOCK(so)
before calling sotryfree().

-- Body of earlier bulk commit this belonged with --

  Log:
  Extend coverage of SOCK_LOCK(so) to include so_count, the socket
  reference count:

  - Assert SOCK_LOCK(so) macros that directly manipulate so_count:
    soref(), sorele().

  - Assert SOCK_LOCK(so) in macros/functions that rely on the state of
    so_count: sofree(), sotryfree().

  - Acquire SOCK_LOCK(so) before calling these functions or macros in
    various contexts in the stack, both at the socket and protocol
    layers.

  - In some cases, perform soisdisconnected() before sotryfree(), as
    this could result in frobbing of a non-present socket if
    sotryfree() actually frees the socket.

  - Note that sofree()/sotryfree() will release the socket lock even if
    they don't free the socket.

  Submitted by:   sam
  Sponsored by:   FreeBSD Foundation
  Obtained from:  BSD/OS
2004-06-12 20:59:48 +00:00
Hajimu UMEMOTO
3c751c1b6c do not check super user privilege in ip6_savecontrol. It is
meaningless and can even be harmful.

Obtained from:	KAME
MFC after:	3 days
2004-06-02 15:41:18 +00:00
Poul-Henning Kamp
5dba30f15a add missing #include <sys/module.h> 2004-05-30 20:27:19 +00:00
Bill Paul
6f8aee2268 Fix a bug which I discovered recently while doing IPv6 testing at
Wind River. In the IPv4 output path, one of the tests in ip_output()
checks how many slots are actually available in the interface output
queue before attempting to send a packet. If, for example, we need
to transmit a packet of 32K bytes over an interface with an MTU of
1500, we know it's going to take about 21 fragments to do it. If
there's less than 21 slots left in the output queue, there's no point
in transmitting anything at all: IP does not do retransmission, so
sending only some of the fragments would just be a waste of bandwidth.
(In an extreme case, if you're sending a heavy stream of fragmented
packets, you might find yourself sending nothing by the first fragment
of all your packets.) So if ip_output() notices there's not enough
room in the output queue to send the frame, it just dumps the packet
and returns ENOBUFS to the app.

It turns out ip6_output() lacks this code. Consequently, this caused
the netperf UDPIPV6_STREAM test to produce very poor results with large
write sizes. This commit adds code to check the remaining space in the
output queue and junk fragmented packets if they're too big to be
sent, just like with IPv4. (I can't imagine anyone's running an NFS
server using UDP over IPv6, but if they are, this will likely make them
a lot happier. :)
2004-05-14 03:57:17 +00:00
Luigi Rizzo
354c3d34d2 fix the change of interface in nd6_storelladdr for multicast
addresses too.

Reported by: Jun Kuriyama
2004-04-26 20:31:46 +00:00
Luigi Rizzo
cd46a114fc This commit does two things:
1. rt_check() cleanup:
    rt_check() is only necessary for some address families to gain access
    to the corresponding arp entry, so call it only in/near the *resolve()
    routines where it is actually used -- at the moment this is
    arpresolve(), nd6_storelladdr() (the call is embedded here),
    and atmresolve() (the call is just before atmresolve to reduce
    the number of changes).
    This change will make it a lot easier to decouple the arp table
    from the routing table.

    There is an extra call to rt_check() in if_iso88025subr.c to
    determine the routing info length. I have left it alone for
    the time being.

    The interface of arpresolve() and nd6_storelladdr() now changes slightly:
     + the 'rtentry' parameter (really a hint from the upper level layer)
       is now passed unchanged from *_output(), so it becomes the route
       to the final destination and not to the gateway.
     + the routines will return 0 if resolution is possible, non-zero
       otherwise.
     + arpresolve() returns EWOULDBLOCK in case the mbuf is being held
       waiting for an arp reply -- in this case the error code is masked
       in the caller so the upper layer protocol will not see a failure.

2. arpcom untangling
    Where possible, use 'struct ifnet' instead of 'struct arpcom' variables,
    and use the IFP2AC macro to access arpcom fields.
    This mostly affects the netatalk code.

=== Detailed changes: ===
net/if_arcsubr.c
   rt_check() cleanup, remove a useless variable

net/if_atmsubr.c
   rt_check() cleanup

net/if_ethersubr.c
   rt_check() cleanup, arpcom untangling

net/if_fddisubr.c
   rt_check() cleanup, arpcom untangling

net/if_iso88025subr.c
   rt_check() cleanup

netatalk/aarp.c
   arpcom untangling, remove a block of duplicated code

netatalk/at_extern.h
   arpcom untangling

netinet/if_ether.c
   rt_check() cleanup (change arpresolve)

netinet6/nd6.c
   rt_check() cleanup (change nd6_storelladdr)
2004-04-25 09:24:52 +00:00
Luigi Rizzo
60348b56fd ifp has the same value as rt->rti_ifp so remove the dependency
on the route entry to locate the necessary information.
2004-04-19 08:02:52 +00:00
Luigi Rizzo
3240408870 Remove a tail-recursive call in nd6_output.
This change is functionally identical to the original code, though
I have no idea if that was correct in the first place (see comment
in the commit).
2004-04-19 07:48:48 +00:00