Commit Graph

466 Commits

Author SHA1 Message Date
sam
513e3c0122 remove dead code
Noticed by:	Coverity Prevent analysis tool
2005-02-25 22:58:25 +00:00
sam
87ad220d1b eliminate dead code
Noticed by:	Coverity Prevent analysis tool
2005-02-23 22:53:04 +00:00
glebius
e1d22638d0 Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by:	mlaier
Obtained from:	OpenBSD (mickey, mcbride)
2005-02-22 13:04:05 +00:00
rwatson
e05244829c Add missed merge of ripcbinfo extern. Given how widely used
ripcbinfo is, we should probably add it to an include file.

Spotted by:	mux
2005-02-09 01:12:43 +00:00
rwatson
1dc0f1f888 Lock raw IP socket pcb list and PCBs when processing input via
icmp6_rip6_input().

Reviewed by:	gnn
MFC after:	1 week
2005-02-08 22:16:26 +00:00
rwatson
35d0929825 Remove a comment from the raw IPv6 output function regarding
M_TRYWAIT allocations: M_PREPEND() now uses M_DONTWAIT.

MFC after:	3 days
2005-02-06 21:43:55 +00:00
ume
e33b312668 we don't need to make fake sockaddr_in6 to compare subject address.
MFC after:	1 week
2005-01-21 18:12:46 +00:00
imp
2b54eeafae /* -> /*- for license, minor formatting changes, separate for KAME 2005-01-07 02:30:35 +00:00
glebius
16a5d8193f In certain cases ip_output() can free our route, so check
for its presence before RTFREE().

Noticed by:	ru
2004-12-10 07:51:14 +00:00
glebius
272be69b39 style the last change 2004-12-09 09:52:58 +00:00
glebius
9fdc5dd01a MFinet4:
- Make route cacheing optional, configurable via IFF_LINK0 flag.
  - Turn it off by default.

Reminded by:	suz
2004-12-09 09:48:47 +00:00
gnn
f0a7bdfddc Reviewed by: SUZUKI Shinsuke <suz@kame.net>
Approved by:  Robert Watson <rwatson@freebsd.org>

Add locking to the IPv6 scoping code.

All spl() like calls have also been removed.

Cleaning up the handling of ifnet data will happen at a later date.
2004-11-29 03:10:35 +00:00
suz
30108058ef support TCP-MD5(IPv4) in KAME-IPSEC, too.
MFC after: 3 week
2004-11-08 18:49:51 +00:00
phk
027fce30f5 Initialize struct pr_userreqs in new/sparse style and fill in common
default elements in net_init_domain().

This makes it possible to grep these structures and see any bogosities.
2004-11-08 14:44:54 +00:00
suz
0a0f72fa43 fixed a bug that incorrect IPsec request level may be returned for proto AH
Obtained from: KAME
2004-10-28 09:24:45 +00:00
andre
41dc9ace75 Be more careful to only index valid IP protocols and be more verbose with
comments.
2004-10-19 14:26:44 +00:00
rwatson
4b81ce6dd2 Push acquisition of the accept mutex out of sofree() into the caller
(sorele()/sotryfree()):

- This permits the caller to acquire the accept mutex before the socket
  mutex, avoiding sofree() having to drop the socket mutex and re-order,
  which could lead to races permitting more than one thread to enter
  sofree() after a socket is ready to be free'd.

- This also covers clearing of the so_pcb weak socket reference from
  the protocol to the socket, preventing races in clearing and
  evaluation of the reference such that sofree() might be called more
  than once on the same socket.

This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket.  The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets.  The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.

RELENG_5_3 candidate.

MFC after:	3 days
Reviewed by:	dwhite
Discussed with:	gnn, dwhite, green
Reported by:	Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by:	Vlad <marchenko at gmail dot com>
2004-10-18 22:19:43 +00:00
suz
7871c65cf0 fixed too delayed routing cache expiry. (tvtohz() converts a time interval to ticks, whereas hzto() converts an absolute time to ticks)
Obtained from: KAME
2004-10-06 03:32:26 +00:00
green
fb8e0f5c56 Prevent reentrancy of the IPv6 routing code (leading to crash with
INVARIANTS on, who knows what with it off).
2004-10-03 00:49:33 +00:00
dwhite
7ac950b117 Disable MTU feedback in IPv6 if the sender writes data that must be fragmented.
Discussed extensively with KAME.  The API author's intent isn't clear at this
point, so rather than remove the code entirely, #if 0 out and put a big
comment in for now. The IPV6_RECVPATHMTU sockopt is available if the
application wants to be notified of the path MTU to optimize packet sizes.

Thanks to JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp> for putting up
with my incessant badgering on this issue, and fenner for pointing out
the API issue and suggesting solutions.
2004-10-02 23:45:02 +00:00
mlaier
b65eae4c19 Add an additional struct inpcb * argument to pfil(9) in order to enable
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.

This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.

Suggested by:		rwatson
A lot of work by:	csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by:		rwatson, csjp
Tested by:		-pf, -ipfw, LINT, csjp and myself
MFC after:		3 days

LOR IDs:		14 - 17 (not fixed yet)
2004-09-29 04:54:33 +00:00
stefanf
3bd075200e Prefer C99's __func__ over GCC's __FUNCTION__. 2004-09-22 17:16:04 +00:00
rwatson
d8beb8f72b Call callout_init() on nd6_slowtimo_ch before setting it going; otherwise,
the flags field will be improperly initialized resulting in inconsistent
operation (sometimes with Giant, sometimes without, et al).

RELENG_5 candidate.
2004-09-05 17:27:54 +00:00
rwatson
f8da965824 Unlock rather than lock the ripcbinfo lock at the end of rip6_input().
RELENG_5 candidate.

Foot provided by:	Patrick Guelat <pg at imp dot ch>
2004-09-02 20:18:02 +00:00
rwatson
289252d030 Mark Netgraph TTY, KAME IPSEC, and IPX/SPX as requiring Giant for correct
operation using NET_NEEDS_GIANT().  This will result in a boot-time
restoration of Giant-enabled network operation, or run-time warning on
dynamic load (applicable only to the Netgraph component).  Additional
components will likely need to be marked with this in the future.
2004-08-28 15:24:53 +00:00
andre
2126402238 Apply error and success logic consistently to the function netisr_queue() and
its users.

netisr_queue() now returns (0) on success and ERRNO on failure.  At the
moment ENXIO (netisr queue not functional) and ENOBUFS (netisr queue full)
are supported.

Previously it would return (1) on success but the return value of IF_HANDOFF()
was interpreted wrongly and (0) was actually returned on success.  Due to this
schednetisr() was never called to kick the scheduling of the isr.  However this
was masked by other normal packets coming through netisr_dispatch() causing the
dequeueing of waiting packets.

PR:		kern/70988
Found by:	MOROHOSHI Akihiko <moro@remus.dti.ne.jp>
MFC after:	3 days
2004-08-27 18:33:08 +00:00
andre
d243747d92 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
rwatson
b1d9338b73 Remove in6_prefix.[ch] and the contained router renumbering capability.
The prefix management code currently resides in nd6, leaving only the
unused router renumbering capability in the in6_prefix files.  Removing
it will make it easier for us to provide locking for the remainder of
IPv6 by reducing the number of objects requiring synchronized access.

This functionality has also been removed from NetBSD and OpenBSD.

Submitted by:	George Neville-Neil <gnn at neville-neil.com>
Discussed with/approved by:	suz, keiichi at kame.net, core at kame.net
2004-08-23 03:00:27 +00:00
rwatson
56ad5a3d5e When notifying protocol components of an event on an in6pcb, use the
result of the notify() function to decide if we need to unlock the
in6pcb or not, rather than always unlocking.  Otherwise, we may unlock
and already unlocked in6pcb.

Reported by:	kuriyama, Gordon Bergling <gbergling at 0xfce3.net>
Tested by:	kuriyama, Gordon Bergling <gbergling at 0xfce3.net>
Discussed with:	mdodd
2004-08-21 17:38:48 +00:00
dwmalone
5df13d37b2 Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD
have already done this, so I have styled the patch on their work:

        1) introduce a ip_newid() static inline function that checks
        the sysctl and then decides if it should return a sequential
        or random IP ID.

        2) named the sysctl net.inet.ip.random_id

        3) IPv6 flow IDs and fragment IDs are now always random.
        Flow IDs and frag IDs are significantly less common in the
        IPv6 world (ie. rarely generated per-packet), so there should
        be smaller performance concerns.

The sysctl defaults to 0 (sequential IP IDs).

Reviewed by:	andre, silby, mlaier, ume
Based on:	NetBSD
MFC after:	2 months
2004-08-14 15:32:40 +00:00
rwatson
411d86faa7 When allocating the IPv6 header to stick in front of raw packet being
sent via a raw IPv6 socket, use M_DONTWAIT not M_TRYWAIT, as we're
holding the raw pcb mutex.

Reported, tested by:	kuriyama
2004-08-12 18:31:36 +00:00
rwatson
11a2e8ce3c Pass pcbinfo structures to in6_pcbnotify() rather than pcbhead
structures, allowing in6_pcbnotify() to lock the pcbinfo and each
inpcb that it notifies of ICMPv6 events.  This prevents inpcb
assertions from firing when IPv6 generates and delievers event
notifications for inpcbs.

Reported by:	kuriyama
Tested by:	kuriyama
2004-08-06 03:45:45 +00:00
yar
1d71ae12e0 Disallow a particular kind of port theft described by the following scenario:
Alice is too lazy to write a server application in PF-independent
	manner.  Therefore she knocks up the server using PF_INET6 only
	and allows the IPv6 socket to accept mapped IPv4 as well.  An evil
	hacker known on IRC as cheshire_cat has an account in the same
	system.  He starts a process listening on the same port as used
	by Alice's server, but in PF_INET.  As a consequence, cheshire_cat
	will distract all IPv4 traffic supposed to go to Alice's server.

Such sort of port theft was initially enabled by copying the code that
implemented the RFC 2553 semantics on IPv4/6 sockets (see inet6(4)) for
the implied case of the same owner for both connections.  After this
change, the above scenario will be impossible.  In the same setting,
the user who attempts to start his server last will get EADDRINUSE.

Of course, using IPv4 mapped to IPv6 leads to security complications
in the first place, but there is no reason to make it even more unsafe.

This change doesn't apply to KAME since it affects a FreeBSD-specific
part of the code.  It doesn't modify the out-of-box behaviour of the
TCP/IP stack either as long as mapping IPv4 to IPv6 is off by default.

MFC after:	1 month
2004-07-28 13:03:07 +00:00
rwatson
558dee61de Commit a first pass at in6pcb and pcbinfo locking for IPv6,
synchronizing IPv6 protocol control blocks and lists.  These changes
are modeled on the inpcb locking for IPv4, submitted by Jennifer Yang,
and committed by Jeffrey Hsu.  With these locking changes, IPv6 use of
inpcbs is now substantially more MPSAFE, and permits IPv4 inpcb locking
assertions to be run in the presence of IPv6 compiled into the kernel.
2004-07-27 23:44:03 +00:00
yar
8feaacec0c Don't consider TCP connections beyond LISTEN state
(i.e. with the foreign address being not wildcard) when checking
for possible port theft since such connections cannot be stolen.

The port theft check is FreeBSD-specific and isn't in the KAME tree.

PR:		bin/65928 (in the audit trail)
Reviewed by:	-net, -hackers (silence)
Tested by:	Nick Leuta <skynick at mail.sc.ru>
MFC after:	1 month
2004-07-27 16:35:09 +00:00
cperciva
d9fecc83c8 Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with:	rwatson, scottl
Requested by:	jhb
2004-07-26 07:24:04 +00:00
phk
5c95d686a1 Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
mlaier
977d97b004 Link ALTQ to the build and break with ABI for struct ifnet. Please recompile
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.

__FreeBSD_version bump will follow.

Tested-by:	(i386)LINT
2004-06-13 17:29:10 +00:00
rwatson
f66d897510 Missed directory in previous commit; need to hold SOCK_LOCK(so)
before calling sotryfree().

-- Body of earlier bulk commit this belonged with --

  Log:
  Extend coverage of SOCK_LOCK(so) to include so_count, the socket
  reference count:

  - Assert SOCK_LOCK(so) macros that directly manipulate so_count:
    soref(), sorele().

  - Assert SOCK_LOCK(so) in macros/functions that rely on the state of
    so_count: sofree(), sotryfree().

  - Acquire SOCK_LOCK(so) before calling these functions or macros in
    various contexts in the stack, both at the socket and protocol
    layers.

  - In some cases, perform soisdisconnected() before sotryfree(), as
    this could result in frobbing of a non-present socket if
    sotryfree() actually frees the socket.

  - Note that sofree()/sotryfree() will release the socket lock even if
    they don't free the socket.

  Submitted by:   sam
  Sponsored by:   FreeBSD Foundation
  Obtained from:  BSD/OS
2004-06-12 20:59:48 +00:00
ume
afde7385d2 do not check super user privilege in ip6_savecontrol. It is
meaningless and can even be harmful.

Obtained from:	KAME
MFC after:	3 days
2004-06-02 15:41:18 +00:00
phk
f43aa0c4bc add missing #include <sys/module.h> 2004-05-30 20:27:19 +00:00
wpaul
6a0eb38f30 Fix a bug which I discovered recently while doing IPv6 testing at
Wind River. In the IPv4 output path, one of the tests in ip_output()
checks how many slots are actually available in the interface output
queue before attempting to send a packet. If, for example, we need
to transmit a packet of 32K bytes over an interface with an MTU of
1500, we know it's going to take about 21 fragments to do it. If
there's less than 21 slots left in the output queue, there's no point
in transmitting anything at all: IP does not do retransmission, so
sending only some of the fragments would just be a waste of bandwidth.
(In an extreme case, if you're sending a heavy stream of fragmented
packets, you might find yourself sending nothing by the first fragment
of all your packets.) So if ip_output() notices there's not enough
room in the output queue to send the frame, it just dumps the packet
and returns ENOBUFS to the app.

It turns out ip6_output() lacks this code. Consequently, this caused
the netperf UDPIPV6_STREAM test to produce very poor results with large
write sizes. This commit adds code to check the remaining space in the
output queue and junk fragmented packets if they're too big to be
sent, just like with IPv4. (I can't imagine anyone's running an NFS
server using UDP over IPv6, but if they are, this will likely make them
a lot happier. :)
2004-05-14 03:57:17 +00:00
luigi
5a251e2af2 fix the change of interface in nd6_storelladdr for multicast
addresses too.

Reported by: Jun Kuriyama
2004-04-26 20:31:46 +00:00
luigi
59063f7a08 This commit does two things:
1. rt_check() cleanup:
    rt_check() is only necessary for some address families to gain access
    to the corresponding arp entry, so call it only in/near the *resolve()
    routines where it is actually used -- at the moment this is
    arpresolve(), nd6_storelladdr() (the call is embedded here),
    and atmresolve() (the call is just before atmresolve to reduce
    the number of changes).
    This change will make it a lot easier to decouple the arp table
    from the routing table.

    There is an extra call to rt_check() in if_iso88025subr.c to
    determine the routing info length. I have left it alone for
    the time being.

    The interface of arpresolve() and nd6_storelladdr() now changes slightly:
     + the 'rtentry' parameter (really a hint from the upper level layer)
       is now passed unchanged from *_output(), so it becomes the route
       to the final destination and not to the gateway.
     + the routines will return 0 if resolution is possible, non-zero
       otherwise.
     + arpresolve() returns EWOULDBLOCK in case the mbuf is being held
       waiting for an arp reply -- in this case the error code is masked
       in the caller so the upper layer protocol will not see a failure.

2. arpcom untangling
    Where possible, use 'struct ifnet' instead of 'struct arpcom' variables,
    and use the IFP2AC macro to access arpcom fields.
    This mostly affects the netatalk code.

=== Detailed changes: ===
net/if_arcsubr.c
   rt_check() cleanup, remove a useless variable

net/if_atmsubr.c
   rt_check() cleanup

net/if_ethersubr.c
   rt_check() cleanup, arpcom untangling

net/if_fddisubr.c
   rt_check() cleanup, arpcom untangling

net/if_iso88025subr.c
   rt_check() cleanup

netatalk/aarp.c
   arpcom untangling, remove a block of duplicated code

netatalk/at_extern.h
   arpcom untangling

netinet/if_ether.c
   rt_check() cleanup (change arpresolve)

netinet6/nd6.c
   rt_check() cleanup (change nd6_storelladdr)
2004-04-25 09:24:52 +00:00
luigi
610287ce03 ifp has the same value as rt->rti_ifp so remove the dependency
on the route entry to locate the necessary information.
2004-04-19 08:02:52 +00:00
luigi
f5a30e0153 Remove a tail-recursive call in nd6_output.
This change is functionally identical to the original code, though
I have no idea if that was correct in the first place (see comment
in the commit).
2004-04-19 07:48:48 +00:00
luigi
e131978e46 Replace Bcopy/Bzero with 'the real thing' as in the rest of the file. 2004-04-18 11:45:28 +00:00
imp
b49b7fe799 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
suz
d551783211 UDP checksum is mandatory in IPv6 (RFC2460 p.28)
Obtained from: KAME
2004-04-01 13:48:23 +00:00
pjd
49554d1bd8 Reduce 'td' argument to 'cred' (struct ucred) argument in those functions:
- in_pcbbind(),
	- in_pcbbind_setup(),
	- in_pcbconnect(),
	- in_pcbconnect_setup(),
	- in6_pcbbind(),
	- in6_pcbconnect(),
	- in6_pcbsetport().
"It should simplify/clarify things a great deal." --rwatson

Requested by:	rwatson
Reviewed by:	rwatson, ume
2004-03-27 21:05:46 +00:00