Commit Graph

118 Commits

Author SHA1 Message Date
Bruce Evans
6f5967c087 Initialize a local variable in 2 places just before it is used, not always
at the start of rtalloc1().  This backs out part of revs 1.83 and 1.85.

Profiling on an i386 showed that that for sending tiny packets using
bge, -current takes 7 bzero()s where RELENG_4 takes only 1, and that
bzero()ing is now the dominant overhead (10-12%, up from 1%, but
profiling overestimated this a bit).  This commit backs out 2 of the
6 extra bzero()s (1 in each of 2 calls per packet to rtalloc1()).  They
were the largest ones by byte count (48 bytes each) but perhaps not
by time (small misaligned ones might take longer).
2006-11-23 05:57:15 +00:00
Qing Li
1a41f91052 Assuming the interface has an address of x.x.x.195, a mask of
255.255.255.0, and a default route with gateway x.x.x.1. Now if
the address mask is changed to something more specific, e.g.,
255.255.255.128, then after the mask change the default gateway
is no longer reachable.

Since the default route is still present in the routing table,
when the output code tries to resolve the address of the default
gateway in function rt_check(), again, the default route will be
returned by rtalloc1(). Because the lock is currently held on the
rtentry structure, one more attempt to hold the lock will trigger
a crash due to "lock recursed on non-recursive mutex ..."

This is a general problem. The fix checks for the above condition
so that an existing route entry is not mistaken for a new cloned
route. Approriately, an ENETUNREACH error is returned back to the
caller

Approved by:	andre
2006-06-05 21:20:21 +00:00
Qing Li
e034e82c56 The current routing code allows insertion of indirect routes that have
gateways which are unreachable except through the default router. For
example, assuming there is a default route configured, and inserting
a route

	"route add 64.102.54.0/24 60.80.1.1"

is currently allowed even when 60.80.1.1 is only reachable through
the default route. However, an error is thrown when this route is
utilized, say,

	"ping 64.102.54.1"  will return an error

This type of route insertion should be disallowed becasue:

1) Let's say that somehow our code allowed this packet to flow to
   the default router, and the default router knows the next hop is
   60.80.1.1, then the question is why bother inserting this route in
   the 1st place, just simply use the default route.

2) Since we're not talking about source routing here, the default
   router could very well choose a different path than using 60.80.1.1
   for the next hop, again it defeats the purpose of adding this route.

Reviewed by:	ru, gnn, bz
Approved by:	andre
2006-05-16 19:11:11 +00:00
Bjoern A. Zeeb
ac4a76ebc9 In rtrequest and rtinit check for sa_len != 0 for the given
destination. These checks are needed so we do not install
a route looking like this:
(0)                192.0.2.200        UH       tun0 =>

When removing this route  the kernel will start to walk
the address space which looks like a hang on 64bit platforms
because it'll take ages while on 32bit you should see a panic
when kernel debugging options are turned on.

The problem is in rtrequest1:
	if (netmask) {
		rt_maskedcopy(dst, ndst, netmask);
	} else
		bcopy(dst, ndst, dst->sa_len);

In both cases the len might be 0 if the application forgot to
set it.  If so ndst will be all-zero  leading to above
mentioned strange routes.

This is an application error but we must not fail/hang/panic
because of this.

Looks ok:	gnn
No objections:	net@ (silence)
MFC after:	8 weeks
2006-05-04 18:33:37 +00:00
Ruslan Ermilov
4a0d6638b3 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
Gleb Smirnoff
2d7e9ead07 Several fixes to rt_setgate(), that fix problems with route changing:
- Rearrange code so that in a case of failure the affected
  route is not changed. Otherwise, a bogus rtentry will be
  left and later rt_check() can recurse on its lock. [1]
- Remove comment about protocol cloning.
- Fix two places where rtentry mutex was recursed on, because
  accessed via two different pointers, that were actually pointing
  to the same rtentry in some cases. [1]
- Return EADDRINUSE instead of bogus EDQUOT, in case when gateway
  uses the same route. [2]

Reported & tested by:	ps, Andrej Zverev <az inec.ru> [1]
PR:			kern/64090 [2]
2005-09-21 11:58:10 +00:00
Andre Oppermann
fe53256dc2 Use monotonic 'time_uptime' instead of 'time_second' as timebase
for rt->rt_rmx.rmx_expire.
2005-09-19 22:54:55 +00:00
Gleb Smirnoff
530f95fc08 o Make rt_check() function more strict:
- rt0 passed to rt_check() must not be NULL, assert this.
  - rt returned by rt_check() must be valid locked rtentry,
    if no error occured.
o Modify callers, so that they never pass NULL rt0
  to rt_check().

Reviewed by:	sam, ume (nd6.c)
2005-08-11 08:14:53 +00:00
Gleb Smirnoff
9bd8ca3014 In preparation for fixing races in ARP (and probably in other
L2/L3 mappings) make rt_check() return a locked rtentry.
2005-08-09 08:39:56 +00:00
Qing Li
16a2e0a6c8 Require gateways for routes to be of the same address family as the
route itself.

It fixes a bug where an IPv4 route for example has an IPv6 gateway
specified:

     route add 10.1.1.1 -inet6 fe80::1%fxp0

     Destination  Gateway       Flags  Refs  Use  Netif Expire
     10.1.1.1     fe80::1%fxp0  UGHS   0     0    fxp0

The fix rejects these illegal combinations:

     route: writing to routing socket: Invalid argument
     add host 10.1.1.1: gateway fe80::1%fxp0: Invalid argument

Reviewed by:	KAME jinmei@isl.rdc.toshiba.co.jp
Reviewed by:	andre (mentor)
Approved by:	re
MFC after:	5
2005-06-28 23:32:22 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Christian S.J. Peron
5090559b7f When a prison is given the ability to create raw sockets (when the
security.jail.allow_raw_sockets sysctl MIB is set to 1) where privileged
access to jails is given out, it is possible for prison root to manipulate
various network parameters which effect the host environment. This commit
plugs a number of security holes associated with the use of raw sockets
and prisons.

This commit makes the following changes:

- Add a comment to rtioctl warning developers that if they add
  any ioctl commands, they should use super-user checks where necessary,
  as it is possible for PRISON root to make it this far in execution.
- Add super-user checks for the execution of the SIOCGETVIFCNT
  and SIOCGETSGCNT IP multicast ioctl commands.
- Add a super-user check to rip_ctloutput(). If the calling cred
  is PRISON root, make sure the socket option name is IP_HDRINCL,
  otherwise deny the request.

Although this patch corrects a number of security problems associated
with raw sockets and prisons, the warning in jail(8) should still
apply, and by default we should keep the default value of
security.jail.allow_raw_sockets MIB to 0 (or disabled) until
we are certain that we have tracked down all the problems.

Looking forward, we will probably want to eliminate the
references to curthread.

This may be a MFC candidate for RELENG_5.

Reviewed by:	rwatson
Approved by:	bmilekic (mentor)
2004-08-21 17:38:57 +00:00
Andre Oppermann
2dc1d58164 Convert the routing table to use an UMA zone for rtentries. The zone is
called "rtentry".

This saves a considerable amount of kernel memory.  R_Zmalloc previously
used 256 byte blocks (plus kmalloc overhead) whereas UMA only needs 132
bytes.

Idea from:	OpenBSD
2004-08-11 17:26:56 +00:00
Alexander Kabaev
445e045b0d Avoid casts as lvalues. 2004-07-28 06:59:55 +00:00
Luigi Rizzo
490b9d88fa fix one typo and remove one wrong line 2004-04-25 01:39:00 +00:00
Luigi Rizzo
769270223c Correct and extend the description of the behaviour of rt_check(). 2004-04-24 23:34:56 +00:00
Luigi Rizzo
d6941ce931 Clearly comment the assumptions that allow us to cast a
'struct radix_node *' to a 'struct rtentry *' in this code,
and introduce a macro, RNTORT(), to do this type conversion.
2004-04-21 15:16:08 +00:00
Luigi Rizzo
85911824db Fix the initial check for NULL arguments in rtfree (previously
it checked for rt == NULL after dereferencing the pointer).
We never check for those events elsewhere, so probably these checks
might go away here as well.

Slightly simplify (and document) the logic for memory allocation
in rt_setgate().

The rest is mostly style changes -- replace 0 with NULL where appropriate,
remove the macro SA() that was only used once, remove some useless
debugging code in rt_fixchange, explain some odd-looking casts.
2004-04-20 07:04:47 +00:00
Luigi Rizzo
1838a6471f replace Bcopy with bcopy as in the rest of the file. 2004-04-18 11:46:29 +00:00
Luigi Rizzo
2eb5613fe6 make route_init() static 2004-04-17 15:10:20 +00:00
Luigi Rizzo
9b98ee2c4f Consistently use ifaddr_byindex() to access the link-level address
of an interface. No functional change.

On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist()
which, if confirmed, would deserve to be fixed and MFC'ed
2004-04-16 08:14:34 +00:00
Luigi Rizzo
e74642df71 route.h: introduce a macro, SA_SIZE(struct sockaddr *) which returns
the space occupied by a struct sockaddr when passed through a
routing socket.
Use it to replace the macro ROUNDUP(int), that does the same but
is redefined by every file which uses it, courtesy of
the School of Cut'n'Paste Programming(TM).

(partial) userland changes to follow.
2004-04-13 11:22:22 +00:00
Luigi Rizzo
5aca0b30d5 in rtinit(), remove one useless variable, and move a few others
within the block where they are used.
2004-04-12 20:24:30 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Sam Leffler
d4b2657f98 Remove extraneous unlock. This fixes a panic seen when manipulating static
entries in the ARP table.
2004-01-07 23:42:21 +00:00
Sam Leffler
e21afc60bf bandaid LOR in rt_setgate; a proper fix requires code refactoring 2003-12-07 21:44:14 +00:00
Sam Leffler
72b9c8c9fd workaround LOR in rt_setgate
Reviewed by:	andre
Approved by:	re (rwatson)
2003-11-25 19:52:12 +00:00
Andre Oppermann
26d02ca7ba Remove RTF_PRCLONING from routing table and adjust users of it
accordingly.  The define is left intact for ABI compatibility
with userland.

This is a pre-step for the introduction of tcp_hostcache.  The
network stack remains fully useable with this change.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 19:47:31 +00:00
Sam Leffler
7138d65c3f replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREF
macros that expand to include assertions when the system is built
with INVARIANTS

Supported by:	FreeBSD Foundation
2003-11-08 23:36:32 +00:00
Sam Leffler
9c63e9dbd7 Overhaul routing table entry cleanup by introducing a new rtexpunge
routine that takes a locked routing table reference and removes all
references to the entry in the various data structures. This
eliminates instances of recursive locking and also closes races
where the lock on the entry had to be dropped prior to calling
rtrequest(RTM_DELETE).  This also cleans up confusion where the
caller held a reference to an entry that might have been reclaimed
(and in some cases used that reference).

Supported by:	FreeBSD Foundation
2003-10-30 23:02:51 +00:00
Sam Leffler
319de71e19 avoid recursive lock panic by unlocking before calling rtrequest;
this is consistent with other places but will be replaced
shortly by a "proper fix"

Supported by:	FreeBSD Foundation
Pain felt by:	Jiri Mikulas
2003-10-29 23:01:37 +00:00
Sam Leffler
ea04521020 Correct handling of cloning loop avoidance: rtalloc1 may return a null
pointer in which case we should not do the unlock.

Supported by:	FreeBSD Foundatin
2003-10-16 16:17:17 +00:00
Sam Leffler
3299a156c7 fix braino: null the pointer who's memory we just free'd, not some other
pointers that are (potentially) used later
2003-10-11 04:48:35 +00:00
Sam Leffler
3e6a836eea insure local variable is initialized prior to use 2003-10-07 16:56:35 +00:00
Sam Leffler
4de5d90c8e fix typo that caused a panic when processing an ICMP redirect
Sponsored by:	FreeBSD Foundation
2003-10-05 19:05:53 +00:00
Sam Leffler
d1dd20be6e Locking for updates to routing table entries. Each rtentry gets a mutex
that covers updates to the contents.  Note this is separate from holding
a reference and/or locking the routing table itself.

Other/related changes:

o rtredirect loses the final parameter by which an rtentry reference
  may be returned; this was never used and added unwarranted complexity
  for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
  we assume the parent will remain as long as the clone; doing this avoids
  a circularity in locking during delete
o convert some timeouts to MPSAFE callouts

Notes:

1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
   applications cannot/do-no know about mutex's.  Doing this requires
   that the mutex be the last element in the structure.  A better solution
   is to introduce an externalized version of struct rtentry but this is
   a major task because of the intertwining of rtentry and other data
   structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
   work to eliminate many held references.  If not these will be resolved
   prior to release.
3. ATM changes are untested.

Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS (partly)
2003-10-04 03:44:50 +00:00
Sam Leffler
becc44d76c cleanups prior to adding locking (and in some cases to eliminate locking):
o move route_cb to be private to rtsock.c
o replace global static route_proto by locals
o eliminate global #define shorthands for info references
o remove some register decls
o ansi-fy function decls
o move items to be close in scope to their usage
o add rt_dispatch function for dispatching the actual message
o cleanup tangled logic for doing all-but-me msg send

Support by:	FreeBSD Foundation
2003-10-03 18:15:54 +00:00
Jeffrey Hsu
983985c11e No need to unlock if error detected before locking.
Submitted by:	harti
2003-04-13 06:21:02 +00:00
Matthew N. Dodd
7f760c4890 Reduce code duplication. This adds the function rt_check() to route.c.
Approved by:	 sam (in principle)
2003-03-02 21:34:37 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Ruslan Ermilov
94e013f0e6 I'm not sure what was the problem at the time of revision 1.37
when julian@ added it, but the commented out code had at least
one bug -- not freeing the allocated mbuf.

Anyway, this comment no longer applies as of revision 1.67, so
remove it.
2002-12-25 10:55:44 +00:00
Ruslan Ermilov
42e9e16d2b Revision 1.67 changes correspond to CSRG revision 8.3.1.1 changes. 2002-12-25 10:50:08 +00:00
Ruslan Ermilov
71eba91593 If the caller of rtrequest*(RTM_DELETE, ...) asked for a copy of
the entry being removed (ret_nrt != NULL), increment the entry's
rt_refcnt like we do it for RTM_ADD and RTM_RESOLVE, rather than
messing around with 1->0 transitions for rtfree() all over.
2002-12-25 10:21:02 +00:00
Jeffrey Hsu
956b0b653c SMP locking for radix nodes. 2002-12-24 03:03:39 +00:00
Ruslan Ermilov
36fea5de60 rn_walktree*() compute the next leaf before applying a function
to current leaves because function may vanish the current node.

If parent RTA_GENMASK route has a clone (a "cloning clone"), an
rn_walktree_from() starting from parent will cause another walk
starting from clone.  If a function is either rt_fixdelete() or
rt_fixchange(), this recursive walk may vanish the leaf that is
remembered by an outer walk (the "next leaf" above), panicing a
system when it resumes with an outer walk.

The following script paniced my single-user mode booted system:

: sysctl net.inet.ip.forwarding=1
: ipfw add 1 allow ip from any to any
: ifconfig lo0 127.1
: route add -net 10 -genmask 255.255.255.0 127.1
: telnet 10.1			# rt_fixchange() panic
: telnet 10.2
: telnet 10.1
: route delete -net 10		# rt_fixdelete() panic

For the time being, avoid these races by disallowing recursive
walks in rt_fixchange() and rt_fixdelete().

Also, make a slight optimization in the rtrequest(RTM_RESOLVE)
case: there is no reason to call rt_fixchange() in this case.

PR:		kern/37606
MFC after:	5 days
2002-12-23 13:12:41 +00:00
Jeffrey Hsu
19fc74fb60 Lock up ifaddr reference counts. 2002-12-18 11:46:59 +00:00
Luigi Rizzo
bbb4330b61 Massive cleanup of the ip_mroute code.
No functional changes, but:

  + the mrouting module now should behave the same as the compiled-in
    version (it did not before, some of the rsvp code was not loaded
    properly);
  + netinet/ip_mroute.c is now truly optional;
  + removed some redundant/unused code;
  + changed many instances of '0' to NULL and INADDR_ANY as appropriate;
  + removed several static variables to make the code more SMP-friendly;
  + fixed some minor bugs in the mrouting code (mostly, incorrect return
    values from functions).

This commit is also a prerequisite to the addition of support for PIM,
which i would like to put in before DP2 (it does not change any of
the existing APIs, anyways).

Note, in the process we found out that some device drivers fail to
properly handle changes in IFF_ALLMULTI, leading to interesting
behaviour when a multicast router is started. This bug is not
corrected by this commit, and will be fixed with a separate commit.

Detailed changes:
--------------------
netinet/ip_mroute.c     all the above.
conf/files              make ip_mroute.c optional
net/route.c             fix mrt_ioctl hook
netinet/ip_input.c      fix ip_mforward hook, move rsvp_input() here
                        together with other rsvp code, and a couple
                        of indentation fixes.
netinet/ip_output.c     fix ip_mforward and ip_mcast_src hooks
netinet/ip_var.h        rsvp function hooks
netinet/raw_ip.c        hooks for mrouting and rsvp functions, plus
                        interface cleanup.
netinet/ip_mroute.h     remove an unused and optional field from a struct

Most of the code is from Pavlin Radoslavov and the XORP project

Reviewed by: sam
MFC after: 1 week
2002-11-15 22:53:53 +00:00
Mike Silbersack
54e84abb59 Ensure that packet counts are always reset to 0 when
a route is cloned.  Previously, they took on the count
of their parent route (which was sometimes nonzero.)

Submitted by:	Andre Oppermann <oppermann@pipeline.ch>
MFC after:	5 days
2002-05-31 04:27:51 +00:00
Alfred Perlstein
929ddbbb89 Remove __P. 2002-03-19 21:54:18 +00:00