2021 Commits

Author SHA1 Message Date
andre
7e338f7bd0 Optimize IP fastforwarding some more:
o New function ip_findroute() to reduce code duplication for the
  route lookup cases. (luigi)

o Store ip_len in host byte order on the stack instead of using
  it via indirection from the mbuf.  This allows to defer the host
  byte conversion to a later point and makes a quicker fallback to
  normal ip_input() processing. (luigi)

o Check if route is dampned with RTF_REJECT flag and drop packet
  already here when ARP is unable to resolve destination address.
  An ICMP unreachable is sent to inform the sender.

o Check if interface output queue is full and drop packet already
  here.  No ICMP notification is sent because signalling source quench
  is depreciated.

o Check if media_state is down (used for ethernet type interfaces)
  and drop the packet already here.  An ICMP unreachable is sent to
  inform the sender.

o Do not account sent packets to the interface address counters.  They
  are only for packets with that 'ia' as source address.

o Update and clarify some comments.

Submitted by:	luigi (most of it)
2004-05-03 13:52:47 +00:00
darrenr
a50f040209 Rename m_claim_next_hop() to m_claim_next(), as suggested by Max Laier. 2004-05-02 15:10:17 +00:00
darrenr
4e8ce3156b oops, I forgot this file in a prior commit (change was still sitting here,
uncommitted):

Rename ip_claim_next_hop() to m_claim_next_hop(), give it an extra arg
(the type of tag to claim) and push it out of ip_var.h into mbuf.h
alongside all of the other macros that work ok mbuf's and tag's.
2004-05-02 15:07:37 +00:00
darrenr
8f62dbebe1 Rename ip_claim_next_hop() to m_claim_next_hop(), give it an extra arg
(the type of tag to claim) and push it out of ip_var.h into mbuf.h alongside
all of the other macros that work ok mbuf's and tag's.
2004-05-02 06:36:30 +00:00
bmilekic
6bbcc9da29 Give jail(8) the feature to allow raw sockets from within a
jail, which is less restrictive but allows for more flexible
jail usage (for those who are willing to make the sacrifice).
The default is off, but allowing raw sockets within jails can
now be accomplished by tuning security.jail.allow_raw_sockets
to 1.

Turning this on will allow you to use things like ping(8)
or traceroute(8) from within a jail.

The patch being committed is not identical to the patch
in the PR.  The committed version is more friendly to
APIs which pjd is working on, so it should integrate
into his work quite nicely.  This change has also been
presented and addressed on the freebsd-hackers mailing
list.

Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
PR: kern/65800
2004-04-26 19:46:52 +00:00
silby
051b00be73 Tighten up reset handling in order to make reset attacks as difficult as
possible while maintaining compatibility with the widest range of TCP stacks.

The algorithm is as follows:

---
For connections in the ESTABLISHED state, only resets with
sequence numbers exactly matching last_ack_sent will cause a reset,
all other segments will be silently dropped.

For connections in all other states, a reset anywhere in the window
will cause the connection to be reset.  All other segments will be
silently dropped.
---

The necessity of accepting all in-window resets was discovered
by jayanth and jlemon, both of whom have seen TCP stacks that
will respond to FIN-ACK packets with resets not meeting the
strict last_ack_sent check.

Idea by:        Darren Reed
Reviewed by:    truckman, jlemon, others(?)
2004-04-26 02:56:31 +00:00
luigi
53bd42643d Another small set of changes to reduce diffs with the new arp code. 2004-04-25 15:00:17 +00:00
luigi
93066eb95b remove a stale comment on the behaviour of arpresolve 2004-04-25 14:06:23 +00:00
luigi
131ad9c351 Start the arp timer at init time.
It runs so rarely that it makes no sense to wait until the first request.
2004-04-25 12:50:14 +00:00
luigi
59063f7a08 This commit does two things:
1. rt_check() cleanup:
    rt_check() is only necessary for some address families to gain access
    to the corresponding arp entry, so call it only in/near the *resolve()
    routines where it is actually used -- at the moment this is
    arpresolve(), nd6_storelladdr() (the call is embedded here),
    and atmresolve() (the call is just before atmresolve to reduce
    the number of changes).
    This change will make it a lot easier to decouple the arp table
    from the routing table.

    There is an extra call to rt_check() in if_iso88025subr.c to
    determine the routing info length. I have left it alone for
    the time being.

    The interface of arpresolve() and nd6_storelladdr() now changes slightly:
     + the 'rtentry' parameter (really a hint from the upper level layer)
       is now passed unchanged from *_output(), so it becomes the route
       to the final destination and not to the gateway.
     + the routines will return 0 if resolution is possible, non-zero
       otherwise.
     + arpresolve() returns EWOULDBLOCK in case the mbuf is being held
       waiting for an arp reply -- in this case the error code is masked
       in the caller so the upper layer protocol will not see a failure.

2. arpcom untangling
    Where possible, use 'struct ifnet' instead of 'struct arpcom' variables,
    and use the IFP2AC macro to access arpcom fields.
    This mostly affects the netatalk code.

=== Detailed changes: ===
net/if_arcsubr.c
   rt_check() cleanup, remove a useless variable

net/if_atmsubr.c
   rt_check() cleanup

net/if_ethersubr.c
   rt_check() cleanup, arpcom untangling

net/if_fddisubr.c
   rt_check() cleanup, arpcom untangling

net/if_iso88025subr.c
   rt_check() cleanup

netatalk/aarp.c
   arpcom untangling, remove a block of duplicated code

netatalk/at_extern.h
   arpcom untangling

netinet/if_ether.c
   rt_check() cleanup (change arpresolve)

netinet6/nd6.c
   rt_check() cleanup (change nd6_storelladdr)
2004-04-25 09:24:52 +00:00
silby
67d79f0cfb Wrap two long lines in the previous commit. 2004-04-23 23:29:49 +00:00
andre
7357a88fdb Correct an edge case in tcp_mss() where the cached path MTU
from tcp_hostcache would have overridden a (now) lower MTU of
an interface or route that changed since first PMTU discovery.
The bug would have caused TCP to redo the PMTU discovery when
not strictly necessary.

Make a comment about already pre-initialized default values
more clear.

Reviewed by:	sam
2004-04-23 22:44:59 +00:00
andre
d4f49f008f Add the option versrcreach to verify that a valid route to the
source address of a packet exists in the routing table.  The
default route is ignored because it would match everything and
render the check pointless.

This option is very useful for routers with a complete view of
the Internet (BGP) in the routing table to reject packets with
spoofed or unrouteable source addresses.

Example:

 ipfw add 1000 deny ip from any to any not versrcreach

also known in Cisco-speak as:

  ip verify unicast source reachable-via any

Reviewed by:	luigi
2004-04-23 14:28:38 +00:00
andre
e8723e5528 Fix a potential race when purging expired hostcache entries.
Spotted by:	luigi
2004-04-23 13:54:28 +00:00
silby
b4c0a798a2 Take out an unneeded variable I forgot to remove in the last commit,
and make two small whitespace fixes so that diffs vs rev 1.142 are minimal.
2004-04-22 08:34:55 +00:00
silby
760a7deec6 Simplify random port allocation, and add net.inet.ip.portrange.randomized,
which can be used to turn off randomized port allocation if so desired.

Requested by:	alfred
2004-04-22 08:32:14 +00:00
bms
8fb0962eb0 Fix a typo in a comment. 2004-04-20 19:04:24 +00:00
silby
f0d28bbf0c Switch from using sequential to random ephemeral port allocation,
implementation taken directly from OpenBSD.

I've resisted committing this for quite some time because of concern over
TIME_WAIT recycling breakage (sequential allocation ensures that there is a
long time before ports are recycled), but recent testing has shown me that
my fears were unwarranted.
2004-04-20 06:45:10 +00:00
silby
743d110741 Enhance our RFC1948 implementation to perform better in some pathlogical
TIME_WAIT recycling cases I was able to generate with http testing tools.

In short, as the old algorithm relied on ticks to create the time offset
component of an ISN, two connections with the exact same host, port pair
that were generated between timer ticks would have the exact same sequence
number.  As a result, the second connection would fail to pass the TIME_WAIT
check on the server side, and the SYN would never be acknowledged.

I've "fixed" this by adding random positive increments to the time component
between clock ticks so that ISNs will *always* be increasing, no matter how
quickly the port is recycled.

Except in such contrived benchmarking situations, this problem should never
come up in normal usage...  until networks get faster.

No MFC planned, 4.x is missing other optimizations that are needed to even
create the situation in which such quick port recycling will occur.
2004-04-20 06:33:39 +00:00
luigi
eb6b02962a Replace Bcopy with 'the real thing' as in the rest of the file. 2004-04-18 11:45:49 +00:00
luigi
94049d0810 In an effort to simplify the routing code, try to deprecate rtalloc()
in favour of rtalloc_ign(), which is what would end up being called
anyways.

There are 25 more instances of rtalloc() in net*/ and
about 10 instances of rtalloc_ign()
2004-04-14 01:13:14 +00:00
imp
b49b7fe799 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
ru
a6980b04fc Fixed a bug in previous revision: compute the payload checksum before
we convert ip_len into a network byte order; in_delayed_cksum() still
expects it in host byte order.

The symtom was the ``in_cksum_skip: out of data by %d'' complaints
from the kernel.

To add to the previous commit log.  These fixes make tcpdump(1) happy
by not complaining about UDP/TCP checksum being bad for looped back
IP multicast when multicast router is deactivated.

Reported by:	Vsevolod Lobko
2004-04-07 10:01:39 +00:00
bde
014c3b4511 Fixed misspelling of IPPORT_MAX as USHRT_MAX. Don't include <sys/limits.h>
to implement this mistake.

Fixed some nearby style bugs (initialization in declaration, misformatting
of this initialization, missing blank line after the declaration, and
comparision of the non-boolean result of the initialization with 0 using
"!".  In KNF, "!" is not even used to compare booleans with 0).
2004-04-06 10:59:11 +00:00
rwatson
e391576e63 Two missed in previous commit -- compare pointer with NULL rather than
using it as a boolean.
2004-04-05 00:52:05 +00:00
rwatson
bdbf43dbba Prefer NULL to 0 when checking pointer values as integers or booleans. 2004-04-05 00:49:07 +00:00
pjd
2e6142d4d9 Fix a panic possibility caused by returning without releasing locks.
It was fixed by moving problemetic checks, as well as checks that
doesn't need locking before locks are acquired.

Submitted by:		Ryan Sommers <ryans@gamersimpact.com>
In co-operation with:	cperciva, maxim, mlaier, sam
Tested by:		submitter (previous patch), me (current patch)
Reviewed by:		cperciva, mlaier (previous patch), sam (current patch)
Approved by:		sam
Dedicated to:		enough!
2004-04-04 20:14:55 +00:00
luigi
c54de1f76f + arpresolve(): remove an unused argument
+ struct ifnet: remove unused fields, move ipv6-related field close
  to each other, add a pointer to l3<->l2 translation tables (arp,nd6,
  etc.) for future use.

+ struct route: remove an unused field, move close to each
  other some fields that might likely go away in the future
2004-04-04 06:14:55 +00:00
deischen
a883ccc958 Unbreak natd.
Reported and submitted by:	Sean McNeil (sean at mcneil.com)
2004-04-02 17:57:57 +00:00
des
38842c29ce Raise WARNS level to 2. 2004-03-31 21:33:55 +00:00
des
2209468b0e Deal with aliasing warnings.
Reviewed by:	ru
Approved by:	silence on the lists
2004-03-31 21:32:58 +00:00
rwatson
8525af93ba Invert the logic of NET_LOCK_GIANT(), and remove the one reference to it.
Previously, Giant would be grabbed at entry to the IP local delivery code
when debug.mpsafenet was set to true, as that implied Giant wouldn't be
grabbed in the driver path.  Now, we will use this primitive to
conditionally grab Giant in the event the entire network stack isn't
running MPSAFE (debug.mpsafenet == 0).
2004-03-28 23:12:19 +00:00
pjd
c83007d58a Remove unused argument. 2004-03-28 15:48:00 +00:00
pjd
49554d1bd8 Reduce 'td' argument to 'cred' (struct ucred) argument in those functions:
- in_pcbbind(),
	- in_pcbbind_setup(),
	- in_pcbconnect(),
	- in_pcbconnect_setup(),
	- in6_pcbbind(),
	- in6_pcbconnect(),
	- in6_pcbsetport().
"It should simplify/clarify things a great deal." --rwatson

Requested by:	rwatson
Reviewed by:	rwatson, ume
2004-03-27 21:05:46 +00:00
pjd
02bc133779 Remove unused argument.
Reviewed by:	ume
2004-03-27 20:41:32 +00:00
ume
11f479f519 Validate IPv6 socket options more carefully to avoid a panic.
PR:		kern/61513
Reviewed by:	cperciva, nectar
2004-03-26 19:52:18 +00:00
pjd
89f5b6c374 Remove unused function.
It was used in FreeBSD 4.x, but now we're using cr_canseesocket().
2004-03-25 15:12:12 +00:00
ru
869b51c6d0 Untangle IP multicast routing interaction with delayed payload checksums.
Compute the payload checksum for a locally originated IP multicast where
God intended, in ip_mloopback(), rather than doing it in ip_output() and
only when multicast router is active.  This is more correct as we do not
fool ip_input() that the packet has the correct payload checksum when in
fact it does not (when multicast router is inactive).  This is also more
efficient if we don't join the multicast group we send to, thus allowing
the hardware to checksum the payload.
2004-03-25 08:46:27 +00:00
rwatson
aaf338640e Lock down global variables in if_gre:
- Add gre_mtx to protect global softc list.
- Hold gre_mtx over various list operations (insert, delete).
- Centralize if_gre interface teardown in gre_destroy(), and call this
  from modevent unload and gre_clone_destroy().
- Export gre_mtx to ip_gre.c, which walks the gre list to look up gre
  interfaces during encapsulation.  Add a wonking comment on how we need
  some sort of drain/reference count mechanism to keep gre references
  alive while in use and simultaneous destroy.

This commit does not lockdown softc data, which follows in a future
commit.
2004-03-22 16:04:43 +00:00
mdodd
97af430ce1 - Fix indentation lost by 'diff -b'.
- Un-wrap short line.
2004-03-21 18:51:26 +00:00
mdodd
3092080960 Remove interface type specific code from arprequest(), and in_arpinput().
The AF_ARP case in the (*if_output)() routine will handle the interface type
specific bits.

Obtained from:	NetBSD
2004-03-21 06:36:05 +00:00
des
3cb81148d8 Run through indent(1) so I can read the code without getting a headache.
The result isn't quite knf, but it's knfer than the original, and far
more consistent.
2004-03-16 21:30:41 +00:00
mdodd
90eb7f448e De-register. 2004-03-14 00:44:11 +00:00
rwatson
fea945fb73 Lock down IP-layer encapsulation library:
- Add encapmtx to protect ip_encap.c global variables (encapsulation
   list).
 - Unifdef #ifdef 0 pieces of encap_init() which was (and now really
   is) basically a no-op.
 - Lock encapmtx when walking encaptab, modifying it, comparing
   entries, etc.
 - Remove spl's.

Note that currently there's no facilite to make sure outstanding
use of encapsulation methods on a table entry have drained bfore
we allow a table entry to be removed.  As such, it's currently the
caller's responsibility to make sure that draining takes place.

Reviewed by:	mlaier
2004-03-10 02:48:50 +00:00
rwatson
7866f6e355 Scrub unused variable zeroin_addr. 2004-03-10 01:01:04 +00:00
hsu
01e90a548b To comply with the spec, do not copy the TOS from the outer IP
header to the inner IP header of the PIM Register if this is a PIM
Null-Register message.

Submitted by:	Pavlin Radoslavov <pavlin@icir.org>
2004-03-08 07:47:27 +00:00
hsu
ef9b350961 Include <sys/types.h> for autoconf/automake detection.
Submitted by:	Pavlin Radoslavov <pavlin@icir.org>
2004-03-08 07:45:32 +00:00
mlaier
206b5dcf52 Add some missing DUMMYNET_UNLOCK() in config_pipe().
Noticed by:	Simon Coggins
Approved by:	bms(mentor)
2004-03-03 01:33:22 +00:00
mlaier
a32329ae33 Two minor follow-ups on the MT_TAG removal:
ifp is now passed explicitly to ether_demux; no need to look it up again.
Make mtag a global var in ip_input.

Noticed by:	rwatson
Approved by:	bms(mentor)
2004-03-02 14:37:23 +00:00
rwatson
8da3c338df Rename NET_PICKUP_GIANT() to NET_LOCK_GIANT(), and NET_DROP_GIANT()
to NET_UNLOCK_GIANT().  While they are used in similar ways, the
semantics are quite different -- NET_LOCK_GIANT() and NET_UNLOCK_GIANT()
directly wrap mutex lock and unlock operations, whereas drop/pickup
special case the handling of Giant recursion.  Add a comment saying
as much.

Add NET_ASSERT_GIANT(), which conditionally asserts Giant based
on the value of debug_mpsafenet.
2004-03-01 22:37:01 +00:00