Commit Graph

1701 Commits

Author SHA1 Message Date
Dag-Erling Smørgrav
521f364b80 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
Jonathan Lemon
272c5dfe93 In timewait state, if the incoming segment is a pure in-sequence ack
that matches snd_max, then do not respond with an ack, just drop the
segment.  This fixes a problem where a simultaneous close results in
an ack loop between two time-wait states.

Test case supplied by: Tim Robbins <tjr@FreeBSD.ORG>
Sponsored by: DARPA, NAI Labs
2003-02-26 18:20:41 +00:00
Jonathan Lemon
ef6b48deb9 The TCP protocol lock may still be held if the reassembly queue dropped FIN.
Detect this case and drop the lock accordingly.

Sponsored by: DARPA, NAI Labs
2003-02-26 13:55:13 +00:00
Mike Silbersack
a75a485d62 Fix a condition so that ip reassembly queues are emptied immediately
when maxfragpackets is dropped to 0.

Noticed by:	bmah
2003-02-26 07:28:35 +00:00
Robert Watson
9327ee33bf When generating a TCP response to a connection, not only test if the
tcpcb is NULL, but also its connected inpcb, since we now allow
elements of a TCP connection to hang around after other state, such
as the socket, has been recycled.

Tested by:	dcs
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-02-25 14:08:41 +00:00
Maxim Konovalov
b36f5b3735 style(9): join lines. 2003-02-25 11:53:11 +00:00
Maxim Konovalov
99e8617d24 Ip reassembly queue structure has ipq_nfrags now. Count a number of
dropped ip fragments precisely.

Reviewed by:	silby
2003-02-25 11:49:01 +00:00
Jeffrey Hsu
edf02ff15d Hold the TCP protocol lock while modifying the connection hash table. 2003-02-25 01:32:03 +00:00
Mike Silbersack
af9c7d06d5 Fix a comment which didn't match the new cookie behavior.
Submitted by:	Scott Renfro <scott@renfro.org>
MFC after:	1 day
2003-02-24 03:15:48 +00:00
Jeffrey Hsu
11a20fb8b6 tcp_twstart() need to be called with the TCP protocol lock held to avoid
a race condition with the TCP timer routines.
2003-02-24 00:52:03 +00:00
Jeffrey Hsu
2fbef91887 Pass the right function to callout_reset() for a compressed
TIME-WAIT control block.
2003-02-24 00:48:12 +00:00
Mike Silbersack
a432399c56 Improve the security and performance of syncookies:
Security improvements:
- Increase the size of each syncookie secret from 32 to 128 bits
  in order to make brute force attacks on the secrets much more
  difficult.
- Always return the lowest order dword from the MD5 hash; this
  allows us to expose 2 more bits of the cookie and makes ACK
  floods which seek to guess the cookie value more difficult.

Performance improvements:
- Increase the lifetime of each syncookie from 4 seconds to 16
  seconds.  This increases the usefulness of syncookies during
  an attack.
- From Yahoo!: Reduce the number of calls to MD5Update; this
  results in a ~17% increase in cookie generation time here.

Reviewed by:	hsu, jayanth, jlemon, nectar
MFC After:	15 seconds
2003-02-23 19:04:23 +00:00
Jonathan Lemon
f243998be5 Yesterday just wasn't my day. Remove testing delta that crept into the diff.
Pointy hat provided by: sam
2003-02-23 15:40:36 +00:00
Sam Leffler
14dd6717f8 Add a new config option IPSEC_FILTERGIF to control whether or not
packets coming out of a GIF tunnel are re-processed by ipfw, et. al.
By default they are not reprocessed.  With the option they are.

This reverts 1.214.  Prior to that change packets were not re-processed.
After they were which caused problems because packets do not have
distinguishing characteristics (like a special network if) that allows
them to be filtered specially.

This is really a stopgap measure designed for immediate MFC so that
4.8 has consistent handling to what was in 4.7.

PR:		48159
Reviewed by:	Guido van Rooij <guido@gvr.org>
MFC after:	1 day
2003-02-23 00:47:06 +00:00
Jonathan Lemon
a14c749f04 Check to see if the TF_DELACK flag is set before returning from
tcp_input().  This unbreaks delack handling, while still preserving
correct T/TCP behavior

Tested by: maxim
Sponsored by: DARPA, NAI Labs
2003-02-22 21:54:57 +00:00
Mike Silbersack
375386e284 Add the ability to limit the number of IP fragments allowed per packet,
and enable it by default, with a limit of 16.

At the same time, tweak maxfragpackets downward so that in the worst
possible case, IP reassembly can use only 1/2 of all mbuf clusters.

MFC after: 	3 days
Reviewed by:	hsu
Liked by:	bmah
2003-02-22 06:41:47 +00:00
Poul-Henning Kamp
d25ecb917b - m = m_gethdr(M_NOWAIT, MT_HEADER);
+       m = m_gethdr(M_DONTWAIT, MT_HEADER);

'nuff said.
2003-02-21 23:17:12 +00:00
Crist J. Clark
b0d226932e The ancient and outdated concept of "privileged ports" in UNIX-type
OSes has probably caused more problems than it ever solved. Allow the
user to retire the old behavior by specifying their own privileged
range with,

  net.inet.ip.portrange.reservedhigh  default = IPPORT_RESERVED - 1
  net.inet.ip.portrange.reservedlo    default = 0

Now you can run that webserver without ever needing root at all. Or
just imagine, an ftpd that can really drop privileges, rather than
just set the euid, and still do PORT data transfers from 20/tcp.

Two edge cases to note,

  # sysctl net.inet.ip.portrange.reservedhigh=0

Opens all ports to everyone, and,

  # sysctl net.inet.ip.portrange.reservedhigh=65535

Locks all network activity to root only (which could actually have
been achieved before with ipfw(8), but is somewhat more
complicated).

For those who stick to the old religion that 0-1023 belong to root and
root alone, don't touch the knobs (or even lock them by raising
securelevel(8)), and nothing changes.
2003-02-21 05:28:27 +00:00
Jonathan Lemon
8608c4c1f9 Remove unused variables in the IPSEC case.
Submitted by:  Lars Eggert <larse@ISI.EDU>
2003-02-20 18:22:21 +00:00
Jonathan Lemon
ffae8c5a7e Unbreak non-IPV6 compilation.
Caught by: phk
Sponsored by: DARPA, NAI Labs
2003-02-19 23:43:04 +00:00
Jonathan Lemon
340c35de6a Add a TCP TIMEWAIT state which uses less space than a fullblown TCP
control block.  Allow the socket and tcpcb structures to be freed
earlier than inpcb.  Update code to understand an inp w/o a socket.

Reviewed by: hsu, silby, jayanth
Sponsored by: DARPA, NAI Labs
2003-02-19 22:32:43 +00:00
Jonathan Lemon
7990938421 Convert tcp_fillheaders(tp, ...) -> tcpip_fillheaders(inp, ...) so the
routine does not require a tcpcb to operate.  Since we no longer keep
template mbufs around, move pseudo checksum out of this routine, and
merge it with the length update.

Sponsored by: DARPA, NAI Labs
2003-02-19 22:18:06 +00:00
Jonathan Lemon
414462252a Correct comments. 2003-02-19 21:33:46 +00:00
Jonathan Lemon
3bfd6421c2 Clean up delayed acks and T/TCP interactions:
- delay acks for T/TCP regardless of delack setting
   - fix bug where a single pass through tcp_input might not delay acks
   - use callout_active() instead of callout_pending()

Sponsored by: DARPA, NAI Labs
2003-02-19 21:18:23 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Maxim Konovalov
b52d5ea3d2 o Fix ipfw uid rules: socheckuid() returns 0 when uid matches a socket
cr_uid.

Note: we do not have socheckuid() in RELENG_4, ip_fw2.c uses its
own macro for a similar purpose that is why ipfw2 in RELENG_4 processes
uid rules correctly. I will MFC the diff for code consistency.

Reported by:	Oleg Baranov <ol@csa.ru>
Reviewed by:	luigi
MFC after:	1 month
2003-02-17 13:39:57 +00:00
Jeffrey Hsu
4b40c56c28 Take advantage of pre-existing lock-free synchronization and type stable memory
to avoid acquiring SMP locks during expensive copyout process.
2003-02-15 02:37:57 +00:00
Jeffrey Hsu
85e8b24343 The protocol lock is always held in the dropafterack case, so we don't
need to check for it at runtime.
2003-02-13 22:14:22 +00:00
Jeffrey Hsu
3dc7ebf9ff in_pcbnotifyall() requires an exclusive protocol lock for notify functions
which modify the connection list, namely, tcp_notify().
2003-02-12 23:55:07 +00:00
Jeffrey Hsu
6d45d64a8f Properly document that syncache timer processing requires an
exclusive TCP protocol lock.
2003-02-12 00:42:12 +00:00
Seigo Tanimura
cd6c2a8874 s/IPSSEC/IPSEC/ 2003-02-11 10:51:56 +00:00
Jeffrey Hsu
24652ff6e1 Get cosmetic changes out of the way before I add routing table SMP locks. 2003-02-10 22:01:34 +00:00
Orion Hodson
022695f82a Avoid multiply for preemptive arp calculation since it hits every
ethernet packet sent.

Prompted by: Jeffrey Hsu <hsu@FreeBSD.org>
2003-02-08 15:05:15 +00:00
Orion Hodson
73224fb019 MFS 1.64.2.22: Re-enable non pre-emptive ARP requests.
Submitted by: "Diomidis Spinellis" <dds@aueb.gr>
PR:           kern/46116
2003-02-04 05:28:08 +00:00
Crist J. Clark
39eb27a4a9 Add the TCP flags to the log message whenever log_in_vain is 1, not
just when set to 2.

PR:		kern/43348
MFC after:	5 days
2003-02-02 22:06:56 +00:00
Mike Silbersack
ecf44c01f4 Move a comment and optimize the frag timeout code a slight bit.
Submitted by:	maxim
MFC with:	The previous two revisions
2003-02-01 05:59:51 +00:00
Sam Leffler
9359ad861e FAST_IPSEC bandaid: act like KAME and ignore ENOENT error codes from
ipsec4_process_packet; they happen when a packet is dropped because
an SA acquire is initiated

Submitted by:	Doug Ambrisko <ambrisko@verniernetworks.com>
2003-01-30 05:45:45 +00:00
Sam Leffler
28a34902c4 remove the restriction on build a kernel with FAST_IPSEC and INET6;
you still don't want to use the two together, but it's ok to have
them in the same kernel (the problem that initiated this bandaid
has long since been fixed)
2003-01-30 05:43:08 +00:00
Mike Silbersack
d4d5315c23 Fix a bug with syncookies; previously, the syncache's MSS size was not
initialized until after a syncookie was generated.  As a result,
all connections resulting from a returned cookie would end up using
a MSS of ~512 bytes.  Now larger packets will be used where possible.

MFC after:	5 days
2003-01-29 03:49:49 +00:00
Poul-Henning Kamp
4ee6e70ef3 Check bounds for index before dereferencing memory past end of array.
Found by:	FlexeLint
2003-01-28 22:44:12 +00:00
Jeffrey Hsu
93f798891a Avoid lock order reversal by expanding the scope of the
AF_INET radix tree lock to cover the ARP data structures.
2003-01-28 20:22:19 +00:00
Mike Silbersack
ac64c8668b A few fixes to rev 1.221
- Honor the previous behavior of maxfragpackets = 0 or -1
- Take a better stab at fragment statistics
- Move / correct a comment

Suggested by:	maxim@
MFC after:	7 days
2003-01-28 03:39:39 +00:00
Mike Silbersack
402062e80c Merge the best parts of maxfragpackets and maxnipq together. (Both
functions implemented approximately the same limits on fragment memory
usage, but in different fashions.)

End user visible changes:
- Fragment reassembly queues are freed in a FIFO manner when maxfragpackets
  has been reached, rather than all reassembly stopping.

MFC after: 	5 days
2003-01-26 01:44:05 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Maxim Konovalov
2adf7582da De-anonymity a couple of messages I missed in a previous sweep.
Move one of them under DEB macro.

Noticed by:	Wiktor Niesiobedzki <w@evip.pl>
2003-01-20 13:03:34 +00:00
Maxim Konovalov
8ec22a9363 If the first action is O_LOG adjust a pointer to the real one, unbreaks
skipto + log rules.

Reported by:	Wiktor Niesiobedzki <w@evip.pl>
MFC after:	1 week
2003-01-20 11:58:34 +00:00
Jeffrey Hsu
314e5a3daf Optimize away call to bzero() in the common case by directly checking
if a connection has any cached TAO information.
2003-01-18 19:03:26 +00:00
Jeffrey Hsu
f5c5746047 Fix long-standing bug predating FreeBSD where calling connect() twice
on a raw ip socket will crash the system with a null-dereference.
2003-01-18 01:10:55 +00:00
Jeffrey Hsu
c996428c32 SMP locking for ARP. 2003-01-17 07:59:35 +00:00
Matthew Dillon
fe41ca530c Introduce the ability to flag a sysctl for operation at secure level 2 or 3
in addition to secure level 1.  The mask supports up to a secure level of 8
but only add defines through CTLFLAG_SECURE3 for now.

As per the missif in the log entry for 1.11 of ip_fw2.c which added the
secure flag to the IPFW sysctl's in the first place, change the secure
level requirement from 1 to 3 now that we have support for it.

Reviewed by:	imp
With Design Suggestions by:	imp
2003-01-14 19:35:33 +00:00
Jeffrey Hsu
cb942153c8 Fix NewReno.
Reviewed by: Tom Henderson <thomas.r.henderson@boeing.com>
2003-01-13 11:01:20 +00:00
Thomas Moestl
a9a7a91220 Clear the target hardware address field when generating an ARP request.
Reviewed by:	nectar
MFC after:	1 week
2003-01-10 00:04:53 +00:00
Jeffrey Hsu
b21bf9a59b Validate inp before de-referencing it.
Submitted by:	pb
2003-01-05 07:56:24 +00:00
Jens Schweikhardt
9d5abbddbf Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.
2003-01-01 18:49:04 +00:00
Sam Leffler
9967cafc49 Correct mbuf packet header propagation. Previously, packet headers
were sometimes propagated using M_COPY_PKTHDR which actually did
something between a "move" and a  "copy" operation.  This is replaced
by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it
from the source mbuf) and m_dup_pkthdr which copies the packet
header contents including any m_tag chain.  This corrects numerous
problems whereby mbuf tags could be lost during packet manipulations.

These changes also introduce arguments to m_tag_copy and m_tag_copy_chain
to specify if the tag copy work should potentially block.  This
introduces an incompatibility with openbsd which we may want to revisit.

Note that move/dup of packet headers does not handle target mbufs
that have a cluster bound to them.  We may want to support this;
for now we watch for it with an assert.

Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG|M_LASTFRAG.

Supported by:	Vernier Networks
Reviewed by:	Robert Watson <rwatson@FreeBSD.org>
2002-12-30 20:22:40 +00:00
Matthew Dillon
07fd333df3 Remove the PAWS ack-on-ack debugging printf().
Note that the original RFC 1323 (PAWS) says in 4.2.1 that the out of
order / reverse-time-indexed packet should be acknowledged as specified
in RFC-793 page 69 then dropped.  The original PAWS code in FreeBSD (1994)
simply acknowledged the segment unconditionally, which is incorrect, and
was fixed in 1.183 (2002).  At the moment we do not do checks for SYN or FIN
in addition to (tlen != 0), which may or may not be correct, but the
worst that ought to happen should be a retry by the sender.
2002-12-30 19:31:04 +00:00
Sam Leffler
069f35d328 correct style bogons 2002-12-30 18:45:31 +00:00
Ian Dowse
ed1a13b18f Bridged packets are supplied to the firewall with their IP header
in network byte order, but icmp_error() expects the IP header to
be in host order and the code here did not perform the necessary
swapping for the bridged case. This bug causes an "icmp_error: bad
length" panic when certain length IP packets (e.g. ip_len == 0x100)
are rejected by the firewall with an ICMP response.

MFC after:	3 days
2002-12-27 17:43:25 +00:00
Jeffrey Hsu
abe239cfe2 Validate inp to prevent an use after free. 2002-12-24 21:00:31 +00:00
Maxim Konovalov
f4ef616f98 o De-anonymity dummynet(4) and ipfw(4) messages, prepend them
by 'dummynet: ' and 'ipfw: ' prefixes.

PR:		kern/41609
2002-12-24 13:45:24 +00:00
Jeffrey Hsu
956b0b653c SMP locking for radix nodes. 2002-12-24 03:03:39 +00:00
Pierre Beyssac
1ba7727b9e Remove forgotten INP_UNLOCK(inp) in my previous commit.
Reported by: hsu
2002-12-22 13:04:08 +00:00
Pierre Beyssac
87cd4001b5 In syncache_timer(), don't attempt to lock the inpcb structure
associated with the syncache entry: in case tcp_close() has been
called on the corresponding listening socket, the lock has been
destroyed as a side effect of in_pcbdetach(), causing a panic when
we attempt to lock on it.

Reviewed by:	hsu
2002-12-21 19:59:47 +00:00
Sam Leffler
00f21882a0 replace the special-purpose rate-limiting code with the general facility
just added; this tries to maintain the same behaviour vis a vis printing
the rate-limiting messages but need tweaking
2002-12-21 00:08:20 +00:00
Jeffrey Hsu
9a39fc9d73 Eliminate a goto.
Fix some line breaks.
2002-12-20 11:24:02 +00:00
Jeffrey Hsu
540e8b7e31 Unravel a nested conditional.
Remove an unneeded local variable.
2002-12-20 11:16:52 +00:00
Jeffrey Hsu
f320a1bfd2 Expand scope of TCP protocol lock to cover syncache data structures. 2002-12-20 00:24:19 +00:00
Bosko Milekic
86fea6be59 o Untangle the confusion with the malloc flags {M_WAITOK, M_NOWAIT} and
the mbuf allocator flags {M_TRYWAIT, M_DONTWAIT}.
o Fix a bpf_compat issue where malloc() was defined to just call
  bpf_alloc() and pass the 'canwait' flag(s) along.  It's been changed
  to call bpf_alloc() but pass the corresponding M_TRYWAIT or M_DONTWAIT
  flag (and only one of those two).

Submitted by: Hiten Pandya <hiten@unixdaemons.com> (hiten->commit_count++)
2002-12-19 22:58:27 +00:00
Jeffrey Hsu
19fc74fb60 Lock up ifaddr reference counts. 2002-12-18 11:46:59 +00:00
Poul-Henning Kamp
11aee0b4b0 Remove unused and incorrectly maintained variable "in_interfaces" 2002-12-17 19:30:04 +00:00
Matthew Dillon
967adce8df Fix syntax in last commit. 2002-12-17 00:24:48 +00:00
Maxim Konovalov
616fa7460c o Trim EOL whitespaces.
MFC after:	1 week
2002-12-15 10:24:36 +00:00
Maxim Konovalov
21ef23ab3f o s/if_name[16]/if_name[IFNAMSIZ]/
Reviewed by:	luigi
MFC after:	1 week
2002-12-15 10:23:02 +00:00
Maxim Konovalov
2713a5bebb o M_DONTWAIT is mbuf(9) flag: malloc(M_DONTWAIT) -> malloc(M_NOWAIT).
The bug does not affect anything because M_NOWAIT == M_DONTWAIT.

Reviewed by:	luigi
MFC after:	1 week
2002-12-15 10:21:30 +00:00
Maxim Konovalov
83b75b7621 o Fix byte order logging issue: sa.sin_port is already in host byte order.
PR:		kern/45964
Submitted by:	Sascha Blank <sblank@tiscali.de>
Reviewed by:	luigi
MFC after:	1 week
2002-12-15 09:44:02 +00:00
Matthew Dillon
d7ff8ef62a Change tcp.inflight_min from 1024 to a production default of 6144. Create
a sysctl for the stabilization value for the bandwidth delay product (inflight)
algorithm and document it.

MFC after:	3 days
2002-12-14 21:00:17 +00:00
Matthew Dillon
1ab4789dc2 Bruce forwarded this tidbit from an analysis Van Jacobson did on an
apparent ack-on-ack problem with FreeBSD.  Prof. Jacobson noticed a
case in our TCP stack which would acknowledge a received ack-only packet,
which is not legal in TCP.

Submitted by:	 Van Jacobson <van@packetdesign.com>,
		bmah@packetdesign.com (Bruce A. Mah)
MFC after:	7 days
2002-12-14 07:31:51 +00:00
Maxim Sobolev
16199bf2d3 MFS: recognize gre packets used in the WCCP protocol.
Approved by:	re
2002-12-07 14:22:05 +00:00
Luigi Rizzo
97850a5dd9 Move fw_one_pass from ip_fw2.c to ip_input.c so that neither
bridge.c nor if_ethersubr.c depend on IPFIREWALL.
Restore the use of fw_one_pass in if_ethersubr.c

ipfw.8 will be updated with a separate commit.

Approved by: re
2002-11-20 19:07:27 +00:00
Luigi Rizzo
032dcc7680 Back out some style changes. They are not urgent,
I will put them back in after 5.0 is out.

Requested by: sam
Approved by: re
2002-11-20 19:00:54 +00:00
Luigi Rizzo
b375c9ec2c Back out the ip_fragment() code -- it is not urgent to have it in now,
I will put it back in in a better form after 5.0 is out.

Requested by: sam, rwatson, luigi (on second thought)
Approved by: re
2002-11-20 18:56:25 +00:00
Mike Silbersack
df285b3d1d Add a sysctl to control the generation of source quench packets,
and set it to 0 by default.

Partially obtained from:	NetBSD
Suggested by:	David Gilbert
MFC after:	5 days
2002-11-19 17:06:06 +00:00
Luigi Rizzo
9b77fbf0a2 Fix function headers and remove 'register' variable declarations. 2002-11-17 17:04:19 +00:00
Luigi Rizzo
3e372e140c Move the ip_fragment code from ip_output() to a separate function,
so that it can be reused elsewhere (there is a number of places
where it can be useful). This also trims some 200 lines from
the body of ip_output(), which helps readability a bit.

(This change was discussed a few weeks ago on the mailing lists,
Julian agreed, silence from others. It is not a functional change,
so i expect it to be ok to commit it now but i am happy to back it
out if there are objections).

While at it, fix some function headers and replace m_copy() with
m_copypacket() where applicable.

MFC after: 1 week
2002-11-17 16:30:44 +00:00
Luigi Rizzo
20fab86349 Minor documentation changes and indentation fix.
Replace m_copy() with m_copypacket() where applicable.

While at it, fix some function headers and remove 'register' from
variable declarations.
2002-11-17 16:13:08 +00:00
Luigi Rizzo
4e8fe3210d Cleanup some of the comments, and reformat long lines.
Replace m_copy() with m_copypacket() where applicable.

Replace "if (a.s_addr ...)" with "if (a.s_addr != INADDR_ANY ...)"
to make it clear what the code means.

While at it, fix some function headers and remove 'register' from
variable declarations.

MFC after: 3 days
2002-11-17 16:02:17 +00:00
Luigi Rizzo
bbb4330b61 Massive cleanup of the ip_mroute code.
No functional changes, but:

  + the mrouting module now should behave the same as the compiled-in
    version (it did not before, some of the rsvp code was not loaded
    properly);
  + netinet/ip_mroute.c is now truly optional;
  + removed some redundant/unused code;
  + changed many instances of '0' to NULL and INADDR_ANY as appropriate;
  + removed several static variables to make the code more SMP-friendly;
  + fixed some minor bugs in the mrouting code (mostly, incorrect return
    values from functions).

This commit is also a prerequisite to the addition of support for PIM,
which i would like to put in before DP2 (it does not change any of
the existing APIs, anyways).

Note, in the process we found out that some device drivers fail to
properly handle changes in IFF_ALLMULTI, leading to interesting
behaviour when a multicast router is started. This bug is not
corrected by this commit, and will be fixed with a separate commit.

Detailed changes:
--------------------
netinet/ip_mroute.c     all the above.
conf/files              make ip_mroute.c optional
net/route.c             fix mrt_ioctl hook
netinet/ip_input.c      fix ip_mforward hook, move rsvp_input() here
                        together with other rsvp code, and a couple
                        of indentation fixes.
netinet/ip_output.c     fix ip_mforward and ip_mcast_src hooks
netinet/ip_var.h        rsvp function hooks
netinet/raw_ip.c        hooks for mrouting and rsvp functions, plus
                        interface cleanup.
netinet/ip_mroute.h     remove an unused and optional field from a struct

Most of the code is from Pavlin Radoslavov and the XORP project

Reviewed by: sam
MFC after: 1 week
2002-11-15 22:53:53 +00:00
Sam Leffler
eec3a0b17f track changes to not strip the Ethernet header from input packets
Reviewed by:	many
Approved by:	re
2002-11-14 23:46:04 +00:00
Sam Leffler
ccb2acfe1b track bpf changes
Reviewed by:	many
Approved by:	re
2002-11-14 23:45:13 +00:00
Maxim Konovalov
8ef1565d2b Due to a memory alignment sizeof(struct ipfw_flow_id) is bigger than
ipfw_flow_id structure actual size and bcmp(3) may fail to compare
them properly. Compare members of these structures instead.

PR:		kern/44078
Submitted by:	Oleg Bulyzhin <oleg@rinet.ru>
Reviewed by:	luigi
MFC after:	2 weeks
2002-11-13 11:31:44 +00:00
Jeffrey Hsu
e1e1b6e892 Turn off duplicate lock checking for inp locks because udp_input()
intentionally locks two inp records simultaneously.
2002-11-12 20:44:38 +00:00
Sam Leffler
6f0d017cf4 a better solution to building FAST_IPSEC w/o INET6
Submitted by:	Jeffrey Hsu <hsu@FreeBSD.org>
2002-11-10 17:17:32 +00:00
Alfred Perlstein
29f194457c Fix instances of macros with improperly parenthasized arguments.
Verified by: md5
2002-11-09 12:55:07 +00:00
Sam Leffler
9c0a8ace11 temporarily disallow FAST_IPSEC and INET6 to avoid potential panics;
will correct this before 5.0 release
2002-11-08 23:50:32 +00:00
Sam Leffler
e8539d32f0 FAST_IPSEC fixups:
o fix #ifdef typo
o must use "bounce functions" when dispatched from the protosw table

don't know how this stuff was missed in my testing; must've committed
the wrong bits

Pointy hat:	sam
Submitted by:	"Doug Ambrisko" <ambrisko@verniernetworks.com>
2002-11-08 23:37:50 +00:00
Sam Leffler
58fcadfc0f fixup FAST_IPSEC build w/o INET6 2002-11-08 23:33:59 +00:00
Sam Leffler
ab94ca3cec correct fast ipsec logic: compare destination ip address against the
contents of the SA, not the SP

Submitted by:	"Doug Ambrisko" <ambrisko@verniernetworks.com>
2002-11-08 23:11:02 +00:00
John Baldwin
2d4e26522d Cast a ptrdiff_t to an int to printf. 2002-11-08 14:52:26 +00:00
Jeff Roberson
1645d0903e - Consistently update snd_wl1, snd_wl2, and rcv_up in the header
prediction code.  Previously, 2GB worth of header predicted data
   could leave these variables too far out of sequence which would cause
   problems after receiving a packet that did not match the header
   prediction.

Submitted by:	Bill Baumann <bbaumann@isilon.com>
Sponsored by:	Isilon Systems, Inc.
Reviewed by:	hsu, pete@isilon.com, neal@isilon.com, aaronp@isilon.com
2002-10-31 23:24:13 +00:00
Jeffrey Hsu
30613f5610 Don't need to check if SO_OOBINLINE is defined.
Don't need to protect isipv6 conditional with INET6.
Fix leading indentation in 2 lines.
2002-10-30 08:32:19 +00:00
Bill Fenner
4d3ffc9841 Renumber IPPROTO_DIVERT out of the range of valid IP protocol numbers.
This allows socket() to return an error when the kernel is not built
with IPDIVERT, and doesn't prevent future applications from using the
"borrowed" IP protocol number.  The sysctl net.inet.raw.olddiverterror
controls whether opening a socket with the "borrowed" IP protocol
fails with an accompanying kernel printf; this code should last only a
couple of releases.

Approved by:	re
2002-10-29 16:46:13 +00:00
Maxim Konovalov
a98d88ad3e Lower a priority of "session drop" messages.
Requested by:	Eugene Grosbein <eugen@kuzbass.ru>
MFC after:	3 days
2002-10-29 08:53:14 +00:00
Maxime Henrion
d28e8b3a0d Oops, forgot to commit this file. This is part of the fix
for ipfw2 panics on sparc64.
2002-10-24 22:32:13 +00:00
Maxime Henrion
7c697970f4 Fix ipfw2 panics on 64-bit platforms.
Quoting luigi:

In order to make the userland code fully 64-bit clean it may
be necessary to commit other changes that may or may not cause
a minor change in the ABI.

Reviewed by:	luigi
2002-10-24 18:04:44 +00:00
Luigi Rizzo
18f13da2be src and dst address were erroneously swapped in SRC_SET and DST_SET
commands.  Use the correct one. Also affects ipfw2 in -stable.
2002-10-24 18:01:53 +00:00
Maxime Henrion
56e77afa59 Fix kernel build on sparc64 in the IPDIVERT case. 2002-10-24 09:58:50 +00:00
Ian Dowse
efac726eeb Unbreak the automatic remapping of an INADDR_ANY destination address
to the primary local IP address when doing a TCP connect(). The
tcp_connect() code was relying on in_pcbconnect (actually in_pcbladdr)
modifying the passed-in sockaddr, and I failed to notice this in
the recent change that added in_pcbconnect_setup(). As a result,
tcp_connect() was ending up using the unmodified sockaddr address
instead of the munged version.

There are two cases to handle: if in_pcbconnect_setup() succeeds,
then the PCB has already been updated with the correct destination
address as we pass it pointers to inp_faddr and inp_fport directly.
If in_pcbconnect_setup() fails due to an existing but dead connection,
then copy the destination address from the old connection.
2002-10-24 02:02:34 +00:00
Maxim Konovalov
ba3a9d459c Kill EOL spaces.
Approved by:	luigi
MFC after:	1 week
2002-10-23 10:07:55 +00:00
Maxim Konovalov
6b6874b20c Use syslog for messages about dropped sessions, do not flood a console.
Suggested by:	Eugene Grosbein <eugen@kuzbass.ru>
Approved by:	luigi
MFC after:	1 week
2002-10-23 10:05:19 +00:00
SUZUKI Shinsuke
2754d95d85 fixed a kernel crash by "ifconfig stf0 inet 1.2.3.4"
MFC after:	1 week
2002-10-22 22:50:38 +00:00
Ian Dowse
c557ae16ce Implement a new IP_SENDSRCADDR ancillary message type that permits
a server process bound to a wildcard UDP socket to select the IP
address from which outgoing packets are sent on a per-datagram
basis. When combined with IP_RECVDSTADDR, such a server process can
guarantee to reply to an incoming request using the same source IP
address as the destination IP address of the request, without having
to open one socket per server IP address.

Discussed on:	-net
Approved by:	re
2002-10-21 20:40:02 +00:00
Ian Dowse
90162a4e87 Remove the "temporary connection" hack in udp_output(). In order
to send datagrams from an unconnected socket, we used to first block
input, then connect the socket to the sendmsg/sendto destination,
send the datagram, and finally disconnect the socket and unblock
input.

We now use in_pcbconnect_setup() to check if a connect() would have
succeeded, but we never record the connection in the PCB (local
anonymous port allocation is still recorded, though). The result
from in_pcbconnect_setup() authorises the sending of the datagram
and selects the local address and port to use, so we just construct
the header and call ip_output().

Discussed on:	-net
Approved by:	re
2002-10-21 20:10:05 +00:00
Ian Dowse
5200e00e72 Replace in_pcbladdr() with a more generic inner subroutine for
in_pcbconnect() called in_pcbconnect_setup(). This version performs
all of the functions of in_pcbconnect() except for the final
committing of changes to the PCB. In the case of an EADDRINUSE error
it can also provide to the caller the PCB of the duplicate connection,
avoiding an extra in_pcblookup_hash() lookup in tcp_connect().

This change will allow the "temporary connect" hack in udp_output()
to be removed and is part of the preparation for adding the
IP_SENDSRCADDR control message.

Discussed on:	-net
Approved by:	re
2002-10-21 13:55:50 +00:00
Poul-Henning Kamp
53be11f680 Fix two instances of variant struct definitions in sys/netinet:
Remove the never completed _IP_VHL version, it has not caught on
anywhere and it would make us incompatible with other BSD netstacks
to retain this version.

Add a CTASSERT protecting sizeof(struct ip) == 20.

Don't let the size of struct ipq depend on the IPDIVERT option.

This is a functional no-op commit.

Approved by:	re
2002-10-20 22:52:07 +00:00
Robert Watson
c740509854 When a packet is multicast encapsulated, give labeled policies the
opportunity to preserve the label.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-20 21:59:00 +00:00
Ian Dowse
4b932371f4 Split out most of the logic from in_pcbbind() into a new function
called in_pcbbind_setup() that does everything except commit the
changes to the PCB. There should be no functional change here, but
in_pcbbind_setup() will be used by the soon-to-appear IP_SENDSRCADDR
control message implementation to check or allocate the source
address and port.

Discussed on:	-net
Approved by:	re
2002-10-20 21:44:31 +00:00
Maxime Henrion
d7f4d27a7a Several malloc() calls were passing the M_DONTWAIT flag
which is an mbuf allocation flag.  Use the correct
M_NOWAIT malloc() flag.  Fortunately, both were defined
to 1, so this commit is a no-op.
2002-10-19 11:31:50 +00:00
Hajimu UMEMOTO
b6e2845324 last arg of in6?_gif_output() is not used any more.
Obtained from:	KAME
MFC after:	3 weeks
2002-10-17 17:47:55 +00:00
Alfred Perlstein
dde2897f82 de-__P(). 2002-10-16 22:27:27 +00:00
Hajimu UMEMOTO
ab94625826 use encapcheck.
Obtained from:	KAME
MFC after:	3 weeks
2002-10-16 20:16:49 +00:00
Hajimu UMEMOTO
9426aedf7f - after gif_set_tunnel(), psrc/pdst may be null. set IFF_RUNNING accordingly.
- set IFF_UP on SIOCSIFADDR.  be consistent with others.
- set if_addrlen explicitly (just in case)
- multi destination mode is long gone.
- missing break statement
- add gif_set_tunnel(), so that we can set tunnel address from within the
  kernel at ease.
- encap_attach/detach dynamically on ioctls
- move encap_attach() to dedicated function in in*_gif.c

Obtained from:	KAME
MFC after:	3 weeks
2002-10-16 19:49:37 +00:00
Matthew Dillon
abac41a659 Fix oops in my last commit, I was calculating a new length but then not
using it.  (The code is already correct in -stable).

Found by: silby
2002-10-16 19:16:33 +00:00
Guido van Rooij
2f591ab8fe Get rid of checking for ip sec history. It is true that packets are not
supposed to be checked by the firewall rules twice. However, because the
various ipsec handlers never call ip_input(), this never happens anyway.

This fixes the situation where a gif tunnel is encrypted with IPsec. In
such a case, after IPsec processing, the unencrypted contents from the
GIF tunnel are fed back to the ipintrq and subsequently handeld by
ip_input(). Yet, since there still is IPSec history attached, the
packets coming out from the gif device are never fed into the filtering
code.
This fix was sent to Itojun, and he pointed towartds
    http://www.netbsd.org/Documentation/network/ipsec/#ipf-interaction.
This patch actually implements what is stated there (specifically:
Packet came from tunnel devices (gif(4) and ipip(4)) will still
go through ipf(4). You may need to identify these packets by
using interface name directive in ipf.conf(5).

Reviewed by:	rwatson
MFC after:	3 weeks
2002-10-16 09:01:48 +00:00
Sam Leffler
9b65723081 correct PCB locking in broadcast/multicast case that was exposed by change
to use udp_append

Reviewed by:	hsu
2002-10-16 02:33:28 +00:00
Sam Leffler
b9234fafa0 Tie new "Fast IPsec" code into the build. This involves the usual
configuration stuff as well as conditional code in the IPv4 and IPv6
areas.  Everything is conditional on FAST_IPSEC which is mutually
exclusive with IPSEC (KAME IPsec implmentation).

As noted previously, don't use FAST_IPSEC with INET6 at the moment.

Reviewed by:	KAME, rwatson
Approved by:	silence
Supported by:	Vernier Networks
2002-10-16 02:25:05 +00:00
Sam Leffler
5d84645305 Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
Sean Chittenden
927a76bb5e Increase the max dummynet hash size from 1024 to 65536. Default is still
1024.

Silence on:	-net, -ipfw 4weeks+
Reviewed by:	dd
Approved by:	knu (mentor)
MFC after:	3 weeks
2002-10-12 07:45:23 +00:00
Matthew Dillon
c8d50f2414 turn off debugging by default if bandwidth delay product limiting is
turned on (it is already off in -stable).
2002-10-10 21:41:30 +00:00
Matthew Dillon
28257b5ccc Update various comments mainly related to retransmit/FIN that I
documented while working on a previous bug.

Fix a PERSIST bug.  Properly account for a FIN sent during a PERSIST.

MFC after:	7 days
2002-10-10 19:21:50 +00:00
Maxim Konovalov
a5428e3a9a Fix IPOPT_TS processing: do not overwrite IP address by timestamp.
PR:		misc/42121
Submitted by:	Praveen Khurjekar <praveen@codito.com>
Reviewed by:	silence on -net
MFC after:	1 month
2002-10-10 12:03:36 +00:00
Maxim Sobolev
748bb23dcc Since bpf is no longer an optional component, remove associated ifdef's.
Submitted by:	don't quite remember - the name of the sender disappeared
		with the rest of my inbox. :(
2002-10-02 09:38:17 +00:00
Mike Barcroft
c0ec31f93e Include <sys/cdefs.h> so the visibility conditionals are available.
(This should have been included with the previous revision.)
2002-10-02 04:22:34 +00:00
Mike Barcroft
0cd4a9031e Use visibility conditionals. Only TCP_NODELAY ends up being defined
in the standards case.
2002-10-02 04:19:47 +00:00
Matthew Dillon
a84db8f49e Guido found another bug. There is a situation with
timestamped TCP packets where FreeBSD will send DATA+FIN and
A W2K box will ack just the DATA portion.  If this occurs
after FreeBSD has done a (NewReno) fast-retransmit and is
recovering it (dupacks > threshold) it triggers a case in
tcp_newreno_partial_ack() (tcp_newreno() in stable) where
tcp_output() is called with the expectation that the retransmit
timer will be reloaded.  But tcp_output() falls through and
returns without doing anything, causing the persist timer to be
loaded instead.  This causes the connection to hang until W2K gives up.
This occurs because in the case where only the FIN must be acked, the
'len' calculation in tcp_output() will be 0, a lot of checks will be
skipped, and the FIN check will also be skipped because it is designed
to handle FIN retransmits, not forced transmits from tcp_newreno().

The solution is to simply set TF_ACKNOW before calling tcp_output()
to absolute guarentee that it will run the send code and reset the
retransmit timer.  TF_ACKNOW is already used for this purpose in other
cases.

For some unknown reason this patch also seems to greatly reduce
the number of duplicate acks received when Guido runs his tests over
a lossy network.  It is quite possible that there are other
tcp_newreno{_partial_ack()} cases which were not generating the expected
output which this patch also fixes.

X-MFC after:	Will be MFC'd after the freeze is over
2002-09-30 18:55:45 +00:00
Poul-Henning Kamp
37c841831f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
Peter Wemm
224af215a6 Zap now-unused SHLIB_MINOR 2002-09-28 00:25:32 +00:00
Maxim Konovalov
cb7641e85b Slightly rearrange a code in rev. 1.164:
o Move len initialization closer to place of its first usage.
o Compare len with 0 to improve readability.
o Explicitly zero out phlen in ip_insertoptions() in failure case.

Suggested by:   jhb
Reviewed by:    jhb
MFC after:      2 weeks
2002-09-23 08:56:24 +00:00
Alfred Perlstein
ebc82cbbf0 s/__attribute__((__packed__))/__packed/g 2002-09-23 06:25:08 +00:00
Mike Silbersack
c1c36a2c68 Fix issue where shutdown(socket, SHUT_RD) was effectively
ignored for TCP sockets.

NetBSD PR:	18185
Submitted by:	Sean Boudreau <seanb@qnx.com>
MFC after:	3 days
2002-09-22 02:54:07 +00:00
Poul-Henning Kamp
a5554bf05b Use m_fixhdr() rather than roll our own. 2002-09-18 19:43:01 +00:00
Matthew Dillon
fa55172bc0 Guido reported an interesting bug where an FTP connection between a
Windows 2000 box and a FreeBSD box could stall.  The problem turned out
to be a timestamp reply bug in the W2K TCP stack.  FreeBSD sends a
timestamp with the SYN, W2K returns a timestamp of 0 in the SYN+ACK
causing FreeBSD to calculate an insane SRTT and RTT, resulting in
a maximal retransmit timeout (60 seconds).  If there is any packet
loss on the connection for the first six or so packets the retransmit
case may be hit (the window will still be too small for fast-retransmit),
causing a 60+ second pause.  The W2K box gives up and closes the
connection.

This commit works around the W2K bug.

15:04:59.374588 FREEBSD.20 > W2K.1036: S 1420807004:1420807004(0) win 65535 <mss 1460,nop,wscale 2,nop,nop,timestamp 188297344 0> (DF) [tos 0x8]
15:04:59.377558 W2K.1036 > FREEBSD.20: S 4134611565:4134611565(0) ack 1420807005 win 17520 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0> (DF)

Bug reported by: Guido van Rooij <guido@gvr.org>
2002-09-17 22:21:37 +00:00
Maxim Sobolev
563a9b6ecb Remove __RCSID().
Submitted by:	bde
2002-09-17 11:31:41 +00:00
Maxim Konovalov
1cf4349926 Explicitly clear M_FRAG flag on a mbuf with the last fragment to unbreak
ip fragments reassembling for loopback interface.

Discussed with:	bde, jlemon
Reviewed by:	silence on -net
MFC after:	2 weeks
2002-09-17 11:20:02 +00:00
Maxim Konovalov
e079ba8d93 In rare cases when there is no room for ip options ip_insertoptions()
can fail and corrupt a header length. Initialize len and check what
ip_insertoptions() returns.

Reviewed by:	archie, silence on -net
MFC after:	5 days
2002-09-17 11:13:04 +00:00
Jennifer Yang
4a03a8a8c7 Tempary fix for inet6. The final fix is to change in6_pcbnotify to take pcbinfo instead
of pcbhead. It is on the way.
2002-09-17 03:19:43 +00:00
Maxim Sobolev
2b82e3b367 Remove superfluous break. 2002-09-10 09:18:33 +00:00
Maxim Sobolev
565bb857d0 Since from now on encap_input() also catches IPPROTO_MOBILE and IPPROTO_GRE
packets in addition to IPPROTO_IPV4 and IPPROTO_IPV6, explicitly specify
IPPROTO_IPV4 or IPPROTO_IPV6 instead of -1 when calling encap_attach().

MFC after:	28 days
		(along with other if_gre changes)
2002-09-09 09:36:47 +00:00
Maxim Sobolev
c23d234cce Reduce namespace pollution by staticizing everything, which doesn't need to
be visible from outside of the module.
2002-09-06 18:16:03 +00:00
Maxim Sobolev
8e96e13e6a Add a new gre(4) driver, which could be used to create GRE (RFC1701)
and MOBILE (RFC2004) IP tunnels.

Obrained from:  NetBSD
2002-09-06 17:12:50 +00:00
Bruce Evans
40545cf5fc Fixed namespace pollution in uma changes:
- use `struct uma_zone *' instead of uma_zone_t, so that <sys/uma.h> isn't
  a prerequisite.
- don't include <sys/uma.h>.
Namespace pollution makes "opaque" types like uma_zone_t perfectly
non-opaque.  Such types should never be used (see style(9)).

Fixed subsequently grwon dependencies of this header on its own pollution:
- include <sys/_mutex.h> and its prerequisite <sys/_lock.h> instead of
  depending on namespace pollution 2 layers deep in <sys/uma.h>.
2002-09-05 19:48:52 +00:00