Commit Graph

230 Commits

Author SHA1 Message Date
Doug Rabson
6effc71332 Re-implement tcp and ip fragment reassembly to not store pointers in the
ip header which can't work on alpha since pointers are too big.

Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
1998-08-24 07:47:39 +00:00
Julian Elischer
f9e354df42 Support for IPFW based transparent forwarding.
Any packet that can be matched by a ipfw rule can be redirected
transparently to another port or machine. Redirection to another port
mostly makes sense with tcp, where a session can be set up
between a proxy and an unsuspecting client. Redirection to another machine
requires that the other machine also be expecting to receive the forwarded
packets, as their headers will not have been modified.

/sbin/ipfw must be recompiled!!!

Reviewed by:	Peter Wemm <peter@freebsd.org>
Submitted by: Chrisy Luke <chrisy@flix.net>
1998-07-06 03:20:19 +00:00
Peter Wemm
04a3fd1276 Let the sowwakeup macro decide when to call sowakeup rather than have
tcp "know" about it.  A pending upcall would be missed, eg: used by NFS.

Obtained from: NetBSD
1998-05-31 18:42:49 +00:00
Guido van Rooij
068373b683 Grumble...It seems I'm suffering from some mental disease. Do it correct now. 1998-05-18 17:11:24 +00:00
Guido van Rooij
0bce271a1f Add some parenthesis for clarity and fix a bug
Pointed out by: Garrett Wollmand
1998-05-18 17:07:58 +00:00
Guido van Rooij
11ad455083 Refuse accellerated opens on listening sockets that have not set
the TCP_NOPUSH socket option.
This disables TAO for those  services that do not know about T/TCP.

Reviewed by:	Garrett Wollman
Submitted by:	Peter Wemm
1998-05-04 17:59:52 +00:00
David Greenman
84e33c9eea At the request of Garrett, changed sysctl:
net.inet.tcp.delack_enabled -> net.inet.tcp.delayed_ack
1998-04-24 10:08:57 +00:00
Dag-Erling Smørgrav
dc73342347 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
Poul-Henning Kamp
8e5db87cdb Remove the last traces of TUBA.
Inspired by:	PR kern/3317
1998-04-06 06:52:47 +00:00
Bill Fenner
75daa6a53f Remove the check for SYN in SYN_RECEIVED state; it breaks simultaneous
connect.  This check was added as part of the defense against the "land"
attack, to prevent attacks which guess the ISS from going into ESTABLISHED.
The "src == dst" check will still prevent the single-homed case of the
"land" attack, and guessing ISS's should be hard anyway.

Submitted by:	David Borman <dab@bsdi.com>
1998-03-20 00:43:29 +00:00
David Greenman
f498eeeead Changes to support the addition of a new sysctl variable:
net.inet.tcp.delack_enabled
Which defaults to 1 and can be set to 0 to disable TCP delayed-ack
processing (i.e. all acks are immediate).
1998-02-26 05:25:39 +00:00
David Greenman
c3229e05a3 Improved connection establishment performance by doing local port lookups via
a hashed port list. In the new scheme, in_pcblookup() goes away and is
replaced by a new routine, in_pcblookup_local() for doing the local port
check. Note that this implementation is space inefficient in that the PCB
struct is now too large to fit into 128 bytes. I might deal with this in the
future by using the new zone allocator, but I wanted these changes to be
extensively tested in their current form first.

Also:
1) Fixed off-by-one errors in the port lookup loops in in_pcbbind().
2) Got rid of some unneeded rehashing. Adding a new routine, in_pcbinshash()
   to do the initialial hash insertion.
3) Renamed in_pcblookuphash() to in_pcblookup_hash() for easier readability.
4) Added a new routine, in_pcbremlists() to remove the PCB from the various
   hash lists.
5) Added/deleted comments where appropriate.
6) Removed unnecessary splnet() locking. In general, the PCB functions should
   be called at splnet()...there are unfortunately a few exceptions, however.
7) Reorganized a few structs for better cache line behavior.
8) Killed my TCP_ACK_HACK kludge. It may come back in a different form in
   the future, however.

These changes have been tested on wcarchive for more than a month. In tests
done here, connection establishment overhead is reduced by more than 50
times, thus getting rid of one of the major networking scalability problems.

Still to do: make tcp_fastimo/tcp_slowtimo scale well for systems with a
large number of connections. tcp_fastimo is easy; tcp_slowtimo is difficult.

WARNING: Anything that knows about inpcb and tcpcb structs will have to be
         recompiled; at the very least, this includes netstat(1).
1998-01-27 09:15:13 +00:00
Bill Fenner
764d8cef56 A more complete fix for the "land" attack, removing the "quick fix" from
rev 1.66.  This fix contains both belt and suspenders.

Belt: ignore packets where src == dst and srcport == dstport in TCPS_LISTEN.
 These packets can only legitimately occur when connecting a socket to itself,
 which doesn't go through TCPS_LISTEN (it goes CLOSED->SYN_SENT->SYN_RCVD->
 ESTABLISHED).  This prevents the "standard" "land" attack, although doesn't
 prevent the multi-homed variation.

Suspenders: send a RST in response to a SYN/ACK in SYN_RECEIVED state.
 The only packets we should get in SYN_RECEIVED are
 1. A retransmitted SYN, or
 2. An ack of our SYN/ACK.
 The "land" attack depends on us accepting our own SYN/ACK as an ACK;
 in SYN_RECEIVED state; this should prevent all "land" attacks.

We also move up the sequence number check for the ACK in SYN_RECEIVED.
 This neither helps nor hurts with respect to the "land" attack, but
 puts more of the validation checking in one spot.

PR:             kern/5103
1998-01-21 02:05:59 +00:00
Bruce Evans
592071e854 Don't use ANSI string concatenation to misformat a string. 1997-12-19 23:46:21 +00:00
Garrett Wollman
76d3eadb53 Add Matt Dillon's quick fix hack for the self-connect DoS.
PR:		5103
1997-11-20 20:04:49 +00:00
Poul-Henning Kamp
4a11ca4e29 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
Bruce Evans
55b211e3af Removed unused #includes. 1997-10-28 15:59:26 +00:00
David Greenman
4281faf253 Killed the SYN_RECEIVED addition from rev 1.52. It results in legitimate
RST's being ignored, keeping a connection around until it times out, and
thus has the opposite effect of what was intended (which is to make the
system more robust to DoS attacks).
1997-10-02 02:10:40 +00:00
Bill Fenner
026650e576 Don't consider a SYN/ACK with CC but no CCECHO a proper T/TCP
handshake.

Reviewed by:	Rich Stevens <rstevens@kohala.com>
1997-09-30 16:38:09 +00:00
Joerg Wunsch
0cc12cc57e Make TCPDEBUG a new-style option. 1997-09-16 18:36:06 +00:00
Garrett Wollman
57bf258e3d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
John Polstra
66e39adc7c Fix a bug (apparently very old) that can cause a TCP connection to
be dropped when it has an unusual traffic pattern.  For full details
as well as a test case that demonstrates the failure, see the
referenced PR.

Under certain circumstances involving the persist state, it is
possible for the receive side's tp->rcv_nxt to advance beyond its
tp->rcv_adv.  This causes (tp->rcv_adv - tp->rcv_nxt) to become
negative.  However, in the code affected by this fix, that difference
was interpreted as an unsigned number by max().  Since it was
negative, it was taken as a huge unsigned number.  The effect was
to cause the receiver to believe that its receive window had negative
size, thereby rejecting all received segments including ACKs.  As
the test case shows, this led to fruitless retransmissions and
eventually to a dropped connection.  Even connections using the
loopback interface could be dropped.  The fix substitutes the signed
imax() for the unsigned max() function.

PR:		closes kern/3998
Reviewed by:	davidg, fenner, wollman
1997-07-01 05:42:16 +00:00
Garrett Wollman
a29f300e80 The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call.  Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken.  I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
1997-04-27 20:01:29 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Bill Fenner
39172c9401 Re-enable the TCP SYN-attack protection code. I was the one who didn't
understand the socket state flag.

2.2 candidate.
1996-11-10 07:37:24 +00:00
Paul Traina
a51764a8bf Fix two bugs I accidently put into the syn code at the last minute
(yes I had tested the hell out of this).

I've also temporarily disabled the code so that it behaves as it previously
did (tail drop's the syns) pending discussion with fenner about some socket
state flags that I don't fully understand.

Submitted by:	fenner
1996-10-11 19:26:42 +00:00
David Greenman
6d6a026b47 Improved in_pcblookuphash() to support wildcarding, and changed relavent
callers of it to take advantage of this. This reduces new connection
request overhead in the face of a large number of PCBs in the system.
Thanks to David Filo <filo@yahoo.com> for suggesting this and providing
a sample implementation (which wasn't used, but showed that it could be
done).

Reviewed by:	wollman
1996-10-07 19:06:12 +00:00
Paul Traina
ebb0cbea75 Increase robustness of FreeBSD against high-rate connection attempt
denial of service attacks.

Reviewed by:	bde,wollman,olah
Inspired by:	vjs@sgi.com
1996-10-07 04:32:42 +00:00
Paul Traina
12eafeb016 I don't understand, I committed this fix (move a counter and fixed a typo)
this evening.

I think I'm going insane.
1996-09-21 06:39:20 +00:00
Andrey A. Chernov
96d719e6e0 Syntax error: so_incom -> so_incomp 1996-09-21 06:30:06 +00:00
Paul Traina
4195b4af58 If the incomplete listen queue for a given socket is full,
drop the oldest entry in the queue.

There was a fair bit of discussion as to whether or not the
proper action is to drop a random entry in the queue.  It's
my conclusion that a random drop is better than a head drop,
however profiling this section of code (done by John Capo)
shows that a head-drop results in a significant performance
increase.

There are scenarios where a random drop is more appropriate.
If I find one in reality, I'll add the random drop code under
a conditional.

Obtained from: discussions and code done by Vernon Schryver (vjs@sgi.com).
1996-09-20 21:25:18 +00:00
Paul Traina
7b40aa327d Make the misnamed tcp initial keepalive timer value (which is really the
time, in seconds, that state for non-established TCP sessions stays about)
a sysctl modifyable variable.

[part 1 of two commits, I just realized I can't play with the indices as
 I was typing this commit message.]
1996-09-13 23:51:44 +00:00
Paul Traina
7ff19458de Receipt of two SYN's are sufficient to set the t_timer[TCPT_KEEP]
to "keepidle".  this should not occur unless the connection has
been established via the 3-way handshake which requires an ACK

Submitted by:	jmb
Obtained from:	problem discussed in Stevens vol. 3
1996-09-13 18:47:03 +00:00
Bill Fenner
df5c0b8a7a Back out my stupid braino; I was thinking strlen and not sizeof. 1996-05-02 05:54:14 +00:00
Bill Fenner
af00f8007c Size temp var correctly; buf[4*sizeof "123"] is not long enough
to store "192.252.119.189\0".
1996-05-02 05:31:13 +00:00
Andrey A. Chernov
75cfc95fe2 inet_ntoa buffer was evaluated twice in log_in_vain, fix it.
Thanx to: jdp
1996-04-27 18:19:12 +00:00
Garrett Wollman
a2352fc148 Delete #ifdef notdef blocks containing old method of srtt calculation.
Requested by: davidg
1996-04-26 18:32:58 +00:00
Paul Traina
d78a37ad5a Logging UDP and TCP connection attempts should not be enabled by default.
It's trivial to create a denial of service attack on a box so enabled.

These messages, if enabled at all, must be rate-limited. (!)
1996-04-09 07:01:53 +00:00
Poul-Henning Kamp
816a3d836e Log TCP syn packets for ports we don't listen on.
Controlled by: sysctl net.inet.tcp.log_in_vain: 1

Log UDP syn packets for ports we don't listen on.
Controlled by: sysctl net.inet.udp.log_in_vain: 1

Suggested by:	Warren Toomey <wkt@cs.adfa.oz.au>
1996-04-04 10:46:44 +00:00
Garrett Wollman
9e2874b067 Slight modification of RTO floor calculation. 1996-03-25 20:13:21 +00:00
Garrett Wollman
233e8c18e8 A number of performance-reducing flaws fixed based on comments
from Larry Peterson &co. at Arizona:

- Header prediction for ACKs did not exclude Fast Retransmit/Recovery.
- srtt calculation tended to get ``stuck'' and could never decrease
  when below 8.  It still can't, but the scaling factors are adjusted
  so that this artifact does not cause as bad an effect on the RTO
  value as it used to.

The paper also points out the incr/8 error that has been long since fixed,
and the problems with ACKing frequency resulting from the use of options
which I suspect to be fixed already as well (as part of the T/TCP work).

Obtained from:	Brakmo & Peterson, ``Performance Problems in BSD4.4 TCP''
1996-03-22 18:09:21 +00:00
David Greenman
2ee45d7d28 Move or add #include <queue.h> in preparation for upcoming struct socket
changes.
1996-03-11 15:13:58 +00:00
Guido van Rooij
1347f5b8e5 Add a counter for the number of times the listen queue was overflowed to
the tcpstat structure. (netstat -s)
Reviewed by:	wollman
Obtained from: Steves, TCP/IP Ill. vol.3, page 189
1996-02-26 21:47:13 +00:00
David Greenman
f9d5a964af Fixed bug in Path MTU Discovery that caused the system to have to re-
discover the Path MTU for each connection if the connecting host didn't
offer an initial MSS.

Submitted by:	davidg & olah
1996-02-22 11:46:39 +00:00
Andras Olah
07e43e10f8 Fix a bug related to the interworking of T/TCP and window scaling:
when a connection enters the ESTBLS state using T/TCP, then window
scaling wasn't properly handled.  The fix is twofold.

1) When the 3WHS completes, make sure that we update our window
scaling state variables.

2) When setting the `virtual advertized window', then make sure
that we do not try to offer a window that is larger than the maximum
window without scaling (TCP_MAXWIN).

Reviewed by:	davidg
Reported by:	Jerry Chen <chen@Ipsilon.COM>
1996-01-31 08:22:24 +00:00
Poul-Henning Kamp
f708ef1b9e Another mega commit to staticize things. 1995-12-14 09:55:16 +00:00
Poul-Henning Kamp
0312fbe97d New style sysctl & staticize alot of stuff. 1995-11-14 20:34:56 +00:00
Poul-Henning Kamp
98163b98dc Start adding new style sysctl here too. 1995-11-09 20:23:09 +00:00
Andras Olah
845799c166 Cosmetic changes to processing of segments in the SYN_SENT state:
- remove a redundant condition;
- complete all validity checks on segment before calling
  soisconnected(so).

Reviewed by:	Richard Stevens, davidg, wollman
1995-11-03 22:31:54 +00:00
Garrett Wollman
91badc866f Routes can be asymmetric. Always offer to /accept/ an MSS of up to the
capacity of the link, even if the route's MTU indicates that we cannot
send that much in their direction.  (This might actually make it possible
to test Path MTU discovery in a useful variety of cases.)
1995-10-13 16:00:25 +00:00
Garrett Wollman
e79adb8ed6 Finish 4.4-Lite-2 merge: randomize TCP initial sequence numbers
to make ISS-guessing spoofing attacks harder.
1995-10-03 16:54:17 +00:00
Andras Olah
d3eede9d32 Remove a redundant `if' from tcp_reass().
Correct a typo in a comment (SEND_SYN -> NEEDSYN).

Reviewed by:	David Greenman
1995-07-31 10:24:22 +00:00
Garrett Wollman
dd22498271 tcp_input.c - keep track of how many times a route contained a cached rtt
or ssthresh that we were able to use

tcp_var.h - declare tcpstat entries for above; declare tcp_{send,recv}space

in_rmx.c - fill in the MTU and pipe sizes with the defaults TCP would have
	used anyway in the absence of values here
1995-07-10 15:39:16 +00:00
Garrett Wollman
fc97827135 Keep track of the number of samples through the srtt filter so that we
know better when to cache values in the route, rather than relying on a
heuristic involving sequence numbers that broke when tcp_sendspace
was increased to 16k.
1995-06-29 18:11:24 +00:00
Rodney W. Grimes
9b2e535452 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
David Greenman
6b067b0744 #ifdef'd my Nagel/ACK hack with "TCP_ACK_HACK", disabled by default. I'm
currently considering reducing the TCP fasttimo to 100ms to help improve
things, but this would be done as a seperate step at some point in the
future.
This was done because it was causing some sometimes serious performance
problems with T/TCP.
1995-05-11 01:41:06 +00:00
Andras Olah
40db8ef747 Fix a misspelled constant in tcp_input.c.
On Tue, 09 May 1995 04:35:27 PDT, Richard Stevens wrote:
> In tcp_dooptions() under the case TCPOPT_CC there is an assignment
>
>       to->to_flag |= TCPOPT_CC;
>
> that should be
>
>       to->to_flag |= TOF_CC;
>
> I haven't thought through the ramifications of what's been happening ...
>
>       Rich Stevens

Submitted by:	rstevens@noao.edu (Richard Stevens)
1995-05-09 12:32:06 +00:00
David Greenman
0d7b7d3ea7 Changed in_pcblookuphash() to not automatically call in_pcblookup() if
the lookup fails. Updated callers to deal with this. Call in_pcblookuphash
instead of in_pcblookup() in in_pcbconnect; this improves performance of
UDP output by about 17% in the standard case.
1995-05-03 07:16:53 +00:00
David Greenman
d79940da0a Further satisfy my paranoia by making sure that the ACKNOW is only
set when ti_len is non-zero.
1995-04-10 17:37:46 +00:00
David Greenman
afa70c96dd Fixed bug I introduced with my Nagel hack which caused tcp_input and
tcp_output to loop endlessly. This was freefall's problem during the past
day.
1995-04-10 17:16:10 +00:00
David Greenman
15bd2b4385 Implemented PCB hashing. Includes new functions in_pcbinshash, in_pcbrehash,
and in_pcblookuphash.
1995-04-09 01:29:31 +00:00
Andras Olah
755c1f07c8 Fix a bug in tcp_input reported by Rick Jones <raj@hpisrdq.cup.hp.com>.
If a goto findpcb occurred during the processing of a segment, the TCP and
IP headers were dropped twice from the mbuf which resulted in data acked
by TCP but not delivered to the user.
Reviewed by:	davidg
1995-04-05 10:32:14 +00:00
David Greenman
e612a582cc Re-apply my "breakage" to the Nagel congestion avoidence. This version
differs slightly in the logic from the previous version; packets are now
acked immediately if the sender set PUSH.
1995-03-27 07:12:24 +00:00
Bruce Evans
b5e8ce9f12 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'.  Fix all the bugs found.  There were no serious
ones.
1995-03-16 18:17:34 +00:00
Garrett Wollman
dac2030182 Avoid deadlock situation described by Stevens using his suggested replacement
code.

Obtained from: Stevens, vol. 2, pp. 959-960
1995-02-16 01:39:19 +00:00
Garrett Wollman
41f82abe5a Transaction TCP support now standard. Hack away! 1995-02-16 00:55:44 +00:00
Poul-Henning Kamp
c70f45100d YFfix. 1995-02-14 06:28:25 +00:00
Garrett Wollman
2f96f1f446 Get rid of some unneeded #ifdef TTCP lines. Also, get rid of some
bogus commons declared in header files.
1995-02-14 02:35:19 +00:00
Garrett Wollman
a0292f2375 Merge Transaction TCP, courtesy of Andras Olah <olah@cs.utwente.nl> and
Bob Braden <braden@isi.edu>.

NB: This has not had David's TCP ACK hack re-integrated.  It is not clear
what the correct solution to this problem is, if any.  If a better solution
doesn't pop up in response to this message, I'll put David's code back in
(or he's welcome to do so himself).
1995-02-09 23:13:27 +00:00
Garrett Wollman
10be56487a As suggested by Sally Floyd, don't add the ``small fraction of the window
size'' when doing congestion avoidance.

Submitted by:	Mark Andrews
1994-10-13 18:36:32 +00:00
Poul-Henning Kamp
623ae52e4e GCC cleanup.
Reviewed by:
Submitted by:
Obtained from:
1994-10-02 17:48:58 +00:00
David Greenman
610ee2f9b5 Made TCPDEBUG truely optional. Based on changes I made in FreeBSD 1.1.5.
Fixed somebody's idea of a joke - about the first half of the lines in
in_proto.c were spaced over by one space.
1994-09-15 10:36:56 +00:00
Garrett Wollman
6a7be6e8e0 Obey RFC 793, section 3.4:
Several examples of connection initiation follow.  Although these
  examples do not show connection synchronization using data-carrying
  segments, this is perfectly legitimate, so long as the receiving TCP
  doesn't deliver the data to the user until it is clear the data is
  valid (i.e., the data must be buffered at the receiver until the
  connection reaches the ESTABLISHED state).
1994-08-26 22:27:16 +00:00
Garrett Wollman
f23b4c91c4 Fix up some sloppy coding practices:
- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
  header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.
1994-08-18 22:36:09 +00:00
David Greenman
3c4dd3568f Added $Id$ 1994-08-02 07:55:43 +00:00
David Greenman
b164106fa7 Fixed bug with Nagel Congestion Avoidance where a tcp connection would
stall unnecessarily - always send an ACK when a packet len of < mss is
received.
1994-08-01 12:00:25 +00:00
David Greenman
d4d0967e5b Added missing ntohl()'s that are needed before calling IN_MULTICAST in
a couple of places.
Submitted by:	Johannes Helander
1994-05-26 09:51:33 +00:00
Rodney W. Grimes
26f9a76710 The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by:	Rodney W. Grimes
Submitted by:	John Dyson and David Greenman
1994-05-25 09:21:21 +00:00
Rodney W. Grimes
df8bae1de4 BSD 4.4 Lite Kernel Sources 1994-05-24 10:09:53 +00:00