Commit Graph

96 Commits

Author SHA1 Message Date
Brian Feldman
ecf723083f Implement RLIMIT_SBSIZE in the kernel. This is a per-uid sockbuf total
usage limit.
1999-10-09 20:42:17 +00:00
Dag-Erling Smørgrav
f861330504 Fix some more disordering, as well as the description string for the
net.inet.tcp.drop_synfin sysctl, which for some mysterious reason said
"Drop TCP packets with FIN+ACK set" (instead of "...with SYN+FIN set")
1999-09-14 16:14:05 +00:00
Dag-Erling Smørgrav
e46cd3d4d2 Add the net.inet.tcp.restrict_rst and net.inet.tcp.drop_synfin sysctl
variables, conditional on the TCP_RESTRICT_RST and TCP_DROP_SYNFIN kernel
options, respectively. See the comments in LINT for details.
1999-09-12 17:22:08 +00:00
Jonathan Lemon
9b8b58e033 Restructure TCP timeout handling:
- eliminate the fast/slow timeout lists for TCP and instead use a
    callout entry for each timer.
  - increase the TCP timer granularity to HZ
  - implement "bad retransmit" recovery, as presented in
    "On Estimating End-to-End Network Path Properties", by Allman and Paxson.

Submitted by:	jlemon, wollmann
1999-08-30 21:17:07 +00:00
David E. O'Brien
5a8c77a83c Remove extra indenting of `break' statements introducted in rev 1.89,
plus wrap some long lines from that revision.

While here, wrap some other long lines.
1999-08-29 21:59:03 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Geoff Rehmet
828b7f4069 Fix breakage if blackhole=1 and tiflags & TH_SYN, plus
style(9) fixes

Submitted by:	 Jonathon Lemon
1999-08-19 05:22:12 +00:00
Geoff Rehmet
2e4e1b4c31 Slight tweak to tcp.blackhole to add optional behaviour to
drop any segment arriving at a closed port.
tcp.blackhole=1 - only drop SYN without RST
tcp.blackhole=2 - drop everything without RST
tcp.blackhole=0 - always send RST - default behaviour

This confuses nmap -sF or -sX or -sN quite badly.
1999-08-18 15:40:05 +00:00
Geoff Rehmet
16f7f31f04 Add net.inet.tcp.blackhole and net.inet.udp.blackhole
sysctl knobs.

With these knobs on, refused connection attempts are dropped
without sending a RST, or Port unreachable in the UDP case.
In the TCP case, sending of RST is inhibited iff the incoming
segment was a SYN.

Docs and rc.conf settings to follow.
1999-08-17 12:17:53 +00:00
Jonathan M. Bresler
e9bd3a37e8 fix comment re: RST received in TIME_WAIT to match the code. 1999-07-18 14:42:48 +00:00
Peter Wemm
dfd5dee1b0 Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.
1999-05-06 18:13:11 +00:00
Bill Fumerola
3d177f465a Add sysctl descriptions to many SYSCTL_XXXs
PR:		kern/11197
Submitted by:	Adrian Chadd <adrian@FreeBSD.org>
Reviewed by:	billf(spelling/style/minor nits)
Looked at by:	bde(style)
1999-05-03 23:57:32 +00:00
Bill Fenner
51b7b33769 Use snd_nxt, not rcv_nxt, when calculating the ISS during TIME_WAIT.
This was missed in the 4.4-Lite2 merge.

Noticed by:	Mohan Parthasarathy <Mohan.Parthasarathy@eng.Sun.COM> and
		jayanth@loc201.tandem.com (vijayaraghavan_jayanth)
		on the tcp-impl mailing list.
1999-02-06 00:47:45 +00:00
Matthew Dillon
831a80b0d5 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 22:42:27 +00:00
Matthew Dillon
51508de112 Reviewed by: freebsd-current
Add ICMP_BANDLIM option and 'net.inet.icmp.icmplim' sysctl.  If option
    is specified in kernel config, icmplim defaults to 100 pps.  Setting it
    to 0 will disable the feature.  This feature limits ICMP error responses
    for packets sent to bad tcp or udp ports, which does a lot to help the
    machine handle network D.O.S. attacks.

    The kernel will report packet rates that exceed the limit at a rate of
    one kernel printf per second.  There is one issue in regards to the
    'tail end' of an attack... the kernel will not output the last report
    until some unrelated and valid icmp error packet is return at some
    point after the attack is over.  This is a minor reporting issue only.
1998-12-03 20:23:21 +00:00
Garrett Wollman
80ab7c0ed8 Fix RST validation.
PR:		7892
Submitted by:	Don.Lewis@tsc.tdk.com
1998-09-11 16:04:03 +00:00
Doug Rabson
6effc71332 Re-implement tcp and ip fragment reassembly to not store pointers in the
ip header which can't work on alpha since pointers are too big.

Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
1998-08-24 07:47:39 +00:00
Julian Elischer
f9e354df42 Support for IPFW based transparent forwarding.
Any packet that can be matched by a ipfw rule can be redirected
transparently to another port or machine. Redirection to another port
mostly makes sense with tcp, where a session can be set up
between a proxy and an unsuspecting client. Redirection to another machine
requires that the other machine also be expecting to receive the forwarded
packets, as their headers will not have been modified.

/sbin/ipfw must be recompiled!!!

Reviewed by:	Peter Wemm <peter@freebsd.org>
Submitted by: Chrisy Luke <chrisy@flix.net>
1998-07-06 03:20:19 +00:00
Peter Wemm
04a3fd1276 Let the sowwakeup macro decide when to call sowakeup rather than have
tcp "know" about it.  A pending upcall would be missed, eg: used by NFS.

Obtained from: NetBSD
1998-05-31 18:42:49 +00:00
Guido van Rooij
068373b683 Grumble...It seems I'm suffering from some mental disease. Do it correct now. 1998-05-18 17:11:24 +00:00
Guido van Rooij
0bce271a1f Add some parenthesis for clarity and fix a bug
Pointed out by: Garrett Wollmand
1998-05-18 17:07:58 +00:00
Guido van Rooij
11ad455083 Refuse accellerated opens on listening sockets that have not set
the TCP_NOPUSH socket option.
This disables TAO for those  services that do not know about T/TCP.

Reviewed by:	Garrett Wollman
Submitted by:	Peter Wemm
1998-05-04 17:59:52 +00:00
David Greenman
84e33c9eea At the request of Garrett, changed sysctl:
net.inet.tcp.delack_enabled -> net.inet.tcp.delayed_ack
1998-04-24 10:08:57 +00:00
Dag-Erling Smørgrav
dc73342347 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
Poul-Henning Kamp
8e5db87cdb Remove the last traces of TUBA.
Inspired by:	PR kern/3317
1998-04-06 06:52:47 +00:00
Bill Fenner
75daa6a53f Remove the check for SYN in SYN_RECEIVED state; it breaks simultaneous
connect.  This check was added as part of the defense against the "land"
attack, to prevent attacks which guess the ISS from going into ESTABLISHED.
The "src == dst" check will still prevent the single-homed case of the
"land" attack, and guessing ISS's should be hard anyway.

Submitted by:	David Borman <dab@bsdi.com>
1998-03-20 00:43:29 +00:00
David Greenman
f498eeeead Changes to support the addition of a new sysctl variable:
net.inet.tcp.delack_enabled
Which defaults to 1 and can be set to 0 to disable TCP delayed-ack
processing (i.e. all acks are immediate).
1998-02-26 05:25:39 +00:00
David Greenman
c3229e05a3 Improved connection establishment performance by doing local port lookups via
a hashed port list. In the new scheme, in_pcblookup() goes away and is
replaced by a new routine, in_pcblookup_local() for doing the local port
check. Note that this implementation is space inefficient in that the PCB
struct is now too large to fit into 128 bytes. I might deal with this in the
future by using the new zone allocator, but I wanted these changes to be
extensively tested in their current form first.

Also:
1) Fixed off-by-one errors in the port lookup loops in in_pcbbind().
2) Got rid of some unneeded rehashing. Adding a new routine, in_pcbinshash()
   to do the initialial hash insertion.
3) Renamed in_pcblookuphash() to in_pcblookup_hash() for easier readability.
4) Added a new routine, in_pcbremlists() to remove the PCB from the various
   hash lists.
5) Added/deleted comments where appropriate.
6) Removed unnecessary splnet() locking. In general, the PCB functions should
   be called at splnet()...there are unfortunately a few exceptions, however.
7) Reorganized a few structs for better cache line behavior.
8) Killed my TCP_ACK_HACK kludge. It may come back in a different form in
   the future, however.

These changes have been tested on wcarchive for more than a month. In tests
done here, connection establishment overhead is reduced by more than 50
times, thus getting rid of one of the major networking scalability problems.

Still to do: make tcp_fastimo/tcp_slowtimo scale well for systems with a
large number of connections. tcp_fastimo is easy; tcp_slowtimo is difficult.

WARNING: Anything that knows about inpcb and tcpcb structs will have to be
         recompiled; at the very least, this includes netstat(1).
1998-01-27 09:15:13 +00:00
Bill Fenner
764d8cef56 A more complete fix for the "land" attack, removing the "quick fix" from
rev 1.66.  This fix contains both belt and suspenders.

Belt: ignore packets where src == dst and srcport == dstport in TCPS_LISTEN.
 These packets can only legitimately occur when connecting a socket to itself,
 which doesn't go through TCPS_LISTEN (it goes CLOSED->SYN_SENT->SYN_RCVD->
 ESTABLISHED).  This prevents the "standard" "land" attack, although doesn't
 prevent the multi-homed variation.

Suspenders: send a RST in response to a SYN/ACK in SYN_RECEIVED state.
 The only packets we should get in SYN_RECEIVED are
 1. A retransmitted SYN, or
 2. An ack of our SYN/ACK.
 The "land" attack depends on us accepting our own SYN/ACK as an ACK;
 in SYN_RECEIVED state; this should prevent all "land" attacks.

We also move up the sequence number check for the ACK in SYN_RECEIVED.
 This neither helps nor hurts with respect to the "land" attack, but
 puts more of the validation checking in one spot.

PR:             kern/5103
1998-01-21 02:05:59 +00:00
Bruce Evans
592071e854 Don't use ANSI string concatenation to misformat a string. 1997-12-19 23:46:21 +00:00
Garrett Wollman
76d3eadb53 Add Matt Dillon's quick fix hack for the self-connect DoS.
PR:		5103
1997-11-20 20:04:49 +00:00
Poul-Henning Kamp
4a11ca4e29 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
Bruce Evans
55b211e3af Removed unused #includes. 1997-10-28 15:59:26 +00:00
David Greenman
4281faf253 Killed the SYN_RECEIVED addition from rev 1.52. It results in legitimate
RST's being ignored, keeping a connection around until it times out, and
thus has the opposite effect of what was intended (which is to make the
system more robust to DoS attacks).
1997-10-02 02:10:40 +00:00
Bill Fenner
026650e576 Don't consider a SYN/ACK with CC but no CCECHO a proper T/TCP
handshake.

Reviewed by:	Rich Stevens <rstevens@kohala.com>
1997-09-30 16:38:09 +00:00
Joerg Wunsch
0cc12cc57e Make TCPDEBUG a new-style option. 1997-09-16 18:36:06 +00:00
Garrett Wollman
57bf258e3d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
John Polstra
66e39adc7c Fix a bug (apparently very old) that can cause a TCP connection to
be dropped when it has an unusual traffic pattern.  For full details
as well as a test case that demonstrates the failure, see the
referenced PR.

Under certain circumstances involving the persist state, it is
possible for the receive side's tp->rcv_nxt to advance beyond its
tp->rcv_adv.  This causes (tp->rcv_adv - tp->rcv_nxt) to become
negative.  However, in the code affected by this fix, that difference
was interpreted as an unsigned number by max().  Since it was
negative, it was taken as a huge unsigned number.  The effect was
to cause the receiver to believe that its receive window had negative
size, thereby rejecting all received segments including ACKs.  As
the test case shows, this led to fruitless retransmissions and
eventually to a dropped connection.  Even connections using the
loopback interface could be dropped.  The fix substitutes the signed
imax() for the unsigned max() function.

PR:		closes kern/3998
Reviewed by:	davidg, fenner, wollman
1997-07-01 05:42:16 +00:00
Garrett Wollman
a29f300e80 The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call.  Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken.  I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
1997-04-27 20:01:29 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Bill Fenner
39172c9401 Re-enable the TCP SYN-attack protection code. I was the one who didn't
understand the socket state flag.

2.2 candidate.
1996-11-10 07:37:24 +00:00
Paul Traina
a51764a8bf Fix two bugs I accidently put into the syn code at the last minute
(yes I had tested the hell out of this).

I've also temporarily disabled the code so that it behaves as it previously
did (tail drop's the syns) pending discussion with fenner about some socket
state flags that I don't fully understand.

Submitted by:	fenner
1996-10-11 19:26:42 +00:00
David Greenman
6d6a026b47 Improved in_pcblookuphash() to support wildcarding, and changed relavent
callers of it to take advantage of this. This reduces new connection
request overhead in the face of a large number of PCBs in the system.
Thanks to David Filo <filo@yahoo.com> for suggesting this and providing
a sample implementation (which wasn't used, but showed that it could be
done).

Reviewed by:	wollman
1996-10-07 19:06:12 +00:00
Paul Traina
ebb0cbea75 Increase robustness of FreeBSD against high-rate connection attempt
denial of service attacks.

Reviewed by:	bde,wollman,olah
Inspired by:	vjs@sgi.com
1996-10-07 04:32:42 +00:00
Paul Traina
12eafeb016 I don't understand, I committed this fix (move a counter and fixed a typo)
this evening.

I think I'm going insane.
1996-09-21 06:39:20 +00:00
Andrey A. Chernov
96d719e6e0 Syntax error: so_incom -> so_incomp 1996-09-21 06:30:06 +00:00
Paul Traina
4195b4af58 If the incomplete listen queue for a given socket is full,
drop the oldest entry in the queue.

There was a fair bit of discussion as to whether or not the
proper action is to drop a random entry in the queue.  It's
my conclusion that a random drop is better than a head drop,
however profiling this section of code (done by John Capo)
shows that a head-drop results in a significant performance
increase.

There are scenarios where a random drop is more appropriate.
If I find one in reality, I'll add the random drop code under
a conditional.

Obtained from: discussions and code done by Vernon Schryver (vjs@sgi.com).
1996-09-20 21:25:18 +00:00
Paul Traina
7b40aa327d Make the misnamed tcp initial keepalive timer value (which is really the
time, in seconds, that state for non-established TCP sessions stays about)
a sysctl modifyable variable.

[part 1 of two commits, I just realized I can't play with the indices as
 I was typing this commit message.]
1996-09-13 23:51:44 +00:00
Paul Traina
7ff19458de Receipt of two SYN's are sufficient to set the t_timer[TCPT_KEEP]
to "keepidle".  this should not occur unless the connection has
been established via the 3-way handshake which requires an ACK

Submitted by:	jmb
Obtained from:	problem discussed in Stevens vol. 3
1996-09-13 18:47:03 +00:00