freebsd-skq

Author	SHA1	Message	Date
hsu	f13c72f301	Style bug: fix 4 space indentations that should have been tabs. Submitted by: jlemon	2002-06-24 16:47:02 +00:00
luigi	ebcf841898	Move two global variables to automatic variables within the only function where they are used (they are used with TCPDEBUG only).	2002-06-23 21:22:56 +00:00
luigi	5259888148	Remove (almost all) global variables that were used to hold packet forwarding state ("annotations") during ip processing. The code is considerably cleaner now. The variables removed by this change are: ip_divert_cookie used by divert sockets ip_fw_fwd_addr used for transparent ip redirection last_pkt used by dynamic pipes in dummynet Removal of the first two has been done by carrying the annotations into volatile structs prepended to the mbuf chains, and adding appropriate code to add/remove annotations in the routines which make use of them, i.e. ip_input(), ip_output(), tcp_input(), bdg_forward(), ether_demux(), ether_output_frame(), div_output(). On passing, remove a bug in divert handling of fragmented packet. Now it is the fragment at offset 0 which sets the divert status of the whole packet, whereas formerly it was the last incoming fragment to decide. Removal of last_pkt required a change in the interface of ip_fw_chk() and dummynet_io(). On passing, use the same mechanism for dummynet annotations and for divert/forward annotations. option IPFIREWALL_FORWARD is effectively useless, the code to implement it is very small and is now in by default to avoid the obfuscation of conditionally compiled code. NOTES: * there is at least one global variable left, sro_fwd, in ip_output(). I am not sure if/how this can be removed. * I have deliberately avoided gratuitous style changes in this commit to avoid cluttering the diffs. Minor stule cleanup will likely be necessary * this commit only focused on the IP layer. I am sure there is a number of global variables used in the TCP and maybe UDP stack. * despite the number of files touched, there are absolutely no API's or data structures changed by this commit (except the interfaces of ip_fw_chk() and dummynet_io(), which are internal anyways), so an MFC is quite safe and unintrusive (and desirable, given the improved readability of the code). MFC after: 10 days	2002-06-22 11:51:02 +00:00
tanimura	cb3347e926	Remove so*_locked(), which were backed out by mistake.	2002-06-18 07:42:02 +00:00
hsu	cd25d4648f	Lock up inpcb. Submitted by: Jennifer Yang <yangjihui@yahoo.com>	2002-06-10 20:05:46 +00:00
tanimura	e6fa9b9e92	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
tanimura	92d8381dd5	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
alfred	798c53d495	Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.	2002-05-01 20:44:46 +00:00
tanimura	89ec521d91	Revert the change of #includes in sys/filedesc.h and sys/socketvar.h. Requested by: bde Since locking sigio_lock is usually followed by calling pgsigio(), move the declaration of sigio_lock and the definitions of SIGIO_*() to sys/signalvar.h. While I am here, sort include files alphabetically, where possible.	2002-04-30 01:54:54 +00:00
tanimura	dbb4756491	Add a global sx sigio_lock to protect the pointer to the sigio object of a socket. This avoids lock order reversal caused by locking a process in pgsigio(). sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now require sigio_lock to be locked. Provide sowwakeup_locked(), soisconnected_locked(), and so on in case where we have to modify a socket and wake up a process atomically.	2002-04-27 08:24:29 +00:00
suz	553226e8e1	just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD. (based on freebsd4-snap-20020128) Reviewed by: ume MFC after: 1 week	2002-04-19 04:46:24 +00:00
silby	c7389be7ba	Remove some ISN generation code which has been unused since the syncache went in. MFC after: 3 days	2002-04-10 22:12:01 +00:00
bde	867fc1ed1c	Fixed some style bugs in the removal of __P(()). Continuation lines were not outdented to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting.	2002-03-24 10:19:10 +00:00
alfred	357e37e023	Remove __P.	2002-03-19 21:25:46 +00:00
cjc	822f4e8381	Change the wording of the inline comments from the previous commit. Objection from: ru	2002-02-27 13:52:06 +00:00
cjc	8b28692f71	The TCP code did not do sufficient checks on whether incoming packets were destined for a broadcast IP address. All TCP packets with a broadcast destination must be ignored. The system only ignored packets that were _link-layer_ broadcasts or multicast. We need to check the IP address too since it is quite possible for a broadcast IP address to come in with a unicast link-layer address. Note that the check existed prior to CSRG revision 7.35, but was removed. This commit effectively backs out that nine-year-old change. PR: misc/35022	2002-02-25 08:29:21 +00:00
mike	bcee06d42c	o Move NTOHL() and associated macros into <sys/param.h>. These are deprecated in favor of the POSIX-defined lowercase variants. o Change all occurrences of NTOHL() and associated marcros in the source tree to use the lowercase function variants. o Add missing license bits to sparc64's <machine/endian.h>. Approved by: jake o Clean up <machine/endian.h> files. o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>. o Remove prototypes for non-existent bswapXX() functions. o Include <machine/endian.h> in <arpa/inet.h> to define the POSIX-required ntohl() family of functions. o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>, and <sys/param.h>. o Prepend underscores to the ntohl() family to help deal with complexities associated with having MD (asm and inline) versions, and having to prevent exposure of these functions in other headers that happen to make use of endian-specific defines. o Create weak aliases to the canonical function name to help deal with third-party software forgetting to include an appropriate header. o Remove some now unneeded pollution from <sys/types.h>. o Add missing <arpa/inet.h> includes in userland. Tested on: alpha, i386 Reviewed by: bde, jake, tmm	2002-02-18 20:35:27 +00:00
rwatson	46f317e07b	o Spelling fix in comment: tcp_ouput -> tcp_output	2002-01-04 17:21:27 +00:00
jlemon	ec4b51f883	Fix up tabs in comments.	2001-12-13 04:02:09 +00:00
dillon	f97547e246	Fix a bug with transmitter restart after receiving a 0 window. The receiver was not sending an immediate ack with delayed acks turned on when the input buffer is drained, preventing the transmitter from restarting immediately. Propogate the TCP_NODELAY option to accept()ed sockets. (Helps tbench and is a good idea anyway). Some cleanup. Identify additonal issues in comments. MFC after: 1 day	2001-12-02 08:49:29 +00:00
jlemon	a3c1c9fdb4	Introduce a syncache, which enables FreeBSD to withstand a SYN flood DoS in an improved fashion over the existing code. Reviewed by: silby (in a previous iteration) Sponsored by: DARPA, NAI Labs	2001-11-22 04:50:44 +00:00
jlemon	c41580e9ad	Move initialization of snd_recover into tcp_sendseqinit().	2001-11-21 18:45:51 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
julian	071f86f9f1	Patches from Keiichi SHIMA <keiichi@iij.ad.jp> to make ip use the standard protosw structure again. Obtained from: Well, KAME I guess.	2001-09-03 20:03:55 +00:00
jayanth	77d67fb568	when newreno is turned on, if dupacks = 1 or dupacks = 2 and new data is acknowledged, reset the dupacks to 0. The problem was spotted when a connection had its send buffer full because the congestion window was only 1 MSS and was not being incremented because dupacks was not reset to 0. Obtained from: Yahoo!	2001-08-29 23:54:13 +00:00
dd	6ea3a08d37	Correct a typo in a comment: FIN_WAIT2 -> FIN_WAIT_2 PR: 29970 Submitted by: Joseph Mallett <jmallett@xMach.org>	2001-08-23 22:34:29 +00:00
silby	58e247fcc4	Much delayed but now present: RFC 1948 style sequence numbers In order to ensure security and functionality, RFC 1948 style initial sequence number generation has been implemented. Barring any major crypographic breakthroughs, this algorithm should be unbreakable. In addition, the problems with TIME_WAIT recycling which affect our currently used algorithm are not present. Reviewed by: jesper	2001-08-22 00:58:16 +00:00
silby	2be73222cb	Temporary feature: Runtime tuneable tcp initial sequence number generation scheme. Users may now select between the currently used OpenBSD algorithm and the older random positive increment method. While the OpenBSD algorithm is more secure, it also breaks TIME_WAIT handling; this is causing trouble for an increasing number of folks. To switch between generation schemes, one sets the sysctl net.inet.tcp.tcp_seq_genscheme. 0 = random positive increments, 1 = the OpenBSD algorithm. 1 is still the default. Once a secure _and_ compatible algorithm is implemented, this sysctl will be removed. Reviewed by: jlemon Tested by: numerous subscribers of -net	2001-07-08 02:20:47 +00:00
ru	f8e11dde26	Add netstat(1) knob to reset net.inet.{ip\|icmp\|tcp\|udp\|igmp}.stats. For example, ``netstat -s -p ip -z'' will show and reset IP stats. PR: bin/17338	2001-06-23 17:17:59 +00:00
silby	f41767543e	Eliminate the allocation of a tcp template structure for each connection. The information contained in a tcptemp can be reconstructed from a tcpcb when needed. Previously, tcp templates required the allocation of one mbuf per connection. On large systems, this change should free up a large number of mbufs. Reviewed by: bmilekic, jlemon, ru MFC after: 2 weeks	2001-06-23 03:21:46 +00:00
ume	832f8d2249	Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks	2001-06-11 12:39:29 +00:00
jesper	9d59cfc3ee	Silby's take one on increasing FreeBSD's resistance to SYN floods: One way we can reduce the amount of traffic we send in response to a SYN flood is to eliminate the RST we send when removing a connection from the listen queue. Since we are being flooded, we can assume that the majority of connections in the queue are bogus. Our RST is unwanted by these hosts, just as our SYN-ACK was. Genuine connection attempts will result in hosts responding to our SYN-ACK with an ACK packet. We will automatically return a RST response to their ACK when it gets to us if the connection has been dropped, so the early RST doesn't serve the genuine class of connections much. In summary, we can reduce the number of packets we send by a factor of two without any loss in functionality by ensuring that RST packets are not sent when dropping a connection from the listen queue. Submitted by: Mike Silbersack <silby@silby.com> Reviewed by: jesper MFC after: 2 weeks	2001-06-06 19:41:51 +00:00
jesper	aa7ec52010	Inline TCP_REASS() in the single location where it's used, just as OpenBSD and NetBSD has done. No functional difference. MFC after: 2 weeks	2001-05-29 19:54:45 +00:00
jesper	02dca88184	properly delay acks in half-closed TCP connections PR: 24962 Submitted by: Tony Finch <dot@dotat.at> MFC after: 2 weeks	2001-05-29 19:51:45 +00:00
jesper	a1fab55459	Say goodbye to TCP_COMPAT_42 Reviewed by: wollman Requested by: wollman	2001-04-20 11:58:56 +00:00
kris	0c55f2e6da	Randomize the TCP initial sequence numbers more thoroughly. Obtained from: OpenBSD Reviewed by: jesper, peter, -developers	2001-04-17 18:08:01 +00:00
des	9dc769bc1b	Axe TCP_RESTRICT_RST. It was never a particularly good idea except for a few very specific scenarios, and now that we have had net.inet.tcp.blackhole for quite some time there is really no reason to use it any more. (last of three commits)	2001-03-19 22:09:00 +00:00
jlemon	3b8f8e9938	Do not delay a new ack if there already is a delayed ack pending on the connection, but send it immediately. Prior to this change, it was possible to delay a delayed-ack for multiple times, resulting in degraded TCP behavior in certain corner cases.	2001-02-25 15:17:24 +00:00
bmilekic	0f9088da56	Clean up RST ratelimiting. Previously, ratelimiting occured before tests were performed to determine if the received packet should be reset. This created erroneous ratelimiting and false alarms in some cases. The code has now been reorganized so that the checks for validity come before the call to badport_bandlim. Additionally, a few changes in the symbolic names of the bandlim types have been made, as well as a clarification of exactly which type each RST case falls under. Submitted by: Mike Silbersack <silby@silby.com>	2001-02-11 07:39:51 +00:00
wollman	08d0e8d96f	Correct a comment.	2001-01-24 16:25:36 +00:00
bmilekic	e94f2430fb	Change the following: 1. ICMP ECHO and TSTAMP replies are now rate limited. 2. RSTs generated due to packets sent to open and unopen ports are now limited by seperate counters. 3. Each rate limiting queue now has its own description, as follows: Limiting icmp unreach response from 439 to 200 packets per second Limiting closed port RST response from 283 to 200 packets per second Limiting open port RST response from 18724 to 200 packets per second Limiting icmp ping response from 211 to 200 packets per second Limiting icmp tstamp response from 394 to 200 packets per second Submitted by: Mike Silbersack <silby@silby.com>	2000-12-15 21:45:49 +00:00
dwmalone	dd75d1d73b	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
jlemon	88c9bb192d	tp->snd_recover is part of the New Reno recovery algorithm, and should only be checked if the system is currently performing New Reno style fast recovery. However, this value was being checked regardless of the NR state, with the end result being that the congestion window was never opened. Change the logic to check t_dupack instead; the only code path that allows it to be nonzero at this point is NewReno, so if it is nonzero, we are in fast recovery mode and should not touch the congestion window. Tested by: phk	2000-11-04 15:59:39 +00:00
jayanth	8c2fae5374	When a connection is being dropped due to a listen queue overflow, delete the cloned route that is associated with the connection. This does not exhaust the routing table memory when the system is under a SYN flood attack. The route entry is not deleted if there is any prior information cached in it. Reviewed by: Peter Wemm,asmodai	2000-07-21 23:26:37 +00:00
itojun	56bc1eab2d	be more cautious about tcp option length field. drop bogus ones earlier. not sure if there is a real threat or not, but it seems that there's possibility for overrun/underrun (like non-NOP option with optlen > cnt).	2000-07-09 13:01:59 +00:00
itojun	5f4e854de1	sync with kame tree as of july00. tons of bug fixes/improvements. API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)	2000-07-04 16:35:15 +00:00
dan	4e9d022872	sysctl'ize ICMP_BANDLIM and ICMP_BANDLIM_SUPPRESS_OUTPUT. Suggested by: des/nbm	2000-05-22 16:12:28 +00:00
jayanth	d854ffaa25	snd_cwnd was updated twice in the tcp_newreno function.	2000-05-18 21:21:42 +00:00
jayanth	e7034ee2a7	Sigh, fix a rookie patch merge error. Also-missed-by: peter	2000-05-17 06:55:00 +00:00
jayanth	ba14a43fa0	snd_una was being updated incorrectly, this resulted in the newreno code retransmitting data from the wrong offset. As a footnote, the newreno code was partially derived from NetBSD and Tom Henderson <tomh@cs.berkeley.edu>	2000-05-16 03:13:59 +00:00
jlemon	8a3c72bb35	Implement TCP NewReno, as documented in RFC 2582. This allows better recovery for multiple packet losses in a single window. The algorithm can be toggled via the sysctl net.inet.tcp.newreno, which defaults to "on". Submitted by: Jayanth Vijayaraghavan <jayanth@yahoo-inc.com>	2000-05-06 03:31:09 +00:00
sumikawa	52e19e3399	ND6_HINT() should not be called unless the connection status is ESTABLISHED. Obtained from: KAME Project	2000-04-17 20:27:02 +00:00
shin	09037f119d	Support per socket based IPv4 mapped IPv6 addr enable/disable control. Submitted by: ume	2000-04-01 22:35:47 +00:00
jlemon	0dcc5bc0d1	Add support for offloading IP/TCP/UDP checksums to NIC hardware which supports them.	2000-03-27 19:14:27 +00:00
shin	4897d4b884	IPv6 6to4 support. Now most big problem of IPv6 is getting IPv6 address assignment. 6to4 solve the problem. 6to4 addr is defined like below, 2002: 4byte v4 addr : 2byte SLA ID : 8byte interface ID The most important point of the address format is that an IPv4 addr is embeded in it. So any user who has IPv4 addr can get IPv6 address block with 2byte subnet space. Also, the IPv4 addr is used for semi-automatic IPv6 over IPv4 tunneling. With 6to4, getting IPv6 addr become dramatically easy. The attached patch enable 6to4 extension, and confirmed to work, between "Richard Seaman, Jr." <dick@tar.com> and me. Approved by: jkh Reviewed by: itojun	2000-03-11 11:17:24 +00:00
imp	9d11326d86	Mitigate the stream.c attacks o Drop all broadcast and multicast source addresses in tcp_input. o Enable ICMP_BANDLIM in GENERIC. o Change default to 200/s from 100/s. This will still stop the attack, but is conservative enough to do this close to code freeze. This is not the optimal patch for the problem, but is likely the least intrusive patch that can be made for this. Obtained from: Don Lewis and Matt Dillon. Reviewed by: freebsd-security	2000-01-28 06:13:09 +00:00
shin	2dec7cab29	Avoid m_len and m_pkthdr.len inconsistency when changing m_len for an mbuf whose M_PKTHDR is set. PR: related to kern/15175 Reviewed by: archie	2000-01-25 01:26:47 +00:00
shin	e7b807d1e3	Fixed the problem that IPsec connection hangs when bigger data is sent. -opt_ipsec.h was missing on some tcp files (sorry for basic mistake) -made buildable as above fix -also added some missing IPv4 mapped IPv6 addr consideration into ipsec4_getpolicybysock	2000-01-15 14:56:38 +00:00
shin	a42e26b36b	add a comment for some possible? IPv4 option processing.	2000-01-13 05:21:05 +00:00
shin	3bdc213839	tcp updates to support IPv6. also a small patch to sys/nfs/nfs_socket.c, as max_hdr size change. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project	2000-01-09 19:17:30 +00:00
shin	50ba589c66	IPSEC support in the kernel. pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project	1999-12-22 19:13:38 +00:00
jlemon	68756a0d9e	Use SEQ_* macros for comparing sequence space numbers. Reviewed by: truckman	1999-12-14 15:43:56 +00:00
jlemon	4e4e4d62e2	According to RFC 793, a reset should be honored if the sequence number is within the receive window. Follow this behavior, instead of only allowing resets at last_ack_sent. Pointed out by: jayanth@yahoo-inc.com	1999-12-11 04:05:52 +00:00
shin	70f0bdf681	udp IPv6 support, IPv6/IPv4 tunneling support in kernel, packet divert at kernel for IPv6/IPv4 translater daemon This includes queue related patch submitted by jburkhol@home.com. Submitted by: queue related patch from jburkhol@home.com Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project	1999-12-07 17:39:16 +00:00
green	f980526bf6	Implement RLIMIT_SBSIZE in the kernel. This is a per-uid sockbuf total usage limit.	1999-10-09 20:42:17 +00:00
des	b94ca10a55	Fix some more disordering, as well as the description string for the net.inet.tcp.drop_synfin sysctl, which for some mysterious reason said "Drop TCP packets with FIN+ACK set" (instead of "...with SYN+FIN set")	1999-09-14 16:14:05 +00:00
des	19e7731a48	Add the net.inet.tcp.restrict_rst and net.inet.tcp.drop_synfin sysctl variables, conditional on the TCP_RESTRICT_RST and TCP_DROP_SYNFIN kernel options, respectively. See the comments in LINT for details.	1999-09-12 17:22:08 +00:00
jlemon	628be0515e	Restructure TCP timeout handling: - eliminate the fast/slow timeout lists for TCP and instead use a callout entry for each timer. - increase the TCP timer granularity to HZ - implement "bad retransmit" recovery, as presented in "On Estimating End-to-End Network Path Properties", by Allman and Paxson. Submitted by: jlemon, wollmann	1999-08-30 21:17:07 +00:00
obrien	1e6f13115f	Remove extra indenting of `break' statements introducted in rev 1.89, plus wrap some long lines from that revision. While here, wrap some other long lines.	1999-08-29 21:59:03 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
csgr	ad6f988e41	Fix breakage if blackhole=1 and tiflags & TH_SYN, plus style(9) fixes Submitted by: Jonathon Lemon	1999-08-19 05:22:12 +00:00
csgr	fc583887e7	Slight tweak to tcp.blackhole to add optional behaviour to drop any segment arriving at a closed port. tcp.blackhole=1 - only drop SYN without RST tcp.blackhole=2 - drop everything without RST tcp.blackhole=0 - always send RST - default behaviour This confuses nmap -sF or -sX or -sN quite badly.	1999-08-18 15:40:05 +00:00
csgr	83e27dbadf	Add net.inet.tcp.blackhole and net.inet.udp.blackhole sysctl knobs. With these knobs on, refused connection attempts are dropped without sending a RST, or Port unreachable in the UDP case. In the TCP case, sending of RST is inhibited iff the incoming segment was a SYN. Docs and rc.conf settings to follow.	1999-08-17 12:17:53 +00:00
jmb	a686f581bc	fix comment re: RST received in TIME_WAIT to match the code.	1999-07-18 14:42:48 +00:00
peter	73556bfee1	Add sufficient braces to keep egcs happy about potentially ambiguous if/else nesting.	1999-05-06 18:13:11 +00:00
billf	dd35516544	Add sysctl descriptions to many SYSCTL_XXXs PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)	1999-05-03 23:57:32 +00:00
fenner	51a5faf6ae	Use snd_nxt, not rcv_nxt, when calculating the ISS during TIME_WAIT. This was missed in the 4.4-Lite2 merge. Noticed by: Mohan Parthasarathy <Mohan.Parthasarathy@eng.Sun.COM> and jayanth@loc201.tandem.com (vijayaraghavan_jayanth) on the tcp-impl mailing list.	1999-02-06 00:47:45 +00:00
dillon	dbf5cd2b57	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 22:42:27 +00:00
dillon	ed174536c8	Reviewed by: freebsd-current Add ICMP_BANDLIM option and 'net.inet.icmp.icmplim' sysctl. If option is specified in kernel config, icmplim defaults to 100 pps. Setting it to 0 will disable the feature. This feature limits ICMP error responses for packets sent to bad tcp or udp ports, which does a lot to help the machine handle network D.O.S. attacks. The kernel will report packet rates that exceed the limit at a rate of one kernel printf per second. There is one issue in regards to the 'tail end' of an attack... the kernel will not output the last report until some unrelated and valid icmp error packet is return at some point after the attack is over. This is a minor reporting issue only.	1998-12-03 20:23:21 +00:00
wollman	bc0a684817	Fix RST validation. PR: 7892 Submitted by: Don.Lewis@tsc.tdk.com	1998-09-11 16:04:03 +00:00
dfr	b9492066e9	Re-implement tcp and ip fragment reassembly to not store pointers in the ip header which can't work on alpha since pointers are too big. Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>	1998-08-24 07:47:39 +00:00
julian	22a5d80812	Support for IPFW based transparent forwarding. Any packet that can be matched by a ipfw rule can be redirected transparently to another port or machine. Redirection to another port mostly makes sense with tcp, where a session can be set up between a proxy and an unsuspecting client. Redirection to another machine requires that the other machine also be expecting to receive the forwarded packets, as their headers will not have been modified. /sbin/ipfw must be recompiled!!! Reviewed by: Peter Wemm <peter@freebsd.org> Submitted by: Chrisy Luke <chrisy@flix.net>	1998-07-06 03:20:19 +00:00
peter	10677f7b5c	Let the sowwakeup macro decide when to call sowakeup rather than have tcp "know" about it. A pending upcall would be missed, eg: used by NFS. Obtained from: NetBSD	1998-05-31 18:42:49 +00:00
guido	8a46909d1a	Grumble...It seems I'm suffering from some mental disease. Do it correct now.	1998-05-18 17:11:24 +00:00
guido	636533efef	Add some parenthesis for clarity and fix a bug Pointed out by: Garrett Wollmand	1998-05-18 17:07:58 +00:00
guido	8d0d7f7ab4	Refuse accellerated opens on listening sockets that have not set the TCP_NOPUSH socket option. This disables TAO for those services that do not know about T/TCP. Reviewed by: Garrett Wollman Submitted by: Peter Wemm	1998-05-04 17:59:52 +00:00
dg	c0b0bc1742	At the request of Garrett, changed sysctl: net.inet.tcp.delack_enabled -> net.inet.tcp.delayed_ack	1998-04-24 10:08:57 +00:00
des	396b114475	Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.	1998-04-17 22:37:19 +00:00
phk	e9827cb58f	Remove the last traces of TUBA. Inspired by: PR kern/3317	1998-04-06 06:52:47 +00:00
fenner	132de55f7b	Remove the check for SYN in SYN_RECEIVED state; it breaks simultaneous connect. This check was added as part of the defense against the "land" attack, to prevent attacks which guess the ISS from going into ESTABLISHED. The "src == dst" check will still prevent the single-homed case of the "land" attack, and guessing ISS's should be hard anyway. Submitted by: David Borman <dab@bsdi.com>	1998-03-20 00:43:29 +00:00
dg	abb797303f	Changes to support the addition of a new sysctl variable: net.inet.tcp.delack_enabled Which defaults to 1 and can be set to 0 to disable TCP delayed-ack processing (i.e. all acks are immediate).	1998-02-26 05:25:39 +00:00
dg	7262ff6e58	Improved connection establishment performance by doing local port lookups via a hashed port list. In the new scheme, in_pcblookup() goes away and is replaced by a new routine, in_pcblookup_local() for doing the local port check. Note that this implementation is space inefficient in that the PCB struct is now too large to fit into 128 bytes. I might deal with this in the future by using the new zone allocator, but I wanted these changes to be extensively tested in their current form first. Also: 1) Fixed off-by-one errors in the port lookup loops in in_pcbbind(). 2) Got rid of some unneeded rehashing. Adding a new routine, in_pcbinshash() to do the initialial hash insertion. 3) Renamed in_pcblookuphash() to in_pcblookup_hash() for easier readability. 4) Added a new routine, in_pcbremlists() to remove the PCB from the various hash lists. 5) Added/deleted comments where appropriate. 6) Removed unnecessary splnet() locking. In general, the PCB functions should be called at splnet()...there are unfortunately a few exceptions, however. 7) Reorganized a few structs for better cache line behavior. 8) Killed my TCP_ACK_HACK kludge. It may come back in a different form in the future, however. These changes have been tested on wcarchive for more than a month. In tests done here, connection establishment overhead is reduced by more than 50 times, thus getting rid of one of the major networking scalability problems. Still to do: make tcp_fastimo/tcp_slowtimo scale well for systems with a large number of connections. tcp_fastimo is easy; tcp_slowtimo is difficult. WARNING: Anything that knows about inpcb and tcpcb structs will have to be recompiled; at the very least, this includes netstat(1).	1998-01-27 09:15:13 +00:00
fenner	606a03ebe5	A more complete fix for the "land" attack, removing the "quick fix" from rev 1.66. This fix contains both belt and suspenders. Belt: ignore packets where src == dst and srcport == dstport in TCPS_LISTEN. These packets can only legitimately occur when connecting a socket to itself, which doesn't go through TCPS_LISTEN (it goes CLOSED->SYN_SENT->SYN_RCVD-> ESTABLISHED). This prevents the "standard" "land" attack, although doesn't prevent the multi-homed variation. Suspenders: send a RST in response to a SYN/ACK in SYN_RECEIVED state. The only packets we should get in SYN_RECEIVED are 1. A retransmitted SYN, or 2. An ack of our SYN/ACK. The "land" attack depends on us accepting our own SYN/ACK as an ACK; in SYN_RECEIVED state; this should prevent all "land" attacks. We also move up the sequence number check for the ACK in SYN_RECEIVED. This neither helps nor hurts with respect to the "land" attack, but puts more of the validation checking in one spot. PR: kern/5103	1998-01-21 02:05:59 +00:00
bde	75c4ef96e7	Don't use ANSI string concatenation to misformat a string.	1997-12-19 23:46:21 +00:00
wollman	390341dca5	Add Matt Dillon's quick fix hack for the self-connect DoS. PR: 5103	1997-11-20 20:04:49 +00:00
phk	4d26888936	Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused	1997-11-07 08:53:44 +00:00
bde	fb826377ff	Removed unused #includes.	1997-10-28 15:59:26 +00:00
dg	295181cc83	Killed the SYN_RECEIVED addition from rev 1.52. It results in legitimate RST's being ignored, keeping a connection around until it times out, and thus has the opposite effect of what was intended (which is to make the system more robust to DoS attacks).	1997-10-02 02:10:40 +00:00
fenner	e71cc90452	Don't consider a SYN/ACK with CC but no CCECHO a proper T/TCP handshake. Reviewed by: Rich Stevens <rstevens@kohala.com>	1997-09-30 16:38:09 +00:00
joerg	c65e27777e	Make TCPDEBUG a new-style option.	1997-09-16 18:36:06 +00:00
wollman	4542c1cf5d	Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.	1997-08-16 19:16:27 +00:00
jdp	3f044120cd	Fix a bug (apparently very old) that can cause a TCP connection to be dropped when it has an unusual traffic pattern. For full details as well as a test case that demonstrates the failure, see the referenced PR. Under certain circumstances involving the persist state, it is possible for the receive side's tp->rcv_nxt to advance beyond its tp->rcv_adv. This causes (tp->rcv_adv - tp->rcv_nxt) to become negative. However, in the code affected by this fix, that difference was interpreted as an unsigned number by max(). Since it was negative, it was taken as a huge unsigned number. The effect was to cause the receiver to believe that its receive window had negative size, thereby rejecting all received segments including ACKs. As the test case shows, this led to fruitless retransmissions and eventually to a dropped connection. Even connections using the loopback interface could be dropped. The fix substitutes the signed imax() for the unsigned max() function. PR: closes kern/3998 Reviewed by: davidg, fenner, wollman	1997-07-01 05:42:16 +00:00
wollman	6afbf203bd	The long-awaited mega-massive-network-code- cleanup. Part I. This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in. 2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials. 3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed. 4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family. 5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem. As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.	1997-04-27 20:01:29 +00:00
peter	94b6d72794	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
jkh	808a36ef65	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
fenner	3227402145	Re-enable the TCP SYN-attack protection code. I was the one who didn't understand the socket state flag. 2.2 candidate.	1996-11-10 07:37:24 +00:00
pst	430faa57f9	Fix two bugs I accidently put into the syn code at the last minute (yes I had tested the hell out of this). I've also temporarily disabled the code so that it behaves as it previously did (tail drop's the syns) pending discussion with fenner about some socket state flags that I don't fully understand. Submitted by: fenner	1996-10-11 19:26:42 +00:00
dg	00503a161c	Improved in_pcblookuphash() to support wildcarding, and changed relavent callers of it to take advantage of this. This reduces new connection request overhead in the face of a large number of PCBs in the system. Thanks to David Filo <filo@yahoo.com> for suggesting this and providing a sample implementation (which wasn't used, but showed that it could be done). Reviewed by: wollman	1996-10-07 19:06:12 +00:00
pst	b51353f335	Increase robustness of FreeBSD against high-rate connection attempt denial of service attacks. Reviewed by: bde,wollman,olah Inspired by: vjs@sgi.com	1996-10-07 04:32:42 +00:00
pst	32aae00362	I don't understand, I committed this fix (move a counter and fixed a typo) this evening. I think I'm going insane.	1996-09-21 06:39:20 +00:00
ache	4b550cd4a8	Syntax error: so_incom -> so_incomp	1996-09-21 06:30:06 +00:00
pst	dd5375bf7c	If the incomplete listen queue for a given socket is full, drop the oldest entry in the queue. There was a fair bit of discussion as to whether or not the proper action is to drop a random entry in the queue. It's my conclusion that a random drop is better than a head drop, however profiling this section of code (done by John Capo) shows that a head-drop results in a significant performance increase. There are scenarios where a random drop is more appropriate. If I find one in reality, I'll add the random drop code under a conditional. Obtained from: discussions and code done by Vernon Schryver (vjs@sgi.com).	1996-09-20 21:25:18 +00:00
pst	460ca264a3	Make the misnamed tcp initial keepalive timer value (which is really the time, in seconds, that state for non-established TCP sessions stays about) a sysctl modifyable variable. [part 1 of two commits, I just realized I can't play with the indices as I was typing this commit message.]	1996-09-13 23:51:44 +00:00
pst	3499a65964	Receipt of two SYN's are sufficient to set the t_timer[TCPT_KEEP] to "keepidle". this should not occur unless the connection has been established via the 3-way handshake which requires an ACK Submitted by: jmb Obtained from: problem discussed in Stevens vol. 3	1996-09-13 18:47:03 +00:00
fenner	1284248a56	Back out my stupid braino; I was thinking strlen and not sizeof.	1996-05-02 05:54:14 +00:00
fenner	2163f66f9a	Size temp var correctly; buf[4*sizeof "123"] is not long enough to store "192.252.119.189\0".	1996-05-02 05:31:13 +00:00
ache	8a5de28c05	inet_ntoa buffer was evaluated twice in log_in_vain, fix it. Thanx to: jdp	1996-04-27 18:19:12 +00:00
wollman	c9ab94c878	Delete #ifdef notdef blocks containing old method of srtt calculation. Requested by: davidg	1996-04-26 18:32:58 +00:00
pst	67931eee29	Logging UDP and TCP connection attempts should not be enabled by default. It's trivial to create a denial of service attack on a box so enabled. These messages, if enabled at all, must be rate-limited. (!)	1996-04-09 07:01:53 +00:00
phk	1eff72b85f	Log TCP syn packets for ports we don't listen on. Controlled by: sysctl net.inet.tcp.log_in_vain: 1 Log UDP syn packets for ports we don't listen on. Controlled by: sysctl net.inet.udp.log_in_vain: 1 Suggested by: Warren Toomey <wkt@cs.adfa.oz.au>	1996-04-04 10:46:44 +00:00
wollman	444648d459	Slight modification of RTO floor calculation.	1996-03-25 20:13:21 +00:00
wollman	acfe4c4467	A number of performance-reducing flaws fixed based on comments from Larry Peterson &co. at Arizona: - Header prediction for ACKs did not exclude Fast Retransmit/Recovery. - srtt calculation tended to get ``stuck'' and could never decrease when below 8. It still can't, but the scaling factors are adjusted so that this artifact does not cause as bad an effect on the RTO value as it used to. The paper also points out the incr/8 error that has been long since fixed, and the problems with ACKing frequency resulting from the use of options which I suspect to be fixed already as well (as part of the T/TCP work). Obtained from: Brakmo & Peterson, ``Performance Problems in BSD4.4 TCP''	1996-03-22 18:09:21 +00:00
dg	3f0638f73b	Move or add #include <queue.h> in preparation for upcoming struct socket changes.	1996-03-11 15:13:58 +00:00
guido	89b4ca893f	Add a counter for the number of times the listen queue was overflowed to the tcpstat structure. (netstat -s) Reviewed by: wollman Obtained from: Steves, TCP/IP Ill. vol.3, page 189	1996-02-26 21:47:13 +00:00
dg	41aff73dfb	Fixed bug in Path MTU Discovery that caused the system to have to re- discover the Path MTU for each connection if the connecting host didn't offer an initial MSS. Submitted by: davidg & olah	1996-02-22 11:46:39 +00:00
olah	09077f7acf	Fix a bug related to the interworking of T/TCP and window scaling: when a connection enters the ESTBLS state using T/TCP, then window scaling wasn't properly handled. The fix is twofold. 1) When the 3WHS completes, make sure that we update our window scaling state variables. 2) When setting the `virtual advertized window', then make sure that we do not try to offer a window that is larger than the maximum window without scaling (TCP_MAXWIN). Reviewed by: davidg Reported by: Jerry Chen <chen@Ipsilon.COM>	1996-01-31 08:22:24 +00:00
phk	9cb413a93c	Another mega commit to staticize things.	1995-12-14 09:55:16 +00:00
phk	db2c71245d	New style sysctl & staticize alot of stuff.	1995-11-14 20:34:56 +00:00
phk	c1dbd5b377	Start adding new style sysctl here too.	1995-11-09 20:23:09 +00:00
olah	d4e1ca409e	Cosmetic changes to processing of segments in the SYN_SENT state: - remove a redundant condition; - complete all validity checks on segment before calling soisconnected(so). Reviewed by: Richard Stevens, davidg, wollman	1995-11-03 22:31:54 +00:00
wollman	7c65eebe94	Routes can be asymmetric. Always offer to /accept/ an MSS of up to the capacity of the link, even if the route's MTU indicates that we cannot send that much in their direction. (This might actually make it possible to test Path MTU discovery in a useful variety of cases.)	1995-10-13 16:00:25 +00:00
wollman	3fc43db861	Finish 4.4-Lite-2 merge: randomize TCP initial sequence numbers to make ISS-guessing spoofing attacks harder.	1995-10-03 16:54:17 +00:00
olah	fd35d46e41	Remove a redundant `if' from tcp_reass(). Correct a typo in a comment (SEND_SYN -> NEEDSYN). Reviewed by: David Greenman	1995-07-31 10:24:22 +00:00
wollman	ae6523c0e5	tcp_input.c - keep track of how many times a route contained a cached rtt or ssthresh that we were able to use tcp_var.h - declare tcpstat entries for above; declare tcp_{send,recv}space in_rmx.c - fill in the MTU and pipe sizes with the defaults TCP would have used anyway in the absence of values here	1995-07-10 15:39:16 +00:00
wollman	35b757bd67	Keep track of the number of samples through the srtt filter so that we know better when to cache values in the route, rather than relying on a heuristic involving sequence numbers that broke when tcp_sendspace was increased to 16k.	1995-06-29 18:11:24 +00:00
rgrimes	c86f0c7a71	Remove trailing whitespace.	1995-05-30 08:16:23 +00:00
dg	fc19afab24	#ifdef'd my Nagel/ACK hack with "TCP_ACK_HACK", disabled by default. I'm currently considering reducing the TCP fasttimo to 100ms to help improve things, but this would be done as a seperate step at some point in the future. This was done because it was causing some sometimes serious performance problems with T/TCP.	1995-05-11 01:41:06 +00:00
olah	e994f8f005	Fix a misspelled constant in tcp_input.c. On Tue, 09 May 1995 04:35:27 PDT, Richard Stevens wrote: > In tcp_dooptions() under the case TCPOPT_CC there is an assignment > > to->to_flag \|= TCPOPT_CC; > > that should be > > to->to_flag \|= TOF_CC; > > I haven't thought through the ramifications of what's been happening ... > > Rich Stevens Submitted by: rstevens@noao.edu (Richard Stevens)	1995-05-09 12:32:06 +00:00
dg	b8a73effc2	Changed in_pcblookuphash() to not automatically call in_pcblookup() if the lookup fails. Updated callers to deal with this. Call in_pcblookuphash instead of in_pcblookup() in in_pcbconnect; this improves performance of UDP output by about 17% in the standard case.	1995-05-03 07:16:53 +00:00
dg	30e9776583	Further satisfy my paranoia by making sure that the ACKNOW is only set when ti_len is non-zero.	1995-04-10 17:37:46 +00:00
dg	95eb1b8365	Fixed bug I introduced with my Nagel hack which caused tcp_input and tcp_output to loop endlessly. This was freefall's problem during the past day.	1995-04-10 17:16:10 +00:00
dg	919fdebd0e	Implemented PCB hashing. Includes new functions in_pcbinshash, in_pcbrehash, and in_pcblookuphash.	1995-04-09 01:29:31 +00:00
olah	ccccd069f4	Fix a bug in tcp_input reported by Rick Jones <raj@hpisrdq.cup.hp.com>. If a goto findpcb occurred during the processing of a segment, the TCP and IP headers were dropped twice from the mbuf which resulted in data acked by TCP but not delivered to the user. Reviewed by: davidg	1995-04-05 10:32:14 +00:00
dg	b8b34df69c	Re-apply my "breakage" to the Nagel congestion avoidence. This version differs slightly in the logic from the previous version; packets are now acked immediately if the sender set PUSH.	1995-03-27 07:12:24 +00:00
bde	289f11acb4	Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.	1995-03-16 18:17:34 +00:00
wollman	fb4135032a	Avoid deadlock situation described by Stevens using his suggested replacement code. Obtained from: Stevens, vol. 2, pp. 959-960	1995-02-16 01:39:19 +00:00
wollman	0f1c96e359	Transaction TCP support now standard. Hack away!	1995-02-16 00:55:44 +00:00
phk	832a9eda23	YFfix.	1995-02-14 06:28:25 +00:00
wollman	58747a5507	Get rid of some unneeded #ifdef TTCP lines. Also, get rid of some bogus commons declared in header files.	1995-02-14 02:35:19 +00:00
wollman	72af2aa44a	Merge Transaction TCP, courtesy of Andras Olah <olah@cs.utwente.nl> and Bob Braden <braden@isi.edu>. NB: This has not had David's TCP ACK hack re-integrated. It is not clear what the correct solution to this problem is, if any. If a better solution doesn't pop up in response to this message, I'll put David's code back in (or he's welcome to do so himself).	1995-02-09 23:13:27 +00:00
wollman	d67ef41179	As suggested by Sally Floyd, don't add the ``small fraction of the window size'' when doing congestion avoidance. Submitted by: Mark Andrews	1994-10-13 18:36:32 +00:00
phk	f3c1ed2327	GCC cleanup. Reviewed by: Submitted by: Obtained from:	1994-10-02 17:48:58 +00:00
dg	309f1c3e76	Made TCPDEBUG truely optional. Based on changes I made in FreeBSD 1.1.5. Fixed somebody's idea of a joke - about the first half of the lines in in_proto.c were spaced over by one space.	1994-09-15 10:36:56 +00:00
wollman	9489a8e21b	Obey RFC 793, section 3.4: Several examples of connection initiation follow. Although these examples do not show connection synchronization using data-carrying segments, this is perfectly legitimate, so long as the receiving TCP doesn't deliver the data to the user until it is clear the data is valid (i.e., the data must be buffered at the receiver until the connection reaches the ESTABLISHED state).	1994-08-26 22:27:16 +00:00
wollman	f9fc827448	Fix up some sloppy coding practices: - Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above. NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.	1994-08-18 22:36:09 +00:00
dg	8d205697aa	Added $Id$	1994-08-02 07:55:43 +00:00
dg	f55740d974	Fixed bug with Nagel Congestion Avoidance where a tcp connection would stall unnecessarily - always send an ACK when a packet len of < mss is received.	1994-08-01 12:00:25 +00:00
dg	28839c3795	Added missing ntohl()'s that are needed before calling IN_MULTICAST in a couple of places. Submitted by: Johannes Helander	1994-05-26 09:51:33 +00:00
rgrimes	2469c867a1	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman	1994-05-25 09:21:21 +00:00
rgrimes	27464aaa8e	BSD 4.4 Lite Kernel Sources	1994-05-24 10:09:53 +00:00

... 3 4 5 6 7 ...

360 Commits