freebsd-skq

Author	SHA1	Message	Date
Ruslan Ermilov	38c1bc358b	Avoid a NULL pointer derefence introduced in rev. 1.129. Problem noticed by: bde, gcc(1) Panic caught by: mjacob Patch tested by: mjacob	2001-07-23 16:50:01 +00:00
Ruslan Ermilov	f2c2962ee5	Backout non-functional changes from revision 1.128. Not objected to by: dcs	2001-07-19 07:10:30 +00:00
Daniel C. Sobral	3afefa3924	Skip the route checking in the case of multicast packets with known interfaces. Reviewed by: people at that channel Approved by: silence on -net	2001-07-17 18:47:48 +00:00
Ruslan Ermilov	9f81cc840b	Backout damage to the INADDR_TO_IFP() macro in revision 1.7. This macro was supposed to only match local IP addresses of interfaces, and all consumers of this macro assume this as well. (See IP_MULTICAST_IF and IP_ADD_MEMBERSHIP socket options in the ip(4) manpage.) This fixes a major security breach in IPFW-based firewalls where the `me' keyword would match the other end of a P2P link. PR: kern/28567	2001-07-17 10:30:21 +00:00
David E. O'Brien	81e561cdf2	Bump net.inet.tcp.sendspace to 32k and net.inet.tcp.recvspace to 65k. This should help us in nieve benchmark "tests". It seems a wide number of people think 32k buffers would not cause major issues, and is in fact in use by many other OS's at this time. The receive buffers can be bumped higher as buffers are hardly used and several research papers indicate that receive buffers rarely use much space at all. Submitted by: Leo Bicknell <bicknell@ufp.org> <20010713101107.B9559@ussenterprise.ufp.org> Agreed to in principle by: dillon (at the 32k level)	2001-07-13 18:38:04 +00:00
Ruslan Ermilov	a307d59838	mdoc(7) police: removed HISTORY info from the .Os call.	2001-07-10 13:41:46 +00:00
Mike Silbersack	2d610a5028	Temporary feature: Runtime tuneable tcp initial sequence number generation scheme. Users may now select between the currently used OpenBSD algorithm and the older random positive increment method. While the OpenBSD algorithm is more secure, it also breaks TIME_WAIT handling; this is causing trouble for an increasing number of folks. To switch between generation schemes, one sets the sysctl net.inet.tcp.tcp_seq_genscheme. 0 = random positive increments, 1 = the OpenBSD algorithm. 1 is still the default. Once a secure _and_ compatible algorithm is implemented, this sysctl will be removed. Reviewed by: jlemon Tested by: numerous subscribers of -net	2001-07-08 02:20:47 +00:00
Brooks Davis	53dab5fe7b	gif(4) and stf(4) modernization: - Remove gif dependencies from stf. - Make gif and stf into modules - Make gif cloneable. PR: kern/27983 Reviewed by: ru, ume Obtained from: NetBSD MFC after: 1 week	2001-07-02 21:02:09 +00:00
Crist J. Clark	92a99815a8	While in there fixing a fragment logging bug, fix it so we log fragments "right." Log fragment information tcpdump(8)-style, Jul 1 19:38:45 bubbles /boot/kernel/kernel: ipfw: 1000 Accept ICMP:8.0 192.168.64.60 192.168.64.20 in via ep0 (frag 53113:1480@0+) That is, instead of the old, ... Fragment = <offset/8> Do, ... (frag <IP ID>:<data len>@<offset>[+]) PR: kern/23446 Approved by: ru MFC after: 1 week	2001-07-02 15:50:31 +00:00
Ruslan Ermilov	8bf82a92d5	Backout CSRG revision 7.22 to this file (if in_losing notices an RTF_DYNAMIC route, it got freed twice). I am not sure what was the actual problem in 1992, but the current behavior is memory leak if PCB holds a reference to a dynamically created/modified routing table entry. (rt_refcnt>0 and we don't call rtfree().) My test bed was: 1. Set net.inet.tcp.msl to a low value (for test purposes), e.g., 5 seconds, to speed up the transition of TCP connection to a "closed" state. 2. Add a network route which causes ICMP redirect from the gateway. 3. ping(8) host H that matches this route; this creates RTF_DYNAMIC RTF_HOST route to H. (I was forced to use ICMP to cause gateway to generate ICMP host redirect, because gateway in question is a 4.2-STABLE system vulnerable to a problem that was fixed later in ip_icmp.c,v 1.39.2.6, and TCP packets with DF bit set were triggering this bug.) 4. telnet(1) to H 5. Block access to H with ipfw(8) 6. Send something in telnet(1) session; this causes EPERM, followed by an in_losing() call in a few seconds. 7. Delete ipfw(8) rule blocking access to H, and wait for TCP connection moving to a CLOSED state; PCB is freed. 8. Delete host route to H. 9. Watch with netstat(1) that `rttrash' increased. 10. Repeat steps 3-9, and watch `rttrash' increases. PR: kern/25421 MFC after: 2 weeks	2001-06-29 12:07:29 +00:00
Ruslan Ermilov	3277d1c498	Fixed the brain-o in rev. 1.10: the logic check was reversed. Reported by: Bernd Fuerwitt <bf@fuerwitt.de>	2001-06-27 14:11:25 +00:00
Ruslan Ermilov	a447a5ae06	Bring in fix from NetBSD's revision 1.16: Pass the correct destination address for the route-to-gateway case. PR: kern/10607 MFC after: 2 weeks	2001-06-26 09:00:50 +00:00
David Malone	7ce87f1205	Allow getcred sysctl to work in jailed root processes. Processes can only do getcred calls for sockets which were created in the same jail. This should allow the ident to work in a reasonable way within jails. PR: 28107 Approved by: des, rwatson	2001-06-24 12:18:27 +00:00
Jonathan Lemon	f962cba5c3	Replace bzero() of struct ip with explicit zeroing of structure members, which is faster.	2001-06-23 17:44:27 +00:00
Ruslan Ermilov	c73d99b567	Add netstat(1) knob to reset net.inet.{ip\|icmp\|tcp\|udp\|igmp}.stats. For example, ``netstat -s -p ip -z'' will show and reset IP stats. PR: bin/17338	2001-06-23 17:17:59 +00:00
Mike Silbersack	08517d530e	Eliminate the allocation of a tcp template structure for each connection. The information contained in a tcptemp can be reconstructed from a tcpcb when needed. Previously, tcp templates required the allocation of one mbuf per connection. On large systems, this change should free up a large number of mbufs. Reviewed by: bmilekic, jlemon, ru MFC after: 2 weeks	2001-06-23 03:21:46 +00:00
Munechika SUMIKAWA	a96c00661a	- Renumber KAME local ICMP types and NDP options numberes beacaues they are duplicated by newly defined types/options in RFC3121 - We have no backward compatibility issue. There is no apps in our distribution which use the above types/options. Obtained from: KAME MFC after: 2 weeks	2001-06-21 07:08:43 +00:00
Hajimu UMEMOTO	ff2428299f	made sure to use the correct sa_len for rtalloc(). sizeof(ro_dst) is not necessarily the correct one. this change would also fix the recent path MTU discovery problem for the destination of an incoming TCP connection. Submitted by: JINMEI Tatuya <jinmei@kame.net> Obtained from: KAME MFC after: 2 weeks	2001-06-20 12:32:48 +00:00
Jonathan Lemon	08aadfbb98	Do not perform arp send/resolve on an interface marked NOARP. PR: 25006 MFC after: 2 weeks	2001-06-15 21:00:32 +00:00
Peter Wemm	215db1379e	Fix a stack of KAME netinet6/in6.h warnings: 592: warning: `struct mbuf' declared inside parameter list 595: warning: `struct ifnet' declared inside parameter list	2001-06-15 00:37:27 +00:00
Hajimu UMEMOTO	3384154590	Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks	2001-06-11 12:39:29 +00:00
Jesper Skriver	96c2b04290	Make the default value of net.inet.ip.maxfragpackets and net.inet6.ip6.maxfragpackets dependent on nmbclusters, defaulting to nmbclusters / 4 Reviewed by: bde MFC after: 1 week	2001-06-10 11:04:10 +00:00
Peter Wemm	0978669829	"Fix" the previous initial attempt at fixing TUNABLE_INT(). This time around, use a common function for looking up and extracting the tunables from the kernel environment. This saves duplicating the same function over and over again. This way typically has an overhead of 8 bytes + the path string, versus about 26 bytes + the path string.	2001-06-08 05:24:21 +00:00
Jonathan Lemon	0a52f59c36	Move IPFilter into contrib.	2001-06-07 05:13:35 +00:00
Peter Wemm	4422746fdf	Back out part of my previous commit. This was a last minute change and I botched testing. This is a perfect example of how NOT to do this sort of thing. :-(	2001-06-07 03:17:26 +00:00
Peter Wemm	81930014ef	Make the TUNABLE_() macros look and behave more consistantly like the SYSCTL_() macros. TUNABLE_INT_DECL() was an odd name because it didn't actually declare the int, which is what the name suggests it would do.	2001-06-06 22:17:08 +00:00
Jesper Skriver	65f28919b3	Silby's take one on increasing FreeBSD's resistance to SYN floods: One way we can reduce the amount of traffic we send in response to a SYN flood is to eliminate the RST we send when removing a connection from the listen queue. Since we are being flooded, we can assume that the majority of connections in the queue are bogus. Our RST is unwanted by these hosts, just as our SYN-ACK was. Genuine connection attempts will result in hosts responding to our SYN-ACK with an ACK packet. We will automatically return a RST response to their ACK when it gets to us if the connection has been dropped, so the early RST doesn't serve the genuine class of connections much. In summary, we can reduce the number of packets we send by a factor of two without any loss in functionality by ensuring that RST packets are not sent when dropping a connection from the listen queue. Submitted by: Mike Silbersack <silby@silby.com> Reviewed by: jesper MFC after: 2 weeks	2001-06-06 19:41:51 +00:00
Brian Somers	f987e1bd0f	Add BSD-style copyright headers Approved by: Charles Mott <cmott@scientech.com>	2001-06-04 15:09:51 +00:00
Brian Somers	888b1a7aa5	Change to a standard BSD-style copyright Approved by: Atsushi Murai <amurai@spec.co.jp>	2001-06-04 14:52:17 +00:00
Jesper Skriver	690a6055ff	Prevent denial of service using bogus fragmented IPv4 packets. A attacker sending a lot of bogus fragmented packets to the target (with different IPv4 identification field - ip_id), may be able to put the target machine into mbuf starvation state. By setting a upper limit on the number of reassembly queues we prevent this situation. This upper limit is controlled by the new sysctl net.inet.ip.maxfragpackets which defaults to 200, as the IPv6 case, this should be sufficient for most systmes, but you might want to increase it if you have lots of TCP sessions. I'm working on making the default value dependent on nmbclusters. If you want old behaviour (no upper limit) set this sysctl to a negative value. If you don't want to accept any fragments (not recommended) set the sysctl to 0 (zero). Obtained from: NetBSD MFC after: 1 week	2001-06-03 23:33:23 +00:00
Kris Kennaway	64dddc1872	Add ``options RANDOM_IP_ID'' which randomizes the ID field of IP packets. This closes a minor information leak which allows a remote observer to determine the rate at which the machine is generating packets, since the default behaviour is to increment a counter for each packet sent. Reviewed by: -net Obtained from: OpenBSD	2001-06-01 10:02:28 +00:00
David E. O'Brien	240ef84277	Back out jesper's 2001/05/31 14:58:11 PDT commit. It does not compile.	2001-06-01 09:51:14 +00:00
Jesper Skriver	2b1a209a17	Prevent denial of service using bogus fragmented IPv4 packets. A attacker sending a lot of bogus fragmented packets to the target (with different IPv4 identification field - ip_id), may be able to put the target machine into mbuf starvation state. By setting a upper limit on the number of reassembly queues we prevent this situation. This upper limit is controlled by the new sysctl net.inet.ip.maxfragpackets which defaults to NMBCLUSTERS/4 If you want old behaviour (no upper limit) set this sysctl to a negative value. If you don't want to accept any fragments (not recommended) set the sysctl to 0 (zero) Obtained from: NetBSD (partially) MFC after: 1 week	2001-05-31 21:57:29 +00:00
Jesper Skriver	7ceb778366	Disable rfc1323 and rfc1644 TCP extensions if we havn't got any response to our third SYN to work-around some broken terminal servers (most of which have hopefully been retired) that have bad VJ header compression code which trashes TCP segments containing unknown-to-them TCP options. PR: kern/1689 Submitted by: jesper Reviewed by: wollman MFC after: 2 weeks	2001-05-31 19:24:49 +00:00
Ruslan Ermilov	79ec1c507a	Add an integer field to keep protocol-specific flags with links. For FTP control connection, keep the CRLF end-of-line termination status in there. Fixed the bug when the first FTP command in a session was ignored. PR: 24048 MFC after: 1 week	2001-05-30 14:24:35 +00:00
Jesper Skriver	e4b6428171	Inline TCP_REASS() in the single location where it's used, just as OpenBSD and NetBSD has done. No functional difference. MFC after: 2 weeks	2001-05-29 19:54:45 +00:00
Jesper Skriver	853be1226e	properly delay acks in half-closed TCP connections PR: 24962 Submitted by: Tony Finch <dot@dotat.at> MFC after: 2 weeks	2001-05-29 19:51:45 +00:00
Ruslan Ermilov	9185426827	In in_ifadown(), differentiate between whether the interface goes down or interface address is deleted. Only delete static routes in the latter case. Reported by: Alexander Leidinger <Alexander@leidinger.net>	2001-05-11 14:37:34 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Jesper Skriver	d1745f454d	Say goodbye to TCP_COMPAT_42 Reviewed by: wollman Requested by: wollman	2001-04-20 11:58:56 +00:00
Kris Kennaway	f0a04f3f51	Randomize the TCP initial sequence numbers more thoroughly. Obtained from: OpenBSD Reviewed by: jesper, peter, -developers	2001-04-17 18:08:01 +00:00
Darren Reed	454a43c1f1	fix security hole created by fragment cache	2001-04-06 15:52:28 +00:00
Bill Fumerola	0901f62e11	pipe/queue are the only consumers of flow_id, so only set it in those cases	2001-04-06 06:52:25 +00:00
Jesper Skriver	b77d155dd3	MFC candidate. Change code from PRC_UNREACH_ADMIN_PROHIB to PRC_UNREACH_PORT for ICMP_UNREACH_PROTOCOL and ICMP_UNREACH_PORT And let TCP treat PRC_UNREACH_PORT like PRC_UNREACH_ADMIN_PROHIB This should fix the case where port unreachables for udp returned ENETRESET instead of ECONNREFUSED Problem found by: Bill Fenner <fenner@research.att.com> Reviewed by: jlemon	2001-03-28 14:13:19 +00:00
Ruslan Ermilov	4a558355e5	MAN[1-9] -> MAN.	2001-03-27 17:27:19 +00:00
Yaroslav Tykhiy	4cbc8ad1bb	Add a missing m_pullup() before a mtod() in in_arpinput(). PR: kern/22177 Reviewed by: wollman	2001-03-27 12:34:58 +00:00
Hidetoshi Shimokawa	110a013333	Replace dyn_fin_lifetime with dyn_ack_lifetime for half-closed state. Half-closed state could last long for some connections and fin_lifetime (default 20sec) is too short for that. OK'ed by: luigi	2001-03-27 05:28:30 +00:00
Poul-Henning Kamp	f83880518b	Send the remains (such as I have located) of "block major numbers" to the bit-bucket.	2001-03-26 12:41:29 +00:00
Brian Somers	71593f95e0	Make header files conform to style(9). Reviewed by (): bde () alias_local.h only got a cursory glance.	2001-03-25 12:05:10 +00:00
Brian Somers	adad9908fa	Remove an extraneous declaration.	2001-03-25 03:34:29 +00:00
Hajimu UMEMOTO	2da24fa6e9	IPv4 address is not unsigned int. This change introduces in_addr_t. PR: 9982 Adviced by: des Reviewed by: -alpha and -net (no objection) Obtained from: OpenBSD	2001-03-23 18:59:31 +00:00
Brian Somers	30fcf11451	Remove (non-protected) variable names from function prototypes.	2001-03-22 11:55:26 +00:00
Paul Richards	1789d85615	Only flush rules that have a rule number above that set by a new sysctl, net.inet.ip.fw.permanent_rules. This allows you to install rules that are persistent across flushes, which is very useful if you want a default set of rules that maintains your access to remote machines while you're reconfiguring the other rules. Reviewed by: Mark Murray <markm@FreeBSD.org>	2001-03-21 08:19:31 +00:00
Dag-Erling Smørgrav	c59319bf1a	Axe TCP_RESTRICT_RST. It was never a particularly good idea except for a few very specific scenarios, and now that we have had net.inet.tcp.blackhole for quite some time there is really no reason to use it any more. (last of three commits)	2001-03-19 22:09:00 +00:00
Ruslan Ermilov	1e3d5af041	Invalidate cached forwarding route (ipforward_rt) whenever a new route is added to the routing table, otherwise we may end up using the wrong route when forwarding. PR: kern/10778 Reviewed by: silence on -net	2001-03-19 09:16:16 +00:00
Ruslan Ermilov	4078ffb154	Make sure the cached forwarding route (ipforward_rt) is still up before using it. Not checking this may have caused the wrong IP address to be used when processing certain IP options (see example below). This also caused the wrong route to be passed to ip_output() when forwarding, but fortunately ip_output() is smart enough to detect this. This example demonstrates the wrong behavior of the Record Route option observed with this bug. Host ``freebsd'' is acting as the gateway for the ``sysv''. 1. On the gateway, we add the route to the destination. The new route will use the primary address of the loopback interface, 127.0.0.1: : freebsd# route add 10.0.0.66 -iface lo0 -reject : add host 10.0.0.66: gateway lo0 2. From the client, we ping the destination. We see the correct replies. Please note that this also causes the relevant route on the ``freebsd'' gateway to be cached in ipforward_rt variable: : sysv# ping -snv 10.0.0.66 : PING 10.0.0.66: 56 data bytes : ICMP Host Unreachable from gateway 192.168.0.115 : ICMP Host Unreachable from gateway 192.168.0.115 : ICMP Host Unreachable from gateway 192.168.0.115 : : ----10.0.0.66 PING Statistics---- : 3 packets transmitted, 0 packets received, 100% packet loss 3. On the gateway, we delete the route to the destination, thus making the destination reachable through the `default' route: : freebsd# route delete 10.0.0.66 : delete host 10.0.0.66 4. From the client, we ping destination again, now with the RR option turned on. The surprise here is the 127.0.0.1 in the first reply. This is caused by the bug in ip_rtaddr() not checking the cached route is still up befor use. The debug code also shows that the wrong (down) route is further passed to ip_output(). The latter detects that the route is down, and replaces the bogus route with the valid one, so we see the correct replies (192.168.0.115) on further probes: : sysv# ping -snRv 10.0.0.66 : PING 10.0.0.66: 56 data bytes : 64 bytes from 10.0.0.66: icmp_seq=0. time=10. ms : IP options: <record route> 127.0.0.1, 10.0.0.65, 10.0.0.66, : 192.168.0.65, 192.168.0.115, 192.168.0.120, : 0.0.0.0(Current), 0.0.0.0, 0.0.0.0 : 64 bytes from 10.0.0.66: icmp_seq=1. time=0. ms : IP options: <record route> 192.168.0.115, 10.0.0.65, 10.0.0.66, : 192.168.0.65, 192.168.0.115, 192.168.0.120, : 0.0.0.0(Current), 0.0.0.0, 0.0.0.0 : 64 bytes from 10.0.0.66: icmp_seq=2. time=0. ms : IP options: <record route> 192.168.0.115, 10.0.0.65, 10.0.0.66, : 192.168.0.65, 192.168.0.115, 192.168.0.120, : 0.0.0.0(Current), 0.0.0.0, 0.0.0.0 : : ----10.0.0.66 PING Statistics---- : 3 packets transmitted, 3 packets received, 0% packet loss : round-trip (ms) min/avg/max = 0/3/10	2001-03-18 13:04:07 +00:00
Poul-Henning Kamp	462b86fe91	<sys/queue.h> makeover.	2001-03-16 20:00:53 +00:00
Poul-Henning Kamp	ccd6f42dc9	Fix a style(9) nit.	2001-03-16 19:36:23 +00:00
Ruslan Ermilov	089cdfad78	net/route.c: A route generated from an RTF_CLONING route had the RTF_WASCLONED flag set but did not have a reference to the parent route, as documented in the rtentry(9) manpage. This prevented such routes from being deleted when their parent route is deleted. Now, for example, if you delete an IP address from a network interface, all ARP entries that were cloned from this interface route are flushed. This also has an impact on netstat(1) output. Previously, dynamically created ARP cache entries (RTF_STATIC flag is unset) were displayed as part of the routing table display (-r). Now, they are only printed if the -a option is given. netinet/in.c, netinet/in_rmx.c: When address is removed from an interface, also delete all routes that point to this interface and address. Previously, for example, if you changed the address on an interface, outgoing IP datagrams might still use the old address. The only solution was to delete and re-add some routes. (The problem is easily observed with the route(8) command.) Note, that if the socket was already bound to the local address before this address is removed, new datagrams generated from this socket will still be sent from the old address. PR: kern/20785, kern/21914 Reviewed by: wollman (the idea)	2001-03-15 14:52:12 +00:00
Ruslan Ermilov	206a3274ef	RFC768 (UDP) requires that "if the computed checksum is zero, it is transmitted as all ones". This got broken after introduction of delayed checksums as follows. Some guys (including Jonathan) think that it is allowed to transmit all ones in place of a zero checksum for TCP the same way as for UDP. (The discussion still takes place on -net.) Thus, the 0 -> 0xffff checksum fixup was first moved from udp_output() (see udp_usrreq.c, 1.64 -> 1.65) to in_cksum_skip() (see sys/i386/i386/in_cksum.c, 1.17 -> 1.18, INVERT expression). Besides that I disagree that it is valid for TCP, there was no real problem until in_cksum.c,v 1.20, where the in_cksum() was made just a special version of in_cksum_skip(). The side effect was that now every incoming IP datagram failed to pass the checksum test (in_cksum() returned 0xffff when it should actually return zero). It was fixed next day in revision 1.21, by removing the INVERT expression. The latter also broke the 0 -> 0xffff fixup for UDP checksums. Before this change: : tcpdump: listening on lo0 : 127.0.0.1.33005 > 127.0.0.1.33006: udp 0 (ttl 64, id 1) : 4500 001c 0001 0000 4011 7cce 7f00 0001 : 7f00 0001 80ed 80ee 0008 0000 After this change: : tcpdump: listening on lo0 : 127.0.0.1.33005 > 127.0.0.1.33006: udp 0 (ttl 64, id 1) : 4500 001c 0001 0000 4011 7cce 7f00 0001 : 7f00 0001 80ed 80ee 0008 ffff	2001-03-13 17:07:06 +00:00
Ruslan Ermilov	fb9aaba000	Count and show incoming UDP datagrams with no checksum.	2001-03-13 13:26:06 +00:00
Poul-Henning Kamp	503d3c0277	Correctly cleanup in case of failure to bind a pcb. PR: 25751 Submitted by: <unicorn@Forest.Od.UA>	2001-03-12 21:53:23 +00:00
Jonathan Lemon	1db24ffb98	Unbreak LINT. Pointed out by: phk	2001-03-12 02:57:42 +00:00
Ian Dowse	5d936aa181	In ip_output(), initialise `ia' in the case where the packet has come from a dummynet pipe. Without this, the code which increments the per-ifaddr stats can dereference an uninitialised pointer. This should make dummynet usable again. Reported by: "Dmitry A. Yanko" <fm@astral.ntu-kpi.kiev.ua> Reviewed by: luigi, joe	2001-03-11 17:50:19 +00:00
Ruslan Ermilov	8ce3f3dd28	Make it possible to use IP_TTL and IP_TOS setsockopt(2) options on certain types of SOCK_RAW sockets. Also, use the ip.ttl MIB variable instead of MAXTTL constant as the default time-to-live value for outgoing IP packets all over the place, as we already do this for TCP and UDP. Reviewed by: wollman	2001-03-09 12:22:51 +00:00
Jonathan Lemon	c0647e0d07	Push the test for a disconnected socket when accept()ing down to the protocol layer. Not all protocols behave identically. This fixes the brokenness observed with unix-domain sockets (and postfix)	2001-03-09 08:16:40 +00:00
Jonathan Lemon	32676c2d1f	The TCP sequence number used for sending a RST with the ipfw reset rule is already in host byte order, so do not swap it again. Reviewed by: bfumerola	2001-03-09 08:13:08 +00:00
Ian Dowse	bfef7ed45c	It was possible for ip_forward() to supply to icmp_error() an IP header with ip_len in network byte order. For certain values of ip_len, this could cause icmp_error() to write beyond the end of an mbuf, causing mbuf free-list corruption. This problem was observed during generation of ICMP redirects. We now make quite sure that the copy of the IP header kept for icmp_error() is stored in a non-shared mbuf header so that it will not be modified by ip_output(). Also: - Calculate the correct number of bytes that need to be retained for icmp_error(), instead of assuming that 64 is enough (it's not). - In icmp_error(), use m_copydata instead of bcopy() to copy from the supplied mbuf chain, in case the first 8 bytes of IP payload are not stored directly after the IP header. - Sanity-check ip_len in icmp_error(), and panic if it is less than sizeof(struct ip). Incoming packets with bad ip_len values are discarded in ip_input(), so this should only be triggered by bugs in the code, not by bad packets. This patch results from code and suggestions from Ruslan, Bosko, Jonathan Lemon and Matt Dillon, with important testing by Mike Tancsa, who could reproduce this problem at will. Reported by: Mike Tancsa <mike@sentex.net> Reviewed by: ru, bmilekic, jlemon, dillon	2001-03-08 19:03:26 +00:00
Don Lewis	a8f1210095	Modify the comments to more closely resemble the English language.	2001-03-05 22:40:27 +00:00
Don Lewis	3f67c83439	Move the loopback net check closer to the beginning of ip_input() so that it doesn't block packets whose destination address has been translated to the loopback net by ipnat. Add warning comments about the ip_checkinterface feature.	2001-03-05 08:45:05 +00:00
Bosko Milekic	234ff7c46f	During a flood, we don't call rtfree(), but we remove the entry ourselves. However, if the RTF_DELCLONE and RTF_WASCLONED condition passes, but the ref count is > 1, we won't decrement the count at all. This could lead to route entries never being deleted. Here, we call rtfree() not only if the initial two conditions fail, but also if the ref count is > 1 (and we therefore don't immediately delete the route, but let rtfree() handle it). This is an urgent MFC candidate. Thanks go to Mike Silbersack for the fix, once again. :-) Submitted by: Mike Silbersack <silby@silby.com>	2001-03-04 21:28:40 +00:00
Don Lewis	e15ae1b226	Disable interface checking for packets subject to "ipfw fwd". Chris Johnson <cjohnson@palomine.net> tested this fix in -stable.	2001-03-04 03:22:36 +00:00
Don Lewis	823db0e9dd	Disable interface checking when IP forwarding is engaged so that packets addressed to the interface on the other side of the box follow their historical path. Explicitly block packets sent to the loopback network sent from the outside, which is consistent with the behavior of the forwarding path between interfaces as implemented in in_canforward(). Always check the arrival interface when matching the packet destination against the interface broadcast addresses. This bug allowed TCP connections to be made to the broadcast address of an interface on the far side of the system because the M_BCAST flag was not set because the packet was unicast to the interface on the near side. This was broken when the directed broadcast code was removed from revision 1.32. If the directed broadcast code was stil present, the destination would not have been recognized as local until the packet was forwarded to the output interface and ether_output() looped a copy back to ip_input() with M_BCAST set and the receive interface set to the output interface. Optimize the order of the tests. Reviewed by: jlemon	2001-03-04 01:39:19 +00:00
Jonathan Lemon	b3e95d4ed0	Add a new sysctl net.inet.ip.check_interface, which will verify that an incoming packet arrivees on an interface that has an address matching the packet's address. This is turned on by default.	2001-03-02 20:54:03 +00:00
Poul-Henning Kamp	970680fad8	Fix jails.	2001-02-28 09:38:48 +00:00
Jonathan Lemon	7538a9a0f8	When iterating over our list of interface addresses in order to determine if an arriving packet belongs to us, also check that the packet arrived through the correct interface. Skip this check if the packet was locally generated.	2001-02-27 19:43:14 +00:00
Bill Fumerola	2a6cb8804e	The TCP header-specific section suffered a little bit of bitrot recently: When we recieve a fragmented TCP packet (other than the first) we can't extract header information (we don't have state to reference). In a rather unelegant fashion we just move on and assume a non-match. Recent additions to the TCP header-specific section of the code neglected to add the logic to the fragment code so in those cases the match was assumed to be positive and those parts of the rule (which should have resulted in a non-match/continue) were instead skipped (which means the processing of the rule continued even though it had already not matched). Fault can be spread out over Rich Steenbergen (tcpoptions) and myself (tcp{seq,ack,win}). rwatson sent me a patch that got me thinking about this whole situation (but what I'm committing / this description is mine so don't blame him).	2001-02-27 10:20:44 +00:00
Jonathan Lemon	7d42e30c2e	Use more aggressive retransmit timeouts for the initial SYN packet. As we currently drop the connection after 4 retransmits + 2 ICMP errors, this allows initial connection attempts to be dropped much faster.	2001-02-26 21:33:55 +00:00
Jonathan Lemon	c693a045de	Remove in_pcbnotify and use in_pcblookup_hash to find the cb directly. For TCP, verify that the sequence number in the ICMP packet falls within the tcp receive window before performing any actions indicated by the icmp packet. Clean up some layering violations (access to tcp internals from in_pcb)	2001-02-26 21:19:47 +00:00
Jeroen Ruigrok van der Werven	b9af273fe3	Remove struct full_tcpiphdr{}. This piece of code has not been referenced since it was put there in 1995. Also done a codebased search on popular networking libraries and third-party applications. This is an orphan. Reviewed by: jesper	2001-02-26 20:10:16 +00:00
Jeroen Ruigrok van der Werven	05f15c3dc3	Remove conditionals for vax support. People who care much about this are welcomed to try 2.11BSD. :) Noticed by: luigi Reviewed by: jesper	2001-02-26 20:05:32 +00:00
Jesper Skriver	694a9ff95b	Remove tcp_drop_all_states, which is unneeded after jlemon removed it from tcp_subr.c in rev 1.92	2001-02-25 17:20:19 +00:00
Jonathan Lemon	d8c85a260f	Do not delay a new ack if there already is a delayed ack pending on the connection, but send it immediately. Prior to this change, it was possible to delay a delayed-ack for multiple times, resulting in degraded TCP behavior in certain corner cases.	2001-02-25 15:17:24 +00:00
Jonathan Lemon	c484d1a38c	When converting soft error into a hard error, drop the connection. The error will be passed up to the user, who will close the connection, so it does not appear to make a sense to leave the connection open. This also fixes a bug with kqueue, where the filter does not set EOF on the connection, because the connection is still open. Also remove calls to so{rw}wakeup, as we aren't doing anything with them at the moment anyway. Reviewed by: alfred, jesper	2001-02-23 21:07:06 +00:00
Jonathan Lemon	e4bb5b0572	Allow ICMP unreachables which map into PRC_UNREACH_ADMIN_PROHIB to reset TCP connections which are in the SYN_SENT state, if the sequence number in the echoed ICMP reply is correct. This behavior can be controlled by the sysctl net.inet.tcp.icmp_may_rst. Currently, only subtypes 2,3,10,11,12 are treated as such (port, protocol and administrative unreachables). Assocaiate an error code with these resets which is reported to the user application: ENETRESET. Disallow resetting TCP sessions which are not in a SYN_SENT state. Reviewed by: jesper, -net	2001-02-23 20:51:46 +00:00
Jesper Skriver	d1c54148b7	Redo the security update done in rev 1.54 of src/sys/netinet/tcp_subr.c and 1.84 of src/sys/netinet/udp_usrreq.c The changes broken down: - remove 0 as a wildcard for addresses and port numbers in src/sys/netinet/in_pcb.c:in_pcbnotify() - add src/sys/netinet/in_pcb.c:in_pcbnotifyall() used to notify all sessions with the specific remote address. - change - src/sys/netinet/udp_usrreq.c:udp_ctlinput() - src/sys/netinet/tcp_subr.c:tcp_ctlinput() to use in_pcbnotifyall() to notify multiple sessions, instead of using in_pcbnotify() with 0 as src address and as port numbers. - remove check for src port == 0 in - src/sys/netinet/tcp_subr.c:tcp_ctlinput() - src/sys/netinet/udp_usrreq.c:udp_ctlinput() as they are no longer needed. - move handling of redirects and host dead from in_pcbnotify() to udp_ctlinput() and tcp_ctlinput(), so they will call in_pcbnotifyall() to notify all sessions with the specific remote address. Approved by: jlemon Inspired by: NetBSD	2001-02-22 21:23:45 +00:00
Jesper Skriver	43c77c8f5f	Backout change in 1.153, as it violate rfc1122 section 3.2.1.3. Requested by: jlemon,ru	2001-02-21 16:59:47 +00:00
Robert Watson	91421ba234	o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project	2001-02-21 06:39:57 +00:00
Jesper Skriver	58e9b41722	Only call in_pcbnotify if the src port number != 0, as we treat 0 as a wildcard in src/sys/in_pbc.c:in_pcbnotify() It's sufficient to check for src\|local port, as we'll have no sessions with src\|local port == 0 Without this a attacker sending ICMP messages, where the attached IP header (+ 8 bytes) has the address and port numbers == 0, would have the ICMP message applied to all sessions. PR: kern/25195 Submitted by: originally by jesper, reimplimented by jlemon's advice Reviewed by: jlemon Approved by: jlemon	2001-02-20 23:25:04 +00:00
Jesper Skriver	2b18d82220	Send a ICMP unreachable instead of dropping the packet silent, if we receive a packet not for us, and forwarding disabled. PR: kern/24512 Reviewed by: jlemon Approved by: jlemon	2001-02-20 21:31:47 +00:00
Jesper Skriver	c2221099a9	Remove unneeded loop increment in src/sys/netinet/in_pcb.c:in_pcbnotify Forgotten by phk, when committing fix in kern/23986 PR: kern/23986 Reviewed by: phk Approved by: phk	2001-02-20 21:11:29 +00:00
Brian Feldman	c0511d3b58	Switch to using a struct xucred instead of a struct xucred when not actually in the kernel. This structure is a different size than what is currently in -CURRENT, but should hopefully be the last time any application breakage is caused there. As soon as any major inconveniences are removed, the definition of the in-kernel struct ucred should be conditionalized upon defined(_KERNEL). This also changes struct export_args to remove dependency on the constantly-changing struct ucred, as well as limiting the bounds of the size fields to the correct size. This means: a) mountd and friends won't break all the time, b) mountd and friends won't crash the kernel all the time if they don't know what they're doing wrt actual struct export_args layout. Reviewed by: bde	2001-02-18 13:30:20 +00:00
Poul-Henning Kamp	90fcbbd635	Remove unneeded loop increment in src/sys/netinet/in_pcb.c:in_pcbnotify Add new PRC_UNREACH_ADMIN_PROHIB in sys/sys/protosw.h Remove condition on TCP in src/sys/netinet/ip_icmp.c:icmp_input In src/sys/netinet/ip_icmp.c:icmp_input set code = PRC_UNREACH_ADMIN_PROHIB or PRC_UNREACH_HOST for all unreachables except ICMP_UNREACH_NEEDFRAG Rename sysctl icmp_admin_prohib_like_rst to icmp_unreach_like_rst to reflect the fact that we also react on ICMP unreachables that are not administrative prohibited. Also update the comments to reflect this. In sys/netinet/tcp_subr.c:tcp_ctlinput add code to treat PRC_UNREACH_ADMIN_PROHIB and PRC_UNREACH_HOST different. PR: 23986 Submitted by: Jesper Skriver <jesper@skriver.dk>	2001-02-18 09:34:55 +00:00
Luigi Rizzo	c1b843c774	remove unused data structure definition, and corresponding macro into*()	2001-02-18 07:10:03 +00:00
Jonathan Lemon	7c45cb9bca	Clean up warning.	2001-02-15 22:32:06 +00:00
Jeroen Ruigrok van der Werven	e61c4bedda	Add definitions for IPPROTO numbers 55-57.	2001-02-14 13:51:20 +00:00
Poul-Henning Kamp	bb07ec8c84	Introduce a new feature in IPFW: Check of the source or destination address is configured on a interface. This is useful for routers with dynamic interfaces. It is now possible to say: 0100 allow tcp from any to any established 0200 skipto 1000 tcp from any to any 0300 allow ip from any to any 1000 allow tcp from 1.2.3.4 to me 22 1010 deny tcp from any to me 22 1020 allow tcp from any to any and not have to worry about the behaviour if dynamic interfaces configure new IP numbers later on. The check is semi expensive (traverses the interface address list) so it should be protected as in the above example if high performance is a requirement.	2001-02-13 14:12:37 +00:00
Bosko Milekic	a57815efd2	Clean up RST ratelimiting. Previously, ratelimiting occured before tests were performed to determine if the received packet should be reset. This created erroneous ratelimiting and false alarms in some cases. The code has now been reorganized so that the checks for validity come before the call to badport_bandlim. Additionally, a few changes in the symbolic names of the bandlim types have been made, as well as a clarification of exactly which type each RST case falls under. Submitted by: Mike Silbersack <silby@silby.com>	2001-02-11 07:39:51 +00:00
Luigi Rizzo	7e1cd0d23d	Sync with the bridge/dummynet/ipfw code already tested in stable. In ip_fw.[ch] change a couple of variable and field names to avoid having types, variables and fields with the same name.	2001-02-10 00:10:18 +00:00
Jeroen Ruigrok van der Werven	1a6e52d0e9	Fix typo: seperate -> separate. Seperate does not exist in the english language.	2001-02-06 11:21:58 +00:00
Poul-Henning Kamp	6817526d14	Convert if_multiaddrs from LIST to TAILQ so that it can be traversed backwards in the three drivers which want to do that. Reviewed by: mikeh	2001-02-06 10:12:15 +00:00
Julian Elischer	41d2ba5e27	Fix bad patch from a few days ago. It broke some bridging.	2001-02-05 21:25:27 +00:00
Poul-Henning Kamp	37d4006626	Another round of the <sys/queue.h> FOREACH transmogriffer. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 16:08:18 +00:00
Darren Reed	185b71c73e	fix duplicate rcsid	2001-02-04 15:25:15 +00:00
Darren Reed	f590526d0a	fix conflicts	2001-02-04 14:26:56 +00:00
Poul-Henning Kamp	fc2ffbe604	Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 13:13:25 +00:00
Poul-Henning Kamp	ef9e85abba	Use <sys/queue.h> macro API.	2001-02-04 12:37:48 +00:00
Julian Elischer	c8f8e9c110	Make the code act the same in the case of BRIDGE being defined, but not turned on, and the case of it not being defined at all. i.e. Disabling bridging re-enables some of the checks it disables. Submitted by: "Rogier R. Mulhuijzen" <drwilco@drwilco.net>	2001-02-03 17:25:21 +00:00
Jonathan Lemon	007581c0d8	When turning off TCP_NOPUSH, call tcp_output to immediately flush out any data pending in the buffer. Submitted by: Tony Finch <dot@dotat.at>	2001-02-02 18:48:25 +00:00
Luigi Rizzo	507b4b5432	MFS: bridge/ipfw/dummynet fixes (bridge.c will be committed separately)	2001-02-02 00:18:00 +00:00
Brian Somers	435ff15c3b	Add a few ``const''s to silence some -Wwrite-strings warnings	2001-01-29 11:44:13 +00:00
Brian Somers	4834b77d04	Ignore leading witespace in the string given to PacketAliasProxyRule().	2001-01-29 00:30:01 +00:00
Luigi Rizzo	f8acf87bb5	Make sure we do not follow an invalid pointer in ipfw_report when we get an incomplete packet or m_pullup fails.	2001-01-27 02:31:08 +00:00
Luigi Rizzo	26fb17bdd0	Minor cleanups after yesterday's patch. The code (bridging and dummynet) actually worked fine!	2001-01-26 19:43:54 +00:00
Luigi Rizzo	6258acf88f	Bring dummynet in line with the code that now works in -STABLE. It compiles, but I cannot test functionality yet.	2001-01-26 06:49:34 +00:00
Luigi Rizzo	7a726a2dd1	Pass up errors returned by dummynet. The same should be done with divert.	2001-01-25 02:06:38 +00:00
Garrett Wollman	a589a70ee1	Correct a comment.	2001-01-24 16:25:36 +00:00
Wes Peters	550b151850	When attempting to bind to an ephemeral port, if no such port is available, the error return should be EADDRNOTAVAIL rather than EAGAIN. PR: 14181 Submitted by: Dima Dorfman <dima@unixfreak.org> Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>	2001-01-23 07:27:56 +00:00
Luigi Rizzo	8b2cd62d7d	Change critical section protection for dummynet from splnet() to splimp() -- we need it because dummynet can be invoked by the bridging code at splimp(). This should cure the pipe "stalls" that several people have been reporting on -stable while using bridging+dummynet (the problem would not affect routers using dummynet).	2001-01-22 23:04:13 +00:00
Dag-Erling Smørgrav	a3ea6d41b9	First step towards an MP-safe zone allocator: - have zalloc() and zfree() always lock the vm_zone. - remove zalloci() and zfreei(), which are now redundant. Reviewed by: bmilekic, jasone	2001-01-21 22:23:11 +00:00
Luigi Rizzo	ec97c79e30	Document data structures and operation on dummynet so next time I or someone else browse through this code I do not have a hard time understanding what is going on.	2001-01-17 01:09:40 +00:00
Luigi Rizzo	5da48f88bd	Some dummynet patches that I forgot to commit last summer. One of them fixes a potential panic when bridging is used and you run out of mbufs (though i have no idea if the bug has ever hit anyone).	2001-01-16 23:49:49 +00:00
Bosko Milekic	987efc765e	Prototype inet_ntoa_r and thereby silence a warning from GCC. The function is prototyped immediately under inet_ntoa, which is also from libkern.	2001-01-12 07:47:53 +00:00
Robert Watson	46a27060af	o Minor style(9)ism to make consistent with -STABLE	2001-01-09 18:26:17 +00:00
Robert Watson	65450f2f77	o IPFW incorrectly handled filtering in the presence of previously reserved and now allocated TCP flags in incoming packets. This patch stops overloading those bits in the IP firewall rules, and moves colliding flags to a seperate field, ipflg. The IPFW userland management tool, ipfw(8), is updated to reflect this change. New TCP flags related to ECN are now included in tcp.h for reference, although we don't currently implement TCP+ECN. o To use this fix without completely rebuilding, it is sufficient to copy ip_fw.h and tcp.h into your appropriate include directory, then rebuild the ipfw kernel module, and ipfw tool, and install both. Note that a mismatch between module and userland tool will result in incorrect installation of firewall rules that may have unexpected effects. This is an MFC candidate, following shakedown. This bug does not appear to affect ipfilter. Reviewed by: security-officer, billf Reported by: Aragon Gouveia <aragon@phat.za.net>	2001-01-09 03:10:30 +00:00
Alfred Perlstein	3269187d41	provide a sysctl 'net.link.ether.inet.log_arp_wrong_iface' to allow one to supress logging when ARP replies arrive on the wrong interface: "/kernel: arp: 1.2.3.4 is on dc0 but got reply from 00:00:c5:79:d0:0c on dc1" the default is to log just to give notice about possibly incorrectly configured networks.	2001-01-06 00:45:08 +00:00
Alfred Perlstein	da289f07ee	Fix incorrect logic wouldn't disconnect incomming connections that had been disconnected because they were not full. Submitted by: David Filo	2001-01-03 19:50:23 +00:00
Assar Westerlund	598ce68dbd	include tcp header files to get the prototype for tcp_seq_vs_sess	2000-12-27 03:02:29 +00:00
Poul-Henning Kamp	442fad6798	Update the "icmp_admin_prohib_like_rst" code to check the tcp-window and to be configurable with respect to acting only in SYN or in all TCP states. PR: 23665 Submitted by: Jesper Skriver <jesper@skriver.dk>	2000-12-24 10:57:21 +00:00
Bosko Milekic	2a0c503e7a	* Rename M_WAIT mbuf subsystem flag to M_TRYWAIT. This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while. * Fix a typo in a comment in mbuf.h * Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.	2000-12-21 21:44:31 +00:00
Bill Fumerola	16cd6db04f	Use getmicrotime() instead of microtime() when timestamping ICMP packets, the former is quicker and accurate enough for use here. Submitted by: Jason Slagle <raistlin@toledolink.com> (on IRC) Reviewed by: phk	2000-12-16 21:39:48 +00:00
Poul-Henning Kamp	b11d7a4a2f	We currently does not react to ICMP administratively prohibited messages send by routers when they deny our traffic, this causes a timeout when trying to connect to TCP ports/services on a remote host, which is blocked by routers or firewalls. rfc1122 (Requirements for Internet Hosts) section 3.2.2.1 actually requi re that we treat such a message for a TCP session, that we treat it like if we had recieved a RST. quote begin. A Destination Unreachable message that is received MUST be reported to the transport layer. The transport layer SHOULD use the information appropriately; for example, see Sections 4.1.3.3, 4.2.3.9, and 4.2.4 below. A transport protocol that has its own mechanism for notifying the sender that a port is unreachable (e.g., TCP, which sends RST segments) MUST nevertheless accept an ICMP Port Unreachable for the same purpose. quote end. I've written a small extension that implement this, it also create a sysctl "net.inet.tcp.icmp_admin_prohib_like_rst" to control if this new behaviour is activated. When it's activated (set to 1) we'll treat a ICMP administratively prohibited message (icmp type 3 code 9, 10 and 13) for a TCP sessions, as if we recived a TCP RST, but only if the TCP session is in SYN_SENT state. The reason for only reacting when in SYN_SENT state, is that this will solve the problem, and at the same time minimize the risk of this being abused. I suggest that we enable this new behaviour by default, but it would be a change of current behaviour, so if people prefer to leave it disabled by default, at least for now, this would be ok for me, the attached diff actually have the sysctl set to 0 by default. PR: 23086 Submitted by: Jesper Skriver <jesper@skriver.dk>	2000-12-16 19:42:06 +00:00
Bosko Milekic	09f81a46a5	Change the following: 1. ICMP ECHO and TSTAMP replies are now rate limited. 2. RSTs generated due to packets sent to open and unopen ports are now limited by seperate counters. 3. Each rate limiting queue now has its own description, as follows: Limiting icmp unreach response from 439 to 200 packets per second Limiting closed port RST response from 283 to 200 packets per second Limiting open port RST response from 18724 to 200 packets per second Limiting icmp ping response from 211 to 200 packets per second Limiting icmp tstamp response from 394 to 200 packets per second Submitted by: Mike Silbersack <silby@silby.com>	2000-12-15 21:45:49 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
Poul-Henning Kamp	959b7375ed	Staticize some malloc M_ instances.	2000-12-08 20:09:00 +00:00
Jonathan Lemon	df5e198723	Lock down the network interface queues. The queue mutex must be obtained before adding/removing packets from the queue. Also, the if_obytes and if_omcasts fields should only be manipulated under protection of the mutex. IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on the queue. An IF_LOCK macro is provided, as well as the old (mutex-less) versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which needs them, but their use is discouraged. Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF, which takes care of locking/enqueue, and also statistics updating/start if necessary.	2000-11-25 07:35:38 +00:00
Jonathan Lemon	e82ac18e52	Revert the last commit to the callout interface, and add a flag to callout_init() indicating whether the callout is safe or not. Update the callers of callout_init() to reflect the new interface. Okayed by: Jake	2000-11-25 06:22:16 +00:00
Bosko Milekic	a352dd9a71	Fixup (hopefully) bridging + ipfw + dummynet together... * Some dummynet code incorrectly handled a malloc()-allocated pseudo-mbuf header structure, called "pkt," and could consequently pollute the mbuf free list if it was ever passed to m_freem(). The fix involved passing not pkt, but essentially pkt->m_next (which is a real mbuf) to the mbuf utility routines. * Also, for dummynet, in bdg_forward(), made the code copy the ethernet header back into the mbuf (prepended) because the dummynet code that follows expects it to be there but it is, unfortunately for dummynet, passed to bdg_forward as a seperate argument. PRs: kern/19551 ; misc/21534 ; kern/23010 Submitted by: Thomas Moestl <tmoestl@gmx.net> Reviewed by: bmilekic Approved by: luigi	2000-11-23 22:25:03 +00:00
Ruslan Ermilov	1b7b85c4d6	mdoc(7) police: use the new feature of the An macro.	2000-11-22 08:47:35 +00:00
Bosko Milekic	0a1df235ba	While I'm here, get rid of (now useless) MCLISREFERENCED and use MEXT_IS_REF instead. Also, fix a small set of "avail." If we're setting `avail,' we shouldn't be re-checking whether m_flags is M_EXT, because we know that it is, as if it wasn't, we would have already returned several lines above. Reviewed by: jlemon	2000-11-11 23:05:59 +00:00
Ruslan Ermilov	203de3b494	Fixed the security breach I introduced in rev 1.145. Disallow getsockopt(IP_FW_ADD) if securelevel >= 3. PR: 22600	2000-11-07 09:20:32 +00:00
Jonathan Lemon	8735719e43	tp->snd_recover is part of the New Reno recovery algorithm, and should only be checked if the system is currently performing New Reno style fast recovery. However, this value was being checked regardless of the NR state, with the end result being that the congestion window was never opened. Change the logic to check t_dupack instead; the only code path that allows it to be nonzero at this point is NewReno, so if it is nonzero, we are in fast recovery mode and should not touch the congestion window. Tested by: phk	2000-11-04 15:59:39 +00:00
Ruslan Ermilov	1d02752206	Fixed the bug I have introduced in icmp_error() in revision 1.44. The amount of data we copy from the original IP datagram into the ICMP message was computed incorrectly for IP packets with payload less than 8 bytes.	2000-11-02 09:46:23 +00:00
Ruslan Ermilov	506f494939	Wrong checksum may have been computed for certain UDP packets. Reviewed by: jlemon	2000-11-01 16:56:33 +00:00
Ruslan Ermilov	60123168be	Wrong checksum used for certain reassembled IP packets before diverting.	2000-11-01 11:21:45 +00:00
Josef Karthauser	ffa37b3f9b	It's no longer true that "nobody uses ia beyond here"; it's now used to keep address based if_data statistics in. Submitted by: ru	2000-11-01 01:59:28 +00:00
Ruslan Ermilov	48cb400fb1	Do not waste a time saving a copy of IP header if we are certainly not going to send an ICMP error message (net.inet.udp.blackhole=1).	2000-10-31 09:13:02 +00:00
Ruslan Ermilov	642cd09fb3	Added boolean argument to link searching functions, indicating whether they should create a link if lookup has failed or not.	2000-10-30 17:24:12 +00:00
Ruslan Ermilov	03453c5e87	A significant rewrite of PPTP aliasing code. PPTP links are no longer dropped by simple (and inappropriate in this case) "inactivity timeout" procedure, only when requested through the control connection. It is now possible to have multiple PPTP servers running behind NAT. Just redirect the incoming TCP traffic to port 1723, everything else is done transparently. Problems were reported and the fix was tested by: Michael Adler <Michael.Adler@compaq.com>, David Andersen <dga@lcs.mit.edu>	2000-10-30 12:39:41 +00:00
Poul-Henning Kamp	cf9fa8e725	Move suser() and suser_xxx() prototypes and a related #define from <sys/proc.h> to <sys/systm.h>. Correctly document the #includes needed in the manpage. Add one now needed #include of <sys/systm.h>. Remove the consequent 48 unused #includes of <sys/proc.h>.	2000-10-29 16:06:56 +00:00
Poul-Henning Kamp	53ce36d17a	Remove unneeded #include <sys/proc.h> lines.	2000-10-29 13:57:19 +00:00
Darren Reed	0c72e2855d	Fix conflicts creted by import.	2000-10-29 07:53:05 +00:00
Josef Karthauser	fe93767490	Count per-address statistics for IP fragments. Requested by: ru Obtained from: BSD/OS	2000-10-29 01:05:09 +00:00
David E. O'Brien	d0eaa94443	Include sys/param.h for `__FreeBSD_version' rather than the non-existent osreldate.h. Submitted by: dougb	2000-10-27 12:53:31 +00:00
Poul-Henning Kamp	46aa3347cb	Convert all users of fldoff() to offsetof(). fldoff() is bad because it only takes a struct tag which makes it impossible to use unions, typedefs etc. Define __offsetof() in <machine/ansi.h> Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h> Remove myriad of local offsetof() definitions. Remove includes of <stddef.h> in kernel code. NB: Kernelcode should never include from /usr/include ! Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API. Deprecate <struct.h> with a warning. The warning turns into an error on 01-12-2000 and the file gets removed entirely on 01-01-2001. Paritials reviews by: various. Significant brucifications by: bde	2000-10-27 11:45:49 +00:00
Ruslan Ermilov	3cebc3e4de	Fetch the protocol header (TCP, UDP, ICMP) only from the first fragment of IP datagram. This fixes the problem when firewall denied fragmented packets whose last fragment was less than minimum protocol header size. Found by: Harti Brandt <brandt@fokus.gmd.de> PR: kern/22309	2000-10-27 07:19:17 +00:00
Ruslan Ermilov	b6ea1aa58d	RFC 791 says that IP_RF bit should always be zero, but nothing in the code enforces this. So, do not check for and attempt a false reassembly if only IP_RF is set. Also, removed the dead code, since we no longer use dtom() on return from ip_reass().	2000-10-26 13:14:48 +00:00
Darren Reed	60b88d9681	fix conflicts from rcsids	2000-10-26 12:33:42 +00:00
Ruslan Ermilov	7e2df4520d	Wrong header length used for certain reassembled IP packets. This was first fixed in rev 1.82 but then broken in rev 1.125. PR: 6177	2000-10-26 12:18:13 +00:00
Luigi Rizzo	1f8ed85239	Close PR22152 and PR19511 -- correct the naming of a variable	2000-10-26 00:16:12 +00:00
Ruslan Ermilov	8829f4ee0b	We now keep the ip_id field in network byte order all the time, so there is no need to make the distinction between ip_output() and ip_input() cases. Reviewed by: silence on freebsd-net	2000-10-25 10:56:41 +00:00
Jun-ichiro itojun Hagino	d31944e6ec	be careful on mbuf overrun on ctlinput. short icmp6 packet may be able to panic the kernel. sync with kame.	2000-10-23 07:11:01 +00:00
Ruslan Ermilov	cc22c7a746	Save a few CPU cycles in IP fragmentation code.	2000-10-20 14:10:37 +00:00
Josef Karthauser	5da9f8fa97	Augment the 'ifaddr' structure with a 'struct if_data' to keep statistics on a per network address basis. Teach the IPv4 and IPv6 input/output routines to log packets/bytes against the network address connected to the flow. Teach netstat to display the per-address stats for IP protocols when 'netstat -i' is evoked, instead of displaying the per-interface stats.	2000-10-19 23:15:54 +00:00
Ruslan Ermilov	f136389613	A failure to allocate memory for auxiliary TCP data is now fatal. This fixes a null pointer dereference problem that is unlikely to happen in normal circumstances.	2000-10-19 10:44:44 +00:00
Ruslan Ermilov	0531ca1fd8	If we do not byte-swap the ip_id in the first place, don't do it in the second. NetBSD (from where I've taken this originally) needs to fix this too.	2000-10-18 11:36:09 +00:00
Ruslan Ermilov	487bdb3855	Backout my wrong attempt to fix the compilation warning in ip_input.c and instead reapply the revision 1.49 of mbuf.h, i.e. Fixed regression of the type of the `header' member of struct pkthdr from `void *' to caddr_t in rev.1.51. This mainly caused an annoying warning for compiling ip_input.c. Requested by: bde	2000-10-12 16:33:41 +00:00
Ruslan Ermilov	e6c89c1bd2	Fix the compilation warning.	2000-10-12 10:42:32 +00:00
Ruslan Ermilov	bc95ac80b2	Allow for IP_FW_ADD to be used in getsockopt(2) incarnation as well, in which case return the rule number back into userland. PR: bin/18351 Reviewed by: archie, luigi	2000-10-12 07:59:14 +00:00
Alfred Perlstein	abbfaeb87b	Remove headers not needed. Pointed out by: phk	2000-10-07 23:15:17 +00:00
Ruslan Ermilov	c0752e1657	As we now may check the TCP header window field, make sure we pullup enough into the mbuf data area. Solve this problem once and for all by pulling up the entire (standard) header for TCP and UDP, and four bytes of header for ICMP (enough for type, code and cksum fields).	2000-10-06 12:12:09 +00:00
Ruslan Ermilov	60f9125458	Added the missing ntohs() conversion when matching IP packet with the IP_FW_IF_IPID rule. (We have recently decided to keep the ip_id field in network byte order inside the kernel, see revision 1.140 of src/sys/netinet/ip_input.c). I did not like to have the conversion happen in userland, and I think that the similar conversions for fw_tcp(seq\|ack\|win) should be moved out of userland (src/sbin/ipfw/ipfw.c) into the kernel.	2000-10-03 12:18:11 +00:00
Jonathan Lemon	d17e895b5f	If TCPDEBUG is defined, we could dereference a tp which was freed.	2000-10-02 15:00:13 +00:00
Ruslan Ermilov	c7f95f5372	A bit of indentation reformatting.	2000-10-02 13:13:24 +00:00
Bill Fumerola	9ad30943aa	Add new fields for more granularity: IP: version, tos, ttl, len, id TCP: seq#, ack#, window size Reviewed by: silence on freebsd-{net,ipfw}	2000-10-02 03:33:31 +00:00
Bill Fumerola	98b829924f	Add new fields for more granularity: IP: version, tos, ttl, len, id TCP: seq#, ack#, window size Reviewed by: silence on freebsd-{net,ipfw}	2000-10-02 03:03:31 +00:00
Ruslan Ermilov	3ea420e391	Document that net.inet.ip.fw.one_pass only affects dummynet(4). Noticed by: Peter Jeremy<peter.jeremy@alcatel.com.au>	2000-09-29 08:39:06 +00:00
Kris Kennaway	be515d91ad	Use stronger random number generation for TCP_ISSINCR and tcp_iss. Reviewed by: peter, jlemon	2000-09-29 01:37:19 +00:00
Bosko Milekic	9d8c8a672c	Finally make do_tcpdrain sysctl live under correct parent, _net_inet_tcp, as opposed to _debug. Like before, default value remains 1.	2000-09-25 23:40:22 +00:00
Ruslan Ermilov	0122d6f195	Fixed the calculations with UDP header length field. The field is in network byte order and contains the size of the header. Reviewed by: brian	2000-09-21 06:52:59 +00:00
Kenjiro Cho	e645a1ca27	change the evaluation order of the rsvp socket in rsvp_input() in favor of the new-style per-vif socket. this does not affect the behavior of the ISI rsvpd but allows another rsvp implementation (e.g., KOM rsvp) to take advantage of the new style for particular sockets while using the old style for others. in the future, rsvp supporn should be replaced by more generic router-alert support. PR: kern/20984 Submitted by: Martin Karsten <Martin.Karsten@KOM.tu-darmstadt.de> Reviewed by: kjc	2000-09-17 13:50:12 +00:00
Poul-Henning Kamp	e4bdf25dc8	Properly jail UDP sockets. This is quite a bit more tricky than TCP. This fixes a !root userland panic, and some cases where the wrong interface was chosen for a jailed UDP socket. PR: 20167, 19839, 20946	2000-09-17 13:35:42 +00:00
Poul-Henning Kamp	24b261c720	Reverse last commit, a better fix has been found.	2000-09-17 13:34:18 +00:00
Poul-Henning Kamp	e2cabba9d7	Make sure UDP sockets are explicitly bind(2)'ed [sic] before we connect(2) them. PR: 20946 Isolated by: Aaron Gifford <agifford@infowest.com>	2000-09-17 11:34:33 +00:00
Jonathan Lemon	af1270f87f	It is possible for a TCP callout to be removed from the timing wheel, but have a network interrupt arrive and deactivate the timeout before the callout routine runs. Check for this case in the callout routine; it should only run if the callout is active and not on the wheel.	2000-09-16 00:53:53 +00:00
Ruslan Ermilov	4996f02545	Add -Wmissing-prototypes.	2000-09-15 15:37:16 +00:00
Jonathan Lemon	a8db1d93f1	m_cat() can free its second argument, so collect the checksum information from the fragment before calling m_cat().	2000-09-14 21:06:48 +00:00
Ruslan Ermilov	e30177e024	Follow BSD/OS and NetBSD, keep the ip_id field in network order all the time. Requested by: wollman	2000-09-14 14:42:04 +00:00
Bill Fumerola	95d0db2b40	Fix screwup in previous commit.	2000-09-12 02:38:05 +00:00
Archie Cobbs	6612c70eb1	Don't do snd_nxt rollback optimization (rev. 1.46) for SYN packets. It causes a panic when/if snd_una is incremented elsewhere (this is a conservative change, because originally no rollback occurred for any packets at all). Submitted by: Vivek Sadananda Pai <vivek@imimic.com>	2000-09-11 19:11:33 +00:00
Alfred Perlstein	b47ce7f5cb	Forget to include sysctl.h Submitted by: des	2000-09-09 18:47:46 +00:00
Alfred Perlstein	34b94e8b82	Accept filter maintainance Update copyrights. Introduce a new sysctl node: net.inet.accf Although acceptfilters need refcounting to be properly (safely) unloaded as a temporary hack allow them to be unloaded if the sysctl net.inet.accf.unloadable is set, this is really for developers who want to work on thier own filters. A near complete re-write of the accf_http filter: 1) Parse check if the request is HTTP/1.0 or HTTP/1.1 if not dump to the application. Because of the performance implications of this there is a sysctl 'net.inet.accf.http.parsehttpversion' that when set to non-zero parses the HTTP version. The default is to parse the version. 2) Check if a socket has filled and dump to the listener 3) optimize the way that mbuf boundries are handled using some voodoo 4) even though you'd expect accept filters to only be used on TCP connections that don't use m_nextpkt I've fixed the accept filter for socket connections that use this. This rewrite of accf_http should allow someone to use them and maintain full HTTP compliance as long as net.inet.accf.http.parsehttpversion is set.	2000-09-06 18:49:13 +00:00
Bill Fumerola	4897e8320e	1. IP_FW_F_{UID,GID} are _not_ commands, they are extras. The sanity checking for them does not belong in the IP_FW_F_COMMAND switch, that mask doesn't even apply to them(!). 2. You cannot add a uid/gid rule to something that isn't TCP, UDP, or IP. XXX - this should be handled in ipfw(8) as well (for more diagnostic output), but this at least protects bogus rules from being added. Pointy hat: green	2000-09-06 03:10:42 +00:00
Ruslan Ermilov	76e6ebd64e	Match IPPROTO_ICMP with IP protocol field of the original IP datagram embedded into ICMP error message, not with protocol field of ICMP message itself (which is always IPPROTO_ICMP). Pointed by: Erik Salander <erik@whistle.com>	2000-09-01 16:38:53 +00:00
Ruslan Ermilov	04287599db	Fixed broken ICMP error generation, unified conversion of IP header fields between host and network byte order. The details: o icmp_error() now does not add IP header length. This fixes the problem when icmp_error() is called from ip_forward(). In this case the ip_len of the original IP datagram returned with ICMP error was wrong. o icmp_error() expects all three fields, ip_len, ip_id and ip_off in host byte order, so DTRT and convert these fields back to network byte order before sending a message. This fixes the problem described in PR 16240 and PR 20877 (ip_id field was returned in host byte order). o ip_ttl decrement operation in ip_forward() was moved down to make sure that it does not corrupt the copy of original IP datagram passed later to icmp_error(). o A copy of original IP datagram in ip_forward() was made a read-write, independent copy. This fixes the problem I first reported to Garrett Wollman and Bill Fenner and later put in audit trail of PR 16240: ip_output() (not always) converts fields of original datagram to network byte order, but because copy (mcopy) and its original (m) most likely share the same mbuf cluster, ip_output()'s manipulations on original also corrupted the copy. o ip_output() now expects all three fields, ip_len, ip_off and (what is significant) ip_id in host byte order. It was a headache for years that ip_id was handled differently. The only compatibility issue here is the raw IP socket interface with IP_HDRINCL socket option set and a non-zero ip_id field, but ip.4 manual page was unclear on whether in this case ip_id field should be in host or network byte order.	2000-09-01 12:33:03 +00:00
Ruslan Ermilov	816fa7febc	Changed the way we handle outgoing ICMP error messages -- do not alias `ip_src' unless it comes from the host an original datagram that triggered this error message was destined for. PR: 20712 Reviewed by: brian, Charles Mott <cmott@scientech.com>	2000-09-01 09:32:44 +00:00
Ruslan Ermilov	0ac308534e	Grab ADJUST_CHECKSUM() macro from alias_local.h.	2000-08-31 12:54:55 +00:00
Ruslan Ermilov	305d10699e	Create aliasing links for incoming ICMP echo/timestamp requests. This makes outgoing ICMP echo/timestamp replies to be de-aliased with the right source IP, not exactly the primary aliasing IP.	2000-08-31 12:47:57 +00:00
Ruslan Ermilov	3e065e76ac	Fixed the bug that div_bind() always returned zero even if there was an error (broken in rev 1.9).	2000-08-30 14:43:02 +00:00
Ruslan Ermilov	2160daba07	Backout the hack in rev 1.71, I am working on a better patch that should cover almost all inconsistencies in ICMP error generation.	2000-08-30 08:28:06 +00:00

... 2 3 4 5 6 ...

1301 Commits