freebsd-skq

Author	SHA1	Message	Date
mux	b29e3549b8	Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() to match the type of the variable they are exporting. Spotted by: Thomas Hurst <tom@hur.st> MFC after: 3 days	2005-12-14 22:27:48 +00:00
glebius	d5ab5191cf	Add a new feature for optimizining ipfw rulesets - substitution of the action argument with the value obtained from table lookup. The feature is now applicable only to "pipe", "queue", "divert", "tee", "netgraph" and "ngtee" rules. An example usage: ipfw pipe 1000 config bw 1000Kbyte/s ipfw pipe 4000 config bw 4000Kbyte/s ipfw table 1 add x.x.x.x 1000 ipfw table 1 add x.x.x.y 4000 ipfw pipe tablearg ip from table(1) to any In the example above the rule will throw different packets to different pipes. TODO: - Support "skipto" action, but without searching all rules. - Improve parser, so that it warns about bad rules. These are: - "tablearg" argument to action, but no "table" in the rule. All traffic will be blocked. - "tablearg" argument to action, but "table" searches for entry with a specific value. All traffic will be blocked. - "tablearg" argument to action, and two "table" looks - for src and for dst. The last lookup will match.	2005-12-13 12:16:03 +00:00
glebius	f5b3d1a6c4	When we drop packet due to no space in output interface output queue, also increase the ifp->if_snd.ifq_drops. PR: 72440 Submitted by: ikob	2005-12-06 11:16:11 +00:00
glebius	dc6ebf5904	Optimize parallel processing of ipfw(4) rulesets eliminating the locking of the radix lookup tables. Since several rnh_lookup() can run in parallel on the same table, we can piggyback on the shared locking provided by ipfw(4). However, the single entry cache in the ip_fw_table can't be used lockless, so it is removed. This pessimizes two cases: processing of bursts of similar packets and matching one packet against the same table several times during one ipfw_chk() lookup. To optimize the processing of similar packet bursts administrator should use stateful firewall. To optimize the second problem a solution will be provided soon. Details: o Since we piggyback on the ipfw(4) locking, and the latter is per-chain, the tables are moved from the global declaration to the struct ip_fw_chain. o The struct ip_fw_table is shrunk to one entry and thus vanished. o All table manipulating functions are extended to accept the struct ip_fw_chain * argument. o All table modifing functions use IPFW_WLOCK_ASSERT().	2005-12-06 10:45:49 +00:00
ru	522e9c2b7b	Fix -Wundef.	2005-12-04 02:12:43 +00:00
ume	aea275fca9	obey opt_inet6.h and opt_ipsec.h in kernel build directory. Requested by: hrs	2005-11-29 17:56:11 +00:00
glebius	7abe9e6ffe	Garbage-collect now unused struct _ipfw_insn_pipe and flush_pipe_ptrs(), thus removing a few XXXes. Document the ABI breakage in UPDATING.	2005-11-29 08:59:41 +00:00
glebius	1de2727068	First step in removing welding between ipfw(4) and dummynet. o Do not use ipfw_insn_pipe->pipe_ptr in locate_flowset(). The _ipfw_insn_pipe isn't touched by this commit to preserve ABI compatibility. o To optimize the lookup of the pipe/flowset in locate_flowset() introduce hashes for pipes and queues: - To preserve ABI compatibility utilize the place of global list pointer for SLIST_ENTRY. - Introduce locate_flowset(queue nr) and locate_pipe(pipe nr). o Rework all the dummynet code to deal with the hashes, not global lists. Also did some style(9) changes in the code blocks that were touched by this sweep: - Be conservative about flowset and pipe variable names on stack, use "fs" and "pipe" everywhere. - Cleanup whitespaces. - Sort variables. - Give variables more meaningful names. - Uppercase and dots in comments. - ENOMEM when malloc(9) failed.	2005-11-29 00:11:01 +00:00
ru	7aa2a06ebf	Fix prototype.	2005-11-24 14:17:35 +00:00
ps	6b40ef55c3	Fix for a bug that causes SACK scoreboard corruption when the limit on holes per connection is reached. Reported by: Patrik Roos Submitted by: Mohan Srinivasan Reviewed by: Raja Mukerji, Noritoshi Demizu	2005-11-21 19:22:10 +00:00
andre	7649e1342f	Remove 'ipprintfs' which were protected under DIAGNOSTIC. It doesn't have any know to enable it from userland and could only be enabled by either setting it to 1 at compile time or through the kernel debugger. In the future it may be brought back as KTR tracing points. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-19 17:04:52 +00:00
andre	5d67ac3cb1	Move MAX_IPOPTLEN and struct ipoption back into ip_var.h as userland programs depend on it. Pointed out by: le Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-19 14:01:32 +00:00
andre	a6a209f2cc	Consolidate all IP Options handling functions into ip_options.[ch] and include ip_options.h into all files making use of IP Options functions. From ip_input.c rev 1.306: ip_dooptions(struct mbuf m, int pass) save_rte(m, option, dst) ip_srcroute(m0) ip_stripoptions(m, mopt) From ip_output.c rev 1.249: ip_insertoptions(m, opt, phlen) ip_optcopy(ip, jp) ip_pcbopts(struct inpcb inp, int optname, struct mbuf *m) No functional changes in this commit. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-18 20:12:40 +00:00
andre	60a2f6f1b5	Purge layer specific mbuf flags on layer crossings to avoid confusing upper or lower layers. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-18 16:23:26 +00:00
andre	ae7b5c1a18	Rework icmp_error() to deal with truncated IP packets from ip_forward() when doing extended quoting in error messages. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-18 14:48:42 +00:00
andre	91d9d9ad2a	In ip_forward() copy as much into the temporary error mbuf as we have free space in it. Allocate correct mbuf from the beginning. This allows icmp_error() to quote the entire TCP header in error messages. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-18 14:44:48 +00:00
glebius	be367893cd	MFOpenBSD 1.62: Prevent backup CARP hosts from replying to arp requests, fixes strangeness with some layer-3 switches. From Bill Marquette. Tested by: Kazuaki Oda <kaakun highway.ne.jp>	2005-11-17 12:56:40 +00:00
ru	815788042c	Unbreak for !INET6 case.	2005-11-14 12:50:23 +00:00
ru	f70f525b49	- Store pointer to the link-level address right in "struct ifnet" rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.	2005-11-11 16:04:59 +00:00
suz	199b78628d	fixed a bug that uRPF does not work properly for an IPv6 packet bound for the sending machine itself (this is a bug introduced due to a change in ip6_input.c:Rev.1.83) Pointed out by: Sean McNeil and J.R.Oldroyd MFC after: 3 days	2005-11-10 22:10:39 +00:00
ru	dcace5669d	Use sparse initializers for "struct domain" and "struct protosw", so they are easier to follow for the human being.	2005-11-09 13:29:16 +00:00
thompsa	48c0bcb5c2	Move the cloned interface list management in to if_clone. For some drivers the softc lists and associated mutex are now unused so these have been removed. Calling if_clone_detach() will now destroy all the cloned interfaces for the driver and in most cases is all thats needed to unload. Idea by: brooks Reviewed by: brooks	2005-11-08 20:08:34 +00:00
glebius	fbcfe1441a	Rework ARP retransmission algorythm so that ARP requests are retransmitted without suppression, while there is demand for such ARP entry. As before, retransmission is rate limited to one packet per second. Details: - Remove net.link.ether.inet.host_down_time - Do not set/clear RTF_REJECT flag on route, to avoid rt_check() returning error. We will generate error ourselves. - Return EWOULDBLOCK on first arp_maxtries failed requests , and return EHOSTDOWN/EHOSTUNREACH on further requests. - Retransmit ARP request always, independently from return code. Ratelimit to 1 pps.	2005-11-08 12:05:57 +00:00
andre	0df84f5a83	Retire MT_HEADER mbuf type and change its users to use MT_DATA. Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag. Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-02 13:46:32 +00:00
rwatson	be4f357149	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
rwatson	49831ed8da	Push the assignment of a new or updated so_qlimit from solisten() following the protocol pru_listen() call to solisten_proto(), so that it occurs under the socket lock acquisition that also sets SO_ACCEPTCONN. This requires passing the new backlog parameter to the protocol, which also allows the protocol to be aware of changes in queue limit should it wish to do something about the new queue limit. This continues a move towards the socket layer acting as a library for the protocol. Bump __FreeBSD_version due to a change in the in-kernel protocol interface. This change has been tested with IPv4 and UNIX domain sockets, but not other protocols.	2005-10-30 19:44:40 +00:00
glebius	ed194f4ef3	First fill in structure with valid values, and only then attach it to the global list. Reviewed by: rwatson	2005-10-28 20:29:42 +00:00
yar	54f0eb5f04	Since carp(4) interfaces presently are kinda fake yet possess IP addresses, mark them with LOOPBACK so that routing daemons take them easy for link-state routing protocols. Reviewed by: glebius	2005-10-26 05:57:35 +00:00
mlaier	25376bdec2	Fix build after in6_joingroup change. It remains unclear if DAD breaks CARP or not.	2005-10-22 14:54:02 +00:00
glebius	9ae88c3a40	In in_addprefix() compare not only route addresses, but their masks, too. This fixes problem when connected prefixes overlap. Obtained from: OpenBSD (rev. 1.40 by claudio); [ I came to this fix myself, and then found out that OpenBSD had already fixed it the same way.]	2005-10-22 14:50:27 +00:00
suz	c2b19f24a4	sync with KAME regarding NDP - introduced fine-grain-timer to manage ND-caches and IPv6 Multicast-Listeners - supports Router-Preference <draft-ietf-ipv6-router-selection-07.txt> - better prefix lifetime management - more spec-comformant DAD advertisement - updated RFC/internet-draft revisions Obtained from: KAME Reviewed by: ume, gnn MFC after: 2 month	2005-10-21 16:23:01 +00:00
rwatson	2043307751	Convert if (tp->t_state == TCPS_LISTEN) panic() into a KASSERT. MFC after: 2 weeks	2005-10-19 09:37:52 +00:00
thompsa	d6130a4703	Change the reference counting to count the number of cloned interfaces for each cloner. This ensures that ifc->ifc_units is not prematurely freed in if_clone_detach() before the clones are destroyed, resulting in memory modified after free. This could be triggered with if_vlan. Assert that all cloners have been destroyed when freeing the memory. Change all simple cloners to destroy their clones with ifc_simple_destroy() on module unload so the reference count is properly updated. This also cleans up the interface destroy routines and allows future optimisation. Discussed with: brooks, pjd, -current Reviewed by: brooks	2005-10-12 19:52:16 +00:00
maxim	98442a62dc	o INP_ONESBCAST is inpcb.inp_vflag flag not inp_flags. The confusion with IP_PORTRANGE_HIGH leads to the incorrect checksum calculation. PR: kern/87306 Submitted by: Rickard Lind Reviewed by: bms MFC after: 2 weeks	2005-10-12 18:13:25 +00:00
philip	806a613e9d	Unbreak the net.inet6.tcp6.getcred sysctl. This makes inetd/auth work again in IPv6 setups. Pointy hat to: ume/KAME	2005-10-12 09:24:18 +00:00
thompsa	fe5395b846	When bridging is enabled and an ARP request is recieved on a member interface, the arp code will search all local interfaces for a match. This triggers a kernel log if the bridge has been assigned an address. arp: ac🇩🇪48:18:83:3d is using my IP address 192.168.0.142! bridge0: flags=8041<UP,RUNNING,MULTICAST> mtu 1500 inet 192.168.0.142 netmask 0xffffff00 ether ac🇩🇪48:18:83:3d Silence this warning for 6.0 to stop unnecessary bug reports, the code will need to be reworked. Approved by: mlaier (mentor) MFC after: 3 days	2005-10-04 19:50:02 +00:00
andre	7990daf715	Correct brainfart in SO_BINTIME test. Pointed out by: nate Pointy hat to: andre	2005-10-04 18:19:21 +00:00
andre	71bc671ea0	Make SO_BINTIME timestamps available on raw_ip sockets. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-10-04 18:07:11 +00:00
rwatson	dc2be38288	Unlock Giant symmetrically with respect to lock acquire order as that's generally nicer. Spotted by: johan MFC after: 1 week	2005-10-03 11:34:29 +00:00
rwatson	2d9ef0269a	Acquire Giant conditionally in in_addmulti() and in_delmulti() based on whether the interface being accessed is IFF_NEEDSGIANT or not. This avoids lock order reversals when calling into the interface ioctl handler, which could potentially lead to deadlock. The long term solution is to eliminate non-MPSAFE network drivers. Discussed with: jhb MFC after: 1 week	2005-10-03 11:09:39 +00:00
maxim	4a3de87cfe	o Teach sysctl_drop() how to deal with the sockets in TIME_WAIT state. This is a special case because tcp_twstart() destroys a tcp control block via tcp_discardcb() so we cannot call tcp_drop(struct *tcpcb) on such connections. Use tcp_twclose() instead. MFC after: 5 days	2005-10-02 08:43:57 +00:00
mlaier	a42af632d8	Remove bridge(4) from the tree. if_bridge(4) is a full functional replacement and has additional features which make it superior. Discussed on: -arch Reviewed by: thompsa X-MFC-after: never (RELENG_6 as transition period)	2005-09-27 18:10:43 +00:00
andre	bedcd4ace8	Implement IP_DONTFRAG IP socket option enabling the Don't Fragment flag on IP packets. Currently this option is only repected on udp and raw ip sockets. On tcp sockets the DF flag is controlled by the path MTU discovery option. Sending a packet larger than the MTU size of the egress interface returns an EMSGSIZE error. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005	2005-09-26 20:25:16 +00:00
andre	b1aa5bb640	Use monotonic 'time_uptime' instead of 'time_second' as timebase for rt->rt_rmx.rmx_expire.	2005-09-19 22:54:55 +00:00
andre	c4178ac83e	Use monotonic 'time_uptime' instead of 'time_second' as timebase for timeouts.	2005-09-19 22:31:45 +00:00
rwatson	64eedb0310	Take a first cut at cleaning up ifnet removal and multicast socket panics, which occur when stale ifnet pointers are left in struct moptions hung off of inpcbs: - Add in_ifdetach(), which matches in6_ifdetach(), and allows the protocol to perform early tear-down on the interface early in if_detach(). - Annotate that if_detach() needs careful consideration. - Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR -- this is not the place to detect interface removal! This also removes what is basically a nasty (and now unnecessary) hack. - Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP IPv4 sockets. It is now possible to run the msocket_ifnet_remove regression test using HEAD without panicking. MFC after: 3 days	2005-09-18 17:36:28 +00:00
andre	418282b822	Do not ignore all other TCP options (eg. timestamp, window scaling) when responding to TCP SYN packets with TCP_MD5 enabled and set. PR: kern/82963 Submitted by: <demizu at dd.iij4u.or.jp> MFC after: 3 days	2005-09-14 15:06:22 +00:00
bz	4d056a4077	Fix panic when kernel compiled without INET6 by rejecting IPv6 opcodes which are behind #if(n)def INET6 now. PR: kern/85826 MFC after: 3 days	2005-09-14 07:53:54 +00:00
andre	37d27a5a23	In tcp_ctlinput() do not swap ip->ip_len a second time. It has been done in icmp_input() already. This fixes the ICMP_UNREACH_NEEDFRAG case where no MTU was proposed in the ICMP reply. PR: kern/81813 Submitted by: Vitezslav Novy <vita at fio.cz> MFC after: 3 days	2005-09-10 07:43:29 +00:00
glebius	f609e1ae48	- Do not hold route entry lock, when calling arprequest(). One such call was introduced by me in 1.139, the other one was present before. - Do all manipulations with rtentry and la before dropping the lock. - Copy interface address from route into local variable before dropping the lock. Supply this copy as argument to arprequest() LORs fixed: http://sources.zabbadoz.net/freebsd/lor/003.html http://sources.zabbadoz.net/freebsd/lor/037.html http://sources.zabbadoz.net/freebsd/lor/061.html http://sources.zabbadoz.net/freebsd/lor/062.html http://sources.zabbadoz.net/freebsd/lor/064.html http://sources.zabbadoz.net/freebsd/lor/068.html http://sources.zabbadoz.net/freebsd/lor/071.html http://sources.zabbadoz.net/freebsd/lor/074.html http://sources.zabbadoz.net/freebsd/lor/077.html http://sources.zabbadoz.net/freebsd/lor/093.html http://sources.zabbadoz.net/freebsd/lor/135.html http://sources.zabbadoz.net/freebsd/lor/140.html http://sources.zabbadoz.net/freebsd/lor/142.html http://sources.zabbadoz.net/freebsd/lor/145.html http://sources.zabbadoz.net/freebsd/lor/152.html http://sources.zabbadoz.net/freebsd/lor/158.html	2005-09-09 10:06:27 +00:00
glebius	044310140d	When a carp(4) interface is being destroyed and is in a promiscous mode, first interface is detached from parent and then bpfdetach() is called. If the interface was the last carp(4) interface attached to parent, then the mutex on parent is destroyed. When bpfdetach() calls if_setflags() we panic on destroyed mutex. To prevent the above scenario, clear pointer to parent, when we detach ourselves from parent.	2005-09-09 08:41:39 +00:00
sam	36e4de77c6	clear lock on error in O_LIMIT case of install_state Submitted by: Ted Unangst MFC after: 3 days	2005-09-04 17:33:40 +00:00
andre	02715a1de8	Use the correct mbuf type for MGET().	2005-08-30 16:35:27 +00:00
glebius	d7f01fc7c7	Add newline to debuging printf. PR: kern/85271 Submitted by: Simon Morgan	2005-08-26 15:27:18 +00:00
glebius	2bb913b9a9	- Refuse hashsize of 0, since it is invalid. - Use defined constant instead of 512.	2005-08-25 13:57:00 +00:00
glebius	fe6f4bd975	When we have a published ARP entry for some IP address, do reply on ARP requests only on the network where this IP address belong, to. Before this change we did replied on all interfaces. This could lead to an IP address conflict with host we are doing ARP proxy for. PR: kern/75634 Reviewed by: andre	2005-08-25 13:25:57 +00:00
ps	813fe6cce6	Remove a KASSERT in the sack path that fails because of a interaction between sack and a bug in the "bad retransmit recovery" logic. This is a workaround, the underlying bug will be fixed later. Submitted by: Mohan Srinivasan, Noritoshi Demizu	2005-08-24 02:48:45 +00:00
ps	44ee0af629	Fix up the comment for MAX_SACK_BLKS. Submitted by: Noritoshi Demizu	2005-08-24 02:47:16 +00:00
andre	bc08e8e9de	Remove unnecessary IPSEC includes. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-23 14:42:40 +00:00
andre	9b00236fed	o Fix a logic error when not doing mbuf cluster allocation. o Change an old panic() to a clean function exit. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-22 22:13:41 +00:00
andre	573a9535a8	Add socketoption IP_MINTTL. May be used to set the minimum acceptable TTL a packet must have when received on a socket. All packets with a lower TTL are silently dropped. Works on already connected/connecting and listening sockets for RAW/UDP/TCP. This option is only really useful when set to 255 preventing packets from outside the directly connected networks reaching local listeners on sockets. Allows userland implementation of 'The Generalized TTL Security Mechanism (GTSM)' according to RFC3682. Examples of such use include the Cisco IOS BGP implementation command "neighbor ttl-security". MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-22 16:13:08 +00:00
andre	ef76245e90	Always quote the entire TCP header when responding and allocate an mbuf cluster if needed. Fixes the TCP issues raised in I-D draft-gont-icmp-payload-00.txt. This aids in-the-wild debugging a lot and allows the receiver to do more elaborate checks on the validity of the response. MFC after: 2 weeks Sponsored by: TCP/IP Optimization Fundraise 2005	2005-08-22 14:12:18 +00:00
andre	1afbca0a45	Handle pure layer 2 broad- and multicasts properly and simplify related checks. PR: kern/85052 Submitted by: Dmitrij Tejblum <tejblum at yandex-team.ru> MFC after: 3 days	2005-08-22 12:06:26 +00:00
andre	685b6138c1	Commit correct version of the change and note the name of the new sysctl: net.inet.icmp.quotelen and defaults to 8 bytes. Pointy hat to: andre	2005-08-21 15:18:00 +00:00
andre	11e9c09778	Add a sysctl to change to length of the quotation of the original packet in an ICMP reply. The minimum of 8 bytes is internally enforced. The maximum quotation is the remaining space in the reply mbuf. This option is added in response to the issues raised in I-D draft-gont-icmp-payload-00.txt. MFC after: 2 weeks Spnsored by: TCP/IP Optimizations Fundraise 2005	2005-08-21 15:09:07 +00:00
andre	cd4fd9f0b7	Add an option to have ICMP replies to non-local packets generated with the IP address the packet came through in. This is useful for routers to show in traceroutes the actual path a packet has taken instead of the possibly different return path. The new sysctl is named net.inet.icmp.reply_from_interface and defaults to off. MFC after: 2 weeks	2005-08-21 12:29:39 +00:00
glebius	b2a592be59	In order to support CARP interfaces kernel was taught to handle more than one interface in one subnet. However, some userland apps rely on the believe that this configuration is impossible. Add a sysctl switch net.inet.ip.same_prefix_carp_only. If the switch is on, then kernel will refuse to add an additional interface to already connected subnet unless the interface is CARP. Default value is off. PR: bin/82306 In collaboration with: mlaier	2005-08-18 10:34:30 +00:00
bz	14ab5de8fe	Fix broken build of rev. 1.108 in case of no INET6 and IPFIREWALL compiled into kernel. Spotted and tested by: Michal Mertl <mime at traveller.cz>	2005-08-14 18:20:33 +00:00
bz	5434a58808	* Add dynamic sysctl for net.inet6.ip6.fw. * Correct handling of IPv6 Extension Headers. * Add unreach6 code. * Add logging for IPv6. Submitted by: sysctl handling derived from patch from ume needed for ip6fw Obtained from: is_icmp6_query and send_reject6 derived from similar functions of netinet6,ip6fw Reviewed by: ume, gnn; silence on ipfw@ Test setup provided by: CK Software GmbH MFC after: 6 days	2005-08-13 11:02:34 +00:00
rodrigc	a0f768a4dd	Add NATM_LOCK() and NATM_UNLOCK() in places where npcb_add() and npcb_free() are called, in order to eliminate witness panics. This was overlooked in removal of GIANT from ATM. Reviewed by: rwatson	2005-08-12 02:38:20 +00:00
glebius	c7fa71afc5	o Fix a race between three threads: output path, incoming ARP packet and route request adding/removing ARP entries. The root of the problem is that struct llinfo_arp was accessed without any locks. To close race we will use locking provided by rtentry, that references this llinfo_arp: - Make arplookup() return a locked rtentry. - In arpresolve() hold the lock provided by rt_check()/arplookup() until the end of function, covering all accesses to the rtentry itself and llinfo_arp it refers to. - In in_arpinput() do not drop lock provided by arplookup() during first part of the function. - Simplify logic in the first part of in_arpinput(), removing one level of indentation. - In the second part of in_arpinput() hold rtentry lock while copying address. o Fix a condition when route entry is destroyed, while another thread is contested on its lock: - When storing a pointer to rtentry in llinfo_arp list, always add a reference to this rtentry, to prevent rtentry being destroyed via RTM_DELETE request. - Remove this reference when removing entry from llinfo_arp list. o Further cleanup of arptimer(): - Inline arptfree() into arptimer(). - Use official queue(3) way to pass LIST. - Hold rtentry lock while reading its structure. - Do not check that sdl_family is AF_LINK, but assert this. Reviewed by: sam Stress test: http://www.holm.cc/stress/log/cons141.html Stress test: http://people.freebsd.org/~pho/stress/log/cons144.html	2005-08-11 08:25:48 +00:00
obrien	d5d343a0fd	Remove public declarations of variables that were forgotten when they were made static.	2005-08-10 07:10:02 +00:00
obrien	a76eb9e4d1	Match IPv6 and use a static struct pr_usrreqs nousrreqs.	2005-08-10 06:41:04 +00:00
rwatson	693479e03f	Add helper function ip_findmoptions(), which accepts an inpcb, and attempts to atomically return either an existing set of IP multicast options for the PCB, or a newlly allocated set with default values. The inpcb is returned locked. This function may sleep. Call ip_moptions() to acquire a reference to a PCB's socket options, and perform the update of the options while holding the PCB lock. Release the lock before returning. Remove garbage collection of multicast options when values return to the default, as this complicates locking substantially. Most applications allocate a socket either to be multicast, or not, and don't tend to keep around sockets that have previously been used for multicast, then used for unicast. This closes a number of race conditions involving multiple threads or processes modifying the IP multicast state of a socket simultaenously. MFC after: 7 days	2005-08-09 17:19:21 +00:00
rwatson	5d770a09e8	Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days	2005-08-09 10:20:02 +00:00
glebius	f2edd835a7	In preparation for fixing races in ARP (and probably in other L2/L3 mappings) make rt_check() return a locked rtentry.	2005-08-09 08:39:56 +00:00
rwatson	7504160c1e	Introduce in_multi_mtx, which will protect IPv4-layer multicast address lists, as well as accessor macros. For now, this is a recursive mutex due code sequences where IPv4 multicast calls into IGMP calls into ip_output(), which then tests for a multicast forwarding case. For support macros in in_var.h to check multicast address lists, assert that in_multi_mtx is held. Acquire in_multi_mtx around iteration over the IPv4 multicast address lists, such as in ip_input() and ip_output(). Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses, as well as over the manipulation of ifnet multicast address lists in order to keep the two layers in sync. Lock down accesses to IPv4 multicast addresses in IGMP, or assert the lock when performing IGMP join/leave events. Eliminate spl's associated with IPv4 multicast addresses, portions of IGMP that weren't previously expunged by IGMP locking. Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded lock order in WITNESS, in that order. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 10 days	2005-08-03 19:29:47 +00:00
rwatson	efa13b9da9	Modify network protocol consumers of the ifnet multicast address lists to lock if_addr_mtx. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 1 week	2005-08-02 23:51:22 +00:00
ume	084dd86e04	recover the line which was wrongly disappeared during scope cleanup. tcpdrop(8) should work for IPv6, again.	2005-08-01 12:08:49 +00:00
bz	6860ef65eb	Add support for IPv6 over GRE [1]. PR kern/80340 includes the FreeBSD specific ip_newid() changes NetBSD does not have. Correct handling of non AF_INET packets passed to bpf [2]. PR: kern/80340[1], NetBSD PRs 29150[1], 30844[2] Obtained from: NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56 Submitted by: Gert Doering <gert at greenie.muc.de>[2] MFC after: 4 days	2005-08-01 08:14:21 +00:00
ume	b7e0fb45da	include scope6_var.h for in6_clearscope().	2005-07-26 00:19:58 +00:00
ume	817761fa0f	include netinet6/scope6_var.h.	2005-07-25 12:36:43 +00:00
ume	da2cf62b28	scope cleanup. with this change - most of the kernel code will not care about the actual encoding of scope zone IDs and won't touch "s6_addr16[1]" directly. - similarly, most of the kernel code will not care about link-local scoped addresses as a special case. - scope boundary check will be stricter. For example, the current BSD code allows a packet with src=::1 and dst=(some global IPv6 address) to be sent outside of the node, if the application do: s = socket(AF_INET6); bind(s, "::1"); sendto(s, some_global_IPv6_addr); This is clearly wrong, since ::1 is only meaningful within a single node, but the current implementation of the BSD kernel cannot reject this attempt. Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> Obtained from: KAME	2005-07-25 12:31:43 +00:00
keramida	35eb3b8312	Misc spelling and/or English fixes in comments. Reviewed by: glebius, andre	2005-07-23 00:59:13 +00:00
ume	c8f443708b	move RFC3542 related definitions into ip6.h. Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Reviewed by: mlaier Obtained from: KAME	2005-07-20 10:30:52 +00:00
ume	ace81960a0	add missing RFC3542 definition. Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Obtained from: KAME	2005-07-20 09:17:41 +00:00
ume	b8e17ef763	update comments: - RFC2292bis -> RFC3542 - typo fixes Submitted by: Keiichi SHIMA <keiichi__at__iijlab.net> Obtained from: KAME	2005-07-20 08:59:45 +00:00
rwatson	d49765d3fd	Remove no-op spl references in in_pcb.c, since in_pcb locking has been basically complete for several years now. Update one spl comment to reference the locking strategy. MFC after: 3 days	2005-07-19 12:24:27 +00:00
rwatson	a7ad7c72d9	Remove no-op spl's and most comment references to spls, as TCP locking is believed to be basically done (modulo any remaining bugs). MFC after: 3 days	2005-07-19 12:21:26 +00:00
rwatson	7399e6761f	Remove spl() calls from ip_slowtimo(), as IP fragment queue locking was merged several years ago. Submitted by: gnn MFC after: 1 day	2005-07-19 12:14:22 +00:00
mlaier	a390027dcd	Export pfsyncstats via sysctl "net.inet.pfsync" in order to print them with netstat (seperate commit). Requested by: glebius MFC after: 1 week	2005-07-14 22:22:51 +00:00
rwatson	5e71112509	Eliminate MAC entry point mac_create_mbuf_from_mbuf(), which is redundant with respect to existing mbuf copy label routines. Expose a new mac_copy_mbuf() routine at the top end of the Framework and use that; use the existing mpo_copy_mbuf_label() routine on the bottom end. Obtained from: TrustedBSD Project Sponsored by: SPARTA, SPAWAR Approved by: re (scottl)	2005-07-05 23:39:51 +00:00
ps	17e64b657e	Fix for a bug in newreno partial ack handling where if a large amount of data is partial acked, snd_cwnd underflows, causing a burst. Found, Submitted by: Noritoshi Demizu Approved by: re	2005-07-05 19:23:02 +00:00
mlaier	a830c4a47a	Remove ambiguity from hlen. IPv4 is now indicated by is_ipv4 and we need a proper hlen value for IPv6 to implement O_REJECT and O_LOG. Reviewed by: glebius, brooks, gnn Approved by: re (scottl)	2005-07-03 15:42:22 +00:00
thompsa	d7e928629d	Check the alignment of the IP header before passing the packet up to the packet filter. This would cause a panic on architectures that require strict alignment such as sparc64 (tier1) and ia64/ppc (tier2). This adds two new macros that check the alignment, these are compile time dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where alignment isn't need so the cost is avoided. IP_HDR_ALIGNED_P() IP6_HDR_ALIGNED_P() Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment is checked for ipfw and dummynet too. PR: ia64/81284 Obtained from: NetBSD Approved by: re (dwhite), mlaier (mentor)	2005-07-02 23:13:31 +00:00
ps	5c9927b6f6	Fix for a bug in the change that defers sack option processing until after PAWS checks. The symptom of this is an inconsistency in the cached sack state, caused by the fact that the sack scoreboard was not being updated for an ACK handled in the header prediction path. Found by: Andrey Chernov. Submitted by: Noritoshi Demizu, Raja Mukerji. Approved by: re	2005-07-01 22:54:18 +00:00
ps	513734e2fa	Fix for a SACK crash caused by a bug in tcp_reass(). tcp_reass() does not clear tlen and frees the mbuf (leaving th pointing at freed memory), if the data segment is a complete duplicate. This change works around that bug. A fix for the tcp_reass() bug will appear later (that bug is benign for now, as neither th nor tlen is referenced in tcp_input() after the call to tcp_reass()). Found by: Pawel Jakub Dawidek. Submitted by: Raja Mukerji, Noritoshi Demizu. Approved by: re	2005-07-01 22:52:46 +00:00
glebius	c6a8611901	When doing ARP load balancing source IP is taken in network byte order, so residue of division for all hosts on net is the same, and thus only one VHID answers. Change source IP in host byte order. Reviewed by: mlaier Approved by: re (scottl)	2005-07-01 08:22:13 +00:00
simon	76d3f5f676	Fix ipfw packet matching errors with address tables. The ipfw tables lookup code caches the result of the last query. The kernel may process multiple packets concurrently, performing several concurrent table lookups. Due to an insufficient locking, a cached result can become corrupted that could cause some addresses to be incorrectly matched against a lookup table. Submitted by: ru Reviewed by: csjp, mlaier Security: CAN-2005-2019 Security: FreeBSD-SA-05:13.ipfw Correct bzip2 permission race condition vulnerability. Obtained from: Steve Grubb via RedHat Security: CAN-2005-0953 Security: FreeBSD-SA-05:14.bzip2 Approved by: obrien Correct TCP connection stall denial of service vulnerability. A TCP packets with the SYN flag set is accepted for established connections, allowing an attacker to overwrite certain TCP options. Submitted by: Noritoshi Demizu Reviewed by: andre, Mohan Srinivasan Security: CAN-2005-2068 Security: FreeBSD-SA-05:15.tcp Approved by: re (security blanket), cperciva	2005-06-29 21:36:49 +00:00
ps	5dc6983c1d	- Postpone SACK option processing until after PAWS checks. SACK option processing is now done in the ACK processing case. - Merge tcp_sack_option() and tcp_del_sackholes() into a new function called tcp_sack_doack(). - Test (SEG.ACK < SND.MAX) before processing the ACK. Submitted by: Noritoshi Demizu Reveiewed by: Mohan Srinivasan, Raja Mukerji Approved by: re	2005-06-27 22:27:42 +00:00
phk	4df35786e2	Libalias incorrectly applies proxy rules to the global divert socket: it should only look for existing translation entries, not create new ones (no matter how it got the idea). Approved by: re(scottl)	2005-06-27 22:21:42 +00:00
glebius	c6e57e046e	Disable checksum processing in LibAlias, when it works as a kernel module. LibAlias is not aware about checksum offloading, so the caller should provide checksum calculation. (The only current consumer is ng_nat(4)). When TCP packet internals has been changed and it requires checksum recalculation, a cookie is set in th_x2 field of TCP packet, to inform caller that it needs to recalculate checksum. This ugly hack would be removed when LibAlias is made more kernel friendly. Incremental checksum updates are left as is, since they don't conflict with offloading. Approved by: re (scottl)	2005-06-27 07:36:02 +00:00
dwmalone	f1f0123e88	Fix some long standing bugs in writing to the BPF device attached to a DLT_NULL interface. In particular: 1) Consistently use type u_int32_t for the header of a DLT_NULL device - it continues to represent the address family as always. 2) In the DLT_NULL case get bpf_movein to store the u_int32_t in a sockaddr rather than in the mbuf, to be consistent with all the DLT types. 3) Consequently fix a bug in bpf_movein/bpfwrite which only permitted packets up to 4 bytes less than the MTU to be written. 4) Fix all DLT_NULL devices to have the code required to allow writing to their bpf devices. 5) Move the code to allow writing to if_lo from if_simloop to looutput, because it only applies to DLT_NULL devices but was being applied to other devices that use if_simloop possibly incorrectly. PR: 82157 Submitted by: Matthew Luckie <mjl@luckie.org.nz> Approved by: re (scottl)	2005-06-26 18:11:11 +00:00
ups	8f1c0248bd	Fix a timer ticks wrap around bug for minmssoverload processing. Approved by: re (scottl,dwhite) MFC after: 4 weeks	2005-06-25 22:24:45 +00:00
imp	d1b7fc96b0	Add back missing copyright and license statement. This is identical to the statement in ip_mroute.h, as well as being the same as what OpenBSD has done with this file. It matches the copyright in NetBSD's 1.1 through 1.14 versions of the file as well, which they subsequently added back. It appears to have been lost in the 4.4-lite1 import for FreeBSD 2.0, but where and why I've not investigated further. OpenBSD had the same problem. NetBSD had a copyright notice until Multicast 3.5 was integrated verbatim back in 1995. This appears to be the version that made it into 4.4-lite1. Approved by: re (scottl) MFC after: 3 days	2005-06-23 18:42:58 +00:00
ps	1f7ded928d	Fix for a bug in tcp_sack_option() causing crashes. Submitted by: Noritoshi Demizu, Mohan Srinivasan. Approved by: re (scottl blanket SACK)	2005-06-23 00:18:54 +00:00
bz	e8078a9f93	Fix IP(v6) over IP tunneling most likely broken with ifnet changes. Reviewed by: gnn Approved by: re (dwhite), rwatson (mentor)	2005-06-20 08:39:30 +00:00
glebius	5b74333367	- Don't use legacy function in a non-legacy one. This gives us possibility to compile libalias without legacy support. - Use correct way to mark variable as unused. Approved by: re (dwhite)	2005-06-20 08:31:48 +00:00
mlaier	702ab5eeef	In verify_rev_path6(): - do not use static memory as we are under a shared lock only - properly rtfree routes allocated with rtalloc - rename to verify_path6() - implement the full functionality of the IPv4 version Also make O_ANTISPOOF work with IPv6. Reviewed by: gnn Approved by: re (blanket)	2005-06-16 14:55:58 +00:00
mlaier	02006202bd	Fix indentation in INET6 section in preperation of more serious work. Approved by: re (blanket ip6fw removal)	2005-06-16 13:20:36 +00:00
mlaier	f82a1d5a8e	When doing matching based on dst_ip/src_ip make sure we are really looking on an IPv4 packet as these variables are uninitialized if not. This used to allow arbitrary IPv6 packets depending on the value in the uninitialized variables. Some opcodes (most noteably O_REJECT) do not support IPv6 at all right now. Reviewed by: brooks, glebius Security: IPFW might pass IPv6 packets depending on stack contents. Approved by: re (blanket)	2005-06-12 16:27:10 +00:00
brooks	567ba9b00a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
green	9f795bf703	Modify send_pkt() to return the generated packet and have the caller do the subsequent ip_output() in IPFW. In ipfw_tick(), the keep-alive packets must be generated from the data that resides under the stateful lock, but they must not be sent at that time, as this would cause a lock order reversal with the normal ordering (interface's lock, then locks belonging to the pfil hooks). In practice, this caused deadlocks when using IPFW and if_bridge(4) together to do stateful transparent filtering. MFC after: 1 week	2005-06-10 12:28:17 +00:00
thompsa	aa4ecbabe5	Add dummynet(4) support to if_bridge, this code is largely based on bridge.c. This is the final piece to match bridge.c in functionality, we can now be a drop-in replacement. Approved by: mlaier (mentor)	2005-06-10 01:25:22 +00:00
ps	63b2db1458	Fix a mis-merge. Remove a redundant call to tcp_sackhole_insert Submitted by: Mohan Srinivasan	2005-06-09 17:55:29 +00:00
ps	0c131b3254	Fix for a crash in tcp_sack_option() caused by hitting the limit on the number of sack holes. Reported by: Andrey Chernov Submitted by: Noritoshi Demizu Reviewed by: Raja Mukerji	2005-06-09 14:01:04 +00:00
ps	10d5de225b	Fix for a bug in the change that walks the scoreboard backwards from the tail (in tcp_sack_option()). The bug was caused by incorrect accounting of the retransmitted bytes in the sackhint. Reported by: Kris Kennaway. Submitted by: Noritoshi Demizu.	2005-06-06 19:46:53 +00:00
thompsa	6c5bdda300	Add hooks into the networking layer to support if_bridge. This changes struct ifnet so a buildworld is necessary. Approved by: mlaier (mentor) Obtained from: NetBSD	2005-06-05 03:13:13 +00:00
green	497a5998af	Better explain, then actually implement the IPFW ALTQ-rule first-match policy. It may be used to provide more detailed classification of traffic without actually having to decide its fate at the time of classification. MFC after: 1 week	2005-06-04 19:04:31 +00:00
ps	c6eb61a11e	Changes to tcp_sack_option() that - Walks the scoreboard backwards from the tail to reduce the number of comparisons for each sack option received. - Introduce functions to add/remove sack scoreboard elements, making the code more readable. Submitted by: Noritoshi Demizu Reviewed by: Raja Mukerji, Mohan Srinivasan	2005-06-04 08:03:28 +00:00
mlaier	f2254cf702	Add support for IPv4 only rules to IPFW2 now that it supports IPv6 as well. This is the last requirement before we can retire ip6fw. Reviewed by: dwhite, brooks(earlier version) Submitted by: dwhite (manpage) Silence from: -ipfw	2005-06-03 01:10:28 +00:00
iedowse	7de9a3957f	Use IFF_LOCKGIANT/IFF_UNLOCKGIANT around calls to the interface if_ioctl routine. This should fix a number of code paths through soo_ioctl() that could call into Giant-locked network drivers without first acquiring Giant.	2005-06-02 00:04:08 +00:00
rwatson	d363d282f8	When aborting tcp_attach() due to a problem allocating or attaching the tcpcb, lock the inpcb before calling in_pcbdetach() or in6_pcbdetach(), as they expect the inpcb to be passed locked. MFC after: 7 days	2005-06-01 12:14:56 +00:00
rwatson	fc80f5f380	Assert tcbinfo lock, inpcb lock in tcp_disconnect(). Assert tcbinfo lock, inpcb lock in in tcp_usrclosed(). MFC after: 7 days	2005-06-01 12:08:15 +00:00
rwatson	36ab8094dd	Assert tcbinfo lock in tcp_drop() due to its call of tcp_close() Assert tcbinfo lock in tcp_close() due to its call to in{,6}_detach() Assert tcbinfo lock in tcp_drop_syn_sent() due to its call to tcp_drop() MFC after: 7 days	2005-06-01 12:06:07 +00:00
rwatson	07bd5e1d17	Assert that tcbinfo is locked in tcp_input() before calling into tcp_drop(). MFC after: 7 days	2005-06-01 12:03:18 +00:00
rwatson	fe0bf623ad	Assert the tcbinfo lock whenever tcp_close() is to be called by tcp_input(). MFC after: 7 days	2005-06-01 11:49:14 +00:00
rwatson	a5b21ce8f4	Assert tcbinfo lock in tcp_attach(), as it is required; the caller (tcp_usr_attach()) currently grabs it. MFC after: 7 days	2005-06-01 11:44:43 +00:00
rwatson	be143d8ea5	Commit correct version of previous commit (in_pcb.c:1.164). Use the local variables as currently named. MFC after: 7 days	2005-06-01 11:43:39 +00:00
rwatson	ad803f0089	Assert pcbinfo lock in in_pcbdisconnect() and in_pcbdetach(), as the global pcb lists are modified. MFC after: 7 days	2005-06-01 11:39:42 +00:00
rwatson	200ac8ea6b	Slight white space tweak. MFC after: 7 days	2005-06-01 11:38:35 +00:00
rwatson	18e2f22abb	De-spl UDP. MFC after: 3 days	2005-06-01 11:24:00 +00:00
tanimura	1142d02df8	Let OSPFv3 go through ipfw. Some more additional checks would be desirable, though.	2005-05-28 07:46:44 +00:00
ps	8512119a50	This is conform with the terminology in M.Mathis and J.Mahdavi, "Forward Acknowledgement: Refining TCP Congestion Control" SIGCOMM'96, August 1996. Submitted by: Noritoshi Demizu, Raja Mukerji	2005-05-25 17:55:27 +00:00
ps	d8a59510c5	Rewrite of tcp_sack_option(). Kentaro Kurahone (NetBSD) pointed out that if we sort the incoming SACK blocks, we can update the scoreboard in one pass of the scoreboard. The added overhead of sorting upto 4 sack blocks is much lower than traversing (potentially) large scoreboards multiple times. The code was updating the scoreboard with multiple passes over it (once for each sack option). The rewrite fixes that, reducing the complexity of the main loop from O(n^2) to O(n). Submitted by: Mohan Srinivasan, Noritoshi Demizu. Reviewed by: Raja Mukerji.	2005-05-23 19:22:48 +00:00
ps	8c89591e0a	Replace t_force with a t_flag (TF_FORCEDATA). Submitted by: Raja Mukerji. Reviewed by: Mohan, Silby, Andre Opperman.	2005-05-21 00:38:29 +00:00
ps	8ce9214165	Introduce routines to alloc/free sack holes. This cleans up the code considerably. Submitted by: Noritoshi Demizu. Reviewed by: Raja Mukerji, Mohan Srinivasan.	2005-05-16 19:26:46 +00:00
glebius	ed7b2b9937	- When carp interface is destroyed, and it affects global preemption suppresion counter, decrease the latter. [1] - Add sysctl to monitor preemption suppression. PR: kern/80972 [1] Submitted by: Frank Volf [1] MFC after: 1 week	2005-05-15 01:44:26 +00:00
ps	861ed0532c	Fix for a bug where the "nexthole" sack hint is out of sync with the real next hole to retransmit from the scoreboard, caused by a bug which did not update the "nexthole" hint in one case in tcp_sack_option(). Reported by: Daniel Eriksson Submitted by: Mohan Srinivasan	2005-05-13 18:02:02 +00:00
glebius	407f5f009c	In div_output() explicitly set m->m_nextpkt to NULL. If divert socket is not userland, but ng_ksocket, then m->m_nextpkt may be non-NULL. In this case we would panic in sbappend.	2005-05-13 11:44:37 +00:00
ps	0ee2317201	When looking for the next hole to retransmit from the scoreboard, or to compute the total retransmitted bytes in this sack recovery episode, the scoreboard is traversed. While in sack recovery, this traversal occurs on every call to tcp_output(), every dupack and every partial ack. The scoreboard could potentially get quite large, making this traversal expensive. This change optimizes this by storing hints (for the next hole to retransmit and the total retransmitted bytes in this sack recovery episode) reducing the complexity to find these values from O(n) to constant time. The debug code that sanity checks the hints against the computed value will be removed eventually. Submitted by: Mohan Srinivasan, Noritoshi Demizu, Raja Mukerji.	2005-05-11 21:37:42 +00:00
cperciva	a199a4f74b	Fix two issues which were missed in FreeBSD-SA-05:08.kmem. Reported by: Uwe Doering	2005-05-07 00:41:36 +00:00
glebius	62a51b2dbf	Add a workaround for 64-bit archs: store unsigned long return value in temporary variable, check it and then cast to in_addr_t.	2005-05-06 13:01:31 +00:00
glebius	3eab4ba5cc	s/DEBUG/LIBALIAS_DEBUG/, since DEBUG is defined in LINT and not supported for kernel build.	2005-05-06 11:07:49 +00:00
cperciva	e513415af9	If we are going to 1. Copy a NULL-terminated string into a fixed-length buffer, and 2. copyout that buffer to userland, we really ought to 0. Zero the entire buffer first. Security: FreeBSD-SA-05:08.kmem	2005-05-06 02:50:00 +00:00
glebius	225ea2fb02	More bits for kernel version: - copy inet_aton() from libc - disable getservbyname() lookup and accept only numeric port	2005-05-05 22:00:32 +00:00
glebius	32b37f4983	Always include alias.h before alias_local.h	2005-05-05 21:55:17 +00:00
glebius	98fced80c8	When used in kernel define NO_FW_PUNCH, NO_LOGGING, NO_USE_SOCKETS.	2005-05-05 21:53:17 +00:00
glebius	fa23219cf2	Fix argument order for bcopy() in last commit. Noticed by: njl Pointy hat to: glebius	2005-05-05 21:40:49 +00:00
glebius	dfbc79ffbf	Use bcopy() instead of memmove().	2005-05-05 21:10:51 +00:00
glebius	e31c062e0b	Hide fflush(3) under ifdef DEBUG.	2005-05-05 21:07:34 +00:00
glebius	d1dba4a851	Things required to build libalias as kernel module: - kernel module declarations and handler. - macros to map malloc(3) calls to malloc(9) ones. - malloc(9) declarations. - call finishoff() from module handler MOD_UNLOAD case instead of atexit(3). - use panic(9) instead of abort(3) - take time from time_second instead of gettimeofday(2) - define INADDR_NONE	2005-05-05 21:05:38 +00:00
glebius	81aed9a0a8	Add NO_USE_SOCKETS knob, which cuts off functionality socket binding.	2005-05-05 20:25:12 +00:00
glebius	497e6391d0	Add NO_LOGGING knob, which cuts off functionality of debug logging to a file.	2005-05-05 20:22:09 +00:00
glebius	89e40708a1	Play with includes so that libalias can be compiled both as userland library and kernel module.	2005-05-05 19:27:32 +00:00
andre	757e9ec267	If we don't get a suggested MTU during path MTU discovery look up the packet size of the packet that generated the response, step down the MTU by one step through ip_next_mtu() and try again. Suggested by: dwmalone	2005-05-04 13:48:44 +00:00
glebius	a0f1ce3be1	Cleanup IPFW2 ifdefs.	2005-05-04 13:24:37 +00:00
glebius	6d827336e9	Makefile is not needed here.	2005-05-04 13:24:12 +00:00
andre	e4576466d8	Add another step of 1280 (gif(4) tunnels) to ip_next_mtu().	2005-05-04 13:23:54 +00:00
glebius	cf5c01798e	IPFW version 2 is the only option in HEAD and RELENG_5. Thus, cleanup unnecessary now ifdefs.	2005-05-04 13:12:52 +00:00
andre	e6d10efc2a	Pass icmp_error() the MTU argument directly instead of an interface pointer. This simplifies a couple of uses and removes some XXX workarounds.	2005-05-04 13:09:19 +00:00
rwatson	88ecd82ce9	Remove now unused inirw variable from previous use of COMMON_END(). Reported by: csjp	2005-05-01 14:01:38 +00:00
grehan	baa9d30def	Fix typo in last commit. Approved by: rwatson	2005-05-01 13:06:05 +00:00
rwatson	6db3b00e40	Slide unlocking of the tcbinfo lock earlier in tcp_usr_send(), as it's needed only for implicit connect cases. Under load, especially on SMP, this can greatly reduce contention on the tcbinfo lock. NB: Ambiguities about the state of so_pcb need to be resolved so that all use of the tcbinfo lock in non-implicit connection cases can be eliminated. Submited by: Kazuaki Oda <kaakun at highway dot ne dot jp>	2005-05-01 11:11:38 +00:00
brooks	4d3f396bf6	Introduce a struct icmphdr which contains the type, code, and cksum fields of an ICMP packet. Use this to allow ipfw to pullup only these values since it does not use the rest of the packet and it was failed on ICMP packets because they were not long enough. struct icmp should probably be modified to use these at some point, but that will break a fair bit of code so it can wait for another day. On the off chance that adding this struct breaks something in ports, bump __FreeBSD_version. Reported by: Randy Bush <randy at psg dot com> Tested by: Randy Bush <randy at psg dot com>	2005-04-26 18:10:21 +00:00
ps	2b9aae1b7a	Remove some code that snuck in by accident. Submitted by: Mohan Srinivasan	2005-04-21 20:29:40 +00:00
ps	38541e0962	Fix for interaction problems between TCP SACK and TCP Signature. If TCP Signatures are enabled, the maximum allowed sack blocks aren't going to fit. The fix is to compute how many sack blocks fit and tack these on last. Also on SYNs, defer padding until after the SACK PERMITTED option has been added. Found by: Mohan Srinivasan. Submitted by: Mohan Srinivasan, Noritoshi Demizu. Reviewed by: Raja Mukerji.	2005-04-21 20:26:07 +00:00
ps	ebae7f1bff	Undo rev 1.71 as it is the wrong change.	2005-04-21 20:24:43 +00:00
ps	a61b20503e	- Make the sack scoreboard logic use the TAILQ macros. This improves code readability and facilitates some anticipated optimizations in tcp_sack_option(). - Remove tcp_print_holes() and TCP_SACK_DEBUG. Submitted by: Raja Mukerji. Reviewed by: Mohan Srinivasan, Noritoshi Demizu.	2005-04-21 20:11:01 +00:00
ps	3c771e02c3	Fix for 2 bugs related to TCP Signatures : - If the peer sends the Signature option in the SYN, use of Timestamps and Window Scaling were disabled (even if the peer supports them). - The sender must not disable signatures if the option is absent in the received SYN. (See comment in syncache_add()). Found, Submitted by: Noritoshi Demizu <demizu at dd dot ij4u dot or dot jp>. Reviewed by: Mohan Srinivasan <mohans at yahoo-inc dot com>.	2005-04-21 20:09:09 +00:00
andre	4f85376072	Move Path MTU discovery ICMP processing from icmp_input() to tcp_ctlinput() and subject it to active tcpcb and sequence number checking. Previously any ICMP unreachable/needfrag message would cause an update to the TCP hostcache. Now only ICMP PMTU messages belonging to an active TCP session with the correct src/dst/port and sequence number will update the hostcache and complete the path MTU discovery process. Note that we don't entirely implement the recommended counter measures of Section 7.2 of the paper. However we close down the possible degradation vector from trivially easy to really complex and resource intensive. In addition we have limited the smallest acceptable MTU with net.inet.tcp.minmss sysctl for some time already, further reducing the effect of any degradation due to an attack. Security: draft-gont-tcpm-icmp-attacks-03.txt Section 7.2 MFC after: 3 days	2005-04-21 14:29:34 +00:00
andre	34a84accee	Ignore ICMP Source Quench messages for TCP sessions. Source Quench is ineffective, depreciated and can be abused to degrade the performance of active TCP sessions if spoofed. Replace a bogus call to tcp_quench() in tcp_output() with the direct equivalent tcpcb variable assignment. Security: draft-gont-tcpm-icmp-attacks-03.txt Section 7.1 MFC after: 3 days	2005-04-21 12:37:12 +00:00
glebius	63fda197fc	Remove anti-LOR bandaid, it is not needed now. Sponsored by: Rambler	2005-04-20 09:32:05 +00:00
phk	bbad4f5164	Make DUMMYNET compile without INET6	2005-04-19 10:12:21 +00:00
phk	84c2edfcd6	typo	2005-04-19 10:04:38 +00:00
phk	fe23bbf226	Make IPFIREWALL compile without INET6	2005-04-19 09:56:14 +00:00
brooks	f3ecaa630b	Add IPv6 support to IPFW and Dummynet. Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi)	2005-04-18 18:35:05 +00:00
ps	aaaa8a951f	Rewrite of tcp_update_sack_list() to make it simpler and more readable than our original OpenBSD derived version. Submitted by: Noritoshi Demizu Reviewed by: Mohan Srinivasan, Raja Mukerji	2005-04-18 18:10:56 +00:00
brooks	def26873da	Centralized finding the protocol header in IP packets in preperation for IPv6 support. The header in IPv6 is more complex then in IPv4 so we want to handle skipping over it in one location. Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi)	2005-04-15 00:47:44 +00:00
ps	2bf5500850	Fix for a TCP SACK bug where more than (win/2) bytes could have been in flight in SACK recovery. Found by: Noritoshi Demizu Submitted by: Mohan Srinivasan <mohans at yahoo-inc dot com> Noritoshi Demizu <demizu at dd dot ij4u dot or dot jp> Raja Mukerji <raja at moselle dot com>	2005-04-14 20:09:52 +00:00
ps	ebd6b162f8	- Tighten up the Timestamp checks to prevent a spoofed segment from setting ts_recent to an arbitrary value, stopping further communication between the two hosts. - If the Echoed Timestamp is greater than the current time, fall back to the non RFC 1323 RTT calculation. Submitted by: Raja Mukerji (raja at moselle dot com) Reviewed by: Noritoshi Demizu, Mohan Srinivasan	2005-04-10 05:24:59 +00:00
ps	f6d14de55c	- If the reassembly queue limit was reached or if we couldn't allocate a reassembly queue state structure, don't update (receiver) sack report. - Similarly, if tcp_drain() is called, freeing up all items on the reassembly queue, clean the sack report. Found, Submitted by: Noritoshi Demizu <demizu at dd dot iij4u dot or dot jp> Reviewed by: Mohan Srinivasan (mohans at yahoo-inc dot com), Raja Mukerji (raja at moselle dot com).	2005-04-10 05:21:29 +00:00
ps	ae82590cf6	When the rightmost SACK block expands, rcv_lastsack should be updated. (Fix for kern/78226). Submitted by : Noritoshi Demizu <demizu at dd dot iij4u dot or dot jp> Reviewed by : Mohan Srinivasan (mohans at yahoo-inc dot com), Raja Mukerji (raja at moselle dot com).	2005-04-10 05:20:10 +00:00
ps	cb57694d7f	Remove some unused sack fields. Submitted by : Noritoshi Demizu, Mohan Srinivasan.	2005-04-10 05:19:22 +00:00
maxim	1d7537ad41	o Nano optimize ip_reass() code path for the first fragment: do not try to reasseble the packet from the fragments queue with the only fragment, finish with the first fragment as soon as we create a queue. Spotted by: Vijay Singh o Drop the fragment if maxfragsperpacket == 0, no chances we will be able to reassemble the packet in future. Reviewed by: silby	2005-04-08 10:25:13 +00:00
maxim	58adac10e7	o Tweak the comment a bit.	2005-04-08 08:43:21 +00:00
maxim	a31bda3d3c	o Disable random port allocation when ip.portrange.first == ip.portrange.last and there is the only port for that because: a) it is not wise; b) it leads to a panic in the random ip port allocation code. In general we need to disable ip port allocation randomization if the last - first delta is ridiculous small. PR: kern/79342 Spotted by: Anjali Kulkarni Glanced at by: silby MFC after: 2 weeks	2005-04-08 08:42:10 +00:00
glebius	d14eecdbbc	When a packet has been reinjected into ipfw(4) after dummynet(4) processing we have a non-NULL args.rule. If the same packet later is subject to "tee" rule, its original is sent again into ipfw_chk() and it reenters at the same rule. This leads to infinite loop and frozen router. Assign args.rule to NULL, any time we are going to send packet back to ipfw_chk() after a tee rule. This is a temporary workaround, which we will leave for RELENG_5. In HEAD we are going to make divert(4) save next rule the same way as dummynet(4) does. PR: kern/79546 Submitted by: Oleg Bulyzhin Reviewed by: maxim, andre MFC after: 3 days	2005-04-06 14:00:33 +00:00
brooks	ae23407fab	Use ACTION_PTR(r) instead of (r->cmd + r->act_ofs). Reviewed by: md5	2005-04-06 00:26:08 +00:00
brooks	104d52c29f	Make dummynet_flush() match its prototype.	2005-04-05 23:38:16 +00:00
phk	063486ce84	natd core dumps when -reverse switch is used because of a bug in libalias. In /usr/src/lib/libalias/alias.c, the functions LibAliasIn and LibAliasOutTry call the legacy PacketAliasIn/PacketAliasOut instead of LibAliasIn/LibAliasOut when the PKT_ALIAS_REVERSE option is set. In this case, the context variable "la" gets lost because the legacy compatibility routines expect "la" to be global. This was obviously an oversight when rewriting the PacketAlias* functions to the LibAlias* functions. The fix (as shown in the patch below) is to remove the legacy subroutine calls and replace with the new ones using the "la" struct as the first arg. Submitted by: Gil Kloepfer <fgil@kloepfer.org> Confirmed by: <nicolai@catpipe.net> PR: 76839 MFC after: 3 days	2005-04-05 13:04:35 +00:00
glebius	20adbdefb7	When several carp interfaces are attached to Ethernet interface, carp_carpdev_state_locked() is called every time carp interface is attached. The first call backs up flags of the first interface, and the second call backs up them again, erasing correct values. To solve this, a carp_sc_state_locked() function is introduced. It is called when interface is attached to parent, instead of calling carp_carpdev_state_locked. carp_carpdev_state_locked() calls carp_sc_state_locked() for each sc in chain. Reported by: Yuriy N. Shkandybin, sem	2005-03-30 11:44:43 +00:00
glebius	31c59e5e8a	- Don't free mbuf, passed to interface output method if the latter returns error. In this case mbuf has already been freed. [1] - Remove redundant declaration. PR: kern/78893 [1] Submitted by: Liang Yi [1] Reviewed by: sam MFC after: 1 day	2005-03-29 13:43:09 +00:00
sam	0f999925e8	eliminate extraneous null ptr checks Noticed by: Coverity Prevent analysis tool	2005-03-29 01:10:46 +00:00
sam	d2c3d0b002	deal with malloc failures Noticed by: Coverity Prevent analysis tool Together with: mdodd	2005-03-26 22:20:22 +00:00
maxim	56ed6f8b75	o Document net.inet.ip.portrange.random* sysctls. o Correct a comment about random port allocation threshold implementation. Reviewed by: silby, ru MFC after: 3 days	2005-03-23 09:26:38 +00:00
glebius	d2dabbb47c	ifma_protospec is a pointer. Use NULL when assigning or compating it.	2005-03-20 14:31:45 +00:00
glebius	3adfed4ac4	Remove a workaround from previos revision. It proved to be incorrect. Add two another workarounds for carp(4) interfaces: - do not add connected route when address is assigned to carp(4) interface - do not add connected route when other interface goes down Embrace workarounds with #ifdef DEV_CARP	2005-03-20 10:27:17 +00:00
glebius	285d285103	If vhid exists return more informative EEXIST instead of EINVAL. While here remove redundant brackets.	2005-03-18 13:41:38 +00:00
glebius	e90a54cc90	Fix a potential crash that could occur when CARP_LOG is being used. Obtained from: OpenBSD (pat)	2005-03-18 13:18:34 +00:00

... 2 3 4 5 6 ...

2560 Commits