freebsd-dev

Author	SHA1	Message	Date
Pawel Jakub Dawidek	a7f3feff1b	Allocate memory when dumping pipes with M_WAITOK flag. On a system with huge number of pipes, M_NOWAIT failes almost always, because of memory fragmentation. My fix is different than the patch proposed by Pawel Malachowski, because in FreeBSD 5.x we cannot sleep while holding dummynet mutex (in 4.x there is no such lock). My fix is also ugly, but there is no easy way to prepare nice and clean fix. PR: kern/46557 Submitted by: Eugene Grosbein <eugen@grosbein.pp.ru> Reviewed by: mlaier	2004-08-25 09:31:30 +00:00
Andre Oppermann	9b932e9e04	Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different. However there are many changes how ipfw is and its add-on's are handled: In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in\|out]() in the ipfw PFIL handler. IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output(). ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it. DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection. BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS. More detailed changes to the code: conf/files Add netinet/ip_fw_pfil.c. conf/options Add IPFIREWALL_FORWARD option. modules/ipfw/Makefile Add ip_fw_pfil.c. net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well. netinet/ip_divert.c Removed divert_clone() function. It is no longer used. netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed. netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args. netinet/ip_fw2.c (Re)moved some global variables and the module handling. netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization. netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set. netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.) netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active. netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags. netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here. sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed. Approved by: re (scottl)	2004-08-17 22:05:54 +00:00
Poul-Henning Kamp	3e019deaed	Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".	2004-07-15 08:26:07 +00:00
Robert Watson	d67ec3dd48	When asserting non-Giant locks in the network stack, also assert Giant if debug.mpsafenet=0, as any points that require synchronization in the SMPng world also required it in the Giant-world: - inpcb locks (including IPv6) - inpcbinfo locks (including IPv6) - dummynet subsystem lock - ipfw2 subsystem lock	2004-06-24 02:01:48 +00:00
Max Laier	b81dae751b	Add some missing DUMMYNET_UNLOCK() in config_pipe(). Noticed by: Simon Coggins Approved by: bms(mentor)	2004-03-03 01:33:22 +00:00
Max Laier	ac9d7e2618	Re-remove MT_TAGs. The problems with dummynet have been fixed now. Tested by: -current, bms(mentor), me Approved by: bms(mentor), sam	2004-02-25 19:55:29 +00:00
Max Laier	36e8826ffb	Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet is not working properly with the patch in place. Approved by: bms(mentor)	2004-02-18 00:04:52 +00:00
Max Laier	1094bdca51	This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacing them mostly with packet tags (one case is handled by using an mbuf flag since the linkage between "caller" and "callee" is direct and there's no need to incur the overhead of a packet tag). This is (mostly) work from: sam Silence from: -arch Approved by: bms(mentor), sam, rwatson	2004-02-13 19:14:16 +00:00
Maxim Konovalov	fad1d65260	o Fix a comment: softticks lives in sys/kern/kern_timeout.c. PR: kern/60613 Submitted by: Gleb Smirnoff MFC after: 3 days	2003-12-27 14:08:53 +00:00
Maksim Yevmenkin	a6a66f5c4c	Do not panic when flushing dummynet firewall rules Reviewed by: andre Approved by: re (scottl)	2003-12-06 09:01:25 +00:00
Sam Leffler	6a3ca7514d	Use MPSAFE callouts only when debug.mpsafenet is 1. Both timer routines potentially transmit packets that may enter KAME IPsec w/o Giant if the callouts are marked MPSAFE. Reviewed by: ume Approved by: re (rwatson)	2003-11-23 18:13:41 +00:00
Sam Leffler	7138d65c3f	replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREF macros that expand to include assertions when the system is built with INVARIANTS Supported by: FreeBSD Foundation	2003-11-08 23:36:32 +00:00
Brooks Davis	9bf40ede4a	Replace the if_name and if_unit members of struct ifnet with new members if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname)	2003-10-31 18:32:15 +00:00
Sam Leffler	3520e9d61d	o add locking to protect routing table refcnt manipulations o add some more debugging help for figuring out why folks are getting complaints about releasing routing table entries with a zero refcnt o fix comment that talked about spl's o remove duplicate define of DUMMYNET_DEBUG Supported by: FreeBSD Foundation	2003-10-29 19:03:58 +00:00
Sam Leffler	b15694110f	Drop dummynet lock when calling back into the network stack to deliver packets. This eliminates a LOR with Giant that caused outbound pipes to fail. Supported by: FreeBSD Foundation	2003-10-16 16:21:25 +00:00
Sam Leffler	91176902bc	Minor fixups + add locking. o change time to MPSAFE callout o make debug printfs conditional on DUMMYNET_DEBUG and runtime controllable by net.inet.ip.dummynet.debug o make boot-time printf dependent on bootverbose Sponsored by: FreeBSD Foundation	2003-09-17 00:54:04 +00:00
Maxim Konovalov	e1bd2f381a	o Fix a typo in previous commit.	2003-07-31 10:24:36 +00:00
Maxim Konovalov	853af3f3f0	o Do not overwrite saved interrupt priority level by alloc_hash(), use a separate variable. o Restore interrupt priority level before return (no-op in HEAD). Spotted by: Don Bowman <don@sandvine.com> MFC after: 5 days	2003-07-25 09:59:16 +00:00
Luigi Rizzo	5d3b4c2480	Remove a stale comment, fix indentation.	2003-06-28 14:23:22 +00:00
Luigi Rizzo	ec4270c021	Fix typo in a (commented out) debugging string. Spotted by: diff	2003-06-23 21:38:21 +00:00
Bernd Walter	330462a315	Change handling to support strong alignment architectures such as alpha and sparc64. PR: alpha/50658 Submitted by: rizzo Tested on: alpha	2003-06-04 01:17:37 +00:00
Maxim Konovalov	7778283b40	Fix indentation.	2003-03-27 15:00:10 +00:00
Maxim Konovalov	be1e4c5162	o Protect set_fs_param() by splimp(9). Quote from kern/37573: There is an obvious race in netinet/ip_dummynet.c:config_pipe(). Interrupts are not blocked when changing the params of an existing pipe. The specific crash observed: ... -> config_pipe -> set_fs_parms -> config_red malloc a new w_q_lookup table but take an interrupt before intializing it, interrupt handler does: ... -> dummynet_io -> red_drops red_drops dereferences the uninitialized (zeroed) w_q_lookup table. o Flush accumulated credits for idle pipes. o Flush accumulated credits when change pipe characteristics. o Change dn_flow_queue.numbytes type to unsigned long. Overlapping dn_flow_queue->numbytes in ready_event() leads to numbytes becomes negative and SET_TICKS() macro returns a very big value. heap_insert() overlaps dn_key again and inserts a queue to a ready heap with a sched_time points to the past. That leads to an "infinity" loop. PR: kern/33234, kern/37573, misc/42459, kern/43133, kern/44045, kern/48099 Submitted by: Mike Hibler <mike@cs.utah.edu> (kern/37573) MFC after: 6 weeks	2003-03-27 14:56:36 +00:00
Maxim Konovalov	2adf7582da	De-anonymity a couple of messages I missed in a previous sweep. Move one of them under DEB macro. Noticed by: Wiktor Niesiobedzki <w@evip.pl>	2003-01-20 13:03:34 +00:00
Jens Schweikhardt	9d5abbddbf	Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.	2003-01-01 18:49:04 +00:00
Maxim Konovalov	f4ef616f98	o De-anonymity dummynet(4) and ipfw(4) messages, prepend them by 'dummynet: ' and 'ipfw: ' prefixes. PR: kern/41609	2002-12-24 13:45:24 +00:00
Maxim Konovalov	616fa7460c	o Trim EOL whitespaces. MFC after: 1 week	2002-12-15 10:24:36 +00:00
Maxim Konovalov	2713a5bebb	o M_DONTWAIT is mbuf(9) flag: malloc(M_DONTWAIT) -> malloc(M_NOWAIT). The bug does not affect anything because M_NOWAIT == M_DONTWAIT. Reviewed by: luigi MFC after: 1 week	2002-12-15 10:21:30 +00:00
Sam Leffler	eec3a0b17f	track changes to not strip the Ethernet header from input packets Reviewed by: many Approved by: re	2002-11-14 23:46:04 +00:00
Maxim Konovalov	8ef1565d2b	Due to a memory alignment sizeof(struct ipfw_flow_id) is bigger than ipfw_flow_id structure actual size and bcmp(3) may fail to compare them properly. Compare members of these structures instead. PR: kern/44078 Submitted by: Oleg Bulyzhin <oleg@rinet.ru> Reviewed by: luigi MFC after: 2 weeks	2002-11-13 11:31:44 +00:00
Sam Leffler	5d84645305	Replace aux mbufs with packet tags: o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month	2002-10-16 01:54:46 +00:00
Sean Chittenden	927a76bb5e	Increase the max dummynet hash size from 1024 to 65536. Default is still 1024. Silence on: -net, -ipfw 4weeks+ Reviewed by: dd Approved by: knu (mentor) MFC after: 3 weeks	2002-10-12 07:45:23 +00:00
Luigi Rizzo	4bbf3b8b3a	Kernel support for a dummynet option: When a pipe or queue has the "noerror" attribute, do not report drops to the caller (ip_output() and friends). (2 lines to implement it, 2 lines to document it.) This will let you simulate losses on the sender side as if they happened in the middle of the network, i.e. with no explicit feedback to the sender. manpage and ipfw2.c changes to follow shortly, together with other ipfw2 changes. Requested by: silby MFC after: 3 days	2002-08-15 16:53:43 +00:00
Luigi Rizzo	318aa87b59	Fix a panic when doing "ipfw add pipe 1 log ..." Also synchronize ip_dummynet.c with the version in RELENG_4 to ease MFC's.	2002-07-17 07:21:42 +00:00
Luigi Rizzo	c7ea683135	Change one variable to make it easier to switch between ipfw and ipfw2	2002-07-09 06:53:38 +00:00
Luigi Rizzo	5185195169	Forgot to update one field name in one of the latest commits.	2002-07-08 22:37:55 +00:00
Luigi Rizzo	9758b77ff1	The new ipfw code. This code makes use of variable-size kernel representation of rules (exactly the same concept of BPF instructions, as used in the BSDI's firewall), which makes firewall operation a lot faster, and the code more readable and easier to extend and debug. The interface with the rest of the system is unchanged, as witnessed by this commit. The only extra kernel files that I am touching are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In userland I only had to touch those programs which manipulate the internal representation of firewall rules). The code is almost entirely new (and I believe I have written the vast majority of those sections which were taken from the former ip_fw.c), so rather than modifying the old ip_fw.c I decided to create a new file, sys/netinet/ip_fw2.c . Same for the user interface, which is in sbin/ipfw/ipfw2.c (it still compiles to /sbin/ipfw). The old files are still there, and will be removed in due time. I have not renamed the header file because it would have required touching a one-line change to a number of kernel files. In terms of user interface, the new "ipfw" is supposed to accepts the old syntax for ipfw rules (and produce the same output with "ipfw show". Only a couple of the old options (out of some 30 of them) has not been implemented, but they will be soon. On the other hand, the new code has some very powerful extensions. First, you can put "or" connectives between match fields (and soon also between options), and write things like ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any This should make rulesets slightly more compact (and lines longer!), by condensing 2 or more of the old rules into single ones. Also, as an example of how easy the rules can be extended, I have implemented an 'address set' match pattern, where you can specify an IP address in a format like this: 10.20.30.0/26{18,44,33,22,9} which will match the set of hosts listed in braces belonging to the subnet 10.20.30.0/26 . The match is done using a bitmap, so it is essentially a constant time operation requiring a handful of CPU instructions (and a very small amount of memmory -- for a full /24 subnet, the instruction only consumes 40 bytes). Again, in this commit I have focused on functionality and tried to minimize changes to the other parts of the system. Some performance improvement can be achieved with minor changes to the interface of ip_fw_chk_t. This will be done later when this code is settled. The code is meant to compile unmodified on RELENG_4 (once the PACKET_TAG_* changes have been merged), for this reason you will see #ifdef __FreeBSD_version in a couple of places. This should minimize errors when (hopefully soon) it will be time to do the MFC.	2002-06-27 23:02:18 +00:00
Luigi Rizzo	2b25acc158	Remove (almost all) global variables that were used to hold packet forwarding state ("annotations") during ip processing. The code is considerably cleaner now. The variables removed by this change are: ip_divert_cookie used by divert sockets ip_fw_fwd_addr used for transparent ip redirection last_pkt used by dynamic pipes in dummynet Removal of the first two has been done by carrying the annotations into volatile structs prepended to the mbuf chains, and adding appropriate code to add/remove annotations in the routines which make use of them, i.e. ip_input(), ip_output(), tcp_input(), bdg_forward(), ether_demux(), ether_output_frame(), div_output(). On passing, remove a bug in divert handling of fragmented packet. Now it is the fragment at offset 0 which sets the divert status of the whole packet, whereas formerly it was the last incoming fragment to decide. Removal of last_pkt required a change in the interface of ip_fw_chk() and dummynet_io(). On passing, use the same mechanism for dummynet annotations and for divert/forward annotations. option IPFIREWALL_FORWARD is effectively useless, the code to implement it is very small and is now in by default to avoid the obfuscation of conditionally compiled code. NOTES: * there is at least one global variable left, sro_fwd, in ip_output(). I am not sure if/how this can be removed. * I have deliberately avoided gratuitous style changes in this commit to avoid cluttering the diffs. Minor stule cleanup will likely be necessary * this commit only focused on the IP layer. I am sure there is a number of global variables used in the TCP and maybe UDP stack. * despite the number of files touched, there are absolutely no API's or data structures changed by this commit (except the interfaces of ip_fw_chk() and dummynet_io(), which are internal anyways), so an MFC is quite safe and unintrusive (and desirable, given the improved readability of the code). MFC after: 10 days	2002-06-22 11:51:02 +00:00
Luigi Rizzo	4b9840932d	Add ipfw hooks to ether_demux() and ether_output_frame(). Ipfw processing of frames at layer 2 can be enabled by the sysctl variable net.link.ether.ipfw=1 Consider this feature experimental, because right now, the firewall is invoked in the places indicated below, and controlled by the sysctl variables listed on the right. As a consequence, a packet can be filtered from 1 to 4 times depending on the path it follows, which might make a ruleset a bit hard to follow. I will add an ipfw option to tell if we want a given rule to apply to ether_demux() and ether_output_frame(), but we have run out of flags in the struct ip_fw so i need to think a bit on how to implement this. to upper layers \| \| +----------->-----------+ ^ V [ip_input] [ip_output] net.inet.ip.fw.enable=1 \| \| ^ V [ether_demux] [ether_output_frame] net.link.ether.ipfw=1 \| \| +->- [bdg_forward]-->---+ net.link.ether.bridge_ipfw=1 ^ V \| \| to devices	2002-05-13 10:37:19 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
Luigi Rizzo	7b109fa404	MFS: sync the ipfw/dummynet/bridge code with the one recently merged into stable (mostly , but not only, formatting and comments changes).	2001-11-04 22:56:25 +00:00
Paul Saab	db69a05dce	Make it so dummynet and bridge can be loaded as modules. Submitted by: billf	2001-10-05 05:45:27 +00:00
Luigi Rizzo	830cc17841	Two main changes here: + implement "limit" rules, which permit to limit the number of sessions between certain host pairs (according to masks). These are a special type of stateful rules, which might be of interest in some cases. See the ipfw manpage for details. + merge the list pointers and ipfw rule descriptors in the kernel, so the code is smaller, faster and more readable. This patch basically consists in replacing "foo->rule->bar" with "rule->bar" all over the place. I have been willing to do this for ages! MFC after: 1 week	2001-09-27 23:44:27 +00:00
Brooks Davis	49c024e373	Include sys/proc.h for the definition of securelevel_ge(). Submitted by: LINT	2001-09-26 21:53:20 +00:00
Robert Watson	785f9ffca3	o Modify IPFW and DUMMYNET administrative setsockopt() calls to use securelevel_gt() to check the securelevel, rather than direct access to the securelevel variable. Obtained from: TrustedBSD Project	2001-09-26 19:58:29 +00:00
Luigi Rizzo	7e1cd0d23d	Sync with the bridge/dummynet/ipfw code already tested in stable. In ip_fw.[ch] change a couple of variable and field names to avoid having types, variables and fields with the same name.	2001-02-10 00:10:18 +00:00
Poul-Henning Kamp	fc2ffbe604	Mechanical change to use <sys/queue.h> macro API instead of fondling implementation details. Created with: sed(1) Reviewed by: md5(1)	2001-02-04 13:13:25 +00:00
Luigi Rizzo	507b4b5432	MFS: bridge/ipfw/dummynet fixes (bridge.c will be committed separately)	2001-02-02 00:18:00 +00:00
Luigi Rizzo	26fb17bdd0	Minor cleanups after yesterday's patch. The code (bridging and dummynet) actually worked fine!	2001-01-26 19:43:54 +00:00
Luigi Rizzo	6258acf88f	Bring dummynet in line with the code that now works in -STABLE. It compiles, but I cannot test functionality yet.	2001-01-26 06:49:34 +00:00

1 2

84 Commits