freebsd-skq

Author	SHA1	Message	Date
melifaro	a5144524e2	Fix 'may be used uninitialized' warning not caught by clang.	2015-04-27 10:01:22 +00:00
melifaro	d498dce901	Use free_nat_instance() for nat instance deletion. Sponsored by: Yandex LLC	2015-04-27 09:16:22 +00:00
melifaro	9f3d7ccd07	Make rule table kernel-index rewriting support any kind of objects. Currently we have tables identified by their names in userland with internal kernel-assigned indices. This works the following way: When userland wishes to communicate with kernel to add or change rule(s), it makes indexed sorted array of table names (internally ipfw_obj_ntlv entries), and refer to indices in that array in rule manipulation. Prior to committing new rule to the ruleset kernel a) finds all referenced tables, bump their refcounts and change values inside the opcodes to be real kernel indices b) auto-creates all referenced but not existing tables and then do a) for them. Kernel does almost the same when exporting rules to userland: prepares array of used tables in all rules in range, and prepends it before the actual ruleset retaining actual in-kernel indexes for that. There is also special translation layer for legacy clients which is able to provide 'real' indices for table names (basically doing atoi()). While it is arguable that every subsystem really needs names instead of numbers, there are several things that should be noted: 1) every non-singleton subsystem needs to store its runtime state somewhere inside ipfw chain (and be able to get it fast) 2) we can't assume object numbers provided by humans will be dense. Existing nat implementation (O(n) access and LIST inside chain) is a good example. Hence the following: * Convert table-centric rewrite code to be more generic, callback-based * Move most of the code from ip_fw_table.c to ip_fw_sockopt.c * Provide abstract API to permit subsystems convert their objects between userland string identifier and in-kernel index. (See struct opcode_obj_rewrite) for more details * Create another per-chain index (in next commit) shared among all subsystems * Convert current NAT44 implementation to use new API, O(1) lookups, shared index and names instead of numbers (in next commit). Sponsored by: Yandex LLC	2015-04-27 08:29:39 +00:00
glebius	5a54b2974a	Fix memory leak. PR: 199670 Reviewed by: ae	2015-04-27 05:44:09 +00:00
glebius	a29f5e7ca8	Move ALTQ from contrib to net/altq. The ALTQ code is for many years discontinued by its initial authors. In FreeBSD the code was already slightly edited during the pf(4) SMP project. It is about to be edited more in the projects/ifnet. Moving out of contrib also allows to remove several hacks to the make glue. Reviewed by: net@	2015-04-16 20:22:40 +00:00
kp	859bfca800	pf: Fix forwarding detection If the direction is not PF_OUT we can never be forwarding. Some input packets have rcvif != ifp (looped back packets), which lead us to ip6_forward() inbound packets, causing panics. Equally, we need to ensure that packets were really received and not locally generated before trying to ip6_forward() them. Differential Revision: https://reviews.freebsd.org/D2286 Approved by: gnn(mentor)	2015-04-14 19:07:37 +00:00
gnn	be303b042b	I can find no reason to allow packets with both SYN and FIN bits set past this point in the code. The packet should be dropped and not massaged as it is here. Differential Revision: https://reviews.freebsd.org/D2266 Submitted by: eri Sponsored by: Rubicon Communications (Netgate)	2015-04-14 14:43:42 +00:00
kp	e192a810c5	pf: Skip firewall for refragmented ip6 packets In cases where we scrub (fragment reassemble) on both input and output we risk ending up in infinite loops when forwarding packets. Fragmented packets come in and get collected until we can defragment. At that point the defragmented packet is handed back to the ip stack (at the pfil point in ip6_input(). Normal processing continues. Eventually we figure out that the packet has to be forwarded and we end up at the pfil hook in ip6_forward(). After doing the inspection on the defragmented packet we see that the packet has been defragmented and because we're forwarding we have to refragment it. In pf_refragment6() we split the packet up again and then ip6_forward() the individual fragments. Those fragments hit the pfil hook on the way out, so they're collected until we can reconstruct the full packet, at which point we're right back where we left off and things continue until we run out of stack. Break that loop by marking the fragments generated by pf_refragment6() as M_SKIP_FIREWALL. There's no point in processing those packets in the firewall anyway. We've already filtered on the full packet. Differential Revision: https://reviews.freebsd.org/D2197 Reviewed by: glebius, gnn Approved by: gnn (mentor)	2015-04-06 19:05:00 +00:00
glebius	7c22152af0	o Use new function ip_fillid() in all places throughout the kernel, where we want to create a new IP datagram. o Add support for RFC6864, which allows to set IP ID for atomic IP datagrams to any value, to improve performance. The behaviour is controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by default. o In case if we generate IP ID, use counter(9) to improve performance. o Gather all code related to IP ID into ip_id.c. Differential Revision: https://reviews.freebsd.org/D2177 Reviewed by: adrian, cy, rpaulo Tested by: Emeric POUPON <emeric.poupon stormshield.eu> Sponsored by: Netflix Sponsored by: Nginx, Inc. Relnotes: yes	2015-04-01 22:26:39 +00:00
kp	67c45e2f58	pf: Deal with runt packets On Ethernet packets have a minimal length, so very short packets get padding appended to them. This padding is not stripped off in ip6_input() (due to support for IPv6 Jumbograms, RFC2675). That means PF needs to be careful when reassembling fragmented packets to not include the padding in the reassembled packet. While here also remove the 'Magic from ip_input.' bits. Splitting up and re-joining an mbuf chain here doesn't make any sense. Differential Revision: https://reviews.freebsd.org/D2189 Approved by: gnn (mentor)	2015-04-01 12:16:56 +00:00
kp	86dedea3cb	Preserve IPv6 fragment IDs accross reassembly and refragmentation When forwarding fragmented IPv6 packets and filtering with PF we reassemble and refragment. That means we generate new fragment headers and a new fragment ID. We already save the fragment IDs so we can do the reassembly so it's straightforward to apply the incoming fragment ID on the refragmented packets. Differential Revision: https://reviews.freebsd.org/D2188 Approved by: gnn (mentor)	2015-04-01 12:15:01 +00:00
ae	ad5d4cffda	The offset variable has been cleared all bits except IP6F_OFF_MASK. Use ip6f_mf variable instead of checking its bits.	2015-03-31 14:41:29 +00:00
pluknet	1dcc5ccab3	Static'ize pf_fillup_fragment body to match its declaration. Missed in 278925.	2015-03-26 13:31:04 +00:00
glebius	d0d9f03f17	Always lock the hash row of a source node when updating its 'states' counter. PR: 182401 Sponsored by: Nginx, Inc.	2015-03-17 12:19:28 +00:00
ae	8ee4f19c05	Fix `ipfw fwd tablearg'. Use dedicated field nh4 in struct table_value to obtain IPv4 next hop address in tablearg case. Add `fwd tablearg' support for IPv6. ipfw(8) uses INADDR_ANY as next hop address in O_FORWARD_IP opcode for specifying tablearg case. For IPv6 we still use this opcode, but when packet identified as IPv6 packet, we obtain next hop address from dedicated field nh6 in struct table_value. Replace hopstore field in struct ip_fw_args with anonymous union and add hopstore6 field. Use this field to copy tablearg value for IPv6. Replace spare1 field in struct table_value with zoneid. Use it to keep scope zone id for link-local IPv6 addresses. Since spare1 was used internally, replace spare0 array with two variables spare0 and spare1. Use getaddrinfo(3)/getnameinfo(3) functions for parsing and formatting IPv6 addresses in table_value. Use zoneid field in struct table_value to store sin6_scope_id value. Since the kernel still uses embedded scope zone id to represent link-local addresses, convert next_hop6 address into this form before return from pfil processing. This also fixes in6_localip() check for link-local addresses. Differential Revision: https://reviews.freebsd.org/D2015 Obtained from: Yandex LLC Sponsored by: Yandex LLC	2015-03-13 09:03:25 +00:00
ae	cc29b99b5c	Reset mbuf pointer to NULL in fastroute case to indicate that mbuf was consumed by filter. This fixes several panics due to accessing to mbuf after free. Submitted by: Kristof Provost MFC after: 1 week	2015-03-12 08:57:24 +00:00
glebius	f9f2edcf7b	Even more fixes to !INET and !INET6 kernels. In collaboration with: pluknet	2015-02-17 22:33:22 +00:00
glebius	534401756a	- Improve INET/INET6 scope. - style(9) declarations. - Make couple of local functions static.	2015-02-16 23:50:53 +00:00
glebius	16f1b2f354	Toss declarations to fix regular build and NO_INET6 build.	2015-02-16 21:52:28 +00:00
glebius	15b1e688ce	In the forwarding case refragment the reassembled packets with the same size as they arrived in. This allows the sender to determine the optimal fragment size by Path MTU Discovery. Roughly based on the OpenBSD work by Alexander Bluhm. Submitted by: Kristof Provost Differential Revision: D1767	2015-02-16 07:01:02 +00:00
glebius	9faacbf76a	Update the pf fragment handling code to closer match recent OpenBSD. That partially fixes IPv6 fragment handling. Thanks to Kristof for working on that. Submitted by: Kristof Provost Tested by: peter Differential Revision: D1765	2015-02-16 03:38:27 +00:00
melifaro	b45491786e	Fix IP_FW_NAT44_LIST_NAT size calculation. Found by: lev Sponsored by: Yandex LLC	2015-02-05 14:54:53 +00:00
melifaro	0f5a4f0517	* Make sure table algorithm destroy hook is always called without locks * Explicitly lock freeing interface references in ta_destroy_ifidx * Change ipfw_iface_unref() to require UH lock * Add forgotten ipfw_iface_unref() to destroy_ifidx_locked() PR: kern/197276 Submitted by: lev Sponsored by: Yandex LLC	2015-02-05 13:49:04 +00:00
glebius	12e7b30255	Back out r276841, r276756, r276747, r276746. The change in r276747 is very very questionable, since it makes vimages more dependent on each other. But the reason for the backout is that it screwed up shutting down the pf purge threads, and now kernel immedially panics on pf module unload. Although module unloading isn't an advertised feature of pf, it is very important for development process. I'd like to not backout r276746, since in general it is good. But since it has introduced numerous build breakages, that later were addressed in r276841, r276756, r276747, I need to back it out as well. Better replay it in clean fashion from scratch.	2015-01-22 01:23:16 +00:00
melifaro	9a4e5966c8	Use ipfw runtime lock only when real modification is required.	2015-01-16 10:49:27 +00:00
rodrigc	400655a4d3	Do not initialize pfi_unlnkdkifs_mtx and pf_frag_mtx. They are already initialized by MTX_SYSINIT. Submitted by: Nikos Vassiliadis <nvass@gmx.com>	2015-01-08 17:49:07 +00:00
rodrigc	89bede2eff	Reapply previous patch to fix build. PR: 194515	2015-01-06 16:47:02 +00:00
rodrigc	b15d5b05bd	Instead of creating a purge thread for every vnet, create a single purge thread and clean up all vnets from this thread. PR: 194515 Differential Revision: D1315 Submitted by: Nikos Vassiliadis <nvass@gmx.com>	2015-01-06 09:03:03 +00:00
rodrigc	58319f89ed	Merge: r258322 from projects/pf branch Split functions that initialize various pf parts into their vimage parts and global parts. Since global parts appeared to be only mutex initializations, just abandon them and use MTX_SYSINIT() instead. Kill my incorrect VNET_FOREACH() iterator and instead use correct approach with VNET_SYSINIT(). PR: 194515 Differential Revision: D1309 Submitted by: glebius, Nikos Vassiliadis <nvass@gmx.com> Reviewed by: trociny, zec, gnn	2015-01-06 08:39:06 +00:00
eri	0d0f5282c7	pf(4) needs to have a correct checksum during its processing. Calculate checksums for the IPv6 path when needed before delving into pf(4) code as required. PR: 172648, 179392 Reviewed by: glebius@ Approved by: gnn@ Obtained from: pfSense MFC after: 1 week Sponsored by: Netgate	2014-11-19 13:31:08 +00:00
melifaro	5edf5a79dc	Finish r274315: remove union 'u' from struct pf_send_entry. Suggested by: kib	2014-11-09 17:01:54 +00:00
melifaro	6b3c0c962e	Remove unused 'struct route' fields.	2014-11-09 16:15:28 +00:00
glebius	99f4ec50e8	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.	2014-11-07 09:39:05 +00:00
melifaro	6690acb29a	Remove unused variable. Found by: Coverity CID: 1245739	2014-11-04 10:25:52 +00:00
melifaro	e80c20a708	Bump default dynamic limit to 16k entries. Print better log message when limit is hit. PR: 193300 Submitted by: me at nileshgr.com	2014-10-24 13:57:15 +00:00
melifaro	ac030d8a9b	Rename log2 to tal_log2. Submitted by: luigi	2014-10-22 21:20:37 +00:00
luigi	f871c30cce	remove/fix old code for building ipfw and dummynet in userspace	2014-10-22 05:21:36 +00:00
hselasky	49c137f7be	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
melifaro	72cb3fba5f	Use copyout() directly instead of updating various fields before/after each sooptcopyout() call. Found by: luigi Sponsored by: Yandex LLC	2014-10-20 11:21:07 +00:00
melifaro	4c24d0f039	Perform more checks on the number of tables supplied by user.	2014-10-19 11:15:19 +00:00
des	325f19fddb	Add a complete implementation of MurmurHash3. Tweak both implementations so they match the established idiom. Document them in hash(9). MFC after: 1 month MFC with: r272906	2014-10-18 22:15:11 +00:00
melifaro	bd97071f9a	Use IPFW_RULE_CNTR_SIZE macro instead of non-relevant ip_fw_cntr structure. Found by: luigi	2014-10-18 17:23:41 +00:00
melifaro	bb36191801	Fix matching default rule on clear/show commands. Found by: Oleg Ginzburg	2014-10-13 13:49:28 +00:00
melifaro	f57c9803f8	Fix KASSERT typo.	2014-10-11 15:04:50 +00:00
melifaro	9743357b21	Remove redundant if_notifier declaration.	2014-10-10 20:37:06 +00:00
gnn	23f601a6ca	Change the PF hash from Jenkins to Murmur3. In forwarding tests this showed a conservative 3% incrase in PPS. Differential Revision: https://reviews.freebsd.org/D461 Submitted by: des Reviewed by: emaste MFC after: 1 month	2014-10-10 19:26:26 +00:00
melifaro	7c4a53a2d6	Fix KASSERT argument type.	2014-10-10 18:57:12 +00:00
melifaro	846b63a583	Fix NOINET6 build for ipfw.	2014-10-10 18:31:35 +00:00
melifaro	4b5577b783	Partially fix build on !amd64 Pointed by: bz	2014-10-10 17:24:56 +00:00
melifaro	0f998c67ea	Merge projects/ipfw to HEAD. Main user-visible changes are related to tables: * Tables are now identified by names, not numbers. There can be up to 65k tables with up to 63-byte long names. * Tables are now set-aware (default off), so you can switch/move them atomically with rules. * More functionality is supported (swap, lock, limits, user-level lookup, batched add/del) by generic table code. * New table types are added (flow) so you can match multiple packet fields at once. * Ability to add different type of lookup algorithms for particular table type has been added. * New table algorithms are added (cidr:hash, iface:array, number:array and flow:hash) to make certain types of lookup more effective. * Table value are now capable of holding multiple data fields for different tablearg users Performance changes: * Main ipfw lock was converted to rmlock * Rule counters were separated from rule itself and made per-cpu. * Radix table entries fits into 128 bytes * struct ip_fw is now more compact so more rules will fit into 64 bytes * interface tables uses array of existing ifindexes for faster match ABI changes: All functionality supported by old ipfw(8) remains functional. Old & new binaries can work together with the following restrictions: * Tables named other than ^\d+$ are shown as table(65535) in ruleset in old binaries Internal changes:. Changing table ids to numbers resulted in format modification for most sockopt codes. Old sopt format was compact, but very hard to extend (no versioning, inability to add more opcodes), so * All relevant opcodes were converted to TLV-based versioned IP_FW3-based codes. * The remaining opcodes were also converted to be able to eliminate all older opcodes at once * All IP_FW3 handlers uses special API instead of calling sooptcopy* directly to ease adding another communication methods * struct ip_fw is now different for kernel and userland * tablearg value has been changed to 0 to ease future extensions * table "values" are now indexes in special value array which holds extended data for given index * Batched add/delete has been added to tables code * Most changes has been done to permit batched rule addition. * interface tracking API has been added (started on demand) to permit effective interface tables operations * O(1) skipto cache, currently turned off by default at compile-time (eats 512K). * Several steps has been made towards making libipfw: * most of new functions were separated into "parse/prepare/show and actuall-do-stuff" pieces (already merged). * there are separate functions for parsing text string into "struct ip_fw" and printing "struct ip_fw" to supplied buffer (already merged). * Probably some more less significant/forgotten features MFC after: 1 month Sponsored by: Yandex LLC	2014-10-09 19:32:35 +00:00
melifaro	74b507651f	Bump ipfw module version.	2014-10-09 16:12:01 +00:00
melifaro	d23efba7dd	Sync to HEAD@r272825.	2014-10-09 15:35:28 +00:00
melifaro	cab1d703b6	Fix core on table destroy inroduced by table values code. Rename @ti array copy to 'ti_copy'.	2014-10-09 14:33:20 +00:00
melifaro	66120268e5	* Wire large user buffer before processing GET request. * Fix incorrect size calculation for IP_FW_XGET request.	2014-10-09 12:37:53 +00:00
melifaro	13cd9545ed	Add IP_FW_DUMP_SOPTCODES sopt to be able to determine which opcodes are currently available in kernel.	2014-10-08 11:12:14 +00:00
melifaro	c2f4f6c308	Fix possible crash when old value pointer is not updated after array resize.	2014-10-07 18:22:05 +00:00
melifaro	330fbac9e8	Notify table algo aboute runtime data change on table flush.	2014-10-07 16:46:11 +00:00
melifaro	7203f96dc1	* Fix crash in interface tracker due to using old "linked" field. * Ensure we're flushing entries without any locks held. * Free memory in (rare) case when interface tracker fails to register ifp. * Add KASSERT on table values refcounts.	2014-10-07 10:54:53 +00:00
melifaro	c5e00288f3	Improve r272609 (O_TCPOPTS). MFC after: 3 dayes	2014-10-06 12:29:06 +00:00
melifaro	bbf0fe2f55	Sync to HEAD@r272609.	2014-10-06 11:29:50 +00:00
melifaro	1a9bf52407	Fix O_TCPOPTS processing. Obtained from: luigi	2014-10-06 11:15:11 +00:00
melifaro	1e90e104a0	Fix build with gcc.	2014-10-04 13:57:14 +00:00
melifaro	f063418bd7	Please GCC by specifying proper cast.	2014-10-04 13:46:10 +00:00
melifaro	ea0abe8630	Bump max rule size to 512 opcodes.	2014-10-04 12:46:26 +00:00
melifaro	e8d559896c	Sync to HEAD@r272516.	2014-10-04 12:42:37 +00:00
melifaro	08c555cee7	Add "ipfw_ctl3" FEATURE to indicate presence of new ipfw interface.	2014-10-04 12:10:32 +00:00
melifaro	e2a6d82545	Switch ipfw to use rmlock for runtime locking.	2014-10-04 11:40:35 +00:00
melifaro	41c6784b49	Bump max rule size to 512 opcodes.	2014-10-04 10:15:49 +00:00
melifaro	c0f26d5a55	Make linear_skipto turned off by default.	2014-10-03 15:54:51 +00:00
melifaro	d8b683d70f	Remove lock init from radix.c. Radix has never managed its locking itself. The only consumer using radix with embeded rwlock is system routing table. Move per-AF lock inits there.	2014-10-01 14:39:06 +00:00
glebius	713d87864c	Use rn_detachhead() instead of direct free(9) for radix tables. Sponsored by: Nginx, Inc.	2014-10-01 13:35:41 +00:00
sbruno	22da1e9569	Fix NULL pointer deref in ipfw when using dummynet at layer 2. Drop packet if pkg->ifp is NULL, which is the case here. ref. https://github.com/HardenedBSD/hardenedBSD commit 4eef3881c64f6e3aa38eebbeaf27a947a5d47dd7 PR 193861 -- DUMMYNET LAYER2: kernel panic in this case a kernel panic occurs. Hence, when we do not get an interface, we just drop the packet in question. PR: 193681 Submitted by: David Carlier <david.carlier@hardenedbsd.org> Obtained from: Hardened BSD MFC after: 2 weeks Relnotes: yes	2014-09-25 02:26:05 +00:00
melifaro	a95acb50bd	Add pre-alfa version of DXR lookup module. It does build but (currently) does not work. This change is not intended to be merged along with other ipfw changes.	2014-09-21 18:15:09 +00:00
glebius	16745af543	Mechanically convert to if_inc_counter().	2014-09-19 09:19:29 +00:00
glebius	72f04611ec	Remove ifq_drops from struct ifqueue. Now queue drops are accounted in struct ifnet if_oqdrops. Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops is simply removed from them. There were no API to read this statistic. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-19 09:01:19 +00:00
glebius	759eeea220	- Provide a sleepable lock to protect against ioctl() vs ioctl() races. - Use the new lock to protect against simultaneous DIOCSTART and/or DIOCSTOP ioctls. Reported & tested by: jmallett Sponsored by: Nginx, Inc.	2014-09-12 08:39:15 +00:00
melifaro	f7e6823045	Make ipfw_nat module use IP_FW3 codes. Kernel changes: * Split kernel/userland nat structures eliminating IPFW_INTERNAL hack. * Add IP_FW_NAT44_* codes resemblin old ones. * Assume that instances can be named (no kernel support currently). * Use both UH+WLOCK locks for all configuration changes. * Provide full ABI support for old sockopts. Userland changes: * Use IP_FW_NAT44_* codes for nat operations. * Remove undocumented ability to show ranges of nat "log" entries.	2014-09-07 18:30:29 +00:00
melifaro	595fec1055	Change copyrights to the proper one.	2014-09-05 14:19:02 +00:00
melifaro	21fa37c8e5	Sync to HEAD@r271160.	2014-09-05 13:52:39 +00:00
melifaro	03b9e62107	* Use modular opcode handling inside ipfw_ctl3() instead of static switch. * Provide hints for subsystem initializers if they are called for the first/last time. * Convert every IP_FW3 opcode user to use new sopt API.	2014-09-05 11:11:15 +00:00
melifaro	d8fb572c36	Be consistent and use same arguments for ctl3 opcodes. Move legacy IP_FW_TABLE_XGETSIZE handling to separate function.	2014-09-03 21:57:06 +00:00
glebius	2e01608625	Clean up unused CSUM_FRAGMENT. Sponsored by: Nginx, Inc.	2014-09-03 08:30:18 +00:00
melifaro	9677452b6e	* Fix crash due to forgotten value refcouting in ipfw_link_table_values() * Fix argument order in rollback_toperation_state() * Make flush_table() use operation state API to ease checks.	2014-09-02 20:46:18 +00:00
melifaro	416d664184	Add more comments on newly-added functions. Add back opstate handler function.	2014-09-02 14:27:12 +00:00
glebius	0cbf499e97	Explicitly free packet on PF_DROP, otherwise a "quick" rule with "route-to" may still forward it. PR: 177808 Submitted by: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de> Sponsored by: InnoGames GmbH	2014-09-01 13:00:45 +00:00
melifaro	a1eca3cc0c	Add support for multi-field values inside ipfw tables. This is the last major change in given branch. Kernel changes: * Use 64-bytes structures to hold multi-value variables. * Use shared array to hold values from all tables (assume each table algo is capable of holding 32-byte variables). * Add some placeholders to support per-table value arrays in future. * Use simple eventhandler-style API to ease the process of adding new table items. Currently table addition may required multiple UH drops/ acquires which is quite tricky due to atomic table modificatio/swap support, shared array resize, etc. Deal with it by calling special notifier capable of rolling back state before actually performing swap/resize operations. Original operation then restarts itself after acquiring UH lock. * Bump all objhash users default values to at least 64 * Fix custom hashing inside objhash. Userland changes: * Add support for dumping shared value array via "vlist" internal cmd. * Some small print/fill_flags dixes to support u32 values. * valtype is now bitmask of <skipto\|pipe\|fib\|nat\|dscp\|tag\|divert\|netgraph\|limit\|ipv4\|ipv6>. New values can hold distinct values for each of this types. * Provide special "legacy" type which assumes all values are the same. * More helpers/docs following.. Some examples: 3:41 [1] zfscurr0# ipfw table mimimi create valtype skipto,limit,ipv4,ipv6 3:41 [1] zfscurr0# ipfw table mimimi info +++ table(mimimi), set(0) +++ kindex: 2, type: addr references: 0, valtype: skipto,limit,ipv4,ipv6 algorithm: addr:radix items: 0, size: 296 3:42 [1] zfscurr0# ipfw table mimimi add 10.0.0.5 3000,10,10.0.0.1,2a02:978:2::1 added: 10.0.0.5/32 3000,10,10.0.0.1,2a02:978:2::1 3:42 [1] zfscurr0# ipfw table mimimi list +++ table(mimimi), set(0) +++ 10.0.0.5/32 3000,0,10.0.0.1,2a02:978:2::1	2014-08-31 23:51:09 +00:00
melifaro	631be4d79a	* Make objhash api a bit more abstract by providing ability to specify own hash/compare functions. * Add requirement for table algorithms to copy "valie" field in @add callback instead of "prepare_add". * Document existing requirement for table algorithms to store value of deleted record to @tei.	2014-08-30 17:18:11 +00:00
melifaro	06eb65b248	Whitespace/style changes merged from projects/ipfw.	2014-08-23 17:57:06 +00:00
melifaro	cf94663e69	Sync to HEAD@r270409.	2014-08-23 14:58:31 +00:00
melifaro	2e65f120c8	Simplify table reference/create chain.	2014-08-23 12:41:39 +00:00
melifaro	3498dca96e	* Use OP_ADD/OP_DEL macro instead of plain integers. * ipfw_foreach_table_tentry() to permit listing arbitrary ipfw table using standart format.	2014-08-23 11:27:49 +00:00
glebius	4242d9acba	Do not lookup source node twice when pf_map_addr() is used. PR: 184003 Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-15 14:16:08 +00:00
glebius	9227a25906	pf_map_addr() can fail and in this case we should drop the packet, otherwise bad consequences including a routing loop can occur. Move pf_set_rt_ifp() earlier in state creation sequence and inline it, cutting some extra code. PR: 183997 Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-15 14:02:24 +00:00
melifaro	b921074dbb	Make room for multi-type values in struct tentry.	2014-08-15 12:58:32 +00:00
glebius	45bdeab3db	Fix synproxy with IPv6. pf_test6() was missing a check for M_SKIP_FIREWALL. PR: 127920 Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-15 04:35:34 +00:00
kevlo	dd40fa7e62	Change pr_output's prototype to avoid the need for explicit casts. This is a follow up to r269699. Phabric: D564 Reviewed by: jhb	2014-08-15 02:43:02 +00:00
melifaro	6f8397b648	Replace "cidr" table type with "addr" type. Suggested by: luigi	2014-08-14 21:43:20 +00:00
melifaro	7c57f4c90d	* Add cidr:kfib algo type just for fun. It binds kernel fib of given number to a table. Example: # ipfw table fib2 create algo "cidr:kfib fib=2" # ipfw table fib2 info +++ table(fib2), set(0) +++ kindex: 2, type: cidr, locked valtype: number, references: 0 algorithm: cidr:kfib fib=2 items: 11, size: 288 # ipfw table fib2 list +++ table(fib2), set(0) +++ 10.0.0.0/24 0 127.0.0.1/32 0 ::/96 0 ::1/128 0 ::ffff:0.0.0.0/96 0 2a02:978:2::/112 0 fe80::/10 0 fe80:1::/64 0 fe80:2::/64 0 fe80:3::/64 0 ff02::/16 0 # ipfw table fib2 lookup 10.0.0.5 10.0.0.0/24 0 # ipfw table fib2 lookup 2a02:978:2::11 2a02:978:2::/112 0 # ipfw table fib2 detail +++ table(fib2), set(0) +++ kindex: 2, type: cidr, locked valtype: number, references: 0 algorithm: cidr:kfib fib=2 items: 11, size: 288 IPv4 algorithm radix info items: 0 itemsize: 200 IPv6 algorithm radix info items: 0 itemsize: 200	2014-08-14 20:17:23 +00:00
glebius	7d0b571895	- Count global pf(4) statistics in counter(9). - Do not count global number of states and of src_nodes, use uma_zone_get_cur() to obtain values. - Struct pf_status becomes merely an ioctl API structure, and moves to netpfil/pf/pf.h with its constants. - V_pf_status is now of type struct pf_kstatus. Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-14 18:57:46 +00:00
melifaro	9b0fd0e183	* Document internal commands. * Do not require/set default table type if algo name is specified. * Add TA_FLAG_READONLY option for algorithms.	2014-08-14 17:31:04 +00:00
melifaro	a5e98ab07d	Clean up kernel interaction in ip_fw_iface.c Suggested by: ae	2014-08-14 13:24:59 +00:00
melifaro	ac476df0ec	Fix crash in case of iflist request on non-initialized tracker.	2014-08-14 08:42:16 +00:00
melifaro	ef7f079c1d	* Fix displaying dynamic rules for large rulesets. * Clean up some comments.	2014-08-14 08:21:22 +00:00
melifaro	9d56937f2a	Fix assertion.	2014-08-13 16:53:12 +00:00
melifaro	03e33c1ac5	Sync to HEAD@r269943.	2014-08-13 16:20:41 +00:00
melifaro	21ceaa3a9f	* Pass proper table set numbers from userland side. * Ignore them, but honor V_fw_tables_sets value on kernel side.	2014-08-13 12:04:45 +00:00
melifaro	2bb7ccb159	* Add jump_linear() function utilizing calculated skipto cache. * Update description for jump_fast() * Make jump_fast() users use JUMP() macro which is resolved to jump_fast() by default.	2014-08-13 09:34:33 +00:00
melifaro	1c05300c17	* Clarify ipfw_swap_table operations * Ensure <add\|del>_table_entry handle ta change properly.	2014-08-12 17:03:13 +00:00
melifaro	37a5b4aafb	* Rename ipfw_[un]bind_table_rule to ipfw_[un]ref_rule_tables * Update their descriptions.	2014-08-12 16:08:13 +00:00
melifaro	20eb17aed6	Change tablearg value to be 0 (try #2 ). Most of the tablearg-supported opcodes does not accept 0 as valid value: O_TAG, O_TAGGED, O_PIPE, O_QUEUE, O_DIVERT, O_TEE, O_SKIPTO, O_CALLRET, O_NETGRAPH, O_NGTEE, O_NAT treats 0 as invalid input. The rest are O_SETDSCP and O_SETFIB. 'Fix' them by adding high-order bit (0x8000) set for non-tablearg values. Do translation in kernel for old clients (import_rule0 / export_rule0), teach current ipfw(8) binary to add/remove given bit. This change does not affect handling SETDSCP values, but limit O_SETFIB values to 32767 instead of 65k. Since currently we have either old (16) or new (2^32) max fibs, this should not be a big deal: we're definitely OK for former and have to add another opcode to deal with latter, regardless of tablearg value.	2014-08-12 15:51:48 +00:00
melifaro	ac4e64f311	Do not use index 0 for tables.	2014-08-12 14:19:45 +00:00
melifaro	7f14a3576e	* Rename has_space to need_modify to be consistent with 0 as return values. * document all callbacks supported by algorithms code.	2014-08-12 14:09:15 +00:00
melifaro	324833519e	No functional changes, do better functions grouping.	2014-08-12 10:22:46 +00:00
melifaro	8c5ec3a86c	Simplify table auto-creation for old userland users.	2014-08-12 09:48:54 +00:00
melifaro	d633efff82	Simplify add/del_table_entry() by making their common pieces common functions.	2014-08-11 22:38:13 +00:00
melifaro	9266cc6d8f	Update functions descriptions.	2014-08-11 20:00:51 +00:00
melifaro	25473f8f4a	* Add the abilify to lock/unlock given table from changes. Example: # ipfw table si lock # ipfw table si info +++ table(si), set(0) +++ kindex: 0, type: cidr, locked valtype: number, references: 0 algorithm: cidr:radix items: 0, size: 288 # ipfw table si add 4.5.6.7 ignored: 4.5.6.7/32 0 ipfw: Adding record failed: table is locked # ipfw table si unlock # ipfw table si add 4.5.6.7 added: 4.5.6.7/32 0 # ipfw table si lock # ipfw table si delete 4.5.6.7 ignored: 4.5.6.7/32 0 ipfw: Deleting record failed: table is locked # ipfw table si unlock # ipfw table si delete 4.5.6.7 deleted: 4.5.6.7/32 0	2014-08-11 18:09:37 +00:00
melifaro	377bb9d131	* Add support for batched add/delete for ipfw tables * Add support for atomic batches add (all or none). * Fix panic on deleting non-existing entry in radix algo. Examples: # si is empty # ipfw table si add 1.1.1.1/32 1111 2.2.2.2/32 2222 added: 1.1.1.1/32 1111 added: 2.2.2.2/32 2222 # ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 4444 exists: 2.2.2.2/32 2200 added: 4.4.4.4/32 4444 ipfw: Adding record failed: record already exists ^^^^^ Returns error but keeps inserted items # ipfw table si list +++ table(si), set(0) +++ 1.1.1.1/32 1111 2.2.2.2/32 2222 4.4.4.4/32 4444 # ipfw table si atomic add 3.3.3.3/32 3333 4.4.4.4/32 4400 5.5.5.5/32 5555 added(reverted): 3.3.3.3/32 3333 exists: 4.4.4.4/32 4400 ignored: 5.5.5.5/32 5555 ipfw: Adding record failed: record already exists ^^^^^ Returns error and reverts added records # ipfw table si list +++ table(si), set(0) +++ 1.1.1.1/32 1111 2.2.2.2/32 2222 4.4.4.4/32 4444	2014-08-11 17:34:25 +00:00
melifaro	5b47ece0e9	* Use 2 32-bits field inside rule instead of 2 pointer to save skipto state. * Introduce ipfw_reap_add() to unify unlinking rules/adding it to reap queue * Unbreak FreeBSD7 export format.	2014-08-09 09:11:26 +00:00
melifaro	57d917cb99	Kernel changes: * Fix buffer calculation for table dumps * Fix IPv6 radix entiries addition broken in r269371. Userland changes: * Fix bug in retrieving statric ruleset * Fix several bugs in retrieving table list	2014-08-08 21:09:22 +00:00
melifaro	deeb40d882	Partially revert previous commit: "0" value is perfectly valid for O_SETFIB and O_SETDSCP, so tablearg remains to be 655535 for now.	2014-08-08 15:33:26 +00:00
melifaro	bc102dcade	* Switch tablearg value from 65535 to 0. * Use u16 table kidx instead of integer on for iface opcode. * Provide compability layer for old clients.	2014-08-08 14:23:20 +00:00
melifaro	2a5da00f23	* Add IP_FW_TABLE_XMODIFY opcode * Since there seems to be lack of consensus on strict value typing, remove non-default value types. Use userland-only "value format type" to print values. Kernel changes: * Add IP_FW_XMODIFY to permit table run-time modifications. Currently we support changing limit and value format type. Userland changes: * Support IP_FW_XMODIFY opcode. * Support specifying value format type (ftype) in tablble create/modify req * Fine-print value type/value format type.	2014-08-08 09:27:49 +00:00
melifaro	3ad34df447	Remove IP_FW_TABLES_XGETSIZE opcode. It is superseded by IP_FW_TABLES_XLIST.	2014-08-08 06:36:26 +00:00
kevlo	7727a3c215	Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb	2014-08-08 01:57:15 +00:00
melifaro	c2c120701d	Since all of base IP_FW opcodes has been converted to IP_FW3, switch default sopt handler to ipfw_clt3. Add some comments for ipfw_get_sopt* api.	2014-08-07 22:08:43 +00:00
melifaro	61bb76b813	Kernel changes: * Implement proper checks for switching between global and set-aware tables * Split IP_FW_DEL mess into the following opcodes: * IP_FW_XDEL (del rules matching pattern) * IP_FW_XMOVE (move rules matching pattern to another set) * IP_FW_SET_SWAP (swap between 2 sets) * IP_FW_SET_MOVE (move one set to another one) * IP_FW_SET_ENABLE (enable/disable sets) * Add IP_FW_XZERO / IP_FW_XRESETLOG to finish IP_FW3 migration. * Use unified ipfw_range_tlv as range description for all of the above. * Check dynamic states IFF there was non-zero number of deleted dyn rules, * Del relevant dynamic states with singe traversal instead of per-rule one. Userland changes: * Switch ipfw(8) to use new opcodes.	2014-08-07 21:37:31 +00:00
melifaro	42eca8abfb	Implement atomic ipfw table swap. Kernel changes: * Add opcode IP_FW_TABLE_XSWAP * Add support for swapping 2 tables with the same type/ftype/vtype. * Make skipto cache init after ipfw locks init. Userland changes: * Add "table X swap Y" command.	2014-08-03 21:37:12 +00:00
melifaro	c7e5ac0567	Implement O(1) skipto using indexed array. This adds 512K (2 * sizeof(u32) * 65k) bytes to the memory footprint. This feature is optionaly and may be turned on in any time (however it starts immediately in this commit. This will be changed.)	2014-08-03 15:49:03 +00:00
melifaro	6e882e1221	Show algorithm-specific data in "table info" output.	2014-08-03 12:19:45 +00:00
melifaro	688e206691	Be consistent on cidr:radix function naming: use algo name instead of "cidr".	2014-08-03 09:53:34 +00:00
melifaro	4cdc519f54	Remove unneded headers.	2014-08-03 09:48:54 +00:00
melifaro	7bb611530d	Whitespace changes.	2014-08-03 09:40:50 +00:00
melifaro	d27a1eeff2	* Move all algo-specific structures to the top of algo definition. * Be consistent on naming variables in different algos. * Use exponential array grow in iface:array and number:array.	2014-08-03 09:04:36 +00:00
melifaro	bfd5bf65d9	Store entry value back in @tei on entry update/deletion as another step to batched atomic updates.	2014-08-03 08:32:54 +00:00
melifaro	a1876c68a2	* Fix case when returning more that 4096 bytes of data * Use different approach to ensure algo has enough space to store N elements: - explicitly ask algo (under UH_WLOCK) before/after insertion. This (along with existing reallocation callbacks) really guarantees us that it is safe to insert N elements at once while holding UH_WLOCK+WLOCK. - remove old aflags/flags approach	2014-08-02 17:18:47 +00:00
melifaro	178311d9d4	* Permit limiting number of items in table. Kernel changes: * Add TEI_FLAGS_DONTADD entry flag to indicate that insert is not possible * Support given flag in all algorithms * Add "limit" field to ipfw_xtable_info * Add actual limiting code into add_table_entry() Userland changes: * Add "limit" option as "create" table sub-option. Limit modification is currently impossible. * Print human-readable errors in table enry addition/deletion code.	2014-08-01 15:17:46 +00:00
melifaro	6d7452f13b	Do not perform memset() on ta_buf in algo callbacks: it is already zeroed by base code.	2014-08-01 08:39:47 +00:00
melifaro	f9c6e04aff	Simplify radix operations: use unified tei_to_sockaddr_ent() to generate keys for add/delete calls.	2014-08-01 08:28:18 +00:00
melifaro	4dc5f97e56	* Use TA_FLAG_DEFAULT for default algorithm selection instead of exporting algorithm structures directly. * Pass needed state buffer size in algo structures as preparation for tables add/del requests batching.	2014-08-01 07:35:17 +00:00
melifaro	58e70e361d	* Add new "flow" table type to support N=1..5-tuple lookups * Add "flow:hash" algorithm Kernel changes: * Add O_IP_FLOW_LOOKUP opcode to support "flow" lookups * Add IPFW_TABLE_FLOW table type * Add "struct tflow_entry" as strage for 6-tuple flows * Add "flow:hash" algorithm. Basically it is auto-growing chained hash table. Additionally, we store mask of fields we need to compare in each instance/ * Increase ipfw_obj_tentry size by adding struct tflow_entry * Add per-algorithm stat (ifpw_ta_tinfo) to ipfw_xtable_info * Increase algoname length: 32 -> 64 (algo options passed there as string) * Assume every table type can be customized by flags, use u8 to store "tflags" field. * Simplify ipfw_find_table_entry() by providing @tentry directly to algo callback. * Fix bug in cidr:chash resize procedure. Userland changes: * add "flow table(NAME)" syntax to support n-tuple checking tables. * make fill_flags() separate function to ease working with _s_x arrays * change "table info" output to reflect longer "type" fields Syntax: ipfw table fl2 create type flow:[src-ip][,proto][,src-port][,dst-ip][dst-port] [algo flow:hash] Examples: 0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash 0:02 [2] zfscurr0# ipfw table fl2 info +++ table(fl2), set(0) +++ kindex: 0, type: flow:src-ip,proto,dst-port valtype: number, references: 0 algorithm: flow:hash items: 0, size: 280 0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000 0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000 0:02 [2] zfscurr0# ipfw table fl2 list +++ table(fl2), set(0) +++ 2a02:6b8::333,6,443 45000 10.0.0.92,6,80 22000 0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)' 00200 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2) 0:03 [2] zfscurr0# ipfw show 00200 0 0 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2) 65535 617 59416 allow ip from any to any 0:03 [2] zfscurr0# telnet -s 10.0.0.92 78.46.89.105 80 Trying 78.46.89.105... .. 0:04 [2] zfscurr0# ipfw show 00200 5 272 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2) 65535 682 66733 allow ip from any to any	2014-07-31 20:08:19 +00:00
melifaro	4419c812fe	* Add number:array algorithm lookup method. Kernel changes: * s/IPFW_TABLE_U32/IPFW_TABLE_NUMBER/ * Force "lookup <port\|uid\|gid\|jid>" to be IPFW_TABLE_NUMBER * Support "lookup" method for number tables * Add number:array algorihm (i32 as key, auto-growing). Userland changes: * Support named tables in "lookup <tag> Table" * Fix handling of "table(NAME,val)" case * Support printing "number" table data.	2014-07-30 14:52:26 +00:00
melifaro	2ca9167fd0	* Add "lookup" method for cidr:hash algorithm type. * Add auoto-grow ability to cidr:hash type. * Fix some bugs / simplify implementation for cidr:hash.	2014-07-30 12:39:49 +00:00
melifaro	23cdd03b9c	Fix "flush" cmd for algorithms wih non-default parameters.	2014-07-30 09:17:40 +00:00
melifaro	389a854346	* Introduce ipfw_ctl3() handler and move all IP_FW3 opcodes there. The long-term goal is to switch remaining opcodes to IP_FW3 versions and use ipfw_ctl3() as default handler simplifying ipfw(4) interaction with external world.	2014-07-29 23:06:06 +00:00
melifaro	bf787a59a7	* Dump available table algorithms via "ipfw talist" cmd. Kernel changes: * Add type/refcount fields to table algo instances. * Add IP_FW_TABLES_ALIST opcode to export available algorihms to userland. Userland changes: * Fix cores on empty input inside "ipfw table" handler. * Add "ipfw talist" cmd to print availabled kernel algorithms. * Change "table info" output to reflect long algorithm config lines.	2014-07-29 22:44:26 +00:00
melifaro	7e2cb6d901	* Copy ta structures to stable storage to ease future extension. * Remove algo .lookup field since table lookup function is set by algo code.	2014-07-29 21:38:06 +00:00
melifaro	ce5a8379b8	* Add new ipfw cidr algorihm: hash table. Algorithm works with both IPv4 and IPv6 prefixes, /32 and /128 ranges are assumed by default. It works the following way: input IP address is masked to specified mask, hashed and searched inside hash bucket. Current implementation does not support "lookup" method and hash auto-resize. This will be changed soon. some examples: ipfw table mi_test2 create type cidr algo cidr:hash ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64" ipfw table mi_test2 info +++ table(mi_test2), set(0) +++ type: cidr, kindex: 7 valtype: number, references: 0 algorithm: cidr:hash items: 0, size: 220 ipfw table mi_test info +++ table(mi_test), set(0) +++ type: cidr, kindex: 6 valtype: number, references: 0 algorithm: cidr:hash masks=/30,/64 items: 0, size: 220 ipfw table mi_test add 10.0.0.5/30 ipfw table mi_test add 10.0.0.8/30 ipfw table mi_test add 2a02:6b8:b010::1/64 25 ipfw table mi_test list +++ table(mi_test), set(0) +++ 10.0.0.4/30 0 10.0.0.8/30 0 2a02:6b8:b010::/64 25	2014-07-29 19:49:38 +00:00
melifaro	286880219b	* Change algorthm names to "type:algo" (e.g. "iface:array", "cidr:radix") format. * Pass number of items changed in add/del hooks to permit adding/deleting multiple values at once.	2014-07-29 08:00:13 +00:00
melifaro	fa3f38a6a0	* Add generic ipfw interface tracking API * Rewrite interface tables to use interface indexes Kernel changes: * Add generic interface tracking API: - ipfw_iface_ref (must call unlocked, performs lazy init if needed, allocates state & bumps ref) - ipfw_iface_add_ntfy(UH_WLOCK+WLOCK, links comsumer & runs its callback to update ifindex) - ipfw_iface_del_ntfy(UH_WLOCK+WLOCK, unlinks consumer) - ipfw_iface_unref(unlocked, drops reference) Additionally, consumer callbacks are called in interface withdrawal/departure. * Rewrite interface tables to use iface tracking API. Currently tables are implemented the following way: runtime data is stored as sorted array of {ifidx, val} for existing interfaces full data is stored inside namedobj instance (chained hashed table). * Add IP_FW_XIFLIST opcode to dump status of tracked interfaces * Pass @chain ptr to most non-locked algorithm callbacks: (prepare_add, prepare_del, flush_entry ..). This may be needed for better interaction of given algorithm an other ipfw subsystems * Add optional "change_ti" algorithm handler to permit updating of cached table_info pointer (happens in case of table_max resize) * Fix small bug in ipfw_list_tables() * Add badd (insert into sorted array) and bdel (remove from sorted array) funcs Userland changes: * Add "iflist" cmd to print status of currently tracked interface * Add stringnum_cmp for better interface/table names sorting	2014-07-28 19:01:25 +00:00

1 2 3 4 5 ...

404 Commits