freebsd-skq

Author	SHA1	Message	Date
glebius	2e01608625	Clean up unused CSUM_FRAGMENT. Sponsored by: Nginx, Inc.	2014-09-03 08:30:18 +00:00
glebius	0cbf499e97	Explicitly free packet on PF_DROP, otherwise a "quick" rule with "route-to" may still forward it. PR: 177808 Submitted by: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de> Sponsored by: InnoGames GmbH	2014-09-01 13:00:45 +00:00
melifaro	06eb65b248	Whitespace/style changes merged from projects/ipfw.	2014-08-23 17:57:06 +00:00
glebius	4242d9acba	Do not lookup source node twice when pf_map_addr() is used. PR: 184003 Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-15 14:16:08 +00:00
glebius	9227a25906	pf_map_addr() can fail and in this case we should drop the packet, otherwise bad consequences including a routing loop can occur. Move pf_set_rt_ifp() earlier in state creation sequence and inline it, cutting some extra code. PR: 183997 Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-15 14:02:24 +00:00
glebius	45bdeab3db	Fix synproxy with IPv6. pf_test6() was missing a check for M_SKIP_FIREWALL. PR: 127920 Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-15 04:35:34 +00:00
kevlo	dd40fa7e62	Change pr_output's prototype to avoid the need for explicit casts. This is a follow up to r269699. Phabric: D564 Reviewed by: jhb	2014-08-15 02:43:02 +00:00
glebius	7d0b571895	- Count global pf(4) statistics in counter(9). - Do not count global number of states and of src_nodes, use uma_zone_get_cur() to obtain values. - Struct pf_status becomes merely an ioctl API structure, and moves to netpfil/pf/pf.h with its constants. - V_pf_status is now of type struct pf_kstatus. Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-14 18:57:46 +00:00
kevlo	7727a3c215	Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb	2014-08-08 01:57:15 +00:00
glebius	98615618b9	On machines with strict alignment copy pfsync_state_key from packet on stack to avoid unaligned access. PR: 187381 Submitted by: Lytochkin Boris <lytboris gmail.com>	2014-07-10 12:41:58 +00:00
hselasky	35b126e324	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
gjb	fc21f40567	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
hselasky	bd1ed65f0f	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
hiren	a877260646	DNOLD_IS_ECN introduced by r266941 is not required. DNOLD_* flags are for compat with old binaries. Suggested by: luigi	2014-06-01 20:19:17 +00:00
hiren	cc47b6d947	ECN marking implenetation for dummynet. Changes include both DCTCP and RFC 3168 ECN marking methodology. DCTCP draft: http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00 Submitted by: Midori Kato (aoimidori27@gmail.com) Worked with: Lars Eggert (lars@netapp.com) Reviewed by: luigi, hiren	2014-06-01 07:28:24 +00:00
jhb	91a569ad69	Fix pf(4) to build with MAXCPU set to 256. MAXCPU is actually a count, not a maximum ID value (so it is a cap on mp_ncpus, not mp_maxid).	2014-05-29 19:17:10 +00:00
ae	26693dfcb9	Since ipfw nat configures all options in one step, we should set all bits in the mask when calling LibAliasSetMode() to properly clear unneeded options. PR: 189655 MFC after: 1 week Sponsored by: Yandex LLC	2014-05-18 14:25:19 +00:00
melifaro	f4783a05e9	Fix wrong formatting of 0.0.0.0/X table records in ipfw(8). Add `flags` u16 field to the hole in ipfw_table_xentry structure. Kernel has been guessing address family for supplied record based on xent length size. Userland, however, has been getting fixed-size ipfw_table_xentry structures guessing address family by checking address by IN6_IS_ADDR_V4COMPAT(). Fix this behavior by providing specific IPFW_TCF_INET flag for IPv4 records. PR: bin/189471 Submitted by: Dennis Yusupoff <dyr@smartspb.net> MFC after: 2 weeks	2014-05-17 13:45:03 +00:00
glebius	9412c23d6c	o In pf_normalize_ip() we don't need mtag in !(PFRULE_FRAGCROP\|PFRULE_FRAGDROP) case. o In the (PFRULE_FRAGCROP\|PFRULE_FRAGDROP) case we should allocate mtag if we don't find any. Tested by: Ian FREISLICH <ianf cloudseed.co.za>	2014-05-17 12:30:27 +00:00
trociny	bd951d3fdb	Define startup order the same way as it is in dummynet.	2014-04-26 08:05:16 +00:00
glebius	597bcfe53d	The current API for adding rules with pool addresses is the following: - DIOCADDADDR adds addresses and puts them into V_pf_pabuf - DIOCADDRULE takes all addresses from V_pf_pabuf and links them into rule. The ugly part is that if address is a table, then it is initialized in DIOCADDRULE, because we need ruleset, and DIOCADDADDR doesn't supply ruleset. But if address is a dynaddr, we need address family, and address family could be different for different addresses in one rule, so dynaddr is initialized in DIOCADDADDR. This leads to the entangled state of addresses on V_pf_pabuf. Some are initialized, and some not. That's why running pf_empty_pool(&V_pf_pabuf) can lead to a panic on a NULL table address. Since proper fix requires API/ABI change, for now simply plug the panic in pf_empty_pool(). Reported by: danger	2014-04-25 11:36:11 +00:00
mm	532d55ab5f	Backport from projects/pf r263908: De-virtualize UMA zone pf_mtag_z and move to global initialization part. The m_tag struct does not know about vnet context and the pf_mtag_free() callback is called unaware of current vnet. This causes a panic. MFC after: 1 week	2014-04-20 09:17:48 +00:00
ae	d70382e43a	Set oif only for outgoing packets. PR: 188543 MFC after: 1 week Sponsored by: Yandex LLC	2014-04-16 14:37:11 +00:00
glebius	97ee1da70b	Backout r257223,r257224,r257225,r257246,r257710. The changes caused some regressions in ICMP handling, and right now me and Baptiste are out of time on analyzing them. PR: 188253	2014-04-16 09:25:20 +00:00
brueffer	f19c513644	Free resources and error cases; re-indent a curly brace while here. CID: 1199366 Found with: Coverity Prevent(tm) MFC after: 1 week	2014-04-13 21:13:33 +00:00
mm	257cccbfaa	Merge from projects/pf r264198: Execute pf_overload_task() in vnet context. Fixes a vnet kernel panic. Reviewed by: trociny MFC after: 1 week	2014-04-07 07:06:13 +00:00
mm	c4f653f608	Merge from projects/pf r251993 (glebius@): De-vnet hash sizes and hash masks. Submitted by: Nikos Vassiliadis <nvass gmx.com> Reviewed by: trociny MFC after: 1 month	2014-03-25 06:55:53 +00:00
glebius	0825c0b36c	Fix breakage in ipfw+VIMAGE after r261590. PR: kern/187665 Sponsored by: Nginx, Inc.	2014-03-21 17:07:18 +00:00
glebius	8a3e4bbebb	- Remove rt_metrics_lite and simply put its members into rtentry. - Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This removes another cache trashing ++ from packet forwarding path. - Create zini/fini methods for the rtentry UMA zone. Via initialize mutex and counter in them. - Fix reporting of rmx_pksent to routing socket. - Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode. The change is mostly targeted for stable/10 merge. For head, rt_pksent is expected to just disappear. Discussed with: melifaro Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-03-05 01:17:47 +00:00
glebius	c23c087e5b	Instead of playing games with casts simply add 3 more members to the structure pf_rule, that are used when the structure is passed via ioctl(). PR: 187074	2014-03-05 00:40:03 +00:00
mm	d2d19c03ef	Revert r262196 I am going to split this into two individual patches and test it with the projects/pf branch that may get merged later.	2014-02-19 17:06:04 +00:00
mm	73f4b67f38	De-virtualize pf_mtag_z [1] Process V_pf_overloadqueue in vnet context [2] This fixes two VIMAGE kernel panics and allows to simultaneously run host-pf and vnet jails. pf inside jails remains broken. PR: kern/182964 Submitted by: glebius@FreeBSD.org [2], myself [1] Tested by: rodrigc@FreeBSD.org, myself MFC after: 2 weeks	2014-02-18 22:17:12 +00:00
gnn	1804af5050	Summary: Two quick edits to the implementation notes as they're no longer stored in netinet but in netpfil.	2014-02-15 18:36:31 +00:00
dim	4e20d58892	Under sys/netpfil/ipfw, surround two IPv6-specific static functions with #ifdef INET6, since they are unused when INET6 is disabled. MFC after: 3 days	2014-02-15 12:25:01 +00:00
glebius	1ea1d562a3	Once pf became not covered by a single mutex, many counters in it became race prone. Some just gather statistics, but some are later used in different calculations. A real problem was the race provoked underflow of the states_cur counter on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this value is used in pf_state_expires() and any state created by this rule is immediately expired. Thus, make fields states_cur, states_tot and src_nodes of struct pf_rule be counter(9)s. Thanks to Dennis for providing me shell access to problematic box and his help with reproducing, debugging and investigating the problem. Thanks to: Dennis Yusupoff <dyr smartspb.net> Also reported by: dumbbell, pgj, Rambler Sponsored by: Nginx, Inc.	2014-02-14 10:05:21 +00:00
melifaro	c32089edca	Reorder struct ip_fw_chain: * move rarely-used fields down * move uh_lock to different cacheline * remove some usused fields Sponsored by: Yandex LLC	2014-01-24 09:13:30 +00:00
glebius	3f1e8f48cd	Remove NULL pointer dereference. CID: 1009118	2014-01-22 15:58:43 +00:00
glebius	4d8db193db	Fix resource leak and simplify code for DIOCCHANGEADDR. CID: 1007035	2014-01-22 15:44:38 +00:00
melifaro	104ab6ec12	Revert r260548. We really should not use IPFW_WLOCK() here but this requires some more playing with IPFW_UH_WLOCK(). Leave till later.	2014-01-11 18:27:34 +00:00
melifaro	9f930faa0d	We don't need chain write lock since we're not modifying its contents. LibAliasSetAddress() uses its own mutex to serialize changes. While here, convert ifp->if_xname access to if_name() function. MFC after: 2 weeks Sponsored by: Yandex LLC	2014-01-11 16:50:41 +00:00
glebius	353906d3d2	When pf_get_translation() fails, it should leave *sn pointer pristine, otherwise we will panic in pf_test_rule(). PR: 182557	2014-01-06 19:05:04 +00:00
melifaro	c491eeb2f3	Use rnh_matchaddr instead of rnh_lookup for longest-prefix match. rnh_lookup is effectively the same as rnh_matchaddr if called with empy network mask. MFC after: 2 weeks	2014-01-03 23:11:26 +00:00
dim	320e3d9bba	Fix incorrect header guard define in sys/netpfil/pf/pf.h, which snuck in in r257186. Found by clang 3.4.	2013-12-22 19:47:22 +00:00
glebius	964c4daeba	Fix fallout from r258479: in pf_free_src_node() the node must already be unlinked. Reported by: Konstantin Kukushkin <dark rambler-co.ru> Sponsored by: Nginx, Inc.	2013-12-22 12:10:36 +00:00
melifaro	ce16a97371	Add net.inet.ip.fw.dyn_keep_states sysctl which re-links dynamic states to default rule instead of flushing on rule deletion. This can be useful while performing ruleset reload (think about `atomic` reload via changing sets). Currently it is turned off by default. MFC after: 2 weeks Sponsored by: Yandex LLC	2013-12-18 20:17:05 +00:00
melifaro	031fdfe55b	Simplify O_NAT opcode handling. MFC after: 2 weeks Sponsored by: Yandex LLC	2013-11-28 15:28:51 +00:00
melifaro	c9cfc8e322	Check ipfw table numbers in both user and kernel space before rule addition. Found by: Saychik Pavel <umka@localka.net> MFC after: 2 weeks Sponsored by: Yandex LLC	2013-11-28 10:28:28 +00:00
rodrigc	ad77255ba1	In sys/netpfil/ipfw/ip_fw_nat.c:vnet_ipfw_nat_uninit() we call "IPFW_WLOCK(chain);". This lock gets deleted in sys/netpfil/ipfw/ip_fw2.c:vnet_ipfw_uninit(). Therefore, vnet_ipfw_nat_uninit() must be called before vnet_ipfw_uninit(), but this doesn't always happen, because the VNET_SYSINIT order is the same for both functions. In sys/net/netpfil/ipfw/ip_fw2.c and sys/net/netpfil/ipfw/ip_fw_nat.c, IPFW_SI_SUB_FIREWALL == IPFW_NAT_SI_SUB_FIREWALL == SI_SUB_PROTO_IFATTACHDOMAIN and IPFW_MODULE_ORDER == IPFW_NAT_MODULE_ORDER Consequently, if VIMAGE is enabled, and jails are created and destroyed, the system sometimes crashes, because we are trying to use a deleted lock. To reproduce the problem: (1) Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS, INVARIANTS. (2) Run this command in a loop: jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo (see http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html ) Fix the problem by increasing the value of IPFW_NAT_SI_SUB_FIREWALL, so that vnet_ipfw_nat_uninit() runs after vnet_ipfw_uninit().	2013-11-25 20:20:34 +00:00
glebius	f889028338	The DIOCKILLSRCNODES operation was implemented with O(m*n) complexity, where "m" is number of source nodes and "n" is number of states. Thus, on heavy loaded router its processing consumed a lot of CPU time. Reimplement it with O(m+n) complexity. We first scan through source nodes and disconnect matching ones, putting them on the freelist and marking with a cookie value in their expire field. Then we scan through the states, detecting references to source nodes with a cookie, and disconnect them as well. Then the freelist is passed to pf_free_src_nodes(). In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de> PR: kern/176763 Sponsored by: InnoGames GmbH Sponsored by: Nginx, Inc.	2013-11-22 19:22:26 +00:00
glebius	c884926273	To support upcoming changes change internal API for source node handling: - Removed pf_remove_src_node(). - Introduce pf_unlink_src_node() and pf_unlink_src_node_locked(). These function do not proceed with freeing of a node, just disconnect it from storage. - New function pf_free_src_nodes() works on a list of previously disconnected nodes and frees them. - Utilize new API in pf_purge_expired_src_nodes(). In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de> Sponsored by: InnoGames GmbH Sponsored by: Nginx, Inc.	2013-11-22 19:16:34 +00:00

1 2 3

145 Commits