freebsd-nq

Author	SHA1	Message	Date
Konstantin Belousov	584b675ed6	Hide the boottime and bootimebin globals, provide the getboottime(9) and getboottimebin(9) KPI. Change consumers of boottime to use the KPI. The variables were renamed to avoid shadowing issues with local variables of the same name. Issue is that boottime* should be adjusted from tc_windup(), which requires them to be members of the timehands structure. As a preparation, this commit only introduces the interface. Some uses of boottime were found doubtful, e.g. NLM uses boottime to identify the system boot instance. Arguably the identity should not change on the leap second adjustment, but the commit is about the timekeeping code and the consumers were kept bug-to-bug compatible. Tested by: pho (as part of the bigger patch) Reviewed by: jhb (same) Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 month X-Differential revision: https://reviews.freebsd.org/D7302	2016-07-27 11:08:59 +00:00
Andrey V. Elsukov	ed22e564b8	Add named dynamic states support to ipfw(4). The keep-state, limit and check-state now will have additional argument flowname. This flowname will be assigned to dynamic rule by keep-state or limit opcode. And then can be matched by check-state opcode or O_PROBE_STATE internal opcode. To reduce possible breakage and to maximize compatibility with old rulesets default flowname introduced. It will be assigned to the rules when user has omitted state name in keep-state and check-state opcodes. Also if name is ambiguous (can be evaluated as rule opcode) it will be replaced to default. Reviewed by: julian Obtained from: Yandex LLC MFC after: 1 month Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6674	2016-07-19 04:56:59 +00:00
Andrey V. Elsukov	b867e84e95	Add ipfw_nptv6 module that implements Network Prefix Translation for IPv6 as defined in RFC 6296. The module works together with ipfw(4) and implemented as its external action module. When it is loaded, it registers as eaction and can be used in rules. The usage pattern is similar to ipfw_nat(4). All matched by rule traffic goes to the NPT module. Reviewed by: hrs Obtained from: Yandex LLC MFC after: 1 month Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6420	2016-07-18 19:46:31 +00:00
Don Lewis	98e82c02e5	Fix problems in the FQ-PIE AQM cleanup code that could leak memory or cause a crash. Because dummynet calls pie_cleanup() while holding a mutex, pie_cleanup() is not able to use callout_drain() to make sure that all callouts are finished before it returns, and callout_stop() is not sufficient to make that guarantee. After pie_cleanup() returns, dummynet will free a structure that any remaining callouts will want to access. Fix these problems by allocating a separate structure to contain the data used by the callouts. In pie_cleanup(), call callout_reset_sbt() to replace the normal callout with a cleanup callout that does the cleanup work for each sub-queue. The instance of the cleanup callout that destroys the last flow will also free the extra allocated block of memory. Protect the reference count manipulation in the cleanup callout with DN_BH_WLOCK() to be consistent with all of the other usage of the reference count where this lock is held by the dummynet code. Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au> MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7174	2016-07-12 17:32:40 +00:00
Kristof Provost	aa7cac58c6	pf: Map hook returns onto the correct error values pf returns PF_PASS, PF_DROP, ... in the netpfil hooks, but the hook callers expect to get E<foo> error codes. Map the returns values. A pass is 0 (everything is OK), anything else means pf ate the packet, so return EACCES, which tells the stack not to emit an ICMP error message. PR: 207598	2016-07-09 12:17:01 +00:00
Don Lewis	12be18c7d5	Fix a race condition between the main thread in aqm_pie_cleanup() and the callout thread that can cause a kernel panic. Always do the final cleanup in the callout thread by passing a separate callout function for that task to callout_reset_sbt(). Protect the ref_count decrement in the callout with DN_BH_WLOCK(). All other ref_count manipulation is protected with this lock. There is still a tiny window between ref_count reaching zero and the end of the callout function where it is unsafe to unload the module. Fixing this would require the use of callout_drain(), but this can't be done because dummynet holds a mutex and callout_drain() might sleep. Remove the callout_pending(), callout_active(), and callout_deactivate() calls from calculate_drop_prob(). They are not needed because this callout uses callout_init_mtx(). Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au> Approved by: re (gjb) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D6928	2016-07-05 00:53:01 +00:00
Bjoern A. Zeeb	9ac51e7911	In case of the global eventhandler make sure the current VNET is still operational before doing any work; otherwise we might run into, e.g., destroyed locks. PR: 210724 Reported by: olevole olevole.ru Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Obtained from: projects/vnet Approved by: re (gjb)	2016-06-30 19:32:45 +00:00
Bjoern A. Zeeb	31fe4e62fa	Move the ipfw_log_bpf() calls from global module initialisation to per-VNET initialisation and virtualise the interface cloning to allow a dedicated ipfw log interface per VNET. Approved by: re (gjb) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2016-06-30 01:33:14 +00:00
Bjoern A. Zeeb	a8fc1b786d	The void isn't void. Unbreak sparc64 and powerpc builds. Approved by: re (gjb) Sponsored by: The FreeBSD Foundation MFC after: 12 days	2016-06-24 11:53:12 +00:00
Bjoern A. Zeeb	66c00e9efb	Proerply virtualize pfsync for bringup after pf is initialized and teardown of VNETs once pf(4) has been shut down. Properly split resources into VNET_SYS(UN)INITs and one time module loading. While here cover the INET parts in the uninit callpath with proper #ifdefs. Approved by: re (gjb) Obtained from: projects/vnet MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2016-06-23 22:31:44 +00:00
Bjoern A. Zeeb	7d7751a071	Make sure pflog is attached after pf is initializaed so we can borrow pf's lock, and also make sure pflog goes after pf is gone in order to avoid callouts in VNETs to an already freed instance. Reported by: Ivan Klymenko, Johan Hendriks on current@ today Obtained from: projects/vnet Sponsored by: The FreeBSD Foundation MFC after: 13 days Approved by: re (gjb)	2016-06-23 22:31:10 +00:00
Bjoern A. Zeeb	a8e8c57443	PFSTATE_NOSYNC goes onto state_flags, not sync_state; this prevents: panic: pfsync_delete_state: unexpected sync state 8 Reviewed by: kp Approved by: re (gjb) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6942	2016-06-23 21:42:43 +00:00
Bjoern A. Zeeb	a0429b5459	Update pf(4) and pflog(4) to survive basic VNET testing, which includes proper virtualisation, teardown, avoiding use-after-free, race conditions, no longer creating a thread per VNET (which could easily be a couple of thousand threads), gracefully ignoring global events (e.g., eventhandlers) on teardown, clearing various globally cached pointers and checking them before use. Reviewed by: kp Approved by: re (gjb) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6924	2016-06-23 21:34:38 +00:00
Bjoern A. Zeeb	8147948e19	Import a fix for and old security issue (CVE-2010-3830) in pf which was not relevant to FreeBSD as only root could open /dev/pf by default. With VIMAGE this is will longer be the case. As pf(4) starts to be supported with VNETs 3rd party users may open /dev/pf inside the virtual jail instance; thus we need to address this issue after all. While OpenBSD largely rewrote code parts for the fix [1], and it's unclear what Apple [3] did, import the minimal fix from NetBSD [2]. [1] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_ioctl.c.diff?r1=1.235&r2=1.236 [2] http://mail-index.netbsd.org/source-changes/2011/01/19/msg017518.html [3] https://support.apple.com/en-gb/HT202154 Obtained from: http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dist/pf/net/pf_ioctl.c.diff?r1=1.42&r2=1.43&only_with_tag=MAIN MFC After: 2 weeks Approved by: re (gjb) Sponsored by: The FreeBSD Foundation Security: CVE-2010-3830	2016-06-23 05:41:46 +00:00
Bjoern A. Zeeb	89856f7e2d	Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747	2016-06-21 13:48:49 +00:00
Kristof Provost	3e248e0fb4	pf: Filter on and set vlan PCP values Adopt the OpenBSD syntax for setting and filtering on VLAN PCP values. This introduces two new keywords: 'set prio' to set the PCP value, and 'prio' to filter on it. Reviewed by: allanjude, araujo Approved by: re (gjb) Obtained from: OpenBSD (mostly) Differential Revision: https://reviews.freebsd.org/D6786	2016-06-17 18:21:55 +00:00
Alexander V. Chernikov	37aefa2ad1	Fix 4-byte overflow in ipv6_writemask. This bug could cause some IPv6 table prefix delete requests to fail. Obtained from: Yandex LLC	2016-06-05 10:33:53 +00:00
Don Lewis	d673654796	Replace constant expressions that contain multiplications by fractional floating point values with integer divides. This will eliminate any chance that the compiler will generate code to evaluate the expression using floating point at runtime. Suggested by: bde Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au> MFC after: 8 days (with r300779 and r300949)	2016-06-01 20:04:24 +00:00
Don Lewis	fe4b5f6659	Cast some expressions that multiply a long long constant by a floating point constant to int64_t. This avoids the runtime conversion of the the other operand in a set of comparisons from int64_t to floating point and doing the comparisions in floating point. Suggested by: lidl Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au> MFC after: 2 weeks (with r300779)	2016-05-29 07:23:56 +00:00
Don Lewis	248c72bfb8	Correct a typo in a comment. MFC after: 2 weeks (with r300779)	2016-05-26 22:03:28 +00:00
Don Lewis	4e59799e1b	Modify BOUND_VAR() macro to wrap all of its arguments in () and tweak its expression to work on powerpc and sparc64 (gcc compatibility). Correct a typo in a nearby comment. MFC after: 2 weeks (with r300779)	2016-05-26 21:44:52 +00:00
Don Lewis	91336b403a	Import Dummynet AQM version 0.2.1 (CoDel, FQ-CoDel, PIE and FQ-PIE). Centre for Advanced Internet Architectures Implementing AQM in FreeBSD * Overview <http://caia.swin.edu.au/freebsd/aqm/index.html> * Articles, Papers and Presentations <http://caia.swin.edu.au/freebsd/aqm/papers.html> * Patches and Tools <http://caia.swin.edu.au/freebsd/aqm/downloads.html> Overview Recent years have seen a resurgence of interest in better managing the depth of bottleneck queues in routers, switches and other places that get congested. Solutions include transport protocol enhancements at the end-hosts (such as delay-based or hybrid congestion control schemes) and active queue management (AQM) schemes applied within bottleneck queues. The notion of AQM has been around since at least the late 1990s (e.g. RFC 2309). In recent years the proliferation of oversized buffers in all sorts of network devices (aka bufferbloat) has stimulated keen community interest in four new AQM schemes -- CoDel, FQ-CoDel, PIE and FQ-PIE. The IETF AQM working group is looking to document these schemes, and independent implementations are a corner-stone of the IETF's process for confirming the clarity of publicly available protocol descriptions. While significant development work on all three schemes has occured in the Linux kernel, there is very little in FreeBSD. Project Goals This project began in late 2015, and aims to design and implement functionally-correct versions of CoDel, FQ-CoDel, PIE and FQ_PIE in FreeBSD (with code BSD-licensed as much as practical). We have chosen to do this as extensions to FreeBSD's ipfw/dummynet firewall and traffic shaper. Implementation of these AQM schemes in FreeBSD will: * Demonstrate whether the publicly available documentation is sufficient to enable independent, functionally equivalent implementations * Provide a broader suite of AQM options for sections the networking community that rely on FreeBSD platforms Program Members: * Rasool Al Saadi (developer) * Grenville Armitage (project lead) Acknowledgements: This project has been made possible in part by a gift from the Comcast Innovation Fund. Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au> X-No objection: core MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6388	2016-05-26 21:40:13 +00:00
Kristof Provost	b599e8dc59	pf: Fix more ICMP mistranslation In the default case fix the substitution of the destination address. PR: 201519 Submitted by: Max <maximos@als.nnov.ru> MFC after: 1 week	2016-05-23 13:59:48 +00:00
Kristof Provost	c0c82715b8	pf: Fix ICMP translation Fix ICMP source address rewriting in rdr scenarios. PR: 201519 Submitted by: Max <maximos@als.nnov.ru> MFC after: 1 week	2016-05-23 12:41:29 +00:00
Kristof Provost	d9f4fce5a7	pf: Fix fragment timeout We were inconsistent about the use of time_second vs. time_uptime. Always use time_uptime so the value can be meaningfully compared. Submitted by: "Max" <maximos@als.nnov.ru> MFC after: 4 days	2016-05-20 15:41:05 +00:00
Andrey V. Elsukov	d16f495cad	Fix the regression introduced in r300143. When we are creating new dynamic state use MATCH_FORWARD direction to correctly initialize protocol's state.	2016-05-20 15:00:12 +00:00
Andrey V. Elsukov	96e84c57e1	Move protocol state handling code from lookup_dyn_rule_locked() function into dyn_update_proto_state(). This allows eliminate the second state lookup in the ipfw_install_state(). Also remove MATCH_* macros, they are defined in ip_fw_private.h as enum. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-05-18 12:53:21 +00:00
Andrey V. Elsukov	2685841b38	Make named objects set-aware. Now it is possible to create named objects with the same name in different sets. Add optional manage_sets() callback to objects rewriting framework. It is intended to implement handler for moving and swapping named object's sets. Add ipfw_obj_manage_sets() function that implements generic sets handler. Use new callback to implement sets support for lookup tables. External actions objects are global and they don't support sets. Modify eaction_findbyname() to reflect this. ipfw(8) now may fail to move rules or sets, because some named objects in target set may have conflicting names. Note that ipfw_obj_ntlv type was changed, but since lookup tables actually didn't support sets, this change is harmless. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-05-17 07:47:23 +00:00
Andrey V. Elsukov	9f2e5ed3cc	Fix memory leak possible in error case. Use free_rule() instead of free(), it will also release memory allocated for rule counters. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-05-11 10:04:32 +00:00
Andrey V. Elsukov	b309f085e0	Change the type of objhash_cb_t callback function to be able return an error code. Use it to interrupt the loop in ipfw_objhash_foreach(). Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-05-06 03:18:51 +00:00
Andrey V. Elsukov	2df1a11ffa	Rename find_name_tlv_type() to ipfw_find_name_tlv_type() and make it global. Use it in ip_fw_table.c instead of find_name_tlv() to reduce duplicated code. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-05-05 20:15:46 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Andrey V. Elsukov	9a5be809ab	Make create_object callback optional and return EOPNOTSUPP when it isn't defined. Remove eaction_create_compat() and use designated initializers to initialize eaction_opcodes structure. Obtained from: Yandex LLC	2016-04-27 15:28:25 +00:00
Pedro F. Giffuni	7a6ab8f19e	netpfil: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle. Reviewed by: ae	2016-04-15 12:24:01 +00:00
Andrey V. Elsukov	2acdf79f53	Add External Actions KPI to ipfw(9). It allows implementing loadable kernel modules with new actions and without needing to modify kernel headers and ipfw(8). The module registers its action handler and keyword string, that will be used as action name. Using generic syntax user can add rules with this action. Also ipfw(8) can be easily modified to extend basic syntax for external actions, that become a part base system. Sample modules will coming soon. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-04-14 22:51:23 +00:00
Andrey V. Elsukov	4bd916567e	Change the type of 'etlv' field in struct named_object to uint16_t. It should match with the type field in struct ipfw_obj_tlv. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-04-14 21:52:31 +00:00
Andrey V. Elsukov	f8e26ca319	Adjust some comments and make ref_opcode_object() static.	2016-04-14 21:45:18 +00:00
Andrey V. Elsukov	b2df1f7ea1	o Teach opcode rewriting framework handle several rewriters for the same opcode. o Reduce number of times classifier callback is called. It is redundant to call it just after find_op_rw(), since the last does call it already and can have all results. o Do immediately opcode rewrite in the ref_opcode_object(). This eliminates additional classifier lookup later on bulk update. For unresolved opcodes the behavior still the same, we save information from classifier callback in the obj_idx array, then perform automatic objects creation, then perform rewriting for opcodes using indeces from created objects. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-04-14 21:31:16 +00:00
Andrey V. Elsukov	f976a4edc0	Move several functions related to opcode rewriting framework from ip_fw_table.c into ip_fw_sockopt.c and make them static. Obtained from: Yandex LLC	2016-04-14 20:49:27 +00:00
Pedro F. Giffuni	74b8d63dcc	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
Kristof Provost	0d8c93313e	pf: Improve forwarding detection When we guess the nature of the outbound packet (output vs. forwarding) we need to take bridges into account. When bridging the input interface does not match the output interface, but we're not forwarding. Similarly, it's possible for the interface to actually be the bridge interface itself (and not a member interface). PR: 202351 MFC after: 2 weeks	2016-03-16 06:42:15 +00:00
Andrey V. Elsukov	657592fd65	Use correct size for malloc. Obtained from: Yandex LLC MFC after: 1 week	2016-03-03 13:07:59 +00:00
John Baldwin	cbc4d2db75	Remove taskqueue_enqueue_fast(). taskqueue_enqueue() was changed to support both fast and non-fast taskqueues 10 years ago in r154167. It has been a compat shim ever since. It's time for the compat shim to go. Submitted by: Howard Su <howard0su@gmail.com> Reviewed by: sephe Differential Revision: https://reviews.freebsd.org/D5131	2016-03-01 17:47:32 +00:00
Kristof Provost	14b5e85b18	pf: Fix possible out-of-bounds write In the DIOCRSETADDRS ioctl() handler we allocate a table for struct pfr_addrs, which is processed in pfr_set_addrs(). At the users request we also provide feedback on the deleted addresses, by storing them after the new list ('bcopy(&ad, addr + size + i, sizeof(ad));' in pfr_set_addrs()). This means we write outside the bounds of the buffer we've just allocated. We need to look at pfrio_size2 instead (i.e. the size the user reserved for our feedback). That'd allow a malicious user to specify a smaller pfrio_size2 than pfrio_size though, in which case we'd still read outside of the allocated buffer. Instead we allocate the largest of the two values. Reported By: Paul J Murphy <paul@inetstat.net> PR: 207463 MFC after: 5 days Differential Revision: https://reviews.freebsd.org/D5426	2016-02-25 07:33:59 +00:00
Andrey V. Elsukov	23a6c7330c	Fix bug in filling and handling ipfw's O_DSCP opcode. Due to integer overflow CS4 token was handled as BE. PR: 207459 MFC after: 1 week	2016-02-24 13:16:03 +00:00
Kristof Provost	c90369f880	in pf_print_state_parts, do not use skw->proto to print the protocol but our local copy proto that we very carefully set beforehands. skw being NULL is perfectly valid there. Obtained from: OpenBSD (henning)	2016-02-20 12:53:53 +00:00
Gleb Smirnoff	cd82d21b2e	Fix obvious typo, that lead to incorrect sorting. Found by: PVS-Studio	2016-02-18 19:05:30 +00:00
Gleb Smirnoff	8ec07310fa	These files were getting sys/malloc.h and vm/uma.h with header pollution via sys/mbuf.h	2016-02-01 17:41:21 +00:00
Luigi Rizzo	1cdc5f0b87	cleanup and document in some detail the internals of the testing code for dummynet schedulers	2016-01-27 02:22:31 +00:00
Luigi Rizzo	ff8d60ab4d	the _Static_assert was not supposed to be in the commit.	2016-01-27 02:14:08 +00:00

1 2 3 4 5 ...

430 Commits