Commit Graph

416 Commits

Author SHA1 Message Date
kp
c19daeaf73 pf: Map hook returns onto the correct error values
pf returns PF_PASS, PF_DROP, ... in the netpfil hooks, but the hook callers
expect to get E<foo> error codes.
Map the returns values. A pass is 0 (everything is OK), anything else means
pf ate the packet, so return EACCES, which tells the stack not to emit an ICMP
error message.

PR:	207598
2016-07-09 12:17:01 +00:00
truckman
9e6e43be95 Fix a race condition between the main thread in aqm_pie_cleanup() and the
callout thread that can cause a kernel panic.  Always do the final cleanup
in the callout thread by passing a separate callout function for that task
to callout_reset_sbt().

Protect the ref_count decrement in the callout with DN_BH_WLOCK().  All
other ref_count manipulation is protected with this lock.

There is still a tiny window between ref_count reaching zero and the end
of the callout function where it is unsafe to unload the module.  Fixing
this would require the use of callout_drain(), but this can't be done
because dummynet holds a mutex and callout_drain() might sleep.

Remove the callout_pending(), callout_active(), and callout_deactivate()
calls from calculate_drop_prob().  They are not needed because this callout
uses callout_init_mtx().

Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
Approved by:	re (gjb)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D6928
2016-07-05 00:53:01 +00:00
bz
2af4ea8fc2 In case of the global eventhandler make sure the current VNET
is still operational before doing any work;  otherwise we might
run into, e.g., destroyed locks.

PR:		210724
Reported by:	olevole olevole.ru
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Obtained from:	projects/vnet
Approved by:	re (gjb)
2016-06-30 19:32:45 +00:00
bz
c79242bce1 Move the ipfw_log_bpf() calls from global module initialisation to
per-VNET initialisation and virtualise the interface cloning to
allow a dedicated ipfw log interface per VNET.

Approved by:		re (gjb)
MFC after:		2 weeks
Sponsored by:		The FreeBSD Foundation
2016-06-30 01:33:14 +00:00
bz
57ab70285a The void isn't void.
Unbreak sparc64 and powerpc builds.

Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
2016-06-24 11:53:12 +00:00
bz
f3a7d6a3f1 Proerply virtualize pfsync for bringup after pf is initialized and
teardown of VNETs once pf(4) has been shut down.
Properly split resources into VNET_SYS(UN)INITs and one time module
loading.
While here cover the INET parts in the uninit callpath with proper
#ifdefs.

Approved by:	re (gjb)
Obtained from:  projects/vnet
MFC after:      2 weeks
Sponsored by:   The FreeBSD Foundation
2016-06-23 22:31:44 +00:00
bz
05bdc5790f Make sure pflog is attached after pf is initializaed so we can
borrow pf's lock, and also make sure pflog goes after pf is gone
in order to avoid callouts in VNETs to an already freed instance.

Reported by:    Ivan Klymenko, Johan Hendriks  on current@ today
Obtained from:  projects/vnet
Sponsored by:   The FreeBSD Foundation
MFC after:      13 days
Approved by:	re (gjb)
2016-06-23 22:31:10 +00:00
bz
76dd3320a8 PFSTATE_NOSYNC goes onto state_flags, not sync_state;
this prevents: panic: pfsync_delete_state: unexpected sync state 8

Reviewed by:		kp
Approved by:		re (gjb)
MFC after:		2 weeks
Sponsored by:		The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D6942
2016-06-23 21:42:43 +00:00
bz
876cb9e018 Update pf(4) and pflog(4) to survive basic VNET testing, which includes
proper virtualisation, teardown, avoiding use-after-free, race conditions,
no longer creating a thread per VNET (which could easily be a couple of
thousand threads), gracefully ignoring global events (e.g., eventhandlers)
on teardown, clearing various globally cached pointers and checking
them before use.

Reviewed by:		kp
Approved by:		re (gjb)
Sponsored by:		The FreeBSD Foundation
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D6924
2016-06-23 21:34:38 +00:00
bz
db5b889e7a Import a fix for and old security issue (CVE-2010-3830) in pf which
was not relevant to FreeBSD as only root could open /dev/pf by default.
With VIMAGE this is will longer be the case.  As pf(4) starts to
be supported with VNETs 3rd party users may open /dev/pf inside the
virtual jail instance; thus we need to address this issue after all.
While OpenBSD largely rewrote code parts for the fix [1], and it's
unclear what Apple [3] did, import the minimal fix from NetBSD [2].

[1] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_ioctl.c.diff?r1=1.235&r2=1.236
[2] http://mail-index.netbsd.org/source-changes/2011/01/19/msg017518.html
[3] https://support.apple.com/en-gb/HT202154

Obtained from:		http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dist/pf/net/pf_ioctl.c.diff?r1=1.42&r2=1.43&only_with_tag=MAIN
MFC After:		2 weeks
Approved by:		re (gjb)
Sponsored by:		The FreeBSD Foundation
Security:		CVE-2010-3830
2016-06-23 05:41:46 +00:00
bz
7a1c0b1ad1 Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by:		re (hrs)
Obtained from:		projects/vnet
Reviewed by:		gnn, jhb
Sponsored by:		The FreeBSD Foundation
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D6747
2016-06-21 13:48:49 +00:00
kp
b06d3a64e7 pf: Filter on and set vlan PCP values
Adopt the OpenBSD syntax for setting and filtering on VLAN PCP values. This
introduces two new keywords: 'set prio' to set the PCP value, and 'prio' to
filter on it.

Reviewed by:    allanjude, araujo
Approved by:	re (gjb)
Obtained from:  OpenBSD (mostly)
Differential Revision:  https://reviews.freebsd.org/D6786
2016-06-17 18:21:55 +00:00
melifaro
8f8138fdd3 Fix 4-byte overflow in ipv6_writemask.
This bug could cause some IPv6 table prefix delete requests to fail.

Obtained from:	Yandex LLC
2016-06-05 10:33:53 +00:00
truckman
fc1de16102 Replace constant expressions that contain multiplications by
fractional floating point values with integer divides.  This will
eliminate any chance that the compiler will generate code to evaluate
the expression using floating point at runtime.

Suggested by:	bde
Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	8 days (with r300779 and r300949)
2016-06-01 20:04:24 +00:00
truckman
ba1b6b7d8c Cast some expressions that multiply a long long constant by a
floating point constant to int64_t.  This avoids the runtime
conversion of the the other operand in a set of comparisons from
int64_t to floating point and doing the comparisions in floating
point.

Suggested by:	lidl
Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	2 weeks (with r300779)
2016-05-29 07:23:56 +00:00
truckman
a44a63ca5b Correct a typo in a comment.
MFC after:	2 weeks (with r300779)
2016-05-26 22:03:28 +00:00
truckman
4c4391fff9 Modify BOUND_VAR() macro to wrap all of its arguments in () and tweak
its expression to work on powerpc and sparc64 (gcc compatibility).

Correct a typo in a nearby comment.

MFC after:	2 weeks (with r300779)
2016-05-26 21:44:52 +00:00
truckman
2a78edb668 Import Dummynet AQM version 0.2.1 (CoDel, FQ-CoDel, PIE and FQ-PIE).
Centre for Advanced Internet Architectures

Implementing AQM in FreeBSD

* Overview <http://caia.swin.edu.au/freebsd/aqm/index.html>

* Articles, Papers and Presentations
  <http://caia.swin.edu.au/freebsd/aqm/papers.html>

* Patches and Tools <http://caia.swin.edu.au/freebsd/aqm/downloads.html>

Overview

Recent years have seen a resurgence of interest in better managing
the depth of bottleneck queues in routers, switches and other places
that get congested. Solutions include transport protocol enhancements
at the end-hosts (such as delay-based or hybrid congestion control
schemes) and active queue management (AQM) schemes applied within
bottleneck queues.

The notion of AQM has been around since at least the late 1990s
(e.g. RFC 2309). In recent years the proliferation of oversized
buffers in all sorts of network devices (aka bufferbloat) has
stimulated keen community interest in four new AQM schemes -- CoDel,
FQ-CoDel, PIE and FQ-PIE.

The IETF AQM working group is looking to document these schemes,
and independent implementations are a corner-stone of the IETF's
process for confirming the clarity of publicly available protocol
descriptions. While significant development work on all three schemes
has occured in the Linux kernel, there is very little in FreeBSD.

Project Goals

This project began in late 2015, and aims to design and implement
functionally-correct versions of CoDel, FQ-CoDel, PIE and FQ_PIE
in FreeBSD (with code BSD-licensed as much as practical). We have
chosen to do this as extensions to FreeBSD's ipfw/dummynet firewall
and traffic shaper. Implementation of these AQM schemes in FreeBSD
will:
* Demonstrate whether the publicly available documentation is
  sufficient to enable independent, functionally equivalent implementations

* Provide a broader suite of AQM options for sections the networking
  community that rely on FreeBSD platforms

Program Members:

* Rasool Al Saadi (developer)

* Grenville Armitage (project lead)

Acknowledgements:

This project has been made possible in part by a gift from the
Comcast Innovation Fund.

Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
X-No objection:	core
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D6388
2016-05-26 21:40:13 +00:00
kp
7ddccc27cd pf: Fix more ICMP mistranslation
In the default case fix the substitution of the destination address.

PR:		201519
Submitted by:	Max <maximos@als.nnov.ru>
MFC after:	1 week
2016-05-23 13:59:48 +00:00
kp
e155a36ec0 pf: Fix ICMP translation
Fix ICMP source address rewriting in rdr scenarios.

PR:		201519
Submitted by:	Max <maximos@als.nnov.ru>
MFC after:	1 week
2016-05-23 12:41:29 +00:00
kp
e52f10d755 pf: Fix fragment timeout
We were inconsistent about the use of time_second vs. time_uptime.
Always use time_uptime so the value can be meaningfully compared.

Submitted by:	"Max" <maximos@als.nnov.ru>
MFC after:	4 days
2016-05-20 15:41:05 +00:00
ae
a1e52f7bdc Fix the regression introduced in r300143.
When we are creating new dynamic state use MATCH_FORWARD direction to
correctly initialize protocol's state.
2016-05-20 15:00:12 +00:00
ae
9d4244757d Move protocol state handling code from lookup_dyn_rule_locked() function
into dyn_update_proto_state(). This allows eliminate the second state
lookup in the ipfw_install_state().
Also remove MATCH_* macros, they are defined in ip_fw_private.h as enum.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-05-18 12:53:21 +00:00
ae
f79f8e9de8 Make named objects set-aware. Now it is possible to create named
objects with the same name in different sets.

Add optional manage_sets() callback to objects rewriting framework.
It is intended to implement handler for moving and swapping named
object's sets. Add ipfw_obj_manage_sets() function that implements
generic sets handler. Use new callback to implement sets support for
lookup tables.
External actions objects are global and they don't support sets.
Modify eaction_findbyname() to reflect this.
ipfw(8) now may fail to move rules or sets, because some named objects
in target set may have conflicting names.
Note that ipfw_obj_ntlv type was changed, but since lookup tables
actually didn't support sets, this change is harmless.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-05-17 07:47:23 +00:00
ae
ae12c4ca15 Fix memory leak possible in error case.
Use free_rule() instead of free(), it will also release memory allocated
for rule counters.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-05-11 10:04:32 +00:00
ae
eb4486dee8 Change the type of objhash_cb_t callback function to be able return an
error code. Use it to interrupt the loop in ipfw_objhash_foreach().

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-05-06 03:18:51 +00:00
ae
59075b2688 Rename find_name_tlv_type() to ipfw_find_name_tlv_type() and make it
global. Use it in ip_fw_table.c instead of find_name_tlv() to reduce
duplicated code.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-05-05 20:15:46 +00:00
pfg
d9c9113377 sys/net*: minor spelling fixes.
No functional change.
2016-05-03 18:05:43 +00:00
ae
30f8ed2140 Make create_object callback optional and return EOPNOTSUPP when it isn't
defined. Remove eaction_create_compat() and use designated initializers to
initialize eaction_opcodes structure.

Obtained from:	Yandex LLC
2016-04-27 15:28:25 +00:00
pfg
b756da1446 netpfil: for pointers replace 0 with NULL.
These are mostly cosmetical, no functional change.

Found with devel/coccinelle.

Reviewed by:	ae
2016-04-15 12:24:01 +00:00
ae
4d9b1f8309 Add External Actions KPI to ipfw(9).
It allows implementing loadable kernel modules with new actions and
without needing to modify kernel headers and ipfw(8). The module
registers its action handler and keyword string, that will be used
as action name. Using generic syntax user can add rules with this
action. Also ipfw(8) can be easily modified to extend basic syntax
for external actions, that become a part base system.
Sample modules will coming soon.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-04-14 22:51:23 +00:00
ae
c1f7aad42e Change the type of 'etlv' field in struct named_object to uint16_t.
It should match with the type field in struct ipfw_obj_tlv.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-04-14 21:52:31 +00:00
ae
3b29c6c182 Adjust some comments and make ref_opcode_object() static. 2016-04-14 21:45:18 +00:00
ae
9b7eaeb151 o Teach opcode rewriting framework handle several rewriters for
the same opcode.

o Reduce number of times classifier callback is called. It is
  redundant to call it just after find_op_rw(), since the last
  does call it already and can have all results.

o Do immediately opcode rewrite in the ref_opcode_object().
  This eliminates additional classifier lookup later on bulk update.
  For unresolved opcodes the behavior still the same, we save information
  from classifier callback in the obj_idx array, then perform automatic
  objects creation, then perform rewriting for opcodes using indeces
  from created objects.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-04-14 21:31:16 +00:00
ae
c259f15148 Move several functions related to opcode rewriting framework from
ip_fw_table.c into ip_fw_sockopt.c and make them static.

Obtained from:	Yandex LLC
2016-04-14 20:49:27 +00:00
pfg
b63211eed5 Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
kp
f0149ff4ac pf: Improve forwarding detection
When we guess the nature of the outbound packet (output vs. forwarding) we need
to take bridges into account. When bridging the input interface does not match
the output interface, but we're not forwarding. Similarly, it's possible for the
interface to actually be the bridge interface itself (and not a member interface).

PR:		202351
MFC after:	2 weeks
2016-03-16 06:42:15 +00:00
ae
9b6c24e7cf Use correct size for malloc.
Obtained from:	Yandex LLC
MFC after:	1 week
2016-03-03 13:07:59 +00:00
jhb
15b2caff0f Remove taskqueue_enqueue_fast().
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167.  It has been a compat shim ever
since.  It's time for the compat shim to go.

Submitted by:	Howard Su <howard0su@gmail.com>
Reviewed by:	sephe
Differential Revision:	https://reviews.freebsd.org/D5131
2016-03-01 17:47:32 +00:00
kp
462a1089a3 pf: Fix possible out-of-bounds write
In the DIOCRSETADDRS ioctl() handler we allocate a table for struct pfr_addrs,
which is processed in pfr_set_addrs(). At the users request we also provide
feedback on the deleted addresses, by storing them after the new list
('bcopy(&ad, addr + size + i, sizeof(ad));' in pfr_set_addrs()).

This means we write outside the bounds of the buffer we've just allocated.
We need to look at pfrio_size2 instead (i.e. the size the user reserved for our
feedback). That'd allow a malicious user to specify a smaller pfrio_size2 than
pfrio_size though, in which case we'd still read outside of the allocated
buffer. Instead we allocate the largest of the two values.

Reported By:	Paul J Murphy <paul@inetstat.net>
PR:		207463
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D5426
2016-02-25 07:33:59 +00:00
ae
fbff7925a1 Fix bug in filling and handling ipfw's O_DSCP opcode.
Due to integer overflow CS4 token was handled as BE.

PR:		207459
MFC after:	1 week
2016-02-24 13:16:03 +00:00
kp
e846a24ec7 in pf_print_state_parts, do not use skw->proto to print the protocol but our
local copy proto that we very carefully set beforehands. skw being NULL is
perfectly valid there.

Obtained from:	OpenBSD (henning)
2016-02-20 12:53:53 +00:00
glebius
f75e66a4bd Fix obvious typo, that lead to incorrect sorting.
Found by:	PVS-Studio
2016-02-18 19:05:30 +00:00
glebius
306a6faf84 These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
luigi
d2ca0a0782 cleanup and document in some detail the internals of the testing code
for dummynet schedulers
2016-01-27 02:22:31 +00:00
luigi
94957acd2f the _Static_assert was not supposed to be in the commit. 2016-01-27 02:14:08 +00:00
luigi
7860e04138 bugfix: the scheduler template (dn_schk) for the round robin scheduler
is followed by another structure (rr_schk) whose size must be set
in the schk_datalen field of the descriptor.
Not allocating the memory may cause other memory to be overwritten
(though dn_schk is 192 bytes and rr_schk only 12 so we may be lucky
and end up in the padding after the dn_schk).

This is a merge candidate for stable and 10.3

MFC after:	3 days
2016-01-27 02:08:30 +00:00
luigi
156e03004c fix various warnings to compile the test code with -Wextra 2016-01-26 23:37:07 +00:00
luigi
5e7533603b fix various warnings (signed/unsigned, printf types, unused arguments) 2016-01-26 23:36:18 +00:00
luigi
17af12dfee prevent warnings for signed/unsigned comparisons and unused arguments.
Add checks for parameters overflowing 32 bit.
2016-01-26 22:46:58 +00:00