Commit Graph

387 Commits

Author SHA1 Message Date
pfg
b756da1446 netpfil: for pointers replace 0 with NULL.
These are mostly cosmetical, no functional change.

Found with devel/coccinelle.

Reviewed by:	ae
2016-04-15 12:24:01 +00:00
ae
4d9b1f8309 Add External Actions KPI to ipfw(9).
It allows implementing loadable kernel modules with new actions and
without needing to modify kernel headers and ipfw(8). The module
registers its action handler and keyword string, that will be used
as action name. Using generic syntax user can add rules with this
action. Also ipfw(8) can be easily modified to extend basic syntax
for external actions, that become a part base system.
Sample modules will coming soon.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-04-14 22:51:23 +00:00
ae
c1f7aad42e Change the type of 'etlv' field in struct named_object to uint16_t.
It should match with the type field in struct ipfw_obj_tlv.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-04-14 21:52:31 +00:00
ae
3b29c6c182 Adjust some comments and make ref_opcode_object() static. 2016-04-14 21:45:18 +00:00
ae
9b7eaeb151 o Teach opcode rewriting framework handle several rewriters for
the same opcode.

o Reduce number of times classifier callback is called. It is
  redundant to call it just after find_op_rw(), since the last
  does call it already and can have all results.

o Do immediately opcode rewrite in the ref_opcode_object().
  This eliminates additional classifier lookup later on bulk update.
  For unresolved opcodes the behavior still the same, we save information
  from classifier callback in the obj_idx array, then perform automatic
  objects creation, then perform rewriting for opcodes using indeces
  from created objects.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-04-14 21:31:16 +00:00
ae
c259f15148 Move several functions related to opcode rewriting framework from
ip_fw_table.c into ip_fw_sockopt.c and make them static.

Obtained from:	Yandex LLC
2016-04-14 20:49:27 +00:00
pfg
b63211eed5 Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
kp
f0149ff4ac pf: Improve forwarding detection
When we guess the nature of the outbound packet (output vs. forwarding) we need
to take bridges into account. When bridging the input interface does not match
the output interface, but we're not forwarding. Similarly, it's possible for the
interface to actually be the bridge interface itself (and not a member interface).

PR:		202351
MFC after:	2 weeks
2016-03-16 06:42:15 +00:00
ae
9b6c24e7cf Use correct size for malloc.
Obtained from:	Yandex LLC
MFC after:	1 week
2016-03-03 13:07:59 +00:00
jhb
15b2caff0f Remove taskqueue_enqueue_fast().
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167.  It has been a compat shim ever
since.  It's time for the compat shim to go.

Submitted by:	Howard Su <howard0su@gmail.com>
Reviewed by:	sephe
Differential Revision:	https://reviews.freebsd.org/D5131
2016-03-01 17:47:32 +00:00
kp
462a1089a3 pf: Fix possible out-of-bounds write
In the DIOCRSETADDRS ioctl() handler we allocate a table for struct pfr_addrs,
which is processed in pfr_set_addrs(). At the users request we also provide
feedback on the deleted addresses, by storing them after the new list
('bcopy(&ad, addr + size + i, sizeof(ad));' in pfr_set_addrs()).

This means we write outside the bounds of the buffer we've just allocated.
We need to look at pfrio_size2 instead (i.e. the size the user reserved for our
feedback). That'd allow a malicious user to specify a smaller pfrio_size2 than
pfrio_size though, in which case we'd still read outside of the allocated
buffer. Instead we allocate the largest of the two values.

Reported By:	Paul J Murphy <paul@inetstat.net>
PR:		207463
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D5426
2016-02-25 07:33:59 +00:00
ae
fbff7925a1 Fix bug in filling and handling ipfw's O_DSCP opcode.
Due to integer overflow CS4 token was handled as BE.

PR:		207459
MFC after:	1 week
2016-02-24 13:16:03 +00:00
kp
e846a24ec7 in pf_print_state_parts, do not use skw->proto to print the protocol but our
local copy proto that we very carefully set beforehands. skw being NULL is
perfectly valid there.

Obtained from:	OpenBSD (henning)
2016-02-20 12:53:53 +00:00
glebius
f75e66a4bd Fix obvious typo, that lead to incorrect sorting.
Found by:	PVS-Studio
2016-02-18 19:05:30 +00:00
glebius
306a6faf84 These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
luigi
d2ca0a0782 cleanup and document in some detail the internals of the testing code
for dummynet schedulers
2016-01-27 02:22:31 +00:00
luigi
94957acd2f the _Static_assert was not supposed to be in the commit. 2016-01-27 02:14:08 +00:00
luigi
7860e04138 bugfix: the scheduler template (dn_schk) for the round robin scheduler
is followed by another structure (rr_schk) whose size must be set
in the schk_datalen field of the descriptor.
Not allocating the memory may cause other memory to be overwritten
(though dn_schk is 192 bytes and rr_schk only 12 so we may be lucky
and end up in the padding after the dn_schk).

This is a merge candidate for stable and 10.3

MFC after:	3 days
2016-01-27 02:08:30 +00:00
luigi
156e03004c fix various warnings to compile the test code with -Wextra 2016-01-26 23:37:07 +00:00
luigi
5e7533603b fix various warnings (signed/unsigned, printf types, unused arguments) 2016-01-26 23:36:18 +00:00
luigi
17af12dfee prevent warnings for signed/unsigned comparisons and unused arguments.
Add checks for parameters overflowing 32 bit.
2016-01-26 22:46:58 +00:00
luigi
3043ee5984 prevent warning for unused argument 2016-01-26 22:45:45 +00:00
luigi
c2c3ad006a avoid warnings for signed/unsigned comparison and unused arguments 2016-01-26 22:45:05 +00:00
luigi
4b163c84bb Revert one chunk from commit 285362, which introduced an off-by-one error
in computing a shift index. The error was due to the use of mixed
fls() / __fls() functions in another implementation of qfq.
To avoid that the problem occurs again, properly document which
incarnation of the function we need.
Note that the bug only affects QFQ in FreeBSD head from last july, as
the patch was not merged to other versions.
2016-01-26 04:48:24 +00:00
melifaro
23582454c7 MFP r287070,r287073: split radix implementation and route table structure.
There are number of radix consumers in kernel land (pf,ipfw,nfs,route)
  with different requirements. In fact, first 3 don't have _any_ requirements
  and first 2 does not use radix locking. On the other hand, routing
  structure do have these requirements (rnh_gen, multipath, custom
  to-be-added control plane functions, different locking).
Additionally, radix should not known anything about its consumers internals.

So, radix code now uses tiny 'struct radix_head' structure along with
  internal 'struct radix_mask_head' instead of 'struct radix_node_head'.
  Existing consumers still uses the same 'struct radix_node_head' with
  slight modifications: they need to pass pointer to (embedded)
  'struct radix_head' to all radix callbacks.

Routing code now uses new 'struct rib_head' with different locking macro:
  RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing
  information base).

New net/route_var.h header was added to hold routing subsystem internal
  data. 'struct rib_head' was placed there. 'struct rtentry' will also
  be moved there soon.
2016-01-25 06:33:15 +00:00
melifaro
acae2332ed Fix panic on table/table entry delete. The panic could have happened
if more than 64 distinct values had been used.

Table value code uses internal objhash API which requires unique key
  for each object. For value code, pointer to the actual value data
  is used. The actual problem arises from the fact that 'actual' e.g.
  runtime data is stored in array and that array is auto-growing. There is
  special hook (update_tvalue() function) which is used to update the pointers
  after the change. For some reason, object 'key' was not updated.
  Fix this by adding update code to the update_tvalue().

Sponsored by:	Yandex LLC
2016-01-21 18:20:40 +00:00
melifaro
3f95da300d Initialize error value ta_lookup_kfib() by default to please compiler. 2016-01-10 08:37:00 +00:00
bz
6e757e4210 Initialize error after r293626 in case neither INET nor INET6 is
compiled into the kernel.  Ideally lots more code would just not
be called (or compiled in) in that case but that requires a lot
more surgery.  For now try to make IP-less kernels compile again.
2016-01-10 08:14:25 +00:00
melifaro
3572a9608b Make ipfw addr:kfib lookup algo use new routing KPI. 2016-01-10 06:43:43 +00:00
melifaro
155575feb3 Use already pre-calculated number of entries instead of tc->count. 2016-01-10 00:28:44 +00:00
melifaro
14cf7637d1 Remove sys/eventhandler.h from net/route.h
Reviewed by:	ae
2016-01-09 09:34:39 +00:00
melifaro
76a48b4688 Convert pf(4) to the new routing API.
Differential Revision:	https://reviews.freebsd.org/D4763
2016-01-07 10:20:03 +00:00
hselasky
918ba30df9 Properly drain callouts in the IPFW subsystem to avoid use after free
panics when unloading the dummynet and IPFW modules:

- The callout drain function can sleep and should not be called having
a non-sleepable lock locked. Remove locks around "ipfw_dyn_uninit(0)".

- Add a new "dn_gone" variable to prevent asynchronous restart of
dummynet callouts when unloading the dummynet kernel module.

- Call "dn_reschedule()" locked so that "dn_gone" can be set and
checked atomically with regard to starting a new callout.

Reviewed by:	hiren
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D3855
2015-12-15 09:02:05 +00:00
melifaro
ca13483a3c Merge helper fib* functions used for basic lookups.
Vast majority of rtalloc(9) users require only basic info from
route table (e.g. "does the rtentry interface match with the interface
  I have?". "what is the MTU?", "Give me the IPv4 source address to use",
  etc..).
Instead of hand-rolling lookups, checking if rtentry is up, valid,
  dealing with IPv6 mtu, finding "address" ifp (almost never done right),
  provide easy-to-use API hiding all the complexity and returning the
  needed info into small on-stack structure.

This change also helps hiding route subsystem internals (locking, direct
  rtentry accesses).
Additionaly, using this API improves lookup performance since rtentry is not
  locked.
(This is safe, since all the rtentry changes happens under both radix WLOCK
  and rtentry WLOCK).

Sponsored by:	Yandex LLC
2015-12-08 10:50:03 +00:00
ae
35a3cc0379 Add destroy_object callback to object rewriting framework.
It is called when last reference to named object is going to be released
and allows to do additional cleanup for implementation of named objects.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-23 22:06:55 +00:00
bdrewery
39392ad0ea Fix dynamic IPv6 rules showing junk for non-specified address masks.
For example:
  00002      0         0 (19s) PARENT 1 tcp 10.10.0.5 0 <-> 0.0.0.0 0
  00002      4       412 (1s) LIMIT tcp 10.10.0.5 25848 <-> 10.10.0.7 22
  00002     10       777 (1s) LIMIT tcp 2001:894:5a24:653::503:1 52023 <-> 2001:894:5a24:653:ca0a:a9ff:fe04:3978 22
  00002      0         0 (17s) PARENT 1 tcp 2001:894:5a24:653::503:1 0 <-> 80f3:70d:23fe:ffff:1005:: 0

Fix this by zeroing the unused address, as is done for IPv4:
  00002     0         0 (18s) PARENT 1 tcp 10.10.0.5 0 <-> 0.0.0.0 0
  00002    36     14952 (1s) LIMIT tcp 10.10.0.5 25848 <-> 10.10.0.7 22
  00002     0         0 (0s) PARENT 1 tcp 2001:894:5a24:653::503:1 0 <-> :: 0
  00002     4       345 (274s) LIMIT tcp 2001:894:5a24:653::503:1 34131 <-> 2001:470:1f11:262:ca0a:a9ff:fe04:3978 22

MFC after:	2 weeks
2015-11-17 20:42:08 +00:00
melifaro
2bf2184989 Bring back the ability of passing cached route via nd6_output_ifp(). 2015-11-15 16:02:22 +00:00
rrs
dc494194a2 This fixes several places where callout_stops return is examined. The
new return codes of -1 were mistakenly being considered "true". Callout_stop
now returns -1 to indicate the callout had either already completed or
was not running and 0 to indicate it could not be stopped.  Also update
the manual page to make it more consistent no non-zero in the callout_stop
or callout_reset descriptions.

MFC after:	1 Month with associated callout change.
2015-11-13 22:51:35 +00:00
melifaro
d94cce972e Print proper setfib values in ipfw log.
Submitted by:	Denis Schneider <v1ne2go at gmail>
2015-11-08 13:44:21 +00:00
melifaro
2eab7c29ca Fix setfib target.
Problem was introduced in r272840 when converting tablearg value to 0.

Submitted by:	Denis Schneider <v1ne2go at gmail>
2015-11-08 12:24:19 +00:00
kp
56bf96006a pf: Fix broken rule skip calculation
r289932 accidentally broke the rule skip calculation. The address family
argument to PF_ANEQ() is now important, and because it was set to 0 the macro
always evaluated to false.
This resulted in incorrect skip values, which in turn broke the rule
evaluations.
2015-11-07 23:51:42 +00:00
ae
90239f5468 Remove now obsolete KASSERT.
Actually, object classify callbacks can skip some opcodes, that could
be rewritten. We will deteremine real numbed of rewritten opcodes a bit
later in this function.

Reported by:	David H. Wolfskill <david at catwhisker dot org>
2015-11-03 22:23:09 +00:00
ae
52522b4db0 Eliminate any conditional increments of object_opcodes in the
check_ipfw_rule_body() function. This function is intended to just
determine that rule has some opcodes that can be rewrited. Then the
ref_rule_objects() function will determine real number of rewritten
opcodes using classify callback.

Reviewed by:	melifaro
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-03 10:34:26 +00:00
ae
f4da06a164 Add ipfw_check_object_name_generic() function to do basic checks for an
object name correctness. Each type of object can do more strict checking
in own implementation. Do such checks for tables in check_table_name().

Reviewed by:	melifaro
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-03 10:29:46 +00:00
ae
750b62ddbe Implement ipfw internal olist command to list named objects.
Reviewed by:	melifaro
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-03 10:21:53 +00:00
kp
0c3b0b48ba pf: Fix IPv6 checksums with route-to.
When using route-to (or reply-to) pf sends the packet directly to the output
interface. If that interface doesn't support checksum offloading the checksum
has to be calculated in software.
That was already done in the IPv4 case, but not for the IPv6 case. As a result
we'd emit packets with pseudo-header checksums (i.e. incorrect checksums).

This issue was exposed by the changes in r289316 when pf stopped performing full
checksum calculations for all packets.

Submitted by:	Luoqi Chen
MFC after:	1 week
2015-10-29 20:45:53 +00:00
melifaro
e98b0226ff Eliminate last rtalloc_ign() caller.
Differential Revision:	https://reviews.freebsd.org/D3927
2015-10-27 21:25:40 +00:00
kp
40bca2754d pf: Fix TSO issues
In certain configurations (mostly but not exclusively as a VM on Xen) pf
produced packets with an invalid TCP checksum.

The problem was that pf could only handle packets with a full checksum. The
FreeBSD IP stack produces TCP packets with a pseudo-header checksum (only
addresses, length and protocol).
Certain network interfaces expect to see the pseudo-header checksum, so they
end up producing packets with invalid checksums.

To fix this stop calculating the full checksum and teach pf to only update TCP
checksums if TSO is disabled or the change affects the pseudo-header checksum.

PR:		154428, 193579, 198868
Reviewed by:	sbruno
MFC after:	1 week
Relnotes:	yes
Sponsored by:	RootBSD
Differential Revision:	https://reviews.freebsd.org/D3779
2015-10-14 16:21:41 +00:00
melifaro
d7ce93106b Bump number of prefixes in O_IP_<SRC|DST> from 15 to 31 (max possible).
PR:		203459
Submitted by:	groos at xiplink.com
MFC after:	2 weeks
2015-10-03 05:42:25 +00:00
melifaro
493325342d Simplify the way of attaching IPv6 link-layer header.
Problem description:
How do we currently perform layer 2 resolution and header imposition:

For IPv4 we have the following chain:
  ip_output() -> (ether|atm|whatever)_output() -> arpresolve()

Lookup is done in proper place (link-layer output routine) and it is possible
  to provide cached lle data.

For IPv6 situation is more complex:
  ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() ->
    nd6_storelladdr()

We have ip6_ouput() which calls nd6_output() instead of link output routine.
nd6_output() does the following:
  * checks if lle exists, creates it if needed (similar to arpresolve())
  * performes lle state transitions (similar to arpresolve())
  * calls nd6_output_ifp() which pushes packets to link output routine along
    with running SeND/MAC hooks regardless of lle state
    (e.g. works as run-hooks placeholder).

After that, iface output routine like ether_output() calls nd6_storelladdr()
  which performs lle lookup once again.

As a result, we perform lookup twice for each outgoing packet for most types
  of interfaces. We also need to maintain runtime-checked table of 'nd6-free'
  interfaces (see nd6_need_cache()).

Fix this behavior by eliminating first ND lookup. To be more specific:
  * make all nd6_output() consumers use nd6_output_ifp() instead
  * rename nd6_output[_slow]() to nd6_resolve_[slow]()
  * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics,
    e.g. copy L2 address to buffer instead of pushing packet towards lower
    layers
  * Make all nd6_storelladdr() users use nd6_resolve()
  * eliminate nd6_storelladdr()

The resulting callchain is the following:
  ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve()

Error handling:
Currently sending packet to non-existing la results in ip6_<output|forward>
  -> nd6_output() -> nd6_output _lle() which returns 0.
In new scenario packet is propagated to <ether|whatever>_output() ->
  nd6_resolve() which will return EWOULDBLOCK, and that result
  will be converted to 0.

(And EWOULDBLOCK is actually used by IB/TOE code).

Sponsored by:		Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D1469
2015-09-16 14:26:28 +00:00