Commit Graph

509 Commits

Author SHA1 Message Date
Don Lewis
4001fcbe0a Fix Dummynet AQM packet marking function ecn_mark() and fq_codel /
fq_pie schedulers packet classification functions in layer2 (bridge mode).

Dummynet AQM packet marking function ecn_mark() and fq_codel/fq_pie
schedulers packet classification functions (fq_codel_classify_flow()
and fq_pie_classify_flow()) assume mbuf is pointing at L3 (IP)
packet. However, this assumption is incorrect if ipfw/dummynet is
used to manage layer2 traffic (bridge mode) since mbuf will point
at L2 frame.  This patch solves this problem by identifying the
source of the frame/packet (L2 or L3) and adding ETHER_HDR_LEN
offset when converting an mbuf pointer to ip pointer if the traffic
is from layer2.  More specifically, in dummynet packet tagging
function, tag_mbuf(), iphdr_off is set to ETHER_HDR_LEN if the
traffic is from layer2 and set to zero otherwise. Whenever an access
to IP header is required, mtodo(m, dn_tag_get(m)->iphdr_off) is
used instead of mtod(m, struct ip *) to correctly convert mbuf
pointer to ip pointer in both L2 and L3 traffic.

Submitted by:	lstewart
MFC after:	2 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D12506
2017-10-26 10:11:35 +00:00
Andrey V. Elsukov
5c70ebfa57 Add IPv6 support for O_TCPDATALEN opcode.
PR:		222746
MFC after:	1 week
2017-10-24 08:39:05 +00:00
Andrey V. Elsukov
ff0a137952 Fix regression in handling O_FORWARD_IP opcode after r279948.
To properly handle 'fwd tablearg,port' opcode, copy sin_port value from
sockaddr_in structure stored in the opcode into corresponding hopstore
field.

PR:		222953
MFC after:	1 week
2017-10-13 11:11:53 +00:00
Michael Tuexen
945906384d Fix a bug which avoided that rules for matching port numbers for SCTP
packets where actually matched.
While there, make clean in the man-page that SCTP port numbers are
supported in rules.

MFC after:	1 month
2017-10-02 18:25:30 +00:00
Andrey V. Elsukov
5df8171da3 Use in_localip() function instead of unlocked access to addresses hash
to determine that an address is our local.

PR:		220078
MFC after:	1 week
2017-09-20 22:35:28 +00:00
Andrey V. Elsukov
369bc48dc5 Do not acquire IPFW_WLOCK when a named object is created and destroyed.
Acquiring of IPFW_WLOCK is requried for cases when we are going to
change some data that can be accessed during processing of packets flow.
When we create new named object, there are not yet any rules, that
references it, thus holding IPFW_UH_WLOCK is enough to safely update
needed structures. When we destroy an object, we do this only when its
reference counter becomes zero. And it is safe to not acquire IPFW_WLOCK,
because noone references it. The another case is when we failed to finish
some action and thus we are doing rollback and destroying an object, in
this case it is still not referenced by rules and no need to acquire
IPFW_WLOCK.

This also fixes panic with INVARIANTS due to recursive IPFW_WLOCK acquiring.

MFC after:	1 week
Sponsored by:	Yandex LLC
2017-09-20 22:00:06 +00:00
Kristof Provost
7f3ad01804 pf_get_sport(): Prevent possible endless loop when searching for an unused nat port
This is an import of Alexander Bluhm's OpenBSD commit r1.60,
the first chunk had to be modified because on OpenBSD the
'cut' declaration is located elsewhere.

Upstream report by Jingmin Zhou:
https://marc.info/?l=openbsd-pf&m=150020133510896&w=2

OpenBSD commit message:
 Use a 32 bit variable to detect integer overflow when searching for
 an unused nat port.  Prevents a possible endless loop if high port
 is 65535 or low port is 0.
 report and analysis Jingmin Zhou; OK sashan@ visa@
Quoted from: https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_lb.c

PR:		221201
Submitted by:	Fabian Keil <fk@fabiankeil.de>
Obtained from:  OpenBSD via ElectroBSD
MFC after:	1 week
2017-08-08 21:09:26 +00:00
Luiz Otavio O Souza
9ffd0f54a7 Fix a couple of typos in a comment.
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-07-21 03:04:55 +00:00
Philip Paeps
b0e1660d53 Fix GRE over IPv6 tunnels with IPFW
Previously, GRE packets in IPv6 tunnels would be dropped by IPFW (unless
net.inet6.ip6.fw.deny_unknown_exthdrs was unset).

PR:		220640
Submitted by:	Kun Xie <kxie@xiplink.com>
MFC after:	1 week
2017-07-13 09:01:22 +00:00
Kristof Provost
b7ae43552b pf: Fix vnet purging
pf_purge_thread() breaks up the work of iterating all states (in
pf_purge_expired_states()) and tracks progress in the idx variable.

If multiple vnets exist this results in pf_purge_thread() only calling
pf_purge_expired_states() for part of the states (the first part of the
first vnet, second part of the second vnet and so on).
Combined with the mark-and-sweep approach to cleaning up old rules (in
V_pf_unlinked_rules) that resulted in pf freeing rules that were still
referenced by states. This in turn caused panics when pf_state_expires()
encounters that state and attempts to access the rule.

We need to track the progress per vnet, not globally, so idx is moved
into a per-vnet V_pf_purge_idx.

PR:		219251
Sponsored by:	Hackathon Essen 2017
2017-07-09 17:56:39 +00:00
Andrey V. Elsukov
785c0d4d97 Fix IPv6 extension header parsing. The length field doesn't include the
first 8 octets.

Obtained from:	Yandex LLC
MFC after:	3 days
2017-06-29 19:06:43 +00:00
Don Lewis
d196c9ee16 Fix the queue delay estimation in PIE/FQ-PIE when the timestamp
(TS) method is used.  When packet timestamp is used, the "current_qdelay"
keeps storing the last queue delay value calculated in the dequeue
function.  Therefore, when a burst of packets arrives followed by
a pause, the "current_qdelay" will store a high value caused by the
burst and stick to that value during the pause because the queue
delay measurement is done inside the dequeue function.  This causes
the drop probability calculation function to calculate high drop
probability value instead of zero and prevents the burst allowance
mechanism from working properly.  Fix this problem by resetting
"current_qdelay" inside the drop probability calculation function
when the queue length is zero and TS option is used.

Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	1 week
2017-05-19 08:38:03 +00:00
Don Lewis
36fb8be630 The result of right shifting a negative signed value is implementation
defined.  On machines without arithmetic shift instructions, zero bits
may be shifted in from the left, giving a large positive result instead
of the desired divide-by power-of-2.  Fix this by operating on the
absolute value and compensating for the possible negation later.

Reverse the order of the underflow/overflow tests and the exponential
decay calculation to avoid the possibility of an erroneous overflow
detection if p is a sufficiently small non-negative value.  Also
check for negative values of prob before doing the exponential decay
to avoid another instance of of right shifting a negative value.

Tested by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	1 week
2017-05-19 01:23:06 +00:00
Kristof Provost
468cefa22e pf: Fix vnet initialisation
When running the vnet init code (pf_load_vnet()) we used to iterate over
all vnets, marking them as unhooked.
This is incorrect and leads to panics if pf is unloaded, as the unload
code does not unregister the pfil hooks (because the vnet is marked as
unhooked).

There's no need or reason to touch other vnets during initialisation.
Their pf_load_vnet() function will be triggered, which handles all
required initialisation.

Reviewed by:	zec, gnn
Differential Revision:	https://reviews.freebsd.org/D10592
2017-05-07 14:33:58 +00:00
Kristof Provost
64c79ee733 pf: Fix panic on unload
vnet_pf_uninit() is called through vnet_deregister_sysuninit() and
linker_file_unload() when the pf module is unloaded. This is executed
after pf_unload() so we end up trying to take locks which have been
destroyed already.

Move pf_unload() to a separate SYSUNINIT() to ensure it's called after
all the vnet_pf_uninit() calls.

Differential Revision:	https://reviews.freebsd.org/D10025
2017-05-03 20:56:54 +00:00
Marko Zec
1e9e374199 Fix VNET leakages in PF by V_irtualizing pfr_ktables and friends.
Apparently this resolves a PF-triggered panic when destroying VNET jails.

Submitted by:	Peter Blok <peter.blok@bsd4all.org>
Reviewed by:	kp
2017-04-25 08:34:39 +00:00
Marko Zec
3a36ee404f Since curvnet is already properly set on entry to event handlers,
there's no need to override it, particularly not unconditionally with
vnet0.

Submitted by:	Peter Blok <peter.blok@bsd4all.org>
Reviewed by:	kp
2017-04-25 08:30:28 +00:00
Kristof Provost
00eab743ab pf: Fix possible incorrect IPv6 fragmentation
When forwarding pf tracks the size of the largest fragment in a fragmented
packet, and refragments based on this size.
It failed to ensure that this size was a multiple of 8 (as is required for all
but the last fragment), so it could end up generating incorrect fragments.

For example, if we received an 8 byte and 12 byte fragment pf would emit a first
fragment with 12 bytes of payload and the final fragment would claim to be at
offset 8 (not 12).

We now assert that the fragment size is a multiple of 8 in ip6_fragment(), so
other users won't make the same mistake.

Reported by:	Antonios Atlasis <aatlasis at secfu net>
MFC after:	3 days
2017-04-20 09:05:53 +00:00
Kristof Provost
4e261006a1 pf: Also clear limit counters
The "pfctl -F info" command didn't clear the limit counters ( as shown in the
"pfctl -vsi" output).

Submitted by:	Max <maximos@als.nnov.ru>
2017-04-18 20:07:21 +00:00
Andrey V. Elsukov
da62ffd9cd Avoid undefined behavior.
The 'pktid' variable is modified while being used twice between
sequence points, probably due to htonl() is macro.

Reported by:	PVS-Studio
MFC after:	1 week
2017-04-14 11:58:41 +00:00
Andrey V. Elsukov
ba3e1361b0 Use address of specific union member instead of whole union address to
fix PVS-Studio warnings.

MFC after:	1 week
2017-04-14 11:41:09 +00:00
Andrey V. Elsukov
1ca7c3b815 The rule field in the ipfw_dyn_rule structure is used as storage
to pass rule number and rule set to userland. In r272840 the kernel
internal rule representation was changed and the rulenum field of
struct ip_fw_rule got the type uint32_t, but userlevel representation
still have the type uint16_t. To not overflow the size of pointer
on the systems with 32-bit pointer size use separate variable to
copy rulenum and set.

Reported by:	PVS-Studio
MFC after:	1 week
2017-04-14 11:19:09 +00:00
Gleb Smirnoff
9f5efe718f Fix potential NULL deref.
Found by:	PVS Studio
2017-04-14 01:56:15 +00:00
Maxim Konovalov
f91eb6adad o Redundant assignments removed.
Found by:	PVS-Stdio, V519
Reviewed by:	ae
2017-04-13 18:13:10 +00:00
Conrad Meyer
bcd8d3b805 dummynet: Use strlcpy to appease static checkers
Some dummynet modules used strcpy() to copy from a larger buffer
(dn_aqm->name) to a smaller buffer (dn_extra_parms->name).  It happens that
the lengths of the strings in the dn_aqm buffers were always hardcoded to be
smaller than the dn_extra_parms buffer ("CODEL", "PIE").

Use strlcpy() instead, to appease static checkers.  No functional change.

Reported by:	Coverity
CIDs:		1356163, 1356165
Sponsored by:	Dell EMC Isilon
2017-04-13 17:47:44 +00:00
Andrey V. Elsukov
88d950a650 Remove "IPFW static rules" rmlock.
Make PFIL's lock global and use it for this purpose.
This reduces the number of locks needed to acquire for each packet.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
No objection from: #network
Differential Revision:	https://reviews.freebsd.org/D10154
2017-04-03 13:35:04 +00:00
Andrey V. Elsukov
aac74aeac7 Add ipfw_pmod kernel module.
The module is designed for modification of a packets of any protocols.
For now it implements only TCP MSS modification. It adds the external
action handler for "tcp-setmss" action.

A rule with tcp-setmss action does additional check for protocol and
TCP flags. If SYN flag is present, it parses TCP options and modifies
MSS option if its value is greater than configured value in the rule.
Then it adjustes TCP checksum if needed. After handling the search
continues with the next rule.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Yandex LLC
No objection from: #network
Differential Revision:	https://reviews.freebsd.org/D10150
2017-04-03 03:07:48 +00:00
Andrey V. Elsukov
11c56650f0 Add O_EXTERNAL_DATA opcode support.
This opcode can be used to attach some data to external action opcode.
And unlike to O_EXTERNAL_INSTANCE opcode, this opcode does not require
creating of named instance to pass configuration arguments to external
action handler. The data is coming just next to O_EXTERNAL_ACTION opcode.

The userlevel part currenly supports formatting for opcode with ipfw_insn
size, by default it expects u16 numeric value in the arg1.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2017-04-03 02:44:40 +00:00
Andrey V. Elsukov
399ad57874 Add the log formatting for an external action opcode.
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2017-04-03 02:26:30 +00:00
Kristof Provost
3601d25181 pf: Fix leak of pf_state_keys
If we hit the state limit we returned from pf_create_state() without cleaning
up.

PR:		217997
Submitted by:	Max <maximos@als.nnov.ru>
MFC after:	1 week
2017-04-01 12:22:34 +00:00
Andrey V. Elsukov
788e62864f Reset the cached state of last lookup in the dynamic states when an
external action is completed, but the rule search is continued.

External action handler can change the content of @args argument,
that is used for dynamic state lookup. Enforce the new lookup to be able
install new state, when the search is continued.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-03-31 09:26:08 +00:00
Kristof Provost
2f8fb3a868 pf: Fix possible shutdown race
Prevent possible races in the pf_unload() / pf_purge_thread() shutdown
code. Lock the pf_purge_thread() with the new pf_end_lock to prevent
these races.

Use a shared/exclusive lock, as we need to also acquire another sx lock
(VNET_LIST_RLOCK). It's fine for both pf_purge_thread() and pf_unload()
to sleep,

Pointed out by: eri, glebius, jhb
Differential Revision:	https://reviews.freebsd.org/D10026
2017-03-22 21:18:18 +00:00
Kristof Provost
08ef4ddb0f pf: Fix rule evaluation after inet6 route-to
In pf_route6() we re-run the ruleset with PF_FWD if the packet goes out
of a different interface. pf_test6() needs to know that the packet was
forwarded (in case it needs to refragment so it knows whether to call
ip6_output() or ip6_forward()).

This lead pf_test6() to try to evaluate rules against the PF_FWD
direction, which isn't supported, so it needs to treat PF_FWD as PF_OUT.
Once fwdir is set correctly the correct output/forward function will be
called.

PR:		217883
Submitted by:	Kajetan Staszkiewicz
MFC after:	1 week
Sponsored by:	InnoGames GmbH
2017-03-19 03:06:09 +00:00
Don Lewis
46c8aadb6f Change several constants used by the PIE algorithm from unsigned to signed.
- PIE_MAX_PROB is compared to variable of int64_t and the type promotion
   rules can cause the value of that variable to be treated as unsigned.
   If the value is actually negative, then the result of the comparsion
   is incorrect, causing the algorithm to perform poorly in some
   situations.  Changing the constant to be signed cause the comparision
   to work correctly.

 - PIE_SCALE is also compared to signed values.  Fortunately they are
   also compared to zero and negative values are discarded so this is
   more of a cosmetic fix.

 - PIE_DQ_THRESHOLD is only compared to unsigned values, but it is small
   enough that the automatic promotion to unsigned is harmless.

Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	1 week
2017-03-18 23:00:13 +00:00
Kristof Provost
5c172e7059 pf: Fix memory leak on vnet shutdown or unload
Rules are unlinked in shutdown_pf(), so we must call
pf_unload_vnet_purge(), which frees unlinked rules, after that, not
before.

Reviewed by:	eri, bz
Differential Revision:	https://reviews.freebsd.org/D10040
2017-03-18 01:37:20 +00:00
Andrey V. Elsukov
3667f39ea3 Use memset with structure size. 2017-03-14 07:57:33 +00:00
Conrad Meyer
49b6a5d60a nat64lsn: Use memset() with structure, not pointer, size
PR:		217738
Submitted by:	Svyatoslav <razmyslov at viva64.com>
Sponsored by:	Viva64 (PVS-Studio)
2017-03-13 17:53:46 +00:00
Kristof Provost
2a57d24bd1 pf: Fix incorrect rw_sleep() in pf_unload()
When we unload we don't hold the pf_rules_lock, so we cannot call rw_sleep()
with it, because it would release a lock we do not hold. There's no need for the
lock either, so we can just tsleep().

While here also make the same change in pf_purge_thread(), because it explicitly
takes the lock before rw_sleep() and then immediately releases it afterwards.
2017-03-12 05:42:57 +00:00
Kristof Provost
f618201314 pf: Do not lose the VNET lock when ending the purge thread
When the pf_purge_thread() exits it must make sure to release the
VNET_LIST_RLOCK it still holds.
kproc_exit() does not return.
2017-03-12 05:00:04 +00:00
Maxim Konovalov
f621c2cd39 o Typo in the comment fixed.
PR:		217617
Submitted by:	lutz
2017-03-09 09:54:23 +00:00
Kristof Provost
98a9874f7b pf: Fix a crash in low-memory situations
If the call to pf_state_key_clone() in pf_get_translation() fails (i.e. there's
no more memory for it) it frees skp. This is wrong, because skp is a
pf_state_key **, so we need to free *skp, as is done later in the function.
Getting it wrong means we try to free a stack variable of the calling
pf_test_rule() function, and we panic.
2017-03-06 23:41:23 +00:00
Andrey V. Elsukov
53de37f8ca Fix the build. Use new ipfw_lookup_table() in the nat64 too.
Reported by:	cy
MFC after:	2 weeks
2017-03-06 00:41:59 +00:00
Andrey V. Elsukov
54e5669d8c Add IPv6 support to O_IP_DST_LOOKUP opcode.
o check the size of O_IP_SRC_LOOKUP opcode, it can not exceed the size of
  ipfw_insn_u32;
o rename ipfw_lookup_table_extended() function into ipfw_lookup_table() and
  remove old ipfw_lookup_table();
o use args->f_id.flow_id6 that is in host byte order to get DSCP value;
o add SCTP ports support to 'lookup src/dst-port' opcode;
o add IPv6 support to 'lookup src/dst-ip' opcode.

PR:		217292
Reviewed by:	melifaro
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9873
2017-03-05 23:48:24 +00:00
Andrey V. Elsukov
c750a56914 Reject invalid object types that can not be used with specific opcodes.
When we doing reference counting of named objects in the new rule,
for existing objects check that opcode references to correct object,
otherwise return EINVAL.

PR:		217391
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-03-05 22:19:43 +00:00
Andrey V. Elsukov
43b294a4db Fix matching table entry value. Use real table value instead of its index
in valuestate array.

When opcode has size equal to ipfw_insn_u32, this means that it should
additionally match value specified in d[0] with table entry value.
ipfw_table_lookup() returns table value index, use TARG_VAL() macro to
convert it to its value. The actual 32-bit value stored in the tag field
of table_value structure, where all unspecified u32 values are kept.

PR:		217262
Reviewed by:	melifaro
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-03-03 20:22:42 +00:00
Andrey V. Elsukov
576429f04b Fix NPTv6 rule counters when one_pass is not enabled.
Consider the rule matching when both @done and @retval values
returned from ipfw_run_eaction() are zero. And modify ipfw_nptv6()
to return IP_FW_DENY and @done=0 when addresses do not match.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2017-03-01 20:00:19 +00:00
Pedro F. Giffuni
e099b90b80 sys: Replace zero with NULL for pointers.
Found with:	devel/coccinelle
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D9694
2017-02-22 02:35:59 +00:00
Eric van Gyzen
8144690af4 Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.

Suggested by:	glebius, emaste
Reviewed by:	gnn
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9625
2017-02-16 20:47:41 +00:00
Eric van Gyzen
643faabe0d pf: use inet_ntoa_r() instead of inet_ntoa(); maybe fix IPv6 OS fingerprinting
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.

This code had an INET6 conditional before this commit, but opt_inet6.h
was not included, so INET6 was never defined.  Apparently, pf's OS
fingerprinting hasn't worked with IPv6 for quite some time.
This commit might fix it, but I didn't test that.

Reviewed by:	gnn, kp
MFC after:	2 weeks
Relnotes:	yes (if I/someone can test pf OS fingerprinting with IPv6)
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9625
2017-02-16 20:44:44 +00:00
Enji Cooper
bc64f428ad Fix typos in comments (returing -> returning)
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-02-07 00:09:48 +00:00