Extend the existing set-tos keyword to also be able to set traffic class
on IPv6 traffic.
Add tests for this as well.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D31564
This allows the maximum value of 4294967295 (~4Gb/s) instead of previous
value of 2147483647 (~2Gb/s).
Reviewed by: np, scottl
Obtained from: pfSense
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31582
The intent is to line up various enums so that branching in the lines of:
idx = (dir == PF_IN ? PF_SK_WIRE : PF_SK_STACK);
is avoided.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Jump cache is used to reduce the cost of rule lookup for O_SKIPTO and
O_CALLRETURN actions. It uses rules chain id to check correctness of
cached value. But due to the possible race, there is the chance that
one thread can read invalid value. In some cases this can lead to out
of bounds access and panic.
Use thread fence operations to constrain the reordering of accesses.
Also rename jump_fast and jump_linear functions to jump_cached and
jump_lookup_pos respectively.
Submitted by: Arseny Smalyuk
Reviewed by: melifaro
Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D31484
Similar to what we did earlier for DIOCGETSTATESV2 we only allocate
enough memory for a handful of states and copy those out, bit by bit,
rather than allocating memory for all states in one go.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
While nvlists are very useful in maximising flexibility for future
extensions their performance is simply unacceptably bad for the
getstates feature, where we can easily want to export a million states
or more.
The DIOCGETSTATESNV call has been MFCd, but has not hit a release on any
branch, so we can still remove it everywhere.
Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31099
Some pf ioctl handlers use strlcpy() to copy strings when converting
from user structures to their in-kernel representations. strlcpy()
ensures that the destination will be nul-terminated, but it assumes that
the source is nul-terminated. In particular, it returns the full length
of the source string, so if the source is not nul-terminated, strlcpy()
will keep scanning until it finds a nul byte, and it may encounter an
unmapped page first. Add a helper to validate user strings before
copying.
There are also places where we look up a ruleset using a user-provided
anchor string. In some ioctl handlers we were already nul-terminating
the string, avoiding the same problem, but in other places we were not.
Fix those by nul-terminating as well. Aside from being consistent,
anchors have a maximum length of MAXPATHLEN - 1 so calling strnlen()
might not be so desirable.
Reported by: syzbot+35a1549b4663e9483dd1@syzkaller.appspotmail.com
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31169
A number of pf ioctls populate an array of structures and copy it out.
They have the following structures:
- caller specifies the size of its output buffer
- ioctl handler allocates a kernel buffer of the same size
- ioctl handler populates the buffer, possibly leaving some items
initialized if the caller provided more space than needed
- ioctl handler copies the entire buffer out to userland
Thus, if more space was provided than is required, we end up copying out
uninitialized kernel memory. Simply zero the buffer at allocation time
to prevent this.
Reported by: KMSAN
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31313
The introduction of synproxy support changed the size of struct
pf_status, which in turn broke the userspace ABI.
Revert the relevant change. More work is needed on the synproxy code to
keep and expose the counters, but in the mean time this restores the
ABI.
PR: 257469
MFC after: 3 days
Sponsored by: Modirum MDPay
These two fuctions were identical, so move them into the common
vlan_set_pcp() function, exposed in the if_vlan_var.h header.
Reviewed by: donner
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31275
Numerous counters got migrated from straight uint64_t to the counter(9)
API. Unfortunately the implementation comes with a significiant
performance hit on some platforms and cannot be easily fixed.
Work around the problem by implementing a pf-specific variant.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
This shaves calculation which in particular helps on arm.
Note using the & hack instead would still be more work.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Kernel side implementation to allow switching between on and off modes,
and allow this configuration to be retrieved.
MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31139
Import OpenBSD's syncookie support for pf. This feature help pf resist
TCP SYN floods by only creating states once the remote host completes
the TCP handshake rather than when the initial SYN packet is received.
This is accomplished by using the initial sequence numbers to encode a
cookie (hence the name) in the SYN+ACK response and verifying this on
receipt of the client ACK.
Reviewed by: kbowling
Obtained from: OpenBSD
MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31138
Similar to the REPLY_TO shortcut (6d786845cf) we also can't shortcut
ROUTE_TO. If we do we will fail to apply transformations or update the
state, which can lead to premature termination of the connections.
PR: 257106
MFC after: 3 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31177
struct mbuf *replyto is not actually used (and only rarely provided).
The same applies to struct ifnet *ifp.
No functional change.
Reviewed by: mjg
MFC after: 1 week
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D31136
Support the 'match' keyword.
Note that support is limited to adding queuing information, so without
ALTQ support in the kernel setting match rules is pointless.
For the avoidance of doubt: this is NOT full support for the match
keyword as found in OpenBSD's pf. That could potentially be built on top
of this, but this commit is NOT that.
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31115
Rather than allocating however much memory userspace asks for we only
allocate enough for a handful of states, and copy to userspace for each
completed row.
We start out with enough space for 16 states (per row), but grow that as
required. In most configurations we expect at most a handful of states
per row (more than that would have other negative effects on packet
processing performance).
Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31111
Add a new version of the DIOCGETSTATES call, which extends the struct to
include the original interface information.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31097
Only print this warning when boot verbose is enabled.
This can get pretty annoying (and useless) in some systems.
Reviewed by: kp
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Happily this wasn't a real bug, because pf_killstates() never fails, but
we should check the return value anyway, in case it does ever start
returning errors.
Reported by: clang --analyze
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
pidx is never NULL, and is used unconditionally later on in the
function.
Add an assertion, as documentation for the requirement to provide an idx
pointer.
Reported by: clang --analyze
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Indicate that this is a kernel-only structure, and make it easier to
distinguish from others used to communicate with userspace.
Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31096
Instead serialize against these operations with a dedicated lock.
Prior to the change, When pushing 17 mln pps of traffic, calling
DIOCRGETTSTATS in a loop would restrict throughput to about 7 mln. With
the change there is no slowdown.
Reviewed by: kp (previous version)
Sponsored by: Rubicon Communications, LLC ("Netgate")
Creating tables and zeroing their counters induces excessive IPIs (14
per table), which in turns kills single- and multi-threaded performance.
Work around the problem by extending per-CPU counters with a general
counter populated on "zeroing" requests -- it stores the currently found
sum. Then requests to report the current value are the sum of per-CPU
counters subtracted by the saved value.
Sample timings when loading a config with 100k tables on a 104-way box:
stock:
pfctl -f tables100000.conf 0.39s user 69.37s system 99% cpu 1:09.76 total
pfctl -f tables100000.conf 0.40s user 68.14s system 99% cpu 1:08.54 total
patched:
pfctl -f tables100000.conf 0.35s user 6.41s system 99% cpu 6.771 total
pfctl -f tables100000.conf 0.48s user 6.47s system 99% cpu 6.949 total
Reviewed by: kp (previous version)
Sponsored by: Rubicon Communications, LLC ("Netgate")
This call is particularly slow due to the large amount of data it
returns. Remove all fields pfctl does not use. There is no functional
impact to pfctl, but it somewhat speeds up the call.
It might affect other (i.e. non-FreeBSD) code that uses the new
interface, but this call is very new, so there's unlikely to be any. No
releases contained the previous version, so we choose to live with the
ABI modification.
Reviewed by: donner
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30944
The sysctl nodes which use V_dn_cfg must be marked as CTLFLAG_VNET so
that we use the correct per-vnet offset
PR: 256819
Reviewed by: donner
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30974
stats are not shared and consequently per-CPU counters only waste
memory.
No slowdown was measured when passing over 20M pps.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")