819 Commits

Author SHA1 Message Date
Mateusz Guzik
f649cff587 pf: padalign global locks found in pf.c
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-05 09:56:54 +00:00
Mateusz Guzik
dc1ab04e4c pf: allow table stats clearing and reading with ruleset rlock
Instead serialize against these operations with a dedicated lock.

Prior to the change, When pushing 17 mln pps of traffic, calling
DIOCRGETTSTATS in a loop would restrict throughput to about 7 mln.  With
the change there is no slowdown.

Reviewed by:	kp (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-05 10:42:01 +02:00
Mateusz Guzik
f92c21a28c pf: depessimize table handling
Creating tables and zeroing their counters induces excessive IPIs (14
per table), which in turns kills single- and multi-threaded performance.

Work around the problem by extending per-CPU counters with a general
counter populated on "zeroing" requests -- it stores the currently found
sum. Then requests to report the current value are the sum of per-CPU
counters subtracted by the saved value.

Sample timings when loading a config with 100k tables on a 104-way box:

stock:

pfctl -f tables100000.conf  0.39s user 69.37s system 99% cpu 1:09.76 total
pfctl -f tables100000.conf  0.40s user 68.14s system 99% cpu 1:08.54 total

patched:

pfctl -f tables100000.conf  0.35s user 6.41s system 99% cpu 6.771 total
pfctl -f tables100000.conf  0.48s user 6.47s system 99% cpu 6.949 total

Reviewed by:	kp (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-05 10:42:01 +02:00
Kristof Provost
a19ff8ce9b pf: getstates: avoid taking the hashrow lock if the row is empty
Reviewed by:	mjg
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30946
2021-07-02 14:47:54 +02:00
Kristof Provost
34285eefdd pf: Reduce the data returned in DIOCGETSTATESNV
This call is particularly slow due to the large amount of data it
returns. Remove all fields pfctl does not use. There is no functional
impact to pfctl, but it somewhat speeds up the call.

It might affect other (i.e. non-FreeBSD) code that uses the new
interface, but this call is very new, so there's unlikely to be any. No
releases contained the previous version, so we choose to live with the
ABI modification.

Reviewed by:	donner
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30944
2021-07-02 14:47:23 +02:00
Mateusz Guzik
48d5b86364 pf: make DIOCGETSTATESNV iterations killable
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-02 08:30:22 +00:00
Kristof Provost
8f76eebce4 dummynet: fix sysctls
The sysctl nodes which use V_dn_cfg must be marked as CTLFLAG_VNET so
that we use the correct per-vnet offset

PR:		256819
Reviewed by:	donner
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30974
2021-07-01 21:34:08 +02:00
Mateusz Guzik
d26ef5c7ac pf: make sure the dtrace probe has safe access to state
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-29 07:24:53 +00:00
Mateusz Guzik
55cc305dfc pf: revert: Use counter(9) for pf_state byte/packet tracking
stats are not shared and consequently per-CPU counters only waste
memory.

No slowdown was measured when passing over 20M pps.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-29 07:24:53 +00:00
Mateusz Guzik
803dfe3da0 pf: deduplicate V_pf_state_z handling with pfsync
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-29 07:24:53 +00:00
Mateusz Guzik
7f025db57c pf: fix error-case leaks in pf_create_state
The hand-rolled clean up failed to free counters.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-29 07:24:52 +00:00
Mateusz Guzik
ccb17a2104 pf: factor out state allocation into pf_alloc_state
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-28 15:49:20 +00:00
Mateusz Guzik
d09388d013 pf: add pf_release_staten and use it in pf_unlink_state
Saves one atomic op.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-28 15:49:20 +00:00
Mark Johnston
bc6a2267ff ipfw: Update the pfil mbuf pointer in ipfw_check_frame()
ipfw_chk() might call m_pullup() and thus can change the mbuf chain
head.  In this case, the new chain head has to be returned to the pfil
hook caller, otherwise the pfil hook caller is left with a dangling
pointer.

Note that this affects only the link-layer hooks installed when the
net.link.ether.ipfw sysctl is set to 1.

PR:		256439, 254015, 255069, 255104
Fixes:		f355cb3e6
Reviewed by:	ae
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30764
2021-06-16 09:46:56 -04:00
Kristof Provost
d38630f619 pf: store L4 headers in pf_pdesc
Rather than pointers to the headers store full copies. This brings us
slightly closer to what OpenBSD does, and also makes more sense than
storing pointers to stack variable copies of the headers.

Reviewed by:	donner, scottl
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30719
2021-06-14 14:22:06 +02:00
Kristof Provost
8b5f4e692b pf: don't hold a lock during copyout()
copyout() can trigger page faults, so it may potentially sleep.

Reported by:	avg
MFC after:	3 days
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-14 14:17:31 +02:00
Kristof Provost
ea21980a3f pf: use M_WAITOK where possible
In the ioctl path use M_WAITOK allocations whereever possible. These are
less sensitive to memory pressure, and ioctl requests have no hard
deadlines.

Reviewed by:	donner
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30702
2021-06-10 12:20:10 +02:00
Kristof Provost
1b2dbe37fa dummynet: free(NULL, M_DUMMYNET); is safe
There's no need to check pointers for NULL before free()ing them.

No functional change.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30382
2021-06-03 09:02:53 +02:00
Kristof Provost
51d73df18e dummynet: Fix schedlist and aqmlist locking
These are global (i.e. shared across vnets) structures, so we need
global lock to protect them.  However, we look up entries in these lists
(find_aqm_type(), find_sched_type()) and return them. We must ensure
that the returned structures cannot go away while we are using them.

Resolve this by using NET_EPOCH(). The structures can be safely accessed
under it, and we postpone their cleanup until we're sure they're no
longer used.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30381
2021-06-03 09:02:49 +02:00
Tom Jones
fe3bcfbda3 VNETify dummynet
This moves dn_cfg and other parameters into per VNET variables.

The taskqueue and control state remains global.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D29274
2021-06-03 09:01:56 +02:00
Mark Johnston
60a38abb89 pf: Avoid leaking pad bytes in struct pfr_astats when copying out
There is padding between pfr_astats.pfras_a and pfras_packets that was
not getting initialized.

Reported by:	KMSAN
Reviewed by:	kp, imp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30585
2021-06-01 19:37:50 -04:00
Kristof Provost
0f86492b09 pf: Fix more ioctl memory leaks
We must also remember to free nvlists added to a parent nvlist with
nvlist_append_nvlist_array().

More importantly, when nvlist_pack() allocates memory for us it does so
in the M_NVLIST zone, so we must free it with free(.., M_NVLIST). Using
free(.., M_TEMP) as we did silently failed to free the memory.

MFC after:	3 days
Reported by:	kib@
Tested by:	kib@
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30595
2021-06-01 22:41:20 +02:00
Kristof Provost
ec7b47fc81 pf: Move provider declaration to pf.h
This simplifies life a bit, by not requiring us to repease the
declaration for every file where we want static probe points.

It also makes the gcc6 build happy.
2021-06-01 09:02:05 +02:00
Kristof Provost
7c4342890b pf: Convenience function for optional (numeric) arguments
Add _opt() variants for the uint* functions. These functions set the
provided default value if the nvlist doesn't contain the relevant value.
This is helpful for optional values (e.g. when the API is extended to
add new fields).

While here simplify the header by also using macros to create the
prototypes for the macro-generated function implementations.

Reviewed by:	scottl
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30510
2021-05-31 14:19:17 +02:00
Kristof Provost
3032c35388 pf: Move nvlist conversion functions to pf_nv
Separate the conversion functions (between kernel structs and nvlists)
to pf_nv. This reduces the size of pf_ioctl.c, which is already quite
large and complex, a good bit. It also keeps all the fairly
straightforward conversion code together.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30359
2021-05-26 13:18:47 +02:00
Kristof Provost
4483fb4773 pf: fix ioctl() memory leak
When we create an nvlist and insert it into another nvlist we must
remember to destroy it. The nvlist_add_nvlist() function makes a copy,
just like nvlist_add_string() makes a copy of the string. If we don't
we're leaking memory on every (nvlist-based) ioctl() call.

While here remove two redundant 'break' statements.

PR:		255971
MFC after:	3 days
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-05-24 15:56:24 +02:00
Kristof Provost
b62489cc92 pf: Support killing floating states by interface
Floating states get assigned to interface 'all' (V_pfi_all), so when we
try to flush all states for an interface states originally created
through this interface are not flushed. Only if-bound states can be
flushed in this way.

Given that we track the original interface we can check if the state's
interface is 'all', and if so compare to the orig_if instead.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30246
2021-05-20 12:49:27 +02:00
Kristof Provost
d0fdf2b28f pf: Track the original kif for floating states
Track (and display) the interface that created a state, even if it's a
floating state (and thus uses virtual interface 'all').

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30245
2021-05-20 12:49:27 +02:00
Kristof Provost
0592a4c83d pf: Add DIOCGETSTATESNV
Add DIOCGETSTATESNV, an nvlist-based alternative to DIOCGETSTATES.

MFC after:      1 week
Sponsored by:   Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30243
2021-05-20 12:49:27 +02:00
Kristof Provost
1732afaa0d pf: Add DIOCGETSTATENV
Add DIOCGETSTATENV, an nvlist-based alternative to DIOCGETSTATE.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30242
2021-05-20 12:49:26 +02:00
Mark Johnston
c4a6258d70 dummynet: Fix mbuf tag allocation failure handling
PR:		255875, 255878, 255879, 255880
Reviewed by:	donner, kp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30318
2021-05-18 15:25:16 -04:00
Kristof Provost
02c44f40f9 dummynet: Remove unused code
We never set 'busy' and never dequeue from the pending mq. Remove this
code.

Reviewed by:	ae
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30313
2021-05-17 15:03:55 +02:00
Kristof Provost
d69cc04014 pf: Set the pfik_group for userspace
Userspace relies on this pointer to work out if the kif is a group or
not. It can't use it for anything else, because it's a pointer to a
kernel address. Substitute 0xfeedc0de for 'true', so that we don't leak
kernel memory addresses to userspace.

PR:		255852
Reviewed by:	donner
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30284
2021-05-17 13:48:06 +02:00
Kristof Provost
93abcf17e6 pf: Support killing 'matching' states
Optionally also kill states that match (i.e. are the NATed state or
opposite direction state entry for) the state we're killing.

See also https://redmine.pfsense.org/issues/8555

Submitted by:	Steven Brown
Reviewed by:	bcr (man page)
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30092
2021-05-07 22:13:31 +02:00
Kristof Provost
abbcba9cf5 pf: Allow states to by killed per 'gateway'
This allows us to kill states created from a rule with route-to/reply-to
set.  This is particularly useful in multi-wan setups, where one of the
WAN links goes down.

Submitted by:	Steven Brown
Obtained from:	https://github.com/pfsense/FreeBSD-src/pull/11/
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30058
2021-05-07 22:13:31 +02:00
Kristof Provost
e989530a09 pf: Introduce DIOCKILLSTATESNV
Introduce an nvlist based alternative to DIOCKILLSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30054
2021-05-07 22:13:30 +02:00
Kristof Provost
7606a45dcc pf: Introduce DIOCCLRSTATESNV
Introduce an nvlist variant of DIOCCLRSTATES.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30052
2021-05-07 22:13:30 +02:00
Kristof Provost
055c55abef pf: Fix IP checksum on reassembly
If we reassemble a packet we modify the IP header (to set the length and
remove the fragment offset information), but we failed to update the
checksum. On certain setups (mostly where we did not re-fragment again
afterwards) this could lead to us sending out packets with incorrect
checksums.

PR:		255432
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30026
2021-04-30 08:19:46 +02:00
Kristof Provost
eaabed8ac4 pf: Trivial typo fix
PV -> PF

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-04-29 15:25:07 +02:00
Kristof Provost
6b146f3b9b pf: Error tracing SDTs
Add additional DTrace static trace points to facilitate debugging
failing pf ioctl calls.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-04-28 17:19:10 +02:00
Kristof Provost
402dfb0a8d pf: Fix parsing of long table names
When parsing the nvlist for a struct pf_addr_wrap we unconditionally
tried to parse "ifname". This broke for PF_ADDR_TABLE when the table
name was longer than IFNAMSIZ. PF_TABLE_NAME_SIZE is longer than
IFNAMSIZ, so this is a valid configuration.

Only parse (or return) ifname or tblname for the corresponding
pf_addr_wrap type.

This manifested as a failure to set rules such as these, where the pfctl
optimiser generated an automatic table:

	pass in proto tcp to 192.168.0.1 port ssh
	pass in proto tcp to 192.168.0.2 port ssh
	pass in proto tcp to 192.168.0.3 port ssh
	pass in proto tcp to 192.168.0.4 port ssh
	pass in proto tcp to 192.168.0.5 port ssh
	pass in proto tcp to 192.168.0.6 port ssh
	pass in proto tcp to 192.168.0.7 port ssh

Reported by:	Florian Smeets
Tested by:	Florian Smeets
Reviewed by:	donner
X-MFC-With:	5c11c5a3655842a176124ef2334fcdf830422c8a
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29962
2021-04-26 18:08:15 +02:00
Kristof Provost
5f5bf88949 pfsync: Expose PFSYNCF_OK flag to userspace
Add 'syncok' field to ifconfig's pfsync interface output. This allows
userspace to figure out when pfsync has completed the initial bulk
import.

Reviewed by:	donner
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29948
2021-04-26 14:31:17 +02:00
Kristof Provost
6fcc8e042a pf: Allow multiple labels to be set on a rule
Allow up to 5 labels to be set on each rule.
This offers more flexibility in using labels. For example, it replaces
the customer 'schedule' keyword used by pfSense to terminate states
according to a schedule.

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29936
2021-04-26 14:14:21 +02:00
Mark Johnston
652908599b Add required checks for unmapped mbufs in ipdivert and ipfw
Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain
are mapped.  Use it in ipfw_nat, which operates on a chain returned by
m_megapullup().

PR:		255164
Reviewed by:	ae, gallatin
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29838
2021-04-21 15:47:05 -04:00
Kristof Provost
586aab9e0a pf: Refactor state killing
Extract the state killing code from pfioctl() and rephrase the filtering
conditions for readability.

No functional change intended.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29795
2021-04-20 09:30:23 +02:00
Kristof Provost
42ec75f83a pf: Optionally attempt to preserve rule counter values across ruleset updates
Usually rule counters are reset to zero on every update of the ruleset.
With keepcounters set pf will attempt to find matching rules between old
and new rulesets and preserve the rule counters.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29780
2021-04-19 14:31:47 +02:00
Kristof Provost
8bb0f1b87b pf: Remove PFRULE_REFS from userspace
PFRULE_REFS should never be used by userspace, so hide it behind #ifdef
_KERNEL.

MFC after:	never
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29779
2021-04-19 14:31:47 +02:00
Kristof Provost
4f1f67e888 pf: PFRULE_REFS should not be user-visible
Split the PFRULE_REFS flag from the rule_flag field. PFRULE_REFS is a
kernel-internal flag and should not be exposed to or read from
userspace.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29778
2021-04-19 14:31:47 +02:00
Andrey V. Elsukov
9bacbf1ae2 ipfw: do not use sleepable malloc in callout context.
Use M_NOWAIT flag when hash growing is called from callout.

PR:             255041
Reviewed by:	kevans
MFC after:	10 days
Differential Revision: https://reviews.freebsd.org/D29772
2021-04-16 10:22:44 +03:00
Kurosawa Takahiro
2aa21096c7 pf: Implement the NAT source port selection of MAP-E Customer Edge
MAP-E (RFC 7597) requires special care for selecting source ports
in NAT operation on the Customer Edge because a part of bits of the port
numbers are used by the Border Relay to distinguish another side of the
IPv4-over-IPv6 tunnel.

PR:		254577
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D29468
2021-04-13 10:53:18 +02:00