bz e15f804c7b Update packet filter (pf) code to OpenBSD 4.5.
You need to update userland (world and ports) tools
to be in sync with the kernel.

Submitted by:	mlaier
Submitted by:	eri
2011-06-28 11:57:25 +00:00

3099 lines
94 KiB
Groff

.\" $FreeBSD$
.\" $OpenBSD: pf.conf.5,v 1.406 2009/01/31 19:37:12 sobrado Exp $
.\"
.\" Copyright (c) 2002, Daniel Hartmeier
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\"
.\" - Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" - Redistributions in binary form must reproduce the above
.\" copyright notice, this list of conditions and the following
.\" disclaimer in the documentation and/or other materials provided
.\" with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
.\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
.\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
.\" ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd January 31 2009
.Dt PF.CONF 5
.Os
.Sh NAME
.Nm pf.conf
.Nd packet filter configuration file
.Sh DESCRIPTION
The
.Xr pf 4
packet filter modifies, drops or passes packets according to rules or
definitions specified in
.Nm pf.conf .
.Sh STATEMENT ORDER
There are seven types of statements in
.Nm pf.conf :
.Bl -tag -width xxxx
.It Cm Macros
User-defined variables may be defined and used later, simplifying
the configuration file.
Macros must be defined before they are referenced in
.Nm pf.conf .
.It Cm Tables
Tables provide a mechanism for increasing the performance and flexibility of
rules with large numbers of source or destination addresses.
.It Cm Options
Options tune the behaviour of the packet filtering engine.
.It Cm Traffic Normalization Li (e.g. Em scrub )
Traffic normalization protects internal machines against inconsistencies
in Internet protocols and implementations.
.It Cm Queueing
Queueing provides rule-based bandwidth control.
.It Cm Translation Li (Various forms of NAT)
Translation rules specify how addresses are to be mapped or redirected to
other addresses.
.It Cm Packet Filtering
Packet filtering provides rule-based blocking or passing of packets.
.El
.Pp
With the exception of
.Cm macros
and
.Cm tables ,
the types of statements should be grouped and appear in
.Nm pf.conf
in the order shown above, as this matches the operation of the underlying
packet filtering engine.
By default
.Xr pfctl 8
enforces this order (see
.Ar set require-order
below).
.Pp
Comments can be put anywhere in the file using a hash mark
.Pq Sq # ,
and extend to the end of the current line.
.Pp
Additional configuration files can be included with the
.Ic include
keyword, for example:
.Bd -literal -offset indent
include "/etc/pf/sub.filter.conf"
.Ed
.Sh MACROS
Macros can be defined that will later be expanded in context.
Macro names must start with a letter, and may contain letters, digits
and underscores.
Macro names may not be reserved words (for example
.Ar pass ,
.Ar in ,
.Ar out ) .
Macros are not expanded inside quotes.
.Pp
For example,
.Bd -literal -offset indent
ext_if = \&"kue0\&"
all_ifs = \&"{\&" $ext_if lo0 \&"}\&"
pass out on $ext_if from any to any
pass in on $ext_if proto tcp from any to any port 25
.Ed
.Sh TABLES
Tables are named structures which can hold a collection of addresses and
networks.
Lookups against tables in
.Xr pf 4
are relatively fast, making a single rule with tables much more efficient,
in terms of
processor usage and memory consumption, than a large number of rules which
differ only in IP address (either created explicitly or automatically by rule
expansion).
.Pp
Tables can be used as the source or destination of filter rules,
.Ar scrub
rules
or
translation rules such as
.Ar nat
or
.Ar rdr
(see below for details on the various rule types).
Tables can also be used for the redirect address of
.Ar nat
and
.Ar rdr
rules and in the routing options of filter rules, but only for
.Ar round-robin
pools.
.Pp
Tables can be defined with any of the following
.Xr pfctl 8
mechanisms.
As with macros, reserved words may not be used as table names.
.Bl -tag -width "manually"
.It Ar manually
Persistent tables can be manually created with the
.Ar add
or
.Ar replace
option of
.Xr pfctl 8 ,
before or after the ruleset has been loaded.
.It Pa pf.conf
Table definitions can be placed directly in this file, and loaded at the
same time as other rules are loaded, atomically.
Table definitions inside
.Nm pf.conf
use the
.Ar table
statement, and are especially useful to define non-persistent tables.
The contents of a pre-existing table defined without a list of addresses
to initialize it is not altered when
.Nm pf.conf
is loaded.
A table initialized with the empty list,
.Li { } ,
will be cleared on load.
.El
.Pp
Tables may be defined with the following attributes:
.Bl -tag -width persist
.It Ar persist
The
.Ar persist
flag forces the kernel to keep the table even when no rules refer to it.
If the flag is not set, the kernel will automatically remove the table
when the last rule referring to it is flushed.
.It Ar const
The
.Ar const
flag prevents the user from altering the contents of the table once it
has been created.
Without that flag,
.Xr pfctl 8
can be used to add or remove addresses from the table at any time, even
when running with
.Xr securelevel 7
= 2.
.It Ar counters
The
.Ar counters
flag enables per-address packet and byte counters which can be displayed with
.Xr pfctl 8 .
.El
.Pp
For example,
.Bd -literal -offset indent
table \*(Ltprivate\*(Gt const { 10/8, 172.16/12, 192.168/16 }
table \*(Ltbadhosts\*(Gt persist
block on fxp0 from { \*(Ltprivate\*(Gt, \*(Ltbadhosts\*(Gt } to any
.Ed
.Pp
creates a table called private, to hold RFC 1918 private network
blocks, and a table called badhosts, which is initially empty.
A filter rule is set up to block all traffic coming from addresses listed in
either table.
The private table cannot have its contents changed and the badhosts table
will exist even when no active filter rules reference it.
Addresses may later be added to the badhosts table, so that traffic from
these hosts can be blocked by using
.Bd -literal -offset indent
# pfctl -t badhosts -Tadd 204.92.77.111
.Ed
.Pp
A table can also be initialized with an address list specified in one or more
external files, using the following syntax:
.Bd -literal -offset indent
table \*(Ltspam\*(Gt persist file \&"/etc/spammers\&" file \&"/etc/openrelays\&"
block on fxp0 from \*(Ltspam\*(Gt to any
.Ed
.Pp
The files
.Pa /etc/spammers
and
.Pa /etc/openrelays
list IP addresses, one per line.
Any lines beginning with a # are treated as comments and ignored.
In addition to being specified by IP address, hosts may also be
specified by their hostname.
When the resolver is called to add a hostname to a table,
.Em all
resulting IPv4 and IPv6 addresses are placed into the table.
IP addresses can also be entered in a table by specifying a valid interface
name, a valid interface group or the
.Em self
keyword, in which case all addresses assigned to the interface(s) will be
added to the table.
.Sh OPTIONS
.Xr pf 4
may be tuned for various situations using the
.Ar set
command.
.Bl -tag -width xxxx
.It Ar set timeout
.Pp
.Bl -tag -width "src.track" -compact
.It Ar interval
Interval between purging expired states and fragments.
.It Ar frag
Seconds before an unassembled fragment is expired.
.It Ar src.track
Length of time to retain a source tracking entry after the last state
expires.
.El
.Pp
When a packet matches a stateful connection, the seconds to live for the
connection will be updated to that of the
.Ar proto.modifier
which corresponds to the connection state.
Each packet which matches this state will reset the TTL.
Tuning these values may improve the performance of the
firewall at the risk of dropping valid idle connections.
.Pp
.Bl -tag -width xxxx -compact
.It Ar tcp.first
The state after the first packet.
.It Ar tcp.opening
The state before the destination host ever sends a packet.
.It Ar tcp.established
The fully established state.
.It Ar tcp.closing
The state after the first FIN has been sent.
.It Ar tcp.finwait
The state after both FINs have been exchanged and the connection is closed.
Some hosts (notably web servers on Solaris) send TCP packets even after closing
the connection.
Increasing
.Ar tcp.finwait
(and possibly
.Ar tcp.closing )
can prevent blocking of such packets.
.It Ar tcp.closed
The state after one endpoint sends an RST.
.El
.Pp
ICMP and UDP are handled in a fashion similar to TCP, but with a much more
limited set of states:
.Pp
.Bl -tag -width xxxx -compact
.It Ar udp.first
The state after the first packet.
.It Ar udp.single
The state if the source host sends more than one packet but the destination
host has never sent one back.
.It Ar udp.multiple
The state if both hosts have sent packets.
.It Ar icmp.first
The state after the first packet.
.It Ar icmp.error
The state after an ICMP error came back in response to an ICMP packet.
.El
.Pp
Other protocols are handled similarly to UDP:
.Pp
.Bl -tag -width xxxx -compact
.It Ar other.first
.It Ar other.single
.It Ar other.multiple
.El
.Pp
Timeout values can be reduced adaptively as the number of state table
entries grows.
.Pp
.Bl -tag -width xxxx -compact
.It Ar adaptive.start
When the number of state entries exceeds this value, adaptive scaling
begins.
All timeout values are scaled linearly with factor
(adaptive.end - number of states) / (adaptive.end - adaptive.start).
.It Ar adaptive.end
When reaching this number of state entries, all timeout values become
zero, effectively purging all state entries immediately.
This value is used to define the scale factor, it should not actually
be reached (set a lower state limit, see below).
.El
.Pp
Adaptive timeouts are enabled by default, with an adaptive.start value
equal to 60% of the state limit, and an adaptive.end value equal to
120% of the state limit.
They can be disabled by setting both adaptive.start and adaptive.end to 0.
.Pp
The adaptive timeout values can be defined both globally and for each rule.
When used on a per-rule basis, the values relate to the number of
states created by the rule, otherwise to the total number of
states.
.Pp
For example:
.Bd -literal -offset indent
set timeout tcp.first 120
set timeout tcp.established 86400
set timeout { adaptive.start 6000, adaptive.end 12000 }
set limit states 10000
.Ed
.Pp
With 9000 state table entries, the timeout values are scaled to 50%
(tcp.first 60, tcp.established 43200).
.Pp
.It Ar set loginterface
Enable collection of packet and byte count statistics for the given
interface or interface group.
These statistics can be viewed using
.Bd -literal -offset indent
# pfctl -s info
.Ed
.Pp
In this example
.Xr pf 4
collects statistics on the interface named dc0:
.Bd -literal -offset indent
set loginterface dc0
.Ed
.Pp
One can disable the loginterface using:
.Bd -literal -offset indent
set loginterface none
.Ed
.Pp
.It Ar set limit
Sets hard limits on the memory pools used by the packet filter.
See
.Xr zone 9
for an explanation of memory pools.
.Pp
For example,
.Bd -literal -offset indent
set limit states 20000
.Ed
.Pp
sets the maximum number of entries in the memory pool used by state table
entries (generated by
.Ar pass
rules which do not specify
.Ar no state )
to 20000.
Using
.Bd -literal -offset indent
set limit frags 20000
.Ed
.Pp
sets the maximum number of entries in the memory pool used for fragment
reassembly (generated by
.Ar scrub
rules) to 20000.
Using
.Bd -literal -offset indent
set limit src-nodes 2000
.Ed
.Pp
sets the maximum number of entries in the memory pool used for tracking
source IP addresses (generated by the
.Ar sticky-address
and
.Ar src.track
options) to 2000.
Using
.Bd -literal -offset indent
set limit tables 1000
set limit table-entries 100000
.Ed
.Pp
sets limits on the memory pools used by tables.
The first limits the number of tables that can exist to 1000.
The second limits the overall number of addresses that can be stored
in tables to 100000.
.Pp
Various limits can be combined on a single line:
.Bd -literal -offset indent
set limit { states 20000, frags 20000, src-nodes 2000 }
.Ed
.Pp
.It Ar set ruleset-optimization
.Bl -tag -width xxxxxxxx -compact
.It Ar none
Disable the ruleset optimizer.
.It Ar basic
Enable basic ruleset optimization.
This is the default behaviour.
Basic ruleset optimization does four things to improve the
performance of ruleset evaluations:
.Pp
.Bl -enum -compact
.It
remove duplicate rules
.It
remove rules that are a subset of another rule
.It
combine multiple rules into a table when advantageous
.It
re-order the rules to improve evaluation performance
.El
.Pp
.It Ar profile
Uses the currently loaded ruleset as a feedback profile to tailor the
ordering of quick rules to actual network traffic.
.El
.Pp
It is important to note that the ruleset optimizer will modify the ruleset
to improve performance.
A side effect of the ruleset modification is that per-rule accounting
statistics will have different meanings than before.
If per-rule accounting is important for billing purposes or whatnot,
either the ruleset optimizer should not be used or a label field should
be added to all of the accounting rules to act as optimization barriers.
.Pp
Optimization can also be set as a command-line argument to
.Xr pfctl 8 ,
overriding the settings in
.Nm .
.It Ar set optimization
Optimize state timeouts for one of the following network environments:
.Pp
.Bl -tag -width xxxx -compact
.It Ar normal
A normal network environment.
Suitable for almost all networks.
.It Ar high-latency
A high-latency environment (such as a satellite connection).
.It Ar satellite
Alias for
.Ar high-latency .
.It Ar aggressive
Aggressively expire connections.
This can greatly reduce the memory usage of the firewall at the cost of
dropping idle connections early.
.It Ar conservative
Extremely conservative settings.
Avoid dropping legitimate connections at the
expense of greater memory utilization (possibly much greater on a busy
network) and slightly increased processor utilization.
.El
.Pp
For example:
.Bd -literal -offset indent
set optimization aggressive
.Ed
.Pp
.It Ar set block-policy
The
.Ar block-policy
option sets the default behaviour for the packet
.Ar block
action:
.Pp
.Bl -tag -width xxxxxxxx -compact
.It Ar drop
Packet is silently dropped.
.It Ar return
A TCP RST is returned for blocked TCP packets,
an ICMP UNREACHABLE is returned for blocked UDP packets,
and all other packets are silently dropped.
.El
.Pp
For example:
.Bd -literal -offset indent
set block-policy return
.Ed
.It Ar set state-policy
The
.Ar state-policy
option sets the default behaviour for states:
.Pp
.Bl -tag -width group-bound -compact
.It Ar if-bound
States are bound to interface.
.It Ar floating
States can match packets on any interfaces (the default).
.El
.Pp
For example:
.Bd -literal -offset indent
set state-policy if-bound
.Ed
.It Ar set state-defaults
The
.Ar state-defaults
option sets the state options for states created from rules
without an explicit
.Ar keep state .
For example:
.Bd -literal -offset indent
set state-defaults pflow, no-sync
.Ed
.It Ar set hostid
The 32-bit
.Ar hostid
identifies this firewall's state table entries to other firewalls
in a
.Xr pfsync 4
failover cluster.
By default the hostid is set to a pseudo-random value, however it may be
desirable to manually configure it, for example to more easily identify the
source of state table entries.
.Bd -literal -offset indent
set hostid 1
.Ed
.Pp
The hostid may be specified in either decimal or hexadecimal.
.It Ar set require-order
By default
.Xr pfctl 8
enforces an ordering of the statement types in the ruleset to:
.Em options ,
.Em normalization ,
.Em queueing ,
.Em translation ,
.Em filtering .
Setting this option to
.Ar no
disables this enforcement.
There may be non-trivial and non-obvious implications to an out of
order ruleset.
Consider carefully before disabling the order enforcement.
.It Ar set fingerprints
Load fingerprints of known operating systems from the given filename.
By default fingerprints of known operating systems are automatically
loaded from
.Xr pf.os 5
in
.Pa /etc
but can be overridden via this option.
Setting this option may leave a small period of time where the fingerprints
referenced by the currently active ruleset are inconsistent until the new
ruleset finishes loading.
.Pp
For example:
.Pp
.Dl set fingerprints \&"/etc/pf.os.devel\&"
.Pp
.It Ar set skip on Aq Ar ifspec
List interfaces for which packets should not be filtered.
Packets passing in or out on such interfaces are passed as if pf was
disabled, i.e. pf does not process them in any way.
This can be useful on loopback and other virtual interfaces, when
packet filtering is not desired and can have unexpected effects.
For example:
.Pp
.Dl set skip on lo0
.Pp
.It Ar set debug
Set the debug
.Ar level
to one of the following:
.Pp
.Bl -tag -width xxxxxxxxxxxx -compact
.It Ar none
Don't generate debug messages.
.It Ar urgent
Generate debug messages only for serious errors.
.It Ar misc
Generate debug messages for various errors.
.It Ar loud
Generate debug messages for common conditions.
.El
.El
.Sh TRAFFIC NORMALIZATION
Traffic normalization is used to sanitize packet content in such
a way that there are no ambiguities in packet interpretation on
the receiving side.
The normalizer does IP fragment reassembly to prevent attacks
that confuse intrusion detection systems by sending overlapping
IP fragments.
Packet normalization is invoked with the
.Ar scrub
directive.
.Pp
.Ar scrub
has the following options:
.Bl -tag -width xxxx
.It Ar no-df
Clears the
.Ar dont-fragment
bit from a matching IP packet.
Some operating systems are known to generate fragmented packets with the
.Ar dont-fragment
bit set.
This is particularly true with NFS.
.Ar Scrub
will drop such fragmented
.Ar dont-fragment
packets unless
.Ar no-df
is specified.
.Pp
Unfortunately some operating systems also generate their
.Ar dont-fragment
packets with a zero IP identification field.
Clearing the
.Ar dont-fragment
bit on packets with a zero IP ID may cause deleterious results if an
upstream router later fragments the packet.
Using the
.Ar random-id
modifier (see below) is recommended in combination with the
.Ar no-df
modifier to ensure unique IP identifiers.
.It Ar min-ttl Aq Ar number
Enforces a minimum TTL for matching IP packets.
.It Ar max-mss Aq Ar number
Enforces a maximum MSS for matching TCP packets.
.It Xo Ar set-tos Aq Ar string
.No \*(Ba Aq Ar number
.Xc
Enforces a
.Em TOS
for matching IP packets.
.Em TOS
may be
given as one of
.Ar lowdelay ,
.Ar throughput ,
.Ar reliability ,
or as either hex or decimal.
.It Ar random-id
Replaces the IP identification field with random values to compensate
for predictable values generated by many hosts.
This option only applies to packets that are not fragmented
after the optional fragment reassembly.
.It Ar fragment reassemble
Using
.Ar scrub
rules, fragments can be reassembled by normalization.
In this case, fragments are buffered until they form a complete
packet, and only the completed packet is passed on to the filter.
The advantage is that filter rules have to deal only with complete
packets, and can ignore fragments.
The drawback of caching fragments is the additional memory cost.
But the full reassembly method is the only method that currently works
with NAT.
This is the default behavior of a
.Ar scrub
rule if no fragmentation modifier is supplied.
.It Ar fragment crop
The default fragment reassembly method is expensive, hence the option
to crop is provided.
In this case,
.Xr pf 4
will track the fragments and cache a small range descriptor.
Duplicate fragments are dropped and overlaps are cropped.
Thus data will only occur once on the wire with ambiguities resolving to
the first occurrence.
Unlike the
.Ar fragment reassemble
modifier, fragments are not buffered, they are passed as soon as they
are received.
The
.Ar fragment crop
reassembly mechanism does not yet work with NAT.
.Pp
.It Ar fragment drop-ovl
This option is similar to the
.Ar fragment crop
modifier except that all overlapping or duplicate fragments will be
dropped, and all further corresponding fragments will be
dropped as well.
.It Ar reassemble tcp
Statefully normalizes TCP connections.
.Ar scrub reassemble tcp
rules may not have the direction (in/out) specified.
.Ar reassemble tcp
performs the following normalizations:
.Pp
.Bl -tag -width timeout -compact
.It ttl
Neither side of the connection is allowed to reduce their IP TTL.
An attacker may send a packet such that it reaches the firewall, affects
the firewall state, and expires before reaching the destination host.
.Ar reassemble tcp
will raise the TTL of all packets back up to the highest value seen on
the connection.
.It timestamp modulation
Modern TCP stacks will send a timestamp on every TCP packet and echo
the other endpoint's timestamp back to them.
Many operating systems will merely start the timestamp at zero when
first booted, and increment it several times a second.
The uptime of the host can be deduced by reading the timestamp and multiplying
by a constant.
Also observing several different timestamps can be used to count hosts
behind a NAT device.
And spoofing TCP packets into a connection requires knowing or guessing
valid timestamps.
Timestamps merely need to be monotonically increasing and not derived off a
guessable base time.
.Ar reassemble tcp
will cause
.Ar scrub
to modulate the TCP timestamps with a random number.
.It extended PAWS checks
There is a problem with TCP on long fat pipes, in that a packet might get
delayed for longer than it takes the connection to wrap its 32-bit sequence
space.
In such an occurrence, the old packet would be indistinguishable from a
new packet and would be accepted as such.
The solution to this is called PAWS: Protection Against Wrapped Sequence
numbers.
It protects against it by making sure the timestamp on each packet does
not go backwards.
.Ar reassemble tcp
also makes sure the timestamp on the packet does not go forward more
than the RFC allows.
By doing this,
.Xr pf 4
artificially extends the security of TCP sequence numbers by 10 to 18
bits when the host uses appropriately randomized timestamps, since a
blind attacker would have to guess the timestamp as well.
.El
.El
.Pp
For example,
.Bd -literal -offset indent
scrub in on $ext_if all fragment reassemble
.Ed
.Pp
The
.Ar no
option prefixed to a scrub rule causes matching packets to remain unscrubbed,
much in the same way as
.Ar drop quick
works in the packet filter (see below).
This mechanism should be used when it is necessary to exclude specific packets
from broader scrub rules.
.Sh QUEUEING
The ALTQ system is currently not available in the GENERIC kernel nor as
loadable modules.
In order to use the herein after called queueing options one has to use a
custom built kernel.
Please refer to
.Xr altq 4
to learn about the related kernel options.
.Pp
Packets can be assigned to queues for the purpose of bandwidth
control.
At least two declarations are required to configure queues, and later
any packet filtering rule can reference the defined queues by name.
During the filtering component of
.Nm pf.conf ,
the last referenced
.Ar queue
name is where any packets from
.Ar pass
rules will be queued, while for
.Ar block
rules it specifies where any resulting ICMP or TCP RST
packets should be queued.
The
.Ar scheduler
defines the algorithm used to decide which packets get delayed, dropped, or
sent out immediately.
There are three
.Ar schedulers
currently supported.
.Bl -tag -width xxxx
.It Ar cbq
Class Based Queueing.
.Ar Queues
attached to an interface build a tree, thus each
.Ar queue
can have further child
.Ar queues .
Each queue can have a
.Ar priority
and a
.Ar bandwidth
assigned.
.Ar Priority
mainly controls the time packets take to get sent out, while
.Ar bandwidth
has primarily effects on throughput.
.Ar cbq
achieves both partitioning and sharing of link bandwidth
by hierarchically structured classes.
Each class has its own
.Ar queue
and is assigned its share of
.Ar bandwidth .
A child class can borrow bandwidth from its parent class
as long as excess bandwidth is available
(see the option
.Ar borrow ,
below).
.It Ar priq
Priority Queueing.
.Ar Queues
are flat attached to the interface, thus,
.Ar queues
cannot have further child
.Ar queues .
Each
.Ar queue
has a unique
.Ar priority
assigned, ranging from 0 to 15.
Packets in the
.Ar queue
with the highest
.Ar priority
are processed first.
.It Ar hfsc
Hierarchical Fair Service Curve.
.Ar Queues
attached to an interface build a tree, thus each
.Ar queue
can have further child
.Ar queues .
Each queue can have a
.Ar priority
and a
.Ar bandwidth
assigned.
.Ar Priority
mainly controls the time packets take to get sent out, while
.Ar bandwidth
primarily affects throughput.
.Ar hfsc
supports both link-sharing and guaranteed real-time services.
It employs a service curve based QoS model,
and its unique feature is an ability to decouple
.Ar delay
and
.Ar bandwidth
allocation.
.El
.Pp
The interfaces on which queueing should be activated are declared using
the
.Ar altq on
declaration.
.Ar altq on
has the following keywords:
.Bl -tag -width xxxx
.It Aq Ar interface
Queueing is enabled on the named interface.
.It Aq Ar scheduler
Specifies which queueing scheduler to use.
Currently supported values
are
.Ar cbq
for Class Based Queueing,
.Ar priq
for Priority Queueing and
.Ar hfsc
for the Hierarchical Fair Service Curve scheduler.
.It Ar bandwidth Aq Ar bw
The maximum bitrate for all queues on an
interface may be specified using the
.Ar bandwidth
keyword.
The value can be specified as an absolute value or as a
percentage of the interface bandwidth.
When using an absolute value, the suffixes
.Ar b ,
.Ar Kb ,
.Ar Mb ,
and
.Ar Gb
are used to represent bits, kilobits, megabits, and
gigabits per second, respectively.
The value must not exceed the interface bandwidth.
If
.Ar bandwidth
is not specified, the interface bandwidth is used
(but take note that some interfaces do not know their bandwidth,
or can adapt their bandwidth rates).
.It Ar qlimit Aq Ar limit
The maximum number of packets held in the queue.
The default is 50.
.It Ar tbrsize Aq Ar size
Adjusts the size, in bytes, of the token bucket regulator.
If not specified, heuristics based on the
interface bandwidth are used to determine the size.
.It Ar queue Aq Ar list
Defines a list of subqueues to create on an interface.
.El
.Pp
In the following example, the interface dc0
should queue up to 5Mbps in four second-level queues using
Class Based Queueing.
Those four queues will be shown in a later example.
.Bd -literal -offset indent
altq on dc0 cbq bandwidth 5Mb queue { std, http, mail, ssh }
.Ed
.Pp
Once interfaces are activated for queueing using the
.Ar altq
directive, a sequence of
.Ar queue
directives may be defined.
The name associated with a
.Ar queue
must match a queue defined in the
.Ar altq
directive (e.g. mail), or, except for the
.Ar priq
.Ar scheduler ,
in a parent
.Ar queue
declaration.
The following keywords can be used:
.Bl -tag -width xxxx
.It Ar on Aq Ar interface
Specifies the interface the queue operates on.
If not given, it operates on all matching interfaces.
.It Ar bandwidth Aq Ar bw
Specifies the maximum bitrate to be processed by the queue.
This value must not exceed the value of the parent
.Ar queue
and can be specified as an absolute value or a percentage of the parent
queue's bandwidth.
If not specified, defaults to 100% of the parent queue's bandwidth.
The
.Ar priq
scheduler does not support bandwidth specification.
.It Ar priority Aq Ar level
Between queues a priority level can be set.
For
.Ar cbq
and
.Ar hfsc ,
the range is 0 to 7 and for
.Ar priq ,
the range is 0 to 15.
The default for all is 1.
.Ar Priq
queues with a higher priority are always served first.
.Ar Cbq
and
.Ar Hfsc
queues with a higher priority are preferred in the case of overload.
.It Ar qlimit Aq Ar limit
The maximum number of packets held in the queue.
The default is 50.
.El
.Pp
The
.Ar scheduler
can get additional parameters with
.Xo Aq Ar scheduler
.Pf ( Aq Ar parameters ) .
.Xc
Parameters are as follows:
.Bl -tag -width Fl
.It Ar default
Packets not matched by another queue are assigned to this one.
Exactly one default queue is required.
.It Ar red
Enable RED (Random Early Detection) on this queue.
RED drops packets with a probability proportional to the average
queue length.
.It Ar rio
Enables RIO on this queue.
RIO is RED with IN/OUT, thus running
RED two times more than RIO would achieve the same effect.
RIO is currently not supported in the GENERIC kernel.
.It Ar ecn
Enables ECN (Explicit Congestion Notification) on this queue.
ECN implies RED.
.El
.Pp
The
.Ar cbq
.Ar scheduler
supports an additional option:
.Bl -tag -width Fl
.It Ar borrow
The queue can borrow bandwidth from the parent.
.El
.Pp
The
.Ar hfsc
.Ar scheduler
supports some additional options:
.Bl -tag -width Fl
.It Ar realtime Aq Ar sc
The minimum required bandwidth for the queue.
.It Ar upperlimit Aq Ar sc
The maximum allowed bandwidth for the queue.
.It Ar linkshare Aq Ar sc
The bandwidth share of a backlogged queue.
.El
.Pp
.Aq Ar sc
is an acronym for
.Ar service curve .
.Pp
The format for service curve specifications is
.Ar ( m1 , d , m2 ) .
.Ar m2
controls the bandwidth assigned to the queue.
.Ar m1
and
.Ar d
are optional and can be used to control the initial bandwidth assignment.
For the first
.Ar d
milliseconds the queue gets the bandwidth given as
.Ar m1 ,
afterwards the value given in
.Ar m2 .
.Pp
Furthermore, with
.Ar cbq
and
.Ar hfsc ,
child queues can be specified as in an
.Ar altq
declaration, thus building a tree of queues using a part of
their parent's bandwidth.
.Pp
Packets can be assigned to queues based on filter rules by using the
.Ar queue
keyword.
Normally only one
.Ar queue
is specified; when a second one is specified it will instead be used for
packets which have a
.Em TOS
of
.Em lowdelay
and for TCP ACKs with no data payload.
.Pp
To continue the previous example, the examples below would specify the
four referenced
queues, plus a few child queues.
Interactive
.Xr ssh 1
sessions get priority over bulk transfers like
.Xr scp 1
and
.Xr sftp 1 .
The queues may then be referenced by filtering rules (see
.Sx PACKET FILTERING
below).
.Bd -literal
queue std bandwidth 10% cbq(default)
queue http bandwidth 60% priority 2 cbq(borrow red) \e
{ employees, developers }
queue developers bandwidth 75% cbq(borrow)
queue employees bandwidth 15%
queue mail bandwidth 10% priority 0 cbq(borrow ecn)
queue ssh bandwidth 20% cbq(borrow) { ssh_interactive, ssh_bulk }
queue ssh_interactive bandwidth 50% priority 7 cbq(borrow)
queue ssh_bulk bandwidth 50% priority 0 cbq(borrow)
block return out on dc0 inet all queue std
pass out on dc0 inet proto tcp from $developerhosts to any port 80 \e
queue developers
pass out on dc0 inet proto tcp from $employeehosts to any port 80 \e
queue employees
pass out on dc0 inet proto tcp from any to any port 22 \e
queue(ssh_bulk, ssh_interactive)
pass out on dc0 inet proto tcp from any to any port 25 \e
queue mail
.Ed
.Sh TRANSLATION
Translation rules modify either the source or destination address of the
packets associated with a stateful connection.
A stateful connection is automatically created to track packets matching
such a rule as long as they are not blocked by the filtering section of
.Nm pf.conf .
The translation engine modifies the specified address and/or port in the
packet, recalculates IP, TCP and UDP checksums as necessary, and passes it to
the packet filter for evaluation.
.Pp
Since translation occurs before filtering the filter
engine will see packets as they look after any
addresses and ports have been translated.
Filter rules will therefore have to filter based on the translated
address and port number.
Packets that match a translation rule are only automatically passed if
the
.Ar pass
modifier is given, otherwise they are
still subject to
.Ar block
and
.Ar pass
rules.
.Pp
The state entry created permits
.Xr pf 4
to keep track of the original address for traffic associated with that state
and correctly direct return traffic for that connection.
.Pp
Various types of translation are possible with pf:
.Bl -tag -width xxxx
.It Ar binat
A
.Ar binat
rule specifies a bidirectional mapping between an external IP netblock
and an internal IP netblock.
.It Ar nat
A
.Ar nat
rule specifies that IP addresses are to be changed as the packet
traverses the given interface.
This technique allows one or more IP addresses
on the translating host to support network traffic for a larger range of
machines on an "inside" network.
Although in theory any IP address can be used on the inside, it is strongly
recommended that one of the address ranges defined by RFC 1918 be used.
These netblocks are:
.Bd -literal
10.0.0.0 - 10.255.255.255 (all of net 10, i.e., 10/8)
172.16.0.0 - 172.31.255.255 (i.e., 172.16/12)
192.168.0.0 - 192.168.255.255 (i.e., 192.168/16)
.Ed
.It Pa rdr
The packet is redirected to another destination and possibly a
different port.
.Ar rdr
rules can optionally specify port ranges instead of single ports.
rdr ... port 2000:2999 -\*(Gt ... port 4000
redirects ports 2000 to 2999 (inclusive) to port 4000.
rdr ... port 2000:2999 -\*(Gt ... port 4000:*
redirects port 2000 to 4000, 2001 to 4001, ..., 2999 to 4999.
.El
.Pp
In addition to modifying the address, some translation rules may modify
source or destination ports for
.Xr tcp 4
or
.Xr udp 4
connections; implicitly in the case of
.Ar nat
rules and explicitly in the case of
.Ar rdr
rules.
Port numbers are never translated with a
.Ar binat
rule.
.Pp
Evaluation order of the translation rules is dependent on the type
of the translation rules and of the direction of a packet.
.Ar binat
rules are always evaluated first.
Then either the
.Ar rdr
rules are evaluated on an inbound packet or the
.Ar nat
rules on an outbound packet.
Rules of the same type are evaluated in the same order in which they
appear in the ruleset.
The first matching rule decides what action is taken.
.Pp
The
.Ar no
option prefixed to a translation rule causes packets to remain untranslated,
much in the same way as
.Ar drop quick
works in the packet filter (see below).
If no rule matches the packet it is passed to the filter engine unmodified.
.Pp
Translation rules apply only to packets that pass through
the specified interface, and if no interface is specified,
translation is applied to packets on all interfaces.
For instance, redirecting port 80 on an external interface to an internal
web server will only work for connections originating from the outside.
Connections to the address of the external interface from local hosts will
not be redirected, since such packets do not actually pass through the
external interface.
Redirections cannot reflect packets back through the interface they arrive
on, they can only be redirected to hosts connected to different interfaces
or to the firewall itself.
.Pp
Note that redirecting external incoming connections to the loopback
address, as in
.Bd -literal -offset indent
rdr on ne3 inet proto tcp to port smtp -\*(Gt 127.0.0.1 port spamd
.Ed
.Pp
will effectively allow an external host to connect to daemons
bound solely to the loopback address, circumventing the traditional
blocking of such connections on a real interface.
Unless this effect is desired, any of the local non-loopback addresses
should be used as redirection target instead, which allows external
connections only to daemons bound to this address or not bound to
any address.
.Pp
See
.Sx TRANSLATION EXAMPLES
below.
.Sh PACKET FILTERING
.Xr pf 4
has the ability to
.Ar block
and
.Ar pass
packets based on attributes of their layer 3 (see
.Xr ip 4
and
.Xr ip6 4 )
and layer 4 (see
.Xr icmp 4 ,
.Xr icmp6 4 ,
.Xr tcp 4 ,
.Xr udp 4 )
headers.
In addition, packets may also be
assigned to queues for the purpose of bandwidth control.
.Pp
For each packet processed by the packet filter, the filter rules are
evaluated in sequential order, from first to last.
The last matching rule decides what action is taken.
If no rule matches the packet, the default action is to pass
the packet.
.Pp
The following actions can be used in the filter:
.Bl -tag -width xxxx
.It Ar block
The packet is blocked.
There are a number of ways in which a
.Ar block
rule can behave when blocking a packet.
The default behaviour is to
.Ar drop
packets silently, however this can be overridden or made
explicit either globally, by setting the
.Ar block-policy
option, or on a per-rule basis with one of the following options:
.Pp
.Bl -tag -width xxxx -compact
.It Ar drop
The packet is silently dropped.
.It Ar return-rst
This applies only to
.Xr tcp 4
packets, and issues a TCP RST which closes the
connection.
.It Ar return-icmp
.It Ar return-icmp6
This causes ICMP messages to be returned for packets which match the rule.
By default this is an ICMP UNREACHABLE message, however this
can be overridden by specifying a message as a code or number.
.It Ar return
This causes a TCP RST to be returned for
.Xr tcp 4
packets and an ICMP UNREACHABLE for UDP and other packets.
.El
.Pp
Options returning ICMP packets currently have no effect if
.Xr pf 4
operates on a
.Xr if_bridge 4 ,
as the code to support this feature has not yet been implemented.
.Pp
The simplest mechanism to block everything by default and only pass
packets that match explicit rules is specify a first filter rule of:
.Bd -literal -offset indent
block all
.Ed
.It Ar pass
The packet is passed;
state is created unless the
.Ar no state
option is specified.
.El
.Pp
By default
.Xr pf 4
filters packets statefully; the first time a packet matches a
.Ar pass
rule, a state entry is created; for subsequent packets the filter checks
whether the packet matches any state.
If it does, the packet is passed without evaluation of any rules.
After the connection is closed or times out, the state entry is automatically
removed.
.Pp
This has several advantages.
For TCP connections, comparing a packet to a state involves checking
its sequence numbers, as well as TCP timestamps if a
.Ar scrub reassemble tcp
rule applies to the connection.
If these values are outside the narrow windows of expected
values, the packet is dropped.
This prevents spoofing attacks, such as when an attacker sends packets with
a fake source address/port but does not know the connection's sequence
numbers.
Similarly,
.Xr pf 4
knows how to match ICMP replies to states.
For example,
.Bd -literal -offset indent
pass out inet proto icmp all icmp-type echoreq
.Ed
.Pp
allows echo requests (such as those created by
.Xr ping 8 )
out statefully, and matches incoming echo replies correctly to states.
.Pp
Also, looking up states is usually faster than evaluating rules.
If there are 50 rules, all of them are evaluated sequentially in O(n).
Even with 50000 states, only 16 comparisons are needed to match a
state, since states are stored in a binary search tree that allows
searches in O(log2 n).
.Pp
Furthermore, correct handling of ICMP error messages is critical to
many protocols, particularly TCP.
.Xr pf 4
matches ICMP error messages to the correct connection, checks them against
connection parameters, and passes them if appropriate.
For example if an ICMP source quench message referring to a stateful TCP
connection arrives, it will be matched to the state and get passed.
.Pp
Finally, state tracking is required for
.Ar nat , binat No and Ar rdr
rules, in order to track address and port translations and reverse the
translation on returning packets.
.Pp
.Xr pf 4
will also create state for other protocols which are effectively stateless by
nature.
UDP packets are matched to states using only host addresses and ports,
and other protocols are matched to states using only the host addresses.
.Pp
If stateless filtering of individual packets is desired,
the
.Ar no state
keyword can be used to specify that state will not be created
if this is the last matching rule.
A number of parameters can also be set to affect how
.Xr pf 4
handles state tracking.
See
.Sx STATEFUL TRACKING OPTIONS
below for further details.
.Sh PARAMETERS
The rule parameters specify the packets to which a rule applies.
A packet always comes in on, or goes out through, one interface.
Most parameters are optional.
If a parameter is specified, the rule only applies to packets with
matching attributes.
Certain parameters can be expressed as lists, in which case
.Xr pfctl 8
generates all needed rule combinations.
.Bl -tag -width xxxx
.It Ar in No or Ar out
This rule applies to incoming or outgoing packets.
If neither
.Ar in
nor
.Ar out
are specified, the rule will match packets in both directions.
.It Ar log
In addition to the action specified, a log message is generated.
Only the packet that establishes the state is logged,
unless the
.Ar no state
option is specified.
The logged packets are sent to a
.Xr pflog 4
interface, by default
.Ar pflog0 .
This interface is monitored by the
.Xr pflogd 8
logging daemon, which dumps the logged packets to the file
.Pa /var/log/pflog
in
.Xr pcap 3
binary format.
.It Ar log (all)
Used to force logging of all packets for a connection.
This is not necessary when
.Ar no state
is explicitly specified.
As with
.Ar log ,
packets are logged to
.Xr pflog 4 .
.It Ar log (user)
Logs the
.Ux
user ID of the user that owns the socket and the PID of the process that
has the socket open where the packet is sourced from or destined to
(depending on which socket is local).
This is in addition to the normal information logged.
.Pp
Due to the problems described in the BUGS section only the first packet
logged via
.Ar log (all, user)
will have the user credentials logged when using stateful matching.
.It Ar log (to Aq Ar interface )
Send logs to the specified
.Xr pflog 4
interface instead of
.Ar pflog0 .
.It Ar quick
If a packet matches a rule which has the
.Ar quick
option set, this rule
is considered the last matching rule, and evaluation of subsequent rules
is skipped.
.It Ar on Aq Ar interface
This rule applies only to packets coming in on, or going out through, this
particular interface or interface group.
For more information on interface groups,
see the
.Ic group
keyword in
.Xr ifconfig 8 .
.It Aq Ar af
This rule applies only to packets of this address family.
Supported values are
.Ar inet
and
.Ar inet6 .
.It Ar proto Aq Ar protocol
This rule applies only to packets of this protocol.
Common protocols are
.Xr icmp 4 ,
.Xr icmp6 4 ,
.Xr tcp 4 ,
and
.Xr udp 4 .
For a list of all the protocol name to number mappings used by
.Xr pfctl 8 ,
see the file
.Em /etc/protocols .
.It Xo
.Ar from Aq Ar source
.Ar port Aq Ar source
.Ar os Aq Ar source
.Ar to Aq Ar dest
.Ar port Aq Ar dest
.Xc
This rule applies only to packets with the specified source and destination
addresses and ports.
.Pp
Addresses can be specified in CIDR notation (matching netblocks), as
symbolic host names, interface names or interface group names, or as any
of the following keywords:
.Pp
.Bl -tag -width xxxxxxxxxxxxxx -compact
.It Ar any
Any address.
.It Ar route Aq Ar label
Any address whose associated route has label
.Aq Ar label .
See
.Xr route 4
and
.Xr route 8 .
.It Ar no-route
Any address which is not currently routable.
.It Ar urpf-failed
Any source address that fails a unicast reverse path forwarding (URPF)
check, i.e. packets coming in on an interface other than that which holds
the route back to the packet's source address.
.It Aq Ar table
Any address that matches the given table.
.El
.Pp
Ranges of addresses are specified by using the
.Sq -
operator.
For instance:
.Dq 10.1.1.10 - 10.1.1.12
means all addresses from 10.1.1.10 to 10.1.1.12,
hence addresses 10.1.1.10, 10.1.1.11, and 10.1.1.12.
.Pp
Interface names and interface group names can have modifiers appended:
.Pp
.Bl -tag -width xxxxxxxxxxxx -compact
.It Ar :network
Translates to the network(s) attached to the interface.
.It Ar :broadcast
Translates to the interface's broadcast address(es).
.It Ar :peer
Translates to the point-to-point interface's peer address(es).
.It Ar :0
Do not include interface aliases.
.El
.Pp
Host names may also have the
.Ar :0
option appended to restrict the name resolution to the first of each
v4 and v6 address found.
.Pp
Host name resolution and interface to address translation are done at
ruleset load-time.
When the address of an interface (or host name) changes (under DHCP or PPP,
for instance), the ruleset must be reloaded for the change to be reflected
in the kernel.
Surrounding the interface name (and optional modifiers) in parentheses
changes this behaviour.
When the interface name is surrounded by parentheses, the rule is
automatically updated whenever the interface changes its address.
The ruleset does not need to be reloaded.
This is especially useful with
.Ar nat .
.Pp
Ports can be specified either by number or by name.
For example, port 80 can be specified as
.Em www .
For a list of all port name to number mappings used by
.Xr pfctl 8 ,
see the file
.Pa /etc/services .
.Pp
Ports and ranges of ports are specified by using these operators:
.Bd -literal -offset indent
= (equal)
!= (unequal)
\*(Lt (less than)
\*(Le (less than or equal)
\*(Gt (greater than)
\*(Ge (greater than or equal)
: (range including boundaries)
\*(Gt\*(Lt (range excluding boundaries)
\*(Lt\*(Gt (except range)
.Ed
.Pp
.Sq \*(Gt\*(Lt ,
.Sq \*(Lt\*(Gt
and
.Sq \&:
are binary operators (they take two arguments).
For instance:
.Bl -tag -width Fl
.It Ar port 2000:2004
means
.Sq all ports \*(Ge 2000 and \*(Le 2004 ,
hence ports 2000, 2001, 2002, 2003 and 2004.
.It Ar port 2000 \*(Gt\*(Lt 2004
means
.Sq all ports \*(Gt 2000 and \*(Lt 2004 ,
hence ports 2001, 2002 and 2003.
.It Ar port 2000 \*(Lt\*(Gt 2004
means
.Sq all ports \*(Lt 2000 or \*(Gt 2004 ,
hence ports 1-1999 and 2005-65535.
.El
.Pp
The operating system of the source host can be specified in the case of TCP
rules with the
.Ar OS
modifier.
See the
.Sx OPERATING SYSTEM FINGERPRINTING
section for more information.
.Pp
The host, port and OS specifications are optional, as in the following examples:
.Bd -literal -offset indent
pass in all
pass in from any to any
pass in proto tcp from any port \*(Le 1024 to any
pass in proto tcp from any to any port 25
pass in proto tcp from 10.0.0.0/8 port \*(Gt 1024 \e
to ! 10.1.2.3 port != ssh
pass in proto tcp from any os "OpenBSD"
pass in proto tcp from route "DTAG"
.Ed
.It Ar all
This is equivalent to "from any to any".
.It Ar group Aq Ar group
Similar to
.Ar user ,
this rule only applies to packets of sockets owned by the specified group.
.It Ar user Aq Ar user
This rule only applies to packets of sockets owned by the specified user.
For outgoing connections initiated from the firewall, this is the user
that opened the connection.
For incoming connections to the firewall itself, this is the user that
listens on the destination port.
For forwarded connections, where the firewall is not a connection endpoint,
the user and group are
.Em unknown .
.Pp
All packets, both outgoing and incoming, of one connection are associated
with the same user and group.
Only TCP and UDP packets can be associated with users; for other protocols
these parameters are ignored.
.Pp
User and group refer to the effective (as opposed to the real) IDs, in
case the socket is created by a setuid/setgid process.
User and group IDs are stored when a socket is created;
when a process creates a listening socket as root (for instance, by
binding to a privileged port) and subsequently changes to another
user ID (to drop privileges), the credentials will remain root.
.Pp
User and group IDs can be specified as either numbers or names.
The syntax is similar to the one for ports.
The value
.Em unknown
matches packets of forwarded connections.
.Em unknown
can only be used with the operators
.Cm =
and
.Cm != .
Other constructs like
.Cm user \*(Ge unknown
are invalid.
Forwarded packets with unknown user and group ID match only rules
that explicitly compare against
.Em unknown
with the operators
.Cm =
or
.Cm != .
For instance
.Cm user \*(Ge 0
does not match forwarded packets.
The following example allows only selected users to open outgoing
connections:
.Bd -literal -offset indent
block out proto { tcp, udp } all
pass out proto { tcp, udp } all user { \*(Lt 1000, dhartmei }
.Ed
.It Xo Ar flags Aq Ar a
.Pf / Ns Aq Ar b
.No \*(Ba / Ns Aq Ar b
.No \*(Ba any
.Xc
This rule only applies to TCP packets that have the flags
.Aq Ar a
set out of set
.Aq Ar b .
Flags not specified in
.Aq Ar b
are ignored.
For stateful connections, the default is
.Ar flags S/SA .
To indicate that flags should not be checked at all, specify
.Ar flags any .
The flags are: (F)IN, (S)YN, (R)ST, (P)USH, (A)CK, (U)RG, (E)CE, and C(W)R.
.Bl -tag -width Fl
.It Ar flags S/S
Flag SYN is set.
The other flags are ignored.
.It Ar flags S/SA
This is the default setting for stateful connections.
Out of SYN and ACK, exactly SYN may be set.
SYN, SYN+PSH and SYN+RST match, but SYN+ACK, ACK and ACK+RST do not.
This is more restrictive than the previous example.
.It Ar flags /SFRA
If the first set is not specified, it defaults to none.
All of SYN, FIN, RST and ACK must be unset.
.El
.Pp
Because
.Ar flags S/SA
is applied by default (unless
.Ar no state
is specified), only the initial SYN packet of a TCP handshake will create
a state for a TCP connection.
It is possible to be less restrictive, and allow state creation from
intermediate
.Pq non-SYN
packets, by specifying
.Ar flags any .
This will cause
.Xr pf 4
to synchronize to existing connections, for instance
if one flushes the state table.
However, states created from such intermediate packets may be missing
connection details such as the TCP window scaling factor.
States which modify the packet flow, such as those affected by
.Ar nat , binat No or Ar rdr
rules,
.Ar modulate No or Ar synproxy state
options, or scrubbed with
.Ar reassemble tcp
will also not be recoverable from intermediate packets.
Such connections will stall and time out.
.It Xo Ar icmp-type Aq Ar type
.Ar code Aq Ar code
.Xc
.It Xo Ar icmp6-type Aq Ar type
.Ar code Aq Ar code
.Xc
This rule only applies to ICMP or ICMPv6 packets with the specified type
and code.
Text names for ICMP types and codes are listed in
.Xr icmp 4
and
.Xr icmp6 4 .
This parameter is only valid for rules that cover protocols ICMP or
ICMP6.
The protocol and the ICMP type indicator
.Po
.Ar icmp-type
or
.Ar icmp6-type
.Pc
must match.
.It Xo Ar tos Aq Ar string
.No \*(Ba Aq Ar number
.Xc
This rule applies to packets with the specified
.Em TOS
bits set.
.Em TOS
may be
given as one of
.Ar lowdelay ,
.Ar throughput ,
.Ar reliability ,
or as either hex or decimal.
.Pp
For example, the following rules are identical:
.Bd -literal -offset indent
pass all tos lowdelay
pass all tos 0x10
pass all tos 16
.Ed
.It Ar allow-opts
By default, IPv4 packets with IP options or IPv6 packets with routing
extension headers are blocked.
When
.Ar allow-opts
is specified for a
.Ar pass
rule, packets that pass the filter based on that rule (last matching)
do so even if they contain IP options or routing extension headers.
For packets that match state, the rule that initially created the
state is used.
The implicit
.Ar pass
rule that is used when a packet does not match any rules does not
allow IP options.
.It Ar label Aq Ar string
Adds a label (name) to the rule, which can be used to identify the rule.
For instance,
pfctl -s labels
shows per-rule statistics for rules that have labels.
.Pp
The following macros can be used in labels:
.Pp
.Bl -tag -width $srcaddr -compact -offset indent
.It Ar $if
The interface.
.It Ar $srcaddr
The source IP address.
.It Ar $dstaddr
The destination IP address.
.It Ar $srcport
The source port specification.
.It Ar $dstport
The destination port specification.
.It Ar $proto
The protocol name.
.It Ar $nr
The rule number.
.El
.Pp
For example:
.Bd -literal -offset indent
ips = \&"{ 1.2.3.4, 1.2.3.5 }\&"
pass in proto tcp from any to $ips \e
port \*(Gt 1023 label \&"$dstaddr:$dstport\&"
.Ed
.Pp
expands to
.Bd -literal -offset indent
pass in inet proto tcp from any to 1.2.3.4 \e
port \*(Gt 1023 label \&"1.2.3.4:\*(Gt1023\&"
pass in inet proto tcp from any to 1.2.3.5 \e
port \*(Gt 1023 label \&"1.2.3.5:\*(Gt1023\&"
.Ed
.Pp
The macro expansion for the
.Ar label
directive occurs only at configuration file parse time, not during runtime.
.It Xo Ar queue Aq Ar queue
.No \*(Ba ( Aq Ar queue ,
.Aq Ar queue )
.Xc
Packets matching this rule will be assigned to the specified queue.
If two queues are given, packets which have a
.Em TOS
of
.Em lowdelay
and TCP ACKs with no data payload will be assigned to the second one.
See
.Sx QUEUEING
for setup details.
.Pp
For example:
.Bd -literal -offset indent
pass in proto tcp to port 25 queue mail
pass in proto tcp to port 22 queue(ssh_bulk, ssh_prio)
.Ed
.It Ar tag Aq Ar string
Packets matching this rule will be tagged with the
specified string.
The tag acts as an internal marker that can be used to
identify these packets later on.
This can be used, for example, to provide trust between
interfaces and to determine if packets have been
processed by translation rules.
Tags are
.Qq sticky ,
meaning that the packet will be tagged even if the rule
is not the last matching rule.
Further matching rules can replace the tag with a
new one but will not remove a previously applied tag.
A packet is only ever assigned one tag at a time.
Packet tagging can be done during
.Ar nat ,
.Ar rdr ,
or
.Ar binat
rules in addition to filter rules.
Tags take the same macros as labels (see above).
.It Ar tagged Aq Ar string
Used with filter, translation or scrub rules
to specify that packets must already
be tagged with the given tag in order to match the rule.
Inverse tag matching can also be done
by specifying the
.Cm !\&
operator before the
.Ar tagged
keyword.
.It Ar rtable Aq Ar number
Used to select an alternate routing table for the routing lookup.
Only effective before the route lookup happened, i.e. when filtering inbound.
.It Xo Ar divert-to Aq Ar host
.Ar port Aq Ar port
.Xc
Used to redirect packets to a local socket bound to
.Ar host
and
.Ar port .
The packets will not be modified, so
.Xr getsockname 2
on the socket will return the original destination address of the packet.
.It Ar divert-reply
Used to receive replies for sockets that are bound to addresses
which are not local to the machine.
See
.Xr setsockopt 2
for information on how to bind these sockets.
.It Ar probability Aq Ar number
A probability attribute can be attached to a rule, with a value set between
0 and 1, bounds not included.
In that case, the rule will be honoured using the given probability value
only.
For example, the following rule will drop 20% of incoming ICMP packets:
.Bd -literal -offset indent
block in proto icmp probability 20%
.Ed
.El
.Sh ROUTING
If a packet matches a rule with a route option set, the packet filter will
route the packet according to the type of route option.
When such a rule creates state, the route option is also applied to all
packets matching the same connection.
.Bl -tag -width xxxx
.It Ar fastroute
The
.Ar fastroute
option does a normal route lookup to find the next hop for the packet.
.It Ar route-to
The
.Ar route-to
option routes the packet to the specified interface with an optional address
for the next hop.
When a
.Ar route-to
rule creates state, only packets that pass in the same direction as the
filter rule specifies will be routed in this way.
Packets passing in the opposite direction (replies) are not affected
and are routed normally.
.It Ar reply-to
The
.Ar reply-to
option is similar to
.Ar route-to ,
but routes packets that pass in the opposite direction (replies) to the
specified interface.
Opposite direction is only defined in the context of a state entry, and
.Ar reply-to
is useful only in rules that create state.
It can be used on systems with multiple external connections to
route all outgoing packets of a connection through the interface
the incoming connection arrived through (symmetric routing enforcement).
.It Ar dup-to
The
.Ar dup-to
option creates a duplicate of the packet and routes it like
.Ar route-to .
The original packet gets routed as it normally would.
.El
.Sh POOL OPTIONS
For
.Ar nat
and
.Ar rdr
rules, (as well as for the
.Ar route-to ,
.Ar reply-to
and
.Ar dup-to
rule options) for which there is a single redirection address which has a
subnet mask smaller than 32 for IPv4 or 128 for IPv6 (more than one IP
address), a variety of different methods for assigning this address can be
used:
.Bl -tag -width xxxx
.It Ar bitmask
The
.Ar bitmask
option applies the network portion of the redirection address to the address
to be modified (source with
.Ar nat ,
destination with
.Ar rdr ) .
.It Ar random
The
.Ar random
option selects an address at random within the defined block of addresses.
.It Ar source-hash
The
.Ar source-hash
option uses a hash of the source address to determine the redirection address,
ensuring that the redirection address is always the same for a given source.
An optional key can be specified after this keyword either in hex or as a
string; by default
.Xr pfctl 8
randomly generates a key for source-hash every time the
ruleset is reloaded.
.It Ar round-robin
The
.Ar round-robin
option loops through the redirection address(es).
.Pp
When more than one redirection address is specified,
.Ar round-robin
is the only permitted pool type.
.It Ar static-port
With
.Ar nat
rules, the
.Ar static-port
option prevents
.Xr pf 4
from modifying the source port on TCP and UDP packets.
.El
.Pp
Additionally, the
.Ar sticky-address
option can be specified to help ensure that multiple connections from the
same source are mapped to the same redirection address.
This option can be used with the
.Ar random
and
.Ar round-robin
pool options.
Note that by default these associations are destroyed as soon as there are
no longer states which refer to them; in order to make the mappings last
beyond the lifetime of the states, increase the global options with
.Ar set timeout src.track .
See
.Sx STATEFUL TRACKING OPTIONS
for more ways to control the source tracking.
.Sh STATE MODULATION
Much of the security derived from TCP is attributable to how well the
initial sequence numbers (ISNs) are chosen.
Some popular stack implementations choose
.Em very
poor ISNs and thus are normally susceptible to ISN prediction exploits.
By applying a
.Ar modulate state
rule to a TCP connection,
.Xr pf 4
will create a high quality random sequence number for each connection
endpoint.
.Pp
The
.Ar modulate state
directive implicitly keeps state on the rule and is
only applicable to TCP connections.
.Pp
For instance:
.Bd -literal -offset indent
block all
pass out proto tcp from any to any modulate state
pass in proto tcp from any to any port 25 flags S/SFRA modulate state
.Ed
.Pp
Note that modulated connections will not recover when the state table
is lost (firewall reboot, flushing the state table, etc...).
.Xr pf 4
will not be able to infer a connection again after the state table flushes
the connection's modulator.
When the state is lost, the connection may be left dangling until the
respective endpoints time out the connection.
It is possible on a fast local network for the endpoints to start an ACK
storm while trying to resynchronize after the loss of the modulator.
The default
.Ar flags
settings (or a more strict equivalent) should be used on
.Ar modulate state
rules to prevent ACK storms.
.Pp
Note that alternative methods are available
to prevent loss of the state table
and allow for firewall failover.
See
.Xr carp 4
and
.Xr pfsync 4
for further information.
.Sh SYN PROXY
By default,
.Xr pf 4
passes packets that are part of a
.Xr tcp 4
handshake between the endpoints.
The
.Ar synproxy state
option can be used to cause
.Xr pf 4
itself to complete the handshake with the active endpoint, perform a handshake
with the passive endpoint, and then forward packets between the endpoints.
.Pp
No packets are sent to the passive endpoint before the active endpoint has
completed the handshake, hence so-called SYN floods with spoofed source
addresses will not reach the passive endpoint, as the sender can't complete the
handshake.
.Pp
The proxy is transparent to both endpoints, they each see a single
connection from/to the other endpoint.
.Xr pf 4
chooses random initial sequence numbers for both handshakes.
Once the handshakes are completed, the sequence number modulators
(see previous section) are used to translate further packets of the
connection.
.Ar synproxy state
includes
.Ar modulate state .
.Pp
Rules with
.Ar synproxy
will not work if
.Xr pf 4
operates on a
.Xr bridge 4 .
.Pp
Example:
.Bd -literal -offset indent
pass in proto tcp from any to any port www synproxy state
.Ed
.Sh STATEFUL TRACKING OPTIONS
A number of options related to stateful tracking can be applied on a
per-rule basis.
.Ar keep state ,
.Ar modulate state
and
.Ar synproxy state
support these options, and
.Ar keep state
must be specified explicitly to apply options to a rule.
.Pp
.Bl -tag -width xxxx -compact
.It Ar max Aq Ar number
Limits the number of concurrent states the rule may create.
When this limit is reached, further packets that would create
state will not match this rule until existing states time out.
.It Ar no-sync
Prevent state changes for states created by this rule from appearing on the
.Xr pfsync 4
interface.
.It Xo Aq Ar timeout
.Aq Ar seconds
.Xc
Changes the timeout values used for states created by this rule.
For a list of all valid timeout names, see
.Sx OPTIONS
above.
.It Ar sloppy
Uses a sloppy TCP connection tracker that does not check sequence
numbers at all, which makes insertion and ICMP teardown attacks way
easier.
This is intended to be used in situations where one does not see all
packets of a connection, e.g. in asymmetric routing situations.
Cannot be used with modulate or synproxy state.
.It Ar pflow
States created by this rule are exported on the
.Xr pflow 4
interface.
.El
.Pp
Multiple options can be specified, separated by commas:
.Bd -literal -offset indent
pass in proto tcp from any to any \e
port www keep state \e
(max 100, source-track rule, max-src-nodes 75, \e
max-src-states 3, tcp.established 60, tcp.closing 5)
.Ed
.Pp
When the
.Ar source-track
keyword is specified, the number of states per source IP is tracked.
.Pp
.Bl -tag -width xxxx -compact
.It Ar source-track rule
The maximum number of states created by this rule is limited by the rule's
.Ar max-src-nodes
and
.Ar max-src-states
options.
Only state entries created by this particular rule count toward the rule's
limits.
.It Ar source-track global
The number of states created by all rules that use this option is limited.
Each rule can specify different
.Ar max-src-nodes
and
.Ar max-src-states
options, however state entries created by any participating rule count towards
each individual rule's limits.
.El
.Pp
The following limits can be set:
.Pp
.Bl -tag -width xxxx -compact
.It Ar max-src-nodes Aq Ar number
Limits the maximum number of source addresses which can simultaneously
have state table entries.
.It Ar max-src-states Aq Ar number
Limits the maximum number of simultaneous state entries that a single
source address can create with this rule.
.El
.Pp
For stateful TCP connections, limits on established connections (connections
which have completed the TCP 3-way handshake) can also be enforced
per source IP.
.Pp
.Bl -tag -width xxxx -compact
.It Ar max-src-conn Aq Ar number
Limits the maximum number of simultaneous TCP connections which have
completed the 3-way handshake that a single host can make.
.It Xo Ar max-src-conn-rate Aq Ar number
.No / Aq Ar seconds
.Xc
Limit the rate of new connections over a time interval.
The connection rate is an approximation calculated as a moving average.
.El
.Pp
Because the 3-way handshake ensures that the source address is not being
spoofed, more aggressive action can be taken based on these limits.
With the
.Ar overload Aq Ar table
state option, source IP addresses which hit either of the limits on
established connections will be added to the named table.
This table can be used in the ruleset to block further activity from
the offending host, redirect it to a tarpit process, or restrict its
bandwidth.
.Pp
The optional
.Ar flush
keyword kills all states created by the matching rule which originate
from the host which exceeds these limits.
The
.Ar global
modifier to the flush command kills all states originating from the
offending host, regardless of which rule created the state.
.Pp
For example, the following rules will protect the webserver against
hosts making more than 100 connections in 10 seconds.
Any host which connects faster than this rate will have its address added
to the
.Aq bad_hosts
table and have all states originating from it flushed.
Any new packets arriving from this host will be dropped unconditionally
by the block rule.
.Bd -literal -offset indent
block quick from \*(Ltbad_hosts\*(Gt
pass in on $ext_if proto tcp to $webserver port www keep state \e
(max-src-conn-rate 100/10, overload \*(Ltbad_hosts\*(Gt flush global)
.Ed
.Sh OPERATING SYSTEM FINGERPRINTING
Passive OS Fingerprinting is a mechanism to inspect nuances of a TCP
connection's initial SYN packet and guess at the host's operating system.
Unfortunately these nuances are easily spoofed by an attacker so the
fingerprint is not useful in making security decisions.
But the fingerprint is typically accurate enough to make policy decisions
upon.
.Pp
The fingerprints may be specified by operating system class, by
version, or by subtype/patchlevel.
The class of an operating system is typically the vendor or genre
and would be
.Ox
for the
.Xr pf 4
firewall itself.
The version of the oldest available
.Ox
release on the main FTP site
would be 2.6 and the fingerprint would be written
.Pp
.Dl \&"OpenBSD 2.6\&"
.Pp
The subtype of an operating system is typically used to describe the
patchlevel if that patch led to changes in the TCP stack behavior.
In the case of
.Ox ,
the only subtype is for a fingerprint that was
normalized by the
.Ar no-df
scrub option and would be specified as
.Pp
.Dl \&"OpenBSD 3.3 no-df\&"
.Pp
Fingerprints for most popular operating systems are provided by
.Xr pf.os 5 .
Once
.Xr pf 4
is running, a complete list of known operating system fingerprints may
be listed by running:
.Pp
.Dl # pfctl -so
.Pp
Filter rules can enforce policy at any level of operating system specification
assuming a fingerprint is present.
Policy could limit traffic to approved operating systems or even ban traffic
from hosts that aren't at the latest service pack.
.Pp
The
.Ar unknown
class can also be used as the fingerprint which will match packets for
which no operating system fingerprint is known.
.Pp
Examples:
.Bd -literal -offset indent
pass out proto tcp from any os OpenBSD
block out proto tcp from any os Doors
block out proto tcp from any os "Doors PT"
block out proto tcp from any os "Doors PT SP3"
block out from any os "unknown"
pass on lo0 proto tcp from any os "OpenBSD 3.3 lo0"
.Ed
.Pp
Operating system fingerprinting is limited only to the TCP SYN packet.
This means that it will not work on other protocols and will not match
a currently established connection.
.Pp
Caveat: operating system fingerprints are occasionally wrong.
There are three problems: an attacker can trivially craft his packets to
appear as any operating system he chooses;
an operating system patch could change the stack behavior and no fingerprints
will match it until the database is updated;
and multiple operating systems may have the same fingerprint.
.Sh BLOCKING SPOOFED TRAFFIC
"Spoofing" is the faking of IP addresses, typically for malicious
purposes.
The
.Ar antispoof
directive expands to a set of filter rules which will block all
traffic with a source IP from the network(s) directly connected
to the specified interface(s) from entering the system through
any other interface.
.Pp
For example, the line
.Bd -literal -offset indent
antispoof for lo0
.Ed
.Pp
expands to
.Bd -literal -offset indent
block drop in on ! lo0 inet from 127.0.0.1/8 to any
block drop in on ! lo0 inet6 from ::1 to any
.Ed
.Pp
For non-loopback interfaces, there are additional rules to block incoming
packets with a source IP address identical to the interface's IP(s).
For example, assuming the interface wi0 had an IP address of 10.0.0.1 and a
netmask of 255.255.255.0,
the line
.Bd -literal -offset indent
antispoof for wi0 inet
.Ed
.Pp
expands to
.Bd -literal -offset indent
block drop in on ! wi0 inet from 10.0.0.0/24 to any
block drop in inet from 10.0.0.1 to any
.Ed
.Pp
Caveat: Rules created by the
.Ar antispoof
directive interfere with packets sent over loopback interfaces
to local addresses.
One should pass these explicitly.
.Sh FRAGMENT HANDLING
The size of IP datagrams (packets) can be significantly larger than the
maximum transmission unit (MTU) of the network.
In cases when it is necessary or more efficient to send such large packets,
the large packet will be fragmented into many smaller packets that will each
fit onto the wire.
Unfortunately for a firewalling device, only the first logical fragment will
contain the necessary header information for the subprotocol that allows
.Xr pf 4
to filter on things such as TCP ports or to perform NAT.
.Pp
Besides the use of
.Ar scrub
rules as described in
.Sx TRAFFIC NORMALIZATION
above, there are three options for handling fragments in the packet filter.
.Pp
One alternative is to filter individual fragments with filter rules.
If no
.Ar scrub
rule applies to a fragment, it is passed to the filter.
Filter rules with matching IP header parameters decide whether the
fragment is passed or blocked, in the same way as complete packets
are filtered.
Without reassembly, fragments can only be filtered based on IP header
fields (source/destination address, protocol), since subprotocol header
fields are not available (TCP/UDP port numbers, ICMP code/type).
The
.Ar fragment
option can be used to restrict filter rules to apply only to
fragments, but not complete packets.
Filter rules without the
.Ar fragment
option still apply to fragments, if they only specify IP header fields.
For instance, the rule
.Bd -literal -offset indent
pass in proto tcp from any to any port 80
.Ed
.Pp
never applies to a fragment, even if the fragment is part of a TCP
packet with destination port 80, because without reassembly this information
is not available for each fragment.
This also means that fragments cannot create new or match existing
state table entries, which makes stateful filtering and address
translation (NAT, redirection) for fragments impossible.
.Pp
It's also possible to reassemble only certain fragments by specifying
source or destination addresses or protocols as parameters in
.Ar scrub
rules.
.Pp
In most cases, the benefits of reassembly outweigh the additional
memory cost, and it's recommended to use
.Ar scrub
rules to reassemble
all fragments via the
.Ar fragment reassemble
modifier.
.Pp
The memory allocated for fragment caching can be limited using
.Xr pfctl 8 .
Once this limit is reached, fragments that would have to be cached
are dropped until other entries time out.
The timeout value can also be adjusted.
.Pp
Currently, only IPv4 fragments are supported and IPv6 fragments
are blocked unconditionally.
.Sh ANCHORS
Besides the main ruleset,
.Xr pfctl 8
can load rulesets into
.Ar anchor
attachment points.
An
.Ar anchor
is a container that can hold rules, address tables, and other anchors.
.Pp
An
.Ar anchor
has a name which specifies the path where
.Xr pfctl 8
can be used to access the anchor to perform operations on it, such as
attaching child anchors to it or loading rules into it.
Anchors may be nested, with components separated by
.Sq /
characters, similar to how file system hierarchies are laid out.
The main ruleset is actually the default anchor, so filter and
translation rules, for example, may also be contained in any anchor.
.Pp
An anchor can reference another
.Ar anchor
attachment point
using the following kinds
of rules:
.Bl -tag -width xxxx
.It Ar nat-anchor Aq Ar name
Evaluates the
.Ar nat
rules in the specified
.Ar anchor .
.It Ar rdr-anchor Aq Ar name
Evaluates the
.Ar rdr
rules in the specified
.Ar anchor .
.It Ar binat-anchor Aq Ar name
Evaluates the
.Ar binat
rules in the specified
.Ar anchor .
.It Ar anchor Aq Ar name
Evaluates the filter rules in the specified
.Ar anchor .
.It Xo Ar load anchor
.Aq Ar name
.Ar from Aq Ar file
.Xc
Loads the rules from the specified file into the
anchor
.Ar name .
.El
.Pp
When evaluation of the main ruleset reaches an
.Ar anchor
rule,
.Xr pf 4
will proceed to evaluate all rules specified in that anchor.
.Pp
Matching filter and translation rules marked with the
.Ar quick
option are final and abort the evaluation of the rules in other
anchors and the main ruleset.
If the
.Ar anchor
itself is marked with the
.Ar quick
option,
ruleset evaluation will terminate when the anchor is exited if the packet is
matched by any rule within the anchor.
.Pp
.Ar anchor
rules are evaluated relative to the anchor in which they are contained.
For example, all
.Ar anchor
rules specified in the main ruleset will reference anchor
attachment points underneath the main ruleset, and
.Ar anchor
rules specified in a file loaded from a
.Ar load anchor
rule will be attached under that anchor point.
.Pp
Rules may be contained in
.Ar anchor
attachment points which do not contain any rules when the main ruleset
is loaded, and later such anchors can be manipulated through
.Xr pfctl 8
without reloading the main ruleset or other anchors.
For example,
.Bd -literal -offset indent
ext_if = \&"kue0\&"
block on $ext_if all
anchor spam
pass out on $ext_if all
pass in on $ext_if proto tcp from any \e
to $ext_if port smtp
.Ed
.Pp
blocks all packets on the external interface by default, then evaluates
all rules in the
.Ar anchor
named "spam", and finally passes all outgoing connections and
incoming connections to port 25.
.Bd -literal -offset indent
# echo \&"block in quick from 1.2.3.4 to any\&" \&| \e
pfctl -a spam -f -
.Ed
.Pp
This loads a single rule into the
.Ar anchor ,
which blocks all packets from a specific address.
.Pp
The anchor can also be populated by adding a
.Ar load anchor
rule after the
.Ar anchor
rule:
.Bd -literal -offset indent
anchor spam
load anchor spam from "/etc/pf-spam.conf"
.Ed
.Pp
When
.Xr pfctl 8
loads
.Nm pf.conf ,
it will also load all the rules from the file
.Pa /etc/pf-spam.conf
into the anchor.
.Pp
Optionally,
.Ar anchor
rules can specify packet filtering parameters using the same syntax as
filter rules.
When parameters are used, the
.Ar anchor
rule is only evaluated for matching packets.
This allows conditional evaluation of anchors, like:
.Bd -literal -offset indent
block on $ext_if all
anchor spam proto tcp from any to any port smtp
pass out on $ext_if all
pass in on $ext_if proto tcp from any to $ext_if port smtp
.Ed
.Pp
The rules inside
.Ar anchor
spam are only evaluated for
.Ar tcp
packets with destination port 25.
Hence,
.Bd -literal -offset indent
# echo \&"block in quick from 1.2.3.4 to any" \&| \e
pfctl -a spam -f -
.Ed
.Pp
will only block connections from 1.2.3.4 to port 25.
.Pp
Anchors may end with the asterisk
.Pq Sq *
character, which signifies that all anchors attached at that point
should be evaluated in the alphabetical ordering of their anchor name.
For example,
.Bd -literal -offset indent
anchor "spam/*"
.Ed
.Pp
will evaluate each rule in each anchor attached to the
.Li spam
anchor.
Note that it will only evaluate anchors that are directly attached to the
.Li spam
anchor, and will not descend to evaluate anchors recursively.
.Pp
Since anchors are evaluated relative to the anchor in which they are
contained, there is a mechanism for accessing the parent and ancestor
anchors of a given anchor.
Similar to file system path name resolution, if the sequence
.Dq ..
appears as an anchor path component, the parent anchor of the current
anchor in the path evaluation at that point will become the new current
anchor.
As an example, consider the following:
.Bd -literal -offset indent
# echo ' anchor "spam/allowed" ' | pfctl -f -
# echo -e ' anchor "../banned" \en pass' | \e
pfctl -a spam/allowed -f -
.Ed
.Pp
Evaluation of the main ruleset will lead into the
.Li spam/allowed
anchor, which will evaluate the rules in the
.Li spam/banned
anchor, if any, before finally evaluating the
.Ar pass
rule.
.Pp
Filter rule
.Ar anchors
can also be loaded inline in the ruleset within a brace ('{' '}') delimited
block.
Brace delimited blocks may contain rules or other brace-delimited blocks.
When anchors are loaded this way the anchor name becomes optional.
.Bd -literal -offset indent
anchor "external" on egress {
block
anchor out {
pass proto tcp from any to port { 25, 80, 443 }
}
pass in proto tcp to any port 22
}
.Ed
.Pp
Since the parser specification for anchor names is a string, any
reference to an anchor name containing
.Sq /
characters will require double quote
.Pq Sq \&"
characters around the anchor name.
.Sh TRANSLATION EXAMPLES
This example maps incoming requests on port 80 to port 8080, on
which a daemon is running (because, for example, it is not run as root,
and therefore lacks permission to bind to port 80).
.Bd -literal
# use a macro for the interface name, so it can be changed easily
ext_if = \&"ne3\&"
# map daemon on 8080 to appear to be on 80
rdr on $ext_if proto tcp from any to any port 80 -\*(Gt 127.0.0.1 port 8080
.Ed
.Pp
If the
.Ar pass
modifier is given, packets matching the translation rule are passed without
inspecting the filter rules:
.Bd -literal
rdr pass on $ext_if proto tcp from any to any port 80 -\*(Gt 127.0.0.1 \e
port 8080
.Ed
.Pp
In the example below, vlan12 is configured as 192.168.168.1;
the machine translates all packets coming from 192.168.168.0/24 to 204.92.77.111
when they are going out any interface except vlan12.
This has the net effect of making traffic from the 192.168.168.0/24
network appear as though it is the Internet routable address
204.92.77.111 to nodes behind any interface on the router except
for the nodes on vlan12.
(Thus, 192.168.168.1 can talk to the 192.168.168.0/24 nodes.)
.Bd -literal
nat on ! vlan12 from 192.168.168.0/24 to any -\*(Gt 204.92.77.111
.Ed
.Pp
In the example below, the machine sits between a fake internal 144.19.74.*
network, and a routable external IP of 204.92.77.100.
The
.Ar no nat
rule excludes protocol AH from being translated.
.Bd -literal
# NO NAT
no nat on $ext_if proto ah from 144.19.74.0/24 to any
nat on $ext_if from 144.19.74.0/24 to any -\*(Gt 204.92.77.100
.Ed
.Pp
In the example below, packets bound for one specific server, as well as those
generated by the sysadmins are not proxied; all other connections are.
.Bd -literal
# NO RDR
no rdr on $int_if proto { tcp, udp } from any to $server port 80
no rdr on $int_if proto { tcp, udp } from $sysadmins to any port 80
rdr on $int_if proto { tcp, udp } from any to any port 80 -\*(Gt 127.0.0.1 \e
port 80
.Ed
.Pp
This longer example uses both a NAT and a redirection.
The external interface has the address 157.161.48.183.
On localhost, we are running
.Xr ftp-proxy 8 ,
waiting for FTP sessions to be redirected to it.
The three mandatory anchors for
.Xr ftp-proxy 8
are omitted from this example; see the
.Xr ftp-proxy 8
manpage.
.Bd -literal
# NAT
# Translate outgoing packets' source addresses (any protocol).
# In this case, any address but the gateway's external address is mapped.
nat on $ext_if inet from ! ($ext_if) to any -\*(Gt ($ext_if)
# NAT PROXYING
# Map outgoing packets' source port to an assigned proxy port instead of
# an arbitrary port.
# In this case, proxy outgoing isakmp with port 500 on the gateway.
nat on $ext_if inet proto udp from any port = isakmp to any -\*(Gt ($ext_if) \e
port 500
# BINAT
# Translate outgoing packets' source address (any protocol).
# Translate incoming packets' destination address to an internal machine
# (bidirectional).
binat on $ext_if from 10.1.2.150 to any -\*(Gt $ext_if
# RDR
# Translate incoming packets' destination addresses.
# As an example, redirect a TCP and UDP port to an internal machine.
rdr on $ext_if inet proto tcp from any to ($ext_if) port 8080 \e
-\*(Gt 10.1.2.151 port 22
rdr on $ext_if inet proto udp from any to ($ext_if) port 8080 \e
-\*(Gt 10.1.2.151 port 53
# RDR
# Translate outgoing ftp control connections to send them to localhost
# for proxying with ftp-proxy(8) running on port 8021.
rdr on $int_if proto tcp from any to any port 21 -\*(Gt 127.0.0.1 port 8021
.Ed
.Pp
In this example, a NAT gateway is set up to translate internal addresses
using a pool of public addresses (192.0.2.16/28) and to redirect
incoming web server connections to a group of web servers on the internal
network.
.Bd -literal
# NAT LOAD BALANCE
# Translate outgoing packets' source addresses using an address pool.
# A given source address is always translated to the same pool address by
# using the source-hash keyword.
nat on $ext_if inet from any to any -\*(Gt 192.0.2.16/28 source-hash
# RDR ROUND ROBIN
# Translate incoming web server connections to a group of web servers on
# the internal network.
rdr on $ext_if proto tcp from any to any port 80 \e
-\*(Gt { 10.1.2.155, 10.1.2.160, 10.1.2.161 } round-robin
.Ed
.Sh FILTER EXAMPLES
.Bd -literal
# The external interface is kue0
# (157.161.48.183, the only routable address)
# and the private network is 10.0.0.0/8, for which we are doing NAT.
# use a macro for the interface name, so it can be changed easily
ext_if = \&"kue0\&"
# normalize all incoming traffic
scrub in on $ext_if all fragment reassemble
# block and log everything by default
block return log on $ext_if all
# block anything coming from source we have no back routes for
block in from no-route to any
# block packets whose ingress interface does not match the one in
# the route back to their source address
block in from urpf-failed to any
# block and log outgoing packets that do not have our address as source,
# they are either spoofed or something is misconfigured (NAT disabled,
# for instance), we want to be nice and do not send out garbage.
block out log quick on $ext_if from ! 157.161.48.183 to any
# silently drop broadcasts (cable modem noise)
block in quick on $ext_if from any to 255.255.255.255
# block and log incoming packets from reserved address space and invalid
# addresses, they are either spoofed or misconfigured, we cannot reply to
# them anyway (hence, no return-rst).
block in log quick on $ext_if from { 10.0.0.0/8, 172.16.0.0/12, \e
192.168.0.0/16, 255.255.255.255/32 } to any
# ICMP
# pass out/in certain ICMP queries and keep state (ping)
# state matching is done on host addresses and ICMP id (not type/code),
# so replies (like 0/0 for 8/0) will match queries
# ICMP error messages (which always refer to a TCP/UDP packet) are
# handled by the TCP/UDP states
pass on $ext_if inet proto icmp all icmp-type 8 code 0
# UDP
# pass out all UDP connections and keep state
pass out on $ext_if proto udp all
# pass in certain UDP connections and keep state (DNS)
pass in on $ext_if proto udp from any to any port domain
# TCP
# pass out all TCP connections and modulate state
pass out on $ext_if proto tcp all modulate state
# pass in certain TCP connections and keep state (SSH, SMTP, DNS, IDENT)
pass in on $ext_if proto tcp from any to any port { ssh, smtp, domain, \e
auth }
# Do not allow Windows 9x SMTP connections since they are typically
# a viral worm. Alternately we could limit these OSes to 1 connection each.
block in on $ext_if proto tcp from any os {"Windows 95", "Windows 98"} \e
to any port smtp
# IPv6
# pass in/out all IPv6 traffic: note that we have to enable this in two
# different ways, on both our physical interface and our tunnel
pass quick on gif0 inet6
pass quick on $ext_if proto ipv6
# Packet Tagging
# three interfaces: $int_if, $ext_if, and $wifi_if (wireless). NAT is
# being done on $ext_if for all outgoing packets. tag packets in on
# $int_if and pass those tagged packets out on $ext_if. all other
# outgoing packets (i.e., packets from the wireless network) are only
# permitted to access port 80.
pass in on $int_if from any to any tag INTNET
pass in on $wifi_if from any to any
block out on $ext_if from any to any
pass out quick on $ext_if tagged INTNET
pass out on $ext_if proto tcp from any to any port 80
# tag incoming packets as they are redirected to spamd(8). use the tag
# to pass those packets through the packet filter.
rdr on $ext_if inet proto tcp from \*(Ltspammers\*(Gt to port smtp \e
tag SPAMD -\*(Gt 127.0.0.1 port spamd
block in on $ext_if
pass in on $ext_if inet proto tcp tagged SPAMD
.Ed
.Sh GRAMMAR
Syntax for
.Nm
in BNF:
.Bd -literal
line = ( option | pf-rule | nat-rule | binat-rule | rdr-rule |
antispoof-rule | altq-rule | queue-rule | trans-anchors |
anchor-rule | anchor-close | load-anchor | table-rule |
include )
option = "set" ( [ "timeout" ( timeout | "{" timeout-list "}" ) ] |
[ "ruleset-optimization" [ "none" | "basic" | "profile" ]] |
[ "optimization" [ "default" | "normal" |
"high-latency" | "satellite" |
"aggressive" | "conservative" ] ]
[ "limit" ( limit-item | "{" limit-list "}" ) ] |
[ "loginterface" ( interface-name | "none" ) ] |
[ "block-policy" ( "drop" | "return" ) ] |
[ "state-policy" ( "if-bound" | "floating" ) ]
[ "state-defaults" state-opts ]
[ "require-order" ( "yes" | "no" ) ]
[ "fingerprints" filename ] |
[ "skip on" ifspec ] |
[ "debug" ( "none" | "urgent" | "misc" | "loud" ) ] )
pf-rule = action [ ( "in" | "out" ) ]
[ "log" [ "(" logopts ")"] ] [ "quick" ]
[ "on" ifspec ] [ "fastroute" | route ] [ af ] [ protospec ]
hosts [ filteropt-list ]
logopts = logopt [ "," logopts ]
logopt = "all" | "user" | "to" interface-name
filteropt-list = filteropt-list filteropt | filteropt
filteropt = user | group | flags | icmp-type | icmp6-type | "tos" tos |
( "no" | "keep" | "modulate" | "synproxy" ) "state"
[ "(" state-opts ")" ] |
"fragment" | "no-df" | "min-ttl" number | "set-tos" tos |
"max-mss" number | "random-id" | "reassemble tcp" |
fragmentation | "allow-opts" |
"label" string | "tag" string | [ ! ] "tagged" string |
"queue" ( string | "(" string [ [ "," ] string ] ")" ) |
"rtable" number | "probability" number"%"
nat-rule = [ "no" ] "nat" [ "pass" [ "log" [ "(" logopts ")" ] ] ]
[ "on" ifspec ] [ af ]
[ protospec ] hosts [ "tag" string ] [ "tagged" string ]
[ "-\*(Gt" ( redirhost | "{" redirhost-list "}" )
[ portspec ] [ pooltype ] [ "static-port" ] ]
binat-rule = [ "no" ] "binat" [ "pass" [ "log" [ "(" logopts ")" ] ] ]
[ "on" interface-name ] [ af ]
[ "proto" ( proto-name | proto-number ) ]
"from" address [ "/" mask-bits ] "to" ipspec
[ "tag" string ] [ "tagged" string ]
[ "-\*(Gt" address [ "/" mask-bits ] ]
rdr-rule = [ "no" ] "rdr" [ "pass" [ "log" [ "(" logopts ")" ] ] ]
[ "on" ifspec ] [ af ]
[ protospec ] hosts [ "tag" string ] [ "tagged" string ]
[ "-\*(Gt" ( redirhost | "{" redirhost-list "}" )
[ portspec ] [ pooltype ] ]
antispoof-rule = "antispoof" [ "log" ] [ "quick" ]
"for" ifspec [ af ] [ "label" string ]
table-rule = "table" "\*(Lt" string "\*(Gt" [ tableopts-list ]
tableopts-list = tableopts-list tableopts | tableopts
tableopts = "persist" | "const" | "counters" | "file" string |
"{" [ tableaddr-list ] "}"
tableaddr-list = tableaddr-list [ "," ] tableaddr-spec | tableaddr-spec
tableaddr-spec = [ "!" ] tableaddr [ "/" mask-bits ]
tableaddr = hostname | ifspec | "self" |
ipv4-dotted-quad | ipv6-coloned-hex
altq-rule = "altq on" interface-name queueopts-list
"queue" subqueue
queue-rule = "queue" string [ "on" interface-name ] queueopts-list
subqueue
anchor-rule = "anchor" [ string ] [ ( "in" | "out" ) ] [ "on" ifspec ]
[ af ] [ protospec ] [ hosts ] [ filteropt-list ] [ "{" ]
anchor-close = "}"
trans-anchors = ( "nat-anchor" | "rdr-anchor" | "binat-anchor" ) string
[ "on" ifspec ] [ af ] [ "proto" ] [ protospec ] [ hosts ]
load-anchor = "load anchor" string "from" filename
queueopts-list = queueopts-list queueopts | queueopts
queueopts = [ "bandwidth" bandwidth-spec ] |
[ "qlimit" number ] | [ "tbrsize" number ] |
[ "priority" number ] | [ schedulers ]
schedulers = ( cbq-def | priq-def | hfsc-def )
bandwidth-spec = "number" ( "b" | "Kb" | "Mb" | "Gb" | "%" )
action = "pass" | "block" [ return ] | [ "no" ] "scrub"
return = "drop" | "return" | "return-rst" [ "( ttl" number ")" ] |
"return-icmp" [ "(" icmpcode [ [ "," ] icmp6code ] ")" ] |
"return-icmp6" [ "(" icmp6code ")" ]
icmpcode = ( icmp-code-name | icmp-code-number )
icmp6code = ( icmp6-code-name | icmp6-code-number )
ifspec = ( [ "!" ] ( interface-name | interface-group ) ) |
"{" interface-list "}"
interface-list = [ "!" ] ( interface-name | interface-group )
[ [ "," ] interface-list ]
route = ( "route-to" | "reply-to" | "dup-to" )
( routehost | "{" routehost-list "}" )
[ pooltype ]
af = "inet" | "inet6"
protospec = "proto" ( proto-name | proto-number |
"{" proto-list "}" )
proto-list = ( proto-name | proto-number ) [ [ "," ] proto-list ]
hosts = "all" |
"from" ( "any" | "no-route" | "urpf-failed" | "self" | host |
"{" host-list "}" | "route" string ) [ port ] [ os ]
"to" ( "any" | "no-route" | "self" | host |
"{" host-list "}" | "route" string ) [ port ]
ipspec = "any" | host | "{" host-list "}"
host = [ "!" ] ( address [ "/" mask-bits ] | "\*(Lt" string "\*(Gt" )
redirhost = address [ "/" mask-bits ]
routehost = "(" interface-name [ address [ "/" mask-bits ] ] ")"
address = ( interface-name | interface-group |
"(" ( interface-name | interface-group ) ")" |
hostname | ipv4-dotted-quad | ipv6-coloned-hex )
host-list = host [ [ "," ] host-list ]
redirhost-list = redirhost [ [ "," ] redirhost-list ]
routehost-list = routehost [ [ "," ] routehost-list ]
port = "port" ( unary-op | binary-op | "{" op-list "}" )
portspec = "port" ( number | name ) [ ":" ( "*" | number | name ) ]
os = "os" ( os-name | "{" os-list "}" )
user = "user" ( unary-op | binary-op | "{" op-list "}" )
group = "group" ( unary-op | binary-op | "{" op-list "}" )
unary-op = [ "=" | "!=" | "\*(Lt" | "\*(Le" | "\*(Gt" | "\*(Ge" ]
( name | number )
binary-op = number ( "\*(Lt\*(Gt" | "\*(Gt\*(Lt" | ":" ) number
op-list = ( unary-op | binary-op ) [ [ "," ] op-list ]
os-name = operating-system-name
os-list = os-name [ [ "," ] os-list ]
flags = "flags" ( [ flag-set ] "/" flag-set | "any" )
flag-set = [ "F" ] [ "S" ] [ "R" ] [ "P" ] [ "A" ] [ "U" ] [ "E" ]
[ "W" ]
icmp-type = "icmp-type" ( icmp-type-code | "{" icmp-list "}" )
icmp6-type = "icmp6-type" ( icmp-type-code | "{" icmp-list "}" )
icmp-type-code = ( icmp-type-name | icmp-type-number )
[ "code" ( icmp-code-name | icmp-code-number ) ]
icmp-list = icmp-type-code [ [ "," ] icmp-list ]
tos = ( "lowdelay" | "throughput" | "reliability" |
[ "0x" ] number )
state-opts = state-opt [ [ "," ] state-opts ]
state-opt = ( "max" number | "no-sync" | timeout | "sloppy" | "pflow" |
"source-track" [ ( "rule" | "global" ) ] |
"max-src-nodes" number | "max-src-states" number |
"max-src-conn" number |
"max-src-conn-rate" number "/" number |
"overload" "\*(Lt" string "\*(Gt" [ "flush" ] |
"if-bound" | "floating" )
fragmentation = [ "fragment reassemble" | "fragment crop" |
"fragment drop-ovl" ]
timeout-list = timeout [ [ "," ] timeout-list ]
timeout = ( "tcp.first" | "tcp.opening" | "tcp.established" |
"tcp.closing" | "tcp.finwait" | "tcp.closed" |
"udp.first" | "udp.single" | "udp.multiple" |
"icmp.first" | "icmp.error" |
"other.first" | "other.single" | "other.multiple" |
"frag" | "interval" | "src.track" |
"adaptive.start" | "adaptive.end" ) number
limit-list = limit-item [ [ "," ] limit-list ]
limit-item = ( "states" | "frags" | "src-nodes" ) number
pooltype = ( "bitmask" | "random" |
"source-hash" [ ( hex-key | string-key ) ] |
"round-robin" ) [ sticky-address ]
subqueue = string | "{" queue-list "}"
queue-list = string [ [ "," ] string ]
cbq-def = "cbq" [ "(" cbq-opt [ [ "," ] cbq-opt ] ")" ]
priq-def = "priq" [ "(" priq-opt [ [ "," ] priq-opt ] ")" ]
hfsc-def = "hfsc" [ "(" hfsc-opt [ [ "," ] hfsc-opt ] ")" ]
cbq-opt = ( "default" | "borrow" | "red" | "ecn" | "rio" )
priq-opt = ( "default" | "red" | "ecn" | "rio" )
hfsc-opt = ( "default" | "red" | "ecn" | "rio" |
linkshare-sc | realtime-sc | upperlimit-sc )
linkshare-sc = "linkshare" sc-spec
realtime-sc = "realtime" sc-spec
upperlimit-sc = "upperlimit" sc-spec
sc-spec = ( bandwidth-spec |
"(" bandwidth-spec number bandwidth-spec ")" )
include = "include" filename
.Ed
.Sh FILES
.Bl -tag -width "/etc/protocols" -compact
.It Pa /etc/hosts
Host name database.
.It Pa /etc/pf.conf
Default location of the ruleset file.
.It Pa /etc/pf.os
Default location of OS fingerprints.
.It Pa /etc/protocols
Protocol name database.
.It Pa /etc/services
Service name database.
.El
.Sh BUGS
Due to a lock order reversal (LOR) with the socket layer, the use of the
.Ar group
and
.Ar user
filter parameter in conjuction with a Giant-free netstack
can result in a deadlock.
A workaround is available under the
.Va debug.pfugidhack
sysctl which is automatically enabled when a
.Ar user
/
.Ar group
rule is added or
.Ar log (user)
is specified.
.Pp
Route labels are not supported by the
.Fx
.Xr route 4
system.
Rules with a route label do not match any traffic.
.Sh SEE ALSO
.Xr altq 4 ,
.Xr carp 4 ,
.Xr icmp 4 ,
.Xr icmp6 4 ,
.Xr ip 4 ,
.Xr ip6 4 ,
.Xr pf 4 ,
.Xr pflow 4 ,
.Xr pfsync 4 ,
.Xr route 4 ,
.Xr tcp 4 ,
.Xr udp 4 ,
.Xr hosts 5 ,
.Xr pf.os 5 ,
.Xr protocols 5 ,
.Xr services 5 ,
.Xr ftp-proxy 8 ,
.Xr pfctl 8 ,
.Xr pflogd 8 ,
.Xr route 8
.Sh HISTORY
The
.Nm
file format first appeared in
.Ox 3.0 .