program name, and ignore that entry. ipfw2.c code instead skips
this entry and starts with options at offset 0, relying on a more
tolerant implementation of the library.
This change fixes the issue by always passing a program name
in the first entry to getopt. The motivation for this change
is to remove a potential compatibility issue should we use
a different getopt() implementation in the future.
No functional changes.
Submitted by: Marta Carbone (parts)
MFC after: 4 weeks
There are still several signed/unsigned warnings left, which
require a bit more study for a proper fix.
This file has grown beyond reasonable limits.
We really need to split it into separate components (ipv4, ipv6,
dummynet, nat, table, userland-kernel communication ...) so we can
make mainteinance easier.
MFC after: 1 weeks
This particular implementation is designed to be fully backwards compatible
and to be MFC-able to 7.x (and 6.x)
Currently the only protocol that can make use of the multiple tables is IPv4
Similar functionality exists in OpenBSD and Linux.
From my notes:
-----
One thing where FreeBSD has been falling behind, and which by chance I
have some time to work on is "policy based routing", which allows
different
packet streams to be routed by more than just the destination address.
Constraints:
------------
I want to make some form of this available in the 6.x tree
(and by extension 7.x) , but FreeBSD in general needs it so I might as
well do it in -current and back port the portions I need.
One of the ways that this can be done is to have the ability to
instantiate multiple kernel routing tables (which I will now
refer to as "Forwarding Information Bases" or "FIBs" for political
correctness reasons). Which FIB a particular packet uses to make
the next hop decision can be decided by a number of mechanisms.
The policies these mechanisms implement are the "Policies" referred
to in "Policy based routing".
One of the constraints I have if I try to back port this work to
6.x is that it must be implemented as a EXTENSION to the existing
ABIs in 6.x so that third party applications do not need to be
recompiled in timespan of the branch.
This first version will not have some of the bells and whistles that
will come with later versions. It will, for example, be limited to 16
tables in the first commit.
Implementation method, Compatible version. (part 1)
-------------------------------
For this reason I have implemented a "sufficient subset" of a
multiple routing table solution in Perforce, and back-ported it
to 6.x. (also in Perforce though not always caught up with what I
have done in -current/P4). The subset allows a number of FIBs
to be defined at compile time (8 is sufficient for my purposes in 6.x)
and implements the changes needed to allow IPV4 to use them. I have not
done the changes for ipv6 simply because I do not need it, and I do not
have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.
Other protocol families are left untouched and should there be
users with proprietary protocol families, they should continue to work
and be oblivious to the existence of the extra FIBs.
To understand how this is done, one must know that the current FIB
code starts everything off with a single dimensional array of
pointers to FIB head structures (One per protocol family), each of
which in turn points to the trie of routes available to that family.
The basic change in the ABI compatible version of the change is to
extent that array to be a 2 dimensional array, so that
instead of protocol family X looking at rt_tables[X] for the
table it needs, it looks at rt_tables[Y][X] when for all
protocol families except ipv4 Y is always 0.
Code that is unaware of the change always just sees the first row
of the table, which of course looks just like the one dimensional
array that existed before.
The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
are all maintained, but refer only to the first row of the array,
so that existing callers in proprietary protocols can continue to
do the "right thing".
Some new entry points are added, for the exclusive use of ipv4 code
called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
which have an extra argument which refers the code to the correct row.
In addition, there are some new entry points (currently called
rtalloc_fib() and friends) that check the Address family being
looked up and call either rtalloc() (and friends) if the protocol
is not IPv4 forcing the action to row 0 or to the appropriate row
if it IS IPv4 (and that info is available). These are for calling
from code that is not specific to any particular protocol. The way
these are implemented would change in the non ABI preserving code
to be added later.
One feature of the first version of the code is that for ipv4,
the interface routes show up automatically on all the FIBs, so
that no matter what FIB you select you always have the basic
direct attached hosts available to you. (rtinit() does this
automatically).
You CAN delete an interface route from one FIB should you want
to but by default it's there. ARP information is also available
in each FIB. It's assumed that the same machine would have the
same MAC address, regardless of which FIB you are using to get
to it.
This brings us as to how the correct FIB is selected for an outgoing
IPV4 packet.
Firstly, all packets have a FIB associated with them. if nothing
has been done to change it, it will be FIB 0. The FIB is changed
in the following ways.
Packets fall into one of a number of classes.
1/ locally generated packets, coming from a socket/PCB.
Such packets select a FIB from a number associated with the
socket/PCB. This in turn is inherited from the process,
but can be changed by a socket option. The process in turn
inherits it on fork. I have written a utility call setfib
that acts a bit like nice..
setfib -3 ping target.example.com # will use fib 3 for ping.
It is an obvious extension to make it a property of a jail
but I have not done so. It can be achieved by combining the setfib and
jail commands.
2/ packets received on an interface for forwarding.
By default these packets would use table 0,
(or possibly a number settable in a sysctl(not yet)).
but prior to routing the firewall can inspect them (see below).
(possibly in the future you may be able to associate a FIB
with packets received on an interface.. An ifconfig arg, but not yet.)
3/ packets inspected by a packet classifier, which can arbitrarily
associate a fib with it on a packet by packet basis.
A fib assigned to a packet by a packet classifier
(such as ipfw) would over-ride a fib associated by
a more default source. (such as cases 1 or 2).
4/ a tcp listen socket associated with a fib will generate
accept sockets that are associated with that same fib.
5/ Packets generated in response to some other packet (e.g. reset
or icmp packets). These should use the FIB associated with the
packet being reponded to.
6/ Packets generated during encapsulation.
gif, tun and other tunnel interfaces will encapsulate using the FIB
that was in effect withthe proces that set up the tunnel.
thus setfib 1 ifconfig gif0 [tunnel instructions]
will set the fib for the tunnel to use to be fib 1.
Routing messages would be associated with their
process, and thus select one FIB or another.
messages from the kernel would be associated with the fib they
refer to and would only be received by a routing socket associated
with that fib. (not yet implemented)
In addition Netstat has been edited to be able to cope with the
fact that the array is now 2 dimensional. (It looks in system
memory using libkvm (!)). Old versions of netstat see only the first FIB.
In addition two sysctls are added to give:
a) the number of FIBs compiled in (active)
b) the default FIB of the calling process.
Early testing experience:
-------------------------
Basically our (IronPort's) appliance does this functionality already
using ipfw fwd but that method has some drawbacks.
For example,
It can't fully simulate a routing table because it can't influence the
socket's choice of local address when a connect() is done.
Testing during the generating of these changes has been
remarkably smooth so far. Multiple tables have co-existed
with no notable side effects, and packets have been routes
accordingly.
ipfw has grown 2 new keywords:
setfib N ip from anay to any
count ip from any to any fib N
In pf there seems to be a requirement to be able to give symbolic names to the
fibs but I do not have that capacity. I am not sure if it is required.
SCTP has interestingly enough built in support for this, called VRFs
in Cisco parlance. it will be interesting to see how that handles it
when it suddenly actually does something.
Where to next:
--------------------
After committing the ABI compatible version and MFCing it, I'd
like to proceed in a forward direction in -current. this will
result in some roto-tilling in the routing code.
Firstly: the current code's idea of having a separate tree per
protocol family, all of the same format, and pointed to by the
1 dimensional array is a bit silly. Especially when one considers that
there is code that makes assumptions about every protocol having the
same internal structures there. Some protocols don't WANT that
sort of structure. (for example the whole idea of a netmask is foreign
to appletalk). This needs to be made opaque to the external code.
My suggested first change is to add routing method pointers to the
'domain' structure, along with information pointing the data.
instead of having an array of pointers to uniform structures,
there would be an array pointing to the 'domain' structures
for each protocol address domain (protocol family),
and the methods this reached would be called. The methods would have
an argument that gives FIB number, but the protocol would be free
to ignore it.
When the ABI can be changed it raises the possibilty of the
addition of a fib entry into the "struct route". Currently,
the structure contains the sockaddr of the desination, and the resulting
fib entry. To make this work fully, one could add a fib number
so that given an address and a fib, one can find the third element, the
fib entry.
Interaction with the ARP layer/ LL layer would need to be
revisited as well. Qing Li has been working on this already.
This work was sponsored by Ironport Systems/Cisco
Reviewed by: several including rwatson, bz and mlair (parts each)
Obtained from: Ironport systems/Cisco
the limit in bytes) hard coded into both the kernel and userland.
Make both these limits a sysctl, so it is easy to change the limit.
If the userland part of ipfw finds that the sysctls don't exist,
it will just fall back to the traditional limits.
(100 packets is quite a small limit these days. If you want to test
TCP at 100Mbps, 100 packets can only accommodate a DBP of 12ms.)
Note these sysctls in the man page and warn against increasing them
without thinking first.
MFC after: 3 weeks
table 'values' as IP addresses, use an explicit argument (-i).
This is a 'POLA' issue. This is a low risk change and should be MFC'd
to RELENG_6 and RELENG 7. it might be put as an errata item for 6.3.
(not sure about 6.2).
Fix suggested by: Eugene Grosbein
PR: 120720
MFC After: 3 days
exposing them to all consumers of ip_fw.h. These structures are
used in both ipfw(8) and ipfw(4), but not part of the user<->kernel
interface for other applications to use, rather, shared
implementation.
MFC after: 3 days
Reported by: Paul Vixie <paul at vix dot com>
- refer to the dummynet(4) man page only once, later use rather
the .Nm macro.
- use .Va macro when refering to the sysctl variables
- grammar and markup fixes
Reviewed by: keramida, trhodes, ru (roughly)
MFC-after: 1 week
If it is set to zero value (default) dummynet module will try to emulate
real link as close as possible (bandwidth & latency): packet will not leave
pipe faster than it should be on real link with given bandwidth.
(This is original behaviour of dummynet which was altered in previous commit)
If it is set to non-zero value only bandwidth is enforced: packet's latency
can be lower comparing to real link with given bandwidth.
- Document recently introduced dummynet(4) sysctl variables.
Requested by: luigi, julian
MFC after: 3 month
$ ipfw -n add 1 allow layer2 not mac-type ip
00001 allow ip from any to any layer2 not not mac-type 0x0800
PR: bin/115372
Submitted by: Andrey V. Elsukov
Approved by: re (hrs)
MFC after: 3 weeks
pack a set number correctly.
Submitted by: oleg
o Plug a memory leak.
Submitted by: oleg and Andrey V. Elsukov
Approved by: re (kensmith)
MFC after: 1 week
Also rename the related functions in a similar way.
There are no functional changes.
For a packet coming in with IPsec tunnel mode, the default is
to only call into the firewall with the "outer" IP header and
payload.
With this option turned on, in addition to the "outer" parts,
the "inner" IP header and payload are passed to the
firewall too when going through ip_input() the second time.
The option was never only related to a gif(4) tunnel within
an IPsec tunnel and thus the name was very misleading.
Discussed at: BSDCan 2007
Best new name suggested by: rwatson
Reviewed by: rwatson
Approved by: re (bmah)
- to show a specific set: ipfw set 3 show
- to delete rules from the set: ipfw set 9 delete 100 200 300
- to flush the set: ipfw set 4 flush
- to reset rules counters in the set: ipfw set 1 zero
PR: kern/113388
Submitted by: Andrey V. Elsukov
Approved by: re (kensmith)
MFC after: 6 weeks
Before:
$ ipfw -n add 100 count icmp from any to any mac-type 0x01
00100 count icmp 0x0001
$ ipfw -n add 100 count icmp from any to any mac any any
00100 count icmp MAC any any any
After:
$ ipfw -n add 100 count icmp from any to any mac-type 0x01
00100 count icmp from any to any mac-type 0x0001
$ ipfw -n add 100 count icmp from any to any mac any any
00100 count icmp from any to any MAC any any
PR: bin/112244
Submitted by: Andrey V. Elsukov
MFC after: 1 month
With the second (and last) part of my previous Summer of Code work, we get:
-ipfw's in kernel nat
-redirect_* and LSNAT support
General information about nat syntax and some examples are available
in the ipfw (8) man page. The redirect and LSNAT syntax are identical
to natd, so please refer to natd (8) man page.
To enable in kernel nat in rc.conf, two options were added:
o firewall_nat_enable: equivalent to natd_enable
o firewall_nat_interface: equivalent to natd_interface
Remember to set net.inet.ip.fw.one_pass to 0, if you want the packet
to continue being checked by the firewall ruleset after being
(de)aliased.
NOTA BENE: due to some problems with libalias architecture, in kernel
nat won't work with TSO enabled nic, thus you have to disable TSO via
ifconfig (ifconfig foo0 -tso).
Approved by: glebius (mentor)
address, to avoid confusing the users that a full address is
always required.
Submitted by: Josh Paetzel <josh@tcbug.org> (through freebsd-doc)
MFC after: 3 days
otherwise this command
ipfw add allow ipv6-icmp from any to 2002::1 icmp6types 1,2,128,129
turns into icmp6types 1,2,32,33,34,...94,95,128,129
PR: 102422 (part 1)
Submitted by: Andrey V. Elsukov <bu7cher at yandex.ru>
MFC after: 5 days
having trouble with the "me6" keyword. Also, we were using inet_pton on
the wrong variable in one place.
Reviewed by: mlaier (previous version of patch)
Obtained from: Sascha Blank (inet_pton change)
MFC after: 1 week
for example:
fwd tablearg ip from any to table(1)
where table 1 has entries of the form:
1.1.1.0/24 10.2.3.4
208.23.2.0/24 router2
This allows trivial implementation of a secondary routing table implemented
in the firewall layer.
I expect more work (under discussion with Glebius) to follow this to clean
up some of the messy parts of ipfw related to tables.
Reviewed by: Glebius
MFC after: 1 month
- 'tag' & 'untag' action parameters.
- 'tagged' & 'limit' rule options.
Rule examples:
pipe 1 tag tablearg ip from table(1) to any
allow ip from any to table(2) tagged tablearg
allow tcp from table(3) to any 25 setup limit src-addr tablearg
sbin/ipfw/ipfw2.c:
1) new macros
GET_UINT_ARG - support of 'tablearg' keyword, argument range checking.
PRINT_UINT_ARG - support of 'tablearg' keyword.
2) strtoport(): do not silently truncate/accept invalid port list expressions
like: '1,2-abc' or '1,2-3-4' or '1,2-3x4'. style(9) cleanup.
Approved by: glebius (mentor)
MFC after: 1 month
Since tags are kept while packet resides in kernelspace, it's possible to
use other kernel facilities (like netgraph nodes) for altering those tags.
Submitted by: Andrey Elsukov <bu7cher at yandex dot ru>
Submitted by: Vadim Goncharov <vadimnuclight at tpu dot ru>
Approved by: glebius (mentor)
Idea from: OpenBSD PF
MFC after: 1 month
doesn't exist or add one that is already present, if the -q flag
is set. Useful for "ipfw -q /dev/stdin" when the command above is
invoked from something like python or TCL to feed commands
down the throat of ipfw.
MFC in: 1 week
action argument with the value obtained from table lookup. The feature
is now applicable only to "pipe", "queue", "divert", "tee", "netgraph"
and "ngtee" rules.
An example usage:
ipfw pipe 1000 config bw 1000Kbyte/s
ipfw pipe 4000 config bw 4000Kbyte/s
ipfw table 1 add x.x.x.x 1000
ipfw table 1 add x.x.x.y 4000
ipfw pipe tablearg ip from table(1) to any
In the example above the rule will throw different packets to different pipes.
TODO:
- Support "skipto" action, but without searching all rules.
- Improve parser, so that it warns about bad rules. These are:
- "tablearg" argument to action, but no "table" in the rule. All
traffic will be blocked.
- "tablearg" argument to action, but "table" searches for entry with
a specific value. All traffic will be blocked.
- "tablearg" argument to action, and two "table" looks - for src and
for dst. The last lookup will match.
IPv6 support was committed:
- Stop treating `ip' and `ipv6' as special in `proto' option as they
conflict with /etc/protocols.
- Disuse `ipv4' in `proto' option as it is corresponding to `ipv6'.
- When protocol is specified as numeric, treat it as it is even it is
41 (ipv6).
- Allow zero for protocol as it is valid number of `ip'.
Still, we cannot specify an IPv6 over an IPv4 tunnel like before such
as:
pass ipv6 from any to any
But, now, you can specify it like:
pass ip4 from any to any proto ipv6
PR: kern/89472
Reported by: Ga l Roualland <gael.roualland__at__dial.oleane.com>
MFC after: 1 week
that debug.mpsafenet be set to 0. It is still possible for dead locks to
occur while these filtering options are used due to the layering violation
inherent in their implementation.
Discussed: -current, rwatson, glebius
* Correct handling of IPv6 Extension Headers.
* Add unreach6 code.
* Add logging for IPv6.
Submitted by: sysctl handling derived from patch from ume needed for ip6fw
Obtained from: is_icmp6_query and send_reject6 derived from similar
functions of netinet6,ip6fw
Reviewed by: ume, gnn; silence on ipfw@
Test setup provided by: CK Software GmbH
MFC after: 6 days
policy. It may be used to provide more detailed classification of
traffic without actually having to decide its fate at the time of
classification.
MFC after: 1 week
This is the last requirement before we can retire ip6fw.
Reviewed by: dwhite, brooks(earlier version)
Submitted by: dwhite (manpage)
Silence from: -ipfw
with the kernel compile time option:
options IPFIREWALL_FORWARD_EXTENDED
This option has to be specified in addition to IPFIRWALL_FORWARD.
With this option even packets targeted for an IP address local
to the host can be redirected. All restrictions to ensure proper
behaviour for locally generated packets are turned off. Firewall
rules have to be carefully crafted to make sure that things like
PMTU discovery do not break.
Document the two kernel options.
PR: kern/71910
PR: kern/73129
MFC after: 1 week
This commit replaces those with two new functions that simplify the code
and produce warnings that the syntax is deprecated. A small number of
sensible abbreviations may be explicitly added based on user feedback.
There were previously three types of strncmp use in ipfw:
- Most commonly, strncmp(av, "string", sizeof(av)) was used to allow av
to match string or any shortened form of it. I have replaced this
with a new function _substrcmp(av, "string") which returns 0 if av
is a substring of "string", but emits a warning if av is not exactly
"string".
- The next type was two instances of strncmp(av, "by", 2) which allowed
the abbreviation of bytes to "by", "byt", etc. Unfortunately, it
also supported "bykHUygh&*g&*7*ui". I added a second new function
_substrcmp2(av, "by", "bytes") which acts like the strncmp did, but
complains if the user doesn't spell out the word "bytes".
- There is also one correct use of strncmp to match "table(" which might
have another token after it without a space.
Since I changed all the lines anyway, I also fixed the treatment of
strncmp's return as a boolean in many cases. I also modified a few
strcmp cases as well to be fully consistent.
reversals+system lock ups if they are using ucred based rules
while running with debug.mpsafenet=1.
I am working on merging a shared locking mechanism into ipfw which
should take care of this problem, but it still requires a bit more
testing and review.
and sent to the DIVERT socket while the original packet continues with the
next rule. Unlike a normally diverted packet no IP reassembly attemts are
made on tee'd packets and they are passed upwards totally unmodified.
Note: This will not be MFC'd to 4.x because of major infrastucture changes.
PR: kern/64240 (and many others collapsed into that one)
contain O_UID, O_GID and O_JAIL opcodes, the F_NOT or F_OR logical
operator bits get clobbered. Making it impossible to use the ``NOT'' or
``OR'' operators with uid, gid and jail based constraints.
The ipfw_insn instruction template contains a ``len'' element which
stores two pieces of information, the size of the instruction
(in 32-bit words) in the low 6 bits of "len" with the 2 remaining
bits to implement OR and NOT.
The current code clobbers the OR and NOT bits by initializing the
``len'' element to the size, rather than OR'ing the bits. This change
fixes this by changing the initialization of cmd->len to an OR operation
for the O_UID, O_GID and O_JAIL opcodes.
This may be a MFC candidate for RELENG_5.
Reviewed by: andre
Approved by: luigi
PR: kern/63961 (partially)
keyword but without 'logamount' limit the amount of their log messages
by net.inet.ip.fw.verbose_limit sysctl value.
RELENG_5 candidate.
PR: kern/46080
Submitted by: Dan Pelleg
MFC after: 1 week
Since the only thing truly unique about a prison is it's ID, I figured
this would be the most granular way of handling this.
This commit makes the following changes:
- Adds tokenizing and parsing for the ``jail'' command line option
to the ipfw(8) userspace utility.
- Append the ipfw opcode list with O_JAIL.
- While Iam here, add a comment informing others that if they
want to add additional opcodes, they should append them to the end
of the list to avoid ABI breakage.
- Add ``fw_prid'' to the ipfw ucred cache structure.
- When initializing ucred cache, if the process is jailed,
set fw_prid to the prison ID, otherwise set it to -1.
- Update man page to reflect these changes.
This change was a strong motivator behind the ucred caching
mechanism in ipfw.
A sample usage of this new functionality could be:
ipfw add count ip from any to any jail 2
It should be noted that because ucred based constraints
are only implemented for TCP and UDP packets, the same
applies for jail associations.
Conceptual head nod by: pjd
Reviewed by: rwatson
Approved by: bmilekic (mentor)
For incoming packets, the packet's source address is checked if it
belongs to a directly connected network. If the network is directly
connected, then the interface the packet came on in is compared to
the interface the network is connected to. When incoming interface
and directly connected interface are not the same, the packet does
not match.
Usage example:
ipfw add deny ip from any to any not antispoof in
Manpage education by: ru
RTF_BLACKHOLE as well.
To quote the submitter:
The uRPF loose-check implementation by the industry vendors, at least on Cisco
and possibly Juniper, will fail the check if the route of the source address
is pointed to Null0 (on Juniper, discard or reject route). What this means is,
even if uRPF Loose-check finds the route, if the route is pointed to blackhole,
uRPF loose-check must fail. This allows people to utilize uRPF loose-check mode
as a pseudo-packet-firewall without using any manual filtering configuration --
one can simply inject a IGP or BGP prefix with next-hop set to a static route
that directs to null/discard facility. This results in uRPF Loose-check failing
on all packets with source addresses that are within the range of the nullroute.
Submitted by: James Jun <james@towardex.com>
o Add sanity checking to the firewall delete operation
which tells the user that a firewall rule
specification is required.
The previous behaviour was to exit without reporting any
errors to the user.
Approved by: bmilekic (mentor)
mac ipfw rules. The exact same sanity check is performed as
the first operation of add_mac(), so there is no sense
in doing it twice.
Approved by: bmilekic (mentor)
PR: bin/55981
source address of a packet exists in the routing table. The
default route is ignored because it would match everything and
render the check pointless.
This option is very useful for routers with a complete view of
the Internet (BGP) in the routing table to reject packets with
spoofed or unrouteable source addresses.
Example:
ipfw add 1000 deny ip from any to any not versrcreach
also known in Cisco-speak as:
ip verify unicast source reachable-via any
Reviewed by: luigi
rule, thus omitting the entire body.
This makes the output a lot more readable for complex rulesets
(provided, of course, you have annotated your ruleset appropriately!)
MFC after: 3 days
code is compiled in to support the O_IPSEC operator. Previously no
support was included and ipsec rules were always matching. Note that
we do not return an error when an ipsec rule is added and the kernel
does not have IPsec support compiled in; this is done intentionally
but we may want to revisit this (document this in the man page).
PR: 58899
Submitted by: Bjoern A. Zeeb
Approved by: re (rwatson)
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
of do_cmd() broke things, because this function assumes that a socklen_t
is large enough to hold a pointer.
A real solution to this problem would be a rewrite of do_cmd() to
treat the optlen parameter consistently and not use it to carry
a pointer or integer dependent on the context.
Allow set 31 to be used for rules other than 65535.
Set 31 is still special because rules belonging to it are not deleted
by the "ipfw flush" command, but must be deleted explicitly with
"ipfw delete set 31" or by individual rule numbers.
This implement a flexible form of "persistent rules" which you might
want to have available even after an "ipfw flush".
Note that this change does not violate POLA, because you could not
use set 31 in a ruleset before this change.
Suggested by: Paul Richards
introduced in the latest commits).
Also:
* update the 'ipfw -h' output;
* allow rules of the form "100 add allow ..." i.e. with the index first.
(requested by Paul Richards). This was an undocumented ipfw1 behaviour,
and it is left undocumented.
and minor code cleanups.
* make the code compile with WARNS=5 (at least on i386), mostly
by adding 'const' specifier and replacing "void *" with "char *"
in places where pointer arithmetic was used.
This also spotted a few places where invalid tests (e.g. uint < 0)
were used.
* support ranges in "list" and "show" commands. Now you can say
ipfw show 100-1000 4000-8000
which is very convenient when you have large rulesets.
* implement comments in ipfw commands. These are implemented in the
kernel as O_NOP commands (which always match) whose body contains
the comment string. In userland, a comment is a C++-style comment:
ipfw add allow ip from me to any // i can talk to everybody
The choice of '//' versus '#' is somewhat arbitrary, but because
the preprocessor/readfile part of ipfw used to strip away '#',
I did not want to change this behaviour.
If a rule only contains a comment
ipfw add 1000 // this rule is just a comment
then it is stored as a 'count' rule (this is also to remind
the user that scanning through a rule is expensive).
* improve handling of flags (still to be completed).
ipfw_main() was written thinking of 'one rule per ipfw invocation',
and so flags are set and never cleared. With readfile/preprocessor
support, this changes and certain flags should be reset on each
line. For the time being, only fix handling of '-a' which
differentiates the "list" and "show" commands.
* rework the preprocessor support -- ipfw_main() already had most
of the parsing code, so i have moved in there the only missing
bit (stripping away '#' and comments) and removed the parsing
from ipfw_readfile().
Also, add some more options (such as -c, -N, -S) to the readfile
section.
MFC after: 3 days
spaces and comma-separated lists of arguments;
* reword the description of address specifications, to include
previous and current changes for address sets and lists;
* document the new '-n' flag.
* update the section on differences between ipfw1 and ipfw2
(this is becoming boring!)
MFC after: 3 days
* Make the addr-set size optional (defaults to /24)
You can now write 1.2.3.0/24{56-80} or 1.2.3.0{56-80}
Also make the parser more strict.
* Support a new format for the list of addresses:
1.2.3.4,5.6.7.8/30,9.10.11.12/22,12.12.12.13, ...
which exploits the new capabilities of O_IP_SRC_MASK/O_IP_DST_MASK
* Allow spaces after commas to make lists of addresses more readable.
1.2.3.4, 5.6.7.8/30, 9.10.11.12/22, 12.12.12.13, ...
* ipfw will now accept full commands as a single argument and strip
extra leading/trailing whitespace as below:
ipfw "-q add allow ip from 1.2.3.4 to 5.6.7.8, 9.10.11.23 "
This should help in moving the body of ipfw into a library
that user programs can invoke.
* Cleanup some comments and data structures.
* Do not print rule counters for dynamic rules with ipfw -d list
(PR 51182)
* Improve 'ipfw -h' output (PR 46785)
* Add a '-n' flag to test the syntax of commands without actually
calling [gs]etsockopt() (PR 44238)
* Support the '-n' flag also with the preprocessors;
Manpage commit to follow.
MFC after: 3 days
Should work with both regular and fast ipsec (mutually exclusive).
See manpage for more details.
Submitted by: Ari Suutari (ari.suutari@syncrontech.com)
Revised by: sam
MFC after: 1 week
1.2.3.4/24{5,6,7,10-20,60-90}
for set of ip addresses.
Previously you needed to specify every address in the range, which
was unconvenient and lead to very long lines.
Internally the set is still stored in the same way, just the
input and output routines are modified.
Manpage update still missing.
Perhaps a similar preprocessing step would be useful for port ranges.
MFC after: 3 days
"ipid" options. This feature has been requested by several users.
On passing, fix some minor bugs in the parser. This change is fully
backward compatible so if you have an old /sbin/ipfw and a new
kernel you are not in trouble (but you need to update /sbin/ipfw
if you want to use the new features).
Document the changes in the manpage.
Now you can write things like
ipfw add skipto 1000 iplen 0-500
which some people were asking to give preferential treatment to
short packets.
The 'MFC after' is just set as a reminder, because I still need
to merge the Alpha/Sparc64 fixes for ipfw2 (which unfortunately
change the size of certain kernel structures; not that it matters
a lot since ipfw2 is entirely optional and not the default...)
PR: bin/48015
MFC after: 1 week
comes in on is the same interface that we would route out of to get to
the packet's source address. Essentially automates an anti-spoofing
check using the information in the routing table.
Experimental. The usage and rule format for the feature may still be
subject to change.
width of fields for packets and bytes counters.
PR: bin/47196
Reviewed by: -audit
Not objected by: luigi, des
o Use %llu instead of deprecated %qu convert specification for ipfw
packets and bytes counters.
Noted by: des
MFC after: 1 month
default-to-deny firewall. Simply turning off IPFW via a preexisting
sysctl does the job. To make it more apparent (since nobody picked up
on this in a week's worth of flames), the boolean sysctl's have been
integrated into the /sbin/ipfw command set in an obvious and straightforward
manner. For example, you can now do 'ipfw disable firewall' or
'ipfw enable firewall'. This is far easier to remember then the
net.inet.ip.fw.enable sysctl.
Reviewed by: imp
MFC after: 3 days
after -p except for the last (the ruleset file to process) to the
preprocessor for interpretation. This allows command-line options besides
-U and -D to be passed to cpp(1) and m4(1) as well as making it easier to
use other preprocessors.
Sponsored By: NTT Multimedia Communications Labs
MFC after: 1 week
prob 0.5 pipe NN ....
due to the generation of an invalid ipfw instruction sequence.
No ABI change, but you need to upgrade /sbin/ipfw to generate the
correct code.
Approved by: re
to net.inet.ip.fw.one_pass.
Add to notes to explain the exact behaviour of "prob xxx" and "log"
options.
Virtually approved by: re (mentioned in rev.1.19 of ip_fw2.c)
Quoting luigi:
In order to make the userland code fully 64-bit clean it may
be necessary to commit other changes that may or may not cause
a minor change in the ABI.
Reviewed by: luigi
has always done.
Technically, this is the wrong format, but it reduces the diffs in
-stable. Someday, when we get rid of ipfw1, I will put the port number
in the proper format both in kernel and userland.
MFC after: 3 days
(with re@ permission)
following Julian's good suggestion: since you can specify any match
pattern as an option, rules now have the following format:
[<proto> from <src> to <dst>] [options]
i.e. the first part is now entirely optional (and left there just
for compatibility with ipfw1 rulesets).
Add a "-c" flag to show/list rules in the compact form
(i.e. without the "ip from any to any" part) when possible.
The default is to include it so that scripts processing ipfw's
canonical output will still work.
Note that as part of this cleanup (and to remove ambiguity), MAC
fields now can only be specified in the options part.
Update the manpage to reflect the syntax.
Clarify the behaviour when a match is attempted on fields which
are not present in the packet, e.g. port numbers on non TCP/UDP
packets, and the "not" operator is specified. E.g.
ipfw add allow not src-port 80
will match also ICMP packets because they do not have port numbers, so
"src-port 80" will fail and "not src-port 80" will succeed. For such
cases it is advised to insert further options to prevent undesired results
(e.g. in the case above, "ipfw add allow proto tcp not src-port 80").
We definitely need to rewrite the parser using lex and yacc!
render the syntax less ambiguous.
Now rules can be in one of these two forms
<action> <protocol> from <src> to <dst> [options]
<action> MAC dst-mac src-mac mac-type [options]
however you can now specify MAC and IP header fields as options e.g.
ipfw add allow all from any to any mac-type arp
ipfw add allow all from any to any { dst-ip me or src-ip me }
which makes complex expressions a lot easier to write and parse.
The "all from any to any" part is there just for backward compatibility.
Manpage updated accordingly.
Implement the M_SKIP_FIREWALL bit in m_flags to avoid loops
for firewall-generated packets (the constant has to go in sys/mbuf.h).
Better comments on keepalive generation, and enforce dyn_rst_lifetime
and dyn_fin_lifetime to be less than dyn_keepalive_period.
Enforce limits (up to 64k) on the number of dynamic buckets, and
retry allocation with smaller sizes.
Raise default number of dynamic rules to 4096.
Improved handling of set of rules -- now you can atomically
enable/disable multiple sets, move rules from one set to another,
and swap sets.
sbin/ipfw/ipfw2.c:
userland support for "noerror" pipe attribute.
userland support for sets of rules.
minor improvements on rule parsing and printing.
sbin/ipfw/ipfw.8:
more documentation on ipfw2 extensions, differences from ipfw1
(so we can use the same manpage for both), stateful rules,
and some additional examples.
Feedback and more examples needed here.
with ipfw2 extensions and give examples of use of the new features.
This is just a preliminary commit, where i simply added the basic
syntax for the extensions, and clean up the page (e.g. by listing
things in alphabetical rather than random order).
I would appreciate feedback and possible corrections/extensions
by interested parties.
Still missing are a more detailed description of stateful rules
(with keepalives), interaction with of stateful rules and natd (don't do
that!), examples of use with the recently introduced rule sets.
There is an issue related to the MFC: RELENG_4 still has ipfw as a
default, and ipfw2 is optional. We have two options here: MFC this
page as ipfw(8) adding a large number of "SORRY NOT IN IPFW" notes,
or create a new ipfw2(8) manpage just for -stable users. I am all
for the first approach, but of course am listening to your comments.
The bugfix (ipfw2.c) makes the handling of port numbers with
a dash in the name, e.g. ftp-data, consistent with old ipfw:
use \\ before the - to consider it as part of the name and not
a range separator.
The new feature (all this description will go in the manpage):
each rule now belongs to one of 32 different sets, which can
be optionally specified in the following form:
ipfw add 100 set 23 allow ip from any to any
If "set N" is not specified, the rule belongs to set 0.
Individual sets can be disabled, enabled, and deleted with the commands:
ipfw disable set N
ipfw enable set N
ipfw delete set N
Enabling/disabling of a set is atomic. Rules belonging to a disabled
set are skipped during packet matching, and they are not listed
unless you use the '-S' flag in the show/list commands.
Note that dynamic rules, once created, are always active until
they expire or their parent rule is deleted.
Set 31 is reserved for the default rule and cannot be disabled.
All sets are enabled by default. The enable/disable status of the sets
can be shown with the command
ipfw show sets
Hopefully, this feature will make life easier to those who want to
have atomic ruleset addition/deletion/tests. Examples:
To add a set of rules atomically:
ipfw disable set 18
ipfw add ... set 18 ... # repeat as needed
ipfw enable set 18
To delete a set of rules atomically
ipfw disable set 18
ipfw delete set 18
ipfw enable set 18
To test a ruleset and disable it and regain control if something
goes wrong:
ipfw disable set 18
ipfw add ... set 18 ... # repeat as needed
ipfw enable set 18 ; echo "done "; sleep 30 && ipfw disable set 18
here if everything goes well, you press control-C before
the "sleep" terminates, and your ruleset will be left
active. Otherwise, e.g. if you cannot access your box,
the ruleset will be disabled after the sleep terminates.
I think there is only one more thing that one might want, namely
a command to assign all rules in set X to set Y, so one can
test a ruleset using the above mechanisms, and once it is
considered acceptable, make it part of an existing ruleset.
+ the header file contains two different opcodes (O_IPOPTS and O_IPOPT)
for what is the same thing, and sure enough i used one in the kernel
and the other one in userland. Be consistent!
+ "keep-state" and "limit" must be the last match pattern in a rule,
so no matter how you enter them move them to the end of the rule.
* accept "icmptype" as an alias for "icmptypes";
* remove an extra whitespace after "log" rules;
* print correctly the "limit" masks;
* correct a typo in parsing dummynet arguments (this caused a coredump);
* do not allow specifying both "check-state" and "limit", they are
(and have always been) mutually exclusive;
* remove an extra print of the rule before installing it;
* make stdout buffered -- otherwise, if you log its output with syslog,
you will see one entry for each printf(). Rather unpleasant.
fatal on alphas.
Fixed setting of WARNS. WARNS should never be set unconditionally, since
this breaks testing of different WARNS values by setting it at a higher
level (e.g., on the command line).
now it should support all the instructions of the old ipfw.
Fix some bugs in the user interface, /sbin/ipfw.
Please check this code against your rulesets, so i can fix the
remaining bugs (if any, i think they will be mostly in /sbin/ipfw).
Once we have done a bit of testing, this code is ready to be MFC'ed,
together with a bunch of other changes (glue to ipfw, and also the
removal of some global variables) which have been in -current for
a couple of weeks now.
MFC after: 7 days
This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.
The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).
The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c . Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw). The old files are still there, and will be removed
in due time.
I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.
In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.
On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like
ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any
This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.
Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:
10.20.30.0/26{18,44,33,22,9}
which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).
Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.
The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.
fields as discussed in the commit to ip_fw.c:1.186
On top of this, a ton of non functional changes to clean up the code,
write functions to replace sections of code that were replicated
multiple times (e.g. the printing or matching of flags and options),
splitting long sections of inlined code into separate functions,
and the like.
I have tested the code quite a bit, but some typos (using one variable
in place of another) might have escaped.
The "embedded manpage" is a bit inconsistent, but i am leaving fixing
it for later. The current format makes no sense, it is over 40 lines
long and practically unreadable. We can either split it into sections
( ipfw -h options , ipfw -h pipe , ipfw -h queue ...)
or remove it altogether and refer to the manpage.
+ setting a bandwidth too large for a pipe (above 2Gbit/s) could
cause the internal representation (which is int) to wrap to a
negative number, causing an infinite loop in the kernel;
+ (see PR bin/35628): when configuring RED parameters for a queue,
the values are not passed to the kernel resulting in panics at
runtime (part of the problem here is also that the kernel does
not check for valid parameters being passed, but this will be
fixed in a separate commit).
These are both critical fixes which need to be merged into 4.6-RELEASE.
MFC after: 1 day
more on how ipfw(8) deals with tiny fragments. While we're at it, add
a quick log message to even let people know we dropped a packet. (Note
that the second FINE POINT is somewhat redundant given the first, but
since the code is there, leave the docs for it.)
MFC after: 1 day
time_to_xxx() and xxx_to_time() functions. e.g. _time_to_xxx()
instead of time_to_xxx(), to make it more obvious that these are
stopgap functions & placemarkers and not meant to create a defacto
standard. They will eventually be replaced when a real standard
comes out of committee.
reinserted by a userland process, will lose a number of packet
attributes, including their source interface. This may affect
the behavior of later rules, and while not strictly a BUG, may
cause unexpected behavior if not clearly documented. A similar
note for natd(8) might be desirable.
ipfirewall(4) to the IMPLEMENTATION NOTES section because it
considers kernel internals and may confuse newbies if placed
at the very beginning of the manpage (where it used to be previously.)
Not objected by: luigi
Fair Queueing) and RED (Random Early Detection) to both give the reader
a hint what they are and to make it easier to find out more information
about them.
addresses (and the macros that ipfw(4) use to lookup data for the 'me'
keyword have been converted) remove a comment about using 'me' being a
"computationally expensive" operation.
while I'm here, change two instances of "IP number" to "IP address"