Commit Graph

98 Commits

Author SHA1 Message Date
Sam Leffler
6a3ca7514d Use MPSAFE callouts only when debug.mpsafenet is 1. Both timer routines
potentially transmit packets that may enter KAME IPsec w/o Giant if the
callouts are marked MPSAFE.

Reviewed by:	ume
Approved by:	re (rwatson)
2003-11-23 18:13:41 +00:00
Andre Oppermann
97d8d152c2 Introduce tcp_hostcache and remove the tcp specific metrics from
the routing table.  Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.

It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination.  Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.

tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.

It removes significant locking requirements from the tcp stack with
regard to the routing table.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 20:07:39 +00:00
Andre Oppermann
26d02ca7ba Remove RTF_PRCLONING from routing table and adjust users of it
accordingly.  The define is left intact for ABI compatibility
with userland.

This is a pre-step for the introduction of tcp_hostcache.  The
network stack remains fully useable with this change.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 19:47:31 +00:00
Maxim Konovalov
dbf7b38125 Fix an arguments order in check_uidgid() call.
PR:		kern/59314
Submitted by:	Andrey V. Shytov
Approved by:	re (rwatson, jhb)
2003-11-20 10:28:33 +00:00
Andre Oppermann
02c1c7070e Remove the global one-level rtcache variable and associated
complex locking and rework ip_rtaddr() to do its own rtlookup.
Adopt all its callers to this and make ip_output() callable
with NULL rt pointer.

Reviewed by:	sam (mentor)
2003-11-14 21:48:57 +00:00
Sam Leffler
8f1ee3683d Move uid/gid checking logic out of line and lock inpcb usage. This
has a LOR between IPFW inpcb locks but I'm committing it now as the
lesser of two evils (the other being unlocked use of in_pcblookup).

Supported by:	FreeBSD Foundation
2003-11-07 23:26:57 +00:00
Hajimu UMEMOTO
aef3a65eb7 use ipsec_getnhist() instead of obsoleted ipsec_gethist().
Submitted by:	"Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
Reviewed by:	Ari Suutari <ari@suutari.iki.fi> (ipfw@)
2003-11-07 20:25:47 +00:00
Brooks Davis
9bf40ede4a Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By:	re (in principle)
Reviewed By:	njl, imp
Tested On:	i386, amd64, sparc64
Obtained From:	NetBSD (if_xname)
2003-10-31 18:32:15 +00:00
Kirk McKusick
b03587f06a Malloc buckets of size 128 have been having their 64-byte offset
trashed after being freed. This has caused several panics including
kern/42277 related to soft updates. Jim Kuhn tracked the problem
down to ipfw limit rule processing.  In the expiry of dynamic rules,
it is possible for an O_LIMIT_PARENT rule to be removed when it still
has live children.  When the children eventually do expire, a pointer
to the (long gone) parent is dereferenced and a count decremented.
Since this memory can, and is, allocated for other purposes (in the
case of kern/42277 an inodedep structure), chaos ensues. The offset
in question in inodedep is the offset of the 16 bit count field in
the ipfw2 ipfw_dyn_rule.

Submitted by:	Jim Kuhn <jkuhn@sandvine.com>
Reviewed by:	"Evgueni V. Gavrilov" <aquatique@rusunix.org>
Reviewed by:	Ben Pfountz <netprince@vt.edu>
MFC after:	1 week
2003-10-16 02:00:12 +00:00
Sam Leffler
598345da4b Bandaid locking change: mark static rule mutex recursive so re-entry when
sending an ICMP packet doesn't cause a panic.  A better solution is needed;
possibly defering the transmit to a dedicated thread.

Observed by:	"Aaron Wohl" <freebsd@soith.com>
2003-09-17 22:06:47 +00:00
Sam Leffler
293941a556 Add locking.
o change timeout to MPSAFE callout
o restructure rule deletion to deal with locking requirements
o replace static buffer used for ipfw control operations with malloc'd storage

Sponsored by:	FreeBSD Foundation
2003-09-17 00:56:50 +00:00
Luigi Rizzo
4805529cf8 Allow set 31 to be used for rules other than 65535.
Set 31 is still special because rules belonging to it are not deleted
by the "ipfw flush" command, but must be deleted explicitly with
"ipfw delete set 31" or by individual rule numbers.

This implement a flexible form of "persistent rules" which you might
want to have available even after an "ipfw flush".
Note that this change does not violate POLA, because you could not
use set 31 in a ruleset before this change.

sbin/ipfw changes to allow manipulation of set 31 will follow shortly.

Suggested by: Paul Richards
2003-07-15 23:07:34 +00:00
Luigi Rizzo
72e02d4dac Implement comments embedded into ipfw2 instructions.
Since we already had 'O_NOP' instructions which always match, all
I needed to do is allow the NOP command to have arbitrary length
(i.e. move its label in a different part of the switch() which
validates instructions).

The kernel must know nothing about comments, everything else is
done in userland (which will be described in the upcoming ipfw2.c
commit).
2003-07-12 05:54:17 +00:00
Luigi Rizzo
7a1dfbc0d3 Merge the handlers of O_IP_SRC_MASK and O_IP_DST_MASK opcodes, and
support matching a list of addr/mask pairs so one can write
more efficient rulesets which were not possible before e.g.

    add 100 skipto 1000 not src-ip 10.0.0.0/8,127.0.0.1/8,192.168.0.0/16

The change is fully backward compatible.
ipfw2 and manpage commit to follow.

MFC after: 3 days
2003-07-08 07:44:42 +00:00
Luigi Rizzo
c3e5b9f154 Implement the 'ipsec' option to match packets coming out of an ipsec tunnel.
Should work with both regular and fast ipsec (mutually exclusive).
See manpage for more details.

Submitted by: Ari Suutari (ari.suutari@syncrontech.com)
Revised by: sam
MFC after: 1 week
2003-07-04 21:42:32 +00:00
Luigi Rizzo
b5f3c4cff3 whitespace fix 2003-06-28 14:16:53 +00:00
Luigi Rizzo
67ab48d1ae Remove whitespace at end of line. 2003-06-23 21:18:56 +00:00
Luigi Rizzo
44c884e134 Add support for multiple values and ranges for the "iplen", "ipttl",
"ipid" options. This feature has been requested by several users.
On passing, fix some minor bugs in the parser.  This change is fully
backward compatible so if you have an old /sbin/ipfw and a new
kernel you are not in trouble (but you need to update /sbin/ipfw
if you want to use the new features).

Document the changes in the manpage.

Now you can write things like

	ipfw add skipto 1000 iplen 0-500

which some people were asking to give preferential treatment to
short packets.

The 'MFC after' is just set as a reminder, because I still need
to merge the Alpha/Sparc64 fixes for ipfw2 (which unfortunately
change the size of certain kernel structures; not that it matters
a lot since ipfw2 is entirely optional and not the default...)

PR: bin/48015

MFC after: 1 week
2003-06-22 17:33:19 +00:00
Bernd Walter
330462a315 Change handling to support strong alignment architectures such as alpha and
sparc64.

PR:		alpha/50658
Submitted by:	rizzo
Tested on:	alpha
2003-06-04 01:17:37 +00:00
Kelly Yancey
ed7ea0e1ab Account for packets processed at layer-2 (i.e. net.link.ether.ipfw=1).
MFC after:	2 weeks
2003-06-02 23:54:09 +00:00
Crist J. Clark
010dabb047 Add a 'verrevpath' option that verifies the interface that a packet
comes in on is the same interface that we would route out of to get to
the packet's source address. Essentially automates an anti-spoofing
check using the information in the routing table.

Experimental. The usage and rule format for the feature may still be
subject to change.
2003-03-15 01:13:00 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Maxim Konovalov
b52d5ea3d2 o Fix ipfw uid rules: socheckuid() returns 0 when uid matches a socket
cr_uid.

Note: we do not have socheckuid() in RELENG_4, ip_fw2.c uses its
own macro for a similar purpose that is why ipfw2 in RELENG_4 processes
uid rules correctly. I will MFC the diff for code consistency.

Reported by:	Oleg Baranov <ol@csa.ru>
Reviewed by:	luigi
MFC after:	1 month
2003-02-17 13:39:57 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Maxim Konovalov
8ec22a9363 If the first action is O_LOG adjust a pointer to the real one, unbreaks
skipto + log rules.

Reported by:	Wiktor Niesiobedzki <w@evip.pl>
MFC after:	1 week
2003-01-20 11:58:34 +00:00
Matthew Dillon
fe41ca530c Introduce the ability to flag a sysctl for operation at secure level 2 or 3
in addition to secure level 1.  The mask supports up to a secure level of 8
but only add defines through CTLFLAG_SECURE3 for now.

As per the missif in the log entry for 1.11 of ip_fw2.c which added the
secure flag to the IPFW sysctl's in the first place, change the secure
level requirement from 1 to 3 now that we have support for it.

Reviewed by:	imp
With Design Suggestions by:	imp
2003-01-14 19:35:33 +00:00
Ian Dowse
ed1a13b18f Bridged packets are supplied to the firewall with their IP header
in network byte order, but icmp_error() expects the IP header to
be in host order and the code here did not perform the necessary
swapping for the bridged case. This bug causes an "icmp_error: bad
length" panic when certain length IP packets (e.g. ip_len == 0x100)
are rejected by the firewall with an ICMP response.

MFC after:	3 days
2002-12-27 17:43:25 +00:00
Maxim Konovalov
f4ef616f98 o De-anonymity dummynet(4) and ipfw(4) messages, prepend them
by 'dummynet: ' and 'ipfw: ' prefixes.

PR:		kern/41609
2002-12-24 13:45:24 +00:00
Maxim Konovalov
83b75b7621 o Fix byte order logging issue: sa.sin_port is already in host byte order.
PR:		kern/45964
Submitted by:	Sascha Blank <sblank@tiscali.de>
Reviewed by:	luigi
MFC after:	1 week
2002-12-15 09:44:02 +00:00
Luigi Rizzo
97850a5dd9 Move fw_one_pass from ip_fw2.c to ip_input.c so that neither
bridge.c nor if_ethersubr.c depend on IPFIREWALL.
Restore the use of fw_one_pass in if_ethersubr.c

ipfw.8 will be updated with a separate commit.

Approved by: re
2002-11-20 19:07:27 +00:00
Maxim Konovalov
a98d88ad3e Lower a priority of "session drop" messages.
Requested by:	Eugene Grosbein <eugen@kuzbass.ru>
MFC after:	3 days
2002-10-29 08:53:14 +00:00
Maxime Henrion
7c697970f4 Fix ipfw2 panics on 64-bit platforms.
Quoting luigi:

In order to make the userland code fully 64-bit clean it may
be necessary to commit other changes that may or may not cause
a minor change in the ABI.

Reviewed by:	luigi
2002-10-24 18:04:44 +00:00
Luigi Rizzo
18f13da2be src and dst address were erroneously swapped in SRC_SET and DST_SET
commands.  Use the correct one. Also affects ipfw2 in -stable.
2002-10-24 18:01:53 +00:00
Maxim Konovalov
ba3a9d459c Kill EOL spaces.
Approved by:	luigi
MFC after:	1 week
2002-10-23 10:07:55 +00:00
Maxim Konovalov
6b6874b20c Use syslog for messages about dropped sessions, do not flood a console.
Suggested by:	Eugene Grosbein <eugen@kuzbass.ru>
Approved by:	luigi
MFC after:	1 week
2002-10-23 10:05:19 +00:00
Maxime Henrion
d7f4d27a7a Several malloc() calls were passing the M_DONTWAIT flag
which is an mbuf allocation flag.  Use the correct
M_NOWAIT malloc() flag.  Fortunately, both were defined
to 1, so this commit is a no-op.
2002-10-19 11:31:50 +00:00
Sam Leffler
5d84645305 Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
Crist J. Clark
784d7650f7 Lock the sysctl(8) knobs that turn ip{,6}fw(8) firewalling and
firewall logging on and off when at elevated securelevel(8). It would
be nice to be able to only lock these at securelevel >= 3, like rules
are, but there is no such functionality at present. I don't see reason
to be adding features to securelevel(8) with MAC being merged into 5.0.

PR:		kern/39396
Reviewed by:	luigi
MFC after:	1 week
2002-08-25 03:50:29 +00:00
Luigi Rizzo
306fe283a1 Raise limit for port lists to 30 entries/ranges.
Remove a duplicate "logging" message, and identify the firewall
as ipfw2 in the boot message.
2002-08-19 04:45:01 +00:00
Luigi Rizzo
99e5e64504 sys/netinet/ip_fw2.c:
Implement the M_SKIP_FIREWALL bit in m_flags to avoid loops
    for firewall-generated packets (the constant has to go in sys/mbuf.h).

    Better comments on keepalive generation, and enforce dyn_rst_lifetime
    and dyn_fin_lifetime to be less than dyn_keepalive_period.

    Enforce limits (up to 64k) on the number of dynamic buckets, and
    retry allocation with smaller sizes.

    Raise default number of dynamic rules to 4096.

    Improved handling of set of rules -- now you can atomically
    enable/disable multiple sets, move rules from one set to another,
    and swap sets.

sbin/ipfw/ipfw2.c:

    userland support for "noerror" pipe attribute.

    userland support for sets of rules.

    minor improvements on rule parsing and printing.

sbin/ipfw/ipfw.8:

    more documentation on ipfw2 extensions, differences from ipfw1
    (so we can use the same manpage for both), stateful rules,
    and some additional examples.
    Feedback and more examples needed here.
2002-08-16 10:31:47 +00:00
Poul-Henning Kamp
ae89fdaba7 remove spurious printf 2002-08-13 19:13:23 +00:00
Luigi Rizzo
43405724ec One bugfix and one new feature.
The bugfix (ipfw2.c) makes the handling of port numbers with
a dash in the name, e.g. ftp-data, consistent with old ipfw:
use \\ before the - to consider it as part of the name and not
a range separator.

The new feature (all this description will go in the manpage):

each rule now belongs to one of 32 different sets, which can
be optionally specified in the following form:

	ipfw add 100 set 23 allow ip from any to any

If "set N" is not specified, the rule belongs to set 0.

Individual sets can be disabled, enabled, and deleted with the commands:

	ipfw disable set N
	ipfw enable set N
	ipfw delete set N

Enabling/disabling of a set is atomic. Rules belonging to a disabled
set are skipped during packet matching, and they are not listed
unless you use the '-S' flag in the show/list commands.
Note that dynamic rules, once created, are always active until
they expire or their parent rule is deleted.
Set 31 is reserved for the default rule and cannot be disabled.

All sets are enabled by default. The enable/disable status of the sets
can be shown with the command

	ipfw show sets

Hopefully, this feature will make life easier to those who want to
have atomic ruleset addition/deletion/tests. Examples:

To add a set of rules atomically:

	ipfw disable set 18
	ipfw add ... set 18 ...		# repeat as needed
	ipfw enable set 18

To delete a set of rules atomically

	ipfw disable set 18
	ipfw delete set 18
	ipfw enable set 18

To test a ruleset and disable it and regain control if something
goes wrong:

	ipfw disable set 18
	ipfw add ... set 18 ...         # repeat as needed
	ipfw enable set 18 ; echo "done "; sleep 30 && ipfw disable set 18

    here if everything goes well, you press control-C before
    the "sleep" terminates, and your ruleset will be left
    active. Otherwise, e.g. if you cannot access your box,
    the ruleset will be disabled after the sleep terminates.

I think there is only one more thing that one might want, namely
a command to assign all rules in set X to set Y, so one can
test a ruleset using the above mechanisms, and once it is
considered acceptable, make it part of an existing ruleset.
2002-08-10 04:37:32 +00:00
Luigi Rizzo
be1826c354 Only log things net.inet.ip.fw.verbose is set 2002-07-24 02:41:19 +00:00
Luigi Rizzo
a8c102a2ec Implement keepalives for dynamic rules, so they will not expire
just because you leave your session idle.

Also, put in a fix for 64-bit architectures (to be revised).

In detail:

ip_fw.h

  * Reorder fields in struct ip_fw to avoid alignment problems on
    64-bit machines. This only masks the problem, I am still not
    sure whether I am doing something wrong in the code or there
    is a problem elsewhere (e.g. different aligmnent of structures
    between userland and kernel because of pragmas etc.)

  * added fields in dyn_rule to store ack numbers, so we can
    generate keepalives when the dynamic rule is about to expire

ip_fw2.c

  * use a local function, send_pkt(), to generate TCP RST for Reset rules;

  * save about 250 bytes by cleaning up the various snprintf()
    in ipfw_log() ...

  * ... and use twice as many bytes to implement keepalives
    (this seems to be working, but i have not tested it extensively).

Keepalives are generated once every 5 seconds for the last 20 seconds
of the lifetime of a dynamic rule for an established TCP flow.  The
packets are sent to both sides, so if at least one of the endpoints
is responding, the timeout is refreshed and the rule will not expire.

You can disable this feature with

        sysctl net.inet.ip.fw.dyn_keepalive=0

(the default is 1, to have them enabled).

MFC after: 1 day

(just kidding... I will supply an updated version of ipfw2 for
RELENG_4 tomorrow).
2002-07-14 23:47:18 +00:00
Luigi Rizzo
d63b346ab1 No functional changes, but:
Following Darren's suggestion, make Dijkstra happy and rewrite the
ipfw_chk() main loop removing a lot of goto's and using instead a
variable to store match status.

Add a lot of comments to explain what instructions are supposed to
do and how -- this should ease auditing of the code and make people
more confident with it.

In terms of code size: the entire file takes about 12700 bytes of text,
about 3K of which are for the main function, ipfw_chk(), and 2K (ouch!)
for ipfw_log().
2002-07-08 22:46:01 +00:00
Luigi Rizzo
5e43aef891 Implement the last 2-3 missing instructions for ipfw,
now it should support all the instructions of the old ipfw.

Fix some bugs in the user interface, /sbin/ipfw.

Please check this code against your rulesets, so i can fix the
remaining bugs (if any, i think they will be mostly in /sbin/ipfw).

Once we have done a bit of testing, this code is ready to be MFC'ed,
together with a bunch of other changes (glue to ipfw, and also the
removal of some global variables) which have been in -current for
a couple of weeks now.

MFC after: 7 days
2002-07-05 22:43:06 +00:00
Doug Rabson
24f8fd9fd1 Fix warning.
Reviewed by: luigi
2002-06-28 08:36:26 +00:00
Luigi Rizzo
9758b77ff1 The new ipfw code.
This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.

The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).

The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c .  Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw).  The old files are still there, and will be removed
in due time.

I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.

In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.

On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like

ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any

This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.

Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:

        10.20.30.0/26{18,44,33,22,9}

which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).

Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.

The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.
2002-06-27 23:02:18 +00:00