Commit Graph

114 Commits

Author SHA1 Message Date
Julian Elischer
8b07e49a00 Add code to allow the system to handle multiple routing tables.
This particular implementation is designed to be fully backwards compatible
and to be MFC-able to 7.x (and 6.x)

Currently the only protocol that can make use of the multiple tables is IPv4
Similar functionality exists in OpenBSD and Linux.

From my notes:

-----

  One thing where FreeBSD has been falling behind, and which by chance I
  have some time to work on is "policy based routing", which allows
  different
  packet streams to be routed by more than just the destination address.

  Constraints:
  ------------

  I want to make some form of this available in the 6.x tree
  (and by extension 7.x) , but FreeBSD in general needs it so I might as
  well do it in -current and back port the portions I need.

  One of the ways that this can be done is to have the ability to
  instantiate multiple kernel routing tables (which I will now
  refer to as "Forwarding Information Bases" or "FIBs" for political
  correctness reasons). Which FIB a particular packet uses to make
  the next hop decision can be decided by a number of mechanisms.
  The policies these mechanisms implement are the "Policies" referred
  to in "Policy based routing".

  One of the constraints I have if I try to back port this work to
  6.x is that it must be implemented as a EXTENSION to the existing
  ABIs in 6.x so that third party applications do not need to be
  recompiled in timespan of the branch.

  This first version will not have some of the bells and whistles that
  will come with later versions. It will, for example, be limited to 16
  tables in the first commit.
  Implementation method, Compatible version. (part 1)
  -------------------------------
  For this reason I have implemented a "sufficient subset" of a
  multiple routing table solution in Perforce, and back-ported it
  to 6.x. (also in Perforce though not  always caught up with what I
  have done in -current/P4). The subset allows a number of FIBs
  to be defined at compile time (8 is sufficient for my purposes in 6.x)
  and implements the changes needed to allow IPV4 to use them. I have not
  done the changes for ipv6 simply because I do not need it, and I do not
  have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.

  Other protocol families are left untouched and should there be
  users with proprietary protocol families, they should continue to work
  and be oblivious to the existence of the extra FIBs.

  To understand how this is done, one must know that the current FIB
  code starts everything off with a single dimensional array of
  pointers to FIB head structures (One per protocol family), each of
  which in turn points to the trie of routes available to that family.

  The basic change in the ABI compatible version of the change is to
  extent that array to be a 2 dimensional array, so that
  instead of protocol family X looking at rt_tables[X] for the
  table it needs, it looks at rt_tables[Y][X] when for all
  protocol families except ipv4 Y is always 0.
  Code that is unaware of the change always just sees the first row
  of the table, which of course looks just like the one dimensional
  array that existed before.

  The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
  are all maintained, but refer only to the first row of the array,
  so that existing callers in proprietary protocols can continue to
  do the "right thing".
  Some new entry points are added, for the exclusive use of ipv4 code
  called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
  which have an extra argument which refers the code to the correct row.

  In addition, there are some new entry points (currently called
  rtalloc_fib() and friends) that check the Address family being
  looked up and call either rtalloc() (and friends) if the protocol
  is not IPv4 forcing the action to row 0 or to the appropriate row
  if it IS IPv4 (and that info is available). These are for calling
  from code that is not specific to any particular protocol. The way
  these are implemented would change in the non ABI preserving code
  to be added later.

  One feature of the first version of the code is that for ipv4,
  the interface routes show up automatically on all the FIBs, so
  that no matter what FIB you select you always have the basic
  direct attached hosts available to you. (rtinit() does this
  automatically).

  You CAN delete an interface route from one FIB should you want
  to but by default it's there. ARP information is also available
  in each FIB. It's assumed that the same machine would have the
  same MAC address, regardless of which FIB you are using to get
  to it.

  This brings us as to how the correct FIB is selected for an outgoing
  IPV4 packet.

  Firstly, all packets have a FIB associated with them. if nothing
  has been done to change it, it will be FIB 0. The FIB is changed
  in the following ways.

  Packets fall into one of a number of classes.

  1/ locally generated packets, coming from a socket/PCB.
     Such packets select a FIB from a number associated with the
     socket/PCB. This in turn is inherited from the process,
     but can be changed by a socket option. The process in turn
     inherits it on fork. I have written a utility call setfib
     that acts a bit like nice..

         setfib -3 ping target.example.com # will use fib 3 for ping.

     It is an obvious extension to make it a property of a jail
     but I have not done so. It can be achieved by combining the setfib and
     jail commands.

  2/ packets received on an interface for forwarding.
     By default these packets would use table 0,
     (or possibly a number settable in a sysctl(not yet)).
     but prior to routing the firewall can inspect them (see below).
     (possibly in the future you may be able to associate a FIB
     with packets received on an interface..  An ifconfig arg, but not yet.)

  3/ packets inspected by a packet classifier, which can arbitrarily
     associate a fib with it on a packet by packet basis.
     A fib assigned to a packet by a packet classifier
     (such as ipfw) would over-ride a fib associated by
     a more default source. (such as cases 1 or 2).

  4/ a tcp listen socket associated with a fib will generate
     accept sockets that are associated with that same fib.

  5/ Packets generated in response to some other packet (e.g. reset
     or icmp packets). These should use the FIB associated with the
     packet being reponded to.

  6/ Packets generated during encapsulation.
     gif, tun and other tunnel interfaces will encapsulate using the FIB
     that was in effect withthe proces that set up the tunnel.
     thus setfib 1 ifconfig gif0 [tunnel instructions]
     will set the fib for the tunnel to use to be fib 1.

  Routing messages would be associated with their
  process, and thus select one FIB or another.
  messages from the kernel would be associated with the fib they
  refer to and would only be received by a routing socket associated
  with that fib. (not yet implemented)

  In addition Netstat has been edited to be able to cope with the
  fact that the array is now 2 dimensional. (It looks in system
  memory using libkvm (!)). Old versions of netstat see only the first FIB.

  In addition two sysctls are added to give:
  a) the number of FIBs compiled in (active)
  b) the default FIB of the calling process.

  Early testing experience:
  -------------------------

  Basically our (IronPort's) appliance does this functionality already
  using ipfw fwd but that method has some drawbacks.

  For example,
  It can't fully simulate a routing table because it can't influence the
  socket's choice of local address when a connect() is done.

  Testing during the generating of these changes has been
  remarkably smooth so far. Multiple tables have co-existed
  with no notable side effects, and packets have been routes
  accordingly.

  ipfw has grown 2 new keywords:

  setfib N ip from anay to any
  count ip from any to any fib N

  In pf there seems to be a requirement to be able to give symbolic names to the
  fibs but I do not have that capacity. I am not sure if it is required.

  SCTP has interestingly enough built in support for this, called VRFs
  in Cisco parlance. it will be interesting to see how that handles it
  when it suddenly actually does something.

  Where to next:
  --------------------

  After committing the ABI compatible version and MFCing it, I'd
  like to proceed in a forward direction in -current. this will
  result in some roto-tilling in the routing code.

  Firstly: the current code's idea of having a separate tree per
  protocol family, all of the same format, and pointed to by the
  1 dimensional array is a bit silly. Especially when one considers that
  there is code that makes assumptions about every protocol having the
  same internal structures there. Some protocols don't WANT that
  sort of structure. (for example the whole idea of a netmask is foreign
  to appletalk). This needs to be made opaque to the external code.

  My suggested first change is to add routing method pointers to the
  'domain' structure, along with information pointing the data.
  instead of having an array of pointers to uniform structures,
  there would be an array pointing to the 'domain' structures
  for each protocol address domain (protocol family),
  and the methods this reached would be called. The methods would have
  an argument that gives FIB number, but the protocol would be free
  to ignore it.

  When the ABI can be changed it raises the possibilty of the
  addition of a fib entry into the "struct route". Currently,
  the structure contains the sockaddr of the desination, and the resulting
  fib entry. To make this work fully, one could add a fib number
  so that given an address and a fib, one can find the third element, the
  fib entry.

  Interaction with the ARP layer/ LL layer would need to be
  revisited as well. Qing Li has been working on this already.

  This work was sponsored by Ironport Systems/Cisco

Reviewed by:    several including rwatson, bz and mlair (parts each)
Obtained from:  Ironport systems/Cisco
2008-05-09 23:03:00 +00:00
Robert Watson
bcf5b9fa38 Fix a comment typo.
MFC after:	3 days
2008-04-29 21:21:15 +00:00
Paolo Pisati
531c890b8a Move ipfw's nat code into its own kld: ipfw_nat. 2008-02-29 22:27:19 +00:00
Robert Watson
bb5081a7eb Hide ipfw internal data structures behind IPFW_INTERNAL rather than
exposing them to all consumers of ip_fw.h.  These structures are
used in both ipfw(8) and ipfw(4), but not part of the user<->kernel
interface for other applications to use, rather, shared
implementation.

MFC after:	3 days
Reported by:	Paul Vixie <paul at vix dot com>
2008-01-25 14:38:27 +00:00
Bjoern A. Zeeb
7a92401aea Add support for filtering on Routing Header Type 0 and
Mobile IPv6 Routing Header Type 2 in addition to filter
on the non-differentiated presence of any Routing Header.

MFC after:	3 weeks
2007-05-04 11:15:41 +00:00
Paolo Pisati
ff2f6fe80f Summer of Code 2005: improve libalias - part 2 of 2
With the second (and last) part of my previous Summer of Code work, we get:

-ipfw's in kernel nat

-redirect_* and LSNAT support

General information about nat syntax and some examples are available
in the ipfw (8) man page. The redirect and LSNAT syntax are identical
to natd, so please refer to natd (8) man page.

To enable in kernel nat in rc.conf, two options were added:

o firewall_nat_enable: equivalent to natd_enable

o firewall_nat_interface: equivalent to natd_interface

Remember to set net.inet.ip.fw.one_pass to 0, if you want the packet
to continue being checked by the firewall ruleset after being
(de)aliased.

NOTA BENE: due to some problems with libalias architecture, in kernel
nat won't work with TSO enabled nic, thus you have to disable TSO via
ifconfig (ifconfig foo0 -tso).

Approved by: glebius (mentor)
2006-12-29 21:59:17 +00:00
Julian Elischer
afad78e259 comply with style police
Submitted by:	ru
MFC after:	1 month
2006-08-18 22:36:05 +00:00
Julian Elischer
c487be961a Allow ipfw to forward to a destination that is specified by a table.
for example:
  fwd tablearg ip from any to table(1)
where table 1 has entries of the form:
1.1.1.0/24 10.2.3.4
208.23.2.0/24 router2

This allows trivial implementation of a secondary routing table implemented
in the firewall layer.

I expect more work (under discussion with Glebius) to follow this to clean
up some of the messy parts of ipfw related to tables.

Reviewed by:	Glebius
MFC after:	1 month
2006-08-17 22:49:50 +00:00
Oleg Bulyzhin
6a7d5cb645 Implement internal (i.e. inside kernel) packet tagging using mbuf_tags(9).
Since tags are kept while packet resides in kernelspace, it's possible to
use other kernel facilities (like netgraph nodes) for altering those tags.

Submitted by:	Andrey Elsukov <bu7cher at yandex dot ru>
Submitted by:	Vadim Goncharov <vadimnuclight at tpu dot ru>
Approved by:	glebius (mentor)
Idea from:	OpenBSD PF
MFC after:	1 month
2006-05-24 13:09:55 +00:00
Max Laier
e93187482d Reintroduce net.inet6.ip6.fw.enable sysctl to dis/enable the ipv6 processing
seperately.  Also use pfil hook/unhook instead of keeping the check
functions in pfil just to return there based on the sysctl.  While here fix
some whitespace on a nearby SYSCTL_ macro.
2006-05-12 04:41:27 +00:00
Ruslan Ermilov
ea9dce1461 When sending a packet from dummynet, indicate that we're forwarding
it so that ip_id etc. don't get overwritten.  This fixes forwarding
of fragmented IP packets through a dummynet pipe -- fragments came
out with modified and different(!) ip_id's, making it impossible to
reassemble a datagram at the receiver side.

Submitted by:	Alexander Karptsov (reworked by me)
MFC after:	3 days
2006-02-14 06:36:39 +00:00
Gleb Smirnoff
40b1ae9e00 Add a new feature for optimizining ipfw rulesets - substitution of the
action argument with the value obtained from table lookup. The feature
is now applicable only to "pipe", "queue", "divert", "tee", "netgraph"
and "ngtee" rules.

An example usage:

  ipfw pipe 1000 config bw 1000Kbyte/s
  ipfw pipe 4000 config bw 4000Kbyte/s
  ipfw table 1 add x.x.x.x 1000
  ipfw table 1 add x.x.x.y 4000
  ipfw pipe tablearg ip from table(1) to any

In the example above the rule will throw different packets to different pipes.

TODO:
  - Support "skipto" action, but without searching all rules.
  - Improve parser, so that it warns about bad rules. These are:
    - "tablearg" argument to action, but no "table" in the rule. All
      traffic will be blocked.
    - "tablearg" argument to action, but "table" searches for entry with
      a specific value. All traffic will be blocked.
    - "tablearg" argument to action, and two "table" looks - for src and
      for dst. The last lookup will match.
2005-12-13 12:16:03 +00:00
Gleb Smirnoff
b090e4ce1f Garbage-collect now unused struct _ipfw_insn_pipe and flush_pipe_ptrs(),
thus removing a few XXXes.
  Document the ABI breakage in UPDATING.
2005-11-29 08:59:41 +00:00
Bjoern A. Zeeb
9066356ba1 * Add dynamic sysctl for net.inet6.ip6.fw.
* Correct handling of IPv6 Extension Headers.
* Add unreach6 code.
* Add logging for IPv6.

Submitted by:	sysctl handling derived from patch from ume needed for ip6fw
Obtained from:	is_icmp6_query and send_reject6 derived from similar
		functions of netinet6,ip6fw
Reviewed by:	ume, gnn; silence on ipfw@
Test setup provided by: CK Software GmbH
MFC after:	6 days
2005-08-13 11:02:34 +00:00
Max Laier
57cd6d263b Add support for IPv4 only rules to IPFW2 now that it supports IPv6 as well.
This is the last requirement before we can retire ip6fw.

Reviewed by:	dwhite, brooks(earlier version)
Submitted by:	dwhite (manpage)
Silence from:	-ipfw
2005-06-03 01:10:28 +00:00
Gleb Smirnoff
a1429ad928 IPFW version 2 is the only option in HEAD and RELENG_5.
Thus, cleanup unnecessary now ifdefs.
2005-05-04 13:12:52 +00:00
Brooks Davis
8195404bed Add IPv6 support to IPFW and Dummynet.
Submitted by:	Mariano Tortoriello and Raffaele De Lorenzo (via luigi)
2005-04-18 18:35:05 +00:00
Gleb Smirnoff
670742a102 Add a ng_ipfw node, implementing a quick and simple interface between
ipfw(4) and netgraph(4) facilities.

Reviewed by:	andre, brooks, julian
2005-02-05 12:06:33 +00:00
Gleb Smirnoff
6c69a7c30b o Clean up interface between ip_fw_chk() and its callers:
- ip_fw_chk() returns action as function return value. Field retval is
  removed from args structure. Action is not flag any more. It is one
  of integer constants.
- Any action-specific cookies are returned either in new "cookie" field
  in args structure (dummynet, future netgraph glue), or in mbuf tag
  attached to packet (divert, tee, some future action).

o Convert parsing of return value from ip_fw_chk() in ipfw_check_{in,out}()
  to a switch structure, so that the functions are more readable, and a future
  actions can be added with less modifications.

Approved by:	andre
MFC after:	2 months
2005-01-14 09:00:46 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Brian Feldman
c99ee9e042 Add support to IPFW for matching by TCP data length. 2004-10-03 00:47:15 +00:00
Brian Feldman
6daf7ebd28 Add support to IPFW for classification based on "diverted" status
(that is, input via a divert socket).
2004-10-03 00:26:35 +00:00
Brian Feldman
974dfe3084 Add to IPFW the ability to do ALTQ classification/tagging. 2004-10-03 00:17:46 +00:00
Max Laier
d6a8d58875 Add an additional struct inpcb * argument to pfil(9) in order to enable
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.

This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.

Suggested by:		rwatson
A lot of work by:	csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by:		rwatson, csjp
Tested by:		-pf, -ipfw, LINT, csjp and myself
MFC after:		3 days

LOR IDs:		14 - 17 (not fixed yet)
2004-09-29 04:54:33 +00:00
Andre Oppermann
e4c97eff8e Bring back the sysctl 'net.inet.ip.fw.enable' to unbreak the startup scripts
and to be able to disable ipfw if it was compiled directly into the kernel.
2004-08-19 17:38:47 +00:00
Andre Oppermann
9b932e9e04 Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI.  The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

 In general ipfw is now called through the PFIL_HOOKS and most associated
 magic, that was in ip_input() or ip_output() previously, is now done in
 ipfw_check_[in|out]() in the ipfw PFIL handler.

 IPDIVERT is entirely handled within the ipfw PFIL handlers.  A packet to
 be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
 reassembly.  If not, or all fragments arrived and the packet is complete,
 divert_packet is called directly.  For 'tee' no reassembly attempt is made
 and a copy of the packet is sent to the divert socket unmodified.  The
 original packet continues its way through ip_input/output().

 ipfw 'forward' is done via m_tag's.  The ipfw PFIL handlers tag the packet
 with the new destination sockaddr_in.  A check if the new destination is a
 local IP address is made and the m_flags are set appropriately.  ip_input()
 and ip_output() have some more work to do here.  For ip_input() the m_flags
 are checked and a packet for us is directly sent to the 'ours' section for
 further processing.  Destination changes on the input path are only tagged
 and the 'srcrt' flag to ip_forward() is set to disable destination checks
 and ICMP replies at this stage.  The tag is going to be handled on output.
 ip_output() again checks for m_flags and the 'ours' tag.  If found, the
 packet will be dropped back to the IP netisr where it is going to be picked
 up by ip_input() again and the directly sent to the 'ours' section.  When
 only the destination changes, the route's 'dst' is overwritten with the
 new destination from the forward m_tag.  Then it jumps back at the route
 lookup again and skips the firewall check because it has been marked with
 M_SKIP_FIREWALL.  ipfw 'forward' has to be compiled into the kernel with
 'option IPFIREWALL_FORWARD' to enable it.

 DUMMYNET is entirely handled within the ipfw PFIL handlers.  A packet for
 a dummynet pipe or queue is directly sent to dummynet_io().  Dummynet will
 then inject it back into ip_input/ip_output() after it has served its time.
 Dummynet packets are tagged and will continue from the next rule when they
 hit the ipfw PFIL handlers again after re-injection.

 BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
 they did before.  Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

 conf/files
	Add netinet/ip_fw_pfil.c.

 conf/options
	Add IPFIREWALL_FORWARD option.

 modules/ipfw/Makefile
	Add ip_fw_pfil.c.

 net/bridge.c
	Disable PFIL_HOOKS if ipfw for bridging is active.  Bridging ipfw
	is still directly invoked to handle layer2 headers and packets would
	get a double ipfw when run through PFIL_HOOKS as well.

 netinet/ip_divert.c
	Removed divert_clone() function.  It is no longer used.

 netinet/ip_dummynet.[ch]
	Neither the route 'ro' nor the destination 'dst' need to be stored
	while in dummynet transit.  Structure members and associated macros
	are removed.

 netinet/ip_fastfwd.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_fw.h
	Removed 'ro' and 'dst' from struct ip_fw_args.

 netinet/ip_fw2.c
	(Re)moved some global variables and the module handling.

 netinet/ip_fw_pfil.c
	New file containing the ipfw PFIL handlers and module initialization.

 netinet/ip_input.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.  ip_forward() does not longer require
	the 'next_hop' struct sockaddr_in argument.  Disable early checks
	if 'srcrt' is set.

 netinet/ip_output.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_var.h
	Add ip_reass() as general function.  (Used from ipfw PFIL handlers
	for IPDIVERT.)

 netinet/raw_ip.c
	Directly check if ipfw and dummynet control pointers are active.

 netinet/tcp_input.c
	Rework the 'ipfw forward' to local code to work with the new way of
	forward tags.

 netinet/tcp_sack.c
	Remove include 'opt_ipfw.h' which is not needed here.

 sys/mbuf.h
	Remove m_claim_next() macro which was exclusively for ipfw 'forward'
	and is no longer needed.

Approved by:	re (scottl)
2004-08-17 22:05:54 +00:00
David E. O'Brien
5af87d0ea1 Put the 'antispoof' opcode in the proper place in the opcode list such
that it doesn't break the ipfw2 ABI.
2004-08-16 12:05:19 +00:00
Christian S.J. Peron
31c88a3043 Add the ability to associate ipfw rules with a specific prison ID.
Since the only thing truly unique about a prison is it's ID, I figured
this would be the most granular way of handling this.

This commit makes the following changes:

- Adds tokenizing and parsing for the ``jail'' command line option
  to the ipfw(8) userspace utility.
- Append the ipfw opcode list with O_JAIL.
- While Iam here, add a comment informing others that if they
  want to add additional opcodes, they should append them to the end
  of the list to avoid ABI breakage.
- Add ``fw_prid'' to the ipfw ucred cache structure.
- When initializing ucred cache, if the process is jailed,
  set fw_prid to the prison ID, otherwise set it to -1.
- Update man page to reflect these changes.

This change was a strong motivator behind the ucred caching
mechanism in ipfw.

A sample usage of this new functionality could be:

    ipfw add count ip from any to any jail 2

It should be noted that because ucred based constraints
are only implemented for TCP and UDP packets, the same
applies for jail associations.

Conceptual head nod by:	pjd
Reviewed by:	rwatson
Approved by:	bmilekic (mentor)
2004-08-12 22:06:55 +00:00
Andre Oppermann
5f9541ecbd New ipfw option "antispoof":
For incoming packets, the packet's source address is checked if it
 belongs to a directly connected network.  If the network is directly
 connected, then the interface the packet came on in is compared to
 the interface the network is connected to.  When incoming interface
 and directly connected interface are not the same, the packet does
 not match.

Usage example:

 ipfw add deny ip from any to any not antispoof in

Manpage education by:	ru
2004-08-09 16:12:10 +00:00
Ruslan Ermilov
cd8b5ae0ae Introduce a new feature to IPFW2: lookup tables. These are useful
for handling large sparse address sets.  Initial implementation by
Vsevolod Lobko <seva@ip.net.ua>, refined by me.

MFC after:	1 week
2004-06-09 20:10:38 +00:00
Andre Oppermann
22b5770b99 Add the option versrcreach to verify that a valid route to the
source address of a packet exists in the routing table.  The
default route is ignored because it would match everything and
render the check pointless.

This option is very useful for routers with a complete view of
the Internet (BGP) in the routing table to reject packets with
spoofed or unrouteable source addresses.

Example:

 ipfw add 1000 deny ip from any to any not versrcreach

also known in Cisco-speak as:

  ip verify unicast source reachable-via any

Reviewed by:	luigi
2004-04-23 14:28:38 +00:00
Max Laier
ac9d7e2618 Re-remove MT_TAGs. The problems with dummynet have been fixed now.
Tested by: -current, bms(mentor), me
Approved by: bms(mentor), sam
2004-02-25 19:55:29 +00:00
Max Laier
36e8826ffb Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet is
not working properly with the patch in place.

Approved by: bms(mentor)
2004-02-18 00:04:52 +00:00
Max Laier
1094bdca51 This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacing
them mostly with packet tags (one case is handled by using an mbuf flag
since the linkage between "caller" and "callee" is direct and there's no
need to incur the overhead of a packet tag).

This is (mostly) work from: sam

Silence from: -arch
Approved by: bms(mentor), sam, rwatson
2004-02-13 19:14:16 +00:00
Brooks Davis
9bf40ede4a Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By:	re (in principle)
Reviewed By:	njl, imp
Tested On:	i386, amd64, sparc64
Obtained From:	NetBSD (if_xname)
2003-10-31 18:32:15 +00:00
Luigi Rizzo
4805529cf8 Allow set 31 to be used for rules other than 65535.
Set 31 is still special because rules belonging to it are not deleted
by the "ipfw flush" command, but must be deleted explicitly with
"ipfw delete set 31" or by individual rule numbers.

This implement a flexible form of "persistent rules" which you might
want to have available even after an "ipfw flush".
Note that this change does not violate POLA, because you could not
use set 31 in a ruleset before this change.

sbin/ipfw changes to allow manipulation of set 31 will follow shortly.

Suggested by: Paul Richards
2003-07-15 23:07:34 +00:00
Luigi Rizzo
f030c1518d Correct some comments, add opcode O_IPSEC to match packets
coming out of an ipsec tunnel.
2003-07-04 21:39:51 +00:00
Bernd Walter
330462a315 Change handling to support strong alignment architectures such as alpha and
sparc64.

PR:		alpha/50658
Submitted by:	rizzo
Tested on:	alpha
2003-06-04 01:17:37 +00:00
Crist J. Clark
010dabb047 Add a 'verrevpath' option that verifies the interface that a packet
comes in on is the same interface that we would route out of to get to
the packet's source address. Essentially automates an anti-spoofing
check using the information in the routing table.

Experimental. The usage and rule format for the feature may still be
subject to change.
2003-03-15 01:13:00 +00:00
Maxime Henrion
d28e8b3a0d Oops, forgot to commit this file. This is part of the fix
for ipfw2 panics on sparc64.
2002-10-24 22:32:13 +00:00
Luigi Rizzo
43405724ec One bugfix and one new feature.
The bugfix (ipfw2.c) makes the handling of port numbers with
a dash in the name, e.g. ftp-data, consistent with old ipfw:
use \\ before the - to consider it as part of the name and not
a range separator.

The new feature (all this description will go in the manpage):

each rule now belongs to one of 32 different sets, which can
be optionally specified in the following form:

	ipfw add 100 set 23 allow ip from any to any

If "set N" is not specified, the rule belongs to set 0.

Individual sets can be disabled, enabled, and deleted with the commands:

	ipfw disable set N
	ipfw enable set N
	ipfw delete set N

Enabling/disabling of a set is atomic. Rules belonging to a disabled
set are skipped during packet matching, and they are not listed
unless you use the '-S' flag in the show/list commands.
Note that dynamic rules, once created, are always active until
they expire or their parent rule is deleted.
Set 31 is reserved for the default rule and cannot be disabled.

All sets are enabled by default. The enable/disable status of the sets
can be shown with the command

	ipfw show sets

Hopefully, this feature will make life easier to those who want to
have atomic ruleset addition/deletion/tests. Examples:

To add a set of rules atomically:

	ipfw disable set 18
	ipfw add ... set 18 ...		# repeat as needed
	ipfw enable set 18

To delete a set of rules atomically

	ipfw disable set 18
	ipfw delete set 18
	ipfw enable set 18

To test a ruleset and disable it and regain control if something
goes wrong:

	ipfw disable set 18
	ipfw add ... set 18 ...         # repeat as needed
	ipfw enable set 18 ; echo "done "; sleep 30 && ipfw disable set 18

    here if everything goes well, you press control-C before
    the "sleep" terminates, and your ruleset will be left
    active. Otherwise, e.g. if you cannot access your box,
    the ruleset will be disabled after the sleep terminates.

I think there is only one more thing that one might want, namely
a command to assign all rules in set X to set Y, so one can
test a ruleset using the above mechanisms, and once it is
considered acceptable, make it part of an existing ruleset.
2002-08-10 04:37:32 +00:00
Luigi Rizzo
318aa87b59 Fix a panic when doing "ipfw add pipe 1 log ..."
Also synchronize ip_dummynet.c with the version in RELENG_4 to
ease MFC's.
2002-07-17 07:21:42 +00:00
Luigi Rizzo
a8c102a2ec Implement keepalives for dynamic rules, so they will not expire
just because you leave your session idle.

Also, put in a fix for 64-bit architectures (to be revised).

In detail:

ip_fw.h

  * Reorder fields in struct ip_fw to avoid alignment problems on
    64-bit machines. This only masks the problem, I am still not
    sure whether I am doing something wrong in the code or there
    is a problem elsewhere (e.g. different aligmnent of structures
    between userland and kernel because of pragmas etc.)

  * added fields in dyn_rule to store ack numbers, so we can
    generate keepalives when the dynamic rule is about to expire

ip_fw2.c

  * use a local function, send_pkt(), to generate TCP RST for Reset rules;

  * save about 250 bytes by cleaning up the various snprintf()
    in ipfw_log() ...

  * ... and use twice as many bytes to implement keepalives
    (this seems to be working, but i have not tested it extensively).

Keepalives are generated once every 5 seconds for the last 20 seconds
of the lifetime of a dynamic rule for an established TCP flow.  The
packets are sent to both sides, so if at least one of the endpoints
is responding, the timeout is refreshed and the rule will not expire.

You can disable this feature with

        sysctl net.inet.ip.fw.dyn_keepalive=0

(the default is 1, to have them enabled).

MFC after: 1 day

(just kidding... I will supply an updated version of ipfw2 for
RELENG_4 tomorrow).
2002-07-14 23:47:18 +00:00
Luigi Rizzo
7d4d3e9051 Remove one unused command name. 2002-07-08 22:39:19 +00:00
Luigi Rizzo
5e43aef891 Implement the last 2-3 missing instructions for ipfw,
now it should support all the instructions of the old ipfw.

Fix some bugs in the user interface, /sbin/ipfw.

Please check this code against your rulesets, so i can fix the
remaining bugs (if any, i think they will be mostly in /sbin/ipfw).

Once we have done a bit of testing, this code is ready to be MFC'ed,
together with a bunch of other changes (glue to ipfw, and also the
removal of some global variables) which have been in -current for
a couple of weeks now.

MFC after: 7 days
2002-07-05 22:43:06 +00:00
Luigi Rizzo
9758b77ff1 The new ipfw code.
This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.

The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).

The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c .  Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw).  The old files are still there, and will be removed
in due time.

I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.

In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.

On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like

ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any

This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.

Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:

        10.20.30.0/26{18,44,33,22,9}

which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).

Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.

The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.
2002-06-27 23:02:18 +00:00
Luigi Rizzo
2b25acc158 Remove (almost all) global variables that were used to hold
packet forwarding state ("annotations") during ip processing.
The code is considerably cleaner now.

The variables removed by this change are:

        ip_divert_cookie        used by divert sockets
        ip_fw_fwd_addr          used for transparent ip redirection
        last_pkt                used by dynamic pipes in dummynet

Removal of the first two has been done by carrying the annotations
into volatile structs prepended to the mbuf chains, and adding
appropriate code to add/remove annotations in the routines which
make use of them, i.e. ip_input(), ip_output(), tcp_input(),
bdg_forward(), ether_demux(), ether_output_frame(), div_output().

On passing, remove a bug in divert handling of fragmented packet.
Now it is the fragment at offset 0 which sets the divert status of
the whole packet, whereas formerly it was the last incoming fragment
to decide.

Removal of last_pkt required a change in the interface of ip_fw_chk()
and dummynet_io(). On passing, use the same mechanism for dummynet
annotations and for divert/forward annotations.

option IPFIREWALL_FORWARD is effectively useless, the code to
implement it is very small and is now in by default to avoid the
obfuscation of conditionally compiled code.

NOTES:
 * there is at least one global variable left, sro_fwd, in ip_output().
   I am not sure if/how this can be removed.

 * I have deliberately avoided gratuitous style changes in this commit
   to avoid cluttering the diffs. Minor stule cleanup will likely be
   necessary

 * this commit only focused on the IP layer. I am sure there is a
   number of global variables used in the TCP and maybe UDP stack.

 * despite the number of files touched, there are absolutely no API's
   or data structures changed by this commit (except the interfaces of
   ip_fw_chk() and dummynet_io(), which are internal anyways), so
   an MFC is quite safe and unintrusive (and desirable, given the
   improved readability of the code).

MFC after: 10 days
2002-06-22 11:51:02 +00:00
Luigi Rizzo
2f8707ca5d Remove custom definitions (IP_FW_TCPF_SYN etc.) of TCP header flags
which are the same as the original ones (TH_SYN etc.)
2002-05-13 10:21:13 +00:00
Luigi Rizzo
d60315bef5 Cleanup the interface to ip_fw_chk, two of the input arguments
were totally useless and have been removed.

ip_input.c, ip_output.c:
    Properly initialize the "ip" pointer in case the firewall does an
    m_pullup() on the packet.

    Remove some debugging code forgotten long ago.

ip_fw.[ch], bridge.c:
    Prepare the grounds for matching MAC header fields in bridged packets,
    so we can have 'etherfw' functionality without a lot of kernel and
    userland bloat.
2002-05-09 10:34:57 +00:00
Alfred Perlstein
4d77a549fe Remove __P. 2002-03-19 21:25:46 +00:00