Commit Graph

28 Commits

Author SHA1 Message Date
Alexander V. Chernikov
57a1cf954e * Use TA_FLAG_DEFAULT for default algorithm selection instead of
exporting algorithm structures directly.

* Pass needed state buffer size in algo structures as preparation
  for tables add/del requests batching.
2014-08-01 07:35:17 +00:00
Alexander V. Chernikov
914bffb6ab * Add new "flow" table type to support N=1..5-tuple lookups
* Add "flow:hash" algorithm

Kernel changes:
* Add O_IP_FLOW_LOOKUP opcode to support "flow" lookups
* Add IPFW_TABLE_FLOW table type
* Add "struct tflow_entry" as strage for 6-tuple flows
* Add "flow:hash" algorithm. Basically it is auto-growing chained hash table.
  Additionally, we store mask of fields we need to compare in each instance/

* Increase ipfw_obj_tentry size by adding struct tflow_entry
* Add per-algorithm stat (ifpw_ta_tinfo) to ipfw_xtable_info
* Increase algoname length: 32 -> 64 (algo options passed there as string)
* Assume every table type can be customized by flags, use u8 to store "tflags" field.
* Simplify ipfw_find_table_entry() by providing @tentry directly to algo callback.
* Fix bug in cidr:chash resize procedure.

Userland changes:
* add "flow table(NAME)" syntax to support n-tuple checking tables.
* make fill_flags() separate function to ease working with _s_x arrays
* change "table info" output to reflect longer "type" fields

Syntax:
ipfw table fl2 create type flow:[src-ip][,proto][,src-port][,dst-ip][dst-port] [algo flow:hash]

Examples:

0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash
0:02 [2] zfscurr0# ipfw table fl2 info
+++ table(fl2), set(0) +++
 kindex: 0, type: flow:src-ip,proto,dst-port
 valtype: number, references: 0
 algorithm: flow:hash
 items: 0, size: 280
0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000
0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000
0:02 [2] zfscurr0# ipfw table fl2 list
+++ table(fl2), set(0) +++
2a02:6b8::333,6,443 45000
10.0.0.92,6,80 22000
0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)'
00200 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
0:03 [2] zfscurr0# ipfw show
00200   0     0 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 617 59416 allow ip from any to any
0:03 [2] zfscurr0# telnet -s 10.0.0.92 78.46.89.105 80
Trying 78.46.89.105...
..
0:04 [2] zfscurr0# ipfw show
00200   5   272 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 682 66733 allow ip from any to any
2014-07-31 20:08:19 +00:00
Alexander V. Chernikov
b23d5de9b6 * Add number:array algorithm lookup method.
Kernel changes:
* s/IPFW_TABLE_U32/IPFW_TABLE_NUMBER/
* Force "lookup <port|uid|gid|jid>" to be IPFW_TABLE_NUMBER
* Support "lookup" method for number tables
* Add number:array algorihm (i32 as key, auto-growing).

Userland changes:
* Support named tables in "lookup <tag> Table"
* Fix handling of "table(NAME,val)" case
* Support printing "number" table data.
2014-07-30 14:52:26 +00:00
Alexander V. Chernikov
daabb523cc Fix "flush" cmd for algorithms wih non-default parameters. 2014-07-30 09:17:40 +00:00
Alexander V. Chernikov
9d099b4f38 * Dump available table algorithms via "ipfw talist" cmd.
Kernel changes:
* Add type/refcount fields to table algo instances.
* Add IP_FW_TABLES_ALIST opcode to export available algorihms to userland.

Userland changes:
* Fix cores on empty input inside "ipfw table" handler.
* Add "ipfw talist" cmd to print availabled kernel algorithms.
* Change "table info" output to reflect long algorithm config lines.
2014-07-29 22:44:26 +00:00
Alexander V. Chernikov
0b565ac0e6 * Copy ta structures to stable storage to ease future extension.
* Remove algo .lookup field since table lookup function is set by algo code.
2014-07-29 21:38:06 +00:00
Alexander V. Chernikov
74b941f042 * Add new ipfw cidr algorihm: hash table.
Algorithm works with both IPv4 and IPv6 prefixes, /32 and /128
ranges are assumed by default.
It works the following way: input IP address is masked to specified
mask, hashed and searched inside hash bucket.

Current implementation does not support "lookup" method and hash auto-resize.
This will be changed soon.

some examples:

ipfw table mi_test2 create type cidr algo cidr:hash
ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64"

ipfw table mi_test2 info
+++ table(mi_test2), set(0) +++
 type: cidr, kindex: 7
 valtype: number, references: 0
 algorithm: cidr:hash
 items: 0, size: 220

ipfw table mi_test info
+++ table(mi_test), set(0) +++
 type: cidr, kindex: 6
 valtype: number, references: 0
 algorithm: cidr:hash masks=/30,/64
 items: 0, size: 220

ipfw table mi_test add 10.0.0.5/30
ipfw table mi_test add 10.0.0.8/30
ipfw table mi_test add 2a02:6b8:b010::1/64 25

ipfw table mi_test list
+++ table(mi_test), set(0) +++
10.0.0.4/30 0
10.0.0.8/30 0
2a02:6b8:b010::/64 25
2014-07-29 19:49:38 +00:00
Alexander V. Chernikov
adea620132 * Change algorthm names to "type:algo" (e.g. "iface:array", "cidr:radix") format.
* Pass number of items changed in add/del hooks to permit adding/deleting
  multiple values at once.
2014-07-29 08:00:13 +00:00
Alexander V. Chernikov
68394ec88e * Add generic ipfw interface tracking API
* Rewrite interface tables to use interface indexes

Kernel changes:
* Add generic interface tracking API:
 - ipfw_iface_ref (must call unlocked, performs lazy init if needed, allocates
  state & bumps ref)
 - ipfw_iface_add_ntfy(UH_WLOCK+WLOCK, links comsumer & runs its callback to
  update ifindex)
 - ipfw_iface_del_ntfy(UH_WLOCK+WLOCK, unlinks consumer)
 - ipfw_iface_unref(unlocked, drops reference)
Additionally, consumer callbacks are called in interface withdrawal/departure.

* Rewrite interface tables to use iface tracking API. Currently tables are
  implemented the following way:
  runtime data is stored as sorted array of {ifidx, val} for existing interfaces
  full data is stored inside namedobj instance (chained hashed table).

* Add IP_FW_XIFLIST opcode to dump status of tracked interfaces

* Pass @chain ptr to most non-locked algorithm callbacks:
  (prepare_add, prepare_del, flush_entry ..). This may be needed for better
  interaction of given algorithm an other ipfw subsystems

* Add optional "change_ti" algorithm handler to permit updating of
  cached table_info pointer (happens in case of table_max resize)

* Fix small bug in ipfw_list_tables()
* Add badd (insert into sorted array) and bdel (remove from sorted array) funcs

Userland changes:
* Add "iflist" cmd to print status of currently tracked interface
* Add stringnum_cmp for better interface/table names sorting
2014-07-28 19:01:25 +00:00
Alexander V. Chernikov
db785d3199 * Require explicit table creation before use on kernel side.
* Add resize callbacks for upcoming table-based algorithms.

Kernel changes:
* s/ipfw_modify_table/ipfw_manage_table_ent/
* Simplify add_table_entry(): make table creation a separate piece of code.
  Do not perform creation if not in "compat" mode.
* Add ability to perform modification of algorithm state (like table resize).
  The following callbacks were added:
 - prepare_mod (allocate new state, without locks)
 - fill_mod (UH_WLOCK, copy old state to new one)
 - modify (UH_WLOCK + WLOCK, switch state)
 - flush_mod (no locks, flushes allocated data)
 Given callbacks are called if table modification has been requested by add or
   delete callbacks. Additional u64 tc->'flags' field was added to pass these
   requests.
* Change add/del table ent format: permit adding/removing multiple entries
   at once (only 1 supported at the moment).

Userland changes:
* Auto-create tables with warning
2014-07-26 13:37:25 +00:00
Alexander V. Chernikov
7e767c791f * Use different rule structures in kernel/userland.
* Switch kernel to use per-cpu counters for rules.
* Keep ABI/API.

Kernel changes:
* Each rules is now exported as TLV with optional extenable
  counter block (ip_fW_bcounter for base one) and
  ip_fw_rule for rule&cmd data.
* Counters needs to be explicitly requested by IPFW_CFG_GET_COUNTERS flag.
* Separate counters from rules in kernel and clean up ip_fw a bit.
* Pack each rule in IPFW_TLV_RULE_ENT tlv to ease parsing.
* Introduce versioning in container TLV (may be needed in future).
* Fix ipfw_cfg_lheader broken u64 alignment.

Userland changes:
* Use set_mask from cfg header when requesting config
* Fix incorrect read accouting in ipfw_show_config()
* Use IPFW_RULE_NOOPT flag instead of playing with _pad
* Fix "ipfw -d list": do not print counters for dynamic states
* Some small fixes
2014-07-08 23:11:15 +00:00
Alexander V. Chernikov
81d3153d61 * Add "lookup" table functionality to permit userland entry lookups.
* Bump table dump format preserving old ABI.

Kernel size:
* Add IP_FW_TABLE_XFIND to handle "lookup" request from userland.
* Add ta_find_tentry() algorithm callbacks/handlers to support lookups.
* Fully switch to ipfw_obj_tentry for various table dumps:
  algorithms are now required to support the latest (ipfw_obj_tentry) entry
    dump format, the rest is handled by generic dump code.
  IP_FW_TABLE_XLIST opcode version bumped (0 -> 1).
* Eliminate legacy ta_dump_entry algo handler:
  dump_table_entry() converts data from current to legacy format.

Userland side:
* Add "lookup" table parameter.
* Change the way table type is guessed: call table_get_info() first,
  and check value for IPv4/IPv6 type IFF table does not exist.
* Fix table_get_list(): do more tries if supplied buffer is not enough.
* Sparate table_show_entry() from table_show_list().
2014-07-06 18:16:04 +00:00
Alexander V. Chernikov
1832a7b303 * Issue warning while requesting ruleset with new tables via legacy binary.
Convert each unresolved table as table 65535 (which cannot be used normally).
* Perform s/^ipfw_// for add_table_entry, del_table_entry and flush_table since
  these are internal functions exported to keep legacy interface.
* Remove macro TABLE_SET. Operations with tables can be done in any set, the only
  thing net.inet.ip.fw.tables_sets affects is the set in which tables are looked
  up while binding them to the rule.
2014-07-04 07:02:11 +00:00
Alexander V. Chernikov
ac35ff1784 Fully switch to named tables:
Kernel changes:
* Introduce ipfw_obj_tentry table entry structure to force u64 alignment.
* Support "update-on-existing-key" "add" bahavior (TEI_FLAGS_UPDATED).
* Use "subtype" field to distingush between IPv4 and IPv6 table records
  instead of previous hack.
* Add value type (vtype) field for kernel tables. Current types are
  number,ip and dscp
* Fix sets mask retrieval for old binaries
* Fix crash while using interface tables

Userland changes:
* Switch ipfw_table_handler() to use named-only tables.
* Add "table NAME create [type {cidr|iface|u32} [valtype {number|ip|dscp}] ..."
* Switch ipfw_table_handler to match_token()-based parser.
* Switch ipfw_sets_handler to use new ipfw_get_config() for mask  retrieval.
* Allow ipfw set X table ... syntax to permit using per-set table namespaces.
2014-07-03 22:25:59 +00:00
Alexander V. Chernikov
6c2997ffec * Add new IP_FW_XADD opcode which permits to
a) specify table ids as names
  b) add multiple rules at once.
Partially convert current code for atomic addition of multiple rules.
2014-06-29 22:35:47 +00:00
Alexander V. Chernikov
563b5ab132 Suppord showing named tables in ipfw(8) rule listing.
Kernel changes:
* change base TLV header to be u64 (so size can be u32).
* Introduce ipfw_obj_ctlv generc container TLV.
* Add IP_FW_XGET opcode which is now used for atomic configuration
  retrieval. One can specify needed configuration pieces to retrieve
  via flags field. Currently supported are
  IPFW_CFG_GET_STATIC (static rules) and
  IPFW_CFG_GET_STATES (dynamic states).
  Other configuration pieces (tables, pipes, etc..) support is planned.

Userland changes:
* Switch ipfw(8) to use new IP_FW_XGET for rule listing.
* Split rule listing code get and show pieces.
* Make several steps forward towards libipfw:
  permit printing states and rules(paritally) to supplied buffer.
  do not die on malloc/kernel failure inside given printing functions.
  stop assuming cmdline_opts is global symbol.
2014-06-28 23:20:24 +00:00
Alexander V. Chernikov
2d99a3497d Use different approach for filling large datasets to userspace:
Instead of trying to allocate bing contiguous chunk of memory,
use intermediate-sized (page size) buffer as sliding window
reducing number of sooptcopyout() calls to perform.

This reduces dump functions complexity and provides additional
layer of abstraction.

User-visible api consists of 2 functions:
ipfw_get_sopt_space() - gets contigious amount of storage (or NULL)
and
ipfw_get_sopt_header() - the same, but zeroes the rest of the buffer.
2014-06-27 10:07:00 +00:00
Alexander V. Chernikov
9490a62716 * Add IP_FW_TABLE_XCREATE / IP_FW_TABLE_XMODIFY opcodes.
* Add 'algoname' string to ipfw_xtable_info permitting to specify lookup
algoritm with parameters.
* Rework part of ipfw_rewrite_table_uidx()

Sponsored by:	Yandex LLC
2014-06-16 13:05:07 +00:00
Alexander V. Chernikov
9c3c43aa77 Remove unused ipfw_dump_xtable(). 2014-06-15 13:43:44 +00:00
Alexander V. Chernikov
d3a4f9249c Simplify opcode handling.
* Use one u16 from op3 header to implement opcode versioning.
* IP_FW_TABLE_XLIST has now 2 handlers, for ver.0 (old) and ver.1 (current).
* Every getsockopt request is now handled in ip_fw_table.c
* Rename new opcodes:
IP_FW_OBJ_DEL -> IP_FW_TABLE_XDESTROY
IP_FW_OBJ_LISTSIZE -> IP_FW_TABLES_XGETSIZE
IP_FW_OBJ_LIST -> IP_FW_TABLES_XLIST
IP_FW_OBJ_INFO -> IP_FW_TABLE_XINFO
IP_FW_OBJ_INFO -> IP_FW_TABLE_XFLUSH

* Add some docs about using given opcodes.
* Group some legacy opcode/handlers.
2014-06-15 13:40:27 +00:00
Alexander V. Chernikov
f1220db8d7 Move further to eliminate next pieces of number-assuming code inside tables.
Kernel changes:
* Add IP_FW_OBJ_FLUSH opcode (flush table based on its name/set)
* Add IP_FW_OBJ_DUMP opcode (dumps table data based on its names/set)
* Add IP_FW_OBJ_LISTSIZE / IP_FW_OBJ_LIST opcodes (get list of kernel tables)

Userland changes:
* move tables code to separate tables.c file
* get rid of tables_max
* switch "all"/list handling to new opcodes
2014-06-14 22:47:25 +00:00
Alexander V. Chernikov
ea761a5dc4 Move most of external table structures/functions to separate ip_fw_table.h 2014-06-14 11:13:02 +00:00
Alexander V. Chernikov
9f7d47b025 Add API to ease adding new algorithms/new tabletypes to ipfw.
Kernel-side changelog:
* Split general tables code and algorithm-specific table data.
  Current algorithms (IPv4/IPv6 radix and interface tables radix) moved to
  new ip_fw_table_algo.c file.
  Tables code now supports any algorithm implementing the following callbacks:
+struct table_algo {
+       char            name[64];
+       int             idx;
+       ta_init         *init;
+       ta_destroy      *destroy;
+       table_lookup_t  *lookup;
+       ta_prepare_add  *prepare_add;
+       ta_prepare_del  *prepare_del;
+       ta_add          *add;
+       ta_del          *del;
+       ta_flush_entry  *flush_entry;
+       ta_foreach      *foreach;
+       ta_dump_entry   *dump_entry;
+       ta_dump_xentry  *dump_xentry;
+};

* Change ->state, ->xstate, ->tabletype fields of ip_fw_chain to
   ->tablestate pointer (array of 32 bytes structures necessary for
   runtime lookups (can be probably shrinked to 16 bytes later):

   +struct table_info {
   +       table_lookup_t  *lookup;        /* Lookup function */
   +       void            *state;         /* Lookup radix/other structure */
   +       void            *xstate;        /* eXtended state */
   +       u_long          data;           /* Hints for given func */
   +};

* Add count method for namedobj instance to ease size calculations
* Bump ip_fw3 buffer in ipfw_clt 128->256 bytes.
* Improve bitmask resizing on tables_max change.
* Remove table numbers checking from most places.
* Fix wrong nesting in ipfw_rewrite_table_uidx().

* Add IP_FW_OBJ_LIST opcode (list all objects of given type, currently
    implemented for IPFW_OBJTYPE_TABLE).
* Add IP_FW_OBJ_LISTSIZE (get buffer size to hold IP_FW_OBJ_LIST data,
    currenly implemented for IPFW_OBJTYPE_TABLE).
* Add IP_FW_OBJ_INFO (requests info for one object of given type).

Some name changes:
s/ipfw_xtable_tlv/ipfw_obj_tlv/ (no table specifics)
s/ipfw_xtable_ntlv/ipfw_obj_ntlv/ (no table specifics)

Userland changes:
* Add do_set3() cmd to ipfw2 to ease dealing with op3-embeded opcodes.
* Add/improve support for destroy/info cmds.
2014-06-14 10:58:39 +00:00
Alexander V. Chernikov
b074b7bbce Make ipfw tables use names as used-level identifier internally:
* Add namedobject set-aware api capable of searching/allocation objects by their name/idx.
* Switch tables code to use string ids for configuration tasks.
* Change locking model: most configuration changes are protected with UH lock, runtime-visible are protected with both locks.
* Reduce number of arguments passed to ipfw_table_add/del by using separate structure.
* Add internal V_fw_tables_sets tunable (set to 0) to prepare for set-aware tables (requires opcodes/client support)
* Implement typed table referencing (and tables are implicitly allocated with all state like radix ptrs on reference)
* Add "destroy" ipfw(8) using new IP_FW_DELOBJ opcode

Namedobj more detailed:
* Blackbox api providing methods to add/del/search/enumerate objects
* Statically-sized hashes for names/indexes
* Per-set bitmask to indicate free indexes
* Separate methods for index alloc/delete/resize

Basically, there should not be any user-visible changes except the following:
* reducing table_max is not supported
* flush & add change table type won't work if table is referenced

Sponsored by:	Yandex LLC
2014-06-12 09:59:11 +00:00
Alexander V. Chernikov
c3015737f3 Fix wrong formatting of 0.0.0.0/X table records in ipfw(8).
Add `flags` u16 field to the hole in ipfw_table_xentry structure.
Kernel has been guessing address family for supplied record based
on xent length size.
Userland, however, has been getting fixed-size ipfw_table_xentry structures
guessing address family by checking address by IN6_IS_ADDR_V4COMPAT().

Fix this behavior by providing specific IPFW_TCF_INET flag for IPv4 records.

PR:		bin/189471
Submitted by:	Dennis Yusupoff <dyr@smartspb.net>
MFC after:	2 weeks
2014-05-17 13:45:03 +00:00
Dimitry Andric
a3043eeef2 Under sys/netpfil/ipfw, surround two IPv6-specific static functions with
#ifdef INET6, since they are unused when INET6 is disabled.

MFC after:	3 days
2014-02-15 12:25:01 +00:00
Alexander V. Chernikov
d28d2aa46d Use rnh_matchaddr instead of rnh_lookup for longest-prefix match.
rnh_lookup is effectively the same as rnh_matchaddr if called with
empy network mask.

MFC after:	2 weeks
2014-01-03 23:11:26 +00:00
Gleb Smirnoff
3b3a8eb937 o Create directory sys/netpfil, where all packet filters should
reside, and move there ipfw(4) and pf(4).

o Move most modified parts of pf out of contrib.

Actual movements:

sys/contrib/pf/net/*.c		-> sys/netpfil/pf/
sys/contrib/pf/net/*.h		-> sys/net/
contrib/pf/pfctl/*.c		-> sbin/pfctl
contrib/pf/pfctl/*.h		-> sbin/pfctl
contrib/pf/pfctl/pfctl.8	-> sbin/pfctl
contrib/pf/pfctl/*.4		-> share/man/man4
contrib/pf/pfctl/*.5		-> share/man/man5

sys/netinet/ipfw		-> sys/netpfil/ipfw

The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.

Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.

The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.

Discussed with:		bz, luigi
2012-09-14 11:51:49 +00:00