Commit Graph

67 Commits

Author SHA1 Message Date
Alexander V. Chernikov
d4e1b51578 Fix build with gcc. 2014-10-04 13:57:14 +00:00
Alexander V. Chernikov
ccba94b8fc Switch ipfw to use rmlock for runtime locking. 2014-10-04 11:40:35 +00:00
Alexander V. Chernikov
1a33e79969 Change copyrights to the proper one. 2014-09-05 14:19:02 +00:00
Alexander V. Chernikov
6b988f3a27 * Use modular opcode handling inside ipfw_ctl3() instead of static switch.
* Provide hints for subsystem initializers if they are called for
  the first/last time.
* Convert every IP_FW3 opcode user to use new sopt API.
2014-09-05 11:11:15 +00:00
Alexander V. Chernikov
e822d9364e Be consistent and use same arguments for ctl3 opcodes.
Move legacy IP_FW_TABLE_XGETSIZE handling to separate function.
2014-09-03 21:57:06 +00:00
Alexander V. Chernikov
fb4b37a357 * Fix crash due to forgotten value refcouting in ipfw_link_table_values()
* Fix argument order in rollback_toperation_state()
* Make flush_table() use operation state API to ease checks.
2014-09-02 20:46:18 +00:00
Alexander V. Chernikov
71af39bf34 Add more comments on newly-added functions.
Add back opstate handler function.
2014-09-02 14:27:12 +00:00
Alexander V. Chernikov
0cba2b2802 Add support for multi-field values inside ipfw tables.
This is the last major change in given branch.

Kernel changes:
* Use 64-bytes structures to hold multi-value variables.
* Use shared array to hold values from all tables (assume
  each table algo is capable of holding 32-byte variables).
* Add some placeholders to support per-table value arrays in future.
* Use simple eventhandler-style API to ease the process of adding new
  table items. Currently table addition may required multiple UH drops/
  acquires which is quite tricky due to atomic table modificatio/swap
  support, shared array resize, etc. Deal with it by calling special
  notifier capable of rolling back state before actually performing
  swap/resize operations. Original operation then restarts itself after
  acquiring UH lock.
* Bump all objhash users default values to at least 64
* Fix custom hashing inside objhash.

Userland changes:
* Add support for dumping shared value array via "vlist" internal cmd.
* Some small print/fill_flags dixes to support u32 values.
* valtype is now bitmask of
  <skipto|pipe|fib|nat|dscp|tag|divert|netgraph|limit|ipv4|ipv6>.
  New values can hold distinct values for each of this types.
* Provide special "legacy" type which assumes all values are the same.
* More helpers/docs following..

Some examples:

3:41 [1] zfscurr0# ipfw table mimimi create valtype skipto,limit,ipv4,ipv6
3:41 [1] zfscurr0# ipfw table mimimi info
+++ table(mimimi), set(0) +++
 kindex: 2, type: addr
 references: 0, valtype: skipto,limit,ipv4,ipv6
 algorithm: addr:radix
 items: 0, size: 296
3:42 [1] zfscurr0# ipfw table mimimi add 10.0.0.5 3000,10,10.0.0.1,2a02:978:2::1
added: 10.0.0.5/32 3000,10,10.0.0.1,2a02:978:2::1
3:42 [1] zfscurr0# ipfw table mimimi list
+++ table(mimimi), set(0) +++
10.0.0.5/32 3000,0,10.0.0.1,2a02:978:2::1
2014-08-31 23:51:09 +00:00
Alexander V. Chernikov
867708f7eb Simplify table reference/create chain. 2014-08-23 12:41:39 +00:00
Alexander V. Chernikov
4dff4ae028 * Use OP_ADD/OP_DEL macro instead of plain integers.
* ipfw_foreach_table_tentry() to permit listing
  arbitrary ipfw table using standart format.
2014-08-23 11:27:49 +00:00
Alexander V. Chernikov
4bbd15771b Make room for multi-type values in struct tentry. 2014-08-15 12:58:32 +00:00
Alexander V. Chernikov
c21034b744 Replace "cidr" table type with "addr" type.
Suggested by:	luigi
2014-08-14 21:43:20 +00:00
Alexander V. Chernikov
d3b00c08bc * Add cidr:kfib algo type just for fun. It binds kernel fib
of given number to a table.

Example:
# ipfw table fib2 create algo "cidr:kfib fib=2"
# ipfw table fib2 info
+++ table(fib2), set(0) +++
 kindex: 2, type: cidr, locked
 valtype: number, references: 0
 algorithm: cidr:kfib fib=2
 items: 11, size: 288
# ipfw table fib2 list
+++ table(fib2), set(0) +++
10.0.0.0/24 0
127.0.0.1/32 0
::/96 0
::1/128 0
::ffff:0.0.0.0/96 0
2a02:978:2::/112 0
fe80::/10 0
fe80:1::/64 0
fe80:2::/64 0
fe80:3::/64 0
ff02::/16 0
# ipfw table fib2 lookup 10.0.0.5
10.0.0.0/24 0
# ipfw table fib2 lookup 2a02:978:2::11
2a02:978:2::/112 0
# ipfw table fib2 detail
+++ table(fib2), set(0) +++
 kindex: 2, type: cidr, locked
 valtype: number, references: 0
 algorithm: cidr:kfib fib=2
 items: 11, size: 288
 IPv4 algorithm radix info
  items: 0 itemsize: 200
 IPv6 algorithm radix info
  items: 0 itemsize: 200
2014-08-14 20:17:23 +00:00
Alexander V. Chernikov
fd0869d547 * Document internal commands.
* Do not require/set default table type if algo name is specified.
* Add TA_FLAG_READONLY option for algorithms.
2014-08-14 17:31:04 +00:00
Alexander V. Chernikov
18ad419788 * Fix displaying dynamic rules for large rulesets.
* Clean up some comments.
2014-08-14 08:21:22 +00:00
Alexander V. Chernikov
fddbbf75c8 Fix assertion. 2014-08-13 16:53:12 +00:00
Alexander V. Chernikov
40e5f498de * Pass proper table set numbers from userland side.
* Ignore them, but honor V_fw_tables_sets value on kernel side.
2014-08-13 12:04:45 +00:00
Alexander V. Chernikov
c8d5d3088b * Clarify ipfw_swap_table operations
* Ensure <add|del>_table_entry handle ta change properly.
2014-08-12 17:03:13 +00:00
Alexander V. Chernikov
e5eec6dd21 * Rename ipfw_[un]bind_table_rule to ipfw_[un]ref_rule_tables
* Update their descriptions.
2014-08-12 16:08:13 +00:00
Alexander V. Chernikov
1940fa7727 Change tablearg value to be 0 (try #2).
Most of the tablearg-supported opcodes does not accept 0 as valid value:
 O_TAG, O_TAGGED, O_PIPE, O_QUEUE, O_DIVERT, O_TEE, O_SKIPTO, O_CALLRET,
 O_NETGRAPH, O_NGTEE, O_NAT treats 0 as invalid input.

The rest are O_SETDSCP and O_SETFIB.
'Fix' them by adding high-order bit (0x8000) set for non-tablearg values.
Do translation in kernel for old clients (import_rule0 / export_rule0),
teach current ipfw(8) binary to add/remove given bit.

This change does not affect handling SETDSCP values, but limit
O_SETFIB values to 32767 instead of 65k. Since currently we have either
old (16) or new (2^32) max fibs, this should not be a big deal:
we're definitely OK for former and have to add another opcode to deal
with latter, regardless of tablearg value.
2014-08-12 15:51:48 +00:00
Alexander V. Chernikov
301290bc6d * Rename has_space to need_modify to be consistent with 0 as return values.
* document all callbacks supported by algorithms code.
2014-08-12 14:09:15 +00:00
Alexander V. Chernikov
f99fbf96c4 No functional changes, do better functions grouping. 2014-08-12 10:22:46 +00:00
Alexander V. Chernikov
0468c5bae9 Simplify table auto-creation for old userland users. 2014-08-12 09:48:54 +00:00
Alexander V. Chernikov
1bc0d45749 Simplify add/del_table_entry() by making their common pieces
common functions.
2014-08-11 22:38:13 +00:00
Alexander V. Chernikov
35e1bbd089 Update functions descriptions. 2014-08-11 20:00:51 +00:00
Alexander V. Chernikov
4f43138ade * Add the abilify to lock/unlock given table from changes.
Example:

# ipfw table si lock
# ipfw table si info
+++ table(si), set(0) +++
 kindex: 0, type: cidr, locked
 valtype: number, references: 0
 algorithm: cidr:radix
 items: 0, size: 288
# ipfw table si add 4.5.6.7
ignored: 4.5.6.7/32 0
ipfw: Adding record failed: table is locked
# ipfw table si unlock
# ipfw table si add 4.5.6.7
added: 4.5.6.7/32 0
# ipfw table si lock
# ipfw table si delete 4.5.6.7
ignored: 4.5.6.7/32 0
ipfw: Deleting record failed: table is locked
# ipfw table si unlock
# ipfw table si delete 4.5.6.7
deleted: 4.5.6.7/32 0
2014-08-11 18:09:37 +00:00
Alexander V. Chernikov
3a845e1076 * Add support for batched add/delete for ipfw tables
* Add support for atomic batches add (all or none).
* Fix panic on deleting non-existing entry in radix algo.

Examples:

# si is empty
# ipfw table si add 1.1.1.1/32 1111 2.2.2.2/32 2222
added: 1.1.1.1/32 1111
added: 2.2.2.2/32 2222
# ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 4444
exists: 2.2.2.2/32 2200
added: 4.4.4.4/32 4444
ipfw: Adding record failed: record already exists
^^^^^ Returns error but keeps inserted items
# ipfw table si list
+++ table(si), set(0) +++
1.1.1.1/32 1111
2.2.2.2/32 2222
4.4.4.4/32 4444
# ipfw table si atomic add 3.3.3.3/32 3333 4.4.4.4/32 4400 5.5.5.5/32 5555
added(reverted): 3.3.3.3/32 3333
exists: 4.4.4.4/32 4400
ignored: 5.5.5.5/32 5555
ipfw: Adding record failed: record already exists
^^^^^ Returns error and reverts added records
# ipfw table si list
+++ table(si), set(0) +++
1.1.1.1/32 1111
2.2.2.2/32 2222
4.4.4.4/32 4444
2014-08-11 17:34:25 +00:00
Alexander V. Chernikov
030b184f10 * Use 2 32-bits field inside rule instead of 2 pointer to save skipto state.
* Introduce ipfw_reap_add() to unify unlinking rules/adding it to reap queue
* Unbreak FreeBSD7 export format.
2014-08-09 09:11:26 +00:00
Alexander V. Chernikov
720ee730c6 Kernel changes:
* Fix buffer calculation for table dumps
* Fix IPv6 radix entiries addition broken in r269371.

Userland changes:
* Fix bug in retrieving statric ruleset
* Fix several bugs in retrieving table list
2014-08-08 21:09:22 +00:00
Alexander V. Chernikov
8bd1921248 Partially revert previous commit:
"0" value is perfectly valid for O_SETFIB and O_SETDSCP,
  so tablearg remains to be 655535 for now.
2014-08-08 15:33:26 +00:00
Alexander V. Chernikov
2c452b20dd * Switch tablearg value from 65535 to 0.
* Use u16 table kidx instead of integer on for iface opcode.
* Provide compability layer for old clients.
2014-08-08 14:23:20 +00:00
Alexander V. Chernikov
adf3b2b9d8 * Add IP_FW_TABLE_XMODIFY opcode
* Since there seems to be lack of consensus on strict value typing,
  remove non-default value types. Use userland-only "value format type"
  to print values.

Kernel changes:
* Add IP_FW_XMODIFY to permit table run-time modifications.
  Currently we support changing limit and value format type.

Userland changes:
* Support IP_FW_XMODIFY opcode.
* Support specifying value format type (ftype) in tablble create/modify req
* Fine-print value type/value format type.
2014-08-08 09:27:49 +00:00
Alexander V. Chernikov
28ea4fa355 Remove IP_FW_TABLES_XGETSIZE opcode.
It is superseded by IP_FW_TABLES_XLIST.
2014-08-08 06:36:26 +00:00
Alexander V. Chernikov
a73d728d31 Kernel changes:
* Implement proper checks for switching between global and set-aware tables
* Split IP_FW_DEL mess into the following opcodes:
  * IP_FW_XDEL (del rules matching pattern)
  * IP_FW_XMOVE (move rules matching pattern to another set)
  * IP_FW_SET_SWAP (swap between 2 sets)
  * IP_FW_SET_MOVE (move one set to another one)
  * IP_FW_SET_ENABLE (enable/disable sets)
* Add IP_FW_XZERO / IP_FW_XRESETLOG to finish IP_FW3 migration.
* Use unified ipfw_range_tlv as range description for all of the above.
* Check dynamic states IFF there was non-zero number of deleted dyn rules,
* Del relevant dynamic states with singe traversal instead of per-rule one.

Userland changes:
* Switch ipfw(8) to use new opcodes.
2014-08-07 21:37:31 +00:00
Alexander V. Chernikov
46d5200874 Implement atomic ipfw table swap.
Kernel changes:
* Add opcode IP_FW_TABLE_XSWAP
* Add support for swapping 2 tables with the same type/ftype/vtype.
* Make skipto cache init after ipfw locks init.

Userland changes:
* Add "table X swap Y" command.
2014-08-03 21:37:12 +00:00
Alexander V. Chernikov
5f379342d2 Show algorithm-specific data in "table info" output. 2014-08-03 12:19:45 +00:00
Alexander V. Chernikov
d20facb2f3 Remove unneded headers. 2014-08-03 09:48:54 +00:00
Alexander V. Chernikov
b6ee846e04 * Fix case when returning more that 4096 bytes of data
* Use different approach to ensure algo has enough space to store N elements:
  - explicitly ask algo (under UH_WLOCK) before/after insertion.  This (along
    with existing reallocation callbacks) really guarantees us that it is safe
    to insert N elements at once while holding UH_WLOCK+WLOCK.
  - remove old aflags/flags approach
2014-08-02 17:18:47 +00:00
Alexander V. Chernikov
4c0c07a552 * Permit limiting number of items in table.
Kernel changes:
* Add TEI_FLAGS_DONTADD entry flag to indicate that insert is not possible
* Support given flag in all algorithms
* Add "limit" field to ipfw_xtable_info
* Add actual limiting code into add_table_entry()

Userland changes:
* Add "limit" option as "create" table sub-option. Limit modification
  is currently impossible.
* Print human-readable errors in table enry addition/deletion code.
2014-08-01 15:17:46 +00:00
Alexander V. Chernikov
57a1cf954e * Use TA_FLAG_DEFAULT for default algorithm selection instead of
exporting algorithm structures directly.

* Pass needed state buffer size in algo structures as preparation
  for tables add/del requests batching.
2014-08-01 07:35:17 +00:00
Alexander V. Chernikov
914bffb6ab * Add new "flow" table type to support N=1..5-tuple lookups
* Add "flow:hash" algorithm

Kernel changes:
* Add O_IP_FLOW_LOOKUP opcode to support "flow" lookups
* Add IPFW_TABLE_FLOW table type
* Add "struct tflow_entry" as strage for 6-tuple flows
* Add "flow:hash" algorithm. Basically it is auto-growing chained hash table.
  Additionally, we store mask of fields we need to compare in each instance/

* Increase ipfw_obj_tentry size by adding struct tflow_entry
* Add per-algorithm stat (ifpw_ta_tinfo) to ipfw_xtable_info
* Increase algoname length: 32 -> 64 (algo options passed there as string)
* Assume every table type can be customized by flags, use u8 to store "tflags" field.
* Simplify ipfw_find_table_entry() by providing @tentry directly to algo callback.
* Fix bug in cidr:chash resize procedure.

Userland changes:
* add "flow table(NAME)" syntax to support n-tuple checking tables.
* make fill_flags() separate function to ease working with _s_x arrays
* change "table info" output to reflect longer "type" fields

Syntax:
ipfw table fl2 create type flow:[src-ip][,proto][,src-port][,dst-ip][dst-port] [algo flow:hash]

Examples:

0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash
0:02 [2] zfscurr0# ipfw table fl2 info
+++ table(fl2), set(0) +++
 kindex: 0, type: flow:src-ip,proto,dst-port
 valtype: number, references: 0
 algorithm: flow:hash
 items: 0, size: 280
0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000
0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000
0:02 [2] zfscurr0# ipfw table fl2 list
+++ table(fl2), set(0) +++
2a02:6b8::333,6,443 45000
10.0.0.92,6,80 22000
0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)'
00200 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
0:03 [2] zfscurr0# ipfw show
00200   0     0 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 617 59416 allow ip from any to any
0:03 [2] zfscurr0# telnet -s 10.0.0.92 78.46.89.105 80
Trying 78.46.89.105...
..
0:04 [2] zfscurr0# ipfw show
00200   5   272 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 682 66733 allow ip from any to any
2014-07-31 20:08:19 +00:00
Alexander V. Chernikov
b23d5de9b6 * Add number:array algorithm lookup method.
Kernel changes:
* s/IPFW_TABLE_U32/IPFW_TABLE_NUMBER/
* Force "lookup <port|uid|gid|jid>" to be IPFW_TABLE_NUMBER
* Support "lookup" method for number tables
* Add number:array algorihm (i32 as key, auto-growing).

Userland changes:
* Support named tables in "lookup <tag> Table"
* Fix handling of "table(NAME,val)" case
* Support printing "number" table data.
2014-07-30 14:52:26 +00:00
Alexander V. Chernikov
daabb523cc Fix "flush" cmd for algorithms wih non-default parameters. 2014-07-30 09:17:40 +00:00
Alexander V. Chernikov
9d099b4f38 * Dump available table algorithms via "ipfw talist" cmd.
Kernel changes:
* Add type/refcount fields to table algo instances.
* Add IP_FW_TABLES_ALIST opcode to export available algorihms to userland.

Userland changes:
* Fix cores on empty input inside "ipfw table" handler.
* Add "ipfw talist" cmd to print availabled kernel algorithms.
* Change "table info" output to reflect long algorithm config lines.
2014-07-29 22:44:26 +00:00
Alexander V. Chernikov
0b565ac0e6 * Copy ta structures to stable storage to ease future extension.
* Remove algo .lookup field since table lookup function is set by algo code.
2014-07-29 21:38:06 +00:00
Alexander V. Chernikov
74b941f042 * Add new ipfw cidr algorihm: hash table.
Algorithm works with both IPv4 and IPv6 prefixes, /32 and /128
ranges are assumed by default.
It works the following way: input IP address is masked to specified
mask, hashed and searched inside hash bucket.

Current implementation does not support "lookup" method and hash auto-resize.
This will be changed soon.

some examples:

ipfw table mi_test2 create type cidr algo cidr:hash
ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64"

ipfw table mi_test2 info
+++ table(mi_test2), set(0) +++
 type: cidr, kindex: 7
 valtype: number, references: 0
 algorithm: cidr:hash
 items: 0, size: 220

ipfw table mi_test info
+++ table(mi_test), set(0) +++
 type: cidr, kindex: 6
 valtype: number, references: 0
 algorithm: cidr:hash masks=/30,/64
 items: 0, size: 220

ipfw table mi_test add 10.0.0.5/30
ipfw table mi_test add 10.0.0.8/30
ipfw table mi_test add 2a02:6b8:b010::1/64 25

ipfw table mi_test list
+++ table(mi_test), set(0) +++
10.0.0.4/30 0
10.0.0.8/30 0
2a02:6b8:b010::/64 25
2014-07-29 19:49:38 +00:00
Alexander V. Chernikov
adea620132 * Change algorthm names to "type:algo" (e.g. "iface:array", "cidr:radix") format.
* Pass number of items changed in add/del hooks to permit adding/deleting
  multiple values at once.
2014-07-29 08:00:13 +00:00
Alexander V. Chernikov
68394ec88e * Add generic ipfw interface tracking API
* Rewrite interface tables to use interface indexes

Kernel changes:
* Add generic interface tracking API:
 - ipfw_iface_ref (must call unlocked, performs lazy init if needed, allocates
  state & bumps ref)
 - ipfw_iface_add_ntfy(UH_WLOCK+WLOCK, links comsumer & runs its callback to
  update ifindex)
 - ipfw_iface_del_ntfy(UH_WLOCK+WLOCK, unlinks consumer)
 - ipfw_iface_unref(unlocked, drops reference)
Additionally, consumer callbacks are called in interface withdrawal/departure.

* Rewrite interface tables to use iface tracking API. Currently tables are
  implemented the following way:
  runtime data is stored as sorted array of {ifidx, val} for existing interfaces
  full data is stored inside namedobj instance (chained hashed table).

* Add IP_FW_XIFLIST opcode to dump status of tracked interfaces

* Pass @chain ptr to most non-locked algorithm callbacks:
  (prepare_add, prepare_del, flush_entry ..). This may be needed for better
  interaction of given algorithm an other ipfw subsystems

* Add optional "change_ti" algorithm handler to permit updating of
  cached table_info pointer (happens in case of table_max resize)

* Fix small bug in ipfw_list_tables()
* Add badd (insert into sorted array) and bdel (remove from sorted array) funcs

Userland changes:
* Add "iflist" cmd to print status of currently tracked interface
* Add stringnum_cmp for better interface/table names sorting
2014-07-28 19:01:25 +00:00
Alexander V. Chernikov
db785d3199 * Require explicit table creation before use on kernel side.
* Add resize callbacks for upcoming table-based algorithms.

Kernel changes:
* s/ipfw_modify_table/ipfw_manage_table_ent/
* Simplify add_table_entry(): make table creation a separate piece of code.
  Do not perform creation if not in "compat" mode.
* Add ability to perform modification of algorithm state (like table resize).
  The following callbacks were added:
 - prepare_mod (allocate new state, without locks)
 - fill_mod (UH_WLOCK, copy old state to new one)
 - modify (UH_WLOCK + WLOCK, switch state)
 - flush_mod (no locks, flushes allocated data)
 Given callbacks are called if table modification has been requested by add or
   delete callbacks. Additional u64 tc->'flags' field was added to pass these
   requests.
* Change add/del table ent format: permit adding/removing multiple entries
   at once (only 1 supported at the moment).

Userland changes:
* Auto-create tables with warning
2014-07-26 13:37:25 +00:00
Alexander V. Chernikov
7e767c791f * Use different rule structures in kernel/userland.
* Switch kernel to use per-cpu counters for rules.
* Keep ABI/API.

Kernel changes:
* Each rules is now exported as TLV with optional extenable
  counter block (ip_fW_bcounter for base one) and
  ip_fw_rule for rule&cmd data.
* Counters needs to be explicitly requested by IPFW_CFG_GET_COUNTERS flag.
* Separate counters from rules in kernel and clean up ip_fw a bit.
* Pack each rule in IPFW_TLV_RULE_ENT tlv to ease parsing.
* Introduce versioning in container TLV (may be needed in future).
* Fix ipfw_cfg_lheader broken u64 alignment.

Userland changes:
* Use set_mask from cfg header when requesting config
* Fix incorrect read accouting in ipfw_show_config()
* Use IPFW_RULE_NOOPT flag instead of playing with _pad
* Fix "ipfw -d list": do not print counters for dynamic states
* Some small fixes
2014-07-08 23:11:15 +00:00