Commit Graph

230 Commits

Author SHA1 Message Date
Luigi Rizzo
10d72ffc7d fix various warnings to compile the test code with -Wextra 2016-01-26 23:37:07 +00:00
Luigi Rizzo
fa57c83c70 fix various warnings (signed/unsigned, printf types, unused arguments) 2016-01-26 23:36:18 +00:00
Luigi Rizzo
f6a5c66400 prevent warnings for signed/unsigned comparisons and unused arguments.
Add checks for parameters overflowing 32 bit.
2016-01-26 22:46:58 +00:00
Luigi Rizzo
e72cd9a70d prevent warning for unused argument 2016-01-26 22:45:45 +00:00
Luigi Rizzo
4d85bfeb07 avoid warnings for signed/unsigned comparison and unused arguments 2016-01-26 22:45:05 +00:00
Luigi Rizzo
f51b072d4c Revert one chunk from commit 285362, which introduced an off-by-one error
in computing a shift index. The error was due to the use of mixed
fls() / __fls() functions in another implementation of qfq.
To avoid that the problem occurs again, properly document which
incarnation of the function we need.
Note that the bug only affects QFQ in FreeBSD head from last july, as
the patch was not merged to other versions.
2016-01-26 04:48:24 +00:00
Alexander V. Chernikov
61eee0e202 MFP r287070,r287073: split radix implementation and route table structure.
There are number of radix consumers in kernel land (pf,ipfw,nfs,route)
  with different requirements. In fact, first 3 don't have _any_ requirements
  and first 2 does not use radix locking. On the other hand, routing
  structure do have these requirements (rnh_gen, multipath, custom
  to-be-added control plane functions, different locking).
Additionally, radix should not known anything about its consumers internals.

So, radix code now uses tiny 'struct radix_head' structure along with
  internal 'struct radix_mask_head' instead of 'struct radix_node_head'.
  Existing consumers still uses the same 'struct radix_node_head' with
  slight modifications: they need to pass pointer to (embedded)
  'struct radix_head' to all radix callbacks.

Routing code now uses new 'struct rib_head' with different locking macro:
  RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing
  information base).

New net/route_var.h header was added to hold routing subsystem internal
  data. 'struct rib_head' was placed there. 'struct rtentry' will also
  be moved there soon.
2016-01-25 06:33:15 +00:00
Alexander V. Chernikov
fa7c058bf8 Fix panic on table/table entry delete. The panic could have happened
if more than 64 distinct values had been used.

Table value code uses internal objhash API which requires unique key
  for each object. For value code, pointer to the actual value data
  is used. The actual problem arises from the fact that 'actual' e.g.
  runtime data is stored in array and that array is auto-growing. There is
  special hook (update_tvalue() function) which is used to update the pointers
  after the change. For some reason, object 'key' was not updated.
  Fix this by adding update code to the update_tvalue().

Sponsored by:	Yandex LLC
2016-01-21 18:20:40 +00:00
Alexander V. Chernikov
89fc126add Initialize error value ta_lookup_kfib() by default to please compiler. 2016-01-10 08:37:00 +00:00
Bjoern A. Zeeb
60c274aaf8 Initialize error after r293626 in case neither INET nor INET6 is
compiled into the kernel.  Ideally lots more code would just not
be called (or compiled in) in that case but that requires a lot
more surgery.  For now try to make IP-less kernels compile again.
2016-01-10 08:14:25 +00:00
Alexander V. Chernikov
004d3e30a7 Make ipfw addr:kfib lookup algo use new routing KPI. 2016-01-10 06:43:43 +00:00
Alexander V. Chernikov
3673828490 Use already pre-calculated number of entries instead of tc->count. 2016-01-10 00:28:44 +00:00
Hans Petter Selasky
c8cfbc066f Properly drain callouts in the IPFW subsystem to avoid use after free
panics when unloading the dummynet and IPFW modules:

- The callout drain function can sleep and should not be called having
a non-sleepable lock locked. Remove locks around "ipfw_dyn_uninit(0)".

- Add a new "dn_gone" variable to prevent asynchronous restart of
dummynet callouts when unloading the dummynet kernel module.

- Call "dn_reschedule()" locked so that "dn_gone" can be set and
checked atomically with regard to starting a new callout.

Reviewed by:	hiren
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D3855
2015-12-15 09:02:05 +00:00
Alexander V. Chernikov
65ff3638df Merge helper fib* functions used for basic lookups.
Vast majority of rtalloc(9) users require only basic info from
route table (e.g. "does the rtentry interface match with the interface
  I have?". "what is the MTU?", "Give me the IPv4 source address to use",
  etc..).
Instead of hand-rolling lookups, checking if rtentry is up, valid,
  dealing with IPv6 mtu, finding "address" ifp (almost never done right),
  provide easy-to-use API hiding all the complexity and returning the
  needed info into small on-stack structure.

This change also helps hiding route subsystem internals (locking, direct
  rtentry accesses).
Additionaly, using this API improves lookup performance since rtentry is not
  locked.
(This is safe, since all the rtentry changes happens under both radix WLOCK
  and rtentry WLOCK).

Sponsored by:	Yandex LLC
2015-12-08 10:50:03 +00:00
Andrey V. Elsukov
1cf09efe5d Add destroy_object callback to object rewriting framework.
It is called when last reference to named object is going to be released
and allows to do additional cleanup for implementation of named objects.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-23 22:06:55 +00:00
Bryan Drewery
7143303723 Fix dynamic IPv6 rules showing junk for non-specified address masks.
For example:
  00002      0         0 (19s) PARENT 1 tcp 10.10.0.5 0 <-> 0.0.0.0 0
  00002      4       412 (1s) LIMIT tcp 10.10.0.5 25848 <-> 10.10.0.7 22
  00002     10       777 (1s) LIMIT tcp 2001:894:5a24:653::503:1 52023 <-> 2001:894:5a24:653:ca0a:a9ff:fe04:3978 22
  00002      0         0 (17s) PARENT 1 tcp 2001:894:5a24:653::503:1 0 <-> 80f3:70d:23fe:ffff:1005:: 0

Fix this by zeroing the unused address, as is done for IPv4:
  00002     0         0 (18s) PARENT 1 tcp 10.10.0.5 0 <-> 0.0.0.0 0
  00002    36     14952 (1s) LIMIT tcp 10.10.0.5 25848 <-> 10.10.0.7 22
  00002     0         0 (0s) PARENT 1 tcp 2001:894:5a24:653::503:1 0 <-> :: 0
  00002     4       345 (274s) LIMIT tcp 2001:894:5a24:653::503:1 34131 <-> 2001:470:1f11:262:ca0a:a9ff:fe04:3978 22

MFC after:	2 weeks
2015-11-17 20:42:08 +00:00
Alexander V. Chernikov
91e93daf9c Print proper setfib values in ipfw log.
Submitted by:	Denis Schneider <v1ne2go at gmail>
2015-11-08 13:44:21 +00:00
Alexander V. Chernikov
b554a27822 Fix setfib target.
Problem was introduced in r272840 when converting tablearg value to 0.

Submitted by:	Denis Schneider <v1ne2go at gmail>
2015-11-08 12:24:19 +00:00
Andrey V. Elsukov
ee09cb0bfb Remove now obsolete KASSERT.
Actually, object classify callbacks can skip some opcodes, that could
be rewritten. We will deteremine real numbed of rewritten opcodes a bit
later in this function.

Reported by:	David H. Wolfskill <david at catwhisker dot org>
2015-11-03 22:23:09 +00:00
Andrey V. Elsukov
748c9559ee Eliminate any conditional increments of object_opcodes in the
check_ipfw_rule_body() function. This function is intended to just
determine that rule has some opcodes that can be rewrited. Then the
ref_rule_objects() function will determine real number of rewritten
opcodes using classify callback.

Reviewed by:	melifaro
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-03 10:34:26 +00:00
Andrey V. Elsukov
f81431cca1 Add ipfw_check_object_name_generic() function to do basic checks for an
object name correctness. Each type of object can do more strict checking
in own implementation. Do such checks for tables in check_table_name().

Reviewed by:	melifaro
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-03 10:29:46 +00:00
Andrey V. Elsukov
5dc5a0e0aa Implement ipfw internal olist command to list named objects.
Reviewed by:	melifaro
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-11-03 10:21:53 +00:00
Alexander V. Chernikov
c6fb65b1df Bump number of prefixes in O_IP_<SRC|DST> from 15 to 31 (max possible).
PR:		203459
Submitted by:	groos at xiplink.com
MFC after:	2 weeks
2015-10-03 05:42:25 +00:00
Alexander V. Chernikov
3535eac433 Fix packets/bytes accounting on i386.
Spotted by:	julian
2015-08-27 07:53:58 +00:00
Andrey V. Elsukov
b13653baf9 Reduce overhead of ipfw's me6 opcode.
Skip checks for IPv6 multicast addresses.
Use in6_localip() for global unicast.
And for IPv6 link-local addresses do search in the IPv6 addresses list.
Since LLA are stored in the kernel internal form, use
IN6_ARE_MASKED_ADDR_EQUAL() macro with lla_mask for addresses comparison.
lla_mask has zero bits in the second word, where we keep sin6_scope_id.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-07-29 10:53:42 +00:00
Andrey V. Elsukov
af9aa0a837 Add helper functions for IP checksum adjusting. Use these functions in
dummynet code and for setdscp. This fixes wrong checksums in some cases.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2015-07-20 07:26:31 +00:00
Luigi Rizzo
4af7aed7c6 assorted algorithmic fixes from Paolo Valente (one of my qfq coauthors):
- use 1ULL to avoid shift truncations
- recompute the sum of weight dynamically to provide better fairness
- fix an erroneous constant in the computation of the slot
- preserve timestamp correctness when the old timestamp is stale.
2015-07-10 19:24:36 +00:00
Luigi Rizzo
e38e277fc4 one more warning suppression when compiling the test code in userspace. 2015-07-10 19:18:49 +00:00
Luigi Rizzo
e25716b7cc add code to compute fairness indexes;
cleanups to remove compile warnings.
2015-07-10 18:10:40 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Luigi Rizzo
62f42cf8ee use proper types to represent function pointers 2015-05-19 16:51:30 +00:00
Luigi Rizzo
352bc63d72 remove a redundant ; at the end of a function
MFC after:	1 week
2015-05-19 15:29:00 +00:00
Luigi Rizzo
bebf3c825f remove an extra ; after MODULE_DEPEND
(would otherwise generate a warning with more verbose compiler flags)

MFC after:	1 week
2015-05-19 14:49:31 +00:00
Luigi Rizzo
8ff71b031e bugfix (only affecting the "lookup" option in the userspace version of ipfw):
the conditional block should not include the 'else' otherwise
the code does a 'break;' without completing the check
2015-05-13 11:53:25 +00:00
Alexander V. Chernikov
e09c1944a3 Remove ptei->value check from ipfw_link_table_values():
even if there was non-zero number of restarts, we would unref/clear
  all value references and start ipfw_link_table_values() once again
  with (mostly) cleared "tei" buffer.
 Additionally, ptei->ptv stores only to-be-added values, not existing ones.
 This is a forgotten piece of previous value refconting implementation,
  and now it is simply incorrect.
2015-05-12 20:42:42 +00:00
Alexander V. Chernikov
b45fa3fad6 Fix panic when prepare_batch_buffer() returns error. 2015-05-06 07:53:43 +00:00
Alexander V. Chernikov
caf993912e Fix KASSERT introduced in r282155.
Found by:	dhw
2015-04-30 21:51:12 +00:00
Alexander V. Chernikov
e948489558 Fix panic introduced by r282070.
Arm friendly KASSERT() to ease debug of similar crashes.

Submitted by:	Olivier Cochard-Labbé
2015-04-28 17:05:55 +00:00
Alexander V. Chernikov
a1bddc75b4 Fix 'may be used uninitialized' warning not caught by clang. 2015-04-27 10:01:22 +00:00
Alexander V. Chernikov
1a458088ff Use free_nat_instance() for nat instance deletion.
Sponsored by:	Yandex LLC
2015-04-27 09:16:22 +00:00
Alexander V. Chernikov
74b22066b0 Make rule table kernel-index rewriting support any kind of objects.
Currently we have tables identified by their names in userland
with internal kernel-assigned indices. This works the following way:

When userland wishes to communicate with kernel to add or change rule(s),
it makes indexed sorted array of table names
(internally ipfw_obj_ntlv entries), and refer to indices in that
array in rule manipulation.
Prior to committing new rule to the ruleset kernel
a) finds all referenced tables, bump their refcounts and change
 values inside the opcodes to be real kernel indices
b) auto-creates all referenced but not existing tables and then
 do a) for them.

Kernel does almost the same when exporting rules to userland:
 prepares array of used tables in all rules in range, and
 prepends it before the actual ruleset retaining actual in-kernel
 indexes for that.

There is also special translation layer for legacy clients which is
able to provide 'real' indices for table names (basically doing atoi()).

While it is arguable that every subsystem really needs names instead of
numbers, there are several things that should be noted:

1) every non-singleton subsystem needs to store its runtime state
somewhere inside ipfw chain (and be able to get it fast)
2) we can't assume object numbers provided by humans will be dense.

Existing nat implementation (O(n) access and LIST inside chain) is a
good example.

Hence the following:
* Convert table-centric rewrite code to be more generic, callback-based
* Move most of the code from ip_fw_table.c to ip_fw_sockopt.c
* Provide abstract API to permit subsystems convert their objects
  between userland string identifier and in-kernel index.
  (See struct opcode_obj_rewrite) for more details
* Create another per-chain index (in next commit) shared among all subsystems
* Convert current NAT44 implementation to use new API, O(1) lookups,
 shared index and names instead of numbers (in next commit).

Sponsored by:	Yandex LLC
2015-04-27 08:29:39 +00:00
Gleb Smirnoff
fdf6290ea9 Fix memory leak.
PR:		199670
Reviewed by:	ae
2015-04-27 05:44:09 +00:00
Andrey V. Elsukov
bf55a0034d The offset variable has been cleared all bits except IP6F_OFF_MASK.
Use ip6f_mf variable instead of checking its bits.
2015-03-31 14:41:29 +00:00
Andrey V. Elsukov
2530ed9e70 Fix `ipfw fwd tablearg'. Use dedicated field nh4 in struct table_value
to obtain IPv4 next hop address in tablearg case.

Add `fwd tablearg' support for IPv6. ipfw(8) uses INADDR_ANY as next hop
address in O_FORWARD_IP opcode for specifying tablearg case. For IPv6 we
still use this opcode, but when packet identified as IPv6 packet, we
obtain next hop address from dedicated field nh6 in struct table_value.

Replace hopstore field in struct ip_fw_args with anonymous union and add
hopstore6 field. Use this field to copy tablearg value for IPv6.

Replace spare1 field in struct table_value with zoneid. Use it to keep
scope zone id for link-local IPv6 addresses. Since spare1 was used
internally, replace spare0 array with two variables spare0 and spare1.

Use getaddrinfo(3)/getnameinfo(3) functions for parsing and formatting
IPv6 addresses in table_value. Use zoneid field in struct table_value
to store sin6_scope_id value.

Since the kernel still uses embedded scope zone id to represent
link-local addresses, convert next_hop6 address into this form before
return from pfil processing. This also fixes in6_localip() check
for link-local addresses.

Differential Revision:	https://reviews.freebsd.org/D2015
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2015-03-13 09:03:25 +00:00
Alexander V. Chernikov
9f925e8a92 Fix IP_FW_NAT44_LIST_NAT size calculation.
Found by:	lev
Sponsored by:	Yandex LLC
2015-02-05 14:54:53 +00:00
Alexander V. Chernikov
0caab00959 * Make sure table algorithm destroy hook is always called without locks
* Explicitly lock freeing interface references in ta_destroy_ifidx
* Change ipfw_iface_unref() to require UH lock
* Add forgotten ipfw_iface_unref() to destroy_ifidx_locked()

PR:		kern/197276
Submitted by:	lev
Sponsored by:	Yandex LLC
2015-02-05 13:49:04 +00:00
Alexander V. Chernikov
0b47e42b49 Use ipfw runtime lock only when real modification is required. 2015-01-16 10:49:27 +00:00
Alexander V. Chernikov
a458ad86ee Remove unused 'struct route' fields. 2014-11-09 16:15:28 +00:00
Gleb Smirnoff
6df8a71067 Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.
Sponsored by:	Nginx, Inc.
2014-11-07 09:39:05 +00:00
Alexander V. Chernikov
038263c36a Remove unused variable.
Found by:	Coverity
CID:		1245739
2014-11-04 10:25:52 +00:00