Commit Graph

416 Commits

Author SHA1 Message Date
Andrey V. Elsukov
7664b71b62 Fix the bug introduced in r342908, that causes problems with dynamic
handling for protocols without ports numbers.

Since port numbers were uninitialized for protocols like ICMP/ICMPv6,
ipfw_chk() used some non-zero values to create dynamic states, and due
this it failed to match replies with created states.

Reported by:	Oliver Hartmann, Boris Lytochkin
Obtained from:	Yandex LLC
X-MFC after:	r342908
2019-01-29 11:18:41 +00:00
Andrey V. Elsukov
48266154de Relax requirement to packet size of CARP protocol and remove version check.
CARP shares protocol number 112 with VRRP (RFC 5798). And the size of
VRRP packet may be smaller than CARP. ipfw_chk() does m_pullup() to at
least sizeof(struct carp_header) and can fail when packet is VRRP. This
leads to packet drop and message about failed pullup attempt.
Also, RFC 5798 defines version 3 of VRRP protocol, this version number
also unsupported by CARP and such check leads to packet drop.

carp_input() does its own checks for protocol version and packet size,
so we can remove these checks to be able pass VRRP packets.

PR:		234207
MFC after:	1 week
2019-01-11 01:54:15 +00:00
Andrey V. Elsukov
3b1522c229 Fix the build with INVARIANTS.
MFC after:	1 month
2019-01-10 02:01:20 +00:00
Andrey V. Elsukov
1cdf23bc03 Reduce the size of struct ip_fw_args from 240 to 128 bytes on amd64.
And refactor the code to avoid unneeded initialization to reduce overhead
of per-packet processing.

ipfw(4) can be invoked by pfil(9) framework for each packet several times.
Each call uses on-stack variable of type struct ip_fw_args to keep the
state of ipfw(4) processing. Currently this variable has 240 bytes size
on amd64.  Each time ipfw(4) does bzero() on it, and then it initializes
some fields.

glebius@ has reported that they at Netflix discovered, that initialization
of this variable produces significant overhead on packet processing.
After patching I managed to increase performance of packet processing on
simple routing with ipfw(4) firewalling to about 11% from 9.8Mpps up to
11Mpps (Xeon E5-2660 v4@ + Mellanox 100G card).

Introduced new field flags, it is used to keep track of what fields was
initialized. Some fields were moved into the anonymous union, to reduce
the size. They all are mutually exclusive. dummypar field was unused, and
therefore it is removed.  The hopstore6 field type was changed from
sockaddr_in6 to a bit smaller struct ip_fw_nh6. And now the size of struct
ip_fw_args is 128 bytes.

ipfw_chk() was modified to properly handle ip_fw_args.flags instead of
rely on checking for NULL pointers.

Reviewed by:	gallatin
Obtained from:	Yandex LLC
MFC after:	1 month
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D18690
2019-01-10 01:47:57 +00:00
Andrey V. Elsukov
d66f9c86fa Add ability to request listing and deleting only for dynamic states.
This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but
after rules reloading some state must be deleted. Added new flag '-D'
for such purpose.

Retire '-e' flag, since there can not be expired states in the meaning
that this flag historically had.

Also add "verbose" mode for listing of dynamic states, it can be enabled
with '-v' flag and adds additional information to states list. This can
be useful for debugging.

Obtained from:	Yandex LLC
MFC after:	2 months
Sponsored by:	Yandex LLC
2018-12-04 16:12:43 +00:00
Andrey V. Elsukov
cefe3d67e2 Reimplement how net.inet.ip.fw.dyn_keep_states works.
Turning on of this feature allows to keep dynamic states when parent
rule is deleted. But it works only when the default rule is
"allow from any to any".

Now when rule with dynamic opcode is going to be deleted, and
net.inet.ip.fw.dyn_keep_states is enabled, existing states will reference
named objects corresponding to this rule, and also reference the rule.
And when ipfw_dyn_lookup_state() will find state for deleted parent rule,
it will return the pointer to the deleted rule, that is still valid.
This implementation doesn't support O_LIMIT_PARENT rules.

The refcnt field was added to struct ip_fw to keep reference, also
next pointer added to be able iterate rules and not damage the content
when deleted rules are chained.

Named objects are referenced only when states are going to be deleted to
be able reuse kidx of named objects when new parent rules will be
installed.

ipfw_dyn_get_count() function was modified and now it also looks into
dynamic states and constructs maps of existing named objects. This is
needed to correctly export orphaned states into userland.

ipfw_free_rule() was changed to be global, since now dynamic state can
free rule, when it is expired and references counters becomes 1.

External actions subsystem also modified, since external actions can be
deregisterd and instances can be destroyed. In these cases deleted rules,
that are referenced by orphaned states, must be modified to prevent access
to freed memory. ipfw_dyn_reset_eaction(), ipfw_reset_eaction_instance()
functions added for these purposes.

Obtained from:	Yandex LLC
MFC after:	2 months
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D17532
2018-12-04 16:01:25 +00:00
Andrey V. Elsukov
0df76496a6 Add assertion to check that named object has correct type.
Obtained from:	Yandex LLC
MFC after:	1 week
2018-12-04 15:12:28 +00:00
Andrey V. Elsukov
2636ba4d03 Do not limit the mbuf queue length for keepalive packets.
It was unlimited before overhaul, and one user reported that this limit
can be reached easily.

PR:		233562
MFC after:	1 week
2018-11-27 16:51:01 +00:00
Andrey V. Elsukov
b2b5660688 Add ability to use dynamic external prefix in ipfw_nptv6 module.
Now an interface name can be specified for nptv6 instance instead of
ext_prefix. The module will track if_addr_ext events and when suitable
IPv6 address will be added to specified interface, it will be configured
as external prefix. When address disappears instance becomes unusable,
i.e. it doesn't match any packets.

Reviewed by:	0mp (manpages)
Tested by:	Dries Michiels <driesm dot michiels gmail com>
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D17765
2018-11-12 11:20:59 +00:00
Eugene Grosbein
5310c19174 ipfw: implement ngtee/netgraph actions for layer-2 frames.
Kernel part of ipfw does not support and ignores rules other than
"pass", "deny" and dummynet-related for layer-2 (ethernet frames).
Others are processed as "pass".

Make it support ngtee/netgraph rules just like they are supported
for IP packets. For example, this allows us to mirror some frames
selectively to another interface for delivery to remote network analyzer
over RSPAN vlan. Assuming ng_ipfw(4) netgraph node has a hook named "900"
attached to "lower" hook of vlan900's ng_ether(4) node, that would be
as simple as:

ipfw add ngtee 900 ip from any to 8.8.8.8 layer2 out xmit igb0

PR:		213452
MFC after:	1 month
Tested-by:	Fyodor Ustinov <ufm@ufm.su>
2018-10-27 07:32:26 +00:00
Andrey V. Elsukov
ab108c4b07 Do not decrement RST life time if keep_alive is not turned on.
This allows use differen values configured by user for sysctl variable
net.inet.ip.fw.dyn_rst_lifetime.

Obtained from:	Yandex LLC
MFC after:	3 weeks
Sponsored by:	Yandex LLC
2018-10-21 16:44:57 +00:00
Andrey V. Elsukov
2ffadd56f5 Call inet_ntop() only when its result is needed.
Obtained from:	Yandex LLC
MFC after:	3 weeks
Sponsored by:	Yandex LLC
2018-10-21 16:37:53 +00:00
Andrey V. Elsukov
aa2715612c Retire IPFIREWALL_NAT64_DIRECT_OUTPUT kernel option. And add ability
to switch the output method in run-time. Also document some sysctl
variables that can by changed for NAT64 module.

NAT64 had compile time option IPFIREWALL_NAT64_DIRECT_OUTPUT to use
if_output directly from nat64 module. By default is used netisr based
output method. Now both methods can be used, but they require different
handling by rules.

Obtained from:	Yandex LLC
MFC after:	3 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D16647
2018-10-21 16:29:12 +00:00
Andrey V. Elsukov
986368d85d Add extra parentheses to fix "versrcreach" opcode, (oif != NULL) should
not be used as condition for ternary operator.

Submitted by:	Tatsuki Makino <tatsuki_makino at hotmail dot com>
Approved by:	re (kib)
MFC after:	1 week
2018-10-15 10:25:34 +00:00
Luiz Otavio O Souza
a0376d4d29 Fix a typo in comment.
MFC after:	3 days
X-MFC with:	r321316
Sponsored by:	Rubicon Communications, LLC (Netgate)
2018-08-15 16:36:29 +00:00
Andrey V. Elsukov
5c4aca8218 Use host byte order when comparing mss values.
This fixes tcp-setmss action on little endian machines.

PR:		225536
Submitted by:	John Zielinski
2018-08-08 17:32:02 +00:00
Andrew Turner
5f901c92a8 Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by:	bz
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16147
2018-07-24 16:35:52 +00:00
Andrey V. Elsukov
0a2c13d333 Use correct size when we are allocating array for skipto index.
Also, there is no need to use M_ZERO for idxmap_back. It will be
re-filled just after allocation in update_skipto_cache().

PR:		229665
MFC after:	1 week
2018-07-12 11:38:18 +00:00
Andrey V. Elsukov
f7c4fdee1a Add "record-state", "set-limit" and "defer-action" rule options to ipfw.
"record-state" is similar to "keep-state", but it doesn't produce implicit
O_PROBE_STATE opcode in a rule. "set-limit" is like "limit", but it has the
same feature as "record-state", it is single opcode without implicit
O_PROBE_STATE opcode. "defer-action" is targeted to be used with dynamic
states. When rule with this opcode is matched, the rule's action will
not be executed, instead dynamic state will be created. And when this
state will be matched by "check-state", then rule action will be executed.
This allows create a more complicated rulesets.

Submitted by:	lev
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D1776
2018-07-09 11:35:18 +00:00
Andrew Turner
2bf9501287 Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by:	bz, emaste, markj
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D16140
2018-07-05 17:13:37 +00:00
Andrey V. Elsukov
20efcfc602 Switch RIB and RADIX_NODE_HEAD lock from rwlock(9) to rmlock(9).
Using of rwlock with multiqueue NICs for IP forwarding on high pps
produces high lock contention and inefficient. Rmlock fits better for
such workloads.

Reviewed by:	melifaro, olivier
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D15789
2018-06-16 08:26:23 +00:00
Mateusz Guzik
4e180881ae uma: implement provisional api for per-cpu zones
Per-cpu zone allocations are very rarely done compared to regular zones.
The intent is to avoid pessimizing the latter case with per-cpu specific
code.

In particular contrary to the claim in r334824, M_ZERO is sometimes being
used for such zones. But the zeroing method is completely different and
braching on it in the fast path for regular zones is a waste of time.
2018-06-08 21:40:03 +00:00
Andrey V. Elsukov
67ad3c0bf9 Restore the ability to keep states after parent rule deletion.
This feature is disabled by default and was removed when dynamic states
implementation changed to be lockless. Now it is reimplemented with small
differences - when dyn_keep_states sysctl variable is enabled,
dyn_match_ipv[46]_state() function doesn't match child states of deleted
rule. And thus they are keept alive until expired. ipfw_dyn_lookup_state()
function does check that state was not orphaned, and if so, it returns
pointer to default_rule and its position in the rules map. The main visible
difference is that orphaned states still have the same rule number that
they have before parent rule deleted, because now a state has many fields
related to rule and changing them all atomically to point to default_rule
seems hard enough.

Reported by:	<lantw44 at gmail.com>
MFC after:	2 days
2018-05-22 13:28:05 +00:00
Andrey V. Elsukov
4bb8a5b0c9 Remove check for matching the rulenum, ruleid and rule pointer from
dyn_lookup_ipv[46]_state_locked(). These checks are remnants of not
ready to be committed code, and they are there by accident.
Due to the race these checks can lead to creating of duplicate states
when concurrent threads in the same time will try to add state for two
packets of the same flow, but in reverse directions and matched by
different parent rules.

Reported by:	lev
MFC after:	3 days
2018-05-21 16:19:00 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Andrey V. Elsukov
782360dec3 Bring in some last changes in NAT64 implementation:
o Modify ipfw(8) to be able set any prefix6 not just Well-Known,
  and also show configured prefix6;
o relocate some definitions and macros into proper place;
o convert nat64_debug and nat64_allow_private variables to be
  VNET-compatible;
o add struct nat64_config that keeps generic configuration needed
  to NAT64 code;
o add nat64_check_prefix6() function to check validness of specified
  by user IPv6 prefix according to RFC6052;
o use nat64_check_private_ip4() and nat64_embed_ip4() functions
  instead of nat64_get_ip4() and nat64_set_ip4() macros. This allows
  to use any configured IPv6 prefixes that are allowed by RFC6052;
o introduce NAT64_WKPFX flag, that is set when IPv6 prefix is
  Well-Known IPv6 prefix. It is used to reduce overhead to check this;
o modify nat64lsn_cfg and nat64stl_cfg structures to use nat64_config
  structure. And respectivelly modify the rest of code;
o remove now unused ro argument from nat64_output() function;
o remove __FreeBSD_version ifdef, NAT64 was not merged to older versions;
o add commented -DIPFIREWALL_NAT64_DIRECT_OUTPUT flag to module's Makefile
  as example.

Obtained from:	Yandex LLC
MFC after:	1 month
Sponsored by:	Yandex LLC
2018-05-09 11:59:24 +00:00
Andrey V. Elsukov
5f69d0a4ff To avoid possible deadlock do not acquire JQUEUE_LOCK before callout_drain.
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2018-04-13 10:03:30 +00:00
Andrey V. Elsukov
2d8fcffb99 Fix integer types mismatch for flags field in nat64stl_cfg structure.
Also preserve internal flags on NAT64STL reconfiguration.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2018-04-12 21:29:40 +00:00
Andrey V. Elsukov
eed302572a Use cfg->nomatch_verdict as return value from NAT64LSN handler when
given mbuf is considered as not matched.

If mbuf was consumed or freed during handling, we must return
IP_FW_DENY, since ipfw's pfil handler ipfw_check_packet() expects
IP_FW_DENY when mbuf pointer is NULL. This fixes KASSERT panics
when NAT64 is used with INVARIANTS. Also remove unused nomatch_final
field from struct nat64lsn_cfg.

Reported by:	Justin Holcomb <justin at justinholcomb dot me>
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2018-04-12 21:13:30 +00:00
Andrey V. Elsukov
c570565f12 Migrate NAT64 to FIB KPI.
Obtained from:	Yandex LLC
MFC after:	1 week
2018-04-12 21:05:20 +00:00
Oleg Bulyzhin
3995ad1768 Fix ipfw table creation when net.inet.ip.fw.tables_sets = 0 and non zero set
specified on table creation. This fixes following:

# sysctl net.inet.ip.fw.tables_sets
net.inet.ip.fw.tables_sets: 0
# ipfw table all info
# ipfw set 1 table 1 create type addr
# ipfw set 1 table 1 create type addr
# ipfw add 10 set 1 count ip from table\(1\) to any
00010 count ip from table(1) to any
# ipfw add 10 set 1 count ip from table\(1\) to any
00010 count ip from table(1) to any
# ipfw table all info
--- table(1), set(1) ---
 kindex: 4, type: addr
 references: 1, valtype: legacy
 algorithm: addr:radix
 items: 0, size: 296
--- table(1), set(1) ---
 kindex: 3, type: addr
 references: 1, valtype: legacy
 algorithm: addr:radix
 items: 0, size: 296
--- table(1), set(1) ---
 kindex: 2, type: addr
 references: 0, valtype: legacy
 algorithm: addr:radix
 items: 0, size: 296
--- table(1), set(1) ---
 kindex: 1, type: addr
 references: 0, valtype: legacy
 algorithm: addr:radix
 items: 0, size: 296
#

MFC after:	1 week
2018-04-11 11:12:20 +00:00
Andrey V. Elsukov
12c080e613 Do not try to reassemble IPv6 fragments in "reass" rule.
ip_reass() expects IPv4 packet and will just corrupt any IPv6 packets
that it gets. Until proper IPv6 fragments handling function will be
implemented, pass IPv6 packets to next rule.

PR:		170604
MFC after:	1 week
2018-03-12 09:40:46 +00:00
Andrey V. Elsukov
99493f5a4a Remove duplicate #include <netinet/ip_var.h>. 2018-02-07 19:12:05 +00:00
Andrey V. Elsukov
b99a682320 Rework ipfw dynamic states implementation to be lockless on fast path.
o added struct ipfw_dyn_info that keeps all needed for ipfw_chk and
  for dynamic states implementation information;
o added DYN_LOOKUP_NEEDED() macro that can be used to determine the
  need of new lookup of dynamic states;
o ipfw_dyn_rule now becomes obsolete. Currently it used to pass
  information from kernel to userland only.
o IPv4 and IPv6 states now described by different structures
  dyn_ipv4_state and dyn_ipv6_state;
o IPv6 scope zones support is added;
o ipfw(4) now depends from Concurrency Kit;
o states are linked with "entry" field using CK_SLIST. This allows
  lockless lookup and protected by mutex modifications.
o the "expired" SLIST field is used for states expiring.
o struct dyn_data is used to keep generic information for both IPv4
  and IPv6;
o struct dyn_parent is used to keep O_LIMIT_PARENT information;
o IPv4 and IPv6 states are stored in different hash tables;
o O_LIMIT_PARENT states now are kept separately from O_LIMIT and
  O_KEEP_STATE states;
o per-cpu dyn_hp pointers are used to implement hazard pointers and they
  prevent freeing states that are locklessly used by lookup threads;
o mutexes to protect modification of lists in hash tables now kept in
  separate arrays. 65535 limit to maximum number of hash buckets now
  removed.
o Separate lookup and install functions added for IPv4 and IPv6 states
  and for parent states.
o By default now is used Jenkinks hash function.

Obtained from:	Yandex LLC
MFC after:	42 days
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D12685
2018-02-07 18:59:54 +00:00
Andrey V. Elsukov
14a6bab1da When IPv6 packet is handled by O_REJECT opcode, convert ICMP code
specified in the arg1 into ICMPv6 destination unreachable code according
to RFC7915.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2018-01-24 12:40:28 +00:00
Pedro F. Giffuni
d821d36419 Unsign some values related to allocation.
When allocating memory through malloc(9), we always expect the amount of
memory requested to be unsigned as a negative value would either stand for
an error or an overflow.
Unsign some values, found when considering the use of mallocarray(9), to
avoid unnecessary casting. Also consider that indexes should be of
at least the same size/type as the upper limit they pretend to index.

MFC after:	3 weeks
2018-01-22 02:08:10 +00:00
Andrey V. Elsukov
d38344208e Add UDPLite support to ipfw(4).
Now it is possible to use UDPLite's port numbers in rules,
create dynamic states for UDPLite packets and see "UDPLite" for matched
packets in log.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2018-01-19 12:50:03 +00:00
Jeff Roberson
3f289c3fcf Implement 'domainset', a cpuset based NUMA policy mechanism. This allows
userspace to control NUMA policy administratively and programmatically.

Implement domainset based iterators in the page layer.

Remove the now legacy numa_* syscalls.

Cleanup some header polution created by having seq.h in proc.h.

Reviewed by:	markj, kib
Discussed with:	alc
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13403
2018-01-12 22:48:23 +00:00
Pedro F. Giffuni
454529cd0b netpfil/ipfw: Make some use of mallocarray(9).
Reviewed by:	kp, ae
Differential Revision: https://reviews.freebsd.org/D13834
2018-01-11 15:29:29 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Michael Tuexen
665c8a2ee5 Add to ipfw support for sending an SCTP packet containing an ABORT chunk.
This is similar to the TCP case. where a TCP RST segment can be sent.

There is one limitation: When sending an ABORT in response to an incoming
packet, it should be tested if there is no ABORT chunk in the received
packet. Currently, it is only checked if the first chunk is an ABORT
chunk to avoid parsing the whole packet, which could result in a DOS attack.

Thanks to Timo Voelker for helping me to test this patch.
Reviewed by: bcr@ (man page part), ae@ (generic, non-SCTP part)
Differential Revision:	https://reviews.freebsd.org/D13239
2017-11-26 18:19:01 +00:00
Andrey V. Elsukov
1719df1bb4 Modify ipfw's dynamic states KPI.
Hide the locking logic used in the dynamic states implementation from
generic code. Rename ipfw_install_state() and ipfw_lookup_dyn_rule()
function to have similar names: ipfw_dyn_install_state() and
ipfw_dyn_lookup_state(). Move dynamic rule counters updating to the
ipfw_dyn_lookup_state() function. Now this function return NULL when
there is no state and pointer to the parent rule when state is found.
Thus now there is no need to return pointer to dynamic rule, and no need
to hold bucket lock for this state. Remove ipfw_dyn_unlock() function.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D11657
2017-11-23 08:02:02 +00:00
Andrey V. Elsukov
9d15540022 Check that address family of state matches address family of packet.
If it is not matched avoid comparing other state fields.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-23 07:05:25 +00:00
Andrey V. Elsukov
30df59d581 Move ipfw_send_pkt() from ip_fw_dynamic.c into ip_fw2.c.
It is not specific for dynamic states function and called also from
generic code.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-23 06:04:57 +00:00
Andrey V. Elsukov
288bf455bb Rework rule ranges matching. Use comparison rule id with UINT32_MAX to
match all rules with the same rule number.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-23 05:55:53 +00:00
Andrey V. Elsukov
7143bb7626 Add ipfw_add_protected_rule() function that creates rule with 65535
number in the reserved set 31. Use this function to create default rule.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-22 05:49:21 +00:00
Andrey V. Elsukov
66f84fabb3 Add comment for accidentally committed unrelated change in r325960.
Do not invoke IPv4 NAT handler for non IPv4 packets. Libalias expects
a packet is IPv4. And in case when it is IPv6, it just translates them
as IPv4. This leads to corruption and in some cases to panics.
In particular a panic can happen when value of ip6_plen modified to
something that leads to IP fragmentation, but actual packet length does
not match the IP length.

Packets that are not IPv4 will be dropped by NAT rule.

Reported by:	Viktor Dukhovni <freebsd at dukhovni dot org>
MFC after:	1 week
2017-11-17 23:25:06 +00:00
Andrey V. Elsukov
e11f0a0c4c Unconditionally enable support for O_IPSEC opcode.
IPsec support can be loaded as kernel module, thus do not depend from
kernel option IPSEC and always build O_IPSEC opcode implementation as
enabled.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-17 22:40:02 +00:00
Don Lewis
4001fcbe0a Fix Dummynet AQM packet marking function ecn_mark() and fq_codel /
fq_pie schedulers packet classification functions in layer2 (bridge mode).

Dummynet AQM packet marking function ecn_mark() and fq_codel/fq_pie
schedulers packet classification functions (fq_codel_classify_flow()
and fq_pie_classify_flow()) assume mbuf is pointing at L3 (IP)
packet. However, this assumption is incorrect if ipfw/dummynet is
used to manage layer2 traffic (bridge mode) since mbuf will point
at L2 frame.  This patch solves this problem by identifying the
source of the frame/packet (L2 or L3) and adding ETHER_HDR_LEN
offset when converting an mbuf pointer to ip pointer if the traffic
is from layer2.  More specifically, in dummynet packet tagging
function, tag_mbuf(), iphdr_off is set to ETHER_HDR_LEN if the
traffic is from layer2 and set to zero otherwise. Whenever an access
to IP header is required, mtodo(m, dn_tag_get(m)->iphdr_off) is
used instead of mtod(m, struct ip *) to correctly convert mbuf
pointer to ip pointer in both L2 and L3 traffic.

Submitted by:	lstewart
MFC after:	2 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D12506
2017-10-26 10:11:35 +00:00
Andrey V. Elsukov
5c70ebfa57 Add IPv6 support for O_TCPDATALEN opcode.
PR:		222746
MFC after:	1 week
2017-10-24 08:39:05 +00:00
Andrey V. Elsukov
ff0a137952 Fix regression in handling O_FORWARD_IP opcode after r279948.
To properly handle 'fwd tablearg,port' opcode, copy sin_port value from
sockaddr_in structure stored in the opcode into corresponding hopstore
field.

PR:		222953
MFC after:	1 week
2017-10-13 11:11:53 +00:00
Michael Tuexen
945906384d Fix a bug which avoided that rules for matching port numbers for SCTP
packets where actually matched.
While there, make clean in the man-page that SCTP port numbers are
supported in rules.

MFC after:	1 month
2017-10-02 18:25:30 +00:00
Andrey V. Elsukov
5df8171da3 Use in_localip() function instead of unlocked access to addresses hash
to determine that an address is our local.

PR:		220078
MFC after:	1 week
2017-09-20 22:35:28 +00:00
Andrey V. Elsukov
369bc48dc5 Do not acquire IPFW_WLOCK when a named object is created and destroyed.
Acquiring of IPFW_WLOCK is requried for cases when we are going to
change some data that can be accessed during processing of packets flow.
When we create new named object, there are not yet any rules, that
references it, thus holding IPFW_UH_WLOCK is enough to safely update
needed structures. When we destroy an object, we do this only when its
reference counter becomes zero. And it is safe to not acquire IPFW_WLOCK,
because noone references it. The another case is when we failed to finish
some action and thus we are doing rollback and destroying an object, in
this case it is still not referenced by rules and no need to acquire
IPFW_WLOCK.

This also fixes panic with INVARIANTS due to recursive IPFW_WLOCK acquiring.

MFC after:	1 week
Sponsored by:	Yandex LLC
2017-09-20 22:00:06 +00:00
Luiz Otavio O Souza
9ffd0f54a7 Fix a couple of typos in a comment.
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-07-21 03:04:55 +00:00
Philip Paeps
b0e1660d53 Fix GRE over IPv6 tunnels with IPFW
Previously, GRE packets in IPv6 tunnels would be dropped by IPFW (unless
net.inet6.ip6.fw.deny_unknown_exthdrs was unset).

PR:		220640
Submitted by:	Kun Xie <kxie@xiplink.com>
MFC after:	1 week
2017-07-13 09:01:22 +00:00
Andrey V. Elsukov
785c0d4d97 Fix IPv6 extension header parsing. The length field doesn't include the
first 8 octets.

Obtained from:	Yandex LLC
MFC after:	3 days
2017-06-29 19:06:43 +00:00
Don Lewis
d196c9ee16 Fix the queue delay estimation in PIE/FQ-PIE when the timestamp
(TS) method is used.  When packet timestamp is used, the "current_qdelay"
keeps storing the last queue delay value calculated in the dequeue
function.  Therefore, when a burst of packets arrives followed by
a pause, the "current_qdelay" will store a high value caused by the
burst and stick to that value during the pause because the queue
delay measurement is done inside the dequeue function.  This causes
the drop probability calculation function to calculate high drop
probability value instead of zero and prevents the burst allowance
mechanism from working properly.  Fix this problem by resetting
"current_qdelay" inside the drop probability calculation function
when the queue length is zero and TS option is used.

Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	1 week
2017-05-19 08:38:03 +00:00
Don Lewis
36fb8be630 The result of right shifting a negative signed value is implementation
defined.  On machines without arithmetic shift instructions, zero bits
may be shifted in from the left, giving a large positive result instead
of the desired divide-by power-of-2.  Fix this by operating on the
absolute value and compensating for the possible negation later.

Reverse the order of the underflow/overflow tests and the exponential
decay calculation to avoid the possibility of an erroneous overflow
detection if p is a sufficiently small non-negative value.  Also
check for negative values of prob before doing the exponential decay
to avoid another instance of of right shifting a negative value.

Tested by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	1 week
2017-05-19 01:23:06 +00:00
Andrey V. Elsukov
da62ffd9cd Avoid undefined behavior.
The 'pktid' variable is modified while being used twice between
sequence points, probably due to htonl() is macro.

Reported by:	PVS-Studio
MFC after:	1 week
2017-04-14 11:58:41 +00:00
Andrey V. Elsukov
ba3e1361b0 Use address of specific union member instead of whole union address to
fix PVS-Studio warnings.

MFC after:	1 week
2017-04-14 11:41:09 +00:00
Andrey V. Elsukov
1ca7c3b815 The rule field in the ipfw_dyn_rule structure is used as storage
to pass rule number and rule set to userland. In r272840 the kernel
internal rule representation was changed and the rulenum field of
struct ip_fw_rule got the type uint32_t, but userlevel representation
still have the type uint16_t. To not overflow the size of pointer
on the systems with 32-bit pointer size use separate variable to
copy rulenum and set.

Reported by:	PVS-Studio
MFC after:	1 week
2017-04-14 11:19:09 +00:00
Maxim Konovalov
f91eb6adad o Redundant assignments removed.
Found by:	PVS-Stdio, V519
Reviewed by:	ae
2017-04-13 18:13:10 +00:00
Conrad Meyer
bcd8d3b805 dummynet: Use strlcpy to appease static checkers
Some dummynet modules used strcpy() to copy from a larger buffer
(dn_aqm->name) to a smaller buffer (dn_extra_parms->name).  It happens that
the lengths of the strings in the dn_aqm buffers were always hardcoded to be
smaller than the dn_extra_parms buffer ("CODEL", "PIE").

Use strlcpy() instead, to appease static checkers.  No functional change.

Reported by:	Coverity
CIDs:		1356163, 1356165
Sponsored by:	Dell EMC Isilon
2017-04-13 17:47:44 +00:00
Andrey V. Elsukov
88d950a650 Remove "IPFW static rules" rmlock.
Make PFIL's lock global and use it for this purpose.
This reduces the number of locks needed to acquire for each packet.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
No objection from: #network
Differential Revision:	https://reviews.freebsd.org/D10154
2017-04-03 13:35:04 +00:00
Andrey V. Elsukov
aac74aeac7 Add ipfw_pmod kernel module.
The module is designed for modification of a packets of any protocols.
For now it implements only TCP MSS modification. It adds the external
action handler for "tcp-setmss" action.

A rule with tcp-setmss action does additional check for protocol and
TCP flags. If SYN flag is present, it parses TCP options and modifies
MSS option if its value is greater than configured value in the rule.
Then it adjustes TCP checksum if needed. After handling the search
continues with the next rule.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Yandex LLC
No objection from: #network
Differential Revision:	https://reviews.freebsd.org/D10150
2017-04-03 03:07:48 +00:00
Andrey V. Elsukov
11c56650f0 Add O_EXTERNAL_DATA opcode support.
This opcode can be used to attach some data to external action opcode.
And unlike to O_EXTERNAL_INSTANCE opcode, this opcode does not require
creating of named instance to pass configuration arguments to external
action handler. The data is coming just next to O_EXTERNAL_ACTION opcode.

The userlevel part currenly supports formatting for opcode with ipfw_insn
size, by default it expects u16 numeric value in the arg1.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2017-04-03 02:44:40 +00:00
Andrey V. Elsukov
399ad57874 Add the log formatting for an external action opcode.
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2017-04-03 02:26:30 +00:00
Andrey V. Elsukov
788e62864f Reset the cached state of last lookup in the dynamic states when an
external action is completed, but the rule search is continued.

External action handler can change the content of @args argument,
that is used for dynamic state lookup. Enforce the new lookup to be able
install new state, when the search is continued.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-03-31 09:26:08 +00:00
Don Lewis
46c8aadb6f Change several constants used by the PIE algorithm from unsigned to signed.
- PIE_MAX_PROB is compared to variable of int64_t and the type promotion
   rules can cause the value of that variable to be treated as unsigned.
   If the value is actually negative, then the result of the comparsion
   is incorrect, causing the algorithm to perform poorly in some
   situations.  Changing the constant to be signed cause the comparision
   to work correctly.

 - PIE_SCALE is also compared to signed values.  Fortunately they are
   also compared to zero and negative values are discarded so this is
   more of a cosmetic fix.

 - PIE_DQ_THRESHOLD is only compared to unsigned values, but it is small
   enough that the automatic promotion to unsigned is harmless.

Submitted by:	Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after:	1 week
2017-03-18 23:00:13 +00:00
Andrey V. Elsukov
3667f39ea3 Use memset with structure size. 2017-03-14 07:57:33 +00:00
Conrad Meyer
49b6a5d60a nat64lsn: Use memset() with structure, not pointer, size
PR:		217738
Submitted by:	Svyatoslav <razmyslov at viva64.com>
Sponsored by:	Viva64 (PVS-Studio)
2017-03-13 17:53:46 +00:00
Maxim Konovalov
f621c2cd39 o Typo in the comment fixed.
PR:		217617
Submitted by:	lutz
2017-03-09 09:54:23 +00:00
Andrey V. Elsukov
53de37f8ca Fix the build. Use new ipfw_lookup_table() in the nat64 too.
Reported by:	cy
MFC after:	2 weeks
2017-03-06 00:41:59 +00:00
Andrey V. Elsukov
54e5669d8c Add IPv6 support to O_IP_DST_LOOKUP opcode.
o check the size of O_IP_SRC_LOOKUP opcode, it can not exceed the size of
  ipfw_insn_u32;
o rename ipfw_lookup_table_extended() function into ipfw_lookup_table() and
  remove old ipfw_lookup_table();
o use args->f_id.flow_id6 that is in host byte order to get DSCP value;
o add SCTP ports support to 'lookup src/dst-port' opcode;
o add IPv6 support to 'lookup src/dst-ip' opcode.

PR:		217292
Reviewed by:	melifaro
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9873
2017-03-05 23:48:24 +00:00
Andrey V. Elsukov
c750a56914 Reject invalid object types that can not be used with specific opcodes.
When we doing reference counting of named objects in the new rule,
for existing objects check that opcode references to correct object,
otherwise return EINVAL.

PR:		217391
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-03-05 22:19:43 +00:00
Andrey V. Elsukov
43b294a4db Fix matching table entry value. Use real table value instead of its index
in valuestate array.

When opcode has size equal to ipfw_insn_u32, this means that it should
additionally match value specified in d[0] with table entry value.
ipfw_table_lookup() returns table value index, use TARG_VAL() macro to
convert it to its value. The actual 32-bit value stored in the tag field
of table_value structure, where all unspecified u32 values are kept.

PR:		217262
Reviewed by:	melifaro
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-03-03 20:22:42 +00:00
Andrey V. Elsukov
576429f04b Fix NPTv6 rule counters when one_pass is not enabled.
Consider the rule matching when both @done and @retval values
returned from ipfw_run_eaction() are zero. And modify ipfw_nptv6()
to return IP_FW_DENY and @done=0 when addresses do not match.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2017-03-01 20:00:19 +00:00
Pedro F. Giffuni
e099b90b80 sys: Replace zero with NULL for pointers.
Found with:	devel/coccinelle
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D9694
2017-02-22 02:35:59 +00:00
Eric van Gyzen
8144690af4 Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.

Suggested by:	glebius, emaste
Reviewed by:	gnn
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9625
2017-02-16 20:47:41 +00:00
Enji Cooper
bc64f428ad Fix typos in comments (returing -> returning)
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-02-07 00:09:48 +00:00
Andrey V. Elsukov
ce3a6cf06a Initialize IPFW static rules rmlock with RM_RECURSE flag.
This lock was replaced from rwlock in r272840. But unlike rwlock, rmlock
doesn't allow recursion on rm_rlock(), so at this time fix this with
RM_RECURSE flag. Later we need to change ipfw to avoid such recursions.

PR:		216171
Reported by:	Eugene Grosbein
MFC after:	1 week
2017-01-17 10:50:28 +00:00
Marius Strobl
0ac43d9728 In dummynet(4), random chunks of memory are casted to struct dn_*,
potentially leading to fatal unaligned accesses on architectures with
strict alignment requirements. This change fixes dummynet(4) as far
as accesses to 64-bit members of struct dn_* are concerned, tripping
up on sparc64 with accesses to 32-bit members happening to be correctly
aligned there. In other words, this only fixes the tip of the iceberg;
larger parts of dummynet(4) still need to be rewritten in order to
properly work on all of !x86.
In principle, considering the amount of code in dummynet(4) that needs
this erroneous pattern corrected, an acceptable workaround would be to
declare all struct dn_* packed, forcing compilers to do byte-accesses
as a side-effect. However, given that the structs in question aren't
laid out well either, this would break ABI/KBI.
While at it, replace all existing bcopy(9) calls with memcpy(9) for
performance reasons, as there is no need to check for overlap in these
cases.

PR:		189219
MFC after:	5 days
2017-01-09 20:51:51 +00:00
Andrey V. Elsukov
02784f106e Convert result of hash_packet6() into host byte order.
For IPv4 similar function uses addresses and ports in host byte order,
but for IPv6 it used network byte order. This led to very bad hash
distribution for IPv6 flows. Now the result looks similar to IPv4.

Reported by:	olivier
MFC after:	1 week
Sponsored by:	Yandex LLC
2016-12-06 23:52:56 +00:00
Andrey V. Elsukov
c5f2dbb625 Fix ICMPv6 Time Exceeded error message translation.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-11-26 10:04:05 +00:00
Luiz Otavio O Souza
e40145851b Remove the mbuf tag after use (for reinjected packets).
Fixes the packet processing in dummynet l2 rules.

Obtained from:	pfSense
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (Netgate)
2016-11-03 00:26:58 +00:00
Luiz Otavio O Souza
3e80a649fb Stop abusing from struct ifnet presence to determine the packet direction
for dummynet, use the correct argument for that, remove the false coment
about the presence of struct ifnet.

Fixes the input match of dummynet l2 rules.

Obtained from:	pfSense
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (Netgate)
2016-11-01 18:42:44 +00:00
Andrey V. Elsukov
308f2c6d56 Fix ipfw table lookup handler to return entry value, but not its index.
Submitted by:	loos
MFC after:	1 week
2016-10-19 11:51:17 +00:00
Andrey V. Elsukov
0d9cbb874c Move opcode rewriter init and destroy handlers into non-VENT code.
PR:		212576,212649,212077
Submitted by:	John Zielinski
MFC after:	1 week
2016-09-18 17:35:17 +00:00
Andrey V. Elsukov
70c1466dad Fix swap tables between sets when this functional is enabled.
We have 6 opcode rewriters for table opcodes. When `set swap' command
invoked, it is called for each rewriter, so at the end we get the same
result, because opcode rewriter uses ETLV type to match opcode. And all
tables opcodes have the same ETLV type. To solve this problem, use
separate sets handler for one opcode rewriter. Use it to handle TEST_ALL,
SWAP_ALL and MOVE_ALL commands.

PR:		212630
MFC after:	1 week
2016-09-13 18:16:15 +00:00
Bjoern A. Zeeb
db68f7839f Try to fix gcc compilation errors (which are right).
nat64_getlasthdr() returns an int, which can be -1 in case of error,
storing the result in an uint8_t and then comparing to < 0 is not
helpful.  Do what is done in the rest of the code and make proto an
int here as well.
2016-08-18 10:26:15 +00:00
Oleg Bulyzhin
e7560c836f Fix command: ipfw set (enable|disable) N (where N > 4).
enable_sets() expects set bitmasks, not set numbers.

MFC after:	3 days
2016-08-15 13:06:29 +00:00
Andrey V. Elsukov
ecd3637584 Use %ju to print unsigned 64-bit value.
Reported by:	kib
2016-08-13 22:14:16 +00:00
Andrey V. Elsukov
57fb3b7a78 Add stats reset command implementation to NPTv6 module
to be able reset statistics counters.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-08-13 16:45:14 +00:00
Andrey V. Elsukov
c402a01b03 Replace __noinline with special debug macro NAT64NOINLINE. 2016-08-13 16:26:15 +00:00
Andrey V. Elsukov
d8caf56e9e Add ipfw_nat64 module that implements stateless and stateful NAT64.
The module works together with ipfw(4) and implemented as its external
action module.

Stateless NAT64 registers external action with name nat64stl. This
keyword should be used to create NAT64 instance and to address this
instance in rules. Stateless NAT64 uses two lookup tables with mapped
IPv4->IPv6 and IPv6->IPv4 addresses to perform translation.

A configuration of instance should looks like this:
 1. Create lookup tables:
 # ipfw table T46 create type addr valtype ipv6
 # ipfw table T64 create type addr valtype ipv4
 2. Fill T46 and T64 tables.
 3. Add rule to allow neighbor solicitation and advertisement:
 # ipfw add allow icmp6 from any to any icmp6types 135,136
 4. Create NAT64 instance:
 # ipfw nat64stl NAT create table4 T46 table6 T64
 5. Add rules that matches the traffic:
 # ipfw add nat64stl NAT ip from any to table(T46)
 # ipfw add nat64stl NAT ip from table(T64) to 64:ff9b::/96
 6. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
    via NAT64 host.

Stateful NAT64 registers external action with name nat64lsn. The only
one option required to create nat64lsn instance - prefix4. It defines
the pool of IPv4 addresses used for translation.

A configuration of instance should looks like this:
 1. Add rule to allow neighbor solicitation and advertisement:
 # ipfw add allow icmp6 from any to any icmp6types 135,136
 2. Create NAT64 instance:
 # ipfw nat64lsn NAT create prefix4 A.B.C.D/28
 3. Add rules that matches the traffic:
 # ipfw add nat64lsn NAT ip from any to A.B.C.D/28
 # ipfw add nat64lsn NAT ip6 from any to 64:ff9b::/96
 4. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
    via NAT64 host.

Obtained from:	Yandex LLC
Relnotes:	yes
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D6434
2016-08-13 16:09:49 +00:00
Andrey V. Elsukov
6951cecf71 Add three helper function to manage tables from external modules.
ipfw_objhash_lookup_table_kidx does lookup kernel index of table;
ipfw_ref_table/ipfw_unref_table takes and releases reference to table.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-08-13 15:48:56 +00:00
Andrey V. Elsukov
56132dcc0d Move logging via BPF support into separate file.
* make interface cloner VNET-aware;
* simplify cloner code and use if_clone_simple();
* migrate LOGIF_LOCK() to rmlock;
* add ipfw_bpf_mtap2() function to pass mbuf to BPF;
* introduce new additional ipfwlog0 pseudo interface. It differs from
  ipfw0 by DLT type used in bpfattach. This interface is intended to
  used by ipfw modules to dump packets with additional info attached.
  Currently pflog format is used. ipfw_bpf_mtap2() function uses second
  argument to determine which interface use for dumping. If dlen is equal
  to ETHER_HDR_LEN it uses old ipfw0 interface, if dlen is equal to
  PFLOG_HDRLEN - ipfwlog0 will be used.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2016-08-13 15:41:04 +00:00
Andrey V. Elsukov
d6eb9b0249 Restore "nat global" support.
Now zero value of arg1 used to specify "tablearg", use the old "tablearg"
value for "nat global". Introduce new macro IP_FW_NAT44_GLOBAL to replace
hardcoded magic number to specify "nat global". Also replace 65535 magic
number with corresponding macro. Fix typo in comments.

PR:		211256
Tested by:	Victor Chernov
MFC after:	3 days
2016-08-11 10:10:10 +00:00
Konstantin Belousov
584b675ed6 Hide the boottime and bootimebin globals, provide the getboottime(9)
and getboottimebin(9) KPI. Change consumers of boottime to use the
KPI.  The variables were renamed to avoid shadowing issues with local
variables of the same name.

Issue is that boottime* should be adjusted from tc_windup(), which
requires them to be members of the timehands structure.  As a
preparation, this commit only introduces the interface.

Some uses of boottime were found doubtful, e.g. NLM uses boottime to
identify the system boot instance.  Arguably the identity should not
change on the leap second adjustment, but the commit is about the
timekeeping code and the consumers were kept bug-to-bug compatible.

Tested by:	pho (as part of the bigger patch)
Reviewed by:	jhb (same)
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:08:59 +00:00