Commit Graph

445 Commits

Author SHA1 Message Date
Mark Johnston
bc6a2267ff ipfw: Update the pfil mbuf pointer in ipfw_check_frame()
ipfw_chk() might call m_pullup() and thus can change the mbuf chain
head.  In this case, the new chain head has to be returned to the pfil
hook caller, otherwise the pfil hook caller is left with a dangling
pointer.

Note that this affects only the link-layer hooks installed when the
net.link.ether.ipfw sysctl is set to 1.

PR:		256439, 254015, 255069, 255104
Fixes:		f355cb3e6
Reviewed by:	ae
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30764
2021-06-16 09:46:56 -04:00
Kristof Provost
1b2dbe37fa dummynet: free(NULL, M_DUMMYNET); is safe
There's no need to check pointers for NULL before free()ing them.

No functional change.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30382
2021-06-03 09:02:53 +02:00
Kristof Provost
51d73df18e dummynet: Fix schedlist and aqmlist locking
These are global (i.e. shared across vnets) structures, so we need
global lock to protect them.  However, we look up entries in these lists
(find_aqm_type(), find_sched_type()) and return them. We must ensure
that the returned structures cannot go away while we are using them.

Resolve this by using NET_EPOCH(). The structures can be safely accessed
under it, and we postpone their cleanup until we're sure they're no
longer used.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30381
2021-06-03 09:02:49 +02:00
Tom Jones
fe3bcfbda3 VNETify dummynet
This moves dn_cfg and other parameters into per VNET variables.

The taskqueue and control state remains global.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D29274
2021-06-03 09:01:56 +02:00
Mark Johnston
c4a6258d70 dummynet: Fix mbuf tag allocation failure handling
PR:		255875, 255878, 255879, 255880
Reviewed by:	donner, kp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30318
2021-05-18 15:25:16 -04:00
Kristof Provost
02c44f40f9 dummynet: Remove unused code
We never set 'busy' and never dequeue from the pending mq. Remove this
code.

Reviewed by:	ae
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30313
2021-05-17 15:03:55 +02:00
Mark Johnston
652908599b Add required checks for unmapped mbufs in ipdivert and ipfw
Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain
are mapped.  Use it in ipfw_nat, which operates on a chain returned by
m_megapullup().

PR:		255164
Reviewed by:	ae, gallatin
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29838
2021-04-21 15:47:05 -04:00
Andrey V. Elsukov
9bacbf1ae2 ipfw: do not use sleepable malloc in callout context.
Use M_NOWAIT flag when hash growing is called from callout.

PR:             255041
Reviewed by:	kevans
MFC after:	10 days
Differential Revision: https://reviews.freebsd.org/D29772
2021-04-16 10:22:44 +03:00
Kristof Provost
320bed3c00 dummynet: Move timekeeping information into dn_cfg
Just like with the packet counters move the timekeeping information into
dn_cfg. This reduces the global name space use for dummynet and will
make subsequent work to add vnet support and re-use in pf easier.

Reviewed by:	donner
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Different Revision:	https://reviews.freebsd.org/D29246
2021-03-23 18:01:58 +01:00
Kristof Provost
cd5671efc0 dummynet: Move packet counters into dn_cfg
Move the packets counters into the dn_cfg struct. This reduces the
global name space use for dummynet and will make future work for things
like vnet support and re-use in pf easier.

Reviewed by:	donner
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29245
2021-03-23 18:01:58 +01:00
Gordon Bergling
183502d162 Fix a few typos in comments
- trough -> through

MFC after:	1 week
2021-03-13 16:37:28 +01:00
Andrey V. Elsukov
a9f7eba959 ipfw: add IPv6 support for sockarg opcode.
MFC after:	1 week
Sponsored by:	Yandex LLC
2021-03-02 12:45:59 +03:00
Neel Chauhan
a08cdb6cfb Allow setting alias port ranges in libalias and ipfw. This will allow a system
to be a true RFC 6598 NAT444 setup, where each network segment (e.g. user,
subnet) can have their own dedicated port aliasing ranges.

Reviewed by:		donner, kp
Approved by:		0mp (mentor), donner, kp
Differential Revision:	https://reviews.freebsd.org/D23450
2021-02-02 13:24:17 -08:00
Alexander V. Chernikov
3ad80c6531 Fix LINT-NOINET6 build after r368571.
Reported by:	mjg
2020-12-14 22:54:32 +00:00
Alexander V. Chernikov
2616eaa3d9 Fix NOINET6 build broken by r368571. 2020-12-12 01:05:31 +00:00
Alexander V. Chernikov
4451d8939c ipfw kfib algo: Use rt accessors instead of accessing rib/rtentry directly.
This removes assumptions on prefix storage and rtentry layout
 from an external code.

Differential Revision:	https://reviews.freebsd.org/D27450
2020-12-11 23:57:30 +00:00
Andrey V. Elsukov
7ec2f6bce5 Add dtrace SDT probe ipfw:::rule-matched.
It helps to reduce complexity with debugging of large ipfw rulesets.
Also define several constants and translators, that can by used by
dtrace scripts with this probe.

Reviewed by:	gnn
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D26879
2020-10-21 15:01:33 +00:00
Andrey V. Elsukov
f909db0b19 Add IPv4 fragments reassembling to NAT64LSN.
NAT64LSN requires the presence of upper level protocol header
in a IPv4 datagram to find corresponding state to make translation.
Now it will be handled automatically by nat64lsn instance.

Reviewed by:	melifaro
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D26758
2020-10-13 18:57:42 +00:00
Mateusz Guzik
662c13053f net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
Ed Maste
5e79303ba1 ipfw: style(9) fixes
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Reviewed by:	emaste, glebius
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D26126
2020-08-20 16:56:13 +00:00
Gleb Smirnoff
825398f946 ipfw: make the "frag" keyword accept additional options "mf",
"df", "rf" and "offset".  This allows to match on specific
bits of ip_off field.

For compatibility reasons lack of keyword means "offset".

Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D26021
2020-08-11 15:46:22 +00:00
Andrey V. Elsukov
aaef76e1fd Handle delayed checksums if needed in NAT64.
Upper level protocols defer checksums calculation in hope we have
checksums offloading in a network card. CSUM_DELAY_DATA flag is used
to determine that checksum calculation was deferred. And IP output
routine checks for this flag before pass mbuf to lower layer. Forwarded
packets have not this flag.

NAT64 uses checksums adjustment when it translates IP headers.
In most cases NAT64 is used for forwarded packets, but in case when it
handles locally originated packets we need to finish checksum calculation
that was deferred to correctly adjust it.

Add check for presence of CSUM_DELAY_DATA flag and finish checksum
calculation before adjustment.

Reported and tested by:	Evgeniy Khramtsov <evgeniy at khramtsov org>
MFC after:	1 week
2020-08-05 09:16:35 +00:00
Tom Jones
b2776a1809 Don't print VNET pointer when initializing dummynet
When dummynet initializes it prints a debug message with the current VNET
pointer unnecessarily revealing kernel memory layout. This appears to be left
over from when the first pieces of vimage support were added.

PR:		238658
Submitted by:	huangfq.daxian@gmail.com
Reviewed by:	markj, bz, gnn, kp, melifaro
Approved by:	jtl (co-mentor), bz (co-mentor)
Event:		July 2020 Bugathon
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D25619
2020-07-13 13:35:36 +00:00
Alexander V. Chernikov
6ad7446c6f Complete conversions from fib<4|6>_lookup_nh_<basic|ext> to fib<4|6>_lookup().
fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees
 over pointer validness and requiring on-stack data copying.

With no callers remaining, remove fib[46]_lookup_nh_ functions.

Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Differential Revision:	https://reviews.freebsd.org/D25445
2020-07-02 21:04:08 +00:00
Mark Johnston
1388cfe1b5 ipfw(4): make O_IPVER/ipversion match IPv4 or 6, not just IPv4.
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Reviewed by:	Lutz Donnerhacke
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25227
2020-06-24 15:46:33 +00:00
Eugene Grosbein
47cb0632e8 ipfw: unbreak matching with big table type flow.
Test case:

# n=32769
# ipfw -q table 1 create type flow:proto,dst-ip,dst-port
# jot -w 'table 1 add tcp,127.0.0.1,' $n 1 | ipfw -q /dev/stdin
# ipfw -q add 5 unreach filter-prohib flow 'table(1)'

The rule 5 matches nothing without the fix if n>=32769.

With the fix, it works:
# telnet localhost 10001
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Permission denied
telnet: Unable to connect to remote host

MFC after:	2 weeks
Discussed with: ae, melifaro
2020-06-04 14:15:39 +00:00
Andrey V. Elsukov
e43ae8dcb5 Fix O_IP_FLOW_LOOKUP opcode handling.
Do not check table value matching when table lookup has failed.

Reported by:	Sergey Lobanov
MFC after:	1 week
2020-05-29 10:37:42 +00:00
Ed Maste
db462d948f ipfw: whitespace fix in SCTP_ABORT_ASSOCIATION case statement comment
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Reviewed by:	rgrimes, tuexen
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24602
2020-05-03 03:44:16 +00:00
Alexander V. Chernikov
e7d8af4f65 Move route_temporal.c and route_var.h to net/route.
Nexthop objects implementation, defined in r359823,
 introduced sys/net/route directory intended to hold all
 routing-related code. Move recently-introduced route_temporal.c and
 private route_var.h header there.

Differential Revision:	https://reviews.freebsd.org/D24597
2020-04-28 19:14:09 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Hans Petter Selasky
fbb890056e Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

This patch extends r357772.

Differential Revision:	https://reviews.freebsd.org/D23742
Reviewed by:	glebius@
Sponsored by:	Mellanox Technologies
2020-02-18 19:53:36 +00:00
Hans Petter Selasky
b4426a7175 Add missing EPOCH(9) wrapper in ipfw(8).
Backtrace:
panic()
ip_output()
dyn_tick()
softclock_call_cc()
softclock()
ithread_loop()

Differential Revision:	https://reviews.freebsd.org/D23599
Reviewed by:	glebius@ and ae@
Found by:	mmacy@
Reported by:	jmd@
Sponsored by:	Mellanox Technologies
2020-02-11 18:16:29 +00:00
Gleb Smirnoff
2a4bd982d0 Introduce NET_EPOCH_CALL() macro and use it everywhere where we free
data based on the network epoch.   The macro reverses the argument
order of epoch_call(9) - first function, then its argument. NFC
2020-01-15 06:05:20 +00:00
Alexander V. Chernikov
57ddf39678 ipfw: Don't rollback state in alloc_table_vidx() if atomicity is not required.
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D22662
2019-12-19 10:22:16 +00:00
Alexander V. Chernikov
00b45f58e8 Revert r355908 to commit it with a proper message. 2019-12-19 10:20:38 +00:00
Alexander V. Chernikov
880266635d svn-commit.tmp 2019-12-19 09:19:27 +00:00
Andrey V. Elsukov
1ae74c1359 Make TCP options parsing stricter.
Rework tcpopts_parse() to be more strict. Use const pointer. Add length
checks for specific TCP options. The main purpose of the change is
avoiding of possible out of mbuf's data access.

Reported by:	Maxime Villard
Reviewed by:	melifaro, emaste
MFC after:	1 week
2019-12-13 11:47:58 +00:00
Andrey V. Elsukov
2873980947 Follow RFC 4443 p2.2 and always use own addresses for reflected ICMPv6
datagrams.

Previously destination address from original datagram was used. That
looked confusing, especially in the traceroute6 output.
Also honor IPSTEALTH kernel option and do TTL/HLIM decrementing only
when stealth mode is disabled.

Reported by:	Marco van Tol <marco at tols org>
Reviewed by:	melifaro
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D22631
2019-12-12 13:28:46 +00:00
Andrey V. Elsukov
ca0ac0a6c1 Avoid access to stale ip pointer and call UPDATE_POINTERS() after
PULLUP_LEN_LOCKED().

PULLUP_LEN_LOCKED() could update mbuf and thus we need to update related
pointers that can be used in next opcodes.

Reported by:	Maxime Villard <max at m00nbsd net>
MFC after:	1 week
2019-12-10 10:35:32 +00:00
Gleb Smirnoff
16a72f53e2 Use epoch(9) directly instead of obsoleted KPI. 2019-10-14 16:37:41 +00:00
Gleb Smirnoff
961c033ef1 ipfw(4) rule matching always happens in network epoch. 2019-10-14 16:37:00 +00:00
Gleb Smirnoff
b8a6e03fac Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by:	gallatin, hselasky, cy, adrian, kristof
Differential Revision:	https://reviews.freebsd.org/D19111
2019-10-07 22:40:05 +00:00
Gleb Smirnoff
f8b45306c6 Drivers may pass runt packets to filter. This is okay.
Reviewed by:	gallatin
2019-09-13 22:36:04 +00:00
Andrey V. Elsukov
773a7e2224 Fix rule truncation on external action module unloading.
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-08-15 13:44:33 +00:00
Andrey V. Elsukov
e758846c09 dd ipfw_get_action() function to get the pointer to action opcode.
ACTION_PTR() returns pointer to the start of rule action section,
but rule can keep several rule modifiers like O_LOG, O_TAG and O_ALTQ,
and only then real action opcode is stored.

ipfw_get_action() function inspects the rule action section, skips
all modifiers and returns action opcode.

Use this function in ipfw_reset_eaction() and flush_nat_ptrs().

MFC after:	1 week
Sponsored by:	Yandex LLC
2019-07-29 15:09:12 +00:00
Andrey V. Elsukov
d4e6a52959 Avoid possible lock leaking.
After r343619 ipfw uses own locking for packets flow. PULLUP_LEN() macro
is used in ipfw_chk() to make m_pullup(). When m_pullup() fails, it just
returns via `goto pullup_failed`. There are two places where PULLUP_LEN()
is called with IPFW_PF_RLOCK() held.

Add PULLUP_LEN_LOCKED() macro to use in these places to be able release
the lock, when m_pullup() fails.

Sponsored by:	Yandex LLC
2019-07-29 12:55:48 +00:00
Andrey V. Elsukov
455d2ecb71 Eliminate rmlock from ipfw's BPF code.
After r343631 pfil hooks are invoked in net_epoch_preempt section,
this allows to avoid extra locking. Add NET_EPOCH_ASSER() assertion
to each ipfw_bpf_*tap*() call to require to be called from inside
epoch section.

Use NET_EPOCH_WAIT() in ipfw_clone_destroy() to wait until it becomes
safe to free() ifnet. And use on-stack ifnet pointer in each
ipfw_bpf_*tap*() call to avoid NULL pointer dereference in case when
V_*log_if global variable will become NULL during ipfw_bpf_*tap*() call.

Sponsored by:	Yandex LLC
2019-07-23 12:52:36 +00:00
Andrey V. Elsukov
2dab0de635 Do not modify cmd pointer if it is already last opcode in the rule.
MFC after:	1 week
2019-07-12 09:59:21 +00:00
Andrey V. Elsukov
4ee2f4c180 Correctly truncate the rule in case when it has several action opcodes.
It is possible, that opcode at the ACTION_PTR() location is not real
action, but action modificator like "log", "tag" etc. In this case we
need to check for each opcode in the loop to find O_EXTERNAL_ACTION.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-07-12 09:48:42 +00:00
Andrey V. Elsukov
019c8c9330 Follow the RFC 3128 and drop short TCP fragments with offset = 1.
Reported by:	emaste
MFC after:	1 week
2019-06-25 11:40:37 +00:00