freebsd-skq

Author	SHA1	Message	Date
Andrey V. Elsukov	f909db0b19	Add IPv4 fragments reassembling to NAT64LSN. NAT64LSN requires the presence of upper level protocol header in a IPv4 datagram to find corresponding state to make translation. Now it will be handled automatically by nat64lsn instance. Reviewed by: melifaro Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D26758	2020-10-13 18:57:42 +00:00
Kristof Provost	52b83a0618	pf: do not remove kifs that are referenced by rules Even if a kif doesn't have an ifp or if_group pointer we still can't delete it if it's referenced by a rule. In other words: we must check rulerefs as well. While we're here also teach pfi_kif_unref() not to remove kifs with flags. Reported-by: syzbot+b31d1d7e12c5d4d42f28@syzkaller.appspotmail.com MFC after: 2 weeks	2020-10-13 11:04:00 +00:00
Kristof Provost	c9449e4fb8	pf: create a kif for flags If userspace tries to set flags (e.g. 'set skip on <ifspec>') and <ifspec> doesn't exist we should create a kif so that we apply the flags when the <ifspec> does turn up. Otherwise we'd end up in surprising situations where the rules say the interface should be skipped, but it's not until the rules get re-applied. Reviewed by: Lutz Donnerhacke <lutz_donnerhacke.de> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26742	2020-10-12 12:39:37 +00:00
Mateusz Guzik	662c13053f	net: clean up empty lines in .c and .h files	2020-09-01 21:19:14 +00:00
Ed Maste	5e79303ba1	ipfw: style(9) fixes Submitted by: Neel Chauhan <neel AT neelc DOT org> Reviewed by: emaste, glebius MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D26126	2020-08-20 16:56:13 +00:00
Gleb Smirnoff	825398f946	ipfw: make the "frag" keyword accept additional options "mf", "df", "rf" and "offset". This allows to match on specific bits of ip_off field. For compatibility reasons lack of keyword means "offset". Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D26021	2020-08-11 15:46:22 +00:00
Andrey V. Elsukov	aaef76e1fd	Handle delayed checksums if needed in NAT64. Upper level protocols defer checksums calculation in hope we have checksums offloading in a network card. CSUM_DELAY_DATA flag is used to determine that checksum calculation was deferred. And IP output routine checks for this flag before pass mbuf to lower layer. Forwarded packets have not this flag. NAT64 uses checksums adjustment when it translates IP headers. In most cases NAT64 is used for forwarded packets, but in case when it handles locally originated packets we need to finish checksum calculation that was deferred to correctly adjust it. Add check for presence of CSUM_DELAY_DATA flag and finish checksum calculation before adjustment. Reported and tested by: Evgeniy Khramtsov <evgeniy at khramtsov org> MFC after: 1 week	2020-08-05 09:16:35 +00:00
Tom Jones	b2776a1809	Don't print VNET pointer when initializing dummynet When dummynet initializes it prints a debug message with the current VNET pointer unnecessarily revealing kernel memory layout. This appears to be left over from when the first pieces of vimage support were added. PR: 238658 Submitted by: huangfq.daxian@gmail.com Reviewed by: markj, bz, gnn, kp, melifaro Approved by: jtl (co-mentor), bz (co-mentor) Event: July 2020 Bugathon MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D25619	2020-07-13 13:35:36 +00:00
Alexander V. Chernikov	6ad7446c6f	Complete conversions from fib<4\|6>_lookup_nh_<basic\|ext> to fib<4\|6>_lookup(). fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees over pointer validness and requiring on-stack data copying. With no callers remaining, remove fib[46]_lookup_nh_ functions. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25445	2020-07-02 21:04:08 +00:00
Mark Johnston	1388cfe1b5	ipfw(4): make O_IPVER/ipversion match IPv4 or 6, not just IPv4. Submitted by: Neel Chauhan <neel AT neelc DOT org> Reviewed by: Lutz Donnerhacke MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25227	2020-06-24 15:46:33 +00:00
Mark Johnston	95033af923	Add the SCTP_SUPPORT kernel option. This is in preparation for enabling a loadable SCTP stack. Analogous to IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured in order to support a loadable SCTP implementation. Discussed with: tuexen MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2020-06-18 19:32:34 +00:00
Eugene Grosbein	47cb0632e8	ipfw: unbreak matching with big table type flow. Test case: # n=32769 # ipfw -q table 1 create type flow:proto,dst-ip,dst-port # jot -w 'table 1 add tcp,127.0.0.1,' $n 1 \| ipfw -q /dev/stdin # ipfw -q add 5 unreach filter-prohib flow 'table(1)' The rule 5 matches nothing without the fix if n>=32769. With the fix, it works: # telnet localhost 10001 Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Permission denied telnet: Unable to connect to remote host MFC after: 2 weeks Discussed with: ae, melifaro	2020-06-04 14:15:39 +00:00
Andrey V. Elsukov	e43ae8dcb5	Fix O_IP_FLOW_LOOKUP opcode handling. Do not check table value matching when table lookup has failed. Reported by: Sergey Lobanov MFC after: 1 week	2020-05-29 10:37:42 +00:00
Mark Johnston	c1be839971	pf: Add a new zone for per-table entry counters. Right now we optionally allocate 8 counters per table entry, so in addition to memory consumed by counters, we require 8 pointers worth of space in each entry even when counters are not allocated (the default). Instead, define a UMA zone that returns contiguous per-CPU counter arrays for use in table entries. On amd64 this reduces sizeof(struct pfr_kentry) from 216 to 160. The smaller size also results in better slab efficiency, so memory usage for large tables is reduced by about 28%. Reviewed by: kp MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D24843	2020-05-16 00:28:12 +00:00
Mark Johnston	21121f9bbe	pf: Don't allocate per-table entry counters unless required. pf by default does not do per-table address accounting unless the "counters" keyword is specified in the corresponding pf.conf table definition. Yet, we always allocate 12 per-CPU counters per table. For large tables this carries a lot of overhead, so only allocate counters when they will actually be used. A further enhancement might be to use a dedicated UMA zone to allocate counter arrays for table entries, since close to half of the structure size comes from counter pointers. A related issue is the cost of zeroing counters, since counter_u64_zero() calls smp_rendezvous() on some architectures. Reported by: loos, Jim Pingle <jimp@netgate.com> Reviewed by: kp MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC (Netgate) Differential Revision: https://reviews.freebsd.org/D24803	2020-05-11 18:47:38 +00:00
Kristof Provost	1ef06ed8de	pf: Improve DIOCADDRULE validation We expect the addrwrap.p.dyn value to be set to NULL (and assert such), but do not verify it on input. Reported-by: syzbot+936a89182e7d8f927de1@syzkaller.appspotmail.com Reviewed by: melifaro (previous version) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24538	2020-05-03 16:09:35 +00:00
Ed Maste	db462d948f	ipfw: whitespace fix in SCTP_ABORT_ASSOCIATION case statement comment Submitted by: Neel Chauhan <neel AT neelc DOT org> Reviewed by: rgrimes, tuexen MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24602	2020-05-03 03:44:16 +00:00
Alexander V. Chernikov	e7d8af4f65	Move route_temporal.c and route_var.h to net/route. Nexthop objects implementation, defined in r359823, introduced sys/net/route directory intended to hold all routing-related code. Move recently-introduced route_temporal.c and private route_var.h header there. Differential Revision: https://reviews.freebsd.org/D24597	2020-04-28 19:14:09 +00:00
Kristof Provost	df03977dd8	pf: Virtualise pf_frag_mtx The pf_frag_mtx mutex protects the fragments queue. The fragments queue is virtualised already (i.e. per-vnet) so it makes no sense to block jail A from accessing its fragments queue while jail B is accessing its own fragments queue. Virtualise the lock for improved concurrency. Differential Revision: https://reviews.freebsd.org/D24504	2020-04-26 16:30:00 +00:00
Kristof Provost	a7c8533634	pf: Improve input validation If we pass an anchor name which doesn't exist pfr_table_count() returns -1, which leads to an overflow in mallocarray() and thus a panic. Explicitly check that pfr_table_count() does not return an error. Reported-by: syzbot+bd09d55d897d63d5f4f4@syzkaller.appspotmail.com Reviewed by: melifaro MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24539	2020-04-26 16:16:39 +00:00
Kristof Provost	98582ce381	pf: Improve ioctl() input validation Both DIOCCHANGEADDR and DIOCADDADDR take a struct pf_pooladdr from userspace. They failed to validate the dyn pointer contained in its struct pf_addr_wrap member structure. This triggered assertion failures under fuzz testing in pfi_dynaddr_setup(). Happily the dyn variable was overruled there, but we should verify that it's set to NULL anyway. Reported-by: syzbot+93e93150bc29f9b4b85f@syzkaller.appspotmail.com Reviewed by: emaste MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24431	2020-04-19 16:10:20 +00:00
Kristof Provost	95324dc3f4	pf: Do not allow negative ps_len in DIOCGETSTATES Userspace may pass a negative ps_len value to us, which causes an assertion failure in malloc(). Treat negative values as zero, i.e. return the required size. Reported-by: syzbot+53370d9d0358ee2a059a@syzkaller.appspotmail.com Reviewed by: lutz at donnerhacke.de MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24447	2020-04-17 14:35:11 +00:00
Alexander V. Chernikov	643ce94878	Convert pf rtable checks to the new routing KPI. Switch uRPF to use specific fib(9)-provided uRPF. Switch MSS calculation to the latest fib(9) kpi. Reviewed by: kp Differential Revision: https://reviews.freebsd.org/D24386	2020-04-15 13:00:48 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Pawel Biernacki	10b49b2302	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (6 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. Mark all nodes in pf, pfsync and carp as MPSAFE. Reviewed by: kp Approved by: kib (mentor, blanket) Differential Revision: https://reviews.freebsd.org/D23634	2020-02-21 16:23:00 +00:00
Hans Petter Selasky	fbb890056e	Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. This patch extends r357772. Differential Revision: https://reviews.freebsd.org/D23742 Reviewed by: glebius@ Sponsored by: Mellanox Technologies	2020-02-18 19:53:36 +00:00
Hans Petter Selasky	b4426a7175	Add missing EPOCH(9) wrapper in ipfw(8). Backtrace: panic() ip_output() dyn_tick() softclock_call_cc() softclock() ithread_loop() Differential Revision: https://reviews.freebsd.org/D23599 Reviewed by: glebius@ and ae@ Found by: mmacy@ Reported by: jmd@ Sponsored by: Mellanox Technologies	2020-02-11 18:16:29 +00:00
Kristof Provost	e3e03bc159	pf: Apply kif flags to new group members If we have a 'set skip on <ifgroup>' rule this flag it set on the group kif, but must also be set on all members. pfctl does this when the rules are set, but if groups are added afterwards we must also apply the flags to the new member. If not, new group members will not be skipped until the rules are reloaded. Reported by: dvl@ Reviewed by: glebius@ Differential Revision: https://reviews.freebsd.org/D23254	2020-01-23 22:13:41 +00:00
Kristof Provost	ef1bd1e517	pfsync: Ensure we enter network epoch before calling ip_output As of r356974 calls to ip_output() require us to be in the network epoch. That wasn't the case for the calls done from pfsyncintr() and pfsync_defer_tmo().	2020-01-22 21:01:19 +00:00
Gleb Smirnoff	2a4bd982d0	Introduce NET_EPOCH_CALL() macro and use it everywhere where we free data based on the network epoch. The macro reverses the argument order of epoch_call(9) - first function, then its argument. NFC	2020-01-15 06:05:20 +00:00
Alexander V. Chernikov	57ddf39678	ipfw: Don't rollback state in alloc_table_vidx() if atomicity is not required. Submitted by: Neel Chauhan <neel AT neelc DOT org> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22662	2019-12-19 10:22:16 +00:00
Alexander V. Chernikov	00b45f58e8	Revert r355908 to commit it with a proper message.	2019-12-19 10:20:38 +00:00
Alexander V. Chernikov	880266635d	svn-commit.tmp	2019-12-19 09:19:27 +00:00
Kristof Provost	cca2ea64e9	pf: Make request_maxcount runtime adjustable There's no reason for this to be a tunable. It's perfectly safe to change this at runtime. Reviewed by: Lutz Donnerhacke Differential Revision: https://reviews.freebsd.org/D22737	2019-12-14 02:06:07 +00:00
Andrey V. Elsukov	1ae74c1359	Make TCP options parsing stricter. Rework tcpopts_parse() to be more strict. Use const pointer. Add length checks for specific TCP options. The main purpose of the change is avoiding of possible out of mbuf's data access. Reported by: Maxime Villard Reviewed by: melifaro, emaste MFC after: 1 week	2019-12-13 11:47:58 +00:00
Andrey V. Elsukov	2873980947	Follow RFC 4443 p2.2 and always use own addresses for reflected ICMPv6 datagrams. Previously destination address from original datagram was used. That looked confusing, especially in the traceroute6 output. Also honor IPSTEALTH kernel option and do TTL/HLIM decrementing only when stealth mode is disabled. Reported by: Marco van Tol <marco at tols org> Reviewed by: melifaro MFC after: 2 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D22631	2019-12-12 13:28:46 +00:00
Andrey V. Elsukov	ca0ac0a6c1	Avoid access to stale ip pointer and call UPDATE_POINTERS() after PULLUP_LEN_LOCKED(). PULLUP_LEN_LOCKED() could update mbuf and thus we need to update related pointers that can be used in next opcodes. Reported by: Maxime Villard <max at m00nbsd net> MFC after: 1 week	2019-12-10 10:35:32 +00:00
Kristof Provost	492f3a312a	pf: Add endline to all DPFPRINTF() DPFPRINTF() doesn't automatically add an endline, so be consistent and always add it.	2019-11-24 13:53:36 +00:00
Kristof Provost	a0d571cbef	pf: Must be in NET_EPOCH to call icmp_error icmp_reflect(), called through icmp_error() requires us to be in NET_EPOCH. Failure to hold it leads to the following panic (with INVARIANTS): panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/netinet/ip_icmp.c:742 cpuid = 2 time = 1571233273 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e0977920 vpanic() at vpanic+0x17e/frame 0xfffffe00e0977980 panic() at panic+0x43/frame 0xfffffe00e09779e0 icmp_reflect() at icmp_reflect+0x625/frame 0xfffffe00e0977aa0 icmp_error() at icmp_error+0x720/frame 0xfffffe00e0977b10 pf_intr() at pf_intr+0xd5/frame 0xfffffe00e0977b50 ithread_loop() at ithread_loop+0x1c6/frame 0xfffffe00e0977bb0 fork_exit() at fork_exit+0x80/frame 0xfffffe00e0977bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e0977bf0 Note that we now enter NET_EPOCH twice if we enter ip_output() from pf_intr(), but ip_output() will soon be converted to a function that requires epoch, so entering NET_EPOCH directly from pf_intr() makes more sense. Discussed with: glebius@	2019-10-18 03:36:26 +00:00
Gleb Smirnoff	16a72f53e2	Use epoch(9) directly instead of obsoleted KPI.	2019-10-14 16:37:41 +00:00
Gleb Smirnoff	961c033ef1	ipfw(4) rule matching always happens in network epoch.	2019-10-14 16:37:00 +00:00
Mark Johnston	bff630d1dc	Fix the build after r353458. MFC with: r353458 Sponsored by: The FreeBSD Foundation	2019-10-13 00:08:17 +00:00
Mark Johnston	6cc9ab8610	Add a missing include of opt_sctp.h. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-12 23:01:16 +00:00
Gleb Smirnoff	b8a6e03fac	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
Gleb Smirnoff	f8b45306c6	Drivers may pass runt packets to filter. This is okay. Reviewed by: gallatin	2019-09-13 22:36:04 +00:00
Andrey V. Elsukov	773a7e2224	Fix rule truncation on external action module unloading. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2019-08-15 13:44:33 +00:00
Ed Maste	c54ee572e5	pf: zero (another) output buffer in pfioctl Avoid potential structure padding leak. r350294 identified a leak via static analysis; although there's no report of a leak with the DIOCGETSRCNODES ioctl it's a good practice to zero the memory. Suggested by: kp MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-07-31 16:58:09 +00:00
Andrey V. Elsukov	e758846c09	dd ipfw_get_action() function to get the pointer to action opcode. ACTION_PTR() returns pointer to the start of rule action section, but rule can keep several rule modifiers like O_LOG, O_TAG and O_ALTQ, and only then real action opcode is stored. ipfw_get_action() function inspects the rule action section, skips all modifiers and returns action opcode. Use this function in ipfw_reset_eaction() and flush_nat_ptrs(). MFC after: 1 week Sponsored by: Yandex LLC	2019-07-29 15:09:12 +00:00
Kristof Provost	f287767d4f	pf: Remove partial RFC2675 support Remove our (very partial) support for RFC2675 Jumbograms. They're not used, not actually supported and not a good idea. Reviewed by: thj@ Differential Revision: https://reviews.freebsd.org/D21086	2019-07-29 13:21:31 +00:00
Andrey V. Elsukov	d4e6a52959	Avoid possible lock leaking. After r343619 ipfw uses own locking for packets flow. PULLUP_LEN() macro is used in ipfw_chk() to make m_pullup(). When m_pullup() fails, it just returns via `goto pullup_failed`. There are two places where PULLUP_LEN() is called with IPFW_PF_RLOCK() held. Add PULLUP_LEN_LOCKED() macro to use in these places to be able release the lock, when m_pullup() fails. Sponsored by: Yandex LLC	2019-07-29 12:55:48 +00:00

1 2 3 4 5 ...

716 Commits