freebsd-skq

Author	SHA1	Message	Date
thj	c2faa57f9e	Don't print VNET pointer when initializing dummynet When dummynet initializes it prints a debug message with the current VNET pointer unnecessarily revealing kernel memory layout. This appears to be left over from when the first pieces of vimage support were added. PR: 238658 Submitted by: huangfq.daxian@gmail.com Reviewed by: markj, bz, gnn, kp, melifaro Approved by: jtl (co-mentor), bz (co-mentor) Event: July 2020 Bugathon MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D25619	2020-07-13 13:35:36 +00:00
melifaro	bb8dbff4ac	Complete conversions from fib<4\|6>_lookup_nh_<basic\|ext> to fib<4\|6>_lookup(). fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees over pointer validness and requiring on-stack data copying. With no callers remaining, remove fib[46]_lookup_nh_ functions. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25445	2020-07-02 21:04:08 +00:00
markj	a43e851194	ipfw(4): make O_IPVER/ipversion match IPv4 or 6, not just IPv4. Submitted by: Neel Chauhan <neel AT neelc DOT org> Reviewed by: Lutz Donnerhacke MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25227	2020-06-24 15:46:33 +00:00
markj	864f5d9ff1	Add the SCTP_SUPPORT kernel option. This is in preparation for enabling a loadable SCTP stack. Analogous to IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured in order to support a loadable SCTP implementation. Discussed with: tuexen MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2020-06-18 19:32:34 +00:00
eugen	ad5077cb33	ipfw: unbreak matching with big table type flow. Test case: # n=32769 # ipfw -q table 1 create type flow:proto,dst-ip,dst-port # jot -w 'table 1 add tcp,127.0.0.1,' $n 1 \| ipfw -q /dev/stdin # ipfw -q add 5 unreach filter-prohib flow 'table(1)' The rule 5 matches nothing without the fix if n>=32769. With the fix, it works: # telnet localhost 10001 Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Permission denied telnet: Unable to connect to remote host MFC after: 2 weeks Discussed with: ae, melifaro	2020-06-04 14:15:39 +00:00
ae	d7e9bff26b	Fix O_IP_FLOW_LOOKUP opcode handling. Do not check table value matching when table lookup has failed. Reported by: Sergey Lobanov MFC after: 1 week	2020-05-29 10:37:42 +00:00
markj	68806c4044	pf: Add a new zone for per-table entry counters. Right now we optionally allocate 8 counters per table entry, so in addition to memory consumed by counters, we require 8 pointers worth of space in each entry even when counters are not allocated (the default). Instead, define a UMA zone that returns contiguous per-CPU counter arrays for use in table entries. On amd64 this reduces sizeof(struct pfr_kentry) from 216 to 160. The smaller size also results in better slab efficiency, so memory usage for large tables is reduced by about 28%. Reviewed by: kp MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D24843	2020-05-16 00:28:12 +00:00
markj	5ffbc764fb	pf: Don't allocate per-table entry counters unless required. pf by default does not do per-table address accounting unless the "counters" keyword is specified in the corresponding pf.conf table definition. Yet, we always allocate 12 per-CPU counters per table. For large tables this carries a lot of overhead, so only allocate counters when they will actually be used. A further enhancement might be to use a dedicated UMA zone to allocate counter arrays for table entries, since close to half of the structure size comes from counter pointers. A related issue is the cost of zeroing counters, since counter_u64_zero() calls smp_rendezvous() on some architectures. Reported by: loos, Jim Pingle <jimp@netgate.com> Reviewed by: kp MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC (Netgate) Differential Revision: https://reviews.freebsd.org/D24803	2020-05-11 18:47:38 +00:00
kp	0300ecad73	pf: Improve DIOCADDRULE validation We expect the addrwrap.p.dyn value to be set to NULL (and assert such), but do not verify it on input. Reported-by: syzbot+936a89182e7d8f927de1@syzkaller.appspotmail.com Reviewed by: melifaro (previous version) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24538	2020-05-03 16:09:35 +00:00
emaste	8aa4dbecf5	ipfw: whitespace fix in SCTP_ABORT_ASSOCIATION case statement comment Submitted by: Neel Chauhan <neel AT neelc DOT org> Reviewed by: rgrimes, tuexen MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24602	2020-05-03 03:44:16 +00:00
melifaro	7c1be85b66	Move route_temporal.c and route_var.h to net/route. Nexthop objects implementation, defined in r359823, introduced sys/net/route directory intended to hold all routing-related code. Move recently-introduced route_temporal.c and private route_var.h header there. Differential Revision: https://reviews.freebsd.org/D24597	2020-04-28 19:14:09 +00:00
kp	827ff56c93	pf: Virtualise pf_frag_mtx The pf_frag_mtx mutex protects the fragments queue. The fragments queue is virtualised already (i.e. per-vnet) so it makes no sense to block jail A from accessing its fragments queue while jail B is accessing its own fragments queue. Virtualise the lock for improved concurrency. Differential Revision: https://reviews.freebsd.org/D24504	2020-04-26 16:30:00 +00:00
kp	4cc4938336	pf: Improve input validation If we pass an anchor name which doesn't exist pfr_table_count() returns -1, which leads to an overflow in mallocarray() and thus a panic. Explicitly check that pfr_table_count() does not return an error. Reported-by: syzbot+bd09d55d897d63d5f4f4@syzkaller.appspotmail.com Reviewed by: melifaro MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24539	2020-04-26 16:16:39 +00:00
kp	96abf553f8	pf: Improve ioctl() input validation Both DIOCCHANGEADDR and DIOCADDADDR take a struct pf_pooladdr from userspace. They failed to validate the dyn pointer contained in its struct pf_addr_wrap member structure. This triggered assertion failures under fuzz testing in pfi_dynaddr_setup(). Happily the dyn variable was overruled there, but we should verify that it's set to NULL anyway. Reported-by: syzbot+93e93150bc29f9b4b85f@syzkaller.appspotmail.com Reviewed by: emaste MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24431	2020-04-19 16:10:20 +00:00
kp	9319f3ce02	pf: Do not allow negative ps_len in DIOCGETSTATES Userspace may pass a negative ps_len value to us, which causes an assertion failure in malloc(). Treat negative values as zero, i.e. return the required size. Reported-by: syzbot+53370d9d0358ee2a059a@syzkaller.appspotmail.com Reviewed by: lutz at donnerhacke.de MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24447	2020-04-17 14:35:11 +00:00
melifaro	7b9732d3ab	Convert pf rtable checks to the new routing KPI. Switch uRPF to use specific fib(9)-provided uRPF. Switch MSS calculation to the latest fib(9) kpi. Reviewed by: kp Differential Revision: https://reviews.freebsd.org/D24386	2020-04-15 13:00:48 +00:00
kaktus	ad355b0a9d	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
kaktus	5c176e02f4	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (6 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. Mark all nodes in pf, pfsync and carp as MPSAFE. Reviewed by: kp Approved by: kib (mentor, blanket) Differential Revision: https://reviews.freebsd.org/D23634	2020-02-21 16:23:00 +00:00
hselasky	e0a1cd8bc2	Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. This patch extends r357772. Differential Revision: https://reviews.freebsd.org/D23742 Reviewed by: glebius@ Sponsored by: Mellanox Technologies	2020-02-18 19:53:36 +00:00
hselasky	845016f4a2	Add missing EPOCH(9) wrapper in ipfw(8). Backtrace: panic() ip_output() dyn_tick() softclock_call_cc() softclock() ithread_loop() Differential Revision: https://reviews.freebsd.org/D23599 Reviewed by: glebius@ and ae@ Found by: mmacy@ Reported by: jmd@ Sponsored by: Mellanox Technologies	2020-02-11 18:16:29 +00:00
kp	95e3476bfd	pf: Apply kif flags to new group members If we have a 'set skip on <ifgroup>' rule this flag it set on the group kif, but must also be set on all members. pfctl does this when the rules are set, but if groups are added afterwards we must also apply the flags to the new member. If not, new group members will not be skipped until the rules are reloaded. Reported by: dvl@ Reviewed by: glebius@ Differential Revision: https://reviews.freebsd.org/D23254	2020-01-23 22:13:41 +00:00
kp	02995f6a7e	pfsync: Ensure we enter network epoch before calling ip_output As of r356974 calls to ip_output() require us to be in the network epoch. That wasn't the case for the calls done from pfsyncintr() and pfsync_defer_tmo().	2020-01-22 21:01:19 +00:00
glebius	5057bc60d0	Introduce NET_EPOCH_CALL() macro and use it everywhere where we free data based on the network epoch. The macro reverses the argument order of epoch_call(9) - first function, then its argument. NFC	2020-01-15 06:05:20 +00:00
melifaro	7cd596605e	ipfw: Don't rollback state in alloc_table_vidx() if atomicity is not required. Submitted by: Neel Chauhan <neel AT neelc DOT org> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22662	2019-12-19 10:22:16 +00:00
melifaro	4344de7fc6	Revert r355908 to commit it with a proper message.	2019-12-19 10:20:38 +00:00
melifaro	52b84efbe2	svn-commit.tmp	2019-12-19 09:19:27 +00:00
kp	f063394b44	pf: Make request_maxcount runtime adjustable There's no reason for this to be a tunable. It's perfectly safe to change this at runtime. Reviewed by: Lutz Donnerhacke Differential Revision: https://reviews.freebsd.org/D22737	2019-12-14 02:06:07 +00:00
ae	5d1408ea62	Make TCP options parsing stricter. Rework tcpopts_parse() to be more strict. Use const pointer. Add length checks for specific TCP options. The main purpose of the change is avoiding of possible out of mbuf's data access. Reported by: Maxime Villard Reviewed by: melifaro, emaste MFC after: 1 week	2019-12-13 11:47:58 +00:00
ae	fb953e763f	Follow RFC 4443 p2.2 and always use own addresses for reflected ICMPv6 datagrams. Previously destination address from original datagram was used. That looked confusing, especially in the traceroute6 output. Also honor IPSTEALTH kernel option and do TTL/HLIM decrementing only when stealth mode is disabled. Reported by: Marco van Tol <marco at tols org> Reviewed by: melifaro MFC after: 2 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D22631	2019-12-12 13:28:46 +00:00
ae	6dfdda7982	Avoid access to stale ip pointer and call UPDATE_POINTERS() after PULLUP_LEN_LOCKED(). PULLUP_LEN_LOCKED() could update mbuf and thus we need to update related pointers that can be used in next opcodes. Reported by: Maxime Villard <max at m00nbsd net> MFC after: 1 week	2019-12-10 10:35:32 +00:00
kp	48915ceec6	pf: Add endline to all DPFPRINTF() DPFPRINTF() doesn't automatically add an endline, so be consistent and always add it.	2019-11-24 13:53:36 +00:00
kp	9cc4228e50	pf: Must be in NET_EPOCH to call icmp_error icmp_reflect(), called through icmp_error() requires us to be in NET_EPOCH. Failure to hold it leads to the following panic (with INVARIANTS): panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/netinet/ip_icmp.c:742 cpuid = 2 time = 1571233273 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e0977920 vpanic() at vpanic+0x17e/frame 0xfffffe00e0977980 panic() at panic+0x43/frame 0xfffffe00e09779e0 icmp_reflect() at icmp_reflect+0x625/frame 0xfffffe00e0977aa0 icmp_error() at icmp_error+0x720/frame 0xfffffe00e0977b10 pf_intr() at pf_intr+0xd5/frame 0xfffffe00e0977b50 ithread_loop() at ithread_loop+0x1c6/frame 0xfffffe00e0977bb0 fork_exit() at fork_exit+0x80/frame 0xfffffe00e0977bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e0977bf0 Note that we now enter NET_EPOCH twice if we enter ip_output() from pf_intr(), but ip_output() will soon be converted to a function that requires epoch, so entering NET_EPOCH directly from pf_intr() makes more sense. Discussed with: glebius@	2019-10-18 03:36:26 +00:00
glebius	5d9a41a2e5	Use epoch(9) directly instead of obsoleted KPI.	2019-10-14 16:37:41 +00:00
glebius	962bb78b05	ipfw(4) rule matching always happens in network epoch.	2019-10-14 16:37:00 +00:00
markj	fdce34ac5e	Fix the build after r353458. MFC with: r353458 Sponsored by: The FreeBSD Foundation	2019-10-13 00:08:17 +00:00
markj	0ca8957b6d	Add a missing include of opt_sctp.h. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-12 23:01:16 +00:00
glebius	337378e04f	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
glebius	3d08ed450e	Drivers may pass runt packets to filter. This is okay. Reviewed by: gallatin	2019-09-13 22:36:04 +00:00
ae	8915b51753	Fix rule truncation on external action module unloading. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2019-08-15 13:44:33 +00:00
emaste	2232ec44d9	pf: zero (another) output buffer in pfioctl Avoid potential structure padding leak. r350294 identified a leak via static analysis; although there's no report of a leak with the DIOCGETSRCNODES ioctl it's a good practice to zero the memory. Suggested by: kp MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-07-31 16:58:09 +00:00
ae	fe1518789b	dd ipfw_get_action() function to get the pointer to action opcode. ACTION_PTR() returns pointer to the start of rule action section, but rule can keep several rule modifiers like O_LOG, O_TAG and O_ALTQ, and only then real action opcode is stored. ipfw_get_action() function inspects the rule action section, skips all modifiers and returns action opcode. Use this function in ipfw_reset_eaction() and flush_nat_ptrs(). MFC after: 1 week Sponsored by: Yandex LLC	2019-07-29 15:09:12 +00:00
kp	65ea01a683	pf: Remove partial RFC2675 support Remove our (very partial) support for RFC2675 Jumbograms. They're not used, not actually supported and not a good idea. Reviewed by: thj@ Differential Revision: https://reviews.freebsd.org/D21086	2019-07-29 13:21:31 +00:00
ae	aaf6f1c4ac	Avoid possible lock leaking. After r343619 ipfw uses own locking for packets flow. PULLUP_LEN() macro is used in ipfw_chk() to make m_pullup(). When m_pullup() fails, it just returns via `goto pullup_failed`. There are two places where PULLUP_LEN() is called with IPFW_PF_RLOCK() held. Add PULLUP_LEN_LOCKED() macro to use in these places to be able release the lock, when m_pullup() fails. Sponsored by: Yandex LLC	2019-07-29 12:55:48 +00:00
emaste	e6cea1ca6d	pf: zero output buffer in pfioctl Avoid potential structure padding leak. Reported by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Reviewed by: kp MFC after: 3 days Security: Potential kernel memory disclosure Sponsored by: The FreeBSD Foundation	2019-07-24 16:51:14 +00:00
ae	5163ded6d4	Eliminate rmlock from ipfw's BPF code. After r343631 pfil hooks are invoked in net_epoch_preempt section, this allows to avoid extra locking. Add NET_EPOCH_ASSER() assertion to each ipfw_bpf_tap() call to require to be called from inside epoch section. Use NET_EPOCH_WAIT() in ipfw_clone_destroy() to wait until it becomes safe to free() ifnet. And use on-stack ifnet pointer in each ipfw_bpf_tap() call to avoid NULL pointer dereference in case when V_log_if global variable will become NULL during ipfw_bpf_tap*() call. Sponsored by: Yandex LLC	2019-07-23 12:52:36 +00:00
ae	571b48c044	Do not modify cmd pointer if it is already last opcode in the rule. MFC after: 1 week	2019-07-12 09:59:21 +00:00
ae	15268fa047	Correctly truncate the rule in case when it has several action opcodes. It is possible, that opcode at the ACTION_PTR() location is not real action, but action modificator like "log", "tag" etc. In this case we need to check for each opcode in the loop to find O_EXTERNAL_ACTION. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2019-07-12 09:48:42 +00:00
hselasky	1a5fd513af	Convert all IPv4 and IPv6 multicast memberships into using a STAILQ instead of a linear array. The multicast memberships for the inpcb structure are protected by a non-sleepable lock, INP_WLOCK(), which needs to be dropped when calling the underlying possibly sleeping if_ioctl() method. When using a linear array to keep track of multicast memberships, the computed memory location of the multicast filter may suddenly change, due to concurrent insertion or removal of elements in the linear array. This in turn leads to various invalid memory access issues and kernel panics. To avoid this problem, put all multicast memberships on a STAILQ based list. Then the memory location of the IPv4 and IPv6 multicast filters become fixed during their lifetime and use after free and memory leak issues are easier to track, for example by: vmstat -m \| grep multi All list manipulation has been factored into inline functions including some macros, to easily allow for a future hash-list implementation, if needed. This patch has been tested by pho@ . Differential Revision: https://reviews.freebsd.org/D20080 Reviewed by: markj @ MFC after: 1 week Sponsored by: Mellanox Technologies	2019-06-25 11:54:41 +00:00
ae	e59b219e5f	Follow the RFC 3128 and drop short TCP fragments with offset = 1. Reported by: emaste MFC after: 1 week	2019-06-25 11:40:37 +00:00
ae	a864a749ac	Mark default rule with IPFW_RULE_NOOPT flag, so it can be showed in compact form. MFC after: 1 week	2019-06-25 09:11:22 +00:00

1 2 3 4 5 ...

699 Commits