freebsd-skq

Author	SHA1	Message	Date
glebius	64967afc7c	In r343631 error code for a packet blocked by a firewall was changed from EACCES to EPERM. This change was not intentional, so fix that. Return EACCESS if a firewall forbids sending. Noticed by: ae	2020-01-01 17:32:20 +00:00
melifaro	2f6a2eada6	Remove useless code from in6_rmx.c The code in questions walks IPv6 tree every 60 seconds and looks into the routes with non-zero expiration time (typically, redirected routes). For each such route it sets RTF_PROBEMTU flag at the expiration time. No other part of the kernel checks for RTF_PROBEMTU flag. RTF_PROBEMTU was defined 21 years ago, 30 Jun 1999, as RTF_PROTO1. RTF_PROTO1 is a de-facto standard indication of a route installed by a routing daemon for a last decade. Reviewed by: bz, ae MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22865	2019-12-18 22:10:56 +00:00
hselasky	f2fb2c44a8	Leave multicast group before reaping and committing state for both IPv4 and IPv6. This fixes a regression issue after r349369. When trying to exit a multicast group before closing the socket, a multicast leave packet should be sent. Differential Revision: https://reviews.freebsd.org/D22848 PR: 242677 Reviewed by: bz (network) Tested by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2019-12-18 12:06:34 +00:00
bz	17696b2f7c	Update comment. Update the comment related to SIIT and v4mapped addresses being rejected by us when coming from the wire given we have supported IPv6-only kernels for a few years now. See also draft-itojun-v6ops-v4mapped-harmful. Suggested by: melifaro MFC after: 2 weeks	2019-12-06 16:53:42 +00:00
bz	b785fe800f	ip6_input: remove redundant v4mapped check In ip6_input() we apply the same v4mapped address check twice. The only case which skipps the first one is M_FASTFWD_OURS which should have passed the check on the firstinput pass and passed the firewall. Remove the 2nd redundant check. Reviewed by: kp, melifaro MFC after: 2 weeks Sponsored by: Netflix (originally) Differential Revision: https://reviews.freebsd.org/D22462	2019-12-06 16:42:58 +00:00
kp	7d0572ff8f	Remove useless NULL check Coverity points out that we've already dereferenced m by the time we check, so there's no reason to keep the check. Moreover, it's safe to pass NULL to m_freem() anyway. CID: 1019092	2019-12-05 16:50:54 +00:00
bz	06619c3586	Make icmp6_reflect() static. icmp6_reflect() is not used anywhere outside icmp6.c, no reason to export it. Sponsored by: Netflix	2019-12-03 14:46:38 +00:00
hselasky	63bc20993b	Use refcount from "in_joingroup_locked()" when joining multicast groups. Do not acquire additional references. This makes the IPv4 IGMP code in line with the IPv6 MLD code. Background: The IPv4 multicast code puts an extra reference on the in_multi struct when joining groups. This becomes visible when using daemons like igmpproxy from ports, that multicast entries do not disappear from the output of ifmcstat(8) when multicast streams are disconnected. This fixes a regression issue after r349762. While at it factor the ip_mfilter_insert() and ip6_mfilter_insert() calls to avoid repeated "is_new" check. Differential Revision: https://reviews.freebsd.org/D22595 Tested by: Guido van Rooij <guido@gvr.org> Reviewed by: rgrimes (network) MFC after: 1 week Sponsored by: Mellanox Technologies	2019-12-03 08:46:59 +00:00
tuexen	a972ee3626	Update the hostcache also for PTB messages received for SCTP/IPv6. The corresponding code for SCTP/IPv4 was introduced in https://svnweb.freebsd.org/base?view=revision&revision=317597 Submitted by: Julius Flohr MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22605	2019-12-01 16:14:44 +00:00
bz	c14bf147f4	Fix m_pullup() problem after removing PULLDOWN_TESTs and KAME EXT_*macros. r354748-354750 replaced the KAME macros with m_pulldown() calls. Contrary to the rest of the network stack m_len checks before m_pulldown() were not put in placed (see r354748). Put these m_len checks in place for now (to go along with the style of the network stack since the initial commits). These are not put in for performance but to avoid an error scenario (even though it also will help performance at the moment as it avoid allocating an extra mbuf; not because of the unconditional function call). The observed error case went like this: (1) an mbuf with M_EXT arrives and we call m_pullup() unconditionally on it. (2) m_pullup() will call m_get() unless the requested length is larger than MHLEN (in which case it'll m_freem() the perfectly fine mbuf) and migrate the requested length of data and pkthdr into the new mbuf. (3) If m_get() succeeds, a further m_pullup() call going over MHLEN will fail. This was observed with failing auto-configuration as an RA packet of 200 bytes exceeded MHLEN and the m_pullup() called from nd6_ra_input() dropped the mbuf. (Re-)adding the m_len checks before m_pullup() calls avoids this problems with mbufs using external storage for now. MFC after: 3 weeks Sponsored by: Netflix	2019-12-01 00:22:04 +00:00
rlibby	6751c1f63b	in6_joingroup_locked: need if_addr_lock around in6m_disconnect_locked It looks like the call that requires the lock was introduced in r337866. Reviewed by: hselasky Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20739	2019-11-25 22:25:10 +00:00
bz	fb644c472d	in6: move include Move the include for sysctl.h out of the middle of the file to the includes at the beginning. This is will make it easier to add new sysctls. No functional changes. MFC after: 3 weeks Sponsored by: Netflix	2019-11-19 21:14:15 +00:00
bz	79837b3c5e	nd6: sysctl Move the SYSCTL_DECL to the top of the file. Move the sysctl function before SYSCTL_PROC so that we don't need an extra function declaration in the middle of the file. No functional changes. MFC after: 3 weeks Sponsored by: Netflix	2019-11-19 21:08:18 +00:00
bz	f59b34ae14	nd6: make nd6_timer_ch static nd6_timer_ch is only used in file local context. There is no need to export it, so make it static. MFC after: 3 weeks Sponsored by: Netflix	2019-11-19 20:54:17 +00:00
bz	155a28899c	nd6_rtr: re-sort functions Resort functions within file in a way that they depend on each other as that makes it easier to rework various things. Also allows us to remove file local function declarations. No functional changes. MFC after: 3 weeks Sponsored by: Netflix	2019-11-19 20:34:33 +00:00
bz	3dc60fd5da	mld: fix epoch assertion in6ifa_ifpforlinklocal() asserts the net epoch. The test case from r354832 revealed code paths where we call into the function without having acquired the net epoch first and consequently we hit the assert. This happens in certain MLD states during VNET shutdown and most people normaly not notice this. For correctness acquire the net epoch around calls to mld_v1_transmit_report() in all cases to avoid the assertion firing. MFC after: 2 weeks Sponsored by: Netflix	2019-11-19 14:53:13 +00:00
bz	2e7436367e	icmpv6: Fix mbuf change in mld After r354748 mld_input() can change the mbuf. The new pointer is never returned to icmp6_input() and when passed to icmp6_rip6_input() the mbuf may no longer valid leading to a panic. Pass a pointer to the mbuf to mld_input() so we can return an updated version in the non-error case. Add a test sending an MLD packet case which will trigger this bug. Pointyhat to: bz Reported by: gallatin, thj MFC After: 2 weeks X-MFC with: r354748 Sponsored by: Netflix	2019-11-18 21:59:47 +00:00
bz	da088e11c9	nd6: retire defrouter_select(), use _fib() variant. Burn bridges and replace the last two calls of defrouter_select() with defrouter_select_fib(). That allows us to retire defrouter_select() and make it more clear in the calling code that it applies to all FIBs. Sponsored by: Netflix	2019-11-16 00:17:35 +00:00
bz	5569a9ae6a	nd6_rtr: Pull in the TAILQ_HEAD() as it is not needed outside nd6_rtr.c. Rename the TAILQ_HEAD() struct and the nd_defrouter variable from "nd_" to "nd6_" as they are not part of the RFC 3542 API which uses "ND_". Ideally I'd like to also rename the struct nd_defrouter {} to "nd6_*" but given that is used externally there is more work to do. No functional changes. MFC after: 3 weeks Sponsored by: Netflix	2019-11-16 00:02:36 +00:00
bz	1feeff48a5	netinet*: replace IP6_EXTHDR_GET() In a few places we have IP6_EXTHDR_GET() left in upper layer protocols. The IP6_EXTHDR_GET() macro might perform an m_pulldown() in case the data fragment is not contiguous. Convert these last remaining instances into m_pullup()s instead. In CARP, for example, we will a few lines later call m_pullup() anyway, the IPsec code coming from OpenBSD would otherwise have done the m_pullup() and are copying the data a bit later anyway, so pulling it in seems no better or worse. Note: this leaves very few m_pulldown() cases behind in the tree and we might want to consider removing them as well to make mbuf management easier again on a path to variable size mbufs, especially given m_pulldown() still has an issue not re-checking M_WRITEABLE(). Reviewed by: gallatin MFC after: 8 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22335	2019-11-15 21:44:17 +00:00
bz	667b9bd5d8	netinet6: Remove PULLDOWN_TESTs. Remove the KAME introduced PULLDOWN_TESTs which did not even have a compile-time option in sys/conf to turn them on for a custom kernel build. They made the code a lot harder to read or more complicated in a few cases. Convert the IP6_EXTHDR_CHECK() calls into FreeBSD looking code. Rather than throwing the packet away if it would not fit the KAME mbuf expectations, convert the macros to m_pullup() calls. Do not do any extra manual conditional checks upfront as to whether the m_len would suffice (), simply let m_pullup() do its work (incl. an early check). Remove extra m_pullup() calls where earlier in the function or the only caller has already done the pullup. Discussed with: rwatson () Reviewed by: ae MFC after: 8 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22334	2019-11-15 21:40:40 +00:00
bz	3db97aba48	nd6: simplify code We are taking the same actions in both cases of the branch inside the block. Simplify that code as the extra branch is not needed. MFC after: 3 weeks Sponsored by: Netflix	2019-11-15 13:45:38 +00:00
bz	119955aa89	nd6: remove unused structs and defines Remove a collections of unused structs and #defines to make it easier to understand what is actually in use. Sponsored by: Netflix	2019-11-13 14:28:07 +00:00
bz	c65d3b689b	nd6: make nd6_alloc() file static nd6_alloc() is a function used only locally. Make it static and no longer export it. Keeps the KPI smaller. Sponsored by: Netflix	2019-11-13 13:53:17 +00:00
bz	4a111320e1	nd6 defrouter: consolidate nd_defrouter manipulations in nd6_rtr.c Move the nd_defrouter along with the sysctl handler from nd6.c to nd6_rtr.c and make the variable file static. Provide (temporary) new accessor functions for code manipulating nd_defrouter from nd6.c, and stop exporting functions no longer needed outside nd6_rtr.c. This also shuffles a few functions around in nd6_rtr.c without functional changes. Given all nd_defrouter logic is now in one place we can tidy up the code, locking and, and other open items. MFC after: 3 weeks X-MFC: keep exporting the functions Sponsored by: Netflix	2019-11-13 12:05:48 +00:00
bz	4f4dd10c54	netinet: update mp to pass the proper value back In ip6_[direct_]input() we are looping over the extension headers to deal with the next header. We pass a pointer to an mbuf pointer to the handling functions. In certain cases the mbuf can be updated there and we need to pass the new one back. That missing in dest6_input() and route6_input(). In tcp6_input() we should also update it before we call tcp_input(). In addition to that mark the mbuf NULL all the times when we return that we are done with handling the packet and no next header should be checked (IPPROTO_DONE). This will eventually allow us to assert proper behaviour and catch the above kind of errors more easily, expecting *mp to always be set. This change is extracted from a larger patch and not an exhaustive change across the entire stack yet. PR: 240135 Reported by: prabhakar.lakhera gmail.com MFC after: 3 weeks Sponsored by: Netflix	2019-11-12 15:46:28 +00:00
glebius	2d2333004c	It is unclear why in6_pcblookup_local() would require write access to the PCB hash. The function doesn't modify the hash. It always asserted write lock historically, but with epoch conversion this fails in some special cases. Reviewed by: rwatson, bz Reported-by: syzbot+0b0488ca537e20cb2429@syzkaller.appspotmail.com	2019-11-11 06:28:25 +00:00
bz	13480d3e0b	frag6: properly handle atomic fragments according to RFCs. RFC 8200 says: "If the fragment is a whole datagram (that is, both the Fragment Offset field and the M flag are zero), then it does not need any further reassembly and should be processed as a fully reassembled packet (i.e., updating Next Header, adjust Payload Length, removing the Fragment header, etc.). .." That means we should remove the fragment header and make all the adjustments rather than just skipping over the fragment header. The difference should be noticeable in that a properly handled atomic fragment triggering an ICMPv6 message at an upper layer (e.g. dest unreach, unreachable port) will not include the fragment header. Update the test cases to also test for an unfragmentable part. That is needed so that the next header is properly updated (not just lengths). MFC after: 3 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22155	2019-11-08 14:36:44 +00:00
glebius	f83a4563fd	Now with epoch synchronized PCB lookup tables we can greatly simplify locking in udp_output() and udp6_output(). First, we select if we need read or write lock in PCB itself, we take the lock and enter network epoch. Then, we proceed for the rest of the function. In case if we need to modify PCB hash, we would take write lock on it for a short piece of code. We could exit the epoch before allocating an mbuf, but with this patch we are keeping it all the way into ip_output()/ip6_output(). Today this creates an epoch recursion, since ip_output() enters epoch itself. However, once all protocols are reviewed, ip_output() and ip6_output() would require epoch instead of entering it. Note: I'm not 100% sure that in udp6_output() the epoch is required. We don't do PCB hash lookup for a bound socket. And all branches of in6_select_src() don't require epoch, at least they lack assertions. Today inet6 address list is protected by rmlock, although it is CKLIST. AFAIU, the future plan is to protect it by network epoch. That would require epoch in in6_select_src(). Anyway, in future ip6_output() would require epoch, udp6_output() would need to enter it.	2019-11-07 21:01:36 +00:00
glebius	76a6e088e6	Since r353292 on input path we are always in network epoch, when we lookup PCBs. Thus, do not enter epoch recursively in in_pcblookup_hash() and in6_pcblookup_hash(). Same applies to tcp_ctlinput() and tcp6_ctlinput(). This leaves several sysctl(9) handlers that return PCB credentials unprotected. Add epoch enter/exit to all of them. Differential Revision: https://reviews.freebsd.org/D22197	2019-11-07 20:49:56 +00:00
glebius	dc2b2b2ef7	Remove unnecessary recursive epoch enter via INP_INFO_RLOCK macro in icmp6_rip6_input(). It shall always run in the network epoch.	2019-11-07 20:43:12 +00:00
glebius	61ee0e91c4	Remove unnecessary recursive epoch enter via INP_INFO_RLOCK macro in raw input functions for IPv4 and IPv6. They shall always run in the network epoch.	2019-11-07 20:40:44 +00:00
glebius	1dd56a524e	Remove unnecessary recursive epoch enter via INP_INFO_RLOCK macro in udp6_input(). It shall always run in the network epoch.	2019-11-07 20:38:53 +00:00
bz	be19ea6cb3	netinet*: variable cleanup In preparation for another change factor out various variable cleanups. These mainly include: (1) do not assign values to variables during declaration: this makes the code more readable and does allow for better grouping of variable declarations, (2) do not assign values to variables before need; e.g., if a variable is only used in the 2nd half of a function and we have multiple return paths before that, then do not set it before it is needed, and (3) try to avoid assigning the same value multiple times. MFC after: 3 weeks Sponsored by: Netflix	2019-11-07 18:29:51 +00:00
glebius	2105b345fe	Widen network epoch coverage in nd6_prefix_onlink() as in6ifa_ifpforlinklocal() requires the epoch. Reported by: bz Reviewed by: bz	2019-11-07 17:00:20 +00:00
glebius	aceb3d6047	In nd6_timer() enter the network epoch earlier. The defrouter_del() may call into leaf functions that require epoch. Since the function is already run in non-sleepable context, it should be safe to cover it whole with epoch. Reported by: syzcaller	2019-11-04 17:35:37 +00:00
bz	d7908b9933	Properly set VNET when nuking recvif from fragment queues. In theory the eventhandler invoke should be in the same VNET as the the current interface. We however cannot guarantee that for all cases in the future. So before checking if the fragmentation handling for this VNET is active, switch the VNET to the VNET of the interface to always get the one we want. Reviewed by: hselasky MFC after: 3 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22153	2019-10-25 18:54:06 +00:00
bz	3f9b3c4810	frag6: do not leak counter in error cases When allocating the IPv6 fragement packet queue entry we do checks against counters and if we pass we increment one of the counters to claim the spot. Right after that we have two cases (malloc and MAC) which can both fail in which case we free the entry but never released our claim on the counter. In theory this can lead to not accepting new fragments after a long time, especially if it would be MAC "refusing" them. Rather than immediately subtracting the value in the error case, only increment it after these two cases so we can no longer leak it. MFC after: 3 weeks Sponsored by: Netflix	2019-10-25 16:29:09 +00:00
bz	9589e8e651	frag6: prevent overwriting initial fragoff=0 packet meta-data. When we receive the packet with the first fragmented part (fragoff=0) we remember the length of the unfragmentable part and the next header (and should probably also remember ECN) as meta-data on the reassembly queue. Someone replying this packet so far could change these 2 (3) values. While changing the next header seems more severe, for a full size fragmented UDP packet, for example, adding an extension header to the unfragmentable part would go unnoticed (as the framented part would be considered an exact duplicate) but make reassembly fail. So do not allow updating the meta-data after we have seen the first fragmented part anymore. The frag6_20 test case is added which failed before triggering an ICMPv6 "param prob" due to the check for each queued fragment for a max-size violation if a fragoff=0 packet was received. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 22:07:45 +00:00
bz	2230476674	frag6: handling of overlapping fragments to conform to RFC 8200 While the comment was updated in r350746, the code was not. RFC8200 says that unless fragment overlaps are exact (same fragment twice) not only the current fragment but the entire reassembly queue for this packet must be silently discarded, which we now do if fragment offset and fragment length do not match. Obtained from: jtl MFC after: 3 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D16850	2019-10-24 20:22:52 +00:00
tuexen	56626fc5ad	Ensure that the flags indicating IPv4/IPv6 are not changed by failing bind() calls. This would lead to inconsistent state resulting in a panic. A fix for stable/11 was committed in https://svnweb.freebsd.org/base?view=revision&revision=338986 An accelerated MFC is planned as discussed with emaste@. Reported by: syzbot+2609a378d89264ff5a42@syzkaller.appspotmail.com Obtained from: jtl@ MFC after: 1 day Sponsored by: Netflix, Inc.	2019-10-24 20:05:10 +00:00
bz	a41c96bda5	frag6: export another counter read-only by sysctl Similar to the system global counter also export the per-VNET counter "frag6_nfragpackets" detailing the current number of fragment packets in this VNET's reassembly queues. The read-only counter is helpful for in-VNET statistical monitoring and for test-cases. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 20:00:37 +00:00
bz	6b64e9286e	frag6: fix counter leak in error case and optimise code In case the first fragmented part (off=0) arrives we check for the maximum packet size for each fragmented part we already queued with the addition of the unfragmentable part from the first one. For one we do not have to enter the loop at all if this is the first fragmented part to arrive, and we can skip the check. Should we encounter an error case we send an ICMPv6 message for any fragment exceeding the maximum length limit. While dequeueing the original packet and freeing it, statistics were not updated and leaked both the reassembly queue count for the fragment and the global fragment count. Found by code inspection and confirmed by tightening test cases checking more statistical and system counters. While here properly wrap a line. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 19:57:18 +00:00
bz	d4a9dd6f75	frag6.c: do not leak packet queue entry in error case When we are checking for the maximum reassembled packet size of the fragmentable part and run into the error case (packet too big), we are leaking the packet queue enntry if this was a first fragment to arrive. Properly cleanup, removing the queue entry from the bucket, decrementing counters, and freeing the memory. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 19:47:32 +00:00
bz	27aa98f3aa	frag6: leave a note about upper layer header checks TBD Per sepcification the upper layer header needs to be within the first fragment. The check was not done so far and there is an open review for related work, so just leave a note as to where to put it. Move the extraction of frag offset up to this as it is needed to determine whether this is a first fragment or not. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 12:16:15 +00:00
bz	ac29b94465	frag6: check global limits before hash and lock Check whether we are accepting more fragments (based on global limits) before doing expensive operations of calculating the hash and taking the bucket lock. This slightly increases a "race" between check time and incrementing counters (which is already there) possibly allowing a few more fragments than the maximum limits. However, when under attack, we rather save this CPU time for other packets/work. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 11:58:24 +00:00
bz	6d29bbbab8	frag6: small improvements Rather than walking the mbuf chain manually use m_last() which doing exactly that for us. Defer initializing srcifp for longer as there are multiple exit paths out of the function which do not need it set. Initialize before taking the lock though. Rename the mtx lock to match the type better. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 08:15:40 +00:00
bz	8194a6554c	frag6: remove IP6_REASS_MBUF macro The IP6_REASS_MBUF() macro did some pointer gynmastics to end up with the same type as it gets in [(cast *)&]. Spelling it out instead saves all this and makes the code more readable and less obfuscated directly using the structure field. MFC after: 3 weeks Sponsored by: Netflix	2019-10-24 07:53:10 +00:00
bz	06d37eb5ee	frag6: add "big picture" Add some ASCII relation of how the bits plug together. The terminology difference of "fragmented packets" and "fragment packets" is subtle. While here clear up more whitespace and comments. No functional change. MFC after: 3 weeks Sponsored by: Netflix	2019-10-23 23:10:12 +00:00
bz	0c95efba5c	frag6: replace KAME hand-rolled queues with queue(9) TAILQs Remove the KAME custom circular queue for fragments and fragmented packets and replace them with a standard TAILQ. This make the code a lot more understandable and maintainable and removes further hand-rolled code from the the tree using a standard interface instead. Hide the still public structures under #ifdef _KERNEL as there is no use for them in user space. The naming is a bit confusing now as struct ip6q and the ip6q[] buckets array are not the same anymore; sadly struct ip6q is also used by the MAC framework and we cannot rename it. Submitted by: jtl (initally) MFC after: 3 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D16847 (jtl's original)	2019-10-23 23:01:18 +00:00

1 2 3 4 5 ...

1927 Commits