freebsd-nq

Author	SHA1	Message	Date
Michael Tuexen	93899d10b4	The handling of RST segments in the SYN-RCVD state exists in the code paths. Both are not consistent and the one on the syn cache code does not conform to the relevant specifications (Page 69 of RFC 793 and Section 4.2 of RFC 5961). This patch fixes this: * The sequence numbers checks are fixed as specified on page Page 69 RFC 793. * The sysctl variable net.inet.tcp.insecure_rst is now honoured and the behaviour as specified in Section 4.2 of RFC 5961. Approved by: re (gjb@) Reviewed by: bz@, glebius@, rrs@, Differential Revision: https://reviews.freebsd.org/D17595 Sponsored by: Netflix, Inc.	2018-10-18 19:21:18 +00:00
Jonathan T. Looney	ac75e35d85	In r338102, the TCP reassembly code was substantially restructured. Prior to this change, the code sometimes used a temporary stack variable to hold details of a TCP segment. r338102 stopped using the variable to hold segments, but did not actually remove the variable. Because the variable is no longer used, we can safely remove it. Approved by: re (gjb)	2018-10-16 14:41:09 +00:00
Bjoern A. Zeeb	4ba16a92c7	In udp_input() when walking the pcblist we can come across an inp marked FREED after the epoch(9) changes. Check once we hold the lock and skip the inp if it is the case. Contrary to IPv6 the locking of the inp is outside the multicast section and hence a single check seems to suffice. PR: 232192 Reviewed by: mmacy, markj Approved by: re (kib) Differential Revision: https://reviews.freebsd.org/D17540	2018-10-12 22:51:45 +00:00
Bjoern A. Zeeb	3afdfcaf33	r217592 moved the check for imo in udp_input() into the conditional block but leaving the variable assignment outside the block, where it is no longer used. Move both the variable and the assignment one block further in. This should result in no functional changes. It will however make upcoming changes slightly easier to apply. Reviewed by: markj, jtl, tuexen Approved by: re (kib) Differential Revision: https://reviews.freebsd.org/D17525	2018-10-12 11:30:46 +00:00
Jonathan T. Looney	13c6ba6d94	There are three places where we return from a function which entered an epoch section without exiting that epoch section. This is bad for two reasons: the epoch section won't exit, and we will leave the epoch tracker from the stack on the epoch list. Fix the epoch leak by making sure we exit epoch sections before returning. Reviewed by: ae, gallatin, mmacy Approved by: re (gjb, kib) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D17450	2018-10-09 13:26:06 +00:00
Michael Tuexen	3535cdc43e	Avoid truncating unrecognised parameters when reporting them. This resulted in sending malformed packets. Approved by: re (kib@) MFC after: 1 week	2018-10-07 15:13:47 +00:00
Michael Tuexen	3924dfa721	Ensure that the ips_localout counter is incremented for locally generated SCTP packets sent over IPv4. This make the behaviour consistent with IPv6. Reviewed by: ae@, bz@, jtl@ Approved by: re (kib@) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17406	2018-10-07 11:26:15 +00:00
Tom Jones	b6e870116f	Convert UDP length to host byte order When getting the number of bytes to checksum make sure to convert the UDP length to host byte order when the entire header is not in the first mbuf. Reviewed by: jtl, tuexen, ae Approved by: re (gjb), jtl (mentor) Differential Revision: https://reviews.freebsd.org/D17357	2018-10-05 12:51:30 +00:00
Ryan Stone	083a010c62	Hold a write lock across udp_notify() With the new route cache feature udp_notify() will modify the inp when it needs to invalidate the route cache. Ensure that we hold a write lock on the inp before calling the function to ensure that multiple threads don't race while trying to invalidate the cache (which previously lead to a page fault). Differential Revision: https://reviews.freebsd.org/D17246 Reviewed by: sbruno, bz, karels Sponsored by: Dell EMC Isilon Approved by: re (gjb)	2018-10-04 22:03:58 +00:00
Michael Tuexen	15a087e551	Mitigate providing a timing signal if the COOKIE or AUTH validation fails. Thanks to jmg@ for reporting the issue, which was discussed in https://admbugs.freebsd.org/show_bug.cgi?id=878 Approved by: re (TBD@) MFC after: 1 week	2018-10-01 14:05:31 +00:00
Michael Tuexen	9d2e3f14c4	After allocating chunks set the fields in a consistent way. This removes two assignments for the flags field being done twice and adds one, which was missing. Thanks to Felix Weinrank for reporting the issue he found by using fuzz testing of the userland stack. Approved by: re (kib@) MFC after: 1 week	2018-10-01 13:09:18 +00:00
Andrey V. Elsukov	384a5c3c28	Add INP_INFO_WUNLOCK_ASSERT() macro and use it instead of INP_INFO_UNLOCK_ASSERT() in TCP-related code. For encapsulated traffic it is possible, that the code is running in net_epoch_preempt section, and INP_INFO_UNLOCK_ASSERT() is very strict assertion for such case. PR: 231428 Reviewed by: mmacy, tuexen Approved by: re (kib) Differential Revision: https://reviews.freebsd.org/D17335	2018-10-01 10:46:00 +00:00
Michael Tuexen	1b084a5e5e	Plug mbuf leak in the SCTP input path in an error case. Approved by: re (kib@) MFC after: 1 week CID: 749312	2018-09-30 21:54:02 +00:00
Michael Tuexen	66bcf0b333	Plug mbuf leaks in the SCTP output path in error cases. Approved by: re (kib@) MFC after: 1 week CID: 1395307	2018-09-30 21:31:33 +00:00
Michael Tuexen	8184648425	Fix the handling of ancillary data for SCTP socket. Implement sctp_process_cmsgs_for_init() and sctp_findassociation_cmsgs() similar to sctp_find_cmsg() to improve consistency and avoid the signed/unsigned issues in sctp_process_cmsgs_for_init() and sctp_findassociation_cmsgs(). Thanks to andrew@ for reporting the problem he found using syzcaller. Approved by: re (kib@) MFC after: 1 week	2018-09-30 16:21:31 +00:00
Michael Tuexen	ae0a9a8850	Increment the corresponding UDP stats counter (udps_opackets) when sending UDP encapsulated SCTP packets. This is consistent with the behaviour that when such packets are received, the corresponding UDP stats counter (udps_ipackets) is incremented. Thanks to Peter Lei for making me aware of this inconsistency. Approved by: re (kib@) MFC after: 1 week	2018-09-30 12:16:06 +00:00
Michael Tuexen	3552f16d82	Fix typo in comment. Reported by: @danfe Approved by: re (kib@) MFC after: 1 week X-MFC: r338941	2018-09-28 19:47:32 +00:00
Michael Tuexen	0277ec9c43	Whitespace changes and fixing a typo. No functional change. Approved by: re (kib@) MFC after: 1 week	2018-09-26 10:24:50 +00:00
Michael Tuexen	078a49a077	Remove the unused parameter 'locked' from the function syncache_respond(). There is no functional change. The parameter became unused in r313330, but wasn't removed. Approved by: re (kib@) MFC after: 1 month Sponsored by: Netflix, Inc.	2018-09-23 16:37:32 +00:00
Andrey V. Elsukov	76b09d1823	Add new field max_hdrsize to struct encap_config. It is currently unused and reserved for future use to keep KBI/KPI. Also add several spare pointers to be able extend structure if it will be needed. Approved by: re (gjb)	2018-09-20 19:45:27 +00:00
Michael Tuexen	ba4704a278	Remove unused code. Approved by: re (kib@) MFC after: 1 week	2018-09-18 10:53:07 +00:00
Michael Tuexen	a8a8a8a808	Fix TCP Fast Open for the TCP RACK stack. * Fix a bug where the SYN handling during established state was applied to a front state. * Move a check for retransmission after the timer handling. This was suppressing timer based retransmissions. * Fix an off-by one byte in the sequence number of retransmissions. * Apply fixes corresponding to https://svnweb.freebsd.org/changeset/base/336934 Reviewed by: rrs@ Approved by: re (kib@) MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16912	2018-09-12 10:27:58 +00:00
Mark Johnston	54af3d0dac	Fix synchronization of LB group access. Lookups are protected by an epoch section, so the LB group linkage must be a CK_LIST rather than a plain LIST. Furthermore, we were not deferring LB group frees, so in_pcbremlbgrouphash() could race with readers and cause a use-after-free. Reviewed by: sbruno, Johannes Lundberg <johalun0@gmail.com> Tested by: gallatin Approved by: re (gjb) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17031	2018-09-10 19:00:29 +00:00
Mark Johnston	a7026c7fd9	Use ratecheck(9) in in_pcbinslbgrouphash(). Reviewed by: bz, Johannes Lundberg <johalun0@gmail.com> Approved by: re (kib) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D17065	2018-09-07 21:11:41 +00:00
Bjoern A. Zeeb	113c4fad55	The inp_lle field to struct inpcb, along with two "valid" flags for the rt and lle cache were added in r191129 (2009). To my best knowledge they have never been used and route caching has converted the inp_rt field from that commit to inp_route rendering this field and these flags obsolete. Convert the pointer into a spare pointer to not change the size of the structure anymore (and to have a spare pointer) and mark the two fields as unused. Reviewed by: markj, karels Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D17062	2018-09-06 19:55:40 +00:00
Bjoern A. Zeeb	6d2b0c0166	Make tcp_hpts.c compile a LINT kernel with options RSS and PCBGROUPS added by adding the missing include files and changing a the type of cpuid which would otherwise cause a false comparison with NETISR_CPUID_NONE. Reviewed by: rrs Approved by: re (marius) Differential Revision: https://reviews.freebsd.org/D16891	2018-09-06 16:11:24 +00:00
Mark Johnston	49365eb433	Define sctp probes only when SCTP is configured. Otherwise the "depends_on provider" guard in sctp.d does not work as intended. Reported by: mjg Reviewed by: tuexen Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D17057	2018-09-06 14:15:03 +00:00
Mark Johnston	8be02ee4da	Fix style bugs in in_pcblookup_lbgroup(). No functional change intended. Reviewed by: bz, Johannes Lundberg <johalun0@gmail.com> Approved by: re (rgrimes) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D17030	2018-09-05 15:04:11 +00:00
Eugene Grosbein	d5d21ad932	Fix "ipfw fwd" to work for incoming IPv4 packets when ip_tryforward() chooses fast forwarding path, as it already works for IPv6 and for both of them on old slow path. PR: 231143 Reviewed by: ae Approved by: re (gjb) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17039	2018-09-05 13:59:36 +00:00
Mark Johnston	73ad0b6abf	Use the correct malloc type in in_pcblbgroup_free(). Approved by: re (kib) Sponsored by: The FreeBSD Foundation	2018-09-03 17:39:09 +00:00
Michael Tuexen	c6c0be2765	Fix a shadowed variable warning. Thanks to Peter Lei for reporting the issue. Approved by: re(kib@) MFH: 1 month Sponsored by: Netflix, Inc.	2018-08-24 10:50:19 +00:00
Michael Tuexen	90ab3571d8	Use arc4rand() instead of read_random() in the SCTP and TCP code. This was suggested by jmg@. Reviewed by: delphij@, jmg@, jtl@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16860	2018-08-23 19:10:45 +00:00
Michael Tuexen	4ba1513d1a	Don't use the explicit number 32 for the length of the secrets, use sizeof() or explicit #definesi instead. No functional change. This was suggested by jmg@. MFC after: 1 month XMFC with: r338053 Sponsored by: Netflix, Inc.	2018-08-23 06:03:59 +00:00
Michael Tuexen	1e88cc8b59	Add support for send, receive and state-change DTrace providers for SCTP. They are based on what is specified in the Solaris DTrace manual for Solaris 11.4. Reviewed by: 0mp, dteske, markj Relnotes: yes Differential Revision: https://reviews.freebsd.org/D16839	2018-08-22 21:23:32 +00:00
Matt Macy	d3878608d7	in_mcast: fix copy paste error when clearing flag	2018-08-22 04:09:55 +00:00
Michael Tuexen	5dff1c3845	Enabling the IPPROTO_IPV6 level socket option IPV6_USE_MIN_MTU on a TCP socket resulted in sending fragmented IPV6 packets. This is fixes by reducing the MSS to the appropriate value. In addtion, if the socket option is set before the handshake happens, announce this MSS to the peer. This is not stricly required, but done since TCP is conservative. PR: 173444 Reviewed by: bz@, rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16796	2018-08-21 14:12:30 +00:00
Michael Tuexen	7d4dcc36a8	Fix the inheritance of IPv6 level socket options on TCP sockets. This was broken for IPv6 listening socket, which are not IPV6_ONLY, and the accepted TCP connection was using IPv4. Reviewed by: bz@, rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16792	2018-08-21 14:07:36 +00:00
Michael Tuexen	6ef849e601	Whitespace change.	2018-08-21 13:37:06 +00:00
Michael Tuexen	1a0b021677	Refactor the SHUTDOWN_PENDING state handling. This is not a functional change but a preperation for the upcoming DTrace support. It is necessary to change the state in one logical operation, even if it involves clearing the sub state SHUTDOWN_PENDING. MFC after: 1 month	2018-08-21 13:25:32 +00:00
Bjoern A. Zeeb	10b070c166	GC inc_isipv6; it was added for "temp" compatibility in 2001, r86764 and does not seem to be used.	2018-08-20 20:06:36 +00:00
Randall Stewart	c28440db29	This change represents a substantial restructure of the way we reassembly inbound tcp segments. The old algorithm just blindly dropped in segments without coalescing. This meant that every segment could take up greater and greater room on the linked list of segments. This of course is now subject to a tighter limit (100) of segments which in a high BDP situation will cause us to be a lot more in-efficent as we drop segments beyond 100 entries that we receive. What this restructure does is cause the reassembly buffer to coalesce segments putting an emphasis on the two common cases (which avoid walking the list of segments) i.e. where we add to the back of the queue of segments and where we add to the front. We also have the reassembly buffer supporting a couple of debug options (black box logging as well as counters for code coverage). These are compiled out by default but can be added by uncommenting the defines. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D16626	2018-08-20 12:43:18 +00:00
Michael Tuexen	8e02b4e00c	Don't expose the uptime via the TCP timestamps. The TCP client side or the TCP server side when not using SYN-cookies used the uptime as the TCP timestamp value. This patch uses in all cases an offset, which is the result of a keyed hash function taking the source and destination addresses and port numbers into account. The keyed hash function is the same a used for the initial TSN. Reviewed by: rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16636	2018-08-19 14:56:10 +00:00
Navdeep Parhar	32d2623ae2	Add the ability to look up the 3b PCP of a VLAN interface. Use it in toe_l2_resolve to fill up the complete vtag and not just the vid. Reviewed by: kib@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D16752	2018-08-16 23:46:38 +00:00
Matt Macy	f9be038601	Fix in6_multi double free This is actually several different bugs: - The code is not designed to handle inpcb deletion after interface deletion - add reference for inpcb membership - The multicast address has to be removed from interface lists when the refcount goes to zero OR when the interface goes away - decouple list disconnect from refcount (v6 only for now) - ifmultiaddr can exist past being on interface lists - add flag for tracking whether or not it's enqueued - deferring freeing moptions makes the incpb cleanup code simpler but opens the door wider still to races - call inp_gcmoptions synchronously after dropping the the inpcb lock Fundamentally multicast needs a rewrite - but keep applying band-aids for now. Tested by: kp Reported by: novel, kp, lwhsu	2018-08-15 20:23:08 +00:00
Luiz Otavio O Souza	59b2022f94	Late style follow up on r312770. Submitted by: glebius X-MFC with: r312770 MFC after: 3 days	2018-08-15 15:44:30 +00:00
Jonathan T. Looney	a967df1c8f	Lower the default limits on the IPv4 reassembly queue. In particular, try to ensure that no bucket will have a reassembly queue larger than approximately 100 items. This limits the cost to find the correct reassembly queue when processing an incoming fragment. Due to the low limits on each bucket's length, increase the size of the hash table from 64 to 1024. Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923	2018-08-14 17:30:46 +00:00
Jonathan T. Looney	ff790bbad0	Implement a limit on on the number of IPv4 reassembly queues per bucket. There is a hashing algorithm which should distribute IPv4 reassembly queues across the available buckets in a relatively even way. However, if there is a flaw in the hashing algorithm which allows a large number of IPv4 fragment reassembly queues to end up in a single bucket, a per- bucket limit could help mitigate the performance impact of this flaw. Implement such a limit, with a default of twice the maximum number of reassembly queues divided by the number of buckets. Recalculate the limit any time the maximum number of reassembly queues changes. However, allow the user to override the value using a sysctl (net.inet.ip.maxfragbucketsize). Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923	2018-08-14 17:23:05 +00:00
Jonathan T. Looney	7b9c5eb0a5	Add a global limit on the number of IPv4 fragments. The IP reassembly fragment limit is based on the number of mbuf clusters, which are a global resource. However, the limit is currently applied on a per-VNET basis. Given enough VNETs (or given sufficient customization of enough VNETs), it is possible that the sum of all the VNET limits will exceed the number of mbuf clusters available in the system. Given the fact that the fragment limit is intended (at least in part) to regulate access to a global resource, the fragment limit should be applied on a global basis. VNET-specific limits can be adjusted by modifying the net.inet.ip.maxfragpackets and net.inet.ip.maxfragsperpacket sysctls. To disable fragment reassembly globally, set net.inet.ip.maxfrags to 0. To disable fragment reassembly for a particular VNET, set net.inet.ip.maxfragpackets to 0. Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923	2018-08-14 17:19:49 +00:00
Jonathan T. Looney	5d9bd45518	Improve hashing of IPv4 fragments. Currently, IPv4 fragments are hashed into buckets based on a 32-bit key which is calculated by (src_ip ^ ip_id) and combined with a random seed. However, because an attacker can control the values of src_ip and ip_id, it is possible to construct an attack which causes very deep chains to form in a given bucket. To ensure more uniform distribution (and lower predictability for an attacker), calculate the hash based on a key which includes all the fields we use to identify a reassembly queue (dst_ip, src_ip, ip_id, and the ip protocol) as well as a random seed. Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923	2018-08-14 17:15:47 +00:00
Michael Tuexen	0f1346f7f4	Remove a set but not used warning showing up in usrsctp.	2018-08-14 08:32:33 +00:00

1 2 3 4 5 ...

6099 Commits