freebsd-skq

Author	SHA1	Message	Date
Richard Scheffenegger	b878ec024b	tcp: Use jenkins_hash32() in hostcache As other parts of the base tcp stack (eg. tcp fastopen) already use jenkins_hash32, and the properties appear reasonably good, switching to use that. Reviewed By: tuexen, #transport, ae MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29515	2021-04-08 20:29:19 +02:00
Gleb Smirnoff	373ffc62c1	tcp_hostcache.c: remove unneeded includes. Reviewed by: rscheff	2021-04-08 10:58:44 -07:00
Gleb Smirnoff	29acb54393	tcp_hostcache: add bool argument for tcp_hc_lookup() to tell are we looking to only read from the result, or to update it as well. For now doesn't affect locking, but allows to push stats and expire update into single place. Reviewed by: rscheff	2021-04-08 10:58:44 -07:00
Gleb Smirnoff	489bde5753	tcp_hostcache: hide rmx_hits/rmx_updates under ifdef. They have little value unless you do some profiling investigations, but they are performance bottleneck. Reviewed by: rscheff	2021-04-08 10:58:44 -07:00
Gleb Smirnoff	2cca4c0ee0	Remove tcp_hostcache.h. Everything is private. Reviewed by: rscheff	2021-04-08 10:58:44 -07:00
Richard Scheffenegger	90cca08e91	tcp: Prepare PRR to work with NewReno LossRecovery Add proper PRR vnet declarations for consistency. Also add pointer to tcpopt struct to tcp_do_prr_ack, in preparation for it to deal with non-SACK window reduction (after loss). No functional change. MFC after: 2 weeks Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29440	2021-04-08 19:16:31 +02:00
Richard Scheffenegger	9f2eeb0262	[tcp] Fix ECN on finalizing sessions. A subtle oversight would subtly change new data packets sent after a shutdown() or close() call, while the send buffer is still draining. MFC after: 3 days Reviewed By: #transport, tuexen Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29616	2021-04-08 15:26:09 +02:00
Mark Johnston	274579831b	capsicum: Limit socket operations in capability mode Capsicum did not prevent certain privileged networking operations, specifically creation of raw sockets and network configuration ioctls. However, these facilities can be used to circumvent some of the restrictions that capability mode is supposed to enforce. Add capability mode checks to disallow network configuration ioctls and creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET internet sockets. Reviewed by: oshogbo Discussed with: emaste Reported by: manu Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29423	2021-04-07 14:32:56 -04:00
Richard Scheffenegger	a04906f027	fix typo in `38ea2bd069`	2021-04-02 20:34:33 +02:00
Richard Scheffenegger	38ea2bd069	Use sbuf_drain unconditionally After making sbuf_drain safe for external use, there is no need to protect the call. MFC after: 2 weeks Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29545	2021-04-02 20:27:46 +02:00
Richard Scheffenegger	9aef4e7c2b	tcp: Shouldn't drain empty sbuf MFC after: 2 weeks Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29524	2021-04-01 17:18:38 +02:00
Richard Scheffenegger	02f26e98c7	tcp: Add hash histogram output and validate bucket length accounting Provide a histogram output to check, if the hashsize or bucketlimit could be optimized. Also add some basic sanity checks around the accounting of the hash utilization. MFC after: 2 weeks Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29506	2021-04-01 14:44:14 +02:00
Richard Scheffenegger	529a2a0f27	tcp: For hostcache performance, use atomics instead of counters As accessing the tcp hostcache happens frequently on some classes of servers, it was recommended to use atomic_add/subtract rather than (per-CPU distributed) counters, which have to be summed up at high cost to cache efficiency. PR: 254333 MFC after: 2 weeks Sponsored by: NetApp, Inc. Reviewed By: #transport, tuexen, jtl Differential Revision: https://reviews.freebsd.org/D29522	2021-04-01 10:03:30 +02:00
Richard Scheffenegger	95e56d31e3	tcp: Make hostcache.cache_count MPSAFE by using a counter_u64_t Addressing the underlying root cause for cache_count to show unexpectedly high values, by protecting all arithmetic on that global variable by using counter(9). PR: 254333 Reviewed By: tuexen, #transport MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29510	2021-03-31 20:24:13 +02:00
Richard Scheffenegger	869880463c	tcp: drain tcp_hostcache_list in between per-bucket locks Explicitly drain the sbuf after completing each hash bucket to minimize the work performed while holding the hash bucket lock. PR: 254333 MFC after: 2 weeks Reviewed By: tuexen, jhb, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29483	2021-03-31 19:24:21 +02:00
Andrey V. Elsukov	c80a4b76ce	ipdivert: check that PCB is still valid after taking INPCB_RLOCK. We are inspecting PCBs of divert sockets under NET_EPOCH section, but PCB could be already detached and we should check INP_FREED flag when we took INP_RLOCK. PR: 254478 MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D29420	2021-03-30 12:31:09 +03:00
Richard Scheffenegger	cb0dd7e122	tcp: reduce memory footprint when listing tcp hostcache In tcp_hostcache_list, the sbuf used would need a large (~2MB) blocking allocation of memory (M_WAITOK), when listing a full hostcache. This may stall the requestor for an indeterminate time. A further optimization is to return the expected userspace buffersize right away, rather than preparing the output of each current entry of the hostcase, provided by: @tuexen. This makes use of the ready-made functions of sbuf to work with sysctl, and repeatedly drain the much smaller buffer. PR: 254333 MFC after: 2 weeks Reviewed By: #transport, tuexen Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29471	2021-03-28 23:50:23 +02:00
Richard Scheffenegger	b9f803b7d4	tcp: Use PRR for ECN congestion recovery MFC after: 2 weeks Reviewed By: #transport, rrs Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28972	2021-03-26 02:06:15 +01:00
Richard Scheffenegger	eb3a59a831	tcp: Refactor PRR code No functional change intended. MFC after: 2 weeks Reviewed By: #transport, rrs Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29411	2021-03-26 00:01:34 +01:00
Richard Scheffenegger	0533fab89e	tcp: Perform simple fast retransmit when SACK Blocks are missing on SACK session MFC after: 2 weeks Reviewed By: #transport, rrs Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28634	2021-03-25 23:23:48 +01:00
Michael Tuexen	d995cc7e54	sctp: fix handling of RTO.initial of 1 ms MFC after: 3 days Reported by: syzbot+5eb0e009147050056ce9@syzkaller.appspotmail.com	2021-03-22 16:44:18 +01:00
Michael Tuexen	40f41ece76	tcp: improve handling of SYN segments in SYN-SENT state Ensure that the stack does not generate a DSACK block for user data received on a SYN segment in SYN-SENT state. Reviewed by: rscheff MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D29376 Sponsored by: Netflix, Inc.	2021-03-22 15:58:49 +01:00
Richard Scheffenegger	e9f029831f	fix panic when rescue retransmission and FIN overlap PR: 254244 PR: 254309 Reviewed By: #transport, hselasky, tuexen MFC after: 3 days Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29315	2021-03-17 17:12:04 +01:00
Gordon Bergling	5666643a95	Fix some common typos in comments - occured -> occurred - normaly -> normally - controling -> controlling - fileds -> fields - insterted -> inserted - outputing -> outputting MFC after: 1 week	2021-03-13 18:26:15 +01:00
Gordon Bergling	183502d162	Fix a few typos in comments - trough -> through MFC after: 1 week	2021-03-13 16:37:28 +01:00
John Baldwin	5a50eb6585	Don't pass RFPROC to kproc_create(), it is redundant. Reviewed by: tuexen, kib MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29206	2021-03-12 09:48:10 -08:00
Alexander V. Chernikov	b1d63265ac	Flush remaining routes from the routing table during VNET shutdown. Summary: This fixes rtentry leak for the cloned interfaces created inside the VNET. PR: 253998 Reported by: rashey at superbox.pl MFC after: 3 days Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown). Thus, any route table operations are too late to schedule. As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`. It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish. Test Plan: ``` set_skip:set_skip_group_lo -> passed [0.053s] tail -n 200 /var/log/messages \| grep rtentry ``` Reviewers: #network, kp, bz Reviewed By: kp Subscribers: imp, ae Differential Revision: https://reviews.freebsd.org/D29116	2021-03-10 21:10:14 +00:00
Richard Scheffenegger	e53138694a	tcp: Add prr_out in preparation for PRR/nonSACK and LRD Reviewed By: #transport, kbowling MFC after: 3 days Sponsored By: Netapp, Inc. Differential Revision: https://reviews.freebsd.org/D29058	2021-03-06 00:38:22 +01:00
Richard Scheffenegger	9a13d9dcee	tcp: remove a superfluous local var in tcp_sack_partialack() No functional change. Reviewed By: #transport, tuexen MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29088	2021-03-05 18:20:23 +01:00
Richard Scheffenegger	4a8f3aad37	tcp: remove incorrect reset of SACK variable in PRR Reviewed By: #transport, rrs, tuexen PR: 253848 MFC after: 3 days Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29083	2021-03-05 17:45:54 +01:00
Michael Tuexen	705d06b289	rack: unbreak TCP fast open for the client side Allow sending user data on the SYN segment. Reviewed by: rrs MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D29082 Sponsored by: Netflix, Inc.	2021-03-05 16:03:03 +01:00
Kristof Provost	bb4a7d94b9	net: Introduce IPV6_DSCP(), IPV6_ECN() and IPV6_TRAFFIC_CLASS() macros Introduce convenience macros to retrieve the DSCP, ECN or traffic class bits from an IPv6 header. Use them where appropriate. Reviewed by: ae (previous version), rscheff, tuexen, rgrimes MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D29056	2021-03-04 20:56:48 +01:00
Michael Tuexen	99adf23006	RACK: fix an issue triggered by using the CDG CC module Obtained from: rrs@ MFC after: 3 days PR: 238741 Sponsored by: Netlix, Inc.	2021-03-02 12:32:16 +01:00
Richard Scheffenegger	0b0f8b359d	calculate prr_out correctly when pipe < ssthresh Reviewed By: #transport, tuexen MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28998	2021-03-01 16:26:05 +01:00
Richard Scheffenegger	e9071000c9	Improve PRR initial transmission timing Reviewed By: tuexen, #transport MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28953	2021-02-28 15:46:54 +01:00
Michael Tuexen	70e95f0b69	sctp: avoid integer overflow when starting the HB timer MFC after: 3 days Reported by: syzbot+14b9d7c3c64208fae62f@syzkaller.appspotmail.com	2021-02-27 23:27:30 +01:00
Richard Scheffenegger	9e83a6a556	Include new data sent in PRR calculation Reviewed By: #transport, kbowling MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28941	2021-02-26 22:31:58 +01:00
Richard Scheffenegger	2593f858d7	A TCP server has to take into consideration, if TCP_NOOPT is preventing the negotiation of TCP features. This affects most TCP options but adherance to RFC7323 with the timestamp option will prevent a session from getting established. PR: 253576 Reviewed By: tuexen, #transport MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28652	2021-02-25 19:12:20 +01:00
Richard Scheffenegger	31d7a27c6e	PRR: Avoid accounting left-edge twice in partial ACK. Reviewed By: #transport, kbowling MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28819	2021-02-25 18:37:47 +01:00
Richard Scheffenegger	48396dc779	Address two incorrect calculations and enhance readability of PRR code - address second instance of cwnd potentially becoming zero - fix sublte bug due to implicit int to uint typecase in max() - fix bug due to typo in hand-coded CEILING() function by using howmany() macro - use int instead of long, and add a missing long typecast - replace if conditionals with easier to read imax/imin (as in pseudocode) Reviewed By: #transport, kbowling MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28813	2021-02-25 18:32:04 +01:00
Kristof Provost	f3245be349	net: remove legacy in_addmulti() Despite the comment to the contrary neither pf nor carp use in_addmulti(). Nothing does, so get rid of it. Carp stopped using it in `08b68b0e4c` (2011). It's unclear when pf stopped using it, but before `d6d3f01e0a` (2012). Reviewed by: bz@, melifaro@ Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D28918	2021-02-25 10:13:52 +01:00
Kristof Provost	c139b3c19b	arp/nd: Cope with late calls to iflladdr_event When tearing down vnet jails we can move an if_bridge out (as part of the normal vnet_if_return()). This can, when it's clearing out its list of member interfaces, change its link layer address. That sends an iflladdr_event, but at that point we've already freed the AF_INET/AF_INET6 if_afdata pointers. In other words: when the iflladdr_event callbacks fire we can't assume that ifp->if_afdata[AF_INET] will be set. Reviewed by: donner@, melifaro@ MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D28860	2021-02-23 13:54:07 +01:00
Hans Petter Selasky	9febbc4541	Fix for natd(8) sending wrong sequence number after TCP retransmission, terminating a TCP connection. If a TCP packet must be retransmitted and the data length has changed in the retransmitted packet, due to the internal workings of TCP, typically when ACK packets are lost, then there is a 30% chance that the logic in GetDeltaSeqOut() will find the correct length, which is the last length received. This can be explained as follows: If a "227 Entering Passive Mode" packet must be retransmittet and the length changes from 51 to 50 bytes, for example, then we have three cases for the list scan in GetDeltaSeqOut(), depending on how many prior packets were received modulus N_LINK_TCP_DATA=3: case 1: index 0: original packet 51 index 1: retransmitted packet 50 index 2: not relevant case 2: index 0: not relevant index 1: original packet 51 index 2: retransmitted packet 50 case 3: index 0: retransmitted packet 50 index 1: not relevant index 2: original packet 51 This patch simply changes the searching order for TCP packets, always starting at the last received packet instead of any received packet, in GetDeltaAckIn() and GetDeltaSeqOut(). Else no functional changes. Discussed with: rscheff@ Submitted by: Andreas Longwitz <longwitz@incore.de> PR: 230755 MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking	2021-02-22 17:13:58 +01:00
Michael Tuexen	b963ce4588	sctp: improve computation of an alternate net Espeially handle the case where the net passed in is about to be deleted and therefore not in the list of nets anymore. MFC after: 3 days Reported by: syzbot+9756917a7c8381adf5e8@syzkaller.appspotmail.com	2021-02-21 17:13:06 +01:00
Michael Tuexen	5ac839029d	sctp: clear a pointer to a net which will be removed MFC after: 3 days	2021-02-21 13:06:05 +01:00
Richard Scheffenegger	a8e431e153	PRR: use accurate rfc6675_pipe when enabled Reviewed By: #transport, tuexen MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28816	2021-02-20 20:11:48 +01:00
Richard Scheffenegger	853fd7a2e3	Ensure cwnd doesn't shrink to zero with PRR Under some circumstances, PRR may end up with a fully collapsed cwnd when finalizing the loss recovery. Reviewed By: #transport, kbowling Reported by: Liang Tian MFC after: 1 week Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28780	2021-02-19 13:55:32 +01:00
Kyle Evans	4c0bef07be	kern: net: remove TCP_LINGERTIME TCP_LINGERTIME can be traced back to BSD 4.4 Lite and perhaps beyond, in exactly the same form that it appears here modulo slightly different context. It used to be the case that there was a single pr_usrreq method with requests dispatched to it; these exact two lines appeared in tcp_usrreq's PRU_ATTACH handling. The only purpose of this that I can find is to cause surprising behavior on accepted connections. Newly-created sockets will never hit these paths as one cannot set SO_LINGER prior to socket(2). If SO_LINGER is set on a listening socket and inherited, one would expect the timeout to be inherited rather than changed arbitrarily like this -- noting that SO_LINGER is nonsense on a listening socket beyond inheritance, since they cannot be 'connected' by definition. Neither Illumos nor Linux reset the timer like this based on testing and inspection of Illumos, and testing of Linux. Reviewed by: rscheff, tuexen Differential Revision: https://reviews.freebsd.org/D28265	2021-02-18 22:36:01 -06:00
Randall Stewart	e13e4fa6c4	fix Navdeeps LINT_NOINET error.	2021-02-18 07:29:12 -05:00
Randall Stewart	0a4f851074	Fix another pesky missing #ifdef TCPHPTS	2021-02-18 01:27:30 -05:00

1 2 3 4 5 ...

6858 Commits