freebsd-dev

Author	SHA1	Message	Date
Andrey V. Elsukov	fcf596178b	Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352	2017-02-06 08:49:57 +00:00
Hiren Panchasara	6134aabe38	Add a knob to change default behavior of inheriting listen socket's tcp stack regardless of what the default stack for the system is set to. With current/default behavior, after changing the default tcp stack, the application needs to be restarted to pick up that change. Setting this new knob net.inet.tcp.functions_inherit_listen_socket_stack to '0' would change that behavior and make any new connection use the newly selected default tcp stack. Reviewed by: rrs MFC after: 2 weeks Sponsored by: Limelight Networks	2017-01-27 23:10:46 +00:00
Gleb Smirnoff	030b9c2f69	Remove assigned only variable.	2016-12-21 22:47:10 +00:00
Hiren Panchasara	2806b2933b	For RTT calculations mid-session, we explicitly ignore ACKs with tsecr of 0 as many borken middle-boxes tend to do that. But during 3whs, in syncache_expand(), we don't do that which causes us to send a RST to such a client. Relax this constraint by only using tsecr to compare against timestamp that we sent when it is not 0. As a result, we'd now accept the final ACK of 3whs with tsecr of 0. Reviewed by: jtl, gnn Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D8552	2016-11-21 20:53:11 +00:00
Julien Charbon	f1ee30ccd6	Remove an extraneous call to soisconnected() in syncache_socket(), introduced with r261242. The useful and expected soisconnected() call is done in tcp_do_segment(). Has been found as part of unrelated PR:212920 investigation. Improve slightly (~2%) the maximum number of TCP accept per second. Tested by: kevin.bowling_kev009.com, jch Approved by: gnn, hiren MFC after: 1 week Sponsored by: Verisign, Inc Differential Revision: https://reviews.freebsd.org/D8072	2016-10-26 15:19:18 +00:00
Patrick Kelsey	09c305eb65	Fix cases where the TFO pending counter would leak references, and eventually, memory. Also renamed some tfo labels and added/reworked comments for clarity. Based on an initial patch from jtl. PR: 213424 Reviewed by: jtl MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D8235	2016-10-15 01:41:28 +00:00
Jonathan T. Looney	68bd7ed102	The TFO server-side code contains some changes that are not conditioned on the TCP_RFC7413 kernel option. This change removes those few instructions from the packet processing path. While not strictly necessary, for the sake of consistency, I applied the new IS_FASTOPEN macro to all places in the packet processing path that used the (t_flags & TF_FASTOPEN) check. Reviewed by: hiren Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D8219	2016-10-12 19:06:50 +00:00
Julien Charbon	c1b19923a3	Fix an issue with accept_filter introduced with r261242: As a side effect of r261242 when using accept_filter the first call to soisconnected() is done earlier in tcp_input() instead of tcp_do_segment() context. Restore the expected behaviour. Note: This call to soisconnected() seems to be extraneous in all cases (with or without accept_filter). Will be addressed in a separate commit. PR: 212920 Reported by: Alexey Tested by: Alexey, jch Sponsored by: Verisign, Inc. MFC after: 1 week	2016-09-29 11:18:48 +00:00
Randall Stewart	587d67c008	Here we update the modular tcp to be able to switch to an alternate TCP stack in other then the closed state (pre-listen/connect). The idea is that if that is supported by the alternate stack, it is asked if its ok to switch. If it approves the "handoff" then we allow the switch to happen. Also the fini() function now gets a flag to tell if you are switching away or the tcb is destroyed. The init() call into the alternate stack is moved to the end so the tcb is more fully formed before the init transpires. Sponsored by: Netflix Inc. Differential Revision: D6790	2016-08-16 15:11:46 +00:00
Sepherosa Ziehau	e6ec45f869	tcp/syncache: Add comment for syncache_respond Suggested by: hiren, hps Reviewed by: sbruno Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D6148	2016-05-10 04:59:04 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Sepherosa Ziehau	9340a8d5b9	tcp/syncache: Set flowid and hash type properly for SYN\|ACK So the underlying drivers can use it to select the sending queue properly for SYN\|ACK instead of rolling their own hash. Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D6120	2016-04-29 07:23:08 +00:00
Pedro F. Giffuni	63b6b7a74a	Indentation issues. Contract some lines leftover from r298310. Mea culpa.	2016-04-20 16:19:44 +00:00
Pedro F. Giffuni	02abd40029	kernel: use our nitems() macro when it is available through param.h. No functional change, only trivial cases are done in this sweep, Discussed in: freebsd-current	2016-04-19 23:48:27 +00:00
Bjoern A. Zeeb	dc95d65555	Mfp: r296309 While there is no dependency interaction, stopping the timer before freeing the rest of the resources seems more natural and avoids it being scheduled an extra time when it is no longer needed. Reviewed by: gnn, emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5733	2016-04-09 10:51:07 +00:00
Gleb Smirnoff	bf840a1707	Redo r294869. The array of counters for TCP states doesn't belong to struct tcpstat, because the structure can be zeroed out by netstat(1) -z, and of course running connection counts shouldn't be touched. Place running connection counts into separate array, and provide separate read-only sysctl oid for it.	2016-03-15 00:15:10 +00:00
Gleb Smirnoff	75dd79d937	Grab a snap amount of TCP connections in syncache from tcpstat.	2016-01-27 00:48:05 +00:00
Gleb Smirnoff	57a78e3bae	Augment struct tcpstat with tcps_states[], which is used for book-keeping the amount of TCP connections by state. Provides a cheap way to get connection count without traversing the whole pcb list. Sponsored by: Netflix	2016-01-27 00:45:46 +00:00
Patrick Kelsey	281a0fd4f9	Implementation of server-side TCP Fast Open (TFO) [RFC7413]. TFO is disabled by default in the kernel build. See the top comment in sys/netinet/tcp_fastopen.c for implementation particulars. Reviewed by: gnn, jch, stas MFC after: 3 days Sponsored by: Verisign, Inc. Differential Revision: https://reviews.freebsd.org/D4350	2015-12-24 19:09:48 +00:00
Randall Stewart	55bceb1e2b	First cut of the modularization of our TCP stack. Still to do is to clean up the timer handling using the async-drain. Other optimizations may be coming to go with this. Whats here will allow differnet tcp implementations (one included). Reviewed by: jtl, hiren, transports Sponsored by: Netflix Inc. Differential Revision: D4055	2015-12-16 00:56:45 +00:00
Gleb Smirnoff	388909a12a	Use Jenkins hash for TCP syncache. o Unlike xor, in Jenkins hash every bit of input affects virtually every bit of output, thus salting the hash actually works. With xor salting only provides a false sense of security, since if hash(x) collides with hash(y), then of course, hash(x) ^ salt would also collide with hash(y) ^ salt. [1] o Jenkins provides much better distribution than xor, very close to ideal. TCP connection setup/teardown benchmark has shown a 10% increase with default hash size, and with bigger hashes that still provide possibility for collisions. With enormous hash size, when dataset is by an order of magnitude smaller than hash size, the benchmark has shown 4% decrease in performance decrease, which is expected and acceptable. Noticed by: Jeffrey Knockel <jeffk cs.unm.edu> [1] Benchmarks by: jch Reviewed by: jch, pkelsey, delphij Security: strengthens protection against hash collision DoS Sponsored by: Nginx, Inc.	2015-09-05 10:15:19 +00:00
Julien Charbon	ff9b006d61	Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability: - The existing TCP INP_INFO lock continues to protect the global inpcb list stability during full list traversal (e.g. tcp_pcblist()). - A new INP_LIST lock protects inpcb list actual modifications (inp allocation and free) and inpcb global counters. It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input()) and INP_INFO_WLOCK only in occasional operations that walk all connections. PR: 183659 Differential Revision: https://reviews.freebsd.org/D2599 Reviewed by: jhb, adrian Tested by: adrian, nitroboost-gmail.com Sponsored by: Verisign, Inc.	2015-08-03 12:13:54 +00:00
Hiren Panchasara	ec446b1375	Make syncookie_mac() use 'tcp_seq irs' in computing hash. This fixes what seems like a simple oversight when the function was added in r253210. Reported by: Daniel Borkmann <dborkman@redhat.com> Florian Westphal <fw@strlen.de> Differential Revision: https://reviews.freebsd.org/D1628 Reviewed by: gnn MFC after: 1 month Sponsored by: Limelight Networks	2015-01-30 17:29:07 +00:00
Hans Petter Selasky	c25290420e	Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file. This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows. "M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before. Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped. MFC after: 1 month Sponsored by: Mellanox Technologies	2014-12-01 11:45:24 +00:00
Gleb Smirnoff	6df8a71067	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.	2014-11-07 09:39:05 +00:00
Andrey V. Elsukov	7e4217558c	Fix typo.	2014-10-31 11:40:49 +00:00
Alexander V. Chernikov	29c47f18da	* Split tcp_signature_compute() into 2 pieces: - tcp_get_sav() - SADB key lookup - tcp_signature_do_compute() - actual computation * Fix TCP signature case for listening socket: do not assume EVERY connection coming to socket with TCP_SIGNATURE set to be md5 signed regardless of SADB key existance for particular address. This fixes the case for routing software having _some_ BGP sessions secured by md5. * Simplify TCP_SIGNATURE handling in tcp_input() MFC after: 2 weeks	2014-09-27 07:04:12 +00:00
John Baldwin	a7c7f2a7e2	In tcp_input(), don't acquire the pcbinfo global write lock for SYN packets targeting a listening socket. Permit to reduce TCP input processing starvation in context of high SYN load (e.g. short-lived TCP connections or SYN flood). Submitted by: Julien Charbon <jcharbon@verisign.com> Reviewed by: adrian, hiren, jhb, Mike Bentkofsky	2014-09-04 19:09:08 +00:00
Bjoern A. Zeeb	ad494fa898	syncache_lookup() is a file local function. Make it static and take it out of the public KPI; seems it was never used elsewhere. MFC after: 2 weeks	2014-05-24 15:03:36 +00:00
Adrian Chadd	2f71993288	Ensure that the flowid hashtype is assigned to the inp if the flowid is also assigned.	2014-05-18 22:34:06 +00:00
Gleb Smirnoff	92f8975ff4	Utilize SYSCTL_UMA_CUR() to export usage of syncache and tcp reassembly zones. Sponsored by: Nginx, Inc.	2014-02-07 14:31:51 +00:00
George V. Neville-Neil	6f3caa6d81	Decrease lock contention within the TCP accept case by removing the INP_INFO lock from tcp_usr_accept. As the PR/patch states this was following the advice already in the code. See the PR below for a full disucssion of this change and its measured effects. PR: 183659 Submitted by: Julian Charbon Reviewed by: jhb	2014-01-28 20:28:32 +00:00
Adrian Chadd	9db69902c6	If the flowid is available for the mbuf that finalised the creation of a syncache connection, copy it into the inp_flowid field. Without this, an incoming TCP connection won't have an inp_flowid marked until some data comes in, and this means that things like the per-CPU TCP timer option will choose a different CPU for the timer work. (It also means that if one grabbed the flowid via an ioctl from userland, it won't be available until some data has been received.) Sponsored by: Netflix, Inc.	2014-01-18 23:48:20 +00:00
Gleb Smirnoff	76039bc84f	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 17:58:36 +00:00
Mark Johnston	57f6086735	Implement the ip, tcp, and udp DTrace providers. The probe definitions use dynamic translation so that their arguments match the definitions for these providers in Solaris and illumos. Thus, existing scripts for these providers should work unmodified on FreeBSD. Tested by: gnn, hiren MFC after: 1 month	2013-08-25 21:54:41 +00:00
Andre Oppermann	ccd040ab18	Free the non-fatal "timestamp missing" debug string manually as it is not covered by the catch-all free for the error cases. Found by: Coverity	2013-07-16 16:37:08 +00:00
Andre Oppermann	81d392a09d	Improve SYN cookies by encoding the MSS, WSCALE (window scaling) and SACK information into the ISN (initial sequence number) without the additional use of timestamp bits and switching to the very fast and cryptographically strong SipHash-2-4 MAC hash algorithm to protect the SYN cookie against forgeries. The purpose of SYN cookies is to encode all necessary session state in the 32 bits of our initial sequence number to avoid storing any information locally in memory. This is especially important when under heavy spoofed SYN attacks where we would either run out of memory or the syncache would fill with bogus connection attempts swamping out legitimate connections. The original SYN cookies method only stored an indexed MSS values in the cookie. This isn't sufficient anymore and breaks down in the presence of WSCALE information which is only exchanged during SYN and SYN-ACK. If we can't keep track of it then we may severely underestimate the available send or receive window. This is compounded with large windows whose size information on the TCP segment header is even lower numerically. A number of years back SYN cookies were extended to store the additional state in the TCP timestamp fields, if available on a connection. While timestamps are common among the BSD, Linux and other *nix systems Windows never enabled them by default and thus are not present for the vast majority of clients seen on the Internet. The common parameters used on TCP sessions have changed quite a bit since SYN cookies very invented some 17 years ago. Today we have a lot more bandwidth available making the use window scaling almost mandatory. Also SACK has become standard making recovering from packet loss much more efficient. This change moves all necessary information into the ISS removing the need for timestamps. Both the MSS (16 bits) and send WSCALE (4 bits) are stored in 3 bit indexed form together with a single bit for SACK. While this is significantly less than the original range, it is sufficient to encode all common values with minimal rounding. The MSS depends on the MTU of the path and with the dominance of ethernet the main value seen is around 1460 bytes. Encapsulations for DSL lines and some other overheads reduce it by a few more bytes for many connections seen. Rounding down to the next lower value in some cases isn't a problem as we send only slightly more packets for the same amount of data. The send WSCALE index is bit more tricky as rounding down under-estimates the available send space available towards the remote host, however a small number values dominate and are carefully selected again. The receive WSCALE isn't encoded at all but recalculated based on the local receive socket buffer size when a valid SYN cookie returns. A listen socket buffer size is unlikely to change while active. The index values for MSS and WSCALE are selected for minimal rounding errors based on large traffic surveys. These values have to be periodically validated against newer traffic surveys adjusting the arrays tcp_sc_msstab[] and tcp_sc_wstab[] if necessary. In addition the hash MAC to protect the SYN cookies is changed from MD5 to SipHash-2-4, a much faster and cryptographically secure algorithm. Reviewed by: dwmalone Tested by: Fabian Keil <fk@fabiankeil.de>	2013-07-11 15:29:25 +00:00
Andre Oppermann	07dacf031e	Extend debug logging of TCP timestamp related specification violations. Update related comments and style.	2013-07-10 12:06:01 +00:00
Andre Oppermann	cda3447bb0	uma_zone_set_max() directly returns the rounded effective zone limit. Use the return value directly instead of doing a second uma_zone_set_max() step. MFC after: 1 week	2013-02-01 14:21:09 +00:00
Navdeep Parhar	464dfeb43f	Add TCP_OFFLOAD hook in syncache_respond for IPv6 too, just like the one that exists for IPv4. Reviewed by: bz@	2013-01-25 22:16:35 +00:00
Gleb Smirnoff	e8db9937f3	Fix sysctl_handle_int() usage. Either arg1 or arg2 should be supplied, and arg2 doesn't pass size of arg1.	2012-12-25 13:55:21 +00:00
Gleb Smirnoff	eb1b1807af	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually	2012-12-05 08:04:20 +00:00
Andre Oppermann	024fd5b6bb	For retransmits of SYN\|ACK from the syncache use the slightly more aggressive special tcp_syn_backoff[] retransmit schedule instead of the normal tcp_backoff[] schedule for established connections. MFC after: 2 weeks	2012-10-28 19:02:07 +00:00
Andre Oppermann	f3a10d7954	Change the syncache count reporting the current number of entries from an unprotected u_int that reports garbage on SMP to a function based sysctl obtaining the current value from UMA. Also read back the actual cache_limit after page size rounding by UMA. PR: kern/165879 MFC after: 2 weeks	2012-10-28 18:07:34 +00:00
Andre Oppermann	cf8f04f4c0	When SYN or SYN/ACK had to be retransmitted RFC5681 requires us to reduce the initial CWND to one segment. This reduction got lost some time ago due to a change in initialization ordering. Additionally in tcp_timer_rexmt() avoid entering fast recovery when we're still in TCPS_SYN_SENT state. MFC after: 2 weeks	2012-10-28 17:25:08 +00:00
Gleb Smirnoff	8f134647ca	Switch the entire IPv4 stack to keep the IP packet header in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>	2012-10-22 21:09:03 +00:00
Navdeep Parhar	09fe63205c	- Updated TOE support in the kernel. - Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs. These are available as t3_tom and t4_tom modules that augment cxgb(4) and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as usual with or without these extra features. - iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the works and will follow soon. Build-tested with make universe. 30s overview ============ What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the capabilities of an interface: # ifconfig -m \| grep TOE Enable/disable TCP offload on an interface (just like any other ifnet capability): # ifconfig cxgbe0 toe # ifconfig cxgbe0 -toe Which connections are offloaded? Look for toe4 and/or toe6 in the output of netstat and sockstat: # netstat -np tcp \| grep toe # sockstat -46c \| grep toe Reviewed by: bz, gnn Sponsored by: Chelsio communications. MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)	2012-06-19 07:34:13 +00:00
Bjoern A. Zeeb	356ab07e2d	It turns out that too many drivers are not only parsing the L2/3/4 headers for TSO but also for generic checksum offloading. Ideally we would only have one common function shared amongst all drivers, and perhaps when updating them for IPv6 we should introduce that. Eventually we should provide the meta information along with mbufs to avoid (re-)parsing entirely. To not break IPv6 (checksums and offload) and to be able to MFC the changes without risking to hurt 3rd party drivers, duplicate the v4 framework, as other OSes have done as well. Introduce interface capability flags for TX/RX checksum offload with IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6 flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6 fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and add an alias for CSUM_DATA_VALID_IPV6. This pretty much brings IPv6 handling in line with IPv4. TSO is still handled in a different way and not via if_hwassist. Update ifconfig to allow (un)setting of the new capability flags. Update loopback to announce the new capabilities and if_hwassist flags. Individual driver updates will have to follow, as will SCTP. Reported by: gallatin, dim, .. Reviewed by: gallatin (glanced at?) MFC after: 3 days X-MFC with: r235961,235959,235958	2012-05-28 09:30:13 +00:00
Bjoern A. Zeeb	45747ba53c	MFp4 bz_ipv6_fast: Add code to handle pre-checked TCP checksums as indicated by mbuf flags to save the entire computation for validation if not needed. In the IPv6 TCP output path only compute the pseudo-header checksum, set the checksum offset in the mbuf field along the appropriate flag as done in IPv4. In tcp_respond() just initialize the IPv6 payload length to 0 as ip6_output() will properly set it. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-25 02:23:26 +00:00
Bjoern A. Zeeb	d8951c8a2f	Fix PAWS (Protect Against Wrapped Sequence numbers) in cases when hz >> 1000 and thus getting outside the timestamp clock frequenceny of 1ms < x < 1s per tick as mandated by RFC1323, leading to connection resets on idle connections. Always use a granularity of 1ms using getmicrouptime() making all but relevant callouts independent of hz. Use getmicrouptime(), not getmicrotime() as the latter may make a jump possibly breaking TCP nfsroot mounts having our timestamps move forward for more than 24.8 days in a second without having been idle for that long. PR: kern/61404 Reviewed by: jhb, mav, rrs Discussed with: silby, lstewart Sponsored by: Sandvine Incorporated (originally in 2011) MFC after: 6 weeks	2012-02-15 16:09:56 +00:00

1 2 3 4 5

242 Commits