freebsd-nq

Author	SHA1	Message	Date
Andrey V. Elsukov	f367798498	Take extra reference to security policy before calling crypto_dispatch(). Currently we perform crypto requests for IPSEC synchronous for most of crypto providers (software, aesni) and only VIA padlock calls crypto callback asynchronous. In synchronous mode it is possible, that security policy will be removed during the processing crypto request. And crypto callback will release the last reference to SP. Then upon return into ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already freed request. To prevent this we will take extra reference to SP. PR: 201876 Sponsored by: Yandex LLC	2015-09-30 08:16:33 +00:00
Gleb Smirnoff	794ac42374	When processing ICMP need frag message, ignore the suggested MTU unless it is smaller than the current one for this connection. This is behavior specified by RFC 1191, and this is how original BSD stack behaved, but this was unintentionally regressed in r182851. Reported & tested by: Richard Russo <russor whatsapp.com> Differential Revision: D3567 Sponsored by: Nginx, Inc.	2015-09-30 03:37:37 +00:00
Alexander V. Chernikov	1558cb2448	Eliminate nd6_nud_hint() and its TCP bindings. Initially function was introduced in r53541 (KAME initial commit) to "provide hints from upper layer protocols that indicate a connection is making "forward progress"" (quote from RFC 2461 7.3.1 Reachability Confirmation). However, it was converted to do nothing (e.g. just return) in r122922 (tcp_hostcache implementation) back in 2003. Some defines were moved to tcp_var.h in r169541. Then, it was broken (for non-corner cases) by r186119 (L2<>L3 split) in 2008 (NULL ifp in nd6_lookup). So, right now this code is broken and has no "real" base users. Differential Revision: https://reviews.freebsd.org/D3699	2015-09-27 05:29:34 +00:00
Alexander V. Chernikov	4a336ef40c	rtsock requests for deleting interface address lles started to return EPERM instead of old "ignore-and-return 0" in r287789. This broke arp -da / ndp -cn behavior (they exit on rtsock command failure). Fix this by translating LLE_IFADDR to RTM_PINNED flag, passing it to userland and making arp/ndp ignore these entries in batched delete. MFC after: 2 weeks	2015-09-27 04:54:29 +00:00
Alexander V. Chernikov	8e5aadb617	Replace toe_nd6_resolve() with nd6_resolve(). Reviewed by: np	2015-09-22 19:05:44 +00:00
Alexander V. Chernikov	aa5f023eaf	Unify nd6 state switching by using newly-created nd6_llinfo_setstate() function. The change is mostly mechanical with the following exception: Last piece of nd6_resolve_slow() was refactored: ND6_LLINFO_PERMANENT condition was removed as always-true, explicit ND6_LLINFO_NOSTATE -> ND6_LLINFO_INCOMPLETE state transition was removed as duplicate. Reviewed by: ae Sponsored by: Yandex LLC	2015-09-21 11:19:53 +00:00
Gleb Smirnoff	399fbd0ec0	Use proper byteswap macro. This isn't a functional change.	2015-09-17 17:27:49 +00:00
Gleb Smirnoff	db642c8e6e	In tcp_ctlinput() separate the (ip == NULL) block from the rest of the function to reduce so many levels of indentation. Style the lines that got now indentation reduced. No functional change. Checked with: md5	2015-09-16 21:42:33 +00:00
Alexander V. Chernikov	59c180c35c	Unify loopback route switching: * prepare gateway before insertion * use RTM_CHANGE instead of explicit find/change route * Remove fib argument from ifa_switch_loopback_route added in r264887: if old ifp fib differes from new one, that the caller is doing something wrong * Make ifa_*_loopback_route call single ifa_maintain_loopback_route().	2015-09-16 06:23:15 +00:00
Brad Davis	e5fe11011a	Remove redundant 'man page' Reviewed by: allanjude	2015-09-15 21:16:45 +00:00
Hiren Panchasara	550e9d4235	Remove unnecessary tcp state transition call. Differential Revision: D3451 Reviewed by: markj MFC after: 2 weeks Sponsored by: Limelight Networks	2015-09-15 20:04:30 +00:00
Alexander V. Chernikov	eec33ea052	* Improve logging invalid arp messages * Remove redundant check in ip_arpinput Suggested by: glebius MFC after: 2 weeks	2015-09-15 08:50:44 +00:00
Alexander V. Chernikov	d3cdb71655	* Require explicitl lle unlink prior to calling llentry_delete(). This one slightly decreases time of holding afdata wlock. * While here, make nd6_free() return void. No one has used its return value since r186119.	2015-09-15 06:48:19 +00:00
Alexander V. Chernikov	3e7a2321e3	* Do more fine-grained locking: call eventhandlers/free_entry without holding afdata wlock * convert per-af delete_address callback to global lltable_delete_entry() and more low-level "delete this lle" per-af callback * fix some bugs/inconsistencies in IPv4/IPv6 ifscrub procedures Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3573	2015-09-14 16:48:19 +00:00
Alexander V. Chernikov	deb6bda6e3	* Improve error checking for arp messages. * Clean stale headers from if_ether.c. Reported by: rozhuk.im at gmail.com Reviewed by: ae MFC after: 2 weeks	2015-09-14 10:28:47 +00:00
Hans Petter Selasky	d76d40126e	Update TSO limits to include all headers. To make driver programming easier the TSO limits are changed to reflect the values used in the BUSDMA tag a network adapter driver is using. The TCP/IP network stack will subtract space for all linklevel and protocol level headers and ensure that the full mbuf chain passed to the network adapter fits within the given limits. Implementation notes: If a network adapter driver needs to fixup the first mbuf in order to support VLAN tag insertion, the size of the VLAN tag should be subtracted from the TSO limit. Else not. Network adapters which typically inline the complete header mbuf could technically transmit one more segment. This patch does not implement a mechanism to recover the last segment for data transmission. It is believed when sufficiently large mbuf clusters are used, the segment limit will not be reached and recovering the last segment will not have any effect. The current TSO algorithm tries to send MTU-sized packets, where the MTU typically is 1500 bytes, which gives 1448 bytes of TCP data payload per packet for IPv4. That means if the TSO length limitiation is set to 65536 bytes, there will be a data payload remainder of (65536 - 1500) mod 1448 bytes which is equal to 324 bytes. Trying to recover total TSO length due to inlining mbuf header data will not have any effect, because adding or removing the ETH/IP/TCP headers to or from 324 bytes will not cause more or less TCP payload to be TSO'ed. Existing network adapter limits will be updated separately. Differential Revision: https://reviews.freebsd.org/D3458 Reviewed by: rmacklem MFC after: 2 weeks	2015-09-14 08:36:22 +00:00
George V. Neville-Neil	5d06879adb	dd DTrace probe points, translators and a corresponding script to provide the TCPDEBUG functionality with pure DTrace. Reviewed by: rwatson MFC after: 2 weeks Sponsored by: Limelight Networks Differential Revision: D3530	2015-09-13 15:50:55 +00:00
Michael Tuexen	30811e70d9	Fix compilation issue introduced in r287717. Thanks to bz@ for making me aware of it. MFC after: 1 week	2015-09-12 21:23:24 +00:00
Michael Tuexen	6802b0904f	Address a compile warning. MFC after: 1 week	2015-09-12 18:00:06 +00:00
Michael Tuexen	86eda749af	Cleanup the handling of error causes for ERROR chunks. This fixes an inconsistency of the padding handling. The final padding is now considered to be a chunk padding. MFC after: 1 week	2015-09-12 17:08:51 +00:00
Michael Tuexen	e629b9fc56	Ensure that ERROR chunks are always padded by implementing this in the routine, which queues an ERROR chunk, instead on relyinh on the callers to do so. Since one caller missed this, this actially fixes a bug. MFC after: 1 week	2015-09-11 13:54:33 +00:00
Michael Tuexen	0941640f34	RFC 4960 requires that packets containing an INIT chunk bundled with another chunk are silently discarded. Do so, instead of sending an ABORT. MFC after: 1 week	2015-09-07 14:00:38 +00:00
Allan Jude	32d321fa4a	missed file that should have been included in r287528 PR: 184110 Submitted by: Marie Helene Kvello-Aune <marieheleneka@gmail.com> Approved by: wblock (mentor)	2015-09-07 02:00:05 +00:00
Adrian Chadd	499baf0aa7	Replace rss_m2cpuid with rss_soft_m2cpuid_v4 for ip_direct_nh.nh_m2cpuid, because the RSS hash may need to be recalculated. Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn> Differential Revision: https://reviews.freebsd.org/D3564	2015-09-06 20:20:48 +00:00
Alexander V. Chernikov	26deb8826c	Do not pass lle to nd6_ns_output(). Use newly-added nd6_llinfo_get_holdsrc() to extract desired IPv6 source from holdchain and pass it to the nd6_ns_output().	2015-09-05 14:14:03 +00:00
Gleb Smirnoff	388909a12a	Use Jenkins hash for TCP syncache. o Unlike xor, in Jenkins hash every bit of input affects virtually every bit of output, thus salting the hash actually works. With xor salting only provides a false sense of security, since if hash(x) collides with hash(y), then of course, hash(x) ^ salt would also collide with hash(y) ^ salt. [1] o Jenkins provides much better distribution than xor, very close to ideal. TCP connection setup/teardown benchmark has shown a 10% increase with default hash size, and with bigger hashes that still provide possibility for collisions. With enormous hash size, when dataset is by an order of magnitude smaller than hash size, the benchmark has shown 4% decrease in performance decrease, which is expected and acceptable. Noticed by: Jeffrey Knockel <jeffk cs.unm.edu> [1] Benchmarks by: jch Reviewed by: jch, pkelsey, delphij Security: strengthens protection against hash collision DoS Sponsored by: Nginx, Inc.	2015-09-05 10:15:19 +00:00
Gleb Smirnoff	24067db8ca	Make tcp_mtudisc() static and void. No functional changes. Sponsored by: Nginx, Inc.	2015-09-04 12:02:12 +00:00
Michael Tuexen	6fb9db98b3	Don't leak memory in an error case. MFC after: 1 week	2015-09-04 09:24:07 +00:00
Michael Tuexen	59713bbf27	Add a NULL pointer check to silence the clang code analyzer. MFC after: 1 week	2015-09-04 09:22:16 +00:00
Michael Tuexen	aa1cfca969	Fix a bug where two SHUTDOWN_ACK chunks were sent if a SHUTDOWN chunk was received acking all outstanding data.	2015-09-03 22:15:56 +00:00
Julien Charbon	d6de19ac2f	Put r284245 back in place: If at first this fix was seen as a temporary workaround for a callout(9) issue, it turns out it is instead the right way to use callout in mpsafe mode without using callout_drain(). r284245 commit message: Fix a callout race condition introduced in TCP timers callouts with r281599. In TCP timer context, it is not enough to check callout_stop() return value to decide if a callout is still running or not, previous callout_reset() return values have also to be checked. Differential Revision: https://reviews.freebsd.org/D2763	2015-08-30 13:44:39 +00:00
Michael Tuexen	2e2d67945a	Use 5 times RTO.Max as the default for the shutdown guard timer as required by RFC 4960. The sysctl variable can be used to overwrite this. Discussed with: rrs MFC after: 1 week	2015-08-29 17:26:29 +00:00
Michael Tuexen	e92c2a8d6a	Fix the exporting of SCTP association states to userland. Without this, associations in SHUTDOWN-PENDING were never reported correctly. MFC after: 3 weeks	2015-08-29 09:14:32 +00:00
Adrian Chadd	2527ccad2d	Rename rss_soft_m2cpuid() -> rss_soft_m2cpuid_v4() in preparation for an IPv6 version to show up. Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn> Differential Revision: https://reviews.freebsd.org/D3504	2015-08-29 06:58:30 +00:00
Adrian Chadd	e5562eb934	Replace the printf()s with optional rate limited debugging for RSS. Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn> Differential Revision: https://reviews.freebsd.org/D3471	2015-08-28 05:58:16 +00:00
Bjoern A. Zeeb	a86e5c96af	get_inpcbinfo() and get_pcblist() are UDP local functions and do not do what one would expect by name. Prefix them with "udp_" to at least obviously limit the scope. This is a non-functional change. Reviewed by: gnn, rwatson MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3505	2015-08-27 15:27:41 +00:00
Julien Charbon	bcf9b91395	Revert r284245: "Fix a callout race condition introduced in TCP timers callouts with r281599." r281599 fixed a TCP timer race condition, but due a callout(9) bug it also introduced another race condition workaround-ed with r284245. The callout(9) bug being fixed with r286880, we can now revert the workaround (r284245). Differential Revision: https://reviews.freebsd.org/D2079 (Initial change) Differential Revision: https://reviews.freebsd.org/D2763 (Workaround) Differential Revision: https://reviews.freebsd.org/D3078 (Fix) Sponsored by: Verisign, Inc. MFC after: 2 weeks	2015-08-24 09:30:27 +00:00
Alexander V. Chernikov	5a2555160f	* Split allocation and table linking for lle's. Before that, the logic besides lle_create() was the following: return existing if found, create if not. This behaviour was error-prone since we had to deal with 'sudden' static<>dynamic lle changes. This commit fixes bunch of different issues like: - refcount leak when lle is converted to static. Simple check case: console 1: while true; do for i in `arp -an\|awk '$4~/incomp/{print$2}'\|tr -d '()'`; do arp -s $i 00:22:44:66:88:00 ; arp -d $i; done; done console 2: ping -f any-dead-host-in-L2 console 3: # watch for memory consumption: vmstat -m \| awk '$1~/lltable/{print$2}' - possible problems in arptimer() / nd6_timer() when dropping/reacquiring lock. New logic explicitly handles use-or-create cases in every lla_create user. Basically, most of the changes are purely mechanical. However, we explicitly avoid using existing lle's for interface/static LLE records. * While here, call lle_event handlers on all real table lle change. * Create lltable_free_entry() calling existing per-lltable lle_free_t callback for entry deletion	2015-08-20 12:05:17 +00:00
Alexander V. Chernikov	a4141c63c5	Check value return from lle_create() for NULL. This bug sneaked unnoticed in r286722. Reported by: adrian	2015-08-19 21:08:42 +00:00
Julien Charbon	31a7749d4b	Make clear that TIME_WAIT timeout expiration is managed solely by tcp_tw_2msl_scan(). Sponsored by: Verisign, Inc.	2015-08-18 08:27:26 +00:00
Alexander V. Chernikov	0c4210f984	Fix panic when handling non-inet arp message introduced in r286825. Submitted by: delphij	2015-08-18 06:16:19 +00:00
Alexander V. Chernikov	512e30ef9f	Split arpresolve() into fast/slow path. This change isolates the most common case (e.g. successful lookup) from more complicates scenarios. It also (tries to) make code more simple by avoiding retry: cycle. The actual goal is to prepare code to the upcoming change that will allow LL address retrieval without acquiring LLE lock at all. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D3383	2015-08-16 12:23:58 +00:00
Michael Tuexen	faadc1b492	Allow the path MTU to grow up to the outgoing interface MTU. MFC after: 3 days	2015-08-14 14:26:13 +00:00
Alexander V. Chernikov	f3bfa7d1cf	Move lle update code from from gigantic ip_arpinput() to separate bunch of functions. The goal is to isolate actual lle updates to permit more fine-grained locking. Do all lle link-level update under AFDATA wlock. Sponsored by: Yandex LLC	2015-08-13 13:38:09 +00:00
Hiren Panchasara	ad389a8c3b	Remove unused TCPTV_SRTTDFLT. We initialize srtt with TCPTV_SRTTBASE when we don't have any rtt estimate. Differential Revision: D3334 Sponsored by: Limelight Networks	2015-08-12 16:08:37 +00:00
Alexander V. Chernikov	0447c1367a	Use single 'lle_timer' callout in lltable instead of two different names of the same timer.	2015-08-11 12:38:54 +00:00
Alexander V. Chernikov	314294de5c	Store addresses instead of sockaddrs inside llentry. This permits us having all (not fully true yet) all the info needed in lookup process in first 64 bytes of 'struct llentry'. struct llentry layout: BEFORE: [rwlock .. state .. state .. MAC ] (lle+1) [sockaddr_in[6]] AFTER [ in[6]_addr MAC .. state .. rwlock ] Currently, address part of struct llentry has only 16 bytes for the key. However, lltable does not restrict any custom lltable consumers with long keys use the previous approach (store key at (lle+1)). Sponsored by: Yandex LLC	2015-08-11 09:26:11 +00:00
Alexander V. Chernikov	41cb42a633	MFP r276712. * Split lltable_init() into lltable_allocate_htbl() (alloc hash table with default callbacks) and lltable_link() ( links any lltable to the list). * Switch from LLTBL_HASHTBL_SIZE to per-lltable hash size field. * Move lltable setup to separate functions in in[6]_domifattach.	2015-08-11 05:51:00 +00:00
Alexander V. Chernikov	2caee4be35	Rename rt_foreach_fib() to rt_foreach_fib_walk(). Suggested by: julian	2015-08-10 20:50:31 +00:00
Alexander V. Chernikov	11cdad9873	Partially merge r274887,r275334,r275577,r275578,r275586 to minimize differences between projects/routing and HEAD. This commit tries to keep code logic the same while changing underlying code to use unified callbacks. * Add llt_foreach_entry method to traverse all entries in given llt * Add llt_dump_entry method to export particular lle entry in sysctl/rtsock format (code is not indented properly to minimize diff). Will be fixed in the next commits. * Add llt_link_entry/llt_unlink_entry methods to link/unlink particular lle. * Add llt_fill_sa_entry method to export address in the lle to sockaddr format. * Add llt_hash method to use in generic hash table support code. * Add llt_free_entry method which is used in llt_prefix_free code. * Prepare for fine-grained locking by separating lle unlink and deletion in lltable_free() and lltable_prefix_free(). * Provide lltable_get<ifp\|af>() functions to reduce direct 'struct lltable' access by external callers. * Remove @llt agrument from lle_free() lle callback since it was unused. * Temporarily add L3_CADDR() macro for 'const' sockaddr typecasting. * Switch to per-af hashing code. * Rename LLE_FREE_LOCKED() callback from in[6]_lltable_free() to in_[6]lltable_destroy() to avoid clashing with llt_free_entry() method. Update description from these functions. * Use unified lltable_free_entry() function instead of per-af one. Reviewed by: ae	2015-08-10 12:03:59 +00:00

1 2 3 4 5 ...

5305 Commits