freebsd-skq

Author	SHA1	Message	Date
Gleb Smirnoff	f25a3d10b3	Now that IP reassembly is no longer under single lock, book-keeping amount of allocations in V_nipq is racy. To fix that, we would simply stop doing book-keeping ourselves, and rely on UMA doing that. There could be a slight overcommit due to caches, but that isn't a big deal. o V_nipq and V_maxnipq go away. o net.inet.ip.fragpackets is now just SYSCTL_UMA_CUR() o net.inet.ip.maxfragpackets could have been just SYSCTL_UMA_MAX(), but historically it has special semantics about values of 0 and -1, so provide sysctl_maxfragpackets() to handle these special cases. o If zone limit lowers either due to net.inet.ip.maxfragpackets or due to kern.ipc.nmbclusters, then new function ipq_drain_tomax() goes over buckets and frees the oldest packets until we are in the limit. The code that (incorrectly) did that in ip_slowtimo() is removed. o ip_reass() doesn't check any limits and calls uma_zalloc(M_NOWAIT). If it fails, a new function ipq_reuse() is called. This function will find the oldest packet in the currently locked bucket, and if there is none, it will search in other buckets until success. Sponsored by: Nginx, Inc.	2015-04-09 22:13:27 +00:00
Gleb Smirnoff	f5746f593c	In the ip_reass() do packet examination and adjusting before acquiring locks and doing lookups. Sponsored by: Nginx, Inc.	2015-04-09 21:32:32 +00:00
Gleb Smirnoff	e3c2c63476	Make ip reassembly queue mutexes per-vnet, putting them into the structure that they protect. Sponsored by: Nginx, Inc.	2015-04-09 21:17:07 +00:00
Gleb Smirnoff	71c70e138d	Use TAILQ_FOREACH_SAFE() instead of implementing it ourselves. Sponsored by: Nginx, Inc.	2015-04-09 09:00:32 +00:00
Gleb Smirnoff	1c0b48c79a	If V_maxnipq is set to zero, drain the queue here and now, instead of relying on timeouts. Sponsored by: Nginx, Inc.	2015-04-09 08:56:23 +00:00
Gleb Smirnoff	55c28800ad	o Since we always update either fragdrop or fragtimeout stat counter when we free a fragment, provide two inline functions that do that for us: ipq_drop() and ipq_timeout(). o Rename ip_free_f() to ipq_free() to match the name scheme of IP reassembly. o Remove assertion from ipq_free(), since it requires extra argument to be passed, but locking scheme is simple enough and function is static. Sponsored by: Nginx, Inc.	2015-04-09 08:52:02 +00:00
Gleb Smirnoff	3de5805b02	Rename ip_drain_locked() to ip_drain_vnet(), since the function differs from ip_drain() not in locking, but in the scope of its work. Sponsored by: Nginx, Inc.	2015-04-09 08:37:16 +00:00
Adrian Chadd	f59e59d5c3	Move the IPv4 reassembly queue locking from a single lock to be per-bucket (global). This significantly improves performance on multi-core servers where there is any kind of IPv4 reassembly going on. glebius@ would like to see the locking moved to be attached to the reassembly bucket, which would make it per-bucket + per-VNET, instead of being global. I decided to keep it global for now as it's the minimal useful change; if people agree / wish to migrate it to be per-bucket / per-VNET then please do feel free to do so. I won't complain. Thanks to Norse Corp for giving me access to much larger servers to test this at across the 4 core boxes I have at home. Differential Revision: https://reviews.freebsd.org/D2095 Reviewed by: glebius (initial comments incorporated into this patch) MFC after: 2 weeks Sponsored by: Norse Corp, Inc (hardware)	2015-04-07 23:09:34 +00:00
Xin LI	edc76c95db	Improve patch for SA-15:04.igmp to solve a potential buffer overflow. Reported by: bde Submitted by: oshogbo	2015-04-07 20:20:03 +00:00
Gleb Smirnoff	93d4534cdc	Add sleepable lock to protect at least against two parallel SIOCSVHs. Sponsored by: Nginx, Inc.	2015-04-06 15:31:19 +00:00
Hans Petter Selasky	c4c4346f5f	Extend fixes made in r278103 and r38754 by copying the complete packet header and not only partial flags and fields. Firewalls can attach classification tags to the outgoing mbufs which should be copied to all the new fragments. Else only the first fragment will be let through by the firewall. This can easily be tested by sending a large ping packet through a firewall. It was also discovered that VLAN related flags and fields should be copied for packets traversing through VLANs. This is all handled by "m_dup_pkthdr()". Regarding the MAC policy check in ip_fragment(), the tag provided by the originating mbuf is copied instead of using the default one provided by m_gethdr(). Tested by: Karim Fodil-Lemelin <fodillemlinkarim at gmail.com> MFC after: 2 weeks Sponsored by: Mellanox Technologies PR: 7802	2015-04-02 15:47:37 +00:00
Julien Charbon	033749179f	Provide better debugging information in tcp_timer_activate() and tcp_timer_active() Differential Revision: https://reviews.freebsd.org/D2179 Suggested by: bz Reviewed by: jhb Approved by: jhb	2015-04-02 14:43:07 +00:00
Gleb Smirnoff	7a742e3744	Provide a comment explaining issues with the counter(9) trick, so that people won't copy and paste it blindly. Prodded by: ian Sponsored by: Nginx, Inc.	2015-04-02 14:22:59 +00:00
Bjoern A. Zeeb	1d549750c9	Try to unbreak the build after r280971 by providing the missing #include header for SYSINIT.	2015-04-02 00:30:53 +00:00
Gleb Smirnoff	6d947416cc	o Use new function ip_fillid() in all places throughout the kernel, where we want to create a new IP datagram. o Add support for RFC6864, which allows to set IP ID for atomic IP datagrams to any value, to improve performance. The behaviour is controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by default. o In case if we generate IP ID, use counter(9) to improve performance. o Gather all code related to IP ID into ip_id.c. Differential Revision: https://reviews.freebsd.org/D2177 Reviewed by: adrian, cy, rpaulo Tested by: Emeric POUPON <emeric.poupon stormshield.eu> Sponsored by: Netflix Sponsored by: Nginx, Inc. Relnotes: yes	2015-04-01 22:26:39 +00:00
Julien Charbon	18832f1fd1	Use appropriate timeout_t* instead of void* in tcp_timer_activate() Suggested by: imp Differential Revision: https://reviews.freebsd.org/D2154 Reviewed by: imp, jhb Approved by: jhb	2015-03-31 10:17:13 +00:00
Gleb Smirnoff	513635bfaa	VNETalize random IP ID engine. Sponsored by: Nginx, Inc.	2015-03-28 16:59:57 +00:00
Gleb Smirnoff	1f08c9479f	Initialize random IP ID engine via SYSINIT() instead of doing that on first packet. This allow to use M_WAITOK and cut down some error handling. Sponsored by: Nginx, Inc.	2015-03-28 16:06:46 +00:00
Fabien Thomas	d612b95e23	On multi CPU systems, we may emit successive packets with the same id. Fix the race by using an atomic operation. Differential Revision: https://reviews.freebsd.org/D2141 Obtained from: emeric.poupon@stormshield.eu MFC after: 1 week Sponsored by: Stormshield	2015-03-27 13:26:59 +00:00
Michael Tuexen	d59909c3e2	Improve the selection of the destination address of SACK chunks. This fixes https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=196755 and is joint work with rrs@. MFC after: 1 week	2015-03-26 22:05:31 +00:00
Michael Tuexen	a756ffc931	Make sure that we don't free an SCTP shared key too early. Thanks to Pouyan Sepehrdad from Qualcomm Product Security Initiative for reporting the issue. MFC after: 3 days	2015-03-25 22:45:54 +00:00
Michael Tuexen	d9bdc5200a	Use the reference count of the right SCTP inp. Joint work with rrs@ MFC after: 3 days	2015-03-25 21:41:20 +00:00
Michael Tuexen	0426123f75	Fix two bugs which resulted in a screwed up end point list: * Use a save way to walk throught a list while manipulting it. * Have to appropiate locks in place. Joint work with rrs@ MFC after: 3 days	2015-03-24 21:12:45 +00:00
Lawrence Stewart	efca16682d	The addition of flowid and flowtype in r280233 and r280237 respectively forgot to extend the IPv6 packet node format string, which causes a build failure when SIFTR is compiled with IPv6 support. Reported by: Lars Eggert	2015-03-24 15:08:43 +00:00
Michael Tuexen	8427b3fd4f	Fix the bug in the handling of fragmented abandoned SCTP user messages reported in https://code.google.com/p/sctp-refimpl/issues/detail?id=11 Thanks to Lally Singh for reporting it. MFC after: 3 days	2015-03-24 15:05:36 +00:00
Michael Tuexen	7fd5b4365a	Fix an accounting bug related to the per stream chunk counter. While there, don't refer to a net articifically. MFC after: 3 days	2015-03-24 14:51:46 +00:00
Michael Tuexen	ca0f81984a	When an ICMP message is received and the MTU shrinks, only mark outstanding chunks for retransmissions. MFC after: 3 days	2015-03-23 23:34:21 +00:00
Michael Tuexen	d5ec585697	Remove a useless assignment. MFC after: 1 week	2015-03-23 15:12:02 +00:00
Hiren Panchasara	d0a8b2a5ae	Add connection flow type to siftr(4). Suggested by: adrian Sponsored by: Limelight Networks	2015-03-19 00:23:16 +00:00
Hiren Panchasara	a025fd1487	Add connection flowid to siftr(4). Reviewed by: lstewart MFC after: 1 week Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D2089	2015-03-18 23:24:25 +00:00
Adrian Chadd	3b27278218	Correctly const-ify things. Found by: clang 3.6	2015-03-18 04:40:36 +00:00
Ian Lepore	dcdeb95f09	Go back to using sbuf_new() with a preallocated large buffer, to avoid triggering an sbuf auto-drain copyout while holding a lock. Pointed out by: jhb Pointy hat: ian	2015-03-14 23:57:33 +00:00
Ian Lepore	751ccc429d	Use sbuf_new_for_sysctl() instead of plain sbuf_new() to ensure sysctl string returned to userland is nulterminated. PR: 195668	2015-03-14 18:11:24 +00:00
Andrey V. Elsukov	2530ed9e70	Fix `ipfw fwd tablearg'. Use dedicated field nh4 in struct table_value to obtain IPv4 next hop address in tablearg case. Add `fwd tablearg' support for IPv6. ipfw(8) uses INADDR_ANY as next hop address in O_FORWARD_IP opcode for specifying tablearg case. For IPv6 we still use this opcode, but when packet identified as IPv6 packet, we obtain next hop address from dedicated field nh6 in struct table_value. Replace hopstore field in struct ip_fw_args with anonymous union and add hopstore6 field. Use this field to copy tablearg value for IPv6. Replace spare1 field in struct table_value with zoneid. Use it to keep scope zone id for link-local IPv6 addresses. Since spare1 was used internally, replace spare0 array with two variables spare0 and spare1. Use getaddrinfo(3)/getnameinfo(3) functions for parsing and formatting IPv6 addresses in table_value. Use zoneid field in struct table_value to store sin6_scope_id value. Since the kernel still uses embedded scope zone id to represent link-local addresses, convert next_hop6 address into this form before return from pfil processing. This also fixes in6_localip() check for link-local addresses. Differential Revision: https://reviews.freebsd.org/D2015 Obtained from: Yandex LLC Sponsored by: Yandex LLC	2015-03-13 09:03:25 +00:00
Michael Tuexen	5ba11c4c2e	Update a comment to get it aligned with the code change. Reported by: brueffer@	2015-03-11 15:40:29 +00:00
Michael Tuexen	975c975bf0	It seems that sb_acc is a better replacement for sb_cc than sb_ccc. At least it unbreaks the use of select() for SCTP sockets. MFC after: 3 days	2015-03-11 15:21:39 +00:00
Michael Tuexen	b3bf169ac7	Fix the adaptation of the path state when thresholds are changed using the SCTP_PEER_ADDR_THLDS socket option. MFC after: 3 days	2015-03-11 14:25:23 +00:00
Michael Tuexen	3cb3567d7e	Keep track on the socket lock state. This fixes a bug showing up on Mac OS X. MFC after: 3 days	2015-03-10 22:38:10 +00:00
Michael Tuexen	2bb7e77385	Unlock the stcb when using setsockopt() for the SCTP_PEER_ADDR_THLDS option. MFC after: 3 days	2015-03-10 21:05:17 +00:00
Michael Tuexen	59b6d5be4e	Add a SCTP socket option to limit the cwnd for each path. MFC after: 1 month	2015-03-10 19:49:25 +00:00
Michael Tuexen	8f290ed51b	Fix a typo. MFC after: 1 week	2015-03-10 09:16:31 +00:00
Julien Charbon	eb96dc3336	In TCP, connect() can return incorrect error code EINVAL instead of EADDRINUSE or ECONNREFUSED PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=196035 Differential Revision: https://reviews.freebsd.org/D1982 Reported by: Mark Nunberg <mnunberg@haskalah.org> Submitted by: Harrison Grundy <harrison.grundy@astrodoggroup.com> Reviewed by: adrian, jch, glebius, gnn Approved by: jhb MFC after: 2 weeks	2015-03-09 20:29:16 +00:00
Andrey V. Elsukov	1b5aa92cff	lla_lookup() can directly call llentry_free() for static entries and the last one requires to hold afdata's wlock. PR: 197096 MFC after: 1 week	2015-03-07 18:33:08 +00:00
Hiroki Sato	11d8451df3	Implement Enhanced DAD algorithm for IPv6 described in draft-ietf-6man-enhanced-dad-13. This basically adds a random nonce option (RFC 3971) to NS messages for DAD probe to detect a looped back packet. This looped back packet prevented DAD on some pseudo-interfaces which aggregates multiple L2 links such as lagg(4). The length of the nonce is set to 6 bytes. This algorithm can be disabled by setting net.inet6.ip6.dad_enhanced sysctl to 0 in a per-vnet basis. Reported by: hiren Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D1835	2015-03-02 17:30:26 +00:00
Hans Petter Selasky	9c0f6aa762	Fix a special case in ip_fragment() to produce a more sensible chain of packets. When the data payload length excluding any headers, of an outgoing IPv4 packet exceeds PAGE_SIZE bytes, a special case in ip_fragment() can kick in to optimise the outgoing payload(s). The code which was added in r98849 as part of zero copy socket support assumes that the beginning of any MTU sized payload is aligned to where a MBUF's "m_data" pointer points. This is not always the case and can sometimes cause large IPv4 packets, as part of ping replies, to be split more than needed. Instead of iterating the MBUFs to figure out how much data is in the current chain, use the value already in the "m_pkthdr.len" field of the first MBUF in the chain. Reviewed by: ken @ Differential Revision: https://reviews.freebsd.org/D1893 MFC after: 2 weeks Sponsored by: Mellanox Technologies	2015-02-25 13:58:43 +00:00
Xin LI	cfa498d88e	Fix integer overflow in IGMP protocol. Security: FreeBSD-SA-15:04.igmp Security: CVE-2015-1414 Found by: Mateusz Kocielski, Logicaltrust Analyzed by: Marek Kroemeke, Mateusz Kocielski (shm@NetBSD.org) and 22733db72ab3ed94b5f8a1ffcde850251fe6f466 Submited by: Mariusz Zaborski <oshogbo@FreeBSD.org> Reviewed by: bms	2015-02-25 05:42:59 +00:00
Zbigniew Bodek	8018ac153f	Change struct attribute to avoid aligned operations mismatch Previous __alignment(4) allowed compiler to assume that operations are performed on aligned region. On ARM processor, this led to alignment fault as shown below: trapframe: 0xda9e5b10 FSR=00000001, FAR=a67b680e, spsr=60000113 r0 =00000000, r1 =00000068, r2 =0000007c, r3 =00000000 r4 =a67b6826, r5 =a67b680e, r6 =00000014, r7 =00000068 r8 =00000068, r9 =da9e5bd0, r10=00000011, r11=da9e5c10 r12=da9e5be0, ssp=da9e5b60, slr=a054f164, pc =a054f2cc <...> udp_input+0x264: ldmia r5, {r0-r3, r6} udp_input+0x268: stmia r12, {r0-r3, r6} This was due to instructions which do not support unaligned access, whereas for __alignment(2) compiler replaced ldmia/stmia with some logically equivalent memcpy operations. In fact, the assumption that 'struct ip' is always 4-byte aligned is definitely false, as we have no impact on data alignment of packet stream received. Another possible solution would be to explicitely perform memcpy() on objects of 'struct ip' type, which, however, would suffer from performance drop, and be merely a problem hiding. Please, note that this has nothing to do with ARM32_DISABLE_ALIGNMENT_FAULTS option, but is related strictly to compiler behaviour. Submitted by: Wojciech Macek <wma@semihalf.com> Reviewed by: glebius, ian Obtained from: Semihalf	2015-02-24 12:57:03 +00:00
Gleb Smirnoff	26d50672d6	The last userland piece of in_var.h is now 'struct in_aliasreq'. Move it to the top of the file, and ifdef _KERNEL the rest.	2015-02-19 23:59:27 +00:00
Gleb Smirnoff	e072c794ad	Now that all users of _WANT_IFADDR are fixed, remove this crutch and hide ifaddr, in_ifaddr and in6_ifaddr under _KERNEL. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-02-19 23:16:10 +00:00
Gleb Smirnoff	0d159406b6	- Rename 'struct igmp_ifinfo' into 'struct igmp_ifsoftc', since it really represents a context. - Preserve name 'struct igmp_ifinfo' for a new structure, that will be stable API between userland and kernel. - Make sysctl_igmp_ifinfo() return the new 'struct igmp_ifinfo', instead of old one, which had a bunch of internal kernel structures in it. - Move all above declarations from in_var.h to igmp_var.h, since they are private to IGMP code. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-02-19 22:35:23 +00:00
Konstantin Belousov	0c6dcac369	Fix build with KTR after r278978.	2015-02-19 15:41:23 +00:00
Gleb Smirnoff	fd1b2a7c57	Widen _KERNEL ifdef to hide more kernel network stack structures from userland.	2015-02-19 06:24:27 +00:00
Gleb Smirnoff	058e08bea9	Use new struct mbufq instead of struct ifqueue to manage packet queues in IPv4 multicast code. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-02-19 01:21:02 +00:00
Randall Stewart	2575fbb827	This fixes a bug in the way that the LLE timers for nd6 and arp were being used. They basically would pass in the mutex to the callout_init. Because they used this method to the callout system, it was possible to "stop" the callout. When flushing the table and you stopped the running callout, the callout_stop code would return 1 indicating that it was going to stop the callout (that was about to run on the callout_wheel blocked by the function calling the stop). Now when 1 was returned, it would lower the reference count one extra time for the stopped timer, then a few lines later delete the memory. Of course the callout_wheel was stuck in the lock code and would then crash since it was accessing freed memory. By using callout_init(c, 1) we always get a 0 back and the reference counting bug does not rear its head. We do have to make a few adjustments to the callouts themselves though to make sure it does the proper thing if rescheduled as well as gets the lock. Commented upon by hiren and sbruno See Phabricator D1777 for more details. Commented upon by hiren and sbruno Reviewed by: adrian, jhb and bz Sponsored by: Netflix Inc.	2015-02-09 19:28:11 +00:00
Hans Petter Selasky	609752f04f	The flowid and hashtype should be copied from the originating packet when fragmenting IP packets to preserve the order of the packets in a stream. Else the resulting fragments can be sent out of order when the hardware supports multiple transmit rings. Reviewed by: glebius @ MFC after: 1 week Sponsored by: Mellanox Technologies	2015-02-02 17:32:50 +00:00
Hiren Panchasara	ec446b1375	Make syncookie_mac() use 'tcp_seq irs' in computing hash. This fixes what seems like a simple oversight when the function was added in r253210. Reported by: Daniel Borkmann <dborkman@redhat.com> Florian Westphal <fw@strlen.de> Differential Revision: https://reviews.freebsd.org/D1628 Reviewed by: gnn MFC after: 1 month Sponsored by: Limelight Networks	2015-01-30 17:29:07 +00:00
Michael Tuexen	aec9ef9745	Whitespace change.	2015-01-27 21:30:24 +00:00
Xin LI	6a58f0e913	Fix SCTP stream reset vulnerability. We would like to acknowledge Gerasimos Dimitriadis who reported the issue and Michael Tuexen who analyzed and provided the fix. Security: FreeBSD-SA-15:03.sctp Security: CVE-2014-8613 Submitted by: tuexen	2015-01-27 19:35:38 +00:00
Xin LI	38f2a43815	Fix SCTP SCTP_SS_VALUE kernel memory corruption and disclosure vulnerability. We would like to acknowledge Clement LECIGNE from Google Security Team and Francisco Falcon from Core Security Technologies who discovered the issue independently and reported to the FreeBSD Security Team. Security: FreeBSD-SA-15:02.kmem Security: CVE-2014-8612 Submitted by: tuexen	2015-01-27 19:35:36 +00:00
John Baldwin	002d455873	Use an sbuf to generate the output of the net.inet.tcp.hostcache.list sysctl to avoid a possible buffer overflow if the cache grows while the text is being generated. PR: 172675 MFC after: 2 weeks	2015-01-25 19:45:44 +00:00
Will Andrews	bb269f3ae4	Log hardware interface up/down as "hardware" rather than just "hw". Suggested by: glebius MFC after: 1 week MFC with: 277530	2015-01-23 14:30:24 +00:00
Will Andrews	369a670857	When a CARP state change is caused by an ifconfig request, log it accordingly. Suggested by: glebius MFC after: 1 week MFC with: 277530	2015-01-23 14:28:12 +00:00
Will Andrews	d01641e2c1	Improve CARP logging so that all state transitions are logged. sys/netinet/ip_carp.c: Add a "reason" string parameter to carp_set_state() and carp_master_down_locked() allowing more specific logging information to be passed into these apis. Refactor existing state transition logging into a single log call in carp_set_state(). Update all calls to carp_set_state() and carp_master_down_locked() to pass an appropriate reason string. For state transitions that were previously logged, the output should be unchanged. Submitted by: gibbs (original), asomers (updated) MFC after: 1 week Sponsored by: Spectra Logic MFSpectraBSD: 1039697 on 2014/02/11 (original) 1049992 on 2014/03/21 (updated)	2015-01-22 17:09:54 +00:00
Michael Tuexen	bcbf8c2105	Remove comparisons which are not necessary. Reported by: Coverity CID: 1237826, 1237844, 1237847 MFC after: 1 week	2015-01-20 19:08:55 +00:00
Michael Tuexen	2010054d91	Code cleanup. Reported by: Coverity CID: 749578 MFC after: 1 week	2015-01-19 11:52:08 +00:00
Michael Tuexen	e1600e5058	Fix a bug which only shows up when an mbuf allocation failed. Therefore chances are low that we hit this. Reported by: Coverity CID: 1018886 MFC after: 1 week	2015-01-18 22:00:39 +00:00
Michael Tuexen	d6165c1fca	Remove an unnecessary check. Reported by: Coverity CID: 749576 MFC after: 1 week	2015-01-18 21:16:22 +00:00
Michael Tuexen	3ff78fbbd9	Add protection code to free memory in case of processing an address which is neither IPv4 or IPv6. Reported by: Coverity CID: 749311 MFC after: 1 week	2015-01-18 20:53:20 +00:00
Michael Tuexen	61330de4b0	Remove an unused variable. Reported by: Coverity CID: 750999 MFC after: 1 week	2015-01-18 20:20:27 +00:00
Adrian Chadd	b2bdc62a95	Refactor / restructure the RSS code into generic, IPv4 and IPv6 specific bits. The motivation here is to eventually teach netisr and potentially other networking subsystems a bit more about how RSS work queues / buckets are configured so things have a hope of auto-configuring in the future. * net/rss_config.[ch] takes care of the generic bits for doing configuration, hash function selection, etc; * topelitz.[ch] is now in net/ rather than netinet/; * (and would be in libkern if it didn't directly include RSS_KEYSIZE; that's a later thing to fix up.) * netinet/in_rss.[ch] now just contains the IPv4 specific methods; * and netinet/in6_rss.[ch] now just contains the IPv6 specific methods. This should have no functional impact on anyone currently using the RSS support. Differential Revision: D1383 Reviewed by: gnn, jfv (intel driver bits)	2015-01-18 18:06:40 +00:00
Gleb Smirnoff	fc2517100b	Do not go one layer down to check ifqueue length. First, not all drivers use ifqueue at all. Second, there is no point in this lockless check. Either positive or negative result of the check could be incorrect after a tick. Reviewed by: tuexen Sponsored by: Nginx, Inc.	2015-01-12 18:06:22 +00:00
Gleb Smirnoff	fc7ea8b690	Remove incorrect layering violating code that: a) assumed that ifqueue length is measured in bytes, instead of packets b) assumed that any interface has working ifqueue c) incremented global counter instead of ifi_oqdrops Sponsored by: Nginx, Inc.	2015-01-12 09:41:12 +00:00
Hiren Panchasara	64807b300f	DCTCP (Data Center TCP) implementation. DCTCP congestion control algorithm aims to maximise throughput and minimise latency in data center networks by utilising the proportion of Explicit Congestion Notification (ECN) marked packets received from capable hardware as a congestion signal. Highlights: Implemented as a mod_cc(4) module. ECN (Explicit congestion notification) processing is done differently from RFC3168. Takes one-sided DCTCP into consideration where only one of the sides is using DCTCP and other is using standard ECN. IETF draft: http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00 Thesis report by Midori Kato: https://eggert.org/students/kato-thesis.pdf Submitted by: Midori Kato <katoon@sfc.wide.ad.jp> and Lars Eggert <lars@netapp.com> with help and modifications from hiren Differential Revision: https://reviews.freebsd.org/D604 Reviewed by: gnn	2015-01-12 08:33:04 +00:00
Michael Tuexen	d89abe19b0	Remove dead code. Reported by: Coverity CID: 748664 MFC after: 1 week	2015-01-12 07:55:16 +00:00
Michael Tuexen	df26ea6839	Remove dead code. Reported by: Coverity CID: 1018052 MFC after: 1 week	2015-01-12 07:39:52 +00:00
Michael Tuexen	f104b614a0	Remove dead code. Reported by: Coverity CID: 1018053 MFC after: 1 week	2015-01-12 07:29:35 +00:00
Michael Tuexen	f0dc2113ca	Remove dead code. Reported by: Coverity CID: 748663 MFC after: 1 week	2015-01-11 22:49:20 +00:00
Michael Tuexen	448e859674	Remove dead code. Reported by: Coverity CID: 748660, 748661 MFC after: 1 week	2015-01-11 22:23:39 +00:00
Michael Tuexen	e88f89a393	Remove dead code. Reported by: Coverity CID: 748665 MFC after: 1 week	2015-01-11 21:55:30 +00:00
Michael Tuexen	d3cfd43074	Remove dead code. Reported by: Coverity CID: 748666 MFC after: 1 week	2015-01-11 21:44:56 +00:00
Michael Tuexen	4be807c4d6	Minimize the usage of SCTP_BUF_IS_EXTENDED. This should help Robert...	2015-01-10 20:49:57 +00:00
Michael Tuexen	296d0b9495	Retire SCTP_BUF_EXTEND_SIZE. This patch was suggested by Robert Watson.	2015-01-10 13:56:26 +00:00
Alexander V. Chernikov	d63e657c04	* Deal with ARCNET L2 multicast mapping for IPv6 the same way as in IPv4: handle it in arc_output() instead of nd6_storelladdr(). * Remove IFT_ARCNET check from arpresolve() since arc_output() does not use arpresolve() to handle broadcast/multicast. This check was there since r84931. It looks like it was not used since r89099 (initial import of Arcnet support where multicast is handled separately). * Remove IFT_IEEE1394 case from nd6_storelladdr() since firewire_output() calles nd6_storelladdr() for unicast addresses only. * Remove IFT_ARCNET case from nd6_storelladdr() since arc_output() now handles multicast by itself. As a result, we have the following pattern: all non-ethernet-style media have their own multicast map handling inside their appropriate routines. On the other hand, arpresolve() (and nd6_storelladdr()) which meant to be 'generic' ones de-facto handles ethernet-only multicast maps. MFC after: 3 weeks	2015-01-09 12:56:51 +00:00
Robert Watson	e1165035a6	Use M_WRITABLE() and M_LEADINGSPACE() rather than checking M_EXT and doing hand-crafted length calculations in the IP options code. Reviewed by: bz Sponsored by: EMC / Isilon Storage Division	2015-01-06 14:32:28 +00:00
Luiz Otavio O Souza	57c5139c46	Remove the check that prevent carp(4) advskew to be set to '0'. CARP devices are created with advskew set to '0' and once you set it to any other value in the valid range (0..254) you can't set it back to zero. The code in question is also used to prevent that zeroed values overwrite the CARP defaults when a new CARP device is created. Since advskew already defaults to '0' for newly created devices and the new value is guaranteed to be within the valid range, it is safe to overwrite it here. PR: 194672 Reported by: cmb@pfsense.org In collaboration with: garga Tested by: garga MFC after: 2 weeks	2015-01-06 13:07:13 +00:00
Robert Watson	ed6a66ca6c	To ease changes to underlying mbuf structure and the mbuf allocator, reduce the knowledge of mbuf layout, and in particular constants such as M_EXT, MLEN, MHLEN, and so on, in mbuf consumers by unifying various alignment utility functions (M_ALIGN(), MH_ALIGN(), MEXT_ALIGN() in a single M_ALIGN() macro, implemented by a now-inlined m_align() function: - Move m_align() from uipc_mbuf.c to mbuf.h; mark as __inline. - Reimplement M_ALIGN(), MH_ALIGN(), and MEXT_ALIGN() using m_align(). - Update consumers around the tree to simply use M_ALIGN(). This change eliminates a number of cases where mbuf consumers must be aware of whether or not mbufs returned by the allocator use external storage, but also assumptions about the size of the returned mbuf. This will make it easier to introduce changes in how we use external storage, as well as features such as variable-size mbufs. Differential Revision: https://reviews.freebsd.org/D1436 Reviewed by: glebius, trasz, gnn, bz Sponsored by: EMC / Isilon Storage Division	2015-01-05 09:58:32 +00:00
Adrian Chadd	492ccbe14d	Migrate the RSS IPv6 hash code to use pointers to the v6 addresses rather than passing them in by value. The eventual aim is to do incremental hash construction rather than all of the memcpy()'ing into a contiguous buffer for the hash function, which does show up as taking quite a bit of CPU during profiling. Tested: * a variety of laptops/desktop setups I have, with v6 connectivity Differential Revision: D1404 Reviewed by: bz, rpaulo	2014-12-31 22:52:43 +00:00
Andrey V. Elsukov	f188f14d43	Extern declarations in C files loses compile-time checking that the functions' calls match their definitions. Move them to header files. Reviewed by: jilles (previous version)	2014-12-25 21:32:37 +00:00
Andrey V. Elsukov	132c449079	Remove in_gif.h and in6_gif.h files. They only contain function declarations used by gif(4). Instead declare these functions in C files. Also make some variables static.	2014-12-23 16:17:37 +00:00
Michael Tuexen	f3ba71bee4	Don't check twice that inp is not NULL. Reported by: Coverity CID: 748671 MFC after: 3 days	2014-12-21 13:58:53 +00:00
Warner Losh	61f26cae7d	Where appropriate, use the modern terms for the one true time base (UTC) rather than the archaic (GMT) in comments. Except where the comments are making fun of people doing this (and pedants who insist on the new terms).	2014-12-21 05:07:11 +00:00
Michael Tuexen	b03b5d729a	Fix and harmonize the validation of PR-SCTP policies. Reported by: Coverity CID: 1232044 MFC after: 3 days	2014-12-20 21:17:28 +00:00
Michael Tuexen	ca10a8d944	Cleanup the code. Reported by: Coverity CID: 1232003	2014-12-20 13:47:38 +00:00
Michael Tuexen	142a4d9e86	Add a missing break. Reported by: Coverity CID: 1232014 MFC after: 3 days	2014-12-17 20:34:38 +00:00
Andrey V. Elsukov	44eb8bbe7b	Do not count security policy violation twice. ipsec*_in_reject() do this by their own. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 19:20:13 +00:00
Andrey V. Elsukov	0332a55f0f	Use ipsec4_in_reject() to simplify ip_ipsec_fwd() and ip_ipsec_input(). ipsec4_in_reject() does the same things, also it counts policy violation errors. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 18:55:54 +00:00
Andrey V. Elsukov	0275b2e369	Remove flag/flags argument from the following functions: ipsec_getpolicybyaddr() ipsec4_checkpolicy() ip_ipsec_output() ip6_ipsec_output() The only flag used here was IP_FORWARDING. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 18:35:34 +00:00
Andrey V. Elsukov	619764beab	Remove flags and tunalready arguments from ipsec4_process_packet() and make its prototype similar to ipsec6_process_packet. The flags argument isn't used here, tunalready is always zero. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 17:34:49 +00:00
Andrey V. Elsukov	8922ddbe40	Move ip_ipsec_fwd() from ip_input() into ip_forward(). Remove check for presence PACKET_TAG_IPSEC_IN_DONE mbuf tag from ip_ipsec_fwd(). PACKET_TAG_IPSEC_IN_DONE tag means that packet is already handled by IPSEC code. This means that before IPSEC processing it was destined to our address and security policy was checked in the ip_ipsec_input(). After IPSEC processing packet has new IP addresses and destination address isn't our own. So, anyway we can't check security policy from the mbuf tag, because it corresponds to different addresses. We should check security policy that corresponds to packet attributes in both cases - when it has a mbuf tag and when it has not. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 16:53:29 +00:00
Andrey V. Elsukov	e58320f127	Remove PACKET_TAG_IPSEC_IN_DONE mbuf tag lookup and usage of its security policy. The changed block of code in ip*_ipsec_input() is called when packet has ESP/AH header. Presence of PACKET_TAG_IPSEC_IN_DONE mbuf tag in the same time means that packet was already handled by IPSEC and reinjected in the netisr, and it has another ESP/AH headers (encrypted twice?). Since it was already processed by IPSEC code, the AH/ESP headers was already stripped (and probably outer IP header was stripped too) and security policy from the tdb_ident was applied to those headers. It is incorrect to apply this security policy to current headers. Also make ip_ipsec_input() prototype similar to ip6_ipsec_input(). Obtained from: Yandex LLC Sponsored by: Yandex LLC	2014-12-11 14:58:55 +00:00

1 2 3 4 5 ...

5167 Commits