freebsd-skq

Author	SHA1	Message	Date
Sean Bruno	51352d9d81	Don't hold the RM lock during lagg_proto_addport() to avoid an LOR. Submitted by: Kevin Bowling <kevin.bowling@kev009.com> Reviewed by: mav MFC after: 1 week Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D11711	2017-07-25 14:41:50 +00:00
Alexander Motin	41cf0d54a2	Call VLAN_CAPABILITIES() when LAGG capabilities change. This makes VLAN on top of LAGG to expose proper capabilities if they are changed after creation. MFC after: 1 week	2017-05-26 22:22:48 +00:00
Alexander Motin	8403ab7919	Improve applying unified capabilities to the lagg ports. Some NICs have some capabilities dependent, so that disabling one require disabling some other (TXCSUM/RXCSUM on em). This code tries to reach the consensus more insistently. PR: 219453 MFC after: 1 week	2017-05-26 20:15:33 +00:00
Alexander Motin	e3d90506c4	Remove some code, dead from the day one.	2017-05-25 23:19:09 +00:00
Alexander Motin	bbfc32a6b5	Relax r317696 locking to not drain taskqueue under the lock. MFC after: 11 days	2017-05-05 16:51:53 +00:00
Alexander Motin	e83177fba8	Fix r317696 build without debug. MFC after: 2 weeks	2017-05-03 02:54:11 +00:00
Alexander Motin	2f86d4b001	Introduce sleepable locks into if_lagg. Before this change if_lagg was using nonsleepable rmlocks to protect its internal state. This patch introduces another sx lock to protect code paths that require sleeping, while still uses old rmlock to protect hot nonsleepable data paths. This change allows to remove taskqueue decoupling used before to change interface addresses without holding the lock. Instead it uses sx lock to protect direct if_ioctl() calls. As another bonus, the new code synchronizes enabled capabilities of member interfaces, and allows to control them with ifconfig laggX, that was impossible before. This part should fix interoperation with if_bridge, that may need to disable some capabilities, such as TXCSUM or LRO, to allow bridging with noncapable interfaces. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D10514	2017-05-02 19:09:11 +00:00
Alexander Motin	1e04441a9d	Remove unneeded conditions. MFC after: 2 weeks	2017-04-22 08:38:49 +00:00
Alexander Motin	b98b5ae8ec	Add interface reference counting to if_lagg. Using plain ifunit() looks like request for troubles. MFC after: 2 weeks	2017-04-21 13:45:01 +00:00
Luiz Otavio O Souza	13157b2baf	Do not update the lagg link layer address when destroying a lagg clone. This would enqueue an event to send the gratuitous arp on a dying lagg interface without any physical ports attached to it. Apart from that, the taskqueue_drain() on lagg_clone_destroy() runs too late, when the ifp data structure is already freed. Fix that too. Obtained from: pfSense MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC (Netgate)	2017-01-30 03:04:33 +00:00
Hans Petter Selasky	f3e7afe2d7	Implement kernel support for hardware rate limited sockets. - Add RATELIMIT kernel configuration keyword which must be set to enable the new functionality. - Add support for hardware driven, Receive Side Scaling, RSS aware, rate limited sendqueues and expose the functionality through the already established SO_MAX_PACING_RATE setsockopt(). The API support rates in the range from 1 to 4Gbytes/s which are suitable for regular TCP and UDP streams. The setsockopt(2) manual page has been updated. - Add rate limit function callback API to "struct ifnet" which supports the following operations: if_snd_tag_alloc(), if_snd_tag_modify(), if_snd_tag_query() and if_snd_tag_free(). - Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT flag, which tells if a network driver supports rate limiting or not. - This patch also adds support for rate limiting through VLAN and LAGG intermediate network devices. - How rate limiting works: 1) The userspace application calls setsockopt() after accepting or making a new connection to set the rate which is then stored in the socket structure in the kernel. Later on when packets are transmitted a check is made in the transmit path for rate changes. A rate change implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the destination network interface, which then sets up a custom sendqueue with the given rate limitation parameter. A "struct m_snd_tag" pointer is returned which serves as a "snd_tag" hint in the m_pkthdr for the subsequently transmitted mbufs. 2) When the network driver sees the "m->m_pkthdr.snd_tag" different from NULL, it will move the packets into a designated rate limited sendqueue given by the snd_tag pointer. It is up to the individual drivers how the rate limited traffic will be rate limited. 3) Route changes are detected by the NIC drivers in the ifp->if_transmit() routine when the ifnet pointer in the incoming snd_tag mismatches the one of the network interface. The network adapter frees the mbuf and returns EAGAIN which causes the ip_output() to release and clear the send tag. Upon next ip_output() a new "snd_tag" will be tried allocated. 4) When the PCB is detached the custom sendqueue will be released by a non-blocking ifp->if_snd_tag_free() call to the currently bound network interface. Reviewed by: wblock (manpages), adrian, gallatin, scottl (network) Differential Revision: https://reviews.freebsd.org/D3687 Sponsored by: Mellanox Technologies MFC after: 3 months	2017-01-18 13:31:17 +00:00
Alan Somers	8a73c85db3	Remove stray debugging code from r310180 Reported by: rstone Pointy hat to: asomers MFC after: 3 weeks X-MFC-with: 310180 Sponsored by: Spectra Logic Corp	2016-12-20 15:45:53 +00:00
Alan Somers	d9fa2d67eb	Fix panic during lagg destruction with simultaneous status check If you run "ifconfig lagg0 destroy" and "ifconfig lagg0" at the same time a page fault may result. The first process will destroy ifp->if_lagg in lagg_clone_destroy (called by if_clone_destroy). Then the second process will observe that ifp->if_lagg is NULL at the top of lagg_port_ioctl and goto fallback: where it will promptly dereference ifp->if_lagg anyway. The solution is to repeat the NULL check for ifp->if_lagg MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D8512	2016-12-16 22:39:30 +00:00
Bjoern A. Zeeb	89856f7e2d	Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747	2016-06-21 13:48:49 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Ravi Pokala	729a4cff7e	Revert accidental submit of WIP as part of r297609 Pointyhat to: rpokala	2016-04-06 04:58:20 +00:00
Ravi Pokala	06152bf0e1	Storage Controller Interface driver - typo in unimplemented macro in scic_sds_controller_registers.h s/contoller/controller/ PR: 207336 Submitted by: Tony Narlock <tony @ git-pull.com>	2016-04-06 04:50:28 +00:00
Marcelo Araujo	d931334bd4	Fix regression introduced on 272446r. lagg(4) supports the protocol none, where it disables any traffic without disabling the lagg(4) interface itself. PR: 206921 Submitted by: Pushkar Kothavade <pushkarbk@gmail.com> Reviewed by: rpokala Approved by: bapt (mentor) MFC after: 3 weeks Sponsored by: gandi.net Differential Revision: https://reviews.freebsd.org/D5076	2016-02-19 06:35:53 +00:00
Marcelo Araujo	d62edc5eb5	Add an IOCTL rr_limit to let users fine tuning the number of packets to be sent using roundrobin protocol and set a better granularity and distribution among the interfaces. Tuning the number of packages sent by interface can increase throughput and reduce unordered packets as well as reduce SACK. Example of usage: # ifconfig bge0 up # ifconfig bge1 up # ifconfig lagg0 create # ifconfig lagg0 laggproto roundrobin laggport bge0 laggport bge1 \ 192.168.1.1 netmask 255.255.255.0 # ifconfig lagg0 rr_limit 500 Reviewed by: thompsa, glebius, adrian (old patch) Approved by: bapt (mentor) Relnotes: Yes Differential Revision: https://reviews.freebsd.org/D540	2016-01-23 04:18:44 +00:00
Steven Hartland	d6e82913c1	Revert r292275 & r292379 glebius has concerns about these changes so reverting those can be discussed and addressed. Sponsored by: Multiplay	2015-12-17 14:41:30 +00:00
Steven Hartland	52e53e2de0	Fix lagg failover due to missing notifications When using lagg failover mode neither Gratuitous ARP (IPv4) or Unsolicited Neighbour Advertisements (IPv6) are sent to notify other nodes that the address may have moved. This results is slow failover, dropped packets and network outages for the lagg interface when the primary link goes down. We now use the new if_link_state_change_cond with the force param set to allow lagg to force through link state changes and hence fire a ifnet_link_event which are now monitored by rip and nd6. Upon receiving these events each protocol trigger the relevant notifications: * inet4 => Gratuitous ARP * inet6 => Unsolicited Neighbour Announce This also fixes the carp IPv6 NA's that stopped working after r251584 which added the ipv6_route__llma route. The new behavour can be controlled using the sysctls: * net.link.ether.inet.arp_on_link * net.inet6.icmp6.nd6_on_link Also removed unused param from lagg_port_state and added descriptions for the sysctls while here. PR: 156226 MFC after: 1 month Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D4111	2015-12-15 16:02:11 +00:00
Alexander V. Chernikov	8ad43f2d0a	Move iflladdr_event eventhandler invocation to if_setlladdr. Suggested by: glebius	2015-11-14 13:34:03 +00:00
Alexander V. Chernikov	bb3d23fd35	Fix lladdr change propagation for on vlans on top of it. Fix lladdr update when setting mac address manually. Fix lladdr_event for slave ports addition. MFC after: 4 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D4004	2015-11-01 19:59:04 +00:00
Hiroki Sato	b7a581eaa6	Fix a panic when destroying a lagg interface. Differential Revision: https://reviews.freebsd.org/D3883	2015-10-16 01:16:01 +00:00
Hiroki Sato	023d10cbc7	Fix a bug that caused reinitialization failure of MAC addresses on the lagg interface when removing the primary port. PR: 201916 Differential Revision: https://reviews.freebsd.org/D3301	2015-10-07 06:32:34 +00:00
Marcelo Araujo	973532fc7d	Remove per complete the fec aggregation protocol. The remove began with revision r271733. NOTE: This patch must never be merge to 10-Stable Reviewed by: glebius Approved by: bapt (mentor) Relnotes: Yes Sponsored by: EuroBSDCon Sweden. Differential Revision: D3786	2015-10-04 08:00:29 +00:00
Hiren Panchasara	0e02b43a07	Make LAG LACP fast timeout tunable through IOCTL. Differential Revision: D3300 Submitted by: LN Sundararajan <lakshmi.n at msystechnologies> Reviewed by: wblock, smh, gnn, hiren, rpokala at panasas MFC after: 2 weeks Sponsored by: Panasas	2015-08-12 20:21:04 +00:00
Andrey V. Elsukov	a4b65afcab	Fix a possible mbuf leak on interface departure. Reported by: Alexandre Martins	2015-03-26 23:40:22 +00:00
Hans Petter Selasky	b7ba031ff7	Factor out mbuf hashing code from LAGG driver so that other network drivers can use it. This avoids some code duplication. Add missing default case to all switch statements while at it. Also move the hashing of the IPv6 flow field to layer 4 because the IPv6 flow field is constant on a per L4 connection basis and not on a per L3 network. Differential Revision: https://reviews.freebsd.org/D1987 Sponsored by: Mellanox Technologies MFC after: 1 month	2015-03-11 16:02:24 +00:00
Will Andrews	0e5f55bb95	Improve the distribution of LAGG port traffic. I edited the original change to retain the use of arc4random() as a seed for the hashing as a very basic defense against intentional lagg port selection. The author's original commit message (edited slightly): sys/net/ieee8023ad_lacp.c sys/net/if_lagg.c In lagg_hashmbuf, use the FNV hash instead of the old hash32_buf. The hash32 family of functions operate one octet at a time, and when run on a string s of length n, their output is equivalent to : ----- i=n-1 \ n \ (n-i-1) 32 ( seed^ + / 33^ * s[i] ) % 2^ / ----- i=0 The problem is that the last five bytes of input don't get multiplied by sufficiently many powers of 33 to rollover 2^32. That means that changing the last few bytes (but obviously not the very last) of input will always change the value of the hash by a multiple of 33. In the case of lagg_hashmbuf() with ipv4 input, the last four bytes are the TCP or UDP port numbers. Since the output of lagg_hashmbuf is always taken modulo the port count, and 3 is a common port count for a lagg, that's bad. It means that the UDP or TCP source port will never affect which lagg member is selected on a 3-port lagg. At 10Gbps, I was not able to measure any difference in CPU consumption between the old and new hash. Submitted by: asomers (original commit) Reviewed by: emaste, glebius MFC after: 1 week Sponsored by: Spectra Logic MFSpectraBSD: 1001723 on 2013/08/28 (original) 1114258 on 2015/01/22 (edit)	2015-01-23 00:06:35 +00:00
Andrey V. Elsukov	504289ea5a	Fix condition and really sort ports. Also add comment describing the intent of this code. Reported by: sbruno MFC after: 1 week Sponsored by: Yandex LLC	2015-01-17 11:32:09 +00:00
Hans Petter Selasky	c25290420e	Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file. This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows. "M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before. Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped. MFC after: 1 month Sponsored by: Mellanox Technologies	2014-12-01 11:45:24 +00:00
Hiroki Sato	bf6d3f0c7c	- Fix lladdr configuration which could prevent LACP mode from working. - Fix LORs when a laggport interface has an IPv6 LLA. PR: 194321	2014-10-17 09:08:44 +00:00
Hiroki Sato	6d47816791	- Move L2 addr configuration for the primary port to a taskqueue. This fixes LOR of softc rmlock in iflladdr_event handlers. - Call if_delmulti_ifma() after LACP_UNLOCK(). This fixes another LOR. - Fix a panic in lacp_transit_expire(). - Fix a panic in lagg_input() upon shutting down a port.	2014-10-05 02:34:21 +00:00
Hiroki Sato	9732189ca9	Separate option handling from SIOC[SG]LAGG to SIOC[SG]LAGGOPTS for backward compatibility with old ifconfig(8).	2014-10-02 20:01:13 +00:00
Hiroki Sato	939a050ad9	Virtualize lagg(4) cloner. This change fixes a panic when tearing down if_lagg(4) interfaces which were cloned in a vnet jail. Sysctl nodes which are dynamically generated for each cloned interface (net.link.lagg.N.*) have been removed, and use_flowid and flowid_shift ifconfig(8) parameters have been added instead. Flags and per-interface statistics counters are displayed in "ifconfig -v". CR: D842	2014-10-01 21:37:32 +00:00
Gleb Smirnoff	dee826cec0	Fix off by one in lagg_port_destroy(). Reported by: "Max N. Boyarov" <zotrix bsd.by>	2014-10-01 11:23:54 +00:00
Gleb Smirnoff	112f50ffb2	Finally, convert counters in struct ifnet to counter(9). Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-28 08:57:07 +00:00
Gleb Smirnoff	2357543753	Convert to if_inc_counter() last remnantes of bare access to struct ifnet counters.	2014-09-28 07:43:38 +00:00
Alexander V. Chernikov	7d6cc45c9b	Use underlying ports counters to get lagg statistics instead of per-packet accounting. This introduce user-visible changes like aggregating error counters. Reviewed by: asomers (prev.version), glebius CR: D781 MFC after: 2 weeks Sponsored by: Yandex LLC	2014-09-27 13:57:48 +00:00
Gleb Smirnoff	eade13f9d2	Remove macros that hide access to struct ifnet fields.	2014-09-26 13:02:29 +00:00
Gleb Smirnoff	38738d739a	Make all lagg protocol methods live in lagg_protos, not in softc. All interfaces of a same protocol, use the same methods. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 12:54:24 +00:00
Andrey V. Elsukov	30e5de489d	Keep list of lagg ports sorted by if_index. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2014-09-26 12:42:06 +00:00
Gleb Smirnoff	6900d0d328	- Whitespace. - Remove caddr_t.	2014-09-26 12:35:58 +00:00
Gleb Smirnoff	16ca790ead	- Provide lagg_proto_attach(), lagg_proto_detach(). - Make detach a protocol method in lagg_protos. - Simplify code to lookup protocols. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 11:01:04 +00:00
Gleb Smirnoff	09c7577ef3	- When reconfiguring protocol on a lagg, first set it to LAGG_PROTO_NONE, then drop lock, run the attach routines, and then set it to specific proto. This removes tons of WITNESS warnings. - Make lagg protocol attach handlers not failing and allocate memory with M_WAITOK. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 08:42:32 +00:00
Gleb Smirnoff	b1bbc5b3d1	Make lagg protocols detach methods returning void. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 07:12:40 +00:00
Hans Petter Selasky	9fd573c39d	Improve transmit sending offload, TSO, algorithm in general. The current TSO limitation feature only takes the total number of bytes in an mbuf chain into account and does not limit by the number of mbufs in a chain. Some kinds of hardware is limited by two factors. One is the fragment length and the second is the fragment count. Both of these limits need to be taken into account when doing TSO. Else some kinds of hardware might have to drop completely valid mbuf chains because they cannot loaded into the given hardware's DMA engine. The new way of doing TSO limitation has been made backwards compatible as input from other FreeBSD developers and will use defaults for values not set. Reviewed by: adrian, rmacklem Sponsored by: Mellanox Technologies MFC after: 1 week	2014-09-22 08:27:27 +00:00
Marcelo Araujo	99cdd96163	Add laggproto broadcast, it allows sends frames to all ports of the lagg(4) group and receives frames on any port of the lagg(4). Phabric: D549 Reviewed by: glebius, thompsa Approved by: glebius Obtained from: OpenBSD Sponsored by: QNAP Systems Inc.	2014-09-18 02:12:48 +00:00
Hans Petter Selasky	72f3100047	Revert r271504. A new patch to solve this issue will be made. Suggested by: adrian @	2014-09-13 20:52:01 +00:00

1 2 3

134 Commits