numam-dpdk

Author	SHA1	Message	Date
Michael Baum	dec50e58f7	net/mlx5: remove declaration duplications In mlx5_ethdev.c file are implemented those 4 functions: - mlx5_dev_infos_get - mlx5_fw_version_get - mlx5_dev_set_mtu - mlx5_hairpin_cap_get In mlx5.h file they are declared twice. First time under mlx5.c file and second time under mlx5_ethdev.c file. This patch removes the redundant declaration. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-02-21 11:36:41 +01:00
Michael Baum	6be4c57add	net/mlx5: fix errno update in shared context creation The mlx5_alloc_shared_dev_ctx() function has a local variable named "err" which contains the errno value in case of failure. When functions called by this function are failed, this variable is updated with their return value (that should be a positive errno value). However, some functions doesn't update errno value by themselves or return negative errno value. If one of them fails, the "err" variable contains negative value what cause to assertion failure. This patch updates all functions uses by mlx5_alloc_shared_dev_ctx() function to update rte_errno and take this value instead of "err" value. Fixes: `5dfa003db5` ("common/mlx5: fix post doorbell barrier") Fixes: `5d55a494f4` ("net/mlx5: split multi-thread flow handling per OS") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-02-21 11:36:40 +01:00
Michael Baum	ce12974cce	net/mlx5: fix ASO CT object release The ASO connection tracking structure is initialized once for sharing device context. Its release takes place in the close function which is called for each ethdev individually. i.e. when there is more than one ethdev under the same sharing device context, it will be destroyed when one of them is closed. If the other wants to use it later, it may cause it to crash. In addition, the creation of this structure is performed in the spawn function. If one of the creations of the objects following it fails, it is supposed to be destroyed but this does not happen. This patch moves its release to the sharing device context free function and thus solves both problems. Fixes: `0af8a2298a` ("net/mlx5: release connection tracking management") Fixes: `ee9e5fad03` ("net/mlx5: initialize connection tracking management") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-02-21 11:36:39 +01:00
Michael Baum	ad9d0c6395	net/mlx5: fix ineffective metadata argument adjustment In "dv_xmeta_en" devarg there is an option of dv_xmeta_en=3 which engages tunnel offload mode. In E-Switch configuration, that mode implicitly activates dv_xmeta_en=1. The update according to E-switch support is done immediately after the first parsing of the devargs, but there is another adjustment later. This patch moves the adjustment after the second parsing. Fixes: `4ec6360de3` ("net/mlx5: implement tunnel offload") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-02-21 11:36:38 +01:00
Michael Baum	dcbaafdc8f	net/mlx5: fix sibling device config check The MLX5 net driver supports "probe again". In probing again, it creates a new ethdev under an existing infiniband device context. Sibling devices sharing infiniband device context should have compatible configurations, so some of the devargs given in the probe again, the ones that are mainly relevant to the sharing device context are sent to the mlx5_dev_check_sibling_config function which makes sure that they compatible its siblings. However, the arguments are adjusted according to the capability of the device, and the function compares the arguments of the probe again before the adjustment with the arguments of the siblings after the adjustment. A user who sends the same values to all siblings may fail in this comparison if he requested something that the device does not support and adjusted. This patch moves the call to the mlx5_dev_check_sibling_config function after the relevant adjustments. Fixes: `92d5dd4834` ("net/mlx5: check sibling device configurations mismatch") Fixes: `2d241515eb` ("net/mlx5: add devarg for extensive metadata support") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-02-21 11:36:32 +01:00
Ferruh Yigit	a41f593f1b	ethdev: introduce generic dummy packet burst function Multiple PMDs have dummy/noop Rx/Tx packet burst functions. These dummy functions are very simple, introduce a common function in the ethdev and update drivers to use it instead of each driver having its own functions. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2022-02-11 21:17:34 +01:00
Dariusz Sosnowski	864678e420	net/mlx5: fix inline length for multi-segment TSO This patch removes a redundant assert in mlx5_tx_packet_multi_tso(). That assert assured that the amount of bytes requested to be inlined is greater than or equal to the minimum amount of bytes required to be inlined. This requirement is either derived from the NIC inlining mode or configured through devargs. When using TSO this requirement can be disregarded, because on all NICs it is satisfied by TSO inlining requirements, since TSO requires L2, L3, and L4 headers to be inlined. Fixes: `18a1c20044` ("net/mlx5: implement Tx burst template") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-02-10 09:44:34 +01:00
Alexander Kozyrev	eb11edd9db	net/mlx5: fix meter capabilities reporting Meter capabilities reporting is not up to date. Mellanox NICs support RFC2698 and RFC4115 as well as RFC2697. Add these marker operations to the capabilities list. Fixes: `6bc327b94f` ("net/mlx5: fill meter capabilities using DevX") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-02-10 09:44:33 +01:00
Alexander Kozyrev	21fdeab422	net/mlx5: fix committed bucket size Committed Bucket Size calculation tries to fit into 8-bit wide mantissa field by setting 256 as a maximum value for it. To compensate for this increase in the mantissa value the exponent value has to be reduced by 8. But it gives a negative exponent value for CBS less than 128. And negative exponent value is not supported by the NIC. Adjust CSB calculation only for values bigger than 128 to allow both small and big bucket sizes. Fixes: `3bd26b23ce` ("net/mlx5: support meter profile operations") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-02-10 09:44:32 +01:00
Viacheslav Ovsiienko	7cad3dc312	net/mlx5: remove unused metadata shift parameter Due to updated modify field action immediate value buffer pattern [1] the implicit shift for the metadata is not needed anymore and should be removed. [1] commit `40c8fb1fd3` ("net/mlx5: update modify field action") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-02-10 09:44:31 +01:00
Viacheslav Ovsiienko	ced4900cde	net/mlx5: fix metadata endianness in modify field action As modify field action immediate source parameter the metadata should follow the CPU endianness (according to SET_META action structure format), and mlx5 PMD wrongly handled the immediate parameter metadata buffer as big-endian, resulting in wrong metadata set action with incorrect endianness. Fixes: `40c8fb1fd3` ("net/mlx5: update modify field action") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-02-10 09:44:28 +01:00
Stephen Hemminger	06c047b680	remove unnecessary null checks Functions like free, rte_free, and rte_mempool_free already handle NULL pointer so the checks here are not necessary. Remove redundant NULL pointer checks before free functions found by nullfree.cocci Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2022-02-12 12:07:48 +01:00
Elena Agostini	8e83ba285a	net/mlx5: add C++ include guard to public header The support for linking rte_pmd_mlx5.h functions with C++ applications was missing. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-02-03 09:19:26 +01:00
Xiaoyu Min	87b26522f7	net/mlx5: reject jump to root table Currently root table as destination is not supported. The jump action which finally be translated to underlying root table in rdma-core should be rejected. Fixes: `f78f747f41` ("net/mlx5: allow jump to group lower than current") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-01-26 17:41:12 +01:00
Raja Zidane	082becbf1f	net/mlx5: fix mark enabling for Rx To optimize datapath, the mlx5 pmd checked for mark action on flow creation, and flagged possible destination rxqs (through queue/RSS actions), then it enabled the mark action logic only for flagged rxqs. Mark action didn't work if no queue/rss action was in the same flow, even when the user use multi-group logic to manage the flows. So, if mark action is performed in group X and the packet is moved to group Y > X when the packet is forwarded to Rx queues, SW did not get the mark ID to the mbuf. Flag Rx datapath to report mark action for any queue when the driver detects the first mark action after dev_start operation. Fixes: `8e61555657` ("net/mlx5: fix shared RSS and mark actions combination") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-01-26 17:41:11 +01:00
Alexander Kozyrev	728b6447e7	net/mlx5: fix MPRQ WQE size assertion Preparation of the stride size and the number of strides for Multi-Packet RQ was updated recently to accommodate the hardware limitation about minimum WQE size. The wrong assertion was introduced to ensure this limitation is met. Assert that the configured WQE size is not less than the minimum supported size. Fixes: `34776af600` ("net/mlx5: fix MPRQ stride devargs adjustment") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-01-18 09:30:25 +01:00
Alexander Kozyrev	9701034faa	net/mlx5: fix maximum packet headers size for TSO The maximum packet headers size for TSO is calculated as a sum of Ethernet, VLAN, IPv6 and TCP headers (plus inner headers). The rationale behind choosing IPv6 and TCP is their headers are bigger than IPv4 and UDP respectively, giving us the maximum possible headers size. But it is not true for L3 headers. IPv4 header size (20 bytes) is smaller than IPv6 header size (40 bytes) only in the default case. There are up to 10 optional header fields called Options in case IHL > 5. This means that the maximum size of the IPv4 header is 60 bytes. Choosing the wrong maximum packets headers size causes inability to transmit multi-segment TSO packets with IPv4 Options present. PMD check that it is possible to inline all the packet headers and the packet headers size exceeds the expected maximum size. The maximum packet headers size was set to 192 bytes before, but its value has been reduced during Tx path refactor activity. Restore the proper maximum packet headers size for TSO. Fixes: `50724e1bba` ("net/mlx5: update Tx definitions") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-01-16 13:06:39 +01:00
Dmitry Kozlyuk	637582afcc	net/mlx5: relax headroom assertion A debug assertion in Single-Packet Receive Queue (SPRQ) mode required all Rx mbufs to have a 128 byte headroom, based on the assumption that rte_pktmbuf_init() sets it. However, rte_pktmbuf_init() may set a smaller headroom if the dataroom is insufficient, e.g. this is a natural case for split buffer segments. The headroom can also be larger. Only check the headroom size when vectored Rx routines are used because they rely on it. Relax the assertion to require sufficient headroom size, not an exact one. Fixes: `a0a45e8af7` ("net/mlx5: configure Rx queue for buffer split") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-01-09 13:06:38 +01:00
Dmitry Kozlyuk	be8cda4932	net/mlx5: fix GCC uninitialized variable warning When building with -Db_sanitize=thread, GCC gives a warning: drivers/net/mlx5/mlx5_flow_meter.c: In function ‘mlx5_flow_meter_create’: drivers/net/mlx5/mlx5_flow_meter.c:1170:33: warning: ‘legacy_fm’ may be used uninitialized in this function [-Wmaybe-uninitialized] This is a false-positive: legacy_fm is initialized and used if and only if priv->sh->meter_aso_en is false. Work around this by initializing legacy_fm to NULL. Add an assertion before legacy_fm use in case the logic changes. Fixes: `4443201863` ("net/mlx5: support meter creation with policy") Cc: stable@dpdk.org Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-01-06 10:09:00 +01:00
Tal Shnaiderman	e50fe91ae3	net/mlx5: support imissed counter on Windows Add support for the imissed counter using the DevX API on Windows. imissed is queried by creating a queue counter for the port, attaching it to all created RQs and querying the "out_of_buffer" field. If the counter cannot be created, imissed will always report 0. Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-01-06 10:07:59 +01:00
Gregory Etelson	985b479267	net/mlx5: fix GRE protocol type translation for Verbs When application creates several flows to match on GRE tunnel without explicitly specifying GRE protocol type value in flow rules, PMD will translate that to zero mask. RDMA-CORE cannot distinguish between different inner flow types and produces identical matchers for each zero mask. The patch extracts inner header type from flow rule and forces it in GRE protocol type, if application did not specify any. Fixes: `84c406e745` ("net/mlx5: add flow translate function") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-01-06 10:07:49 +01:00
Gregory Etelson	f3f1f576f4	net/mlx5: fix RSS expansion with explicit next protocol The PMD RSS expansion scheme by default compiles flow rules for all flow item types that may branch out from a stub supplied by application. For example, ETH can lead to VLAN, IPv4 or IPv6. IPv4 can lead to UDP, TCP, IPv4 or IPv6. If application explicitly specified next protocol type, expansion must use that option only and not create flows with other protocol types. The PMD ignored explicit next protocol values in GRE and VXLAN-GPE. The patch updates RSS expansion for GRE and VXLAN-GPE with explicit next protocol settings. Fixes: `c7870bfe09` ("ethdev: move RSS expansion code to mlx5 driver") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-01-06 10:07:41 +01:00
Lior Margalit	0c7606b75b	net/mlx5: fix assertion on flags set in packet mbuf Fixed the assertion on the flags set in pkt->ol_flags for vectorized MPRQ. With vectorized MPRQ the CQs are processed before copying the MPRQ bufs so the valid assertion is that the expected flag is set and not that the pkt->ol_flags equlas this flag alone. Fixes: `0f20acbf5e` ("net/mlx5: implement vectorized MPRQ burst") Cc: stable@dpdk.org Signed-off-by: Lior Margalit <lmargalit@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-01-06 10:07:30 +01:00
Michael Baum	147f6fb42b	net/mlx5: fix memory socket selection in ASO management In ASO objects creation (WQE, CQE and MR), socket number is given as a parameter. The selection was wrongly socket 0 hardcoded even if the user didn't configure memory for this socket. This patch replaces the selection to default socket (SOCKET_ID_ANY). Fixes: `f935ed4b64` ("net/mlx5: support flow hit action for aging") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-12-22 09:56:01 +01:00
Michael Baum	34776af600	net/mlx5: fix MPRQ stride devargs adjustment In Multi-Packet RQ creation, the user can choose the number of strides and their size in bytes. The user updates it using specific devargs for both of these parameters. The above two parameters determine the size of the WQE which is actually their product of multiplication. If the user selects values that are not in the supported range, the PMD changes them to default values. However, apart from the range limitations for each parameter individually there is also a minimum value on their multiplication. When the user selects values that their multiplication are lower than minimum value, no adjustment is made and the creation of the WQE fails. This patch adds an adjustment in these cases as well. When the user selects values whose multiplication is lower than the minimum, they are replaced with the default values. Fixes: `ecb160456a` ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-12-05 12:22:09 +01:00
Michael Baum	0947ed380f	net/mlx5: improve stride parameter names In the striding RQ management there are two important parameters, the size of the single stride in bytes and the number of strides. Both the data-path structure and config structure keep the log of the above parameters. However, in their names there is no mention that the value is a log which may be misleading as if the fields represent the values themselves. This patch updates their names describing the values more accurately. Fixes: `ecb160456a` ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-12-05 12:22:09 +01:00
Viacheslav Ovsiienko	252b5ae036	net/mlx5: fix modify field MAC address offset The MAC addresses fields are 48 bit wide and are processed by mlx5 PMD as two words. There the bug was introduced for the offset, causing malfunction of MODIFY_FIELD action with MAC address fields as source or destination and with non zero field offset. Fixes: `40c8fb1fd3` ("net/mlx5: update modify field action") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-12-05 12:24:13 +01:00
Josh Soref	7be78d0279	fix spelling in comments and strings The tool comes from https://github.com/jsoref Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2022-01-11 12:16:53 +01:00
Michael Baum	8648fa2f46	net/mlx5: fix devargs validation for multi-class probing The mlx5_args function reads the devargs and checks if they are valid for this driver and if not it returns an error. This was normal behavior as long as all the devargs come to this driver, but since it is possible to run several drivers together, the function may return an error for another driver's devarg even though it is completely valid. In addition the function does not allow the user to know which of the devargs is incorrect, but returns an error without printing the unknown devarg. This patch eliminates the error return in the case of an unknown devarg, and prints a warning for each such devarg specifically. Fixes: `7b4f1e6bd3` ("common/mlx5: introduce common library") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-26 13:36:16 +01:00
Sean Morrissey	b53d106d34	remove repeated 'the' in the code Remove the use of double "the" as it does not make sense. Cc: stable@dpdk.org Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Signed-off-by: Conor Fogarty <conor.fogarty@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-26 11:28:34 +01:00
Viacheslav Ovsiienko	572c9d4bda	net/mlx5: fix shared Rx queue segment configuration match While joining the shared Rx queue to the existing queue group, the queue configurations is checked to be the same as it was specified in the first group queue creation - all shared queues should be created with identical configurations. During the Rx queue creation the buffer split segment configuration can be altered - the zero segment sizes are substituted with the actual ones, inherited from the pools, number of segments can be extended to cover the maximal packet length, etc. It means the actual queue segment configuration can not be used directly to match the configuration provided in the queue setup call. To resolve an issue we should store original parameters in the shared queue structure and perform the check against one of these stored ones. Fixes: `09c2555303` ("net/mlx5: support shared Rx queue") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-24 17:25:37 +01:00
Alexander Kozyrev	94421842de	net/mlx5: fix GENEVE and VXLAN-GPE flow item matching GENEVE and VXLAN-GPE item matching is done similarly to GRE matching. Users can skip the specification of the protocol type and expect that this type is deducted from the inner header type automatically. But the inner header type may not be specified in order to match all the protocol types. In this case, PMD should not specify the protocol type. Check if we have the inner header type before setting the protocol type. Fixes: `690391dd0e` ("net/mlx5: fix GENEVE protocol type translation") Fixes: `861fa3796f` ("net/mlx5: fix VXLAN-GPE next protocol translation") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-24 17:25:36 +01:00
Alexander Kozyrev	9e61533df2	net/mlx5: fix GRE flow item matching GRE protocol type is implicitly set in the matching translation in case an application doesn't specify any type explicitly in a flow rule. It is extracted from the inner header type, but this type may be absent. In this case, GRE item matching is broken. Check if we have the inner header type before setting it to allow matching on all GRE packets. Fixes: `be26e81bfc` ("net/mlx5: fix GRE protocol type translation") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-24 17:25:36 +01:00
Dmitry Kozlyuk	ec9b812b6c	net/mlx5: fix Rx queue reference count for indirect RSS mlx5_ind_table_obj_modify() was not changing the reference counters of neither the new set of RxQs, nor the old set of RxQs. On the other hand, creation of the RSS incremented the RxQ refcnt. If an RxQ was present in both the initial and the modified set, its reference counter was incremented one extra time compared to the queues that were only present in the new set. This prevented releasing said RxQ resources on port stop: flow indirect_action 0 create action_id 1 \ action rss queues 0 1 end / end flow indirect_action 0 update 1 \ action rss queues 2 3 end / end quit ... mlx5_net: mlx5.c:1622: mlx5_dev_close(): port 0 some Rx queue objects still remain mlx5_net: mlx5.c:1626: mlx5_dev_close(): port 0 some Rx queues still remain Increment reference counters for the new set of RxQs and decrement them for the old set of RxQs when needed. Remove explicit referencing of RxQ from mlx5_ind_table_obj_attach() because it reuses mlx5_ind_table_obj_modify() code doing this. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com>	2021-11-24 17:25:35 +01:00
Dmitry Kozlyuk	c65d684497	net/mlx5: fix indirect RSS creation when port is stopped mlx5_ind_table_obj_setup() was incrementing RxQ reference counters even when the port was stopped, which prevented RxQ release and triggered an internal assertion. Only increment reference counter when the port is started. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com>	2021-11-24 17:25:33 +01:00
Dmitry Kozlyuk	0ece5de3c4	net/mlx5: fix crash on close after failed start If mlx5_rxq_start() failed and rxq_ctrl was not initialized, mlx5_rxq_obj_verify() would segfault in an attempt to dereference it. Add a check that rxq_ctrl is not NULL before accessing its members. Fixes: `09c2555303` ("net/mlx5: support shared Rx queue") Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-23 21:39:05 +01:00
Dariusz Sosnowski	8fbce96fbe	net/mlx5: fix reference count on detached indirect action This patch fixes segfault which was triggered when port, with indirect actions created, was closed. Segfault was occurring only when RTE_LIBRTE_MLX5_DEBUG was defined. It was caused by redundant decrement of RX queues refcount: - refcount was decremented when port was stopped and indirect actions were detached from RX queues (port stop), - refcount was decremented when indirect actions objects were destroyed (port close or destroying of indirect action). This patch fixes behavior. Dereferencing Rx queues is done if and only if indirect action is explicitly destroyed by the user or detached on port stop. Dereferencing Rx queues on action destroy operation depends on an argument to the wrapper of indirect action destroy operation, introduced in this patch. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-23 17:57:19 +01:00
Dariusz Sosnowski	fa4883456d	net/mlx5: fix multi-segment packet wraparound This patch fixes the assertion failure triggered when the user configured minimum inline length requirements and the application transmitted multi segment packets. Failure was triggered when space left in TX queue was not enough to cover this requirement. This patch limits the length of data to be copied to the available space in TX queue. Fixes: `cacb44a099` ("net/mlx5: add no-inline Tx flag") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-23 17:57:13 +01:00
Bing Zhao	db3ec06e7a	net/mlx5: fix RSS validation for meter hierarchy In a meter hierarchy, all the meters are marked with having RSS if the final meter's termination action is RSS. When validating a flow rule with meter hierarchy, the RSS action should not be fetched from the current meter if it is not the final one. The fate action union is next meter ID instead of the pointer to the RSS action. By using the final meter in the hierarchy, the flow rule validation will succeed without any crash caused by the invalid RSS action pointer access. Fixes: `1ce19ab1f4` ("net/mlx5: fix RSS validation with meter policy") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Reviewed-by: Li Zhang <lizh@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-23 17:57:13 +01:00
Jiawei Wang	693c7d4b1e	net/mlx5: fix flow mark with sampling and metering If there are sample action and the meter action in the same flow, mlx5 PMD performs several levels of splitting. For example, sampling feature splits the original flow into prefix subflow with sample action, and suffix subflow with the rest of actions. Then, metering feature splits the sampling suffix subflow into its own meter subflows. If mark action was added before the sample and meter action, the flow mark flag was kept in the sample subflows but reset on handling the metering split, causing the flow mark value missed. This patch keeps the flow mark flag of previous subflow, and then the following meter subflows handle the flow mark correctly. Fixes: `9ade91dfe8` ("net/mlx5: fix group value of sample suffix flow") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-23 14:24:17 +01:00
Dmitry Kozlyuk	08ac03580e	common/mlx5: fix mempool registration Mempool registration was not correctly processing mempools with RTE_PKTMBUF_F_PINEND_EXT_BUF flag set ("pinned mempools" for short), because it is not known at registration time whether the mempool is a pktmbuf one, and its elements may not yet be initialized to analyze them. Attempts had been made to recognize such pools, but there was no robust solution, only the owner of a mempool (the application or a device) knows its type. This patch extends common/mlx5 registration code to accept a hint that the mempool is a pinned one and uses this capability from net/mlx5 driver. 1. Remove all code assuming pktmbuf pool type or trying to recognize the type of a pool. 2. Register pinned mempools used for Rx and their external memory on port start. Populate the MR cache with all their MRs. 3. Change Tx slow path logic as follows: 3.1. Search the mempool database for a memory region (MR) by the mbuf pool and its buffer address. 3.2. If not MR for the address is found for the mempool, and the mempool contains only pinned external buffers, perform the mempool registration of the mempool and its external pinned memory. 3.3. Fall back to using page-based MRs in other cases (for example, a buffer with externally attached memory, but not from a pinned mempool). Fixes: `690b2a88c2` ("common/mlx5: add mempool registration facilities") Fixes: `fec28ca0e3` ("net/mlx5: support mempool registration") Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-21 15:38:07 +01:00
Jiawei Wang	144d222305	net/mlx5: fix mismatch metadata flow with meter action The mlx5 PMD introduced the table id attribute to allow multiple flow tables on the same table level for flow metering, there can be multiple flow table objects with the same table level but different table ids. If the extended metadata mode is enabled, all flows containing destination Queue/RSS actions are split into two subflows - prefix one jumps to the MLX5_FLOW_MREG_CP_TABLE_GROUP flow table to copy MARK action data, and suffix one to perform the destination Queue/RSS action. The table_id for the jump in the metadata split prefix flow is always 0. If flow itself was the metering split suffix subflow the table id was set to 1 in the flow split structure and the metadata split suffix subflow was created in the table with wrong table id, causing the metadata suffix flow mismatch. This patch resets the table id to 0 while creating the metadata suffix flows. Fixes: `51ec04dc7b` ("net/mlx5: connect meter policy to created flows") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-21 15:38:06 +01:00
Jiawei Wang	16f4aa57ca	net/mlx5: fix metadata and meter split shared tag In the metadata flow split, PMD created the prefix subflow with removed Queue or RSS action and appended the set tag and copy table jump actions. If the flow being split for metadata was the meter prefix subflow, the driver supposed to share the same meter split tag action for the metadata split flow. There was the wrong check for preceding meter split tag action, causing append with metadata split set tag action and resulting the meter suffix subflow was missed due to tag value mismatch. This patch adds the checking before copying into extend action list, to make sure the correct shared tag is used. Fixes: `8d72fa6689` ("net/mlx5: share tag between meter and metadata") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-21 15:38:02 +01:00
Viacheslav Ovsiienko	a750169c4f	net/mlx5: fix modify field destination bit offset If the modify field action requests the field copy/set transaction from other field, the destination field hardware bit offset was assigned incorrectly with non-zero byte offset, causing wrong translations for the fields with sizes larger than 32 bits. Fixes: `40c8fb1fd3` ("net/mlx5: update modify field action") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-17 15:51:40 +01:00
Dariusz Sosnowski	421177ccd7	net/mlx5: fix MPLS tunnel outer layer overwrite mlx5 PMD incorrectly overwrote outer layer fields in MPLS tunnel rte_flow patterns using defaults for MPLS tunnels. This included overwriting UDP destination port in MPLSoUDP and GRE protocol field in MPLSoGRE. This patch fixes this behavior. If application provides the values in flow pattern items preceding the MPLS flow item the provided values will be used, otherwise the defaults will be applied. Fixes: `d1abe664dd` ("net/mlx5: add MPLS to Direct Verbs flow engine") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-17 11:48:18 +01:00
Dariusz Sosnowski	7775172c04	net/mlx5: fix partial inline of fine grain packets Assuming a user tried to send multi-segment packets, with RTE_PMD_MLX5_FINE_GRANULARITY_INLINE flag set, using a device with minimum inlining requirements (such as ConnectX-4 Lx or when user specified them explicitly), sending such packets caused segfault. Segfault was caused by failed invariants in mlx5_tx_packet_multi_inline function. This patch introduces a logic for multi-segment packets, with RTE_PMD_MLX5_FINE_GRANULARITY_INLINE flag set, to omit mbuf scanning for filling inline buffer and inline only minimal amount of data required. Fixes: `ec837ad0fc` ("net/mlx5: fix multi-segment inline for the first segments") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-17 11:48:18 +01:00
Dmitry Kozlyuk	e4c402afc1	common/mlx5: fix MPRQ mempool registration Mempool registration code had a wrong assumption that it is always dealing with packet mempools and always called rte_pktmbuf_priv_flags(), which returned a random value for different types of mempools. In particular, it could consider MPRQ mempools as having externally pinned buffers, which is wrong. Packet mempools cannot be reliably recognized, but it is sufficient to check that the mempool is not a packet one, so it cannot have externally pinned buffers. Compare mempool private data size to that of packet mempools to check. Fixes: `690b2a88c2` ("common/mlx5: add mempool registration facilities") Fixes: `fec28ca0e3` ("net/mlx5: support mempool registration") Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 16:45:21 +01:00
Chengfeng Ye	1e580ed4b0	net/mlx5: fix mutex unlock in Tx packet pacing cleanup The lock sh->txpp.mutex was not correctly released on one path of cleanup function return, potentially causing the deadlock. Fixes: `d133f4cdb7` ("net/mlx5: create clock queue for packet pacing") Cc: stable@dpdk.org Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 17:55:17 +01:00
Dmitry Kozlyuk	5c078fce58	net/mlx5: fix keeping indirect RSS non-isolated mode When a port starts in non-isolated mode, an internal indirect RSS is created that includes all configured queues and a flow rule is created that references this indirect RSS. If before switching to non-isolated mode an indirect RSS was created that includes the same set of queues, it would be reused at this point. However, because the port had been stopped (or not yet started), the TIR for this indirect RSS had been destroyed (or not yet created). The flow rule could not be created and the port start failed. Creation of TIRs is moved before configuring non-isolated mode flows, but it is not enough because of the following issue. Commit `0cedf34da7` ("net/mlx5: move Rx queue reference count") changed mlx5_rxq_get() not to increment RxQ control structure reference count, mlx5_rxq_ref() was introduced for this purpose. mlx5_ind_table_obj_attach() was not updated to use the new function, so when the port was stopped, the control structure reference count of an RxQ used in RSS reached zero and the structure was destroyed. Use mlx5_rxq_ref() to keep RxQ control structure needed for indirect RSS persistence across port restart. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Fixes: `0cedf34da7` ("net/mlx5: move Rx queue reference count") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 14:05:35 +01:00
Bing Zhao	1ce19ab1f4	net/mlx5: fix RSS validation with meter policy The RSS can be one of the fate actions when creating a meter with policy. In the previous implementation, the RSS validation was missed when creating a flow rule with such meter due to the fact that a policy meter was created firstly and then used in the rule. In the stage of meter creation, no rte_flow_item* information was provided. A unnecessary RSS expansion might be called since the validation was missed and would cause an unexpected error of the rule creation. Even though the rule should be rejected from the very beginning, it would cause confusion. There might be some other errors when the validation was missed. Adding the RSS validation inside the meter action validation will prevent the code from continuing when there is a conflict between the items, other actions and the policy meter RSS action. Fixes: `4443201863` ("net/mlx5: support meter creation with policy") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Reviewed-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-16 10:23:31 +01:00
Gregory Etelson	be26e81bfc	net/mlx5: fix GRE protocol type translation When application creates several flows to match on GRE tunnel without explicitly specifying GRE protocol type value in flow rules, PMD will translate that to zero mask. RDMA-CORE cannot distinguish between different inner flow types and produces identical matchers for each zero mask. The patch extracts inner header type from flow rule and forces it in GRE protocol type, if application did not specify any without explicitly specifying GRE protocol type value in flow rules, protocol type value. Fixes: `fc2c498ccb` ("net/mlx5: add Direct Verbs translate items") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 10:22:56 +01:00
Gregory Etelson	690391dd0e	net/mlx5: fix GENEVE protocol type translation When application creates several flows to match on GENEVE tunnel without explicitly specifying GENEVE protocol type value in flow rules, PMD will translate that to zero mask. RDMA-CORE cannot distinguish between different inner flow types and produces identical matchers for each zero mask. The patch extracts inner header type from flow rule and forces it in GENEVE protocol type, if application did not specify any without explicitly specifying GENEVE protocol type value in flow rules, protocol type value. Fixes: `e59a5dbcfd` ("net/mlx5: add flow match on GENEVE item") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 10:22:55 +01:00
Gregory Etelson	a21d616b99	net/mlx5: fix RSS expansion scheme for GRE header RFC-2784 allows any valid Ethernet type in GRE protocol type field. Add Ethernet to GRE RSS expansion. Fixes: `f4b901a46a` ("net/mlx5: add flow GRE item") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 10:22:54 +01:00
Gregory Etelson	9f151fd8df	net/mlx5: add Ethernet header to GENEVE RSS expansion RFC-8926 allows inner Ethernet header after GENEVE tunnel. Current GENEVE RSS expansion created IPv4 and IPv6 paths only. The patch adds Ethernet to RSS expansion scheme. Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 10:22:53 +01:00
Gregory Etelson	861fa3796f	net/mlx5: fix VXLAN-GPE next protocol translation VXLAN-GPE extends VXLAN protocol and provides the next protocol field specifying the first inner header type. The application can assign some explicit value to VXLAN-GPE::next_protocol field or set it to the default one. In the latter case, the rdma-core library cannot recognize the matcher built by PMD correctly, and it results in hardware configuration missing inner headers match. The patch forces VXLAN-GPE::next_protocol assignment if the application did not explicitly assign it to the non-default value Fixes: `90456726eb` ("net/mlx5: fix VXLAN-GPE item translation") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-16 10:22:51 +01:00
Michael Baum	71304b5c7b	common/mlx5: fix redundant field in MR control structure Inside the MR control structure there is a pointer to the common device. This pointer enables access to the global cache as well as hardware objects that may be required in case a new MR needs to be created. The purpose of adding this pointer into the MR control structure was to avoid its transfer as a parameter to all the functions of searching MR in the caches. However, adding it to this structure increased the Rx and Tx data-path structures, all the fields that followed it were slightly moved away which caused to a reduction in performance. This patch removes the pointer from the structure. It can be accessed through the "dev_gen_ptr" existing field using the "container_of" operator. Fixes: `334ed198ab` ("common/mlx5: remove redundant parameter in MR search") Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-17 10:42:20 +01:00
Bing Zhao	ce78c51833	net/mlx5: fix delay drop bit set overflow The attribute to record the global control of hairpin queues' delay drop was defined as a bit-field with one bit, and the intention was to reduce the memory overhead. In the meanwhile, the macro was defined as an enumerated value 0x2. No matter what value inputted via devarg, the lowest bit was always zero and the higher bits would be ignored. For hairpin queues, the delay drop attribute couldn't be enabled. With the commit, the double logical negation is used to fix this. Fixes: `febcac7b46` ("net/mlx5: support Rx queue delay drop") Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-14 16:26:49 +01:00
Gregory Etelson	60bc280518	net/mlx5: fix flex item macro collision Flex item macro definition values duplicated existing connection tracking values. The patch provides new values for flex item macros. Fixes: `a23e9b6e3e` ("net/mlx5: handle flex item in flows") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-14 09:25:13 +01:00
Gregory Etelson	c06b773809	net/mlx5: fix integrity conversion scheme RTE flow integrity API provides top-level packet validations. RTE integrity bits are not always translated one-to-one to hardware integrity bits. For example RTE l3_ok and l4_ok integrity bits require 2 hardware integrity bits each. The patch fixes RTE l3_ok and l4_ok bits translation to match ConnectX-6 hardware. Fixes: `79f8952783` ("net/mlx5: support integrity flow item") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-14 09:24:23 +01:00
Viacheslav Ovsiienko	11cfe349b3	net/mlx5: fix Tx scheduling check There was a redundant check for the enabled E-Switch, this resulted in device probing failure if the Tx scheduling was requested and E-Switch was enabled. Fixes: `f17e4b4ffe` ("net/mlx5: add Tx scheduling check on queue creation") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-14 09:18:28 +01:00
Viacheslav Ovsiienko	e53d873dc0	net/mlx5: fix modify field action conversion The routine converting RTE flow modify field action into field driver's presentation did not specify the field mask correctly and this resulted into wrong conversion for the actions with shifted fields. Fixes: `40c8fb1fd3` ("net/mlx5: update modify field action") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-10 15:44:45 +01:00
Bing Zhao	6b5b3005cb	net/mlx5: fix RETA update without stopping device The global redirection table is used to create the default flow rules for the ingress traffic with the lowest priority. It is also used to create the default RSS rule in the destination table when there is a tunnel offload. To update the RETA in-flight, there is no restriction in the ethdev API. In the previous implementation of mlx5, a port restart was needed to make the new configuration take effect. The restart is heavy, e.g., all the queues will be released and reallocated, users' rules will be flushed. Since the restart is internal, there is a risk to crash the application when some change in the ethdev is introduced but no workaround is done in mlx5 PMD. The users' rules, including the default miss rule for tunnel offload, should not be impacted by the RETA update. It is improper to flush all rules when updating RETA. With this patch, only the default rules will be flushed and re-created with the new table configuration. Fixes: `3f2fe392bd` ("net/mlx5: fix crash during RETA update") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-10 15:44:44 +01:00
Jiawei Wang	a9b6ea45be	net/mlx5: fix tag ID conflict with sample action For the flows containing sample action, the tag action was added implicitly to store the unique flow index into metadata register in the split prefix subflow, and then match on this index in the split suffix subflow. The metadata register for flow index of sample split subflows was also used to store application metadata TAG 0 item, this might cause TAG 0 corruption in the flows with sample actions. This patch uses the same metadata register C index as used for ASO action since it's reserved and not used directly by the application, and adds the checking in validation to make sure not to conflict with ASO CT in the same flow. Fixes: `b4c0ddbfcc` ("net/mlx5: split sample flow into two sub-flows") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-10 15:44:43 +01:00
Gregory Etelson	aaa6a7ec0f	net/mlx5: fix tunnel offload validation Tunnel offload API allows the application to restore packet to its original form if the chain of flows is missed after DECAP action. MLX5 PMD provides tunnel offload support only if DV API was enabled. The patch verifies DV availability before processing with tunnel offload tasks. Fixes: `4ec6360de3` ("net/mlx5: implement tunnel offload") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-10 15:44:43 +01:00
Dmitry Kozlyuk	7297d2cdec	common/mlx5: fix external memory pool registration Registration of packet mempools with RTE_PKTMBUF_POOL_PINNED_EXT_MEM was performed incorrectly: after population of such mempool chunks only contain memory for rte_mbuf structures, while pointers to actual external memory are not yet filled. MR LKeys could not be obtained for external memory addresses of such mempools. Rx datapath assumes all used mempools are registered and does not fallback to dynamic MR creation in such case, so no packets could be received. Skip registration of extmem pools on population because it is useless. If used for Rx, they are registered at port start. During registration, recognize such pools, inspect their mbufs and recover the pages they reside in. While MRs for these pages may already be created by rte_dev_dma_map(), they are not reused to avoid synchronization on Rx datapath in case these MRs are changed in the database. Fixes: `690b2a88c2` ("common/mlx5: add mempool registration facilities") Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-10 15:44:42 +01:00
Rongwei Liu	0888c011d5	net/mlx5: fix meter policy validation When a user specifies meter policy like "g_actions queue / end y_actions queue / r_action drop / end", validation logic missed to set meter policy mode and it took a random value from the stack. Define ALL policy modes for the mentioned cases. Fixes: `4b7bf3ffb4` ("net/mlx5: support yellow in meter policy validation") Cc: stable@dpdk.org Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Bing Zhao <bingz@nvidia.com>	2021-11-10 15:44:42 +01:00
Bing Zhao	0ad28e873c	net/mlx5: fix RSS consistency check of meter policy After yellow color actions in the metering policy were supported, the RSS could be used for both green and yellow colors and only the queues attribute could be different. When specifying the attributes of a RSS, some fields can be ignored and some default values will be used in PMD. For example, there is a default RSS key in the PMD and it will be used to create the TIR if nothing is provided by the application. The default value cases were missed in the current implementation and it would cause some false positives or crashes. The comparison function should be adjusted to take all cases into consideration when RSS is used for both green and yellow colors. Fixes: `4b7bf3ffb4` ("net/mlx5: support yellow in meter policy validation") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-10 15:44:39 +01:00
Michael Baum	8451e165b8	net/mlx5: workaround MR creation for flow counter Due to kernel driver / FW issues in direct MKEY creation using the DevX API, this patch replaces the counter MR creation to use wrapped mkey API. Fixes: `5382d28c21` ("net/mlx5: accelerate DV flow counter transactions") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Signed-off-by: Matan Azrad <matan@nvidia.com>	2021-11-10 15:50:44 +01:00
Dmitry Kozlyuk	077be91dd7	net/mlx5: fix split buffer Rx Routine to lookup LKey on Rx was assuming that the mbuf address always belongs to a single mempool: the one associated with an RxQ or the MPRQ mempool. This assumption is false for split buffers case. A wrong LKey was looked up, resulting in completion errors. Modify lookup routines to lookup LKey in the mbuf->pool for non-MPRQ cases both on Rx datapath and on queue initialization. Fixes: `fec28ca0e3` ("net/mlx5: support mempool registration") Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-08 13:56:29 +01:00
Michael Baum	5dfa003db5	common/mlx5: fix post doorbell barrier The rdma-core library can map doorbell register in two ways, depending on the environment variable "MLX5_SHUT_UP_BF": - as regular cached memory, the variable is either missing or set to zero. This type of mapping may cause the significant doorbell register writing latency and requires an explicit memory write barrier to mitigate this issue and prevent write combining. - as non-cached memory, the variable is present and set to not "0" value. This type of mapping may cause performance impact under heavy loading conditions but the explicit write memory barrier is not required and it may improve core performance. The UAR creation function maps a doorbell in one of the above ways according to the system. In run time, it always adds an explicit memory barrier after writing to. In cases where the doorbell was mapped as non-cached memory, the explicit memory barrier is unnecessary and may impair performance. The commit [1] solved this problem for a Tx queue. In run time, it checks the mapping type and provides the memory barrier after writing to a Tx doorbell register if it is needed. The mapping type is extracted directly from the uar_mmap_offset field in the queue properties. This patch shares this code between the drivers and extends the above solution for each of them. [1] commit `8409a28573` ("net/mlx5: control transmit doorbell register mapping") Fixes: `f8c97babc9` ("compress/mlx5: add data-path functions") Fixes: `8e196c08ab` ("crypto/mlx5: support enqueue/dequeue operations") Fixes: `4d4e245ad6` ("regex/mlx5: support enqueue") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-07 16:21:03 +01:00
Michael Baum	b6e9c33c82	net/mlx5: remove duplicated reference of Tx doorbell The Tx doorbell has different virtual addresses per process. The secondary process takes the UAR physical page ID of the primary and mmap it to its own virtual address. The primary doorbell references were saved in two shared memory locations: the TxQ structure and a dedicated doorbell array. Remove the doorbell reference from the TxQ structure and move the primary processes to take the UAR information from the primary doorbell array. Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-07 16:21:03 +01:00
Michael Baum	204891763c	common/mlx5: make multi-process MR management port-agnostic In the multi-process mechanism, there are things that the secondary process does not perform itself but asks the primary process to perform for it. There is a special API for communication between the processes that receives parameters necessary for the specific action required as well as a special structure called mp_id that contains the port number of the processes through which the initial process finds the relevant ETH device for the processes. One of the operations performed through this mechanism is the creation of a memory region, where the secondary process sends the virtual address as a parameter and the mp_id structure with the port number inside it. However, once the memory area management is shared between the drivers and either port number or ETH device is no longer relevant to them, it seems unnecessary to continue communicating between the processes through the mp_id variable. In this patch we will remove the use of the above structure for all MR management, and add to the specific parameter of operations a pointer to the common device that contains everything needed to create/register MR. Fixes: `9f1d636f3e` ("common/mlx5: share MR management") Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-07 14:12:08 +01:00
Michael Baum	334ed198ab	common/mlx5: remove redundant parameter in MR search Memory region management has recently been shared between drivers, including the search for caches in the data plane. The initial search in the local linear cache of the queue, usually yields a result and one should not continue searching in the next level caches. The function that searches in the local cache gets the pointer to a device as a parameter, that is not necessary for its operation but for subsequent searches (which, as mentioned, usually do not happen). Transferring the device to a function and maintaining it, takes some time and causes some impact on performance. Add the pointer to the device as a field of the mr_ctrl structure. The field will be updated during control path and will be used only when needed in the search. Fixes: `fc59a1ec55` ("common/mlx5: share MR mempool registration") Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-07 14:11:16 +01:00
Bing Zhao	e848218741	net/mlx5: check delay drop settings in kernel driver The delay drop is the common feature managed on per device basis and the kernel driver is responsible one for the initialization and rearming. By default, the timeout value is set to activate the delay drop when the driver is loaded. A private flag "dropless_rq" is used to control the rearming. Only when it is on, the rearming will be handled once received a timeout event. Or else, the delay drop will be deactivated after the first timeout occurs and all the Rx queues won't have this feature. The PMD is trying to query this flag and warn the application when some queues are created with delay drop but the flag is off. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-05 17:04:53 +01:00
Bing Zhao	febcac7b46	net/mlx5: support Rx queue delay drop For the Ethernet RQs, if there all receiving descriptors are exhausted, the packets being received will be dropped. This behavior prevents slow or malicious software entities at the host from affecting the network. While for hairpin cases, even if there is no software involved during the packet forwarding from Rx to Tx side, some hiccup in the hardware or back pressure from Tx side may still cause the descriptors to be exhausted. In certain scenarios it may be preferred to configure the device to avoid such packet drops, assuming the posting of descriptors will resume shortly. To support this, a new devarg "delay_drop" is introduced. By default, the delay drop is enabled for hairpin Rx queues and disabled for standard Rx queues. This value is used as a bit mask: - bit 0: enablement of standard Rx queue - bit 1: enablement of hairpin Rx queue And this attribute will be applied to all Rx queues of a device. The "rq_delay_drop" capability in the HCA_CAP is checked before creating any queue. If the hardware capabilities do not support this delay drop, all the Rx queues will still be created without this attribute, and the devarg setting will be ignored even if it is specified explicitly. A warning log is used to notify the application when this occurs. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-05 17:04:53 +01:00
Viacheslav Ovsiienko	25ed2ebff1	net/mlx5: support shared Rx queue port data path When receive packet, mlx5 PMD saves mbuf port number from RxQ data. To support shared RxQ, save port number into RQ context as user index. Received packet resolve port number from CQE user index which derived from RQ context. Legacy Verbs API doesn't support RQ user index setting, still read from RxQ port number. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:51 +01:00
Xueming Li	09c2555303	net/mlx5: support shared Rx queue This patch introduces shared RxQ. All shared Rx queues with same group and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared, all queues from different member port share same WQ and CQ, essentially one Rx WQ, mbufs are filled into this singleton WQ. Shared rxq_data is set into device Rx queues of all member ports as RxQ object, used for receiving packets. Polling queue of any member ports returns packets of any member, mbuf->port is used to identify source port. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:50 +01:00
Xueming Li	5cf0707fc7	net/mlx5: remove Rx queue data list from device Rx queue data list(priv->rxqs) can be replaced by Rx queue list(priv->rxq_privs), removes it and replaces with universal wrapper API. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:49 +01:00
Xueming Li	5ceb3a02b0	net/mlx5: move Rx queue DevX resource To support shared RX queue, moves DevX RQ which is per queue resource to Rx queue private data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:48 +01:00
Xueming Li	5db77fef78	net/mlx5: remove port info from shareable Rx queue To prepare for shared Rx queue, removes port info from shareable Rx queue control. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:47 +01:00
Xueming Li	44126bd9d0	net/mlx5: move Rx queue hairpin info to private data Hairpin info of Rx queue can't be shared, moves to private queue data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:47 +01:00
Xueming Li	0cedf34da7	net/mlx5: move Rx queue reference count Rx queue reference count is counter of RQ, used to count reference to RQ object. To prepare for shared Rx queue, this patch moves it from rxq_ctrl to Rx queue private data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:46 +01:00
Xueming Li	4cda06c3c3	net/mlx5: split Rx queue into shareable and private To prepare shared Rx queue, splits RxQ data into shareable and private. Struct mlx5_rxq_priv is per queue data. Struct mlx5_rxq_ctrl is shared queue resources and data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:45 +01:00
Xueming Li	53232e3b05	net/mlx5: clean Rx queue code This patch removes unused Rx queue code. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:45 +01:00
Xueming Li	fdb67b84a5	net/mlx5: fix Rx queue memory allocation return value If error happened during Rx queue mbuf allocation, boolean value returned. From description, return value should be error number. This patch returns negative error number. Fixes: `0f20acbf5e` ("net/mlx5: implement vectorized MPRQ burst") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:44 +01:00
Xueming Li	056c87d07d	common/mlx5: support receive memory pool The hardware Receive Memory Pool (RMP) object holds the destination for incoming packets/messages that are routed to the RMP through RQs. RMP enables sharing of memory across multiple Receive Queues. Multiple Receive Queues can be attached to the same RMP and consume memory from that shared poll. When using RMPs, completions are reported to the CQ pointed to by the RQ, user index that set in RQ creation time is carried to completion entry. This patch enables RMP based RQ, RMP is created when mlx5_devx_rq.rmp is set. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:43 +01:00
Xueming Li	68fa62924d	net/mlx5: fix Altivec Rx This patch fixes stale field reference. Fixes: `a18ac61133` ("net/mlx5: add metadata support to Rx datapath") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:41 +01:00
Gregory Etelson	a23e9b6e3e	net/mlx5: handle flex item in flows Provide flex item recognition, validation and translation in flow patterns. Track the flex item referencing. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:41 +01:00
Viacheslav Ovsiienko	6dac7d7ff2	net/mlx5: translate flex item pattern into matcher The matcher is an steering engine entity that represents the flow pattern to hardware to match. It order to provide match on the flex item pattern the appropriate matcher fields should be configured with values and masks accordingly. The flex item related matcher fields is an array of eight 32-bit fields to match with data captured by sample registers of configured flex parser. One packet field, presented in item pattern can be split between several sample registers, and multiple fields can be combined together into single sample register to optimize hardware resources usage (number os sample registers is limited), depending on field modes, widths and offsets. Actual mapping is complicated and controlled by special translation data, built by PMD on flex item creation. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:40 +01:00
Viacheslav Ovsiienko	b293e8e49d	net/mlx5: translate flex item configuration RTE Flow flex item configuration should be translated into actual hardware settings: - translate header length and next protocol field samplings - translate data field sampling, the similar fields with the same mode and matching related parameters are relocated and grouped to be covered with minimal amount of hardware sampling registers (each register can cover arbitrary neighbour 32 bits (aligned to byte boundary) in the packet and we can combine the fields with smaller lengths or segments of bigger fields) - input and output links translation - preparing data for parsing flex item pattern on flow creation Signed-off-by: Gregory Etelson <getelson@nvidia.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:39 +01:00
Gregory Etelson	9086ac093a	net/mlx5: add flex parser DevX object management The DevX flex parsers can be shared between representors within the same IB context. We should put the flex parser objects into the shared list and engage the standard mlx5_list_xxx API to manage ones. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:38 +01:00
Viacheslav Ovsiienko	db25cadc08	net/mlx5: add flex item operations This patch is a preparation step of implementing flex item feature in driver and it provides: - external entry point routines for flex item creation/deletion - flex item objects management over the ports. The flex item object keeps information about the item created over the port - reference counter to track whether item is in use by some active flows and the pointer to underlying shared DevX object, providing all the data needed to translate the flow flex pattern into matcher fields according hardware configuration. There is not too many flex items supposed to be created on the port, the design is optimized rather for flow insertion rate than memory savings. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:38 +01:00
Viacheslav Ovsiienko	575740d10e	net/mlx5: update eCPRI flex parser structures To handle eCPRI protocol in the flows the mlx5 PMD engages flex parser hardware feature. While we were implementing eCPRI support we anticipated the flex parser usage extension, and all related variables were named accordingly, containing flex syllabus. Now we are preparing to introduce more common approach of flex item, in order to avoid naming conflicts and improve the code readability the eCPRI infrastructure related variables are renamed as preparation step. Later, once we have the new flex item implemented, we could consider to refactor the eCPRI protocol support to move on common flex item basis. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:37 +01:00
Dmitry Kozlyuk	ec4e11d41d	net/mlx5: preserve indirect actions on restart MLX5 PMD uses reference counting to manage RX queue resources. After port stop shared RSS actions kept references to RX queues, preventing resource release. As a result, internal PMD mempool for such queues had been exhausted after a number of port restarts. Diagnostic message from rte_eth_dev_start(): Rx queue allocation failed: Cannot allocate memory Dereference RX queues used by indirect actions on port stop (detach) and restore references on port start (attach) in order to allow RX queue resource release, but keep indirect RSS across the port restart. Replace queue IDs in HW by drop queue ID on detach and restore actual queue IDs on attach. When the port is stopped, create indirect RSS in the detached state. As a result, MLX5 PMD is able to keep all its indirect actions across port restart. Advertise this capability. Fixes: `4b61b8774b` ("ethdev: introduce indirect flow action") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-02 18:59:17 +01:00
Dmitry Kozlyuk	bc5bee028e	net/mlx5: create drop queue using DevX Drop queue creation and destruction were not implemented for DevX flow engine and Verbs engine methods were used as a workaround. Implement these methods for DevX so that there is a valid queue ID that can be used regardless of queue configuration via API. Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-02 18:59:17 +01:00
Dmitry Kozlyuk	c5042f93a4	net/mlx5: discover max flow priority using DevX Maximum available flow priority was discovered using Verbs API regardless of the selected flow engine. This required some Verbs objects to be initialized in order to use DevX engine. Make priority discovery an engine method and implement it for DevX using its API. Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-02 18:59:17 +01:00
Lior Margalit	a451287102	net/mlx5: fix RSS expansion with EtherType The RSS expansion algorithm is using a graph to find the possible expansion paths. A graph node with the 'explicit' flag will be skipped, if it is not found in the flow pattern. The current implementation misses a check for the explicit flag when expanding the pattern according to ETH item with EtherType. For example: testpmd> flow create 0 ingress pattern eth / ipv6 / udp / vxlan / eth type is 2048 / end actions rss level 2 types udp end / end The "eth type is 2048" item in the pattern may be expanded to "ETH IPv4". The ETH node in the expansion graph is followed by VLAN node marked as explicit. The fix is to skip the VLAN node and continue the expansion with its next nodes, IPv4 and IPv6. The expansion paths for the above example will be: ETH IPV6 UDP VXLAN ETH END ETH IPV6 UDP VXLAN ETH IPV4 UDP END Fixes: `69d268b4ff` ("net/mlx5: fix RSS expansion for explicit graph node") Cc: stable@dpdk.org Signed-off-by: Lior Margalit <lmargalit@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-01 14:53:37 +01:00
Jiawei Wang	7797b0fe64	net/mlx5: fix meter action pool protection The ASO meter action with flows creation could be supported on multiple threads. The meter pools were created to manage the meter object resources, if there is no room in the current meter pool then resize the meter pool to the new pool size and free the old one. There's a race condition while one thread resizes the meter pool and the old pool resource be freed, and another thread query the meter object by index on the old pool, the return value is invalid. This patch adds a read-write lock to protect the pool resource while resizing and query. Fixes: `a5835d530f` ("net/mlx5: optimize Rx queue match") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-01 14:53:36 +01:00
Jiawei Wang	7cf2d15a39	net/mlx5: fix age action pool protection The age action with flows creation could be supported on the multiple threads. The age pools were created to manage the age resources, if there is no room in the current pool then resize the age pool to the new pool size and free the old one. There's a race condition while one thread resizes the age pool and the old pool resource be freed, and another thread query the age action value of the old pool so the queried value is invalid. This patch uses the read-write lock to protect the pool resource while resizing and query. Fixes: `a5835d530f` ("net/mlx5: optimize Rx queue match") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-01 14:53:35 +01:00
David Marchand	7f49dafe05	net/mlx5: do not close stdin on error If for any reason, a socket could not be opened, mlx5_pmd_socket_init() could close the 0 fd (which is valid, and has a fair chance to be stdin), since server_socket == 0 from the variable being in .bss. Fixes: `e6cdc54cc0` ("net/mlx5: add socket server for external tools") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>	2021-11-01 08:51:48 +01:00
Alexander Kozyrev	a5a0a43bc6	net/mlx5: allow meta modifications in legacy mode The MODIFY_FIELD RTE action rejects copy to/from metadata in case of the legacy mode extensive flow metadata support. It is not consistent with SET_META action that has no such restriction imposed. Registers A or B are used for META in legacy mode. Allow meta modifications in legacy mode as well. On other hand, SET_META rejects actions in case register C is not available even though it is not needed in legacy mode. Skip this check for legacy mode and allow setting META. Fixes: `edf325d421` ("net/mlx5: check extended metadata for meta modification") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-31 13:31:13 +01:00
Alexander Kozyrev	f4f8f5aee3	net/mlx5: fix Tx meta width for modify field flow rule Register C is used for the metadata within NIC Rx domain. And its width can vary from 0 to 32 bits depending on its kernel usage. But it is not the case within NIC Tx domain, register A is always 32 bits there. Fix metadata width detection for the modify_field flow API within NIC Tx domain. Fixes: `6d5735c1cb` ("net/mlx5: fix meta register conversion for extensive mode") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-31 13:31:12 +01:00
Maxime Coquelin	5aeb7fab59	net/mlx5: fix RSS RETA update This patch fixes RETA updating for entries above 64. Without that, these entries are never updated as calculated mask value will always be 0. Fixes: `634efbc2c8` ("mlx5: support RETA query and update") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-29 11:23:10 +02:00
Gregory Etelson	23b0a8b298	net/mlx5: fix integrity item validation and translation Integrity item validation and translation must verify that integrity item bits match L3 and L4 items in flow rule pattern. For cases when integrity item was positioned before L3 header, such verification must be split into two stages. The first stage detects integrity flow item and makes initializations for the second stage. The second stage is activated after PMD completes processing of all flow items in rule pattern. PMD accumulates information about flow items in flow pattern. When all pattern flow items were processed, PMD can apply that data to complete integrity item validation and translation. Fixes: `79f8952783` ("net/mlx5: support integrity flow item") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-28 10:14:39 +02:00
Gregory Etelson	06741117ec	net/mlx5: fix integrity match on inner and outer headers MLX5 PMD can match on integrity bits for inner and outer headers in a single flow. That means a single flow rule can reference both inner and outer integrity bits. That is implemented by adding 2 flow integrity items to a rule - one item for outer integrity bits and other for inner integrity bits. Integrity item `level` parameter specifies what part is being targeted. Current PMD treated integrity items for outer and inner headers as the same. The patch separates PMD verifications for inner and outer integrity items. Fixes: `79f8952783` ("net/mlx5: support integrity flow item") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-28 10:14:38 +02:00
Haifei Luo	a7ac7fae49	net/mlx5: enhance flow dump Multiple rules could use the same encap_decap/modify_hdr/counter action. The flow dump data could be duplicated. To avoid redundancy, flow dump value is based on the actions' pointer instead of previous rules' pointer. For counter, the data is stored in cmng of priv->sh. For encap_decap/modify_hdr, the data stored in encaps_decaps/modify_cmds. Traverse the fields and get action's pointer and information. Formats are same for information in the dump except "id" stands for actions' pointer: Counter: rec_type,id,hits,bytes Modify_hdr: rec_type,id,actions_number,actions Encap_decap: rec_type,id,buf Signed-off-by: Haifei Luo <haifeil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-28 10:14:21 +02:00
Jiawei Wang	3c4338a421	net/mlx5: optimize device spawn time with representors During the device spawn process, mlx5 PMD queried the available flow priorities by calling mlx5_flow_discover_priorities, queried if the DR drop action was supported on the root table by calling the mlx5_flow_discover_dr_action_support routine, and queried the availability of metadata register C by calling mlx5_flow_discover_mreg_c These functions created the test flows to get the supported fields, and at the end destroyed the test flows. The test flows in the first two functions was created on the root table. If the device was spawned with multiple representors, these test flows were created and destroyed on each representor as well. The above operations took a significant amount of init time during the device spawn. This patch optimizes the device discover functions, if there is the device with multiple representors (VF/SF) being spawned, the priority and drop action and metadata register support check can be done only ones and check results can be shared for all representors. Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-27 14:04:39 +02:00
Rongwei Liu	7299ab6822	net/mlx5: support socket direct mode bonding In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-26 13:24:20 +02:00
Harman Kalra	d61138d4f0	drivers: remove direct access to interrupt handle Removing direct access to interrupt handle structure fields, rather use respective get set APIs for the same. Making changes to all the drivers access the interrupt handle fields. Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Raslan Darawsheh <rasland@nvidia.com>	2021-10-25 21:20:12 +02:00
Olivier Matz	daa02b5cdd	mbuf: add namespace to offload flags Fix the mbuf offload flags namespace by adding an RTE_ prefix to the name. The old flags remain usable, but a deprecation warning is issued at compilation. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>	2021-10-24 13:37:43 +02:00
Olivier Matz	5b63493241	mbuf: mark old VLAN offload flags as deprecated The flags PKT_TX_VLAN_PKT and PKT_TX_QINQ_PKT are marked as deprecated since commit `380a7aab1a` ("mbuf: rename deprecated VLAN flags") (2017). But they were not using the RTE_DEPRECATED macro, because it did not exist at this time. Add it, and replace usage of these flags. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-10-24 13:30:40 +02:00
Ferruh Yigit	295968d174	ethdev: add namespace Add 'RTE_ETH' namespace to all enums & macros in a backward compatible way. The macros for backward compatibility can be removed in next LTS. Also updated some struct names to have 'rte_eth' prefix. All internal components switched to using new names. Syntax fixed on lines that this patch touches. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Wisam Jaddo <wisamm@nvidia.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>	2021-10-22 18:15:38 +02:00
Rongwei Liu	a89f6433aa	net/mlx5: set Tx queue affinity in round-robin Previously, we set txq affinity to 0 and let firmware to perform round-robin when bonding. Firmware uses a global counter to assign txq affinity to different physical ports accord to remainder after division. There are three dis-advantages: 1. The global counter is shared between kernel and dpdk. 2. After restarting pmd or port, the previous counter value is reused, so the new affinity is unpredictable. 3. There is no way to get what affinity is set by firmware. In this update, we will create several TISs up to the number of bonding ports and bind each TIS to one PF port. For each port, it will start to pick up TIS using its port index. Upper layer application can quickly calculate each txq's affinity without querying. At DPDK layer, when creating txq with 2 bonding ports, the affinity is set like: port 0: 1-->2-->1-->2 port 1: 2-->1-->2-->1 port 2: 1-->2-->1-->2 Note: Only applicable to DevX api. This affinity subjects to HW hash. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 12:37:00 +02:00
Dmitry Kozlyuk	ea823b2c51	net/mlx5: close tools socket with last device MLX5 PMD exposes a socket for external tools to dump port state. Socket events are listened using an interrupt source of EXT type. The socket was closed and the interrupt callback was unregistered at program exit, which is incorrect because DPDK could be already shut down at this point. Move actions performed at program exit to the moment the last MLX5 port is closed. The socket will be opened again if later a new MLX5 device is plugged in and probed. Also fix comments that were decisively talking about secondary processes instead of external tools. Fixes: `e6cdc54cc0` ("net/mlx5: add socket server for external tools") Cc: stable@dpdk.org Reported-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-10-21 10:31:53 +02:00
Dmitry Kozlyuk	9ec1ceab76	net/mlx5: fix Rx queue resource cleanup mlx5_rxq_start() allocates rxq_ctrl->obj and frees it on failure, but did not set it to NULL. Later mlx5_rxq_release() could not recognize this object is already freed and attempted to release its resources, resulting in a crash: Configuring Port 0 (socket 0) mlx5_common: Failed to create RQ using DevX mlx5_common: Can't create DevX RQ object. mlx5_net: Port 0 Rx queue 0 RQ creation failure. Segmentation fault Set rxq_ctrl->obj to NULL after it is freed to skip resource release. Fixes: `1260a87b28` ("net/mlx5: share Rx control code") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 09:31:17 +02:00
Bing Zhao	273b09376c	net/mlx5: fix meter yellow policy with RSS action The RSS configuration in a policy action container was a pointer inside a union, and the pointer area could be used as other fate action. In the current implementation, the RSS of the green color was prior to that of the yellow color. There was a high possibility the pointer was considered as the RSS and result in a error flow expansion when only the yellow color had the RSS action. The check of the fate action type should also be done to get rid of the misjudgment. Fixes: `b38a12272b` ("net/mlx5: split meter color policy handling") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 09:31:15 +02:00
Xueming Li	614966c2fa	net/mlx5: check DevX to support more Verbs ports Verbs API doesn't support device port number larger than 255 by design. To support more VF or SubFunction port representors, forces DevX API check when max Verbs device link ports larger than 255. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:14 +02:00
Xueming Li	686d05b60d	net/mlx5: enable DevX Tx queue creation Verbs API does not support Infiniband device port number larger 255 by design. To support more representors on a single Infiniband device DevX API should be engaged. While creating Send Queue (SQ) object with Verbs API, the PMD assigned IB device port attribute and kernel created the default miss flows in FDB domain, to redirect egress traffic from the queue being created to representor appropriate peer (wire, HPF, VF or SF). With DevX API there is no IB-device port attribute (it is merely kernel one, DevX operates in PRM terms) and PMD must create default miss flows in FDB explicitly. PMD did not provide this and using DevX API for E-Switch configurations was disabled. The default miss FDB flow matches E-Switch manager vport (to make sure the source is some representor) and SQn (Send Queue number - device internal queue index). The root flow table managed by kernel/firmware and it does not support vport redirect action, we have to split the default miss flow into two ones: - flow with lowest priority in the root table that matches E-Switch manager vport ID and jump to group 1. - flow in group 1 that matches E-Switch manager vport ID and SQn and forwards packet to peer vport Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:13 +02:00
Xueming Li	ebe9afedc7	net/mlx5: fix internal root table flow priority When creating internal transfer flow on root table with lowest priority, the flow was created with max UINT32_MAX priority. It is wrong since the flow is created in kernel and max priority supported is 16. This patch fixes this by adding internal flow check. Fixes: `5f8ae44dd4` ("net/mlx5: enlarge maximal flow priority") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:12 +02:00
Xueming Li	d9020f2577	net/mlx5: support flow item of normal Tx queue Extends txq flow pattern to support both hairpin and regular txq. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:11 +02:00
Xueming Li	a564038699	net/mlx5: support E-Switch manager egress traffic match For egress packet on representor, the vport ID in transport domain is E-Switch manager vport ID since representor shares resources of E-Switch manager. E-Switch manager vport ID and Tx queue internal device index are used to match representor egress packet. This patch adds flow item port ID match on E-Switch manager. E-Switch manager vport ID is 0xfffe on BlueField, 0 otherwise. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:10 +02:00
Xueming Li	1d47e9335e	net/mlx5: improve Verbs flow priority discovery To detect number flow Verbs flow priorities, PMD try to create Verbs flows in different priority. While Verbs is not designed to support ports larger than 255. When DevX supported by kernel driver, 16 Verbs priorities must be supported, no need to create Verbs flows. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:09 +02:00
Xueming Li	3fd2961efa	net/mlx5: use Netlink when IB port greater than 255 IB spec doesn't allow 255 ports on a single HCA, port number of 256 was cast to u8 value 0 which invalid to ibv_query_port() This patch invokes Netlink API to query port state when port number greater than 255. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:08 +02:00
Michael Baum	fc59a1ec55	common/mlx5: share MR mempool registration Expand the use of mempool registration to MR management for other drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:58:00 +02:00
Michael Baum	a5d06c9006	common/mlx5: support device DMA map and unmap Since MR management has moved to the common area, there is no longer a need for the DMA map and unmap function for each driver. This patch share those functions. For most drivers it supports these operations for the first time. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:58:00 +02:00
Michael Baum	9f1d636f3e	common/mlx5: share MR management Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:57:58 +02:00
Michael Baum	5fbc75ace1	common/mlx5: add global MR cache create function Add function for global shared MR cache structure initialization. This function include: - btree initialization. - set callbacks for reg and dereg MR. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:57:24 +02:00
Michael Baum	85c7005e84	common/mlx5: add MR control initialization Add function for MR control structure initialization. This function include: - btree initialization. - dev_gen_ptr initialization. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:54:32 +02:00
Michael Baum	05fa53d6a0	net/mlx5: remove redundancy in MR file This patch remove two redundant things from MR file: 1. mr_find_contig_memsegs_data structure which is moved to common file before. 2. External memory mechanism - mlx5_tx_update_ext_mp function. Since commit [1] which added support for DMA map and unmap, external mem must be configured by the user using rte_mem_map function and no need to handle this in pmd. [1] commit `989e999d93` ("net/mlx5: support PCI device DMA map and unmap") Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:46 +02:00
Michael Baum	fe46b20c96	common/mlx5: share HCA capabilities handle Add HCA attributes structure as a field of device config structure. It query in common probing, and updates the timestamp format fields. Each driver use HCA attributes from common device config structure, instead of query it for itself. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:46 +02:00
Michael Baum	e35ccf243b	common/mlx5: share protection domain object Create shared Protection Domain in common area and add it and its PDN as fields of common device structure. Use this Protection Domain in all drivers and remove the PD and PDN fields from their private structure. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:46 +02:00
Michael Baum	ca1418ce39	common/mlx5: share device context object Create shared context device in common area and add it as a field of common device. Use this context device in all drivers and remove the ctx field from their private structure. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:44 +02:00
Michael Baum	5bc38358b5	net/mlx5: remove redundant flag in device config Device configure structure has flag named devx as same as SH structure with the same meaning. Remove the flag from the configuration structure and move all the usages to the SH flag. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:36 +02:00
Michael Baum	887183effa	common/mlx5: move basic probing functions to common Move open IBV/DevX device function to common. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:32 +02:00
Michael Baum	5021ce2085	net/mlx5: rearrange probing functions for Windows Rearrange device detection code. Rearrange configuration structures filling. Remove unneeded variables. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:39:24 +02:00
Michael Baum	8520992403	common/mlx5: share memory related devargs Add device configure structure and function to parse user device arguments into it. Move parsing and management of relevant device arguments to common. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:39:04 +02:00
Michael Baum	a77bedf255	common/mlx5: share common definitions Create MACRO definitions file in the common driver as preparation for MR and basic probe sharing. Move relevant definitions from the net driver to the above file. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:39:01 +02:00
Michael Baum	7af08c8f1a	common/mlx5: share basic probing with internal drivers Create common probing structure that includes, for now, basic probing information detected by the common driver and share it with all the internal drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:38:46 +02:00
Michael Baum	620be7f27b	net/mlx5: register memory event callback in Windows In device initialization, the driver registers to free hugepages events. When hugepage is released, this callback frees all its related MRs. In Windows initialization, this callback is not registered what may cause to use invalid memory. This patch adds memory event callback registration in Windows initialization. Fixes: `980826dc6f` ("net/mlx5: probe on Windows") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:37:23 +02:00
Lior Margalit	0c3fa68396	net/mlx5: fix RSS expansion for L2/L3 VXLAN The RSS expansion algorithm is using a graph to find the possible expansion paths. The current implementation does not differentiate between standard (L2) VXLAN and L3 VXLAN. As result the flow is expanded with all possible paths. For example: testpmd> flow create... / vxlan / end actions rss level 2 / end It is currently expanded to the following paths: ETH IPV4 UDP VXLAN END ETH IPV4 UDP VXLAN ETH IPV4 END ETH IPV4 UDP VXLAN ETH IPV6 END ETH IPV4 UDP VXLAN IPV4 END ETH IPV4 UDP VXLAN IPV6 END The fix is to adjust the expansion according to the outer UDP destination port. In case flow pattern defines a match on the standard udp port, 4789, or does not define a match on the destination port, which also implies setting the standard one, the expansion for the above example will be: ETH IPV4 UDP VXLAN END ETH IPV4 UDP VXLAN ETH IPV4 END ETH IPV4 UDP VXLAN ETH IPV6 END Otherwise, the expansion will be: ETH IPV4 UDP VXLAN END ETH IPV4 UDP VXLAN IPV4 END ETH IPV4 UDP VXLAN IPV6 END Fixes: `f4f06e3615` ("net/mlx5: add flow VXLAN item") Cc: stable@dpdk.org Signed-off-by: Lior Margalit <lmargalit@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-18 09:12:42 +02:00
Eli Britstein	292be511d2	net/mlx5: support more tunnel types Accept RTE_FLOW_ITEM_TYPE_GRE, RTE_FLOW_ITEM_TYPE_NVGRE and RTE_FLOW_ITEM_TYPE_GENEVE as valid tunnel types. Fixes: `4ec6360de3` ("net/mlx5: implement tunnel offload") Cc: stable@dpdk.org Signed-off-by: Eli Britstein <elibr@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-19 23:51:10 +02:00
Ferruh Yigit	b563c14212	ethdev: remove jumbo offload flag Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag. Instead of drivers announce this capability, application can deduct the capability by checking reported 'dev_info.max_mtu' or 'dev_info.max_rx_pktlen'. And instead of application setting this flag explicitly to enable jumbo frames, this can be deduced by driver by comparing requested 'mtu' to 'RTE_ETHER_MTU'. Removing this additional configuration for simplification. Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Acked-by: Michal Krawczyk <mk@semihalf.com>	2021-10-18 19:20:21 +02:00
Ferruh Yigit	1bb4a528c4	ethdev: fix max Rx packet length There is a confusion on setting max Rx packet length, this patch aims to clarify it. 'rte_eth_dev_configure()' API accepts max Rx packet size via 'uint32_t max_rx_pkt_len' field of the config struct 'struct rte_eth_conf'. Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result stored into '(struct rte_eth_dev)->data->mtu'. These two APIs are related but they work in a disconnected way, they store the set values in different variables which makes hard to figure out which one to use, also having two different method for a related functionality is confusing for the users. Other issues causing confusion is: * maximum transmission unit (MTU) is payload of the Ethernet frame. And 'max_rx_pkt_len' is the size of the Ethernet frame. Difference is Ethernet frame overhead, and this overhead may be different from device to device based on what device supports, like VLAN and QinQ. * 'max_rx_pkt_len' is only valid when application requested jumbo frame, which adds additional confusion and some APIs and PMDs already discards this documented behavior. * For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory field, this adds configuration complexity for application. As solution, both APIs gets MTU as parameter, and both saves the result in same variable '(struct rte_eth_dev)->data->mtu'. For this 'max_rx_pkt_len' updated as 'mtu', and it is always valid independent from jumbo frame. For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user request and it should be used only within configure function and result should be stored to '(struct rte_eth_dev)->data->mtu'. After that point both application and PMD uses MTU from this variable. When application doesn't provide an MTU during 'rte_eth_dev_configure()' default 'RTE_ETHER_MTU' value is used. Additional clarification done on scattered Rx configuration, in relation to MTU and Rx buffer size. MTU is used to configure the device for physical Rx/Tx size limitation, Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer size as Rx buffer size. PMDs compare MTU against Rx buffer size to decide enabling scattered Rx or not. If scattered Rx is not supported by device, MTU bigger than Rx buffer size should fail. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Acked-by: Huisong Li <lihuisong@huawei.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2021-10-18 19:20:20 +02:00
Li Zhang	771253ea8f	net/mlx5: fix domains selection for meter policy Fate actions are different per domain. When all the domains, ingress, egress and FDB (transfer), can support all the policy actions, i.e. [SET_TAG], the policy prepares resources for all the domains and failure happens if one of the domains misses its fate action in the policy action list. Remove the domains missing their fate action from the meter policy preparation. Now, the policy will prepare a domain only when the domain supports all the actions and when one of the domain fate actions is on the list. Fixes: `afb4aa4f12` ("net/mlx5: support meter policy operations") Cc: stable@dpdk.org Signed-off-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-14 10:48:33 +02:00
Viacheslav Ovsiienko	40c8fb1fd3	net/mlx5: update modify field action Update immediate value/pointer source operand support for modify field RTE Flow action: - source operand data can be presented by byte buffer (instead of former uint64_t) or by pointer - no host byte ordering is assumed anymore for immediate data buffer (not uint64_t anymore) - no immediate value offset is expected (the source subfield is located at the same offset as in destination) Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-14 14:34:31 +02:00
Andrew Rybchenko	d35dd287a2	net/mlx5: support represented port flow action Semantics of the existing support for action PORT_ID suggests that support for equal action REPRESENTED_PORT be implemented. Helper functions keep port_id suffix since action MLX5_FLOW_ACTION_PORT_ID is still used internally. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Konstantin Ananyev	8d7d4fcdca	ethdev: change input parameters for Rx queue count Currently majority of fast-path ethdev ops take pointers to internal queue data structures as an input parameter. While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue index. For future work to hide rte_eth_devices[] and friends it would be plausible to unify parameters list of all fast-path ethdev ops. This patch changes eth_rx_queue_count() to accept pointer to internal queue data as input parameter. While this change is transparent to user, it still counts as an ABI change, as eth_rx_queue_count_t is used by ethdev public inline function rte_eth_rx_queue_count(). Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Feifei Wang <feifei.wang2@arm.com>	2021-10-13 22:14:58 +02:00
Tal Shnaiderman	c8834a3663	net/mlx5: support keeping CRC on Windows Support of the keep-CRC offloading by checking the relevant FW capability (scatter_fcs) for NIC support. Supported offload: DEV_RX_OFFLOAD_KEEP_CRC Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com>	2021-10-12 15:29:39 +02:00
Tal Shnaiderman	6061cc4148	net/mlx5: support VLAN stripping offload on Windows Support of the VLAN stripping offloading by checking the relevant FW capability (vlan_cap) for NIC support. Supported offload: DEV_RX_OFFLOAD_VLAN_STRIP Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com>	2021-10-12 15:29:38 +02:00
Tal Shnaiderman	738da9a867	net/mlx5: support TSO offload on Windows Support of the TSO offloading by checking the relevant FW capability for NIC support. Supported offloads: DEV_TX_OFFLOAD_TCP_TSO DEV_TX_OFFLOAD_VXLAN_TNL_TSO DEV_TX_OFFLOAD_GRE_TNL_TSO DEV_TX_OFFLOAD_GENEVE_TNL_TSO Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com>	2021-10-12 15:29:37 +02:00

1 2 3 4 5 ...

2402 Commits