The PMD was not reporting the supported RSS capabilities.
Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
It is suggested to use PCI BDF to identify a port for port addition
in OVS-DPDK. While mlx5 has its own naming style: name it by ib dev
name. This breaks the typical OVS DPDK use case and brings more puzzle
to the end users.
To fix it, this patch changes it to use PCI BDF as the name, too.
Also, a postfix " port %u" is added, just in case their might be more
than 1 port associated with a PCI device.
Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
When no memory is available on the same numa node than the device, the
initialization of the device fails. However, the use case where the
cores and memory are on a different socket than the device is valid,
even if not optimal.
To fix this issue, this commit introduces an infrastructure to select
the socket on which to allocate the verbs objects based on the ethdev
configuration and the object type, rather than the PCI numa node.
Fixes: 1e3a39f72d5d ("net/mlx5: allocate verbs object into shared memory")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
On error, mlx5_dev_start() does not return a negative value
as it is supposed to do. The consequence is that the application
(ex: testpmd) does not notice that the port is not started
and begins the rxtx on an uninitialized port, which crashes.
Fixes: e1016cb73383 ("net/mlx5: fix Rx interrupts management")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
All multi code should not be handled in exit part of the code but in the
mainline of the function.
Fixes: 0a40a1363a4d ("net/mlx5: fix flow type for allmulti rules")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Polling a new packet is basically sensing the generation bit in a
completion entry. For some processors not having strongly-ordered memory
model, there has to be a memory barrier between reading the generation bit
and other fields of the entry in order to guarantee data is not stale.
Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This patch reverts:
commit 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. It is because buffer address is
compared against aligned addresses for Memory Region. Saving original
addresses of mempool for comparison doesn't create any overlap.
Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
Fixes: 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Even though link of a port gets down, device still can receive traffic.
That is the reason why mlx5_set_link_up/down() switches rx/tx_pkt_burst().
However, if link gets down by an external command (e.g. ifconfig), it isn't
effective. It is better to change burst functions when link status change
is detected.
Fixes: 62072098b54e ("mlx5: support setting link up or down")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
There are parameters that are not copy from
spec to verbs structure in the vtc_label
Fixes: 43e9d9794cde ("net/mlx5: support upstream rdma-core")
Cc: stable@dpdk.org
Signed-off-by: Shachar Beiser <shacharbe@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
PKT_RX_FDIR_ID should be set only if flow_tag is neither non-zero nor
MLX5_FLOW_MARK_DEFAULT.
Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Adding the size of ibv_flow_spec_action_drop is missing. This can cause
overflow when converting rte_flow to ibv_flow and thus corrupt memory.
Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
If mlx5_dev_link_status_handler() is executed while canceling the alarm,
deadlock can happen because rte_eal_alarm_cancel() waits for all callbackes
to finish execution and both calls are protected by priv->lock.
Fixes: 198a3c339a8f ("mlx5: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The pointer to the user parameter of the callback registration is
automatically pass to the callback function.
There is no point to allow changing this user parameter by a caller.
That's why this parameter is always set to NULL by PMDs and set only
in ethdev layer before calling the callback function.
The history is that the user parameter was initially used
by the callback implementation to pass some information
between the application and the driver:
c1ceaf3ad056 ("ethdev: add an argument to internal callback function")
Then a new parameter has been added to leave the user parameter
to its standard usage of context given at registration:
d6af1a13d7a1 ("ethdev: add return values to callback process API")
The NULL parameter in the internal callback processing function
is now removed. It makes clear that the callback parameter is user
managed and opaque from a DPDK point of view.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Counter_set is a counter used for flows when its support is available.
Renaming it to flow counter.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Move device configuration and features capabilities to its own structure.
This structure is filled by mlx5_pci_probe(), outside of this function
it should be treated as *read only*.
This configuration struct will be used for the Tx/Rx queue setup to
select the Tx/Rx queue parameters based on the user configuration and
device capabilities.
In addition it will be used by the burst selection function to decide
on the best pkt burst to be used.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Change the function prototype to return the function pointer of the
selected Tx/Rx burst function instead of assigning it directly to the
device context.
Such change will enable to use those select functions to query the burst
function that will be selected according to the device configuration.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Bit-fields in mlx5_rxq_data can be changed on the fly by a control plane -
e.g. rxq->mark. However, vectorized Rx uses a bit-field to mark pending
errors. Even if one bit is written, consequence is to write the whole
integer and this can cause a synchronization issue - two entities write to
a same block without locking. As the pending_err bit is entirely internal
use for the datapath, this can be replaced with a local variable.
Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
MLX5 NIC does not support all hash fields, this patch limit by refusing
impossible RSS combination to avoid errors.
Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Checking HW checksum offload flag for outer IP is missing. If flag is set
for only outer IP, this can't be set properly.
Fixes: f5fde5205101 ("net/mlx5: add hardware checksum offload for tunnel packets")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The trigger for PMD to use the application RSS configuration should be
based on the validity of rss_key and not the rss_key_len. Otherwise
segmentation fault can occur if application provide valid RSS key length
but without any RSS key.
Fixes: 29c1d8bb3e79 ("net/mlx5: handle a single RSS hash key for all protocols")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Remove multi-segment packet handling from mlx5_tx_burst_empw() as there's
fallback to regular Tx for such packets.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
mlx5_tx_burst_empw() falls back to legacy Tx descriptor for multi-segmented
packets without taking advantage of inlining. In many cases, the 1st
segment can be inlined and this could make device fetch only one segment
instead of two. This helps saving PCIe bandwidth when transmitting out
multi-segmented packets with still using the Enhanced Multi-Packet Send for
other packets.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
mlx5_tx_burst() doesn't inline data from the 2nd segment. If there's still
enough room in the descriptor after inlining the 1st segment, further
inlining from the 2nd segment would be beneficial to save PCIe bandwidth.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
If there're more MR(Memroy Region)'s than the size of per-queue cache, the
cache can be overflowed and corrupt the following data structure in
mlx5_txq_data.
Fixes: 6e78005a9b30 ("net/mlx5: add reference counter on DPDK Tx queues")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. To avoid it, end address of a
packet segment is used in LKEY search.
Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
Cc: stable@dpdk.org
Reported-by: Hanoch Haim <hhaim@cisco.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Ethdev layer has an API to configure vlan setting on the flight, i.e.
when the port state is start.
Calling such API when the port is stopped may cause segmentation fault
as the related Verbs contexts has not been created yet.
Fixes: 09cb5b581762 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Chnaged ibv_flow_attr type for allmulti rule to IBV_FLOW_ATTR_MC_DEFAULT
instead of IBV_FLOW_ATTR_NORMAL, in case allmulti was enabled.
Fixes: 272733b5 ("net/mlx5: use flow to enable unicast traffic")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Tx checksum offloads are correctly handled in a single Tx burst function
whereas the capability is always set.
This causes VXLAN packet with checksum offloads request to be ignored when
the (E)MPS Tx functions are selected.
Fixes: f5fde5205101 ("net/mlx5: add hardware checksum offload for tunnel packets")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Extend the PMD extended statistics with more counters on the physical
port.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
A non max_inline 0 means an inline is requested, there is no need to
duplicate this information.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Most of the variable in mlx5_tx_burst() are defined too soon.
This commit moves them their uses C block of code.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
naddr variable was introduced in
commit 9a7fa9f76d9e ("net/mlx5: use vector types to speed up processing")
to avoid compilation errors on 32bits compilation, as x86_32 is no more
supported by rdma-core nor by MLNX_OFED, this variable becomes useless and
can be safely removed.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Since the secondary process has its own devops, function which cannot be
called by the secondary don't need anymore to verify which process is
calling it.
Fixes: 87ec44ce1651 ("net/mlx5: add operations for secondary process")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
mlx5_get_priv() is barely use across the driver. To avoid mixing access,
this function is definitely removed.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This drop queue can be handled efficiently by using the drop flag in the
context.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Simplify the computation for the needed size:
- exact size for the structure header,
- exact size for a number of 64-bit counters.
Fixes: a4193ae3bc4f ("net/mlx5: support extended statistics")
Cc: stable@dpdk.org
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Mark action flag was not set on the flow, causing the rx burst function
after port start to ignore the flow mark.
Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
total_length is only visible when SOFT_COUNTERS are enabled
Fixes: 3f13f8c23a7c ("net/mlx5: support hardware TSO")
Cc: stable@dpdk.org
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The specification are wrongly computed causing the rule to not be match
correctly for deletion.
Fixes: 4c3e9bcdd52e ("net/mlx5: support flow director")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The constant ETHTOOL_LINK_MODE_1000baseT_Full_BIT and the others like
that in mlx5_link_update_unlocked_gs must be bit masks but unfortunately
they are bit numbers. This commit fixes the issue.
Fixes: 188408719888 ("net/mlx5: fix support for newer link speeds")
Cc: stable@dpdk.org
Signed-off-by: Edward Makarov <makarov@kraftway.ru>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
When deleting/replacing a flow director rule the first rule of the list
was always the one replaced instead of the corresponding one.
Fixes: 4c3e9bcdd52e ("net/mlx5: support flow director")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>