Function mlx5_rx_intr_disable() calls mlx5_rxq_ibv_get() and performs
some actions on the returned rxq_ibv.
It doesn't release the rxq_ibv when all is completed with success.
This patch adds call to mlx5_rxq_ibv_release() where it's missing.
Fixes: 09cb5b581762 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Currently, the name of MPRQ mempool is set by
snprintf(name, sizeof(name), "%s-mprq", dev->device->name);
For port representor, the name is duplicate of its master and failed to
create such a mempool having the same name. Port ID is used in the name
instead.
Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Recent patch [1] added, at the end of mlx5_dev_configure(), a call to
mlx5_proc_priv_init(), initializing process_private data of eth_dev.
This call is not reached if PMD is started with zero Rx queues.
In this case mlx5_dev_configure() returns earlier due to the check:
if (rxqs_n == priv->rxqs_n)
return 0;
In such a scenario, later references to uninitialized process_private
data will result in segmentation fault.
For example see in function txq_uar_init().
This patch changes the check logic. The following code is executed
if (rxqs_n != priv->rxqs_n), and skipped otherwise.
Function mlx5_proc_priv_init() is always invoked, to ensure
process_private data is initialized.
[1] http://patches.dpdk.org/patch/52629/
Fixes: 120dc4a7dcd3 ("net/mlx5: remove device register remap")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The RDMA-CORE Direct Rules API was changed in latest upstream code
This commit update the API accordingly.
Fixes: 4f84a19779ca ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
BlueField SmartNIC has 0xa2d2 as PCI device ID on both ARM and x86 host. On
ARM side, Tx inlining need not be used as PCI bandwidth is not bottleneck.
Vectorized Tx can still be used up to 16 queues. For other archs
(e.g., x86), keep using the default value.
Fixes: 09d8b41699bb ("net/mlx5: make vectorized Tx threshold configurable")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
If Tx packet inlining is enabled, rdma-core library should allocate large
Tx WQ enough to support it. It is better for PMD to calculate the size of
WQ based on the parameters and return error with appropriate message if it
exceeds the device capability.
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When creating a flow rule without the port_id pattern item, always the
PF was selected.
This commit fixes this issue, if no port_id pattern item is available
then we use the port that the flow was created on as source port.
Fixes: 822fb3195348 ("net/mlx5: add port id item to Direct Verbs")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
ibv_destroy_flow_action() refers to QP. QP must not be freed until
corresponding action is destroyed.
Fixes: 3eb004431072 ("net/mlx5: fix release of jump to queue action")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Mellanox mlx5 PMD implements the list of devices to process the memory
free events to reflect the actual memory state to Memory Regions.
Because this list contains the devices and devices may share the
same context the callback routine may be called multiple times
with the same parameter, that is not optimal. This patch modifies
the list to contain the device contexts instead of device objects
and shared context is included in the list only once.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The multiport Infiniband device support was introduced [1].
All active ports, belonging to the same Infiniband device use the single
shared Infiniband context of that device and share the resources:
- QPs are created within shared context
- Verbs flows are also created with specifying port index
- DV/DR resources
- Protection Domain
- Event Handlers
This patchset adds support for Memory Regions sharing between
ports, created on the base of multiport Infiniband device.
The datapath of mlx5 uses the layered cache subsystem for
allocating/releasing Memory Regions, only the lowest layer L3
is subject to share due to performance issues.
[1] http://patches.dpdk.org/cover/51800/
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
There are some physical link settings can be queried from
Ethernet devices: link status, link speed, speed capabilities,
duplex mode, etc. These setting do not make a lot of sense for
representors due to missing physical link. The new kernel drivers
dropped query for link settings for representors causing the
ioctl call to fail. This patch adds some kind of emulation
of link settings to PMD - representors inherit the link parameters
from the master device. The actual link status (up/down)
is retrieved from the representor device.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When creating the modify action using Direct Rules, we need to
add flags to mark, if the action will be done on root table or on
private table.
Fixes: 4f84a19779ca ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
If there is the support of DevX is exposed by rdma-core but
DevX is not supported by or disabled for the specific interface
the mlx5_devx_cmd_query_hca_attr() routine returns an error
preventing the device from successful probing. The routine
should be invoked only in case of enabled DevX.
Fixes: e2b4925ef7c1 ("net/mlx5: support Direct Rules E-Switch")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
In mlx5_rxq.c, in some comments, text includes "Tx" instead of "Rx".
In mlx5_txq.c, in some comments, text includes "Rx" instead of "Tx".
This patch fixes these typos.
Fixes: faf2667fe8d5 ("net/mlx5: separate DPDK from verbs Tx queue objects")
Fixes: a1366b1a2be3 ("net/mlx5: add reference counter on DPDK Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
All the library calls must be called via the glue layer.
Fixes: b2177648b8de ("net/mlx5: add Direct Rules flow data alloc/free routines")
Fixes: 79e35d0d5979 ("net/mlx5: share Direct Rules/Verbs flow related structures")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Looking for an ethdev port is better (and more efficient)
with an ethdev API than an EAL one.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The flow group should be initialized.
For example selecting if the encapsulation is for root or private tables
is based on the flow->group value.
Fixes: 4f84a19779ca ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This commit adds support for drop action when creating E-Switch flow
using DV.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Actions like encap/decap, modify header require setting the flow table
type. Until now we supported only Nic RX and Nic TX, this commits adds
the support for FDB table type for those actions.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
In current implementation the DV steering supported only NIC steering.
This commit adds the transfer attribute in order to create a matcher
on the FDB tables.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This commit checks the for DR E-Switch support.
The support is based on both Device and Kernel.
This commit also enables the user to manually disable this this feature.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The meson build was missing the define for Direct Rules.
Fixes: 4f84a19779ca ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Modify the translate vport function to match other translate items
naming conventions.
Fixes: 0fe3f18f78d8 ("net/mlx5: add source vport match to the ingress rules")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
According to RTE flow the action order should be the order that the
actions were given.
In the case of modify actions the position of the action was always
last.
This commit solves this issue by saving the position of the first modify
action, and then adds to this position the pointer to the modify action.
Fixes: 4bb14c83df95 ("net/mlx5: support modify header using Direct Verbs")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Currently the allocation of the jump to QP is done in flow apply,
this results in memory leak.
This patch fixes this issue by moving the allocation and release of the
jump to QP action to the responsibility of the hrxq.
Fixes: cbb66daa3c85 ("net/mlx5: prepare Direct Verbs for Direct Rule")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
On BlueField platform we have the new entity - PF representor.
This one represents the PCI PF attached to external host on the
side of ARM. The traffic sent by the external host to the NIC
via PF will be seem by ARM on this PF representor.
This patch refactors port recognizing capability on the base of
physical port name. We have two groups of name formats. Legacy
name formats are supported by kernels before ver 5.0 (being
more precise - before the patch [1]) or before Mellanox OFED 4.6,
and new naming formats added by the patch [1].
Legacy naming formats are supported:
- missing physical port name (no sysfs/netlink key) at all,
master is assumed
- decimal digits (for example "12"), representor is assumed,
the value is the index of attached VF
New naming formats are supported:
- "p" followed by decimal digits, for example "p2", master
is assumed
- "pf" followed by PF index concatenated with "vf" followed by
VF index, for example "pf0vf1", representor is assumed.
If index of VF is "-1" it is a special case of host PF
representor, this representor must be indexed in devargs
as 65535, for example representor=[0-3,65535] will
allow representors for VF0, VF1, VF2, VF3 and for host PF.
Note: do not specify representor=[0-65535], it causes devargs
processing error, because number of ports (rte_eth_dev) is
limited.
Applications should distinguish representors and master devices
exclusively by device flag RTE_ETH_DEV_REPRESENTOR and do not
rely on switch port_id (mlx5 PMD deduces ones from representor_id)
values returned by dev_infos_get() API.
[1] https://www.spinics.net/lists/netdev/msg547007.html
Linux-tree: c12ecc23 (Or Gerlitz 2018-04-25 17:32 +0300)
"net/mlx5e: Move to use common phys port names for vport representors"
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
mlx5 driver has a global list of Memory Regions created by
device, and there is a ml5_mr_release() routine which makes
a memory cleanup at device closing. The head of device MR list
was fetched outside the rwlock protected section. Also some
noticed typos are fixed.
Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Some port iterations are manually checking against RTE_ETH_DEV_UNUSED
instead of using the iterators based on rte_eth_find_next().
A new macro RTE_ETH_FOREACH_VALID_DEV() is introduced, but kept private
because there should be no need of iterating over all devices in the
API. The public iterators have additional filters for ownership, parent
device or sibling ports.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
As stated in the deprecation notice from December 2016,
"the legacy filter API, including rte_eth_dev_filter_supported(),
rte_eth_dev_filter_ctrl() as well as filter types MACVLAN, ETHERTYPE,
FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR, HASH and L2_TUNNEL, is superseded
by the generic flow API (rte_flow)".
After a long wait of more than two years, the legacy filter API
is marked as deprecated, while still tested with testpmd and
the tep_termination example.
The next step will be to announce a deadline for complete removal.
As preparation of the removal of rte_eth_ctrl.h,
RTE_ETH_FLOW_*, RTE_TUNNEL_TYPE_* and RTE_ETH_HASH_FUNCTION_* definitions
are moved to rte_ethdev.h and rte_flow.h.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The RSS validation function was missing the verifcation that
if RSS is requested on inner packet, the flow must have tunnel data.
Fixes: 23c1d42c7138 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
If MLNX_OFED is installed, there's no .pc file installed for libraries and
dependency() can't find libraries by pkg-config. By adding fallback of
using cc.find_library(), libraries are properly located.
Fixes: e30b4e566f47 ("build: improve dependency handling")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Luca Boccassi <bluca@debian.org>
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private
The actual UAR table follows the data structure and the table is used
for both Tx and Rx.
For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.
For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Queue index is redundantly stored for both Rx and Tx structures.
E.g. txq_ctrl->idx and txq->stats.idx. Both are consolidated to single
storage - rxq->idx and txq->idx.
Also, rxq and txq are moved to the beginning of its control structure
(rxq_ctrl and txq_ctrl) for cacheline alignment.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
In case of cross compilation on aarch64 we must add include for
stdlib in order to use the free function.
Fixes: cbb66daa3c85 ("net/mlx5: prepare Direct Verbs for Direct Rule")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
At the mlx5 device closing the shared IB context was destroyed
before cleanup routines completion. As it was found on some
setups (Netlink fails with old kernel drivers and we have to use
sysfs to retrieve interface index, this requires IB device name,
which is stored in shared context) the mlx5_nl_mac_addr_flush()
requires IB device name, and if shared context is removed it
causes the segmentation fault.
Fixes: 17e19bc4dde7 ("net/mlx5: add IB shared context alloc/free functions")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Retrieving network interface index via Netlink fails in
case of old ib_core kernel driver installed - mlx5_nl_ifindex()
routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not
supported by the old driver.
The patch allowing to retrieve the network interface index and
name via Netlink [1]. So, the problem depends on ib_core module
version - 4.16 supports getting ifindex via Netlink, 4.15 does not.
This error was ignored in previous versions of MLX5 PMD probing
routine. For single device ifindex was retrieved via sysfs
and link control was not lost, so problem just was not noticed.
In order to support MLX5 PMD functioning over old kernel driver
this patch adds ifindex retrieving via sysfs into probing routine.
It is worth to note this method works for master/standalone
device only.
[1] https://www.spinics.net/lists/linux-rdma/msg62948.html
Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270)
Fixes: ad74bc619504 ("net/mlx5: support multiport IB device during probing")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
We are going to share the Direct Rules and Direct Verbs flow
device data structures between master and representors in the
E-Switch configurations over multiport IB device.
The code of initializing and destroying these data is
moved to dedicated routines, this is just a preparation
step for actual data sharing.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When using Direct Rules we can add actions to jump between tables.
This is extra useful since rule insertion rate is much higher on other
tables compared to table zero.
If no group is selected the rule is added to group 0.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Adds calls to the Direct Rules API inside the glue functions.
Due to difference in parameters between the Direct Rules and Direct
Verbs some of the glue functions API was updated.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This is the first patch of a series that is designed to enable the
Direct Rules API.
The main difference between Direct Verbs and Direct Rules from API
perspective is that in Direct Rules each action has it's own create
function and the object itself is of type void.
In this patch I'm adding functions to generate actions that currently
are done without create action, and I'm changing the action type to be
void *, so in next patches only the glue functions will need to change.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The API that was defined in OFED 4.5 was replaced both in OFED 4.6 and
in upstream.
This commit updates the API to match the upstream one.
Fixes: f5bf91de738a ("net/mlx5: support flow counters using devx")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Iterating over siblings was done with RTE_ETH_FOREACH_DEV()
which skips the owned ports.
The new iterators RTE_ETH_FOREACH_DEV_SIBLING()
and RTE_ETH_FOREACH_DEV_OF() are more appropriate and more correct.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.
If enabled, mlx5_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smaller
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.
To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>