mlx5 PMDs use the mlx5_dev_to_pci_addr() routine to convert
Infiniband device name to the Bus-Device-Function location
on the PCI bus. The routine returned success even in case of
not found identification string. On caller side it likely
caused the wrong match with the BDF of previous device
resulting in wrong representor and master recognitions.
Fixes: 771fa900b7 ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Previous patch wrongly used rdma-core defined values, when preparing
attributes for creating DevX CQ object.
This patch adds the correct value definition and uses them instead.
Fixes: 08d1838f64 ("net/mlx5: implement CQ for Rx using DevX API")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs
provide the same functionality as rte_io_*mb APIs on all platforms, so
remove them and use rte_io_*mb instead.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Before this commit system call memalign was used for aligned
allocations, however memalign is deprecated.
Based on (1) - POSIX requires that memory aligned allocations can be
freed using free. Some systems provide no way to reclaim memory
allocated with memalign (because one can only pass to free a pointer
gotten from malloc, while, memalign would call malloc and then align the
obtained value).
Another issue is that 64/32 bits architectures use a minimal alignment
size. So any requested alignment below the minimal system size can be
simplified by calling malloc.
The glibc implementation allows memory obtained from posix_memalign to
be reclaimed with free. This commit replaces system call memalign with
system call posix_memalign. It also calls malloc in case the requested
alignment is below the minimal system size.
(1) https://linux.die.net/man/3/memalign
Fixes: d38e3d5266 ("common/mlx5: add memory management functions")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Several DV-based structs of type 'struct mlx5dv_devx_XXX' are replaced
with 'void *' to enable compilation under non-Linux operating systems.
New getter functions were added to retrieve the specific fields that
were previously accessed directly.
Replaced structs:
'struct mlx5dv_pp *'
'struct mlx5dv_devx_event_channel *'
'struct mlx5dv_devx_umem *'
'struct mlx5dv_devx_uar *'
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Enumerated variable REG_NONE (defined in mlx5_prm.h) is in conflict with
Windows definition (winnt.h): #define REG_NONE ( 0ul ) // No value type
To enable mlx5 PMD Windows compilation - rename REG_NONE as REG_NON.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
strsep() is a non-standardized API (by C or POSIX) and thus it is
non-portable between different operating systems. Replace it with
strtok_r() which is standardized by the C standard, and hence also by
POSIX.
The replacement occurs in the code that extracts individual PCI class
names (e.g. class=net:vdpa:foo:bar).
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Start a new release cycle with empty release notes.
The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reserved field should be 0x60 instead of 0x40.
Will fail FW check otherwise.
Fixes: 9428310ae1 ("regex/mlx5: add engine status check")
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
mlx5_nl_mac_addr_flush should flush all allocated MAC
addresses.
The MAC addresses array size should be of size
MLX5_MAX_MAC_ADDRESSES, but currently we return without
flushing the addresses if size is MLX5_MAX_MAC_ADDRESSES.
This was fixed by not allowing an array larger than
MLX5_MAX_MAC_ADDRESSES.
Fixes: e9a8ac59b6 ("common/mlx5: fix MAC addresses assert")
Cc: stable@dpdk.org
Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
To detect the timestamp mode configured on the NIC the mlx5
PMD uses the firmware command ACCESS_REGISTER_USER. This
command is relatively new and might be not supported by
older firmware versions and was rejected, causing annoying
messages in kernel log.
This patch adds the attribute flag check whether firmware
supports the command and avoid the call if it does not.
Fixes: bb7ef9a962 ("common/mlx5: add register access DevX routine")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Currently, the counter query requires the counter ID should start
with 4 aligned. In none-batch mode, the counter pool might have the
chance to get the counter ID not 4 aligned. In this case, the counter
should be skipped, or the query will be failed.
Skip the counter with ID not 4 aligned as the first counter in the
none-batch count pool to avoid invalid counter query. Once having
new min_dcs ID in the poll less than the skipped counters, the
skipped counters will be returned to the pool free list to use.
Fixes: 5382d28c21 ("net/mlx5: accelerate DV flow counter transactions")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
When Rx/Tx queue was being created with DevX the allocated
doorbell record size was only uint64_t. That was definitely
less than size of CPU cacheline and it might have happened the
doorbell records attached to different queues handled by
different cores were allocated within same cacheline. It might
have caused the contention on doorbell record writing.
This patch extends the allocated memory size for doorbell
record to cacheline size.
Fixes: 21cae8580f ("net/mlx5: allocate door-bells via DevX")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Now that mlx5_pci PMD checks for enabled classes and performs
probe(), remove() of associated classes, individual class driver
does not need to check if other driver is enabled.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Add generic mlx5 PCI PMD layer as part of existing common_mlx5
module. This enables multiple classes (net, regex, vdpa) PMDs
to be supported at same time.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
mlx5 PCI Device supports multiple classes of devices such as net, vdpa,
and/or regex.
To support these multiple classes, change mlx5_class to a
bitmap values so that if users asks to enable multiple of them, all
supported classes can be parsed.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
mlx5_common is shared library between mlx5 net, VDPA and regex PMD.
It is better to use common initialization helper instead of using
RTE_PRIORITY_CLASS priority.
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Following two errors are reported when compiled with
gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5).
drivers/common/mlx5/linux/mlx5_glue.h:188:2:
error: function declaration isn't a prototype [-Werror=strict-prototypes]
drivers/common/mlx5/linux/mlx5_glue.h:188:2:
error: function declaration isn't a prototype [-Werror=strict-prototypes]
Fix them by adding void data type in empty argument list.
Fixes: 34fa7c0268 ("net/mlx5: add drop action to Direct Verbs E-Switch")
Fixes: 400d985eb5 ("net/mlx5: add VLAN push/pop DR commands to glue")
Cc: stable@dpdk.org
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The mlx5 PMD did not support queue_start and queue_stop eth_dev API
routines, queue could not be suspended and resumed during device
operation.
There is the use case when this feature is crucial for applications:
- there is the secondary process handling the queue
- secondary process crashed/aborted
- some mbufs were allocated or used by secondary application
- some mbufs were allocated by Rx queues to receive packets
- some mbufs were placed to send queue
- queue goes to undefined state
In this case there was no reliable way to recovery queue handling
by restarted secondary process but reset queue to initial state
freeing all involved resources, including buffers involved in queue
operations, reset the mbuf pools, and then reinitialize queue
to working state:
- reset mbuf pool, allocate all mbuf to initialize pool into
safe state after the crush and allow safe mbuf free calls
- stop queue, free all potentially involved mbufs
- reset mbuf pool again
- start queue, reallocate mbufs needed
This patch introduces the queue start/stop feature with some
limitations:
- hairpin queues are not supported
- it is application responsibility to synchronize start/stop
with datapath routines, rx/tx_burst must be suspended during
the queue_start/queue_stop calls
- it is application responsibility to track queue usage and
provide coordinated queue_start/queue_stop calls from
secondary and primary processes.
- Rx queues with vectorized Rx routine and engaged CQE
compression are not supported by this patch currently
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Several source files include Verbs header files as in (1). These source
files will not compile under non-Linux operating systems. This commit
removes this inclusion in two cases:
Case 1: There is no usage of ibv_* or mlx5dv_* symbols in the source
file so the inclusion in (1) can be safely removed.
Case 2: Verbs symbols are used. Please note the inclusion in (1) already
appears in file linux/mlx5_glue.h (which represents the interface
to the rdma-core library). Therefore, replace (1) in the source file
with (2). Under non-Linux operating systems - file mlx5_glue.h will not
include (1).
(1)
#include <infiniband/verbs.h>
#include <infiniband/mlx5dv.h>
(2)
#include <mlx5_glue.h>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The following Linux calls are replaced by their matching rte APIs.
mmap ==> rte_mem_map()
munmap == >rte_mem_unmap()
sysconf(_SC_PAGESIZE) ==> rte_mem_page_size()
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
As scatter FCS might be not supported for decapsulated tunnel
packets in some NIC HW, a new capability bit which indicates
if scatter FCS works with decap is added.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit allocates the data path object page and B-tree table memory
from unified malloc function with explicit flag MLX5_MEM_RTE.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This commit allocates the control path objects memory from the unified
malloc function.
These objects are all used during the instances initialize, it will not
affect the data path.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Add the internal mlx5 memory management functions:
mlx5_malloc_mem_select();
mlx5_memory_stat_dump();
mlx5_rellaocate();
mlx5_malloc();
mlx5_free();
User will be allowed to manage memory from system or from rte memory
with the unified functions.
In this case, for the system with limited memory which can not reserve
lots of rte hugepage memory in advanced, will allocate the memory from
system for some of not so important control path objects based on the
sys_mem_en configuration.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In order to use dynamic flex parser to parse protocols that is not
supported natively, two steps are needed.
Firstly, creating the parse graph node. There are three parts of the
flex parser: node, arc and sample. Node is the whole structure of a
flex parser, when creating, the length of the protocol should be
specified. Then the input arc(s) is(are) mandatory, it will tell the
HW when to use this parser to parse the packet. For a single parser
node, up to 8 input arcs could be supported and it gives SW ability
to support this protocol over multiple layers. The output arc is
optional and also up to 8 arcs could be supported. If the protocol
is the last header of the stack, then output arc should be NULL. Or
else it should be specified. The protocol type in the arc is used to
indicate the parser pointing to or from this flex parser node. For
output arc, the next header type field offset and size should be set
in the node structure, then the HW could get the proper type of the
next header and decide which parser to point to.
Note: the parsers have two types now, native parser and flex parser.
The arc between two flex parsers are not supported in this stage.
Secondly, querying the sample IDs. If the protocol header parsed
with flex parser needs to used in flow rule offloading, the DW
samples are needed when creating the parse graph node. The offset
of bytes starting from the header needs to be set. After creating
the node successfully, a general object handle will be returned.
This object could be queried with Devx command to get the sample
IDs.
When creating a flow, sample IDs could be used to sample a DW from
the parsed header - 4 continuous bytes starting from the offset. The
flow entry could specify some mask to use part of this DW for
matching. Up to 8 samples could be supported for a single parse
graph node. The offset should not exceed the header length.
The HW resources have some limitation, low layer driver error should
be checked once there is a failure of creating parse graph node.
Signed-off-by: Netanel Gonen <netanelg@mellanox.com>
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The structures and other definitions will be used for the dynamic
flex parser creation via Devx command interface. These structures
will be used as some some intermediate variables and input
parameters for the parser creation API.
It is better to keep all members consistent with the PRM definition
even though some of them will not be used.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
In the translation stage, the eCPRI item should be translated into
the format that lower layer driver could use. All the fields that
need to match must be in network byte order after translation, as
well as the mask. Since the header in the item belongs to the network
layers stack, and the input parameter of the header is considered to
be in big-endian format already.
Base on the definition in the PRM, the DW samples will be used for
matching in the FTE/STE. Now, the type field and only the PC ID, RTC
ID, and DLY MSR ID of the payload will be supported. The masks should
be 00 ff 00 00 ff ff(00) 00 00 in the network order. Two DWs are
needed to support such matching. The mask fields could be zeros to
support some wildcard rules. But it makes no sense to support the
rule matching only on the payload but without matching type field.
The DW samples should be stored after the flex parser creation for
eCPRI. There is no need to query the sample IDs each time when
creating a flow rule with eCPRI item. It will not introduce
insertion rate degradation significantly.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The DevX routine to read/write NIC registers via DevX API is added.
This is the preparation step to check timestamp modes and units
and gather the extended statistics.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch prepares the common part of the mlx5 PMDs to
support packet send scheduling on mbuf timestamps:
- the DevX routine to query the packet pacing HCA capabilities
- packet pacing Send Queue attributes support
- the hardware related definitions
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In case the ibverbs glue is a separate library to dlopen,
the PMD library must allocate a glue structure to be filled by dlopen.
The glue management was in mlx5_common.c and moved to mlx5_common_os.c,
but the variable allocation was not removed from the original file.
The consequence was a link failure, if ibverbs dlopen option is enabled,
because of the redefinition of the variable (with GCC 10):
multiple definition of 'mlx5_glue'
The original definition is removed to keep only the one moved
in the Linux sub-directory.
Fixes: 79aa430721 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
This commit introduce the RegEx poll mode drivers class, and
adds Mellanox RegEx PMD.
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
This patch makes the Infiniband device physical port name
recognition more strict. Currently mlx5 PMD might recognize
the names like "pf0sf0" erroneously as "pf0" and the wrong
device type (host PF representor) is reported.
The names like "pf0sf0" belong to PCI subfunctions which
is currently not supported by mlx5 PMD and this false
recognition must be eliminated.
Fixes: 420bbdae89 ("net/mlx5: fix host physical function representor naming")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This adds the ConnectX-6 Lx device id to the list of supported
Mellanox devices that run the MLX5 PMD.
The device is still in development stage.
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
Some compilers raise an error when declaring a variable
in the middle of a function. This is a C99 allowance.
Even if DPDK switches globally to C99 or C11 standard,
the coding rules are for declarations at the beginning
of a block:
http://doc.dpdk.org/guides/contributing/coding_style.html#local-variables
This coding style is enforced by adding a check of
the common patterns like "for (int i;"
The occurrences of the checked pattern are fixed:
'for *(\(char\|u\?int\|unsigned\|s\?size_t\)'
In the file dpaa2_sparser.c, the fix is to remove the unused macros.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
allow_experimental_apis flag has no effect for in-tree compilation.
See https://git.dpdk.org/dpdk/commit/?id=acec04c4b2f5
Fixes: 72f7566056 ("common/mlx5: move glue files under Linux directory")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
The mlx5_dev_to_pci_addr function defines a variable called ret inside a
loop and uses it.
During the loop, the function assigns a value within the variable and
breaks from the loop, so that this assigning has done nothing and is
actually unnecessary.
Remove the unnecessary assigning.
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Flow tag action is supported only when the driver has DR or DV support.
The tag allocation is adjusted to the modes DV or DR.
In case both DR and DV are not supported in the system, the driver
handles static code for error report.
This error code, wrongly, was compiled when DV is supported while in
this case it cannot be accessed at all.
Ignore the aforementioned static error code in case of DV by
preprocessor commands rearrangement.
Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Add dv_create_action_default_miss wrapper function
for the action added to the rdma-core
MLX5DV_FLOW_ACTION_DEFAULT_MISS.
When a packet matches MLX5DV_FLOW_ACTION_DEFAULT_MISS
action it is steered to the default miss of the verbs
steering domain.
Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The new kernel adds the names like "pf0" for Host PCI physical
function representor on Bluefield SmartNIC hosts. This patch
provides correct HPF representor recognition over the kernel
versions 5.7 and laters.
The following port naming formats are supported:
- missing physical port name (no sysfs/netlink key) at all,
master is assumed
- decimal digits (for example "12"), representor is
assumed, the value is the index of attached VF
- "p" followed by decimal digits, for example "p2", master
is assumed
- "pf" followed by PF index, for example "pf0", Host PF
representor is assumed on SmartNIC systems.
- "pf" followed by PF index concatenated with "vf" followed by
VF index, for example "pf0vf1", representor is assumed.
If index of VF is "-1" it is a special case of Host PF
representor, this representor must be indexed in devargs
as 65535, for example representor=[0-3,65535] will
allow representors for VF0, VF1, VF2, VF3 and for host PF.
Fixes: 79aa430721 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The creation of DBR can be used by a number of different
Mellanox PMDs. for example RegEx / Net / VDPA.
This commits moves the DBR creation and release functions to common
folder.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Getter functions such as: 'mlx5_os_get_ctx_device_name',
'mlx5_os_get_ctx_device_path', 'mlx5_os_get_dev_device_name',
'mlx5_os_get_umem_id' are implemented under net directory. To enable
additional devices (e.g. regex, vdpa) to access these getter functions
they are moved under common directory.
As part of this commit string sizes DEV_SYSFS_NAME_MAX and
DEV_SYSFS_PATH_MAX are increased by 1 to make sure that the destination
string size in strncpy() function is bigger than the source string size.
This update will avoid GCC version 8 error -Werror=stringop-truncation.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Some configuration of the mlx5 port are done by the kernel net device
associated to the IB device represents the PCI device.
The DPDK mlx5 driver uses Linux system calls, for example ioctl, in
order to configure per port configurations requested by the DPDK user.
One of the basic knowledges required to access the correct kernel net
device is its name.
Move function to get interface name from IB device path to the common
library.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>