The new kernel adds the names like "pf0" for Host PCI physical
function representor on Bluefield SmartNIC hosts. This patch
provides correct HPF representor recognition over the kernel
versions 5.7 and laters.
The following port naming formats are supported:
- missing physical port name (no sysfs/netlink key) at all,
master is assumed
- decimal digits (for example "12"), representor is
assumed, the value is the index of attached VF
- "p" followed by decimal digits, for example "p2", master
is assumed
- "pf" followed by PF index, for example "pf0", Host PF
representor is assumed on SmartNIC systems.
- "pf" followed by PF index concatenated with "vf" followed by
VF index, for example "pf0vf1", representor is assumed.
If index of VF is "-1" it is a special case of Host PF
representor, this representor must be indexed in devargs
as 65535, for example representor=[0-3,65535] will
allow representors for VF0, VF1, VF2, VF3 and for host PF.
Fixes: 79aa430721 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The creation of DBR can be used by a number of different
Mellanox PMDs. for example RegEx / Net / VDPA.
This commits moves the DBR creation and release functions to common
folder.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Getter functions such as: 'mlx5_os_get_ctx_device_name',
'mlx5_os_get_ctx_device_path', 'mlx5_os_get_dev_device_name',
'mlx5_os_get_umem_id' are implemented under net directory. To enable
additional devices (e.g. regex, vdpa) to access these getter functions
they are moved under common directory.
As part of this commit string sizes DEV_SYSFS_NAME_MAX and
DEV_SYSFS_PATH_MAX are increased by 1 to make sure that the destination
string size in strncpy() function is bigger than the source string size.
This update will avoid GCC version 8 error -Werror=stringop-truncation.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, when allocate a new counter, it needs loop the whole
container pool list to get a free counter.
In the case with millions of counters allocated, and all the pools
are empty, allocate the new counter will still need to loop the
whole container pool list first, then allocate a new pool to get a
free counter. It wastes the cycles during the pool list traversal.
Add a global free counter list in the container helps to get the free
counters more efficiently.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
For single counter, when allocate a new counter, it needs to find the pool
it belongs in order to do the query together.
Once there are millions of counters allocated, the pool array in the
counter container will become very large. In this case, the pool search
from the pool array will become extremely slow.
Save the minimum and maximum counter ID to have a quick check of current
counter ID range. And start searching the pool from the last pool in the
container will mostly get the needed pool since counter ID increases
sequentially.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, to check if any shared counter with same ID existing, it will
have to loop the counter pools to search for the counter. Even add the
counter to the list will also not so helpful while there are thousands
of shared counters in the list.
Change Three-Level table to look up the counter index saved in the
relevant table entry will be more efficient.
This patch introduces the Three-level table to save the ID relevant
counter index in the table. Then the next while the same ID comes, just
check the table entry of this ID will get the counter index directly.
No search will be needed.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
For the case which data is linked with sequence increased index, the
array table will be more efficient than hash table once need to search
one data entry in large numbers of entries. Since the traditional hash
tables has fixed table size, when huge numbers of data saved to the hash
table, it also comes lots of hash conflict.
But simple array table also has fixed size, allocates all the needed
memory at once will waste lots of memory. For the case don't know the
exactly number of entries will be impossible to allocate the array.
Then the multiple level table helps to balance the two disadvantages.
Allocate a global high level table with sub table entries at first,
the global table contains the sub table entries, and the sub table will
be allocated only once the corresponding index entry need to be saved.
e.g. for up to 32-bits index, three level table with 10-10-12 splitting,
with sequence increased index, the memory grows with every 4K entries.
The currently implementation introduces 10-10-12 32-bits splitting
Three-Level table to help the cases which have millions of entries to
save. The index entries can be addressed directly by the index, no
search will be needed.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Some configuration of the mlx5 port are done by the kernel net device
associated to the IB device represents the PCI device.
The DPDK mlx5 driver uses Linux system calls, for example ioctl, in
order to configure per port configurations requested by the DPDK user.
One of the basic knowledges required to access the correct kernel net
device is its name.
Move function to get interface name from IB device path to the common
library.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Create a set of verbs callbacks in 'struct mlx5_verbs_ops'
and add MR operations to it (file net/mlx5/linux/mlx5_verbs.c).
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Prior to this commit MR operations were verbs based and hard coded under
common/mlx5/linux directory. This commit enables upper layers (e.g.
net/mlx5) to determine which MR operations to use. For example the net
layer could set devx based MR operations in non-Linux environments. The
reg_mr and dereg_mr callbacks are added to the global per-device MR
cache 'struct mlx5_mr_share_cache'.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The TCP checksum includes IPV4 pseudo-header checksum and L3
payload checksum which include TCP header and TCP payload.
When mlx5 LRO is enabled, HW will calculate the TCP payload
checksum, PMD need complete the IPV4 pseudo-header checksum
and the TCP header checksum.
The mlx5_lro_update_tcp_hdr function completes the TCP header
checksum, but this function using lower 4 bits of data-offset
field in TCP header to get the whole TCP header length, this
will cause TCP header checksum wrong calculation.
Update the code using higher 4 bits of data-offset field
instead of lower 4 bits.
Fixes: e4c2a16eb1 ("net/mlx5: handle LRO packets in Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The number of descriptors to configure in a Rx/Tx queue is passed to
the mlx5_tx/rx_queue_pre_setup() function by value. That means any
adjustments of this variable are local and cannot affect the actual
value that is used to allocate mbufs in the mlx5_txq/rxq_new()
functions. Pass the number as a reference to actually update it.
Fixes: 6218063b39 ("net/mlx5: refactor Rx data path")
Fixes: 1d88ba1719 ("net/mlx5: refactor Tx data path")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The Legacy MPW (multi-packet write) should not be engaged implicitly.
We should exclude this function from a Tx burst routine selection
process unless it is requested specifically by setting the txq_mpw_en
devarg. Exclude this function from the selection process the same way
it is done for the Enhanced MPW in the mlx5_select_tx_function()
routine.
Fixes: eb8121ab9d ("net/mlx5: introduce Tx burst routine template")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
mlx5 statistics are calculated by several methods:
1. In software when packets go through datapath.
2. Calling ioctl with ETHTOOL command (Linux specific).
3. Reading counters from SYSFS device path (Linux specific).
The Linux related functions are moved to file linux/mlx5_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
There are three types of eth_dev_ops: primary, secondary and isolate.
Their function calls assignments are moved from common file
mlx5.c to the Linux specific file linux/mlx5_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
File mlx5_ethdev.c is partially moved to linux/mlx5_ethdev_os.c for
functions which are Linux specific. Functions which are Linux agnostics
remain in mlx5_ethdev.c file.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
mlx5_socket.c file is using APIs which are Linux specifics. Therefore
move it (including mlx5_socket.h) from net/mlx5 directory to
net/mlx5/linux directory. This commit also updates the Makefile and
the meson files.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
In DV enabled MLX5 PMD build mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW].size
was initiated for DV structure. If RTE initialization encountered MLX5
PCI function with disabled DV support
mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW].size was reduced to match legacy
verbs flow size. Since mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW] is a
global variable that change reflected on DV enabled MLX5 PCI functions
too.
Running flow with invalid ipool size crashes PMD.
The patch adjusts ipool flow size for each active PCI function.
Fixes: b88341ca35 ("net/mlx5: convert flow dev handle to indexed")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Recent patch added definition of mask MLX5_GTP_FLAGS_MASK, just
above function flow_dv_validate_item_gtp(), where it is used.
Patch was applied together with other patches which modified the same
file, so the mask was located further away from the function it is
used in.
This patch moves the mask definition to the proper location.
Fixes: 563ac307a4 ("net/mlx5: support match on GTP flags")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
1. Replace 'struct ibv_device *' with 'void *' in 'struct
mlx5_dev_spawn_data'. Define a getter function to retrieve the
device name.
2. Rename ibv_dev and ibv_port as phys_dev and phys_port
respectively.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
File drivers/net/linux/mlx5_os.h is added. It includes specific
Linux definitions such as PCI driver flags, link state changes
interrupts, link removal interrupts, etc.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Refactor PCI probing related code. Move Linux specific functions (as
well as verbs and dv related code) from mlx5.c file to linux/mlx5_os.c
file.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
umem field is used in several structs. Its type 'struct mlx5dv_devx_umem
*' is changed to 'void *'. This change will allow non-Linux OS
compilations.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Define 'struct mlx5_dev_attr' which is ibv and dv independent. It
contains attribute that were originally contained in 'struct
ibv_device_attr_ex' and 'struct mlx5dv_context dv_attr'. Add a new API
mlx5_os_get_dev_attr() which fills in the new defined struct.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Replace 'struct ibv_pd *' with 'void *' in struct mlx5_ctx_shared and
all function calls in mlx5 PMD.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
'ctx' type (field in 'struct mlx5_ctx_shared') is changed from 'struct
ibv_context *' to 'void *'. 'ctx' members which are verbs dependent
(e.g. device_name) will be accessed through getter functions which are
added to a new file under Linux directory: linux/mlx5_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Maximum burst size of Vectorized Rx burst routine is set to
MLX5_VPMD_RX_MAX_BURST(64). This limits the performance of any
application that would like to gather more than 64 packets from
the single Rx burst for batch processing (i.e. VPP).
The situation gets worse with a mix of zipped and unzipped CQEs.
They are processed separately and the Rx burst function returns
small number of packets every call.
Repeat the cycle of gathering packets from the vectorized Rx routine
until a requested number of packets are collected or there are no
more CQEs left to process.
Fixes: 6cb559d67b ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, when flow destroyed, some memory resources may still be kept
as cached to help next time create flow more efficiently.
Some system may need the resources to be more flexible with flow create
and destroy. After peak time, with millions of flows destroyed, the
system would prefer the resources to be reclaimed completely, no cache
is needed. Then the resources can be allocated and used by other
components. The system is not so sensitive about the flow insertion
rate, but more care about the resources.
Both DPDK mlx5 PMD driver and the low level component rdma-core have
provided the flow resources to be configured cached or not, but there is
no APIs or parameters exposed to user to configure the flow resources
cache mode. In this case, introduce a new PMD devarg to let user
configure the flow resources cache mode will be helpful.
This commit is to add a new "reclaim_mem_mode" to help user configure if
the destroyed flows' cache resources should be kept or not.
Their will be three mode can be chosen:
1. 0(none). It means the flow resources will be cached as usual. The
resources will be cached, helpful with flow insertion rate.
2. 1(light). It will only enable the DPDK PMD level resources reclaim.
3. 2(aggressive). Both DPDK PMD level and rdma-core low level will be
configured as reclaimed mode.
With these three mode, user can configure the resources cache mode with
different levels.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The glue file mlx5_glue.c is based on Linux specifics APIs.
Move it (including file mlx5_glue.h) to common/mlx5/linux directory.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, the DevX counter query works asynchronously with Devx
interrupt handler return the query result. When port closes, the
interrupt handler will be uninstalled and the Devx comp obj will
also be destroyed. Meanwhile the query is still not cancelled.
In this case, counter query may use the invalid Devx comp which
has been destroyed, and query failure with invalid FD will be
reported.
Adjust the shared interrupt install and uninstall timing to make
the counter asynchronous query stop before interrupt uninstall.
Fixes: f15db67df0 ("net/mlx5: accelerate DV flow counter query")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
When secondary process starts, it will allocate its own process private
data, and also does remap to UAR register of the Tx queue. Once the
secondary process exits, these resources should be released accordingly.
And the shared resources owned by primary should not be touched.
Currently, once one port in the secondary process spawn failed, all the
other spawned ports will also be released during process exits. However,
the mlx5_dev_close() function does not add the cases for secondary
process, it means call the mlx5_dev_close() function directly in
secondary process releases the resources it should not touch.
Add the case for secondary process release to its own resources in
mlx5_dev_close() function to help it quits gracefully.
Fixes: 942d13e6e7 ("net/mlx5: fix sharing context destroy order")
Fixes: 3a8207423a ("net/mlx5: close all ports on remove")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The mlx5_flow_validate_item_mpls function checks MPLS item validation.
It first checks if the device supports MPLS, it is done using the ifdef
condition that if it fails to skip to endif and return the appropriate
error.
When MPLS is supported, the preprocessor will copy the body of the
function ending with return 0 followed by the lines that report MPLS
support.
In fact, these lines are unreachable because before them the function
returns 0 and in any case they are unnecessary.
Replace the endif by else and move endif to the end of the
function.
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The mlx5_txq_obj_new function defines a pointer named txq_data and
assign value into it. After assigning, the code writer is sure that the
variable does not point to NULL and even express it using assertion.
During the function, the function does dereferencing to the pointer
several times and at no point change its value. However, at the end of
the function at the error label when it wants to free one of the fields
of the structure that txq_data points to, it checks again whether
txq_data is invalid.
This check is unnecessary since it knows for sure that txq_data is
valid.
Remove the aforementioned needless check.
Fixes: 6449068818 ("net/mlx5: add free on completion queue")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The mlx5_pmd_socket_handle function calls the accept function that
returns the socket descriptor into the conn_sock variable. The socket
descriptor value can be 0 (according to accept API) or positive and so
immediately after calling the function it checks whether conn_sock < 0.
Later in the function when other things fail it jumps to the error label
and release previously allocated resources (such as socket or file).
During the resource release, it checks whether the variable conn_sock
containing the socket descriptor is positive and if it is, it releases
it. However, in this check it misses the case where conn_sock == 0, in
this case the socket will not be released and there will be a Resource
leak.
Extend the close condition for 0 value too.
Fixes: e6cdc54cc0 ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In the mlx5_pmd_socket_handle function it calls the recvmsg function
which returns the number of bytes read. The function assigns this return
value into a ret variable defined at the beginning of the function.
Similarly in the mlx5_pmd_socket_init function the it calls the socket
function which returns a file descriptor for the new socket. The
function also assigns this return value into a ret variable defined at
the beginning of the function.
In both functions they initialize the variable when defining it,
however, in both cases they do not use any ret variable before assigning
the return value from the function, so the initialization is
unnecessary.
Clean the aforementioned unnecessary initializations.
Fixes: e6cdc54cc0 ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The mlx5_rxq_obj_hairpin_new function defines a pointer named tmpl and
allocates memory for it using the rte_zmalloc_socket function.
Later, this function allocates memory to a variable inside tmpl using
the mlx5_devx_cmd_create_rq function.
In both cases, if the allocation fails, the code jumps to the error
label and frees allocated resources. However, in the first jump there
are still no resources to free and the jump only for the line return
NULL is unnecessary. Even worse, when it jumps to error label with
invalid tmpl it actually does dereference to a null pointer.
In contrast, the second jump needs to free the tmpl variable but the
function instead of freeing, tries to free the variable that it just
failed to allocate.
In addition, for another error, the function returns NULL without
freeing the tmpl variable before, causing a memory leak.
Delete the error label and replace each jump with local return NULL and
free tmpl variable if needed.
Fixes: e79c9be915 ("net/mlx5: support Rx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The mlx5_txq_obj_hairpin_new function defines a pointer named tmpl and
allocates memory for it using the rte_zmalloc_socket function.
Later, this function allocates memory to a variable inside tmpl using
the mlx5_devx_cmd_create_sq function.
In both cases, if the allocation fails, the code jumps to the error
label and frees allocated resources. However, in the first jump there
are still no resources to free and the jump only for the line return
NULL is unnecessary. Even worse, when it jumps to error label with
invalid tmpl it actually does dereference to a null pointer.
In contrast, the second jump needs to free the tmpl variable but the
function instead of freeing, tries to free the variable that it just
failed to allocate, and another variable that has never been allocated.
In addition, for another error, the function returns NULL without
freeing the tmpl variable before, causing a memory leak.
Delete the error label and replace each jump with local return NULL and
free tmpl variable if needed.
Fixes: ae18a1ae96 ("net/mlx5: support Tx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Recent patch exposed a minor performance issue,
so it is reverted.
Fixes: d237d22fbe ("net/mlx5: prefer DevX API to create Rx objects")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Removed the typing error in doc/guides/eventdevs/index.rst,
drivers/net/mlx5/mlx5.c and in lib/librte_vhost/rte_vhost.h
Bugzilla ID: 477
Fixes: 0857b94211 ("doc: add event device and software eventdev")
Fixes: 039253166a ("vhost: add device op when notification to guest is sent")
Fixes: ad74bc6195 ("net/mlx5: support multiport IB device during probing")
Cc: stable@dpdk.org
Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
After memory optimization, the organization of some resources are
changed from pointer based LIST to the index based ILIST. A lot of
code parts are touched due to such change.
Some static code checking and analysis tool will complain and raise
a false warning on the uninitialized value using. E.g. in the port
action registering function, the stack variable will be used as the
right value with some uninitialized field to initialize variable
allocated from heap. But indeed, it is not an error because all the
fields set with the uninitialized value will be overwritten in the
following code part and the macros. All the fields will be used as
the left value explicitly.
It makes no sense to clear the stack variable to 0 in this case,
and the extra memset will introduce some cycles overhead. It just
needs to ignore the false warning from the tool, if any.
Fixes: f3faf9ea11 ("net/mlx5: convert port id action to indexed")
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Reviewed-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
After memory optimization, some action object handles are changed to
index to save the overhead. Assertion in debug mode will be helpful
for trouble shooting.
In the current implementation, only one port action is supported in
switchdev mode for one device flow. In debug mode, an assertion will
be used to check the if the port action is none, and it should
locate before the port action resource registration but not after
it. The action index in the handle should be 0 before registration.
Or else it will always cause a failure because the port action is
registered and the index is not 0.
Fixes: f3faf9ea11 ("net/mlx5: convert port id action to indexed")
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Reviewed-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Currently, the shared counter search uses the wrong nested index which
is used by the pool index. The incorrect nested index using causes the
search go to incorrect counter pool is not existed.
Add the counter index to fix the incorrect nested use case.
Fixes: 4001d7ad26 ("net/mlx5: change Direct Verbs counter to indexed")
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The doorbell record is organized with page and bitmap. When some new
doorbell needs to be associated with a queue, the bit will be set
in the bitmap to indicate the corresponding doorbell occupied. A
counter is used to record the number of doorbell occupied to speed
up the searching.
If the number reaches the maximal value of a pre-defined number of a
page, a new page will be allocated. If not, then the bitmap will be
checked to find a free one.
The LSHIFT and OR (AND NOT) operations are used to update the bitmap
of a page. But 1 will be treated as a signed integer when compiling.
When the shift number is 31, the shifted value will be considered as
negative. Then a wrong extension will be done when setting it to a
64-bits variable. All the upper 32-bits will be set to 1 by such
extension.
Then a wrong offset value will be calculated because of this. The
next 64 bits will be also treated as the bitmap and get corrupted
through the bit set operation.
The immediate value 1 needs to be used as 64 bits width explicitly.
Fixes: 21cae8580f ("net/mlx5: allocate door-bells via DevX")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
When create the Verbs flows with counter, randomly SEGSEV will also
comes. The reason is that the counter pool memory is not allocated
sufficiently and initialized correctly in Verbs case.
As the mlx5_flow_counter array member is moved out of the counter pool
struct, the counter pool memory layout currently contain implicitly
with mlx5_flow_counter, mlx5_age_param(if the pool is an age pool),
mlx5_flow_counter_ext(if the pool is a none batch pool). When allocate
the pool memory, the pool size should be calculated based on the pool
type accordingly.
Currently, for Verbs counter pool, both mlx5_flow_counter and
mlx5_flow_counter_ext need to be taken into account in the pool size.
And the pool type should also be initialized as CNT_POOL_TYPE_EXT.
This patch add the missing size and type for the Verbs counter pool.
Fixes: 8d93c830e4 ("net/mlx5: modify ext-counter memory allocation")
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Previous patch added support of VLAN item without VLAN ID value,
i.e. using wildcard VLAN item, to match VLAN with any VLAN ID.
The implication on VLAN actions was not taken into consideration.
VLAN actions (e.g. push vlan) use the VLAN ID value in the VLAN item,
and expect it to be valid.
This patch updates function flow_dev_get_vlan_info_from_items() to
check the VLAN item contents before trying to use it.
Fixes: 92818d839e ("net/mlx5: fix match on empty VLAN item in DV mode")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>