Windows Devx interface name is the same as device name with
different size then IF_NAMESIZE. To support it MLX5_NAMESIZE
is defined with IF_NAMESIZE value for Linux and MLX5_FS_NAME_MAX
value for Windows.
Fixes: e9c0b96e35 ("net/mlx5: move Linux ifname function")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
There are three types of eth_dev_ops: primary, secondary and isolate
represented in three callback tables per OS. In this commit the OS
specific eth dev tables are unified into shared tables in file mlx5.c.
Starting from this commit all operating systems must implement the same
eth dev APIs. In case an OS does not support an API - it can return in
its implementation an error ENOTSUP.
Fixes: 042f5c94fd ("net/mlx5: refactor device operations for Linux")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Wrap glue call dr_create_flow_action_default_miss() with an OS API. This
commit is a follow up on [1].
[1]
commit d4d85aa6f1 ("common/mlx5: add default miss action")
commit b293fbf967 ("net/mlx5: add OS specific flow actions operations")
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
mlx5_flow_adjust_priority() is used to adjust priorities according to
priorities levels. It is Verbs based and it is called from shared code
(mlx5_flow_dv.c). Therefore, wrap it in an OS API.
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
ibdev_name and ibdev_path sizes are defined in Windows DevX
differently from the sizes used in Linux with
IBV_SYSFS_NAME_MAX and IBV_SYSFS_PATH_MAX.
Added MLX5_FS_NAME_MAX and MLX5_FS_NAME_PATH in mlx5_os.h for both OSs.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently MR operations are Verbs based. This commit updates MR
operations prototypes such that DevX MR operations callbacks can be used
as well. Rename 'struct mlx5_verbs_ops' as 'struct mlx5_mr_ops' and
move it to shared file mlx5.h.
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, the hash list saves the hash key in the hash entry. And the
key is mostly used to get the bucket index only.
Save the entire 64 bits key to the entry will not be a good option if
the key is only used to get the bucket index. Since 64 bits costs more
memory for the entry, mostly the signature data in the key only uses
32 bits. And in the unregister function, the key in the entry causes
extra bucket index calculation.
This commit saves the bucket index to the entry instead of the hash key.
For the hash list like table, tag and mreg_copy which save the signature
data in the key, the signature data is moved to the resource data struct
itself.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The rdma-core library uses callbacks to allocate and free memory
from DPDK. The memory allocation callback used the complicated
and incorrect way to get the NUMA socket ID from the context.
The context was wrong that might result in wrong socket ID
and allocating memory from wrong node.
The callbacks are assigned once as Infinibande device context
is created allowing early access to shared DPDK memory for all
Verbs internal objects need that.
Fixes: 36dabcea78 ("net/mlx5: use anonymous Direct Verbs allocator argument")
Fixes: 2eb4d0107a ("net/mlx5: refactor PCI probing on Linux")
Fixes: 17e19bc4dd ("net/mlx5: add IB shared context alloc/free functions")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
mlx5 PMD refuses to update link state if link speed is defined but
status is down or if link speed is undefined but status is up, even if
the ioctl() succeeded.
This prevents application to detect link up/down event, especially when
the link speed is not correctly detected.
Commit [1] allowed returning unknown link speed, so now PMD allows
the return of unknown link speed in the above case.
Due to some old kernel driver bug, link speed wasn't detected properly.
[1] http://git.dpdk.org/dpdk/commit/?id=810b17d116f03
Signed-off-by: Benoît Ganne <bganne@cisco.com>
Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Existing code uses the previous API offered by rdma-core in order
to create ASO Flow Hit action.
A general API is now formally released, to create ASO action of any
type. This patch moves the MLX5 PMD code to use the formal API.
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Existing code uses the hard-coded value REG_C_5 as input for function
mlx5dv_dr_action_create_flow_hit().
This patch updates function mlx5_flow_get_reg_id() to return the
selected REG_C value for ASO Flow Hit operation.
The returned value is used, after reducing offset REG_C_0, as input
for function mlx5dv_dr_action_create_flow_hit().
Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Starting ConnectX-6 Dx, the VF device ID is generic
and not per chip.
https://pci-ids.ucw.cz/v2.2/pci.ids
101e ConnectX Family mlx5Gen Virtual Function
This means that all will have the same VF device ID.
Fixes: 5fc66630be ("net/mlx5: add ConnectX6-DX device ID")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The sample and mirror action objects are maintained on the list
shared between the ports belonging to the same multiport Infiniband
device(between representors).
The actions in the NIC steering domains might contain the references
to the sub-flow action objects created over the given port. The action
deletion might happen in the context of the different port and on the
deletion of referenced objects the incorrect port might be specified.
To avoid this we should save the port on what the sub-flow actions
were created and then use this saved port for sub-flow action release.
This commit saves the create device in the sample and mirror actions
struct to avoid using the incorrect port device in releasing.
Fixes: 1978414169 ("net/mlx5: make sample and mirror action thread safe")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Reviewed-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When the representor device was set to PF1 in bonding mode, iterating
device iterator that looking for representors by bonding device failed
to match PF0 pci address with PF1 address. So detaching PF bonding
device only detached all representors on PF0.
This patch registers all representors of PF1 with PF0 as PCI device.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Bonding adjustment is done only when DEVX_PORT is supported in the
rdma-core.
Some bonding condition was done even when DEVX_PORT is not supported.
Remove it.
Fixes: 2eb4d0107a ("net/mlx5: refactor PCI probing on Linux")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
mlx_steering_dump_parser.py tool failed to dump flow due to socket file
name changed.
Change socket file name back to make it consistent.
Fixes: e4b7b8d082 ("common/mlx5: fix PCI driver name")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The Tx queue stop API doesn't call the PMD callback when the state of
the queue is stopped.
The drivers should update the state to be stopped when the queue stop
callback is done successfully or when the port is stopped.
The drivers should update the state to be started when the queue start
callback is done successfully or when the port is started.
The driver wrongly didn't update the state as started when the port
start callback was done which kept the state as stopped.
Following call to a queue stop API was not completed by ethdev layer
because the state is already stopped.
Move the state update from the Tx queue setup to the port start
callback.
Fixes: 161d103b23 ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The current DevX implementation of the relaxed ordering feature is
enabling relaxed ordering usage only if both relaxed ordering read AND
write are supported. In that case both relaxed ordering read and write
are activated.
This commit will optimize the usage of relaxed ordering by enabling it
when the read OR write features are supported. Each relaxed ordering
type will be activated according to its own capability bit.
This will align the DevX flow with the verbs implementation of
ibv_reg_mr when using the flag IBV_ACCESS_RELAXED_ORDERING
Fixes: 53ac93f71a ("net/mlx5: create relaxed ordering memory regions")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
A new ASO (Advanced Steering Operation) feature was added in the last
mlx5 adapters to support flow hit detection.
Using this new steering action, the driver can detect flow traffic hit
and to reset this indication any time.
The ASO age action cannot support flows in table 0.
Add support for flow aging action in rte_flow using this new feature.
The counter aging mode will be taken only when the ASO feature is not
supported for the user flow groups.
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch removes shared data
lock since all resources should support concurrent protection.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit uses spinlock to protect the shared action list in multiple
thread.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit uses cache list to make sample and mirror action thread
safe.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch converts push VLAN
action cache list to thread safe cache list.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch convert port id
action cache list to thread safe cache list.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit applies the cache linked list to Rx queue to make it thread
safe.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch updates flow header
reformat action list to use thread safe hash list with write-most mode.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch updates metadata copy
flow list to use thread safe hash list.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch updates flow modify
action list to use thread safe hash list with write-most mode.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow insertion, this patch updates flow tag list
to use thread safe hash list with write-most mode.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
All table, tag, header modify, header reformat are supported only on DV
mode. For the OFED version doesn't support these, create the related
redundant DV resources waste the memory.
Add the code section in the HAVE_IBV_FLOW_DV_SUPPORT macro to avoid the
redundant resources allocation.
Fixes: 2eb4d0107a ("net/mlx5: refactor PCI probing on Linux")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
In order to support hash list concurrent access, adding next:
1. List level read/write lock.
2. Entry reference counter.
3. Entry create/match/remove callback.
4. Remove insert/lookup/remove function which are not thread safe.
5. Add register/unregister function to support entry reuse.
For better performance, lookup function uses read lock to
allow concurrent lookup from different thread, all other hash list
modification functions uses write lock which blocks concurrent
modification and lookups from other thread.
The exact objects change will be applied in the next patches.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit creates the global drop action for flows instead of
maintain it in flow insertion time. The uniqueu global drop action
makes it thread safe.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit creates the global default miss action instead of maintain
it in flow insertion time. This makes the action to be thread safe.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit protects the VLAN VM workaround area using a spinlock
in multiple-thread flow insertion to make it thread safe.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow operations, this patch introduces list lock
for the rte_flow list manages all the rte_flow handlers.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The ID generation API used an integer pool to save released ID, To
support multiple flow, it has to be enhanced to be thread safe.
Indexed pool could be used to generate unique ID by setting size of pool
entry to zero. Since bitmap is used, an extra benefits is saving memory
to about one bit per entry. Further more indexed pool could be thread
safe by enabling lock.
This patch leverages indexed pool to generate ID, removes
unused ID generating API.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
As part of multi-thread flow support, this patch moves flow intermediate
data to thread specific, makes them a flow workspace. The workspace is
allocated per thread, destroyed along with thread life-cycle.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The rte_atomic API is deprecated and needs to be replaced with
C11 atomic builtins. Use the relaxed ordering for RxQ/TxQ refcounts.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In real-life business, one device could be attached and detached
dynamically. The hairpin configuration of this port to/from all the
other ports should be enabled and disabled accordingly.
The RTE ethdev lib and PMD should provide this ability to get the
peer ports list in case that the application doesn't save it. It is
recommended that the size of the array to save the port IDs is as
large as the "RTE_MAX_ETHPORTS" to have the maximal capacity.
The order of the peer port IDs may be different from that during
hairpin queues set in the initialization stage. The peer port ID
could be the same as the current device port ID when the hairpin
peer ports contain itself - the single port hairpin.
The application should check the ports' status and decide if the
peer port should be bound / unbound when starting / stopping the
current device.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In order to support hairpin between two ports, mlx5 PMD needs to
implement the functions and provide them as the function pointers.
The bind and unbind functions are executed per port pairs. All the
hairpin queues between the two ports should have the same attributes
during queues setup. Different configurations among queue pairs from
the same ports are not supported. It is allowed that two ports only
have one direction hairpin.
In order to set up the connection between two queues, peer Rx queue
HW information must be fetched via the internal RTE API and the queue
information could be used to modify the SQ object. Then the RQ object
will be modified with the Tx queue HW information. The reverse
operation is not supported right now.
When disconnecting the queues pair, SQ and RQ object should be reset
without any peer HW information. The unbinding operation will try to
disconnect all Tx queues from the port from the Rx queues of the peer
port.
Tx explicit mode attribute will be saved and used when creating a
hairpin flow.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Tunnel Offload API provides hardware independent, unified model
to offload tunneled traffic. Key model elements are:
- apply matches to both outer and inner packet headers
during entire offload procedure;
- restore outer header of partially offloaded packet;
- model is implemented as a set of helper functions.
Implementation details:
* tunnel_offload PMD parameter must be set to 1 to enable the feature.
* application cannot use MARK and META flow actions with tunnel.
* offload JUMP action is restricted to steering tunnel rule only.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When probe a representor, tag cache hash table and modification cache
hash table allocated memory upon each port, overwrote previous existing
cache in shared context data.
This patch moves reference check of shared data prior to hash table
allocation to avoid such issue.
Fixes: 6801116688 ("net/mlx5: fix multiple flow table hash list")
Fixes: 1ef4cdef26 ("net/mlx5: fix flow tag hash list conversion")
Cc: stable@dpdk.org
Acked-by: Matan Azrad <matan@nvidia.com>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Counter extend memory was allocated for non-batch counter to save the
extra DevX object. Currently, for non-batch counter which does not
support aging, entry in the generic counter struct is used only when
counter is free in free list, and bytes in the struct is used only when
counter is allocated in using.
In this case, the DevX object can be saved to the generic counter struct
union with entry memory when counter is allocated and union with bytes
when counter is free.
And pool type is also not needed as non-fallback mode only has generic
counter and aging counter, just a bit to indicate the pool is aged or
not will be enough.
This eliminates the counter extend info struct saves the memory.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, counter operations are not thread safe as the counter
pools' array resize is not protected.
This commit protects the container pools' array resize using a spinlock.
The original counter pool statistic memory allocate is moved to the
host thread in order to minimize the critical section. Since that pool
statistic memory is required only in query time. The container pools'
array should be resized by the user threads, the new pool may be used
by other rte_flow APIs before the host thread resize is done, if the
pool is not saved to the pools' array, the specified counter memory will
not be found as the pool is not saved to the counter management pool
array. The pool raw statistic memory will be filled in host thread.
The shared counters will be protected in other commit.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
A flow counter which was allocated by a batch API couldn't be assigned
to a flow in the root table (group 0) in old rdma-core version.
Hence, a root table flow counter required PMD mechanism to manage
counters which were allocated singly.
Currently, the batch counters have already been supported in root table
includes a new rdma-core version with MLX5_FLOW_ACTION_COUNTER_OFFSET
enum and with a kernel driver includes
MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX_OFFSET enum.
When the PMD uses rdma-core API to assign a batch counter to a root
table flow using invalid counter offset, it should get an error only
if the batch counter assignment for root table is supported.
Using this trial in the initialization time can help to detect the
support.
Using the above trial, if the support is valid, remove the management of
single counter container in the fast counter mechanism. Otherwise, move
the counter mechanism to fallback mode.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
One of the conditions to create Tx queue object by DevX is to be sure
that the DPDK mlx5 driver is not going to be the E-Switch manager of
the device. The issue is with the default FDB flows managed by the
kernel driver, which are not created by the kernel when the Tx queues
are created by DevX.
The current decision is to create the Tx queues by Verbs when E-Switch
is enabled while the current behavior uses an opposite condition to
create them by DevX.
Create the Tx queues by Verbs when E-Switch is enabled.
Fixes: 86d259cec8 ("net/mlx5: separate Tx queue object creations")
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>