This commit protects the VLAN VM workaround area using a spinlock
in multiple-thread flow insertion to make it thread safe.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
To support multi-thread flow operations, this patch introduces list lock
for the rte_flow list manages all the rte_flow handlers.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The ID generation API used an integer pool to save released ID, To
support multiple flow, it has to be enhanced to be thread safe.
Indexed pool could be used to generate unique ID by setting size of pool
entry to zero. Since bitmap is used, an extra benefits is saving memory
to about one bit per entry. Further more indexed pool could be thread
safe by enabling lock.
This patch leverages indexed pool to generate ID, removes
unused ID generating API.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
As part of multi-thread flow support, this patch moves flow intermediate
data to thread specific, makes them a flow workspace. The workspace is
allocated per thread, destroyed along with thread life-cycle.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The rte_atomic API is deprecated and needs to be replaced with
C11 atomic builtins. Use the relaxed ordering for RxQ/TxQ refcounts.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In real-life business, one device could be attached and detached
dynamically. The hairpin configuration of this port to/from all the
other ports should be enabled and disabled accordingly.
The RTE ethdev lib and PMD should provide this ability to get the
peer ports list in case that the application doesn't save it. It is
recommended that the size of the array to save the port IDs is as
large as the "RTE_MAX_ETHPORTS" to have the maximal capacity.
The order of the peer port IDs may be different from that during
hairpin queues set in the initialization stage. The peer port ID
could be the same as the current device port ID when the hairpin
peer ports contain itself - the single port hairpin.
The application should check the ports' status and decide if the
peer port should be bound / unbound when starting / stopping the
current device.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In order to support hairpin between two ports, mlx5 PMD needs to
implement the functions and provide them as the function pointers.
The bind and unbind functions are executed per port pairs. All the
hairpin queues between the two ports should have the same attributes
during queues setup. Different configurations among queue pairs from
the same ports are not supported. It is allowed that two ports only
have one direction hairpin.
In order to set up the connection between two queues, peer Rx queue
HW information must be fetched via the internal RTE API and the queue
information could be used to modify the SQ object. Then the RQ object
will be modified with the Tx queue HW information. The reverse
operation is not supported right now.
When disconnecting the queues pair, SQ and RQ object should be reset
without any peer HW information. The unbinding operation will try to
disconnect all Tx queues from the port from the Rx queues of the peer
port.
Tx explicit mode attribute will be saved and used when creating a
hairpin flow.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Tunnel Offload API provides hardware independent, unified model
to offload tunneled traffic. Key model elements are:
- apply matches to both outer and inner packet headers
during entire offload procedure;
- restore outer header of partially offloaded packet;
- model is implemented as a set of helper functions.
Implementation details:
* tunnel_offload PMD parameter must be set to 1 to enable the feature.
* application cannot use MARK and META flow actions with tunnel.
* offload JUMP action is restricted to steering tunnel rule only.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When probe a representor, tag cache hash table and modification cache
hash table allocated memory upon each port, overwrote previous existing
cache in shared context data.
This patch moves reference check of shared data prior to hash table
allocation to avoid such issue.
Fixes: 6801116688 ("net/mlx5: fix multiple flow table hash list")
Fixes: 1ef4cdef26 ("net/mlx5: fix flow tag hash list conversion")
Cc: stable@dpdk.org
Acked-by: Matan Azrad <matan@nvidia.com>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Counter extend memory was allocated for non-batch counter to save the
extra DevX object. Currently, for non-batch counter which does not
support aging, entry in the generic counter struct is used only when
counter is free in free list, and bytes in the struct is used only when
counter is allocated in using.
In this case, the DevX object can be saved to the generic counter struct
union with entry memory when counter is allocated and union with bytes
when counter is free.
And pool type is also not needed as non-fallback mode only has generic
counter and aging counter, just a bit to indicate the pool is aged or
not will be enough.
This eliminates the counter extend info struct saves the memory.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, counter operations are not thread safe as the counter
pools' array resize is not protected.
This commit protects the container pools' array resize using a spinlock.
The original counter pool statistic memory allocate is moved to the
host thread in order to minimize the critical section. Since that pool
statistic memory is required only in query time. The container pools'
array should be resized by the user threads, the new pool may be used
by other rte_flow APIs before the host thread resize is done, if the
pool is not saved to the pools' array, the specified counter memory will
not be found as the pool is not saved to the counter management pool
array. The pool raw statistic memory will be filled in host thread.
The shared counters will be protected in other commit.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
A flow counter which was allocated by a batch API couldn't be assigned
to a flow in the root table (group 0) in old rdma-core version.
Hence, a root table flow counter required PMD mechanism to manage
counters which were allocated singly.
Currently, the batch counters have already been supported in root table
includes a new rdma-core version with MLX5_FLOW_ACTION_COUNTER_OFFSET
enum and with a kernel driver includes
MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX_OFFSET enum.
When the PMD uses rdma-core API to assign a batch counter to a root
table flow using invalid counter offset, it should get an error only
if the batch counter assignment for root table is supported.
Using this trial in the initialization time can help to detect the
support.
Using the above trial, if the support is valid, remove the management of
single counter container in the fast counter mechanism. Otherwise, move
the counter mechanism to fallback mode.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
One of the conditions to create Tx queue object by DevX is to be sure
that the DPDK mlx5 driver is not going to be the E-Switch manager of
the device. The issue is with the default FDB flows managed by the
kernel driver, which are not created by the kernel when the Tx queues
are created by DevX.
The current decision is to create the Tx queues by Verbs when E-Switch
is enabled while the current behavior uses an opposite condition to
create them by DevX.
Create the Tx queues by Verbs when E-Switch is enabled.
Fixes: 86d259cec8 ("net/mlx5: separate Tx queue object creations")
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Queue stats are stored in 'struct rte_eth_stats' as array and array size
is defined by 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag.
As a result of technical board discussion, decided to remove the queue
statistics from 'struct rte_eth_stats' in the long term.
Instead PMDs should represent the queue statistics via xstats, this
gives more flexibility on the number of the queues supported.
Currently queue stats in the xstats are filled by ethdev layer, using
some basic stats, when queue stats removed from basic stats the
responsibility to fill the relevant xstats will be pushed to the PMDs.
During the switch period, temporary 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS'
device flag is created. Initially all PMDs using xstats set this flag.
The PMDs implemented queue stats in the xstats should clear the flag.
When all PMDs switch to the xstats for the queue stats, queue stats
related fields from 'struct rte_eth_stats' will be removed, as well as
'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' flag.
Later 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag also can be
removed.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Add sample action validate function.
Sample Flow is supported in NIC-RX and FDB domains. For the NIC-RX
the Sample Flow action list must include the destination queue action.
Only NIC-RX domain supports the optional actions list. FDB doesn't
support any optional actions, the sampled packets is always forwarded
to the E-Switch manager port.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Once the separation between Verbs and DevX is done using function
pointers, the type field of the Rx queue object structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate Rx state modification to the Verbs and DevX modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Once the separation between Verbs and DevX is done using function
pointers, the type field of the Tx queue object structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Use new modify_qp functions for Tx object creation in DevX and Verbs
modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate Tx object modification to the Verbs and DevX modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
1. Rename function to mention the internal resources.
2. Reduce the number of function arguments.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Move Tx object similar resources allocations and debug logs from DevX
and Verbs modules to a shared location.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
As an arrangement to Windows OS support, the Verbs operations should be
separated to another file.
By this way, the build can easily cut the unsupported Verbs APIs from
the compilation process.
Define operation structure and DevX module in addition to the existing
Linux Verbs module.
Separate Tx object creation into the Verbs/DevX modules and update the
operation structure according to the OS support and the user
configuration.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.
The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
- trigger event callback
- reset state and few pointers
- free all generic port resources
The private port resources must be released in the .dev_close callback.
The .remove callback should:
- call .dev_close callback
- call rte_eth_dev_release_port()
- free multi-port device shared resources
Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
ionic, liquidio, nfp
Below is a summary of what to implement in those drivers.
* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.
* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.
* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.
* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.
* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
To manage encap decap header format actions mlx5 PMD used the single
linked list and lookup and insertion operations took too long times if
there were millions of objects and this impacted the flow
insertion/deletion rate.
In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In case of bonding, device ifindex was detected as the PF ifindex, so
any operation using ifindex applied to PF instead of the bond device.
These operations includes MTU get/set, up/down and mac address
manipulation, etc.
This patch detects bond interface ifindex and name for PF that join a
bond interface, uses it by default for netdev operations.
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
There are 2 creators for Rx objects, DevX and Verbs.
There are supported DR versions when a DevX destination TIR flow action
creation cannot be supported, using this versions the TIR object should
be created by Verbs, what forces all the Rx objects to be created by
Verbs.
The selection of the Rx objects creator, wrongly, didn't take into
account the destination TIR action support what caused a failure in the
Rx flows creation.
Select Verbs creator when destination TIR action creation is not
supported by the DR version.
Fixes: 6deb19e1b2 ("net/mlx5: separate Rx queue object creations")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
'_rte_eth_dev_callback_process()' & '_rte_eth_dev_reset()' internal APIs
has unconventional underscore ('_') prefix.
Although this is not documented most probably this is to mark them as
internal. Since we have '__rte_internal' flag to mark this, removing '_'
from API names.
For '_rte_eth_dev_reset()', there is already a public API named
'rte_eth_dev_reset()', so renaming '_rte_eth_dev_reset()' to
'rte_eth_dev_internal_reset'.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
This patch is a preparation to hide the 'struct eth_dev_ops' from
applications by moving some device operations from 'struct eth_dev_ops'
to 'struct rte_eth_dev'.
Mentioned ethdev APIs are in the data path and implemented as inline
because of performance reasons.
Exposing 'struct eth_dev_ops' to applications is bad because it is a
contract between ethdev and PMDs, not really needs to be known by
applications, also changes in the struct causing ABI breakages which
shouldn't.
To be able to both keep APIs inline and hide the 'struct eth_dev_ops',
moving device operations used in ethdev inline APIs to 'struct
rte_eth_dev' to the same level with Rx/Tx burst functions.
The list of dev_ops moved:
eth_rx_queue_count_t rx_queue_count;
eth_rx_descriptor_done_t rx_descriptor_done;
eth_rx_descriptor_status_t rx_descriptor_status;
eth_tx_descriptor_status_t tx_descriptor_status;
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
Move Rx queue drop action similar resources allocations from Verbs
module to a shared location.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate Rx queue drop creation into both Verbs and DevX modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Move Rx hash queue object similar resources allocations from DevX and
Verbs modules to a shared location.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Move Rx indirection table object similar resources allocations from DevX
and Verbs modules to a shared location.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Once the separation between Verbs and DevX is done using function
pointers, the type field of the indirection table structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate Rx hash queue creation into both Verbs and DevX modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate Rx indirection table object creation into both Verbs and DevX
modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Use new modify_wq functions for Rx object creation in DevX and Verbs
modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate Rx object modification to the Verbs and DevX modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Rearrangement of WQ and CQ creation for Verbs Rx queue:
1. Rename the allocation function.
2. Reduce the number of arguments that the creation functions receive.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Move Rx object similar resources allocations and debug logs from DevX
and Verbs modules to a shared location.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Separate interrupt event handler into both Verbs and DevX modules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
As an arrangement to Windows OS support, the Verbs operations should be
separated to another file.
By this way, the build can easily cut the unsupported Verbs APIs from
the compilation process.
Define operation structure and DevX module in addition to the existing
linux Verbs module.
Separate Rx object creation into the Verbs/DevX modules and update the
operation structure according to the OS support and the user
configuration.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
File mlx5_vlan.c contains Netlink APIs (Linux dependent) as part of VM
workaround implementation. Move this implementation to file
linux/mlx5_vlan_os.c. To remove Netlink dependency in header files
change pointer of type 'struct mlx5_nl_vlan_vmwa_context *' to 'void *'.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
When updating a queue vlan stripping offload - either the WQ is modified
in Verbs or the RQ is modified in DevX. Add a vlan stripping modify
callback to 'struct mlx5_obj_ops' and assign it with the specific Verbs
and DevX implementations: 'rxq_obj_modify_wq_vlan_strip' and
'rxq_obj_modify_rq_vlan_strip' respectively.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This commit adds Linux implementation of routine mlx5_os_mac_addr_flush
as wrapper to Netlink API to avoid direct calls under non-Linux
operating systems.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
mlx5_get_ifname() prototype includes 'IF_NAMESIZE' definition from Linux
file net/if.h. Since this API is only used under Linux and to enable
compilation under non-Linux OS - move this prototype from shared file
mlx5.h to file linux/mlx5_os.h.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
To manage header modify actions mlx5 PMD used the single linked list and
lookup and insertion operations took too long times if there were
millions of objects and this impacted the flow insertion/deletion rate.
In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
To detect the timestamp mode configured on the NIC the mlx5
PMD uses the firmware command ACCESS_REGISTER_USER. This
command is relatively new and might be not supported by
older firmware versions and was rejected, causing annoying
messages in kernel log.
This patch adds the attribute flag check whether firmware
supports the command and avoid the call if it does not.
Fixes: bb7ef9a962 ("common/mlx5: add register access DevX routine")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The device configuration struct is not small enough to be used as
function argument by value.
Call spawn function with device configuration by reference.
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>