Commit Graph

130 Commits

Author SHA1 Message Date
Ferruh Yigit
f30e69b41f ethdev: add device flag to bypass auto-filled queue xstats
Queue stats are stored in 'struct rte_eth_stats' as array and array size
is defined by 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag.

As a result of technical board discussion, decided to remove the queue
statistics from 'struct rte_eth_stats' in the long term.

Instead PMDs should represent the queue statistics via xstats, this
gives more flexibility on the number of the queues supported.

Currently queue stats in the xstats are filled by ethdev layer, using
some basic stats, when queue stats removed from basic stats the
responsibility to fill the relevant xstats will be pushed to the PMDs.

During the switch period, temporary 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS'
device flag is created. Initially all PMDs using xstats set this flag.
The PMDs implemented queue stats in the xstats should clear the flag.

When all PMDs switch to the xstats for the queue stats, queue stats
related fields from 'struct rte_eth_stats' will be removed, as well as
'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' flag.
Later 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag also can be
removed.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-10-16 23:27:15 +02:00
Jiawei Wang
96b1f0273c net/mlx5: validate sample action
Add sample action validate function.

Sample Flow is supported in NIC-RX and FDB domains. For the NIC-RX
the Sample Flow action list must include the destination queue action.

Only NIC-RX domain supports the optional actions list. FDB doesn't
support any optional actions, the sampled packets is always forwarded
to the E-Switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-10-16 19:48:18 +02:00
Michael Baum
e96242efa4 net/mlx5: remove Rx queue object type field
Once the separation between Verbs and DevX is done using function
pointers, the type field of the Rx queue object structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
4c6d80f1c5 net/mlx5: separate Rx queue state modification
Separate Rx state modification to the Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
354cc08a2d net/mlx5: remove Tx queue object type field
Once the separation between Verbs and DevX is done using function
pointers, the type field of the Tx queue object structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
a9c7930662 net/mlx5: share Tx queue object modification
Use new modify_qp functions for Tx object creation in DevX and Verbs
modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
5d9f3c3f48 net/mlx5: separate Tx queue object modification
Separate Tx object modification to the Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
e8390b3de0 net/mlx5: rearrange QP creation in Verbs module
1. Rename function to mention the internal resources.
2. Reduce the number of function arguments.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
f49f44839d net/mlx5: share Tx control code
Move Tx object similar resources allocations and debug logs from DevX
and Verbs modules to a shared location.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Michael Baum
86d259cec8 net/mlx5: separate Tx queue object creations
As an arrangement to Windows OS support, the Verbs operations should be
separated to another file.
By this way, the build can easily cut the unsupported Verbs APIs from
the compilation process.

Define operation structure and DevX module in addition to the existing
Linux Verbs module.
Separate Tx object creation into the Verbs/DevX modules and update the
operation structure according to the OS support and the user
configuration.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-10-09 13:17:42 +02:00
Thomas Monjalon
fbd1913561 ethdev: remove old close behaviour
The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.

The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
	- trigger event callback
	- reset state and few pointers
	- free all generic port resources

The private port resources must be released in the .dev_close callback.

The .remove callback should:
	- call .dev_close callback
	- call rte_eth_dev_release_port()
	- free multi-port device shared resources

Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
	ionic, liquidio, nfp
Below is a summary of what to implement in those drivers.

* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.

* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.

* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.

* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.

* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:14 +02:00
Suanming Mou
bf615b077d net/mlx5: manage header reformat actions with hashed list
To manage encap decap header format actions mlx5 PMD used the single
linked list and lookup and insertion operations took too long times if
there were millions of objects and this impacted the flow
insertion/deletion rate.

In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-09-30 19:19:09 +02:00
Xueming Li
c21e5facf7 net/mlx5: use bond index for netdev operations
In case of bonding, device ifindex was detected as the PF ifindex, so
any operation using ifindex applied to PF instead of the bond device.
These operations includes MTU get/set, up/down and mac address
manipulation, etc.

This patch detects bond interface ifindex and name for PF that join a
bond interface, uses it by default for netdev operations.

Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-09-30 19:19:09 +02:00
Michael Baum
7aa9892f79 net/mlx5: fix Rx objects creator selection
There are 2 creators for Rx objects, DevX and Verbs.
There are supported DR versions when a DevX destination TIR flow action
creation cannot be supported, using this versions the TIR object should
be created by Verbs, what forces all the Rx objects to be created by
Verbs.

The selection of the Rx objects creator, wrongly, didn't take into
account the destination TIR action support what caused a failure in the
Rx flows creation.

Select Verbs creator when destination TIR action creation is not
supported by the DR version.

Fixes: 6deb19e1b2 ("net/mlx5: separate Rx queue object creations")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:11 +02:00
Ferruh Yigit
5723fbed4f ethdev: remove underscore prefix from internal API
'_rte_eth_dev_callback_process()' & '_rte_eth_dev_reset()' internal APIs
has unconventional underscore ('_') prefix.
Although this is not documented most probably this is to mark them as
internal. Since we have '__rte_internal' flag to mark this, removing '_'
from API names.

For '_rte_eth_dev_reset()', there is already a public API named
'rte_eth_dev_reset()', so renaming '_rte_eth_dev_reset()' to
'rte_eth_dev_internal_reset'.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
cbfc6111b5 ethdev: move inline device operations
This patch is a preparation to hide the 'struct eth_dev_ops' from
applications by moving some device operations from 'struct eth_dev_ops'
to 'struct rte_eth_dev'.

Mentioned ethdev APIs are in the data path and implemented as inline
because of performance reasons.

Exposing 'struct eth_dev_ops' to applications is bad because it is a
contract between ethdev and PMDs, not really needs to be known by
applications, also changes in the struct causing ABI breakages which
shouldn't.

To be able to both keep APIs inline and hide the 'struct eth_dev_ops',
moving device operations used in ethdev inline APIs to 'struct
rte_eth_dev' to the same level with Rx/Tx burst functions.

The list of dev_ops moved:
eth_rx_queue_count_t       rx_queue_count;
eth_rx_descriptor_done_t   rx_descriptor_done;
eth_rx_descriptor_status_t rx_descriptor_status;
eth_tx_descriptor_status_t tx_descriptor_status;

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
2020-09-18 18:55:08 +02:00
Michael Baum
0c762e81da net/mlx5: share Rx queue drop action code
Move Rx queue drop action similar resources allocations from Verbs
module to a shared location.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
5eaf882e94 net/mlx5: separate Rx queue drop
Separate Rx queue drop creation into both Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
5a959cbfa6 net/mlx5: share Rx hash queue code
Move Rx hash queue object similar resources allocations from DevX and
Verbs modules to a shared location.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
25ae7f1a5d net/mlx5: share Rx queue indirection table code
Move Rx indirection table object similar resources allocations from DevX
and Verbs modules to a shared location.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
66b96fa6a6 net/mlx5: remove indirection table type field
Once the separation between Verbs and DevX is done using function
pointers, the type field of the indirection table structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
85552726d3 net/mlx5: separate Rx hash queue creation
Separate Rx hash queue creation into both Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
87e2db37ef net/mlx5: separate Rx indirection table object creation
Separate Rx indirection table object creation into both Verbs and DevX
modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
fa2c85cc9c net/mlx5: share Rx queue object modification
Use new modify_wq functions for Rx object creation in DevX and Verbs
modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
c279f187ee net/mlx5: separate Rx queue object modification
Separate Rx object modification to the Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
675911d033 net/mlx5: rearrange creation of WQ and CQ object
Rearrangement of WQ and CQ creation for Verbs Rx queue:
1. Rename the allocation function.
2. Reduce the number of arguments that the creation functions receive.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
1260a87b28 net/mlx5: share Rx control code
Move Rx object similar resources allocations and debug logs from DevX
and Verbs modules to a shared location.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
322870799e net/mlx5: separate Rx interrupt handling
Separate interrupt event handler into both Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Michael Baum
6deb19e1b2 net/mlx5: separate Rx queue object creations
As an arrangement to Windows OS support, the Verbs operations should be
separated to another file.
By this way, the build can easily cut the unsupported Verbs APIs from
the compilation process.

Define operation structure and DevX module in addition to the existing
linux Verbs module.
Separate Rx object creation into the Verbs/DevX modules and update the
operation structure according to the OS support and the user
configuration.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-09-18 18:55:08 +02:00
Ophir Munk
7af10d29a4 net/mlx5/linux: refactor VLAN
File mlx5_vlan.c contains Netlink APIs (Linux dependent) as part of VM
workaround implementation. Move this implementation to file
linux/mlx5_vlan_os.c.  To remove Netlink dependency in header files
change pointer of type 'struct mlx5_nl_vlan_vmwa_context *' to 'void *'.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-09-18 18:55:06 +02:00
Ophir Munk
8bb2410ea3 net/mlx5: separate VLAN strip modification
When updating a queue vlan stripping offload - either the WQ is modified
in Verbs or the RQ is modified in DevX.  Add a vlan stripping modify
callback to 'struct mlx5_obj_ops' and assign it with the specific Verbs
and DevX implementations: 'rxq_obj_modify_wq_vlan_strip' and
'rxq_obj_modify_rq_vlan_strip' respectively.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-09-18 18:55:06 +02:00
Ophir Munk
f00f6562e1 net/mlx5: remove netlink dependency in shared code
This commit adds Linux implementation of routine mlx5_os_mac_addr_flush
as wrapper to Netlink API to avoid direct calls under non-Linux
operating systems.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-09-18 18:55:06 +02:00
Ophir Munk
e9c0b96e35 net/mlx5: move Linux ifname function
mlx5_get_ifname() prototype includes 'IF_NAMESIZE' definition from Linux
file net/if.h. Since this API is only used under Linux and to enable
compilation under non-Linux OS - move this prototype from shared file
mlx5.h to file linux/mlx5_os.h.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-09-18 18:55:06 +02:00
Suanming Mou
3fe889617b net/mlx5: manage modify actions with hashed list
To manage header modify actions mlx5 PMD used the single linked list and
lookup and insertion operations took too long times if there were
millions of objects and this impacted the flow insertion/deletion rate.

In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-09-18 18:55:06 +02:00
Viacheslav Ovsiienko
972a1bf812 common/mlx5: fix user mode register access command
To detect the timestamp mode configured on the NIC the mlx5
PMD uses the firmware command ACCESS_REGISTER_USER. This
command is relatively new and might be not supported by
older firmware versions and was rejected, causing annoying
messages in kernel log.

This patch adds the attribute flag check whether firmware
supports the command and avoid the call if it does not.

Fixes: bb7ef9a962 ("common/mlx5: add register access DevX routine")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-07-30 00:41:24 +02:00
Michael Baum
d462a83c65 net/mlx5: optimize stack memory in probe
The device configuration struct is not small enough to be used as
function argument by value.

Call spawn function with device configuration by reference.

Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-30 00:41:23 +02:00
Michael Baum
63d1db710f net/mlx5: fix unlimited parsing of switch info
In mlx5_sysfs_switch_info function, the driver gets switch information
associated with network interface.

The driver writes the port name into buffer and translates it.
However, when it writes the name, it does not limit writing to the
buffer size.

Limit writing to the size of the buffer.

Fixes: 1256805dd5 ("net/mlx5: move Linux-specific functions")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-30 00:41:23 +02:00
Parav Pandit
f6d099d7da common/mlx5: remove class check from class drivers
Now that mlx5_pci PMD checks for enabled classes and performs
probe(), remove() of associated classes, individual class driver
does not need to check if other driver is enabled.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-28 19:01:30 +02:00
Parav Pandit
392bf9084d common/mlx5: register class drivers through common layer
Migrate mlx5 net, vdpa and regex PMD to start using mlx5 common class
driver.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-28 19:01:30 +02:00
Shy Shyman
038e7fc085 net/mlx5: fix HW counters path in switchdev mode
When debugging performance of a DPDK application the user may
need to view the different statistics of DPDK (for example out_of_buffer)
This can be enabled by using testpmd command 'show port xstats
<port_id>' for example.

The current implementation assumes legacy mode in which the counters
are at <ibdev_path>/<port_id>/hw_counters/<file_name>.
In switchdev mode the counters file is located right after the device
name, hence resides at <ibdev_path>/hw_counters.

The fix tries to open the path in the second location after a failure
to open the file from the first location.

Fixes: 9c0a9eed37 ("net/mlx5: switch to the names in the shared IB context")
Cc: stable@dpdk.org

Signed-off-by: Shy Shyman <shys@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-07-21 15:46:30 +02:00
Viacheslav Ovsiienko
161d103b23 net/mlx5: add queue start and stop
The mlx5 PMD did not support queue_start and queue_stop eth_dev API
routines, queue could not be suspended and resumed during device
operation.

There is the use case when this feature is crucial for applications:

- there is the secondary process handling the queue
- secondary process crashed/aborted
- some mbufs were allocated or used by secondary application
- some mbufs were allocated by Rx queues to receive packets
- some mbufs were placed to send queue
- queue goes to undefined state

In this case there was no reliable way to recovery queue handling
by restarted secondary process but reset queue to initial state
freeing all involved resources, including buffers involved in queue
operations, reset the mbuf pools, and then reinitialize queue
to working state:

- reset mbuf pool, allocate all mbuf to initialize pool into
  safe state after the crush and allow safe mbuf free calls
- stop queue, free all potentially involved mbufs
- reset mbuf pool again
- start queue, reallocate mbufs needed

This patch introduces the queue start/stop feature with some
limitations:

- hairpin queues are not supported
- it is application responsibility to synchronize start/stop
  with datapath routines, rx/tx_burst must be suspended during
  the queue_start/queue_stop calls
- it is application responsibility to track queue usage and
  provide coordinated queue_start/queue_stop calls from
  secondary and primary processes.
- Rx queues with vectorized Rx routine and engaged CQE
  compression are not supported by this patch currently

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-07-21 15:46:30 +02:00
Dekel Peled
08d1838f64 net/mlx5: implement CQ for Rx using DevX API
This patch continues the work to use DevX API for different objects
creation and management.
On Rx control path, the RQ, RQT, and TIR objects can already be
created using DevX API.
This patch adds the support to create CQ for RxQ using DevX API.
The corresponding event channel is also created and utilized using
DevX API.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
9d60f54569 common/mlx5: remove inclusion of Verbs header files
Several source files include Verbs header files as in (1). These source
files will not compile under non-Linux operating systems. This commit
removes this inclusion in two cases:

Case 1: There is no usage of ibv_* or mlx5dv_* symbols in the source
file so the inclusion in (1) can be safely removed.

Case 2: Verbs symbols are used. Please note the inclusion in (1) already
appears in file linux/mlx5_glue.h (which represents the interface
to the rdma-core library). Therefore, replace (1) in the source file
with (2).  Under non-Linux operating systems - file mlx5_glue.h will not
include (1).

(1)
 #include <infiniband/verbs.h>
 #include <infiniband/mlx5dv.h>

(2)
 #include <mlx5_glue.h>

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
2e86c4e5c7 net/mlx5: refactor multi-process communication
1. The shared data communication between the primary and the secondary
processes is implemented using Linux API. Move the Linux API code under
linux directory (file linux/mlx5_os.c).

2. File net/mlx5/mlx5_mp.c handles requests to the primary and secondary
processes (e.g. start_rxtx, stop_rxtx). It is Linux based so it is moved
under linux (new file linux/mlx5_mp_os.c).

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
ef9ee13f6e net/mlx5: cleanup header file
The cleanup refers to header file mlx5.h.
1. Remove unused prototypes.
2. Move prototypes under their correct title.
3. Change functions to static and remove their prototye from the header
file.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
98c4b12afa net/mlx5: eliminate dependency on Linux in shared header
This commit eliminates Linux dependencies in shared file mlx5.h.

1. All functions using 'struct ifreq' are moved to file
linux/mlx5_ethdev_os.c such that this struct can be removed from mlx5.h.
2. Function mlx5_set_flags() that uses Linux flags (e.g. IFF_UP) is
changed to static and its prototype is removed from mlx5.h.
3. Remove redundant member verbs_action from 'struct mlx5_priv'.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
4d18abd130 net/mlx5: wrap Linux promiscuous and multicast functions
This commit adds Linux implementation of routines mlx5_os_set_promisc()
and mlx5_os_set_promisc(). The routines call netlink APIs.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
ab27cdd93a net/mlx5: refactor Linux MAC operations
Move OS specific MAC operations add, remove, modify VF into file
linux/mlx5_os.c.
Remove unused function mlx5_get_mac().

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Ophir Munk
2aba9fc725 net/mlx5: replace Linux specific calls
The following Linux calls are replaced by their matching rte APIs.

mmap ==> rte_mem_map()
munmap == >rte_mem_unmap()
sysconf(_SC_PAGESIZE) ==> rte_mem_page_size()

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Suanming Mou
50f95b23c9 net/mlx5: add option to configure FCS or decapsulation
There are some limitations on some NICs (at least on ConnectX-6 Dx
and BlueField 2) with supporting FCS (frame checksum) scattering for
the tunnel decapsulated packets.

For the case only one of the features can be supported in the same time,
and the new devarg "decap_en" is introduced to provide the choice to the
users.

If FCS scattering feature is not supposed to be engaged by application,
this new devarg should be specified as "decap_en=0", forcing the FCS
feature enable and rejecting tunnel decap actions in the rte_flow engine.
If FCS scatter is not needed and application supposes to use tunnel
decapsulation in rte_flow, the devarg can be omitted or set to non-zero
value (this is default settings).

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-07-21 15:46:30 +02:00
Suanming Mou
2175c4dc62 net/mlx5: convert configuration objects to unified malloc
This commit allocates the miscellaneous configuration objects from the
unified malloc function.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:46:30 +02:00
Suanming Mou
5522da6b20 net/mlx5: add option to allocate memory from system
Currently, for MLX5 PMD, once millions of flows created, the memory
consumption of the flows are also very huge. For the system with limited
memory, it means the system need to reserve most of the memory as huge
page memory to serve the flows in advance. And other normal applications
will have no chance to use this reserved memory any more. While most of
the time, the system will not have lots of flows, the  reserved huge
page memory becomes a bit waste of memory at most of the time.

By the new sys_mem_en devarg, once set it to be true, it allows the PMD
allocate the memory from system by default with the new add mlx5 memory
management functions. Only once the MLX5_MEM_RTE flag is set, the memory
will be allocate from rte, otherwise, it allocates memory from system.

So in this case, the system with limited memory no need to reserve most
of the memory for hugepage. Only some needed memory for datapath objects
will be enough to allocated with explicitly flag. Other memory will be
allocated from system. For system with enough memory, no need to care
about the devarg, the memory will always be from rte hugepage.

One restriction is that for DPDK application with multiple PCI devices,
if the sys_mem_en devargs are different between the devices, the
sys_mem_en only gets the value from the first device devargs, and print
out a message to warn that.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:44:36 +02:00
Viacheslav Ovsiienko
a2854c4de1 net/mlx5: convert Rx timestamps in real-time format
The ConnectX-6DX supports the timestamps in various formats,
the new realtime format is introduced - the upper 32-bit word
of timestamp contains the UTC seconds and the lower 32-bit word
contains the nanoseconds. This patch detects what format is
configured in the NIC and performs the conversion accordingly.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:44:36 +02:00
Viacheslav Ovsiienko
b94d93ca73 net/mlx5: support reading device clock
If send schedule feature is engaged there is the Clock Queue
created, that reports reliable the current device clock counter
value. The device clock counter can be read directly from the
Clock Queue CQE.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:44:36 +02:00
Viacheslav Ovsiienko
d133f4cdb7 net/mlx5: create clock queue for packet pacing
This patch creates the special completion queue providing
reference completions to schedule packet send from
other transmitting queues.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:44:36 +02:00
Viacheslav Ovsiienko
24feb04596 net/mlx5: fix UAR lock sharing for multiport devices
The master and representors might be created over the multiport
Infiniband devices and the UAR resource allocated for sibling
ports might belong to the same underlying Infiniband device.
Hardware requires the write access to the UAR must be performed
as atomic 64-bit write, on 32-bit systems this is two sequential
writes, protected by lock. Due to possibility to share the same
UAR between sibling devices the locks must be moved to shared
context.

Fixes: f048f3d479 ("net/mlx5: switch to the shared IB device context")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:44:36 +02:00
Viacheslav Ovsiienko
8f848f32fc net/mlx5: introduce send scheduling devargs
This patch introduces the new devargs:

tx_pp - enables accurate packet send scheduling on mbuf timestamps
  in the PMD. On the device start if "rte_dynflag_timestamp"
  dynamic flag is registered and this devarg non-zero value is
  specified, the driver initializes all necessary internal
  infrastructure to provide packet scheduling. The parameter
  value specifies scheduling granularity in nanoseconds.

tx_skew - the parameter adjusts the send packet scheduling on
  timestamps and represents the average delay between beginning
  of the transmitting descriptor processing by the hardware and
  appearance of actual packet data on the wire. The value should
  be provided in nanoseconds and is valid only if tx_pp parameter
  is specified. The default value is zero.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-21 15:44:36 +02:00
Dekel Peled
b293fbf967 net/mlx5: add OS specific flow actions operations
This patch introduces the OS specific functions, for flow actions
create and destroy operations.

In existing implementation, the functions to create flow actions
return a pointer to the created action object.

The new OS specific functions to create flow actions return 0 on
success, and (-1) on failure.
On success, a pointer to the created action object is returned
using an additional parameter.
On failure errno is set.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-07 23:38:26 +02:00
Dekel Peled
e57b858710 net/mlx5: add OS specific flow create and destroy
This patch introduces the OS specific functions, for flow create
and flow destroy operations.

In existing implementation, the functions to create objects
(flow/table/matcher) return a pointer to the created object.
The functions to destroy objects return 0 on success and errno on
failure.

The new OS specific functions to create objects return 0 on success,
and (-1) on failure.
On success, a pointer to the created object is returned using an
additional parameter.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-07 23:38:26 +02:00
Dekel Peled
e4ed8de39b net/mlx5: add OS specific flow type selection
In current implementation the flow type (DV/Verbs) is selected
using dedicated function flow_get_drv_type().

This patch adds OS specific function mlx5_flow_os_get_type(), to
allow OS specific flow type selection.
The new function is called by flow_get_drv_type(), and if it returns a
valid value (DV/Verbs) no more logic is required.
Otherwise the existing logic is executed.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-07 23:38:26 +02:00
Dekel Peled
17ad3af9f4 net/mlx5: add OS specific flow related utilities
This patch introduces the first OS specific utility functions,
for use by flow engine in different OS implementation.

The first utility functions are:
bool mlx5_flow_os_item_supported(item)
bool mlx5_flow_os_action_supported(action)

They are implemented to check OS specific support for different
item types and action types.

New header file is added:
drivers/net/mlx5/linux/mlx5_flow_os.h

This file contains the utility functions mentioned above for Linux OS.
At this stage they are implemented as static inline, for efficiency,
and always return true.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-07 23:38:26 +02:00
Michael Baum
36dabcea78 net/mlx5: use anonymous Direct Verbs allocator argument
The mlx5_dev_spawn function defines an struct mlx5dv_ctx_allocators type
variable several hundred rows after it starts, with the only use it
being passed as a parameter to the mlx5_glue->dv_set_context_attr
function.
However, according to DPDK Coding Style Guidelines, variables should be
declared at the start of a block of code rather than in the middle.
Therefore, to improve the Coding Style, the variable is passed directly
to the function without declaring it before.

Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:30 +02:00
Viacheslav Ovsiienko
420bbdae89 net/mlx5: fix host physical function representor naming
The new kernel adds the names like "pf0" for Host PCI physical
function representor on Bluefield SmartNIC hosts. This patch
provides correct HPF representor recognition over the kernel
versions 5.7 and laters.

The following port naming formats are supported:

  - missing physical port name (no sysfs/netlink key) at all,
    master is assumed

  - decimal digits (for example "12"), representor is
    assumed, the value is the index of attached VF

  - "p" followed by decimal digits, for example "p2", master
    is assumed

  - "pf" followed by PF index, for example "pf0", Host PF
     representor is assumed on SmartNIC systems.

  - "pf" followed by PF index concatenated with "vf" followed by
     VF index, for example "pf0vf1", representor is assumed.
     If index of VF is "-1" it is a special case of Host PF
     representor, this representor must be indexed in devargs
     as 65535, for example representor=[0-3,65535] will
     allow representors for VF0, VF1, VF2, VF3 and for host PF.

Fixes: 79aa430721 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:30 +02:00
Ophir Munk
391b8bcc81 common/mlx5: move some getter functions from net driver
Getter functions such as: 'mlx5_os_get_ctx_device_name',
'mlx5_os_get_ctx_device_path', 'mlx5_os_get_dev_device_name',
'mlx5_os_get_umem_id' are implemented under net directory. To enable
additional devices (e.g. regex, vdpa) to access these getter functions
they are moved under common directory.

As part of this commit string sizes DEV_SYSFS_NAME_MAX and
DEV_SYSFS_PATH_MAX are increased by 1 to make sure that the destination
string size in strncpy() function is bigger than the source string size.
This update will avoid GCC version 8 error -Werror=stringop-truncation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:30 +02:00
David Marchand
63783b0172 net/mlx5: remove redundant newline from logs
The DRV_LOG macro already appends a newline.

Fixes: 46287eacc1 ("net/mlx5: introduce hash list")
Fixes: 860897d289 ("net/mlx5: reorganize flow tables with hash list")
Fixes: e484e40323 ("net/mlx5: optimize tag traversal with hash list")
Fixes: 6801116688 ("net/mlx5: fix multiple flow table hash list")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
aec086c9f1 common/mlx5: share kernel interface name getter
Some configuration of the mlx5 port are done by the kernel net device
associated to the IB device represents the PCI device.

The DPDK mlx5 driver uses Linux system calls, for example ioctl, in
order to configure per port configurations requested by the DPDK user.

One of the basic knowledges required to access the correct kernel net
device is its name.

Move function to get interface name from IB device path to the common
library.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Ophir Munk
4f96d91396 net/mlx5/linux: add memory region callbacks to Verbs
Create a set of verbs callbacks in 'struct mlx5_verbs_ops'
and add MR operations to it (file net/mlx5/linux/mlx5_verbs.c).

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-17 16:32:01 +02:00
Ophir Munk
d5ed8aa944 net/mlx5: add memory region callbacks in per-device cache
Prior to this commit MR operations were verbs based and hard coded under
common/mlx5/linux directory. This commit enables upper layers (e.g.
net/mlx5) to determine which MR operations to use. For example the net
layer could set devx based MR operations in non-Linux environments. The
reg_mr and dereg_mr callbacks are added to the global per-device MR
cache 'struct mlx5_mr_share_cache'.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-17 16:32:01 +02:00
Ophir Munk
73bf9235e9 net/mlx5: refactor statistics
mlx5 statistics are calculated by several methods:
1. In software when packets go through datapath.
2. Calling ioctl with ETHTOOL command (Linux specific).
3. Reading counters from SYSFS device path (Linux specific).

The Linux related functions are moved to file linux/mlx5_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
042f5c94fd net/mlx5: refactor device operations for Linux
There are three types of eth_dev_ops: primary, secondary and isolate.
Their function calls assignments are moved from common file
mlx5.c to the Linux specific file linux/mlx5_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
1256805dd5 net/mlx5: move Linux-specific functions
File mlx5_ethdev.c is partially moved to linux/mlx5_ethdev_os.c for
functions which are Linux specific. Functions which are Linux agnostics
remain in mlx5_ethdev.c file.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
f484ffa1b1 net/mlx5: move socket files in Linux directory
mlx5_socket.c file is using APIs which are Linux specifics.  Therefore
move it (including mlx5_socket.h) from net/mlx5 directory to
net/mlx5/linux directory. This commit also updates the Makefile and
the meson files.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
9138989036 net/mlx5: rename ib in names
Renames in this commit:
mlx5_ibv_list -> mlx5_dev_ctx_list
mlx5_alloc_shared_ibctx -> mlx5_alloc_shared_dev_ctx
mlx5_free_shared_ibctx -> mlx5_free_shared_dev_ctx
mlx5_ibv_shared_port -> mlx5_dev_shared_port
ibv_port -> dev_port

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
21b7c452a6 net/mlx5: remove completion object dependency on DV
Replace 'struct mlx5dv_devx_cmd_comp *' with 'void *' in 'struct
mlx5_dev_ctx_shared'.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
834a9019ec net/mlx5: remove Verbs dependency in spawn struct
1. Replace 'struct ibv_device *' with 'void *' in 'struct
mlx5_dev_spawn_data'. Define a getter function to retrieve the
device name.
2. Rename ibv_dev and ibv_port as phys_dev and phys_port
respectively.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
10f3581dfd net/mlx5: add Linux-specific header file
File drivers/net/linux/mlx5_os.h is added. It includes specific
Linux definitions such as PCI driver flags, link state changes
interrupts, link removal interrupts, etc.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
2eb4d0107a net/mlx5: refactor PCI probing on Linux
Refactor PCI probing related code. Move Linux specific functions (as
well as verbs and dv related code) from mlx5.c file to linux/mlx5_os.c
file.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
c7f6ba0e53 net/mlx5: remove umem field dependency on Direct Verbs
umem field is used in several structs. Its type 'struct mlx5dv_devx_umem
*' is changed to 'void *'. This change will allow non-Linux OS
compilations.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
e85f623e13 net/mlx5: remove attributes dependency on Verbs
Define 'struct mlx5_dev_attr' which is ibv and dv independent. It
contains attribute that were originally contained in 'struct
ibv_device_attr_ex' and 'struct mlx5dv_context dv_attr'. Add a new API
mlx5_os_get_dev_attr() which fills in the new defined struct.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00
Ophir Munk
f44b09f9e3 net/mlx5: add Linux-specific file with getter functions
'ctx' type (field in 'struct mlx5_ctx_shared') is changed from 'struct
ibv_context *' to 'void *'.  'ctx' members which are verbs dependent
(e.g. device_name) will be accessed through getter functions which are
added to a new file under Linux directory: linux/mlx5_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-16 19:21:07 +02:00