Commit Graph

1099 Commits

Author SHA1 Message Date
Viacheslav Ovsiienko
fa2e14d492 net/mlx5: cache associated network device index
The associated device index is retrieved via Netlink request to
underlying Infiniband device driver. This network device index
is permanent throughout the lifetime of device. We do not
spawn the rte_eth_dev ports without associated network device, and
if network device is being unbound we get the remove notification
message and rte_eth_dev port is also detached. So, we may store
the ifindex in mlx5_device_spawn() routine at rte_eth_dev port
creation and initialization time and use the cached value further
instead of doing actual Netlink request.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
cb9cb61e54 net/mlx5: report max number of mbuf segments
This patch fills the tx_desc_lim.nb_seg_max and
tx_desc_lim.nb_mtu_seg_max fields of rte_eth_dev_info
structure to report thee maximal number of packet
segments, requested inline data configuration is
taken into account in conservative way.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
18a1c20044 net/mlx5: implement Tx burst template
This patch adds the implementation of tx_burst routine template.
The template supports all Tx offloads and multiple optimized
tx_burst routines can be generated by compiler from this one.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
eb8121ab9d net/mlx5: introduce Tx burst routine template
Mellanox NICs support the wide set of Tx offloads. The supported
offloads are reported by the mlx5 PMD in rte_eth_dev_info
tx_offload_capa field.
An application may choose any combination of supported offloads
and configure the device appropriately. Some of Tx offloads may be
not requested by application, or ever all of them may be omitted.
Most of the Tx offloads require some code branches in tx_burst routine
to support ones. If Tx offload is not requested the tx_burst routine
code may be significantly simplified and consume less CPU cycles.

For example, if application does not engage TSO offload this code
can be omitted, if multi-segment packet is not supposed the tx_burst
may assume single mbuf packets only, etc.

Currently, the mlx5 PMD implements multiple tx_burst subroutines
for most common combinations of requested Tx offloads, each branch
has its own dedicated implementation. It is not very easy to update,
support and develop such kind of code - multiple branches impose
the multiple points to process. Also many of frequently requested
offload combinations are not supported yet. That leads to selecting of
not completely matching tx_burst routine and harms the performance.

This patch introduces the new approach for tx_burst code. It is proposed
to develop the unified template for tx_burst routine, which supports
all the Tx offloads and takes the compile time defined parameter
describing the supposed set of supported offloads. On the base
of this template, the compiler is able to generate multiple tx_burst
routines highly optimized for the statically specified set of
Tx offloads.
Next, in runtime, at Tx queue configuration the best matching optimized
implementation of tx_burst is chosen.

This patch intentionally omits the template internal implementation,
but just introduces the template itself to emboss the approach of
the multiple specially tuned tx_burst routines.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
38b4b397a5 net/mlx5: add Tx configuration and setup
This patch updates the Tx datapath control and configuration
structures and code for managing Tx datapath settings.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
ed4470c70f net/mlx5: extend NIC attributes query via DevX
This patch extends the NIC attributes query via DevX.
The appropriate interface structures are borrowed from
kernel driver headers and DevX calls are added to
mlx5_devx_cmd_query_hca_attr() routine.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
50724e1bba net/mlx5: update Tx definitions
This patch updates Tx datapath definitions, mostly hardware related.
The Tx descriptor structures are redefined with required fields,
size definitions are renamed to reflect the meanings in more
appropriate way. This is a preparation step before introducing
the new Tx datapath implementation.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
505f1fe426 net/mlx5: add Tx devargs
This patch introduces new mlx5 PMD devarg options:

- txq_inline_min - specifies minimal amount of data to be inlined into
  WQE during Tx operations. NICs may require this minimal data amount
  to operate correctly. The exact value may depend on NIC operation
  mode, requested offloads, etc.

- txq_inline_max - specifies the maximal packet length to be completely
  inlined into WQE Ethernet Segment for ordinary SEND method. If packet
  is larger the specified value, the packet data won't be copied by the
  driver at all, data buffer is addressed with a pointer. If packet
  length is less or equal all packet data will be copied into WQE.

- txq_inline_mpw - specifies the maximal packet length to be completely
  inlined into WQE for Enhanced MPW method.

Driver documentation is also updated.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
a6bd4911ad net/mlx5: remove Tx implementation
This patch removes the existing Tx datapath code
as preparation step before introducing the new
implementation. The following entities are being
removed:

- deprecated devargs support
- tx_burst() routines
- related PRM definitions
- SQ configuration code
- Tx routine selection code
- incompatible Tx completion code

The following devargs are deprecated and ignored:
- "txq_inline" is going to be converted to "txq_inline_max"
  for compatibility issue
- "tx_vec_en"
- "txqs_max_vec"
- "txq_mpw_hdr_dseg_en"
- "txq_max_inline_len" is going to be converted
  to "txq_inline_mpw" for compatibility issue

The deprecated devarg keys are recognized by PMD
and ignored/converted to the new ones in order not
to block device probing.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
42280dd91b net/mlx5: fix typos in comments
Some spelling mistakes were found in comments.
This patch fixes them.

Fixes: d10b09db0a ("net/mlx5: fix allocation when no memory on device NUMA node")
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Fixes: f6d9ab4e76 ("net/mlx5: check Tx queue size overflow")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
20367e45f2 net/mlx5: fix flow item flags bitmap
In functions flow_dv_translate() and  flow_dv_validate(), the flow
items are scanned and each item is marked in item_flags bitmap.
The code handling some of the items was ported from another project,
where items are marked in a slightly different manner.

This patch fixes the setting of items in bitmap, adapting it to the
required manner.

Fixes: d53aa89aea ("net/mlx5: support matching on ICMP/ICMP6")
Fixes: 5865955ad994 ("net/mlx5: match GRE key and present bits")
Fixes: 2e4c987aad ("net/mlx5: validate Direct Rule E-Switch")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
31538ef62c net/mlx5: allow basic counter management fallback
In case the asynchronous devx commands are not supported in RDMA core
fallback to use a basic counter management.

Here, the PMD counters cashe is redundant and the host thread doesn't
update it. hence, each counter operation will go to the FW and the
acceleration reduces.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Matan Azrad
f15db67df0 net/mlx5: accelerate DV flow counter query
All the DV counters are cashed in the PMD memory and are contained in
pools which are contained in containers according to the counters
allocation type - batch or single.

Currently, the flow counter query is done synchronously in pool
resolution means that on the user request a FW command is triggered to
read all the counters in the pool.

A new feature of devX to asynchronously read batch of flow counters
allows to accelerate the user query operation.

Using the DPDK host thread, the PMD periodically triggers asynchronous
query in pool resolution for all the counter pools and an interrupt is
triggered by the FW when the values are updated.
In the interrupt handler the pool counter values raw data is replaced
using a double buffer algorithm (very fast).
In the user query, the PMD just returns the last query values from the
PMD cache - no system-calls and FW commands are triggered from the user
control thread on query operation!

More synchronization is added with the host thread:
        Container resize uses double buffer algorithm.
        Pools growing in container uses atomic operation.
        Pool query buffer replace uses a spinlock.
        Pool minimum devX counter ID uses atomic operation.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Matan Azrad
ebbac312e4 net/mlx5: resize a full counter container
When the counter countainer has no more space to store more counter
pools try to resize the container to allow more pools to be created.

So, the only limitation for the maximum counter number is the memory.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Matan Azrad
5382d28c21 net/mlx5: accelerate DV flow counter transactions
The DevX interface exposes a new feature to the PMD that can allocate a
batch of counters by one FW command. It can improve the flow
transaction rate (with count action).

Add a new counter pools mechanism to manage HW counters in the PMD.
So, for each flow with counter creation the PMD will try to find a free
counter in the PMD pools container and only if there is no a free
counter, it will allocate a new DevX batch counters.

Currently we cannot support batch counter for a group 0 flow, so
create a 2 container types, one which allocates counters one by
one and one which allocates X counters by the batch feature.

The allocated counters objects are never released back to the HW
assuming the flows maximum number will be close to the actual value of
the flows number.
Later, it can be updated, and dynamic release mechanism can be added.

The counters are contained in pools, each pool with 512 counters.
The pools are contained in counter containers according to the
allocation resolution type - single or batch.
The cache memory of the counters statistics is saved as raw data per
pool.
All the raw data memory is allocated for all the container in one
memory allocation and is managed by counter_stats_mem_mng structure
which registers all the raw memory to the HW.
Each pool points to one raw data structure.

The query operation is in pool resolution which updates all the pool
counter raw data by one operation.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Xiaoyu Min
5e33bebdd8 net/mlx5: support IP-in-IP tunnel
Enabled IP-in-IP tunnel type support on DV/DR flow engine.
This includes the following combination:
 - IPv4 over IPv4
 - IPv4 over IPv6
 - IPv6 over IPv4
 - IPv6 over IPv6

MLX5 NIC supports IP-in-IP tunnel via FLEX Parser so
need to make sure fw using FLEX Paser profile 0.

  mlxconfig -d <mst device> -y set FLEX_PARSER_PROFILE_ENABLE=0

The example testpmd commands would be:

- Match on IPv4 over IPv4 packets and do inner RSS:

  testpmd> flow create 0 ingress pattern eth / ipv4 proto is 0x04 /
           ipv4 / udp / end actions rss level 2 queues 0 1 2 3 end / end

- Match on IPv6 over IPv4 packets and do inner RSS:

  testpmd> flow create 0 ingress pattern eth / ipv4 proto is 0x29 /
           ipv6 / udp / end actions rss level 2 queues 0 1 2 3 end / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:34 +02:00
Xiaoyu Min
a7a0365565 net/mlx5: match GRE key and present bits
Support matching on the present bits (C,K,S)
as well as the optional key field.

If the rte_flow_item_gre_key is specified in pattern,
it will set K present match automatically.

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:34 +02:00
Xiaoyu Min
9f8dee4bcb net/mlx5: support match GRE protocol on DR engine
DR engine support matching on GRE protocol field without MPLS supports.
So bypassing the MPLS check when DR is enabled.

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:34 +02:00
David Marchand
b76fafb174 eal: fix IOVA mode selection as VA for PCI drivers
The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which
was intended to mean "driver only supports VA" but had been understood
as "driver supports both PA and VA" by most net drivers and used to let
dpdk processes to run as non root (which do not have access to physical
addresses on recent kernels).

The check on physical addresses actually closed the gap for those
drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this
flag can retain its intended meaning.
Document explicitly its meaning.

We can check that a driver requirement wrt to IOVA mode is fulfilled
before trying to probe a device.

Finally, document the heuristic used to select the IOVA mode and hope
that we won't break it again.

Fixes: 703458e19c ("bus/pci: consider only usable devices for IOVA mode")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-07-22 17:45:52 +02:00
Xiaoyu Min
d53aa89aea net/mlx5: support matching on ICMP/ICMP6
On DV/DR flow engine, MLX5 can match on ICMP/ICMP6's code and type field
via FLEX Parser, which can be enabled by config FW using FLEX Parser
profile 2:

mlxconfig -d <mst device> -y set FLEX_PARSER_PROFILE_ENABLE=2

The testpmd commands could be:

  testpmd> flow create 0 ingress pattern eth / ipv4 /
           icmp type is 8 code is 0 / end
	   actions rss queues 0 1 end / end

  testpmd> flow create 0 ingress pattern  eth / ipv6 /
           icmp6 type is 128 code is 0 / end
	   actions rss queues 0 1 end / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-08 21:26:52 +02:00
Eli Britstein
bf1d7d9a03 net/mlx5: zero out UDP checksum in encapsulation
Mellanox NICs do not support UDP checksum hardware tx offload over IPv6.
This limitation becomes critical for UDP based tunnels like VXLAN.
Beside the UDP checksum validity is required by IPv6 there is an option
in Linux to allow accepting UDP zero sum (see udp6zerocsumrx in iproute2
package).

This patch zeroes out the UDP checksum field for encapsulation headers
in raw encap action.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-08 21:26:52 +02:00
Thomas Monjalon
3dee3f1b52 net/mlx: allow build only on Linux
Currently mlx4/mlx5 support only Linux.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2019-07-08 21:26:52 +02:00
Asaf Penso
3f22e3153f net/mlx5: check memory allocation in flow creation
rte_calloc functions returns a non-null pointer in case of
success and null pointer in case of failure.

The return value should be checked and the function flow
should take that into consideration.

This patch adds a check for rte_calloc return value in function
flow_list_create.

Fixes: 84c406e745 ("net/mlx5: add flow translate function")
Cc: stable@dpdk.org

Signed-off-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-05 01:52:02 +02:00
Asaf Penso
cb1d2cce95 net/mlx5: fix condition for link update fallback
mlx5_link_update uses the newer ethtool command
ETHTOOL_GLINKSETTINGS to determine interface capabilities but falls
back to the older (deprecated) ETHTOOL_GSET command if the new
method fails for any reason.
The older method only supports reporting of capabilities up to 40G.

However, mlx5_link_update_unlocked_gs can return a failure for a
number of reasons (including the link being down).
Using the older method in cases of transient failure of the method
can result in reporting of reduced capabilities to the application.

The older method (mlx5_link_update_unlocked_gset) should only be
invoked if the newer method returns EOPNOTSUPP.

Fixes: 7d2e32f76c ("net/mlx5: fix ethtool link setting call order")
Cc: stable@dpdk.org

Reported-by: Srinivas Narayan <srinivas.narayan@att.com>
Signed-off-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-05 01:52:02 +02:00
Moti Haimovsky
5291c601bb net/mlx5: remove TCF support
This commit removes the support of configuring the device E-switch
using TCF since it is now possible to configure it via DR (direct
verbs rules), and by that to also remove the PMD dependency in libmnl.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-05 01:52:02 +02:00
Moti Haimovsky
8f0fa0c0ad net/mlx5: fix crash on null operation
mlx5 implements mlx5_flow_null_drv_ops to be used when a specific
flow typei/driver is not available or invalid.
This routines return error without modifying the rte_flow_error
parameter passed to them which causes testpmd, for example, to crash.
This commit addresses the issue by modifying the rte_flow_error
parameter in theses routines.

Fixes: 0c76d1c9a1 ("net/mlx5: add abstraction for multiple flow drivers")
Fixes: 684dafe795 ("net/mlx5: add flow query abstraction interface")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-05 01:52:02 +02:00
Dekel Peled
585b99fb4a net/mlx5: modify TCP header using Direct Verbs
This patch implements additional actions of packet header
modifications.

Add actions:
- INC_TCP_SEQ - Increase sequence number in the outermost TCP header.
- DEC_TCP_SEQ - Decrease sequence number in the outermost TCP header.
- INC_TCP_ACK - Increase acknowledgment number in the outermost TCP
		header.
- DEC_TCP_ACK - Decrease acknowledgment number in the outermost TCP
		header.

Original work by Xiaoyu Min.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2019-07-05 01:52:01 +02:00
Anatoly Burakov
028669bc9f eal: hide shared memory config
Now that everything that has ever accessed the shared memory
config is doing so through the public API's, we can make it
internal. Since we're removing quite a few headers from
rte_eal_memconfig.h, we need to add them back in places
where this header is used.

This bumps the ABI, so also change all build files and make
update documentation.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
2019-07-06 10:32:34 +02:00
Anatoly Burakov
76f80881ef mem: add API to lock/unlock memory hotplug
Currently, the memory hotplug is locked automatically by all
memory-related _walk() functions, but sometimes locking the
memory subsystem outside of them is needed. There is no
public API to do that, so it creates a dependency on shared
memory config to be public. Fix this by introducing a new
API to lock/unlock the memory hotplug subsystem.

Create a new common file for all things mem config, and a
new API namespace rte_mcfg_*, and search-and-replace all
usages of the locks with the new API.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
2019-07-05 22:12:40 +02:00
Bruce Richardson
530588f3cd drivers: add reasons for components being disabled
For each driver where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2019-07-02 23:21:11 +02:00
Ali Alnubani
843e720557 net/mlx5: fix 32-bit build
This is to fix the error:
  ```
  drivers/net/mlx5/mlx5_defs.h:14:26:
  error: format '%lx' expects argument of type 'long unsigned int',
  but argument 5 has type 'off_t {aka long long int}' [-Werror=format=]
  drivers/net/mlx5/mlx5_txq.c:569:48: note: format string is defined here
     DRV_LOG(DEBUG, "port %u: uar_mmap_offset 0x%lx"
                                                ~~^
                                                %llx
  ```
Which reproduces with gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0.

Fixes: 6bf10ab69b ("net/mlx5: support 32-bit systems")
Cc: stable@dpdk.org

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-06-28 20:31:49 +02:00
Viacheslav Ovsiienko
e28111ac98 net/mlx5: fix master device Netlink socket sharing
There is the patch [1] that uses master device Netlink socket
to retrieve master device link settings. This is not thread safe
because this resource may be in use by other call to the master
device itself. Using the same Netlink socket concurrently from
the multiple threads causes Netlink requests malfunction and
must be eliminated. The patch replaces master Netlink socket
with the socket from representor device.

[1] http://patches.dpdk.org/patch/53120/

Fixes: 0333b2f584 ("net/mlx5: inherit master link settings for representors")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
35c090eac1 net/mlx5: recover secondary process Tx errors
The SQ errors recovery mechanism in the PMD invokes a Verbs
functions to modify the RQ states in order to reset the SQ and to
reactivate it.

These Verbs functions are not allowed to be invoked from a secondary
process, hence the PMD skips the recovery when the error is captured
by secondary processes queues.

Using the DPDK IPC mechanism the secondary process can request Verbs
queues state modifications to be done synchronically by the primary
process.

Add support for secondary process Tx errors recovery.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
2d77cb615b net/mlx5: recover secondary process Rx errors
The RQ errors recovery mechanism in the PMD invokes a Verbs functions to
modify the RQ states in order to reset the RQ and to reactivate it.

These Verbs functions are not allowed to be invoked from a secondary
process, hence the PMD skips the recovery when the error is captured by
secondary processes queues.

Using the DPDK IPC mechanism the secondary process can request Verbs
queues state modifications to be done synchronically by the primary
process.

Add support for secondary process Rx errors recovery.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
957e45fb7b net/mlx5: handle Tx completion with error
When WQEs are posted to the HW to send packets, the PMD may get a
completion report with error from the HW, aka error CQE which is
associated to a bad WQE.

The error reason may be bad address, wrong lkey, bad sizes, etc.
that can wrongly be configured by the PMD or by the user.

Checking all the optional mistakes to prevent error CQEs doesn't make
sense due to performance impacts and huge complexity.

The error CQEs change the SQ state to error state what causes all the
next posted WQEs to be completed with CQE flush error forever.

Currently, the PMD doesn't handle Tx error CQEs and even may crashed
when one of them appears.

Extend the Tx data-path to detect these error CQEs, to report them by
the statistics error counters, to recover the SQ by moving the state
to ready again and adjusting the management variables appropriately.

Sometimes the error CQE root cause is very hard to debug and even may
be related to some corner cases which are not reproducible easily, hence
a dump file with debug information will be created for the first number
of error CQEs, this number can be configured by the PMD probe
parameters.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
88c0733535 net/mlx5: extend Rx completion with error handling
When WQEs are posted to the HW to receive packets, the PMD may receive
a completion report with error from the HW, aka error CQE which is
associated to a bad WQE.

The error reason may be bad address, wrong lkey, small buffer size,
etc. that can wrongly be configured by the PMD or by the user.

Checking all the optional mistakes to prevent error CQEs doesn't make
sense due to performance impacts, moreover, some error CQEs can be
triggered because of the packets coming from the wire when the DPDK
application has no any control.

Most of the error CQE types change the RQ state to error state what
causes all the next received packets to be dropped by the HW and to be
completed with CQE flush error forever.

The current solution detects these error CQEs and even reports the
errors to the user by the statistics error counters but without
recovery, so if the RQ inserted to the error state it never moves to
ready state again and all the next packets ever will be dropped.

Extend the error CQEs handling for recovery by moving the state to
ready again, and rearranging all the RQ WQEs and the management
variables appropriately.

Sometimes the error CQE root cause is very hard to debug and even may
be related to some corner cases which are not reproducible easily,
hence a dump file with debug information will be created for the first
number of error CQEs, this number can be configured by the PMD probe
parameters.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
6bb506cc2c net/mlx5: separate Rx queue initialization
Move the RQ WQEs initialization code to separate function as an
arrangement to CQE error recovering for code reuse.

CC: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
6cfb153006 net/mlx5: mitigate Rx doorbell memory barrier
The RQ WQEs must be written in the memory before the HW gets the RQ
doorbell, hence a memory barrier should be triggered after the WQEs
writing and before the doorbell writing.

The current code used rte_wmb barrier which ensures that all the memory
stores were done while it is enough to use rte_cio_wmb barrier for the
local memory stores because the WQEs are in local memory.

CC: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
15b0ea0053 net/mlx5: fix device arguments error detection
When bad device arguments are added to the DPDK command line, the PMD
ignores all the command line arguments specified by the user and uses
the default values instead.

This behavior doesn't make sense because the user intention is to force
some device parameters and expects to get an error in case of
problematic issues with the arguments.

Stop probing and report an error in case of problematic command line
arguments.

Fixes: e72dd09b61 ("net/mlx5: add support for configuration through kvargs")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
066cfecdc9 net/mlx5: add log file procedure for debug data
Add a global function in the PMD which dumps debug information to
specific file.

The data can be printed in hexadecimal format or as regular string.

The number of debug files per PMD entity should be limited by a new PMD
probe parameter called max_dump_files_num.

The files will be created in the /var/log directory or in the current
directory.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Matan Azrad
1c191691a6 net/mlx5: remove Rx queues indexes correlation
There is a full correlation between the CQE indexes to the WQE indexes
in the vectorized Rx queues management.

When the RQ is inserted to the reset state, the correlation may break
because the HW starts the RQ polling from index 0 while the CQ polling
continues regularly.

As an arrangement to CQE errors handling, when the RQ can be reset,
the correlation dependence should be removed from all the Rx queues
index managements.

Remove the aforementioned dependence from the vectorized Rx burst
functions.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14 00:01:06 +09:00
Stephen Hemminger
d812a3c21e net/mlx5: remove unnecessary cast
The device private pointer (dev_private) is of type void *
therefore no cast is necessary in C.

Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-06-13 23:54:29 +09:00
Viacheslav Ovsiienko
5897ac1393 net/mlx5: fix event handler uninstall
When device is being closed and tries to unregister interrupt callback,
there is a chance the handler is still active (called in context of
eal_intr_thread_main thread). If so the rte_intr_callback_unregister
returns -EAGAIN and keeps the handler registered, causing crash when
underlaying resourse is gone away.

This race condition may happen if event handling in application takes
a long time. We should check the return code of unregistering routine
and try again to unregister the handler. The diagnostic messages are
shown once a second, while trying to unregister.

Fixes: 028b2a28c3 ("net/mlx5: update event handler for multiport IB devices")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-06-06 20:21:20 +09:00
Dekel Peled
0d51a5ec26 net/mlx5: fix order of items in NEON scatter
Previous patch added handling of metadata for multi-segment packet.
Function txq_scatter_v in file mlx5_rxtx_vec_neon.h was updated
incorrectly, items were inserted into WQE in wrong order.

This patch fixes the issue, inserting items into WQE correctly.

Fixes: 7f4019d370 ("net/mlx5: fix Tx metadata for multi-segment packet")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-06 20:21:20 +09:00
Tom Barbette
e571ad5541 net/mlx5: support reading clock
Implements support for read_clock for the mlx5 driver. mlx5 supports
hardware timestamp offload, setting packets timestamp field to the
device clock. rte_eth_read_clock allows to read the device's current
clock value and therefore compare values on similar time base.

See rxtx_callbacks for an example.

Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-06 20:21:20 +09:00
David Marchand
81d03bff26 net/mlx5: fix RxQ errors stat
Transmit errors must not be reported in q_errors[] which is for
reception.

Fixes: 87011737b7 ("mlx5: add software counters")
Fixes: 9f9a48eb29 ("net/mlx5: fix Tx stats error counter definition")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-06 20:21:20 +09:00
Anatoly Burakov
c7ef989970 ipc: handle unsupported IPC in sync request
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-06-05 11:28:05 +02:00
Anatoly Burakov
edf73dd330 ipc: handle unsupported IPC in action register
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.

For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.

For secondary processes, IPC will be crucial, so leave all
of the error handling as is.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-06-05 11:27:36 +02:00
Yongseok Koh
69c06d0e35 net/mlx: support IOVA VA mode
Set RTE_PCI_DRV_IOVA_AS_VA to driver's drv_flags as device's IOMMU takes
virtual address.

Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-04 00:34:54 +02:00
David Marchand
0c9da7555d net: replace IPv4/v6 constants with uppercase name
Since we change these macros, we might as well avoid triggering complaints
from checkpatch because of mixed case.

old=RTE_IPv4
new=RTE_IPV4
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"

old=RTE_ETHER_TYPE_IPv4
new=RTE_ETHER_TYPE_IPV4
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"

old=RTE_ETHER_TYPE_IPv6
new=RTE_ETHER_TYPE_IPV6
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
2019-06-03 16:54:54 +02:00
Dekel Peled
9e44476448 net/mlx5: fix memory free on queue create error
In function mlx5_rxq_ibv_new(), pointer *tmpl allocation is attempted
at the start, but not validated or freed in case of error.
In function mlx5_txq_ibv_new(), pointer *txq_ibv allocation is
attempted at the start, but not freed in case of error.

This patch adds pointers initialization, validation and freeing.

Fixes: 09cb5b5817 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Fixes: faf2667fe8 ("net/mlx5: separate DPDK from verbs Tx queue objects")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-05-27 10:21:04 +02:00
Dekel Peled
73f008334f net/mlx5: remove redundant size calculation macro
Patch [1] added, among other definitions, the macro MLX5_ST_SZ_DB.
Patch [2] added later the macro MLX5_ST_SZ_BYTES, which is exactly
the same macro with a different name.
Each of these macros was used in very few places.

This patch removes the definition of MLX5_ST_SZ_DB, and replaces it
with MLX5_ST_SZ_BYTES wherever it was used.
Macro MLX5_ST_SZ_BYTES was preffered since it is the same macro
name used in kernel code, see [3].

[1] http://patches.dpdk.org/patch/45254/
[2] http://patches.dpdk.org/patch/49403/
[3] https://lists.openwall.net/netdev/2014/10/02/75

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-27 10:20:55 +02:00
Jack Min
fe3b65c202 net/mlx5: support match on TCP flags
Add support to match all TCP control bits (flags)
except "NS (ECN-nonce)" via Direct Verbs (DV) or Direct Rule (DR)
engine.

Signed-off-by: Jack Min <jackmin@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-05-27 10:20:48 +02:00
Dekel Peled
50254b6d2a net/mlx5: move locally used functions to static
Multiple functions were declared in header file mlx5_rxtx.h,
implemented in mlx5_rxq.c, and called only in mlx5_rxq.c.

This patch moves all these functions declarations into mlx5_rxq.c,
as static functions.
Some functions implementation was copied higher in the file to
precede the functions calls.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-24 14:56:36 +02:00
Dekel Peled
cf9f7115f0 net/mlx5: fix description of return value
Return value of function mlx5_rxq_releasable() was not described
correctly in function description.

This patch updates the description to correctly describe the optional
return values.

Fixes: a6d83b6a92 ("net/mlx5: standardize on negative errno values")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-24 14:56:36 +02:00
Dekel Peled
3a3d398280 net/mlx5: fix missing validation of null pointer
Function mlx5_rxq_ibv_release() is called in several places.
Before each call except one, the input parameter is validated to make
sure it is not null.

This patch adds the validation where it is missing.
It also changes a priv_ prefix, left in a comment, to mlx5_ prefix.

Fixes: af4f09f282 ("net/mlx5: prefix all functions with mlx5")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-24 14:56:36 +02:00
Dekel Peled
a65c2d6637 net/mlx5: remove unused functions
Functions implemented but never called:
mlx5_rxq_ibv_releasable()
mlx5_rxq_cleanup()
mlx5_txq_ibv_releasable()

Function declared but not implemented:
rxq_alloc_mprq_buf()

This patch removes these functions from code and header file.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-24 14:56:36 +02:00
Olivier Matz
e73e3547ce net: add rte prefix to UDP structure
Add 'rte_' prefix to structures:
- rename struct udp_hdr as struct rte_udp_hdr.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:46 +02:00
Olivier Matz
f41b5156fe net: add rte prefix to TCP structure
Add 'rte_' prefix to structures:
- rename struct tcp_hdr as struct rte_tcp_hdr.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:46 +02:00
Olivier Matz
24ac604ef7 net: add rte prefix to IP defines
Add 'RTE_' prefix to defines:
- rename IPv4( as RTE_IPv4(.
- rename IPV4_MAX_PKT_LEN as RTE_IPV4_MAX_PKT_LEN.
- rename IPV4_HDR_IHL_MASK as RTE_IPV4_HDR_IHL_MASK.
- rename IPV4_IHL_MULTIPLIER as RTE_IPV4_IHL_MULTIPLIER.
- rename IPV4_HDR_DF_SHIFT as RTE_IPV4_HDR_DF_SHIFT.
- rename IPV4_HDR_MF_SHIFT as RTE_IPV4_HDR_MF_SHIFT.
- rename IPV4_HDR_FO_SHIFT as RTE_IPV4_HDR_FO_SHIFT.
- rename IPV4_HDR_DF_FLAG as RTE_IPV4_HDR_DF_FLAG.
- rename IPV4_HDR_MF_FLAG as RTE_IPV4_HDR_MF_FLAG.
- rename IPV4_HDR_OFFSET_MASK as RTE_IPV4_HDR_OFFSET_MASK.
- rename IPV4_HDR_OFFSET_UNITS as RTE_IPV4_HDR_OFFSET_UNITS.
- rename IPV4_ANY as RTE_IPV4_ANY.
- rename IPV4_LOOPBACK as RTE_IPV4_LOOPBACK.
- rename IPV4_BROADCAST as RTE_IPV4_BROADCAST.
- rename IPV4_ALLHOSTS_GROUP as RTE_IPV4_ALLHOSTS_GROUP.
- rename IPV4_ALLRTRS_GROUP as RTE_IPV4_ALLRTRS_GROUP.
- rename IPV4_MAX_LOCAL_GROUP as RTE_IPV4_MAX_LOCAL_GROUP.
- rename IPV4_MIN_MCAST as RTE_IPV4_MIN_MCAST.
- rename IPV4_MAX_MCAST as RTE_IPV4_MAX_MCAST.
- rename IS_IPV4_MCAST as RTE_IS_IPV4_MCAST.
- rename IPV6_HDR_FL_SHIFT as RTE_IPV6_HDR_FL_SHIFT.
- rename IPV6_HDR_TC_SHIFT as RTE_IPV6_HDR_TC_SHIFT.
- rename IPV6_HDR_FL_MASK as RTE_IPV6_HDR_FL_MASK.
- rename IPV6_HDR_TC_MASK as RTE_IPV6_HDR_TC_MASK.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:46 +02:00
Olivier Matz
a7c528e5d7 net: add rte prefix to IP structure
Add 'rte_' prefix to structures:
- rename struct ipv4_hdr as struct rte_ipv4_hdr.
- rename struct ipv6_hdr as struct rte_ipv6_hdr.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:46 +02:00
Olivier Matz
be11774d45 net: add rte prefix to GRE structure
Add 'rte_' prefix to structures:
- rename struct gre_hdr as struct rte_gre_hdr.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Olivier Matz
35b2d13fd6 net: add rte prefix to ether defines
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.

Do not update the command line library to avoid adding a dependency to
librte_net.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Olivier Matz
538da7a1ca net: add rte prefix to ether functions
Add 'rte_' prefix to functions:
- rename is_same_ether_addr() as rte_is_same_ether_addr().
- rename is_zero_ether_addr() as rte_is_zero_ether_addr().
- rename is_unicast_ether_addr() as rte_is_unicast_ether_addr().
- rename is_multicast_ether_addr() as rte_is_multicast_ether_addr().
- rename is_broadcast_ether_addr() as rte_is_broadcast_ether_addr().
- rename is_universal_ether_addr() as rte_is_universal_ether_addr().
- rename is_local_admin_ether_addr() as rte_is_local_admin_ether_addr().
- rename is_valid_assigned_ether_addr() as rte_is_valid_assigned_ether_addr().
- rename eth_random_addr() as rte_eth_random_addr().
- rename ether_addr_copy() as rte_ether_addr_copy().
- rename ether_format_addr() as rte_ether_format_addr().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Olivier Matz
6d13ea8e8e net: add rte prefix to ether structures
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.

Do not update the command line library to avoid adding a dependency to
librte_net.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Viacheslav Ovsiienko
40d9f906f4 net/mlx5: fix device removal handler for multiport
IBV_EVENT_DEVICE_FATAL event is generated by the driver once for
the entire multiport Infiniband device, not for each existing ports.
The port index is zero and it causes dropping the device removal
event. We should invoke the removal event processing routine
for each port we have installed handler for.

Fixes: 028b2a28c3 ("net/mlx5: update event handler for multiport IB devices")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-12 13:45:57 +02:00
Ori Kam
617ecc9000 net/mlx5: fix errno for unsupported Direct Rules
When Direct Rules API is not supported we don't set the errno.
This results in failing the function but with errno equals to zero.
The result of this is that a function that failed, is considered as
a function that worked correctly.

This commit fixes this issue by setting the errno to ENOTSUP and
returning this error when error value should be returned.

Since RDMA-CORE are returning positive errno we are also returning
positive error values.

Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-08 17:29:31 +02:00
Dekel Peled
10cfa5f2d4 net/mlx5: fix release of Rx queue object
Function mlx5_rx_intr_disable() calls mlx5_rxq_ibv_get() and performs
some actions on the returned rxq_ibv.
It doesn't release the rxq_ibv when all is completed with success.

This patch adds call to mlx5_rxq_ibv_release() where it's missing.

Fixes: 09cb5b5817 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-08 17:29:31 +02:00
Yongseok Koh
4594487b78 net/mlx5: fix Multi-Packet RQ mempool name
Currently, the name of MPRQ mempool is set by
	snprintf(name, sizeof(name), "%s-mprq", dev->device->name);
For port representor, the name is duplicate of its master and failed to
create such a mempool having the same name. Port ID is used in the name
instead.

Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-08 17:29:31 +02:00
Dekel Peled
09ba4c58e5 net/mlx5: fix init with zero Rx queue
Recent patch [1] added, at the end of mlx5_dev_configure(), a call to
mlx5_proc_priv_init(), initializing process_private data of eth_dev.
This call is not reached if PMD is started with zero Rx queues.
In this case mlx5_dev_configure() returns earlier due to the check:
	if (rxqs_n == priv->rxqs_n)
		return 0;
In such a scenario, later references to uninitialized process_private
data will result in segmentation fault.
For example see in function txq_uar_init().

This patch changes the check logic. The following code is executed
if (rxqs_n != priv->rxqs_n), and skipped otherwise.
Function mlx5_proc_priv_init() is always invoked, to ensure
process_private data is initialized.

[1] http://patches.dpdk.org/patch/52629/

Fixes: 120dc4a7dc ("net/mlx5: remove device register remap")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-08 17:29:31 +02:00
Ori Kam
d1e64fbf63 net/mlx5: fix Direct Rules API
The RDMA-CORE Direct Rules API was changed in latest upstream code

This commit update the API accordingly.

Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Yongseok Koh
9cc42c5893 net/mlx5: fix max number of queues for NEON Tx
BlueField SmartNIC has 0xa2d2 as PCI device ID on both ARM and x86 host. On
ARM side, Tx inlining need not be used as PCI bandwidth is not bottleneck.
Vectorized Tx can still be used up to 16 queues. For other archs
(e.g., x86), keep using the default value.

Fixes: 09d8b41699 ("net/mlx5: make vectorized Tx threshold configurable")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Yongseok Koh
f6d9ab4e76 net/mlx5: check Tx queue size overflow
If Tx packet inlining is enabled, rdma-core library should allocate large
Tx WQ enough to support it. It is better for PMD to calculate the size of
WQ based on the parameters and return error with appropriate message if it
exceeds the device capability.

Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Ori Kam
ce777b147b net/mlx5: fix E-Switch flow without port item
When creating a flow rule without the port_id pattern item, always the
PF was selected.

This commit fixes this issue, if no port_id pattern item is available
then we use the port that the flow was created on as source port.

Fixes: 822fb31953 ("net/mlx5: add port id item to Direct Verbs")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-03 18:45:23 +02:00
Yongseok Koh
0b083fe41c net/mlx5: fix flow action destroy
ibv_destroy_flow_action() refers to QP. QP must not be freed until
corresponding action is destroyed.

Fixes: 3eb0044310 ("net/mlx5: fix release of jump to queue action")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Viacheslav Ovsiienko
ccb3815346 net/mlx5: update memory event callback for shared context
Mellanox mlx5 PMD implements the list of devices to process the memory
free events to reflect the actual memory state to Memory Regions.
Because this list contains the devices and devices may share the
same context the callback routine may be called multiple times
with the same parameter, that is not optimal. This patch modifies
the list to contain the device contexts instead of device objects
and shared context is included in the list only once.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Viacheslav Ovsiienko
ab3cffcfc2 net/mlx5: share Memory Regions for multiport device
The multiport Infiniband device support was introduced [1].
All active ports, belonging to the same Infiniband device use the single
shared Infiniband context of that device and share the resources:
  - QPs are created within shared context
  - Verbs flows are also created with specifying port index
  - DV/DR resources
  - Protection Domain
  - Event Handlers

This patchset adds support for Memory Regions sharing between
ports, created on the base of multiport Infiniband device.
The datapath of mlx5 uses the layered cache subsystem for
allocating/releasing Memory Regions, only the lowest layer L3
is subject to share due to performance issues.

[1] http://patches.dpdk.org/cover/51800/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-03 18:45:23 +02:00
Viacheslav Ovsiienko
0333b2f584 net/mlx5: inherit master link settings for representors
There are some physical link settings can be queried from
Ethernet devices: link status, link speed, speed capabilities,
duplex mode, etc. These setting do not make a lot of sense for
representors due to missing physical link. The new kernel drivers
dropped query for link settings for representors causing the
ioctl call to fail. This patch adds some kind of emulation
of link settings to PMD - representors inherit the link parameters
from the master device. The actual link status (up/down)
is retrieved from the representor device.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Ori Kam
79e7ba1f7d net/mlx5: fix modification action flags
When creating the modify action using Direct Rules, we need to
add flags to mark, if the action will be done on root table or on
private table.

Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Viacheslav Ovsiienko
f4a9349da8 net/mlx5: fix probing if DevX disabled
If there is the support of DevX is exposed by rdma-core but
DevX is not supported by or disabled for the specific interface
the mlx5_devx_cmd_query_hca_attr() routine returns an error
preventing the device from successful probing. The routine
should be invoked only in case of enabled DevX.

Fixes: e2b4925ef7 ("net/mlx5: support Direct Rules E-Switch")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:22 +02:00
Dekel Peled
ebd4df0ffc net/mlx5: fix comments mixing Rx and Tx
In mlx5_rxq.c, in some comments, text includes "Tx" instead of "Rx".
In mlx5_txq.c, in some comments, text includes "Rx" instead of "Tx".

This patch fixes these typos.

Fixes: faf2667fe8 ("net/mlx5: separate DPDK from verbs Tx queue objects")
Fixes: a1366b1a2b ("net/mlx5: add reference counter on DPDK Rx queues")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:22 +02:00
Yongseok Koh
fc0aebe187 net/mlx5: fix Direct Rules build
All the library calls must be called via the glue layer.

Fixes: b2177648b8 ("net/mlx5: add Direct Rules flow data alloc/free routines")
Fixes: 79e35d0d59 ("net/mlx5: share Direct Rules/Verbs flow related structures")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:22 +02:00
Thomas Monjalon
2ed187ba07 net/mlx5: replace a port lookup with ethdev API
Looking for an ethdev port is better (and more efficient)
with an ethdev API than an EAL one.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:22 +02:00
Ori Kam
af4a86d446 net/mlx5: fix initialization for flow group
The flow group should be initialized.
For example selecting if the encapsulation is for root or private tables
is based on the flow->group value.

Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-05-03 18:45:22 +02:00
Ori Kam
34fa7c0268 net/mlx5: add drop action to Direct Verbs E-Switch
This commit adds support for drop action when creating E-Switch flow
using DV.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
a567a5b683 net/mlx5: add Forward Database table type
Actions like encap/decap, modify header require setting the flow table
type. Until now we supported only Nic RX and Nic TX, this commits adds
the support for FDB table type for those actions.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
c269b51778 net/mlx5: add E-Switch port id action to Direct Verbs
This commits adds matching on source port, using DV API.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
c14995c557 net/mlx5: add transfer attribute to matcher
In current implementation the DV steering supported only NIC steering.
This commit adds the transfer attribute in order to create a matcher
on the FDB tables.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
822fb31953 net/mlx5: add port id item to Direct Verbs
Adds the port id item to the DV steering code.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
2e4c987aad net/mlx5: validate Direct Rule E-Switch
Add validation logic for E-Switch using Direct Rules.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
e2b4925ef7 net/mlx5: support Direct Rules E-Switch
This commit checks the for DR E-Switch support.
The support is based on both Device and Kernel.
This commit also enables the user to manually disable this this feature.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
0eff907118 net/mlx5: fix meson build for Direct Rules
The meson build was missing the define for Direct Rules.

Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
2bc811766b net/mlx5: fix translate vport function name
Modify the translate vport function to match other translate items
naming conventions.

Fixes: 0fe3f18f78 ("net/mlx5: add source vport match to the ingress rules")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
5a5e41f9b8 net/mlx5: fix modify header action position
According to RTE flow the action order should be the order that the
actions were given.
In the case of modify actions the position of the action was always
last.

This commit solves this issue by saving the position of the first modify
action, and then adds to this position the pointer to the modify action.

Fixes: 4bb14c83df ("net/mlx5: support modify header using Direct Verbs")
Cc: stable@dpdk.org

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Ori Kam
3eb0044310 net/mlx5: fix release of jump to queue action
Currently the allocation of the jump to QP is done in flow apply,
this results in memory leak.

This patch fixes this issue by moving the allocation and release of the
jump to QP action to the responsibility of the hrxq.

Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")
Cc: stable@dpdk.org

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Viacheslav Ovsiienko
30a86157f6 net/mlx5: support PF representor
On BlueField platform we have the new entity - PF representor.
This one represents the PCI PF attached to external host on the
side of ARM. The traffic sent by the external host to the NIC
via PF will be seem by ARM on this PF representor.

This patch refactors port recognizing capability on the base of
physical port name. We have two groups of name formats. Legacy
name formats are supported by kernels before ver 5.0 (being
more precise - before the patch [1]) or before Mellanox OFED 4.6,
and new naming formats added by the patch [1].

Legacy naming formats are supported:

  - missing physical port name (no sysfs/netlink key) at all,
    master is assumed

  - decimal digits (for example "12"), representor is assumed,
    the value is the index of attached VF

New naming formats are supported:

  - "p" followed by decimal digits, for example "p2", master
    is assumed

  - "pf" followed by PF index concatenated with "vf" followed by
    VF index, for example "pf0vf1", representor is assumed.
    If index of VF is "-1" it is a special case  of host PF
    representor, this representor must be indexed in devargs
    as 65535, for example representor=[0-3,65535] will
    allow representors for VF0, VF1, VF2, VF3 and for host PF.

    Note: do not specify representor=[0-65535], it causes devargs
    processing error, because number of ports (rte_eth_dev) is
    limited.

Applications should distinguish representors and master devices
exclusively by device flag RTE_ETH_DEV_REPRESENTOR and do not
rely on switch port_id (mlx5 PMD deduces ones from representor_id)
values returned by dev_infos_get() API.

[1] https://www.spinics.net/lists/netdev/msg547007.html
    Linux-tree: c12ecc23 (Or Gerlitz 2018-04-25 17:32 +0300)
    "net/mlx5e: Move to use common phys port names for vport representors"

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-19 14:51:55 +02:00
Viacheslav Ovsiienko
f8753eae84 net/mlx5: fix memory region cleanup
mlx5 driver has a global list of Memory Regions created by
device, and there is a ml5_mr_release() routine which makes
a memory cleanup at device closing. The head of device MR list
was fetched outside the rwlock protected section. Also some
noticed typos are fixed.

Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:55 +02:00
Thomas Monjalon
5294b80029 ethdev: avoid explicit check of valid port state
Some port iterations are manually checking against RTE_ETH_DEV_UNUSED
instead of using the iterators based on rte_eth_find_next().

A new macro RTE_ETH_FOREACH_VALID_DEV() is introduced, but kept private
because there should be no need of iterating over all devices in the
API. The public iterators have additional filters for ownership, parent
device or sibling ports.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2019-04-19 14:51:55 +02:00
Thomas Monjalon
b7ed955a20 ethdev: deprecate legacy filter API
As stated in the deprecation notice from December 2016,
"the legacy filter API, including rte_eth_dev_filter_supported(),
rte_eth_dev_filter_ctrl() as well as filter types MACVLAN, ETHERTYPE,
FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR, HASH and L2_TUNNEL, is superseded
by the generic flow API (rte_flow)".

After a long wait of more than two years, the legacy filter API
is marked as deprecated, while still tested with testpmd and
the tep_termination example.

The next step will be to announce a deadline for complete removal.
As preparation of the removal of rte_eth_ctrl.h,
RTE_ETH_FLOW_*, RTE_TUNNEL_TYPE_* and RTE_ETH_HASH_FUNCTION_* definitions
are moved to rte_ethdev.h and rte_flow.h.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-04-19 14:51:54 +02:00
Ori Kam
1183f12f14 net/mlx5: fix RSS validation function
The RSS validation function was missing the verifcation that
if RSS is requested on inner packet, the flow must have tunnel data.

Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19 14:51:54 +02:00
Yongseok Koh
d588f12ffb net/mlx: fix library search in meson build
If MLNX_OFED is installed, there's no .pc file installed for libraries and
dependency() can't find libraries by pkg-config. By adding fallback of
using cc.find_library(), libraries are properly located.

Fixes: e30b4e566f ("build: improve dependency handling")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2019-04-18 18:22:42 +02:00
Yongseok Koh
120dc4a7dc net/mlx5: remove device register remap
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private

The actual UAR table follows the data structure and the table is used
for both Tx and Rx.

For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.

For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
d5c900d1dd net/mlx5: remove redundant queue index
Queue index is redundantly stored for both Rx and Tx structures.
E.g. txq_ctrl->idx and txq->stats.idx. Both are consolidated to single
storage - rxq->idx and txq->idx.

Also, rxq and txq are moved to the beginning of its control structure
(rxq_ctrl and txq_ctrl) for cacheline alignment.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
227684feb8 net/mlx5: fix recursive inclusion of header file
mlx5.h includes mlx5_rxtx.h and mlx5_rxtx.h includes mlx5.h recursively.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Ori Kam
bbda883ca0 net/mlx5: fix build on Arm
In case of cross compilation on aarch64 we must add include for
stdlib in order to use the free function.

Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
942d13e6e7 net/mlx5: fix sharing context destroy order
At the mlx5 device closing the shared IB context was destroyed
before cleanup routines completion. As it was found on some
setups (Netlink fails with old kernel drivers and we have to use
sysfs to retrieve interface index, this requires IB device name,
which is stored in shared context) the mlx5_nl_mac_addr_flush()
requires IB device name, and if shared context is removed it
causes the segmentation fault.

Fixes: 17e19bc4dd ("net/mlx5: add IB shared context alloc/free functions")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
9c2bbd0488 net/mlx5: fix device probing for old kernel drivers
Retrieving network interface index via Netlink fails in
case of old ib_core kernel driver installed - mlx5_nl_ifindex()
routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not
supported by the old driver.

The patch allowing to retrieve the network interface index and
name via Netlink [1]. So, the problem depends on ib_core module
version - 4.16 supports getting ifindex via Netlink, 4.15 does not.

This error was ignored in previous versions of MLX5 PMD probing
routine. For single device ifindex was retrieved via sysfs
and link control was not lost, so problem just was not noticed.
In order to support MLX5 PMD functioning over old kernel driver
this patch adds ifindex retrieving via sysfs into probing routine.
It is worth to note this method works for master/standalone
device only.

[1] https://www.spinics.net/lists/linux-rdma/msg62948.html
    Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270)

Fixes: ad74bc6195 ("net/mlx5: support multiport IB device during probing")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
ae4eb7dc79 net/mlx5: fix typos in comments
Fixes: 299d7dc28c ("net/mlx5: add representor recognition on Linux 5.x")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
79e35d0d59 net/mlx5: share Direct Rules/Verbs flow related structures
Direct Rules/Verbs related structures are moved to
the shared context:
  - rx/tx namespaces, shared by master and representors
  - rx/tx flow tables
  - matchers
  - encap/decap action resources
  - flow tags (MARK actions)
  - modify action resources
  - jump tables

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko
b2177648b8 net/mlx5: add Direct Rules flow data alloc/free routines
We are going to share the Direct Rules and Direct Verbs flow
device data structures between master and representors in the
E-Switch configurations over multiport IB device.

The code of initializing and destroying these data is
moved to dedicated routines, this is just a preparation
step for actual data sharing.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
684b9a1b1f net/mlx5: support jump action
When using Direct Rules we can add actions to jump between tables.
This is extra useful since rule insertion rate is much higher on other
tables compared to table zero.

If no group is selected the rule is added to group 0.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
4f84a19779 net/mlx5: add Direct Rules API
Adds calls to the Direct Rules API inside the glue functions.
Due to difference in parameters between the Direct Rules and Direct
Verbs some of the glue functions API was updated.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
cbb66daa3c net/mlx5: prepare Direct Verbs for Direct Rule
This is the first patch of a series that is designed to enable the
Direct Rules API.

The main difference between Direct Verbs and Direct Rules from API
perspective is that in Direct Rules each action has it's own create
function and the object itself is of type void.

In this patch I'm adding functions to generate actions that currently
are done without create action, and I'm changing the action type to be
void *, so in next patches only the glue functions will need to change.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
f54aeb3ec0 net/mlx5: fix flow counters using devx
The API that was defined in OFED 4.5 was replaced both in OFED 4.6 and
in upstream.

This commit updates the API to match the upstream one.

Fixes: f5bf91de73 ("net/mlx5: support flow counters using devx")
Cc: stable@dpdk.org

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Thomas Monjalon
d874a4eed5 net/mlx5: use port sibling iterators
Iterating over siblings was done with RTE_ETH_FOREACH_DEV()
which skips the owned ports.
The new iterators RTE_ETH_FOREACH_DEV_SIBLING()
and RTE_ETH_FOREACH_DEV_OF() are more appropriate and more correct.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
c18cf501a7 net/mlx5: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
dceb502942 net/mlx5: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx5_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smaller
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
207fe7ac72 net/mlx5: fix external memory registration
Secondary process is not allowed to register MR due to a restriction of
library and kernel driver.

Fixes: 7e43a32ee0 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
3d1f3c7c83 net/mlx: remove debug messages on datapath
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
2aac5b5d11 net/mlx5: sync stop/start with secondary process
Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp
APIs, primary can easily broadcast a request to stop/start the datapath
of secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
7be600c8d8 net/mlx5: rework PMD global data init
There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global
data and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the
PMD global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
9a8ab29b84 net/mlx5: replace IPC socket with EAL API
Socket API is used for IPC in order for secondary process to acquire
Verb command file descriptor. The FD is used to remap UAR address.
The multi-process APIs (rte_mp) in EAL are newly introduced.
mlx5_socket.c is replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is
need to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
3ebe658059 net/mlx5: fix memory event on secondary process
As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Dekel Peled
de90612f40 net/mlx5: fix errno typos in comments
Correct typing mistake in several locations:
ernno ==> errno

Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
9c55c6bd86 net/mlx5: revert mbuf address calculation for x86
When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be
loaded. non-x86 processors (mostly RISC such as ARM and Power) are more
vulnerable to load stall. For x86, reducing the number of instructions
seems to matter most.

For x86, this is simply a load but for other architectures, it is
calculated from the address of mbuf structure by rte_mbuf_buf_addr()
without having to load the first cacheline of the mbuf.

Fixes: 12d468a62b ("net/mlx5: fix instruction hotspot on replenishing Rx buffer")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko
0fe3f18f78 net/mlx5: add source vport match to the ingress rules
For E-Switch configurations over multiport Infiniband devices
we should add source vport match to correctly distribute
traffic between representors.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
028b2a28c3 net/mlx5: update event handler for multiport IB devices
This patch modifies asynchronous event handler to support multiport
Infiniband devices. Handler queries the event parameters, including
event source port index, and invokes the handler for specific
devices with appropriate port_id.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
53e5a82fd1 net/mlx5: update install/uninstall event handlers
We are implementing the support for multiport Infiniband device
with representors attached to these multiple ports. Asynchronous
device event notifications (link status change, removal event, etc.)
should be shared between ports. We are going to implement shared
event handler and this patch introduces appropriate device
structure changes and updated event handler install and uninstall
routines.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
1e14090e31 net/mlx5: provide IB port for the object being created
The code is updated to provide IB port index for the Verbs
objects being created - QPs and Verbs Flows.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
f048f3d479 net/mlx5: switch to the shared IB device context
The code is updated to use the shared IB device context and
device handles. The IB device context is shared between
reprentors created over the single multiport IB device. All
Verbs and DevX objects will be created within this shared context.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
d485cdca01 net/mlx5: switch to the shared context IB attributes
The code is updated to use the shared IB device attributes,
located in the shared IB context. It saves some memory if
there are representors created over the single Infiniband
device with multiple ports.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
1b782252cb net/mlx5: switch to the shared protection domain
The PMD code is updated to use Protected Domain from the
shared IB device context. The Domain is shared between
all devices belonging to the same multiport Infiniband device.
If IB device has only one port, the PD is not shared, because
there is only ethernet device created over IB one.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
9c0a9eed37 net/mlx5: switch to the names in the shared IB context
The IB device names are moved from device private data
to the shared context, code involving the names is updated.
The IB port index treatment is added where it is relevant.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
17e19bc4dd net/mlx5: add IB shared context alloc/free functions
The Mellanox NICs support SR-IOV and have E-Switch feature.
When SR-IOV is set up in switchdev mode and E-Switch is enabled
we have so called VF representors in the system. All representors
belonging to the same E-Switch are created on the basis of the
single PCI function and with current implementation each representor
has its own dedicated Infiniband device and operates within its
own Infiniband context. It is proposed to provide representors
as ports of the single Infiniband device and operate on the
shared Infiniband context saving various resources. This patch
introduces appropriate structures.

Also the functions to allocate and free shared IB context for
multiport are added. The IB device context, Protection Domain,
device attributes, Infiniband names are going to be relocated
to the shared structure from the device private one.
mlx5_dev_spawn() is updated to support shared context.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
ad74bc6195 net/mlx5: support multiport IB device during probing
mlx5_pci_probe() routine is refactored to probe the ports
of found Infiniband devices. All active ports (with attached
network interface), belonging to the same Infiniband device
will use the single shared Infiniband context of that device.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
bbfad6427b net/mlx5: add getting IB ports number for multiport IB
There is the routine mlx5_nl_portnum() added to get
the number of ports of multiport Infiniband device.
It is assumed the Uplink/VF representors are attached
on these ports.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
e505508a38 net/mlx5: modify get ifindex routine for multiport IB
There is the routine mlx5_nl_ifindex() returning the
network interface index associated with Infiniband device.
We are going to support multiport IB devices, now function
takes the IB port as argument and returns ifindex associated
with tuple <IB device, IB port>

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
299d7dc28c net/mlx5: add representor recognition on Linux 5.x
The master device and VF representors were distinguished by
presence of port name, master device did not have one. The new Linux
kernels starting from 5.0 provide the port name for master device
and the implemented representor recognizing method does not work.
The new recognizing method is based on querying the VF number,
has been created on the base of the device.

The IFLA_NUM_VF attribute is returned by kernel if IFLA_EXT_MASK
attribute is specified in the Netlink request message.

Also the presence check of device symlink in device sysfs folder
is added to distinguish representors with sysfs based method.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Ali Alnubani
e3bcaf3a0f net/mlx5: add missing return value check
This patch fixes the build failure with message:
  drivers/net/mlx5/mlx5_ethdev.c: In function ‘mlx5_sysfs_switch_info’:
  drivers/net/mlx5/mlx5_ethdev.c:1381:3:
    error: ignoring return value of ‘fscanf’, declared with attribute
           warn_unused_result [-Werror=unused-result]
  fscanf(file, "%s", port_name);
    ^

Which reproduces on Ubuntu 16.04 LTS with
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609.

Fixes: b2f3a38101 ("net/mlx5: support new representor naming format")

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Dekel Peled <dekelp@mellanox.com>
2019-03-29 17:25:32 +01:00
Shahaf Shuler
989e999d93 net/mlx5: support PCI device DMA map and unmap
The implementation reuses the external memory registration work done by
commit[1].

Note about representors:

The current representor design will not work
with those map and unmap functions. The reason is that for representors
we have multiple IB devices share the same PCI function, so mapping will
happen only on one of the representors and not all of them.

While it is possible to implement such support, the IB representor
design is going to be changed during DPDK19.05. The new design will have
a single IB device for all representors, hence sharing of a single
memory region between all representors will be possible.

[1]
commit 7e43a32ee0
("net/mlx5: support externally allocated static memory")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-30 16:48:58 +01:00
Shahaf Shuler
0f132546a8 net/mlx5: refactor external memory registration
Move the memory region creation to a separate function to
prepare the ground for the reuse of it on the PCI driver map and unmap
functions.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-30 16:48:57 +01:00
Dekel Peled
b2f3a38101 net/mlx5: support new representor naming format
Kernel update [1] introduce new format of representors names.
This patch implements RFC [2], updating MLX5 PMD to support the new
format, while maintaining support of the existing format.

[1] https://github.com/torvalds/linux/commit/c12ecc2
[2] http://mails.dpdk.org/archives/dev/2019-March/125676.html

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-20 18:15:42 +01:00
Shahaf Shuler
57c0e2494c net/mlx5: fix packet inline on Tx queue wraparound
Inlining a packet to WQE that cross the WQ wraparound, i.e. the WQE
starts on the end of the ring and ends on the beginning, is not
supported and blocked by the data path logic.

However, in case of TSO, an extra inline header is required before
inlining. This inline header is not taken into account when checking if
there is enough room left for the required inline size.
On some corner cases were
(ring_tailroom - inline header) < inline size < ring_tailroom ,
this can lead to WQE being written outsize of the ring buffer.

Fixing it by always assuming the worse case that inline of packet will
require the inline header.

Fixes: 3f13f8c23a ("net/mlx5: support hardware TSO")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-03-20 18:15:42 +01:00
Dekel Peled
fd350d3c9a net/mlx5: fix sync when handling Tx completions
Function mlx5_tx_complete() reads completion entry information
from Tx queue.
For some processors not having strongly-ordered memory model,
there has to be a memory barrier between reading the entry index
and the entry fields, in order to guarantee data is valid.

Fixes: 54d3fe948d ("net/mlx5: poll completion queue once per a call")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Dekel Peled
38f0a160b5 net/mlx5: fix hex dump of error completion
struct mlx5_cqe is defined in MLX5 PMD code (mlx5_prm.h).
It includes 64 bytes padding in case of (RTE_CACHE_LINE_SIZE == 128).

struct mlx5_err_cqe is defined in kernel, and doesn't include padding.

When running in debug mode, in case an error CQE is detected
it is printed using rte_hexdump().

The size of data to print should be sizeof(*cqe) instead of
sizeof(*err_cqe), to handle the case of (RTE_CACHE_LINE_SIZE == 128),
and print the full data in any case.

Fixes: c771499209 ("net/mlx5: extend debug logs verbosity")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Thomas Monjalon
09c9c4d23d net/mlx5: call generic strlcpy
The call to strlcpy uses either libc, libbsd or internal rte_strlcpy.
No need to call the DPDK flavor explicitly.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Viacheslav Ovsiienko
4fb27c1dfe net/mlx5: fix flow priorities probing error path
The mlx5 PMD probes the Verbs flow priorities supported with
ibv_create_flow() function. If rdma-core or kernel fails for
some reason, the returned error causes the drop queue is not
destroyed, and pd is locked by not freed resource.

Also the mlx5_flow_discover_priorities() returned negative value
as error, and this code was reported "as is", without sign
changing (eventually causing assert(err > 0)).

Fixes: 2815702bae ("net/mlx5: replace verbs priorities by flow")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Thomas Monjalon
dbeba4cf18 net/mlx: prefix private structure
The private structure stored in rte_eth_dev->data->dev_private
was named "struct priv".
In order to ease code browsing, the structure is renamed
"struct mlx[45]_priv".

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-03-01 18:17:35 +01:00
Luca Boccassi
e30b4e566f build: improve dependency handling
Whenever possible (if the library ships a pkg-config file) use meson's
dependency() function to look for it, as it will automatically add it
to the Requires.private list if needed, to allow for static builds to
succeed for reverse dependencies of DPDK. Otherwise the recursive
dependencies are not parsed, and users doing static builds have to
resolve them manually by themselves.
When using this API avoid additional checks that are superfluous and
take extra time, and avoid adding the linker flag manually which causes
it to be duplicated.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
2019-02-27 12:13:54 +01:00
Thomas Monjalon
714bf46ebb net/mlx: support firmware version query
The API function rte_eth_dev_fw_version_get() is querying drivers
via the operation callback fw_version_get().
The implementation of this operation is added for mlx4 and mlx5.
Both functions are copying the same ibverbs field fw_ver
which is retrieved when calling ibv_query_device[_ex]()
during the port probing.

It is tested with command "drvinfo" of examples/ethtool/.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-02-13 12:55:38 +01:00