Commit Graph

1028 Commits

Author SHA1 Message Date
Dekel Peled
ea81c1b816 net/mlx5: fix NVGRE matching
NVGRE has a GRE header with c_rsvd0_ver value 0x2000 and protocol
value 0x6558.
These should be matched when item_nvgre is provided.

This patch adds validation function of NVGRE item.
It also updates the translate function of NVGRE item, to add the
required values, if they were not specified.

Original work by Xiaoyu Min <jackmin@mellanox.com>

Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
ee39fe82ea net/mlx5: adjust maximum LRO message size
LRO message is contained in the MPRQ strides.
While the LRO message size cannot be bigger than 65280 according to the
PRM, the strides which contain it may be bigger than the maximum buffer
size allowed in dpdk mbuf - 0xFFFF.

Adjust the maximum LRO message size to avoid buffer length overflow.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
a496e09317 net/mlx5: zero LRO mbuf headroom
LRO packet may consume all the stride memory, hence the PMD cannot
guaranty head-room for the LRO mbuf.

The issue is lack in HW support to write the packet in offset from the
stride start.

A new striding RQ feature may be added in CX6 DX to allow head-room and
tail-room for the LRO strides.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
e4c2a16eb1 net/mlx5: handle LRO packets in Rx queue
When LRO offload is configured in Rx queue, the HW may coalesce TCP
packets from same TCP connection into single packet.

In this case the SW should fix the relevant packet headers because the
HW doesn't update them according to the new created packet
characteristics.

Add update header code to the mprq Rx burst function to support LRO
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
8b8f7994f1 net/mlx5: update LRO fields in completion entry
Update the CQE structure to include LRO fields.

Some reserved values were changed, hence also data-path code used the
reserved values were updated accordingly.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
3a22f3877c net/mlx5: replace external mbuf shared memory
As an arrangement to the LRO support when a packet can consume all the
stride memory, the external mbuf shared information cannot be anymore
in the end of the stride, because the HW may write the packet data to
all the stride memory.

Move the shared information memory from the stride to the control
memory of the external mbuf.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
940f0a1d07 net/mlx5: support LRO with single RxQ object
Implement LRO support using a single RQ object per DPDK RxQ.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
dc9ceff73c net/mlx5: create advanced RxQ via DevX
Function mlx5_rxq_obj_new(), previously called mlx5_rxq_ibv_new(),
supports creating Rx queue objects using verbs.
This patch expands the relevant functions, to support creating
verbs or DevX Rx queue objects:
Function mlx5_rxq_obj_new() updated to create RQ object using DevX.
Function mlx5_ind_table_obj_new() updated to create RQT object using DevX.
Function mlx5_hrxq_new() updated to create TIR object using DevX.
New utility functions added to perform specific operations:
mlx5_devx_rq_new(),  mlx5_devx_wq_attr_fill(),
mlx5_devx_create_rq_attr_fill().

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
abc81aafde net/mlx5: add function for Rx verbs work queue
Verbs WQ for RxQ is created inside function mlx5_rxq_obj_new().
This patch moves the creation of verbs WQ to dedicated function
mlx5_ibv_wq_new().

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
8bbad00a91 net/mlx5: add function for Rx verbs completion queue
Verbs CQ for RxQ is created inside function mlx5_rxq_obj_new().
This patch moves the creation of CQ to dedicated function
mlx5_ibv_cq_new().

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
b9d86122bf net/mlx5: store protection domain number on create
Function mlx5_alloc_shared_ibctx() allocates Protection Domain using
verbs API, as part of shared IB device context.
This patch adds reading and storing of pdn value from the created PD
object, using DV API.
The pdn value is required when creating WQ using DevX API.

This patch also updates function flow_dv_create_counter_stat_mem_mng()
which uses the pdn value as well.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
84537d3c35 net/mlx5: update queue state modify for DevX
Function mlx5_queue_state_modify_primary() was implemented to handle
state change for queues created using Verbs API.

This patch update function mlx5_queue_state_modify_primary() to
support state change of RQ object created using DevX API.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
23820a7989 net/mlx5: rename hash RxQ verbs to general
Prepare for introducing use of DevX TIR object.
Hash Rx queue is currently created using verbs QP only.
The next patches will add the option to create it with a TIR object
using DevX.
This patch renames hrxq_ibv to hrxq wherever relevant, and adds
the DevX items to relevant structs.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
15c80a126d net/mlx5: rename verbs indirection table to obj
Prepare for introducing of DevX RQT object.
Rx indirection table object is currently created using verbs only.
The next patches will add the option to create an RQT object using
DevX.
This patch renames ind_table_ibv to ind_table_obj wherever relevant,
and adds the DevX items to relevant structs.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
93403560ba net/mlx5: rename RxQ verbs to general RxQ object
Prepare for introducing of DevX RxQ object.
RxQ object is currently created using verbs only.
The next patches will add the option to create RxQ object using DevX.
This patch renames rxq_ibv to rxq_obj wherever relevant, and adds the
DevX items to relevant structs.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
21cae8580f net/mlx5: allocate door-bells via DevX
When using DevX API, memory for door-bell records should be allocated
by PMD and registered using DevX API.

This patch implements the utility functions to support it:
- Add struct mlx5_devx_dbr_page, containing door-bells page data.
- Add list of struct mlx5_devx_dbr_page door-bell pages to device
  private data.
- Implement function mlx5_alloc_dbr_page() to allocate page for
  door-bell records, and register it using DevX API.
- Implement function mlx5_get_dbr(). to acquire a door-bell record
  from the door-bells page, allocating a new page if needed.
- Implement function mlx5_release_dbr() to release a door-bell
  record that is no longer needed, freeing the containing page if
  it becomes empty.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
718d166e55 net/mlx5: create advanced RxQ table via DevX
Implement function mlx5_devx_cmd_create_rqt() to create RQT
object using DevX API.
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
c3aea272ee net/mlx5: create advanced Rx object via DevX
Implement function mlx5_devx_cmd_create_tir() to create TIR
object using DevX API..
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
5bf204fef1 net/mlx5: modify advanced RxQ object via DevX
Implement function mlx5_devx_cmd_modify_rq() to modify RQ.
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
86fc67fc93 net/mlx5: create advanced RxQ object via DevX
Implement function mlx5_devx_cmd_create_rq() to create RQ object using
DevX API.
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
8791ff42ef net/mlx5: update Tx queue create for LRO
Update function mlx5_txq_ibv_new(), query and store the TIS
transport domain value.
It is required later on Rx side when creating matching TIR.
Add field in mlx5 data structure to store Transport Domain ID.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
09f78f2144 net/mlx5: support Tx interface query via DevX
Implement function mlx5_devx_cmd_qp_tis_td_query(), to query
QP TIS Transport Domain value.

Add related structs in mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
175f1c21d0 net/mlx5: check conditions to enable LRO
Use DevX API to read device LRO capabilities.
Check if LRO is supported and can be enabled.
Check if MPRQ is supported and can be used.
Enable MPRQ for LRO use if not enabled by user.
Added note for mlx5_mprq_enabled(), to emphasize that LRO
enables MPRQ.
Disable CQE compression and CRC stripping if LRO is enabled.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
3075bd23e3 net/mlx5: add glue for create action via DevX
Add compile option HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR, and matching
dest_tir flag in device configuration structure.
Add glue function pointer dv_create_flow_action_dest_devx_tir, and
function mlx5_glue_dv_create_flow_action_dest_devx_tir(),
to invoke API mlx5dv_dr_action_create_dest_devx_tir();

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
62d6f70f30 net/mlx5: add glue for queue query via DevX
Add function mlx5_glue_devx_qp_query().
Add glue function pointer devx_qp_query to run it.
Glue version updated to 19.08.0.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
d9102b3312 net/mlx5: query LRO capabilities via DevX
Update function mlx5_devx_cmd_query_hca_attr() to query HCA
capabilities related to LRO.

Add relevant structs in drivers/net/mlx5/mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
21bb6c7e62 net/mlx5: introduce LRO
Add command-line argument to set LRO session timeout.
Add LRO settings struct in PMD configuration struct.
Add support of LRO offload in port configuration.
Add macros and function to check if LRO is supported and enabled.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
970eb58c47 net/mlx5: remove redundant item from union
A variable of type struct ibv_cq_ex is declared in 2 unions, but
isn't used.
This patch removes the 2 redundant declarations.

Fixes: 6218063b39 ("net/mlx5: refactor Rx data path")
Fixes: 1d88ba1719 ("net/mlx5: refactor Tx data path")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
ff45f462b8 net/mlx5: revert Netlink socket sharing
This reverts commit e28111ac98.
The netlink requests are replaced by ifindex caching and
not needed anymore.

Fixes: e28111ac98 ("net/mlx5: fix master device Netlink socket sharing")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
fa2e14d492 net/mlx5: cache associated network device index
The associated device index is retrieved via Netlink request to
underlying Infiniband device driver. This network device index
is permanent throughout the lifetime of device. We do not
spawn the rte_eth_dev ports without associated network device, and
if network device is being unbound we get the remove notification
message and rte_eth_dev port is also detached. So, we may store
the ifindex in mlx5_device_spawn() routine at rte_eth_dev port
creation and initialization time and use the cached value further
instead of doing actual Netlink request.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
cb9cb61e54 net/mlx5: report max number of mbuf segments
This patch fills the tx_desc_lim.nb_seg_max and
tx_desc_lim.nb_mtu_seg_max fields of rte_eth_dev_info
structure to report thee maximal number of packet
segments, requested inline data configuration is
taken into account in conservative way.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
18a1c20044 net/mlx5: implement Tx burst template
This patch adds the implementation of tx_burst routine template.
The template supports all Tx offloads and multiple optimized
tx_burst routines can be generated by compiler from this one.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
eb8121ab9d net/mlx5: introduce Tx burst routine template
Mellanox NICs support the wide set of Tx offloads. The supported
offloads are reported by the mlx5 PMD in rte_eth_dev_info
tx_offload_capa field.
An application may choose any combination of supported offloads
and configure the device appropriately. Some of Tx offloads may be
not requested by application, or ever all of them may be omitted.
Most of the Tx offloads require some code branches in tx_burst routine
to support ones. If Tx offload is not requested the tx_burst routine
code may be significantly simplified and consume less CPU cycles.

For example, if application does not engage TSO offload this code
can be omitted, if multi-segment packet is not supposed the tx_burst
may assume single mbuf packets only, etc.

Currently, the mlx5 PMD implements multiple tx_burst subroutines
for most common combinations of requested Tx offloads, each branch
has its own dedicated implementation. It is not very easy to update,
support and develop such kind of code - multiple branches impose
the multiple points to process. Also many of frequently requested
offload combinations are not supported yet. That leads to selecting of
not completely matching tx_burst routine and harms the performance.

This patch introduces the new approach for tx_burst code. It is proposed
to develop the unified template for tx_burst routine, which supports
all the Tx offloads and takes the compile time defined parameter
describing the supposed set of supported offloads. On the base
of this template, the compiler is able to generate multiple tx_burst
routines highly optimized for the statically specified set of
Tx offloads.
Next, in runtime, at Tx queue configuration the best matching optimized
implementation of tx_burst is chosen.

This patch intentionally omits the template internal implementation,
but just introduces the template itself to emboss the approach of
the multiple specially tuned tx_burst routines.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
38b4b397a5 net/mlx5: add Tx configuration and setup
This patch updates the Tx datapath control and configuration
structures and code for managing Tx datapath settings.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
ed4470c70f net/mlx5: extend NIC attributes query via DevX
This patch extends the NIC attributes query via DevX.
The appropriate interface structures are borrowed from
kernel driver headers and DevX calls are added to
mlx5_devx_cmd_query_hca_attr() routine.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
50724e1bba net/mlx5: update Tx definitions
This patch updates Tx datapath definitions, mostly hardware related.
The Tx descriptor structures are redefined with required fields,
size definitions are renamed to reflect the meanings in more
appropriate way. This is a preparation step before introducing
the new Tx datapath implementation.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
505f1fe426 net/mlx5: add Tx devargs
This patch introduces new mlx5 PMD devarg options:

- txq_inline_min - specifies minimal amount of data to be inlined into
  WQE during Tx operations. NICs may require this minimal data amount
  to operate correctly. The exact value may depend on NIC operation
  mode, requested offloads, etc.

- txq_inline_max - specifies the maximal packet length to be completely
  inlined into WQE Ethernet Segment for ordinary SEND method. If packet
  is larger the specified value, the packet data won't be copied by the
  driver at all, data buffer is addressed with a pointer. If packet
  length is less or equal all packet data will be copied into WQE.

- txq_inline_mpw - specifies the maximal packet length to be completely
  inlined into WQE for Enhanced MPW method.

Driver documentation is also updated.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko
a6bd4911ad net/mlx5: remove Tx implementation
This patch removes the existing Tx datapath code
as preparation step before introducing the new
implementation. The following entities are being
removed:

- deprecated devargs support
- tx_burst() routines
- related PRM definitions
- SQ configuration code
- Tx routine selection code
- incompatible Tx completion code

The following devargs are deprecated and ignored:
- "txq_inline" is going to be converted to "txq_inline_max"
  for compatibility issue
- "tx_vec_en"
- "txqs_max_vec"
- "txq_mpw_hdr_dseg_en"
- "txq_max_inline_len" is going to be converted
  to "txq_inline_mpw" for compatibility issue

The deprecated devarg keys are recognized by PMD
and ignored/converted to the new ones in order not
to block device probing.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
42280dd91b net/mlx5: fix typos in comments
Some spelling mistakes were found in comments.
This patch fixes them.

Fixes: d10b09db0a ("net/mlx5: fix allocation when no memory on device NUMA node")
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Fixes: f6d9ab4e76 ("net/mlx5: check Tx queue size overflow")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Dekel Peled
20367e45f2 net/mlx5: fix flow item flags bitmap
In functions flow_dv_translate() and  flow_dv_validate(), the flow
items are scanned and each item is marked in item_flags bitmap.
The code handling some of the items was ported from another project,
where items are marked in a slightly different manner.

This patch fixes the setting of items in bitmap, adapting it to the
required manner.

Fixes: d53aa89aea ("net/mlx5: support matching on ICMP/ICMP6")
Fixes: 5865955ad994 ("net/mlx5: match GRE key and present bits")
Fixes: 2e4c987aad ("net/mlx5: validate Direct Rule E-Switch")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2019-07-23 14:31:36 +02:00
Matan Azrad
31538ef62c net/mlx5: allow basic counter management fallback
In case the asynchronous devx commands are not supported in RDMA core
fallback to use a basic counter management.

Here, the PMD counters cashe is redundant and the host thread doesn't
update it. hence, each counter operation will go to the FW and the
acceleration reduces.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Matan Azrad
f15db67df0 net/mlx5: accelerate DV flow counter query
All the DV counters are cashed in the PMD memory and are contained in
pools which are contained in containers according to the counters
allocation type - batch or single.

Currently, the flow counter query is done synchronously in pool
resolution means that on the user request a FW command is triggered to
read all the counters in the pool.

A new feature of devX to asynchronously read batch of flow counters
allows to accelerate the user query operation.

Using the DPDK host thread, the PMD periodically triggers asynchronous
query in pool resolution for all the counter pools and an interrupt is
triggered by the FW when the values are updated.
In the interrupt handler the pool counter values raw data is replaced
using a double buffer algorithm (very fast).
In the user query, the PMD just returns the last query values from the
PMD cache - no system-calls and FW commands are triggered from the user
control thread on query operation!

More synchronization is added with the host thread:
        Container resize uses double buffer algorithm.
        Pools growing in container uses atomic operation.
        Pool query buffer replace uses a spinlock.
        Pool minimum devX counter ID uses atomic operation.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Matan Azrad
ebbac312e4 net/mlx5: resize a full counter container
When the counter countainer has no more space to store more counter
pools try to resize the container to allow more pools to be created.

So, the only limitation for the maximum counter number is the memory.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Matan Azrad
5382d28c21 net/mlx5: accelerate DV flow counter transactions
The DevX interface exposes a new feature to the PMD that can allocate a
batch of counters by one FW command. It can improve the flow
transaction rate (with count action).

Add a new counter pools mechanism to manage HW counters in the PMD.
So, for each flow with counter creation the PMD will try to find a free
counter in the PMD pools container and only if there is no a free
counter, it will allocate a new DevX batch counters.

Currently we cannot support batch counter for a group 0 flow, so
create a 2 container types, one which allocates counters one by
one and one which allocates X counters by the batch feature.

The allocated counters objects are never released back to the HW
assuming the flows maximum number will be close to the actual value of
the flows number.
Later, it can be updated, and dynamic release mechanism can be added.

The counters are contained in pools, each pool with 512 counters.
The pools are contained in counter containers according to the
allocation resolution type - single or batch.
The cache memory of the counters statistics is saved as raw data per
pool.
All the raw data memory is allocated for all the container in one
memory allocation and is managed by counter_stats_mem_mng structure
which registers all the raw memory to the HW.
Each pool points to one raw data structure.

The query operation is in pool resolution which updates all the pool
counter raw data by one operation.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23 14:31:35 +02:00
Xiaoyu Min
5e33bebdd8 net/mlx5: support IP-in-IP tunnel
Enabled IP-in-IP tunnel type support on DV/DR flow engine.
This includes the following combination:
 - IPv4 over IPv4
 - IPv4 over IPv6
 - IPv6 over IPv4
 - IPv6 over IPv6

MLX5 NIC supports IP-in-IP tunnel via FLEX Parser so
need to make sure fw using FLEX Paser profile 0.

  mlxconfig -d <mst device> -y set FLEX_PARSER_PROFILE_ENABLE=0

The example testpmd commands would be:

- Match on IPv4 over IPv4 packets and do inner RSS:

  testpmd> flow create 0 ingress pattern eth / ipv4 proto is 0x04 /
           ipv4 / udp / end actions rss level 2 queues 0 1 2 3 end / end

- Match on IPv6 over IPv4 packets and do inner RSS:

  testpmd> flow create 0 ingress pattern eth / ipv4 proto is 0x29 /
           ipv6 / udp / end actions rss level 2 queues 0 1 2 3 end / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:34 +02:00
Xiaoyu Min
a7a0365565 net/mlx5: match GRE key and present bits
Support matching on the present bits (C,K,S)
as well as the optional key field.

If the rte_flow_item_gre_key is specified in pattern,
it will set K present match automatically.

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:34 +02:00
Xiaoyu Min
9f8dee4bcb net/mlx5: support match GRE protocol on DR engine
DR engine support matching on GRE protocol field without MPLS supports.
So bypassing the MPLS check when DR is enabled.

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:34 +02:00
David Marchand
b76fafb174 eal: fix IOVA mode selection as VA for PCI drivers
The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which
was intended to mean "driver only supports VA" but had been understood
as "driver supports both PA and VA" by most net drivers and used to let
dpdk processes to run as non root (which do not have access to physical
addresses on recent kernels).

The check on physical addresses actually closed the gap for those
drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this
flag can retain its intended meaning.
Document explicitly its meaning.

We can check that a driver requirement wrt to IOVA mode is fulfilled
before trying to probe a device.

Finally, document the heuristic used to select the IOVA mode and hope
that we won't break it again.

Fixes: 703458e19c ("bus/pci: consider only usable devices for IOVA mode")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-07-22 17:45:52 +02:00
Xiaoyu Min
d53aa89aea net/mlx5: support matching on ICMP/ICMP6
On DV/DR flow engine, MLX5 can match on ICMP/ICMP6's code and type field
via FLEX Parser, which can be enabled by config FW using FLEX Parser
profile 2:

mlxconfig -d <mst device> -y set FLEX_PARSER_PROFILE_ENABLE=2

The testpmd commands could be:

  testpmd> flow create 0 ingress pattern eth / ipv4 /
           icmp type is 8 code is 0 / end
	   actions rss queues 0 1 end / end

  testpmd> flow create 0 ingress pattern  eth / ipv6 /
           icmp6 type is 128 code is 0 / end
	   actions rss queues 0 1 end / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-08 21:26:52 +02:00
Eli Britstein
bf1d7d9a03 net/mlx5: zero out UDP checksum in encapsulation
Mellanox NICs do not support UDP checksum hardware tx offload over IPv6.
This limitation becomes critical for UDP based tunnels like VXLAN.
Beside the UDP checksum validity is required by IPv6 there is an option
in Linux to allow accepting UDP zero sum (see udp6zerocsumrx in iproute2
package).

This patch zeroes out the UDP checksum field for encapsulation headers
in raw encap action.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-08 21:26:52 +02:00