Commit Graph

16047 Commits

Author SHA1 Message Date
Ting Xu
0fff4ae4a9 net/ice: fix L3 RSS with IPv6 fragment
Since the header type of IPv6 fragment is wrong, the L3 dst/src RSS hash
fields cannot work properly. This patch changed the header type from any
to outer.

Fixes: f1ea76eb63 ("net/ice: support RSS hash for IP fragment")
Cc: stable@dpdk.org

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-22 01:55:48 +02:00
Ting Xu
775a25a372 net/ice: clear QoS bandwidth on DCF close
When closing DCF, the bandwidth limit configured for VFs by DCF is not
cleared correctly. The configuration will still take effect when DCF starts
again, if VFs are not re-allocated. This patch cleared VFs bandwidth limit
when DCF closes, and DCF needs to re-configure bandwidth for VFs when it
starts next time.

Fixes: 3a6bfc37ea ("net/ice: support QoS config VF bandwidth in DCF")

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-22 01:55:44 +02:00
Liang Ma
da7a5c1406 net/mlx5: export PMD-specific API file
The file rte_pmd_mlx5.h should be exported by Meson.

Fixes: efa79e68c8 ("net/mlx5: support fine grain dynamic flag")
Fixes: 23f627e0ed ("net/mlx5: add flow sync API")
Cc: stable@dpdk.org

Signed-off-by: Liang Ma <liangma@bytedance.com>
2021-07-22 17:23:26 +02:00
Lior Margalit
4e5ba38d56 net/mlx5: reject inner ethernet matching in GTP
The user is able to create a flow rule pattern with ETH after GTP
although it is not supported by the flex-parser configuration.

Failed the rule validation in such case with proper error message.

Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org

Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 17:13:09 +02:00
Lior Margalit
3e455a97dc net/mlx5: fix RSS expansion for GTP
The flow did not expand correctly when it included a GTP item.

Added GTP node to the expansion graph as possible next node
after IPv4/IPv6 UDP node.

Fixes: 592f05b29a ("net/mlx5: add RSS flow action")
Cc: stable@dpdk.org

Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 16:55:19 +02:00
Xueming Li
92d16c83a7 net/mlx5: fix SF representor probing in isolate mode
Representor failed to probe in isolated mode due to callback of
retrieving representor info missing. This patch adds it back.

Fixes: cb95feefdd ("net/mlx5: support sub-function representor")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 16:53:26 +02:00
Viacheslav Ovsiienko
9f430dd751 net/mlx5: fix RoCE LAG bond device probing
The RoCE LAG bond device requires neither E-Switch nor SR-IOV
configurations. It means the RoCE LAG bond device might be
presented as a single port Infiniband device.

The mlx5 PMD wrongly recognized standalone RoCE LAG bond device
as E-Switch configuration, this triggered the calls of E-Switch
ports related API and the latter failed (over the new OFED kernel
driver, starting since 5.4.1), causing the overall device probe
failure.

If there is a single port Infiniband bond device found the
E-Switch related flags must be cleared indicating standalone
configuration.

Also, it is not true anymore the bond device can exist
over E-Switch configurations only (as it was claimed for VF LAG
bond devices). The related checks are not relevant anymore
and removed.

Fixes: 790164ce1d ("net/mlx5: check kernel support for VF LAG bonding")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 16:43:49 +02:00
Alexander Kozyrev
39f0df9b6d net/mlx5: reject copy to mark via modify action
The Mark action is a two-stage process in the Mellanox driver.
First, a hardware register is filled with the required value,
then this value is registered in the software resource table.

The MODIFY_FIELD action can instruct a Mellanox NIC to copy
some value from an arbitrary packet header field into the
hardware register, associated with the Mark item. But there
is no way NIC can modify the software resource table as well.

Due to these driver limitations the copying of arbitrary value
to the MARK can not be supported and should be rejected in the
MODIFY_FIELD action.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 16:26:41 +02:00
Alexander Kozyrev
6d5735c1cb net/mlx5: fix meta register conversion for extensive mode
Register C is used in the extensive metadata mode number 1 and its
width can vary from 0 to 32 bits depending on the kernel usage of it.

There are several issues associated with this mode (dv_xmeta_en=1):
1. The metadata setting assumes that the width is always 16 bits,
which is the most common case in this mode. Use the proper mask.
2. The same is true for the modify_field Flow API. 16-bits width
is hardcoded for dv_xmeta_en=1. Switch to the register C mask width.
3. Metadata is stored in the most significant bits in CQE in this
mode because the registers copy code was not updated during the
metadata conversion to the big-endian format. Update this code to
avoid shifting the metadata in the datapath.

Fixes: b57e414b48 ("net/mlx5: convert meta register to big-endian")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 16:24:56 +02:00
Suanming Mou
89a4bcb1fc net/mlx5: fix indexed pools allocation on Windows
Currently, the flow indexed pools are allocated per port,
the allocation was missing in Windows code.

Allocate indexed pool for the Windows case too.

Fixes: b4edeaf3ef ("net/mlx5: replace flow list with indexed pool")

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Odi Assli <odia@nvidia.com>
2021-07-22 16:16:29 +02:00
Dmitry Kozlyuk
b7c8ea62d0 net/mlx5: fix indirect action modify rollback
mlx5_ind_table_obj_modify() first references queues from the new list,
then applies the new list to HW. In case of apply failure the function
dereferenced queues from the old list, while it should be the new list.

Fixes: fa7ad49e96 ("net/mlx5: fix shared RSS action update")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 15:55:48 +02:00
Dmitry Kozlyuk
94e257ec8c net/mlx5: fix Rx/Tx queue checks
When device configuration was interrupted by a signal,
mlx5_rxq/txq_release() could access yet unitinialized array
and crash the application. Add checks whether queue array
is initialized.

Fixes: a1366b1a2b ("net/mlx5: add reference counter on DPDK Rx queues")
Fixes: 6e78005a9b ("net/mlx5: add reference counter on DPDK Tx queues")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 15:49:33 +02:00
Dong Zhou
96f85ec489 net/mlx5: check VLAN push/pop support
For ConnectX-6 in FDB domain, pop and push VLAN
on both ingress and egress directions are supported.

For ConnectX-6 in NIC domain, and ConnectX-5 in both FWD and NIC domain,
pop VLAN is only supported on ingress direction,
push VLAN is only supported on egress direction.

Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 15:40:01 +02:00
Michael Baum
34c84ebbbc regex/mlx5: fix redundancy in device removal
In the removal function, PMD releases all driver resources and
cancels the regexdev registry.

However, regexdev registration is accidentally canceled twice.
Remove one of them.

Fixes: b34d816363 ("regex/mlx5: support rules import")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2021-07-22 15:19:37 +02:00
Michael Baum
a1fcde8c80 regex/mlx5: fix leak on device removal
In the removal function, PMD releases all driver resources allocated
in the probe function.

The MR btree memory is allocated in the probe function, but it is not
freed in remove function what caused a memory leak.

Release it.

Fixes: cda883bbb6 ("regex/mlx5: add dynamic memory registration to datapath")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2021-07-22 15:19:31 +02:00
Michael Baum
29ca3215f3 regex/mlx5: fix memory region unregistration
The issue can cause illegal physical address access while a huge-page A
is released and huge-page B is allocated on the same virtual address.
The old MR can be matched using the virtual address of huge-page B but
the HW will access the physical address of huge-page A which is no more
part of the DPDK process.

Register a driver callback for memory event in order to free out all the
MRs of memory that is going to be freed from the DPDK process.

Fixes: cda883bbb6 ("regex/mlx5: add dynamic memory registration to datapath")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2021-07-22 15:19:30 +02:00
Michael Baum
2fec07edd4 net/mlx5: fix overflow in mempool argument
The mlx5_mprq_alloc_mp function makes shifting to the numeric constant
1, for sending it as a parameter to rte_mempool_create function.

The rte_mempool_create function expects to get void pointer (uintptr_t,
might be 64-bit) and instead gets a 32-bit variable, because the
numeric constant size is a 32-bit.
In case the shift is greater than 32 the variable might lose its value
even though the function might get 64-bit argument.

Change the size of the numeric constant 1 to uintptr_t.

Fixes: 3a22f3877c ("net/mlx5: replace external mbuf shared memory")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 14:48:47 +02:00
Michael Baum
c6b552e4c0 vdpa/mlx5: fix overflow in queue attribute
The mlx5_vdpa_event_qp_create function makes shifting to the numeric
constant 1, then multiplies it by another constant and finally assigns
it into a uint64_t variable.

The numeric constant type is an int with a 32-bit sign. if after
shifting , its MSB (bit of sign) will change, the uint64 variable will
get into it a different value than what the function intended it to get.

Set the numeric constant 1 to be uint64_t in the first place.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 14:48:07 +02:00
Michael Baum
c87bc83a33 compress/mlx5: fix overflow in queue size
The mlx5_compress_qp_setup function makes shifting to the numeric
constant 1, then sends it as a parameter to rte_calloc function.

The rte_calloc function expects to get size_t (might be 64 bit) and
instead gets a 32-bit variable, because the numeric constant size is a
32-bit.
In case the shift is greater than 32 bit and it 64-system, the variable
will lose its value even though the function can get 64-bit argument.

Change the size of the numeric constant 1 to size_t.

Fixes: 8619fcd516 ("compress/mlx5: support queue pair operations")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 14:47:32 +02:00
Michael Baum
423719a367 regex/mlx5: fix size of setup constants
The constant representing the size of the metadata is defined as an
unsigned int variable with 32-bit.
Similarly the constant representing the maximal output is also defined
as an unsigned int variable with 32-bit.

There is potentially overflowing expression when those constants are
evaluated using 32-bit arithmetic, and then used in a context that
expects an expression of type size_t that might be 64-bit.

Change the size of the above constants to size_t.

Fixes: 30d604bb15 ("regex/mlx5: fix type of setup constants")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 14:47:10 +02:00
Bing Zhao
33a7493c8d net/mlx5: support meter for trTCM profiles
The support of RFC2698 and RFC4115 are added in mlx5 PMD. Only the
ASO metering supports these two profiles.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:29:01 +02:00
Bing Zhao
4d648fad90 net/mlx5: check consistency of meter policy and profile
In the previous implementation, only green color policy was
supported in mlx5 PMD. Since yellow color policy is supported now,
the consistency of meter policy and profile should be checked.
  1. If the profile supports yellow but the policy doesn't, an error
     should be returned when creating the meter. Or else, there is
     no explicit steering action for the packets marked with yellow.
  2. If the policy supports yellow but the profile doesn't, it will
     be considered as a valid case. Even if no packet will be
     handled with the yellow steering action, it is just like that
     only the green policy presents.

Usually the green color is supported by default, but when it is
disabled intentionally with setting the CBS to a small value like
zero in the profile, the similar checking on green policy and
profile should also be done.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:28:57 +02:00
Bing Zhao
4b7bf3ffb4 net/mlx5: support yellow in meter policy validation
In the previous implementation, the policy for yellow color was not
supported. The action validation for yellow was skipped.

Since the yellow color policy needs to be supported, the validation
should also be done for the yellow color. In the meanwhile, due to
the fact that color policies of one meter should be used for the
same flow(s), the domains supported of both colors should be the
same. If both of the colors have RSS as the termination actions,
except the queues, all other parameters of RSS should be the same.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:28:54 +02:00
Bing Zhao
b38a12272b net/mlx5: split meter color policy handling
If the fate action is either RSS or Queue of a meter policy, the
action will only be created in the flow splitting stage. With queue
as the fate action, only one sub-policy is needed. And RSS will
have more than one sub-policies if there is an expansion.

Since the RSS parameters are the same for both green and yellow
colors except the queues, the expansion result will be unique.
Even if only one color has the RSS action, the checking and possible
expansion will be done then. For each sub-policy, the action rules
need to be created separately on its own policy table.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:28:50 +02:00
Bing Zhao
fa31a5ed01 net/mlx5: support yellow meter policy rules
When creating a meter policy, both / either of the action rules for
green and yellow colors may be provided. After validation, usually
the actions are created before the meter is using by a flow rule.

If there is action specified for the yellow color, the action rules
should be created together with green color in the same time. The
action of green / yellow color can be empty, then the default
behavior is the jump action of the rule, just the same as that of
the default policy.

If the fate action of either one color is queue / RSS, all the
actions rules will be created on the flow splitting stage instead of
the policy adding stage.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:28:47 +02:00
Bing Zhao
9b5463df5d net/mlx5: enable meter bucket overflow for yellow color
To support the meter policy for yellow action, the prerequisite is
that the hardware needs to support the EBS, as defined in the
RFC2697.
  https://datatracker.ietf.org/doc/html/rfc2697
Then some of the packets can be marked as yellow if the tokens of C
bucket is not enough but enough in E bucket. The color could be used
for the further steering of the packets.

In the current implementation EBS and overflow were ignored when
creating a meter profile. With this commit, if EBS is set by the
application, the generation of yellow color will be enabled in the
hardware for flow rules steering of packets.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:28:43 +02:00
Bing Zhao
363db9b00f net/mlx5: handle yellow case in default meter policy
In order to support the yellow color for the default meter policy,
the default policy action for yellow should be created together
with the green policy.

The default policy action for yellow action is the same as that for
green. In the same table, the same matcher will be reused for yellow
and the destination group will be the same.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-22 13:28:40 +02:00
Xueming Li
d3c521265e common/mlx5: remove legacy PCI driver
Clean up legacy PCI bus driver since all mlx5 PMDs are moved
to the new bus-agnostic driver interface.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:16:47 +02:00
Xueming Li
37d3bde48b crypto/mlx5: migrate to bus-agnostic common interface
To support auxiliary bus, upgrade the driver to use mlx5 common driver
structure.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
82242186c3 compress/mlx5: migrate to bus-agnostic common interface
To support auxiliary bus, upgrade the driver to use mlx5 common driver
structure.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Thomas Monjalon
cf8a8a8f48 vdpa/mlx5: support Sub-Function
RoCE disabling requirement is based on PCI address.
In order to support Sub-Function, a conversion is needed
in the case of an auxiliary device.

SF device can be probed with such devargs string:
  auxiliary:mlx5_core.sf.<id>,class=vdpa

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Thomas Monjalon
d599bf8209 vdpa/mlx5: migrate to bus-agnostic common interface
Replace PCI-specific handling with bus-agnostic structures.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Thomas Monjalon
bb060bb545 vdpa/mlx5: define driver name as macro
Use a macro for the PMD driver name.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
0564ddead3 regex/mlx5: migrate to bus-agnostic common interface
To support auxiliary bus, upgrades driver to use mlx5 common driver
structure.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
cdfdb82d0b net/mlx5: check maximum Verbs port number
Verbs API doesn't support device port number larger than 255 by design.
Add check and fail probing with proper error log.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
919488fbfa net/mlx5: support Sub-Function
Introduce SF support.
Similar to VF, SF on auxiliary bus is a portion of hardware PF,
no representor or bonding parameters for SF.

Devargs to support SF:
-a auxiliary:mlx5_core.sf.8,dv_flow_en=1

New global syntax to support SF:
-a bus=auxiliary,name=mlx5_core.sf.8/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
a7f34989e9 net/mlx5: migrate to bus-agnostic common interface
To support SubFunction based on auxiliary bus, common driver supports
new bus-agnostic driver.

This patch migrates net driver to new common driver.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
56bb3c84e9 net/mlx5: reduce PCI dependency
To support more bus types, remove PCI dependency where possible.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Thomas Monjalon
4d567938be common/mlx5: get PCI device address from any bus
A function is exported to allow retrieving the PCI address
of the parent PCI device of a Sub-Function in auxiliary bus sysfs.
The function mlx5_dev_to_pci_str() is accepting both PCI and auxiliary
devices. In case of a PCI device, it is simply using the device name.

The function mlx5_dev_to_pci_addr(), which is based on sysfs path
and do not use any device object, is renamed to mlx5_get_pci_addr()
for clarity purpose.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
777b72a933 common/mlx5: support auxiliary bus
Add auxiliary bus support for Sub-Function.

As a limitation of current driver, NUMA node of device is detected
from PCI bus of device sysfs symbol link.
It will be removed once NUMA node file will be available in sysfs.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Thomas Monjalon
67350881e0 common/mlx5: move description of PCI sysfs functions
The Linux-specific functions mlx5_get_pci_addr() and
mlx5_get_ifname_sysfs() are better described in the .h file.

The requirement for using mlx5_get_pci_addr() is made explicit:
the node /device must exist in the provided sysfs path.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Xueming Li
ad435d3204 common/mlx5: add bus-agnostic layer
To support auxiliary bus, introduces common device driver and callbacks,
supposed to replace mlx5 common PCI bus driver.

Mlx5 class drivers, i.e. eth, vDPA, regex and compress normally consumes
single Verbs device context to probe a device. The Verbs device comes
from PCI address if the device is PCI bus device, from Auxiliary sysfs
if the device is auxiliary bus device. Currently only PCI bus is
supported.

Common device driver is a middle layer between mlx5 class drivers and
bus, resolve and abstract bus info to Verbs device for class drivers.
Both PCI bus driver and Auxiliary bus driver can utilize the common
driver layer to cast bus operations to mlx5 class drivers.

Legacy mlx5 common PCI bus driver still being used by mlx5 eth, vDPA,
regex and compress PMD, will be removed once all PMD drivers
migrate to new common driver.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:11 +02:00
Xueming Li
a99f2f9054 common/mlx5: rename ethernet device class
To align with EAL class driver, rename internal class name
from "net" to "eth"

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-21 22:59:03 +02:00
Ivan Ilchenko
4e8169eb0d net/virtio: fix Rx scatter offload
Report Rx scatter offload capability depending on VIRTIO_NET_F_MRG_RXBUF.

If Rx scatter is not requested, ensure that provided Rx buffers on
each Rx queue are big enough to fit Rx packets up to configured MTU.

Fixes: ce17eddefc ("ethdev: introduce Rx queue offloads API")
Cc: stable@dpdk.org

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Cheng Jiang
2d91b28730 net/virtio: fix refill order in packed ring datapath
The front-end should refill the descriptor with the mbuf indicated by
the buff_id rather then the index of used descriptor. Back-end may
return buffers out of order if async copy mode is enabled.

When initializing rxq, refill the descriptors in order as buff_id is
not available at that time.

Fixes: a76290c8f1 ("net/virtio: implement Rx path for packed queues")
Cc: stable@dpdk.org

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Michael Shamis
e0f89d5e1f crypto/mvsam: support IPsec offload
This patch provides the support for IPsec protocol
offload to the hardware.
Following security operations are added:
- session_create
- session_destroy
- capabilities_get

Signed-off-by: Michael Shamis <michaelsh@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Tested-by: Liron Himi <lironh@marvell.com>
2021-07-21 15:08:52 +02:00
Suanming Mou
9dfc2d6fda crypto/mlx5: support statistics operations
This commit adds mlx5 crypto statistic get and reset operations.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Suanming Mou
8e196c08ab crypto/mlx5: support enqueue/dequeue operations
The crypto operations are done with the WQE set which contains
one UMR WQE and one rdma write WQE. Most segments of the WQE
set are initialized properly during queue setup, only limited
segments are initialized according to the crypto detail in the
datapath process.

This commit adds the enqueue and dequeue operations and updates
the WQE set segments accordingly.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Suanming Mou
c2a42d19d9 crypto/mlx5: add WQE set initialization
Currently, HW handles the WQEs much faster than the software,
Using the constant WQE set layout can initialize most of the WQE
segments in advanced, and software only needs to configure very
limited segments in datapath. This accelerates the software WQE
organize in datapath.

This commit initializes the fixed WQE set segments.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Suanming Mou
a1978aa23b crypto/mlx5: add maximum segments configuration
The mlx5 HW crypto operations are done by attaching crypto property
to a memory region. Once done, every access to the memory via the
crypto-enabled memory region will result with in-line encryption or
decryption of the data.

As a result, the design choice is to provide two types of WQEs. One
is UMR WQE which sets the crypto property and the other is rdma write
WQE which sends DMA command to copy data from local MR to remote MR.

The size of the WQEs will be defined by a new devarg called
max_segs_num.

This devarg also defines the maximum segments in mbuf chain that will be
supported for crypto operations.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Suanming Mou
e8db4413cb crypto/mlx5: add keytag configuration
A keytag is a piece of data encrypted together with a DEK.

When a DEK is referenced by an MKEY.bsf through its index, the keytag is
also supplied in the BSF as plaintext. The HW will decrypt the DEK (and
the attached keytag) and will fail the operation if the keytags don't
match.

This commit adds the configuration of the keytag with devargs.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Shiri Kuzin
debb27ea34 crypto/mlx5: create login object using DevX
To work with crypto engines that are marked with wrapped_import_method,
a login session is required.
A crypto login object needs to be created using DevX.

The crypto login object contains:
	- The credential pointer.
	- The import_KEK pointer to be used for all secured information
	  communicated in crypto commands (key fields), including the
	  provided credential in this command.
	- The credential secret, wrapped by the import_KEK indicated in
	  this command. Size includes 8 bytes IV for wrapping.

Added devargs for the required login values:
	- wcs_file - path to the file containing the credential.
	- import_kek_id - the import KEK pointer.
	- credential_id - the credential pointer.

Create the login DevX object in pci_probe function and destroy it in
pci_remove.
Destroying the crypto login object means logout.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Shiri Kuzin
247ad1305a crypto/mlx5: add memory region management
Mellanox user space drivers don't deal with physical addresses as part
of a memory protection mechanism.
The device translates the given virtual address to a physical address
using the given memory key as an address space identifier.
That's why any mbuf virtual address is moved directly to the HW
descriptor(WQE).

The mapping between the virtual address to the physical address is saved
in MR configured by the kernel to the HW.

Each MR has a key that should also be moved to the WQE by the SW.

When the SW sees an unmapped address, it extends the address range and
creates a MR using a system call.

Add memory region cache management:
	- 2 level cache per queue-pair - no locks.
	- 1 shared cache between all the queues using a lock.

Using this way, the MR key search per data-path address is optimized.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:27:00 +02:00
Shiri Kuzin
1004be3c03 crypto/mlx5: support session operations
Sessions are used in symmetric transformations in order to prepare
objects and data for packet processing stage.

A mlx5 session includes iv_offset, pointer to mlx5_crypto_dek struct,
bsf_size, bsf_p_type, block size index, encryption_order and encryption
standard.

Implement the next session operations:
        mlx5_crypto_sym_session_get_size- returns the size of the mlx5
	session struct.
	mlx5_crypto_sym_session_configure- prepares the DEK hash-list
	and saves all the session data.
	mlx5_crypto_sym_session_clear - destroys the DEK hash-list.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 22:26:43 +02:00
Shiri Kuzin
6152534e21 crypto/mlx5: support queue pairs operations
The HW queue pairs are a pair of send queue and receive queue of
independent work queues packed together in one object for the purpose
of transferring data between nodes of a network.

Completion Queue is a FIFO queue of completed work requests.

In crypto driver we use one QP in loopback in order to encrypt and
decrypt data locally without sending it to the wire.
In the configured QP we only use the SQ to perform the encryption and
decryption operations.

Added implementation for the QP setup function which creates the CQ,
creates the QP and changes its state to RTS (ready to send).

Added implementation for the release QP function to release all the QP
resources.

Added the ops structure that contains any operation which is supported
by the cryptodev.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 21:51:33 +02:00
Shiri Kuzin
90646d6c6e crypto/mlx5: support basic operations
The basic dev control operations are configure, close, start, stop and
get info.

Extended the existing support of configure and close:
	-mlx5_crypto_dev_configure- function used to configure device.
	-mlx5_crypto_dev_close-  function used to close a configured
	 device.
	-mlx5_crypto_dev_stop- function used to stop device.
	-mlx5_crypto_dev_start- function used to start device.
	-mlx5_crypto_dev_infos_get- function used to get info.

Added config struct to user private data with the fields socket id,
number of queue pairs and feature flags to be disabled.
Add the dev_start function that is used to start a configured device.
Add the dev_stop function that is used to stop a configured device.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 21:50:25 +02:00
Shiri Kuzin
586add6ef4 crypto/mlx5: add DEK object management
A DEK (Data encryption Key) is an mlx5 HW object which represents
the cipher algorithm key.
The DEKs are used during data encryption/decryption operations.

In symmetric algorithms like AES-XTS, we use the same DEK for both
encryption and decryption.

Use the mlx5 hash-list tool to manage the DEK objects in the PMD.

Provide the compare, create and destroy functions to manage DEKs in
hash-list and introduce an internal API to setup and unset the DEK
management and to prepare and destroy specific DEK object.

The DEK hash-list will be created in dev_configure routine and
destroyed in dev_close routine.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 21:48:32 +02:00
Shiri Kuzin
a7c86884f1 crypto/mlx5: introduce Mellanox crypto driver
Add a new PMD for Mellanox devices- crypto PMD.

The crypto PMD will be supported starting Nvidia ConnectX6 and
BlueField2.

The crypto PMD will add the support of encryption and decryption using
the AES-XTS symmetric algorithm.

The crypto PMD requires rdma-core and uses mlx5 DevX.

This patch adds the PCI probing, basic functions, build files and
log utility.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 21:45:58 +02:00
Srujana Challa
5c7704712d common/cnxk: support UDP encapsulation
Adds support for UDP encapsulation in crypto_cn10k
PMD.

Signed-off-by: Srujana Challa <schalla@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Anoob Joseph
760eedf38d crypto/cnxk: reset feature flags on reconfigure
Feature flag in dev would be updated during config.
On reconfigure, the field need to be set again to
original value.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Ruifeng Wang
ffb81dce5a compress/isal: support Arm platform
Isal compress PMD has build failures on Arm platform.

As dependent library ISA-L is supported on Arm platform,
support of the PMD is expanded to Arm architecture.
Fixed build failure caused by architecture specific code,
and made the PMD multi architecture compatible.

Bugzilla ID: 755
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-07-20 10:32:05 +02:00
Michael Baum
8c09010614 compress/mlx5: fix memory region unregistration
The issue can cause illegal physical address access while a huge-page A
is released and huge-page B is allocated on the same virtual address.
The old MR can be matched using the virtual address of huge-page B but
the HW will access the physical address of huge-page A which is no more
part of the DPDK process.

Register a driver callback for memory event in order to free out all the
MRs of memory that is going to be freed from the dpdk process.

Fixes: f8c97babc9 ("compress/mlx5: add data-path functions")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-20 10:32:05 +02:00
Anoob Joseph
db06451baf common/cpt: allocate auth key dynamically
Reduce session private data size by allocating
auth_key dynamically as required. Added auth_key_iova
to eliminate any impact on fastpath.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Anoob Joseph
252947b950 common/cnxk: allocate auth key dynamically
Reduce session private data size by allocating
auth_key dynamically as required.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Tejasree Kondoj
48c56b3294 crypto/octeontx2: fix lookaside IPsec IV pointer
In case of AES-GCM/CCM, nonce/salt comes along
with IV, hence can be copied in a single memcpy.
This patch fixes the IV copy in lookaside IPsec
outbound instruction.

Fixes: fab634eb87 ("crypto/octeontx2: support security session data path")
Cc: stable@dpdk.org

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Anoob Joseph
ee9b17ea58 net/octeontx2: clear SA valid during session destroy
SA table entry would be reserved for inline inbound operations. Clear
valid bit of the SA so that CPT would treat SA entry as invalid. Also,
move setting of valid bit to the end in case of session_create() to
eliminate possibility of hardware seeing partial data.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Anoob Joseph
87e1160c2c net/octeontx2: add lock for inline IPsec tables
Add locking for IPsec table updates.

Fixed error handling to clear SA entry if the SA
population functions encounters any error.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Anoob Joseph
40beec4bf4 crypto/octeontx2: fix IPsec session member overlap
The member 'dir' should not overlap with 'ip'. Usage of union for all
members would mean dir would get corrupt.

Fixes: e91b4f45ff ("net/octeontx2: support anti-replay for security session")
Cc: stable@dpdk.org

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Ankur Dwivedi
52008104e9 crypto/cnxk: update instruction queue in start/stop
The instruction queue is enabled in dev start and
is disabled in dev stop.

Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Ankur Dwivedi
3bf8783955 common/cnxk: move instruction queue enable to ROC
The code for enabling instruction queue is moved to ROC API.

Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Fan Zhang
328d690d2f crypto/qat: update raw data path
This commit updates the QAT raw data-path API to support the
changes made to device and sessions. The QAT RAW data-path API
now works on Generation 1-3 devices and is disabled on GEN4.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
960ff4d665 common/qat: add service discovery
This commit adds service discovery to generation four
of Intel QuickAssist Technology devices.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
c546d6e3d4 common/qat: reset ring pairs before setting GEN4
This commit resets ring pairs of particular vf before
setting PMD.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
b17d16fb47 common/qat: add PF to VF communication
Add communication between physical device and virtual function
in Intel QucikAssist Technology PMD.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
e4beb311d2 crypto/qat: support GMAC in GEN4 legacy mode
Add AES-GMAC algorithm in legacy mode to generation 4 devices.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
c2c1ccaec2 crypto/qat: add Chacha-Poly in UCS-SPC mode
This commit adds Chacha20-Poly1305 aglorithm that works
in UCS (Unified crypto slice) SPC(Single-Pass) mode.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
3e7a5a124d crypto/qat: add AES-GCM in UCS-SPC mode
This commit adds AES-GCM algorithm that works
in UCS (Unified crypto slice) SPC(Single-Pass) mode.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
6618d3b5ca crypto/qat: rework init common header
Rework init common header function for request
descriptor so it can be called only once.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
6599d09314 crypto/qat: support legacy GCM and CCM
Add AES-GCM, AES-CCM algorithms in legacy mode.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
bfe16f145d crypto/qat: rename content descriptor functions
Content descriptor functions are incorrectly named,
having them with proper name will improve readability and
facilitate further work.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
d05492913e crypto/qat: support GEN4 unified cipher slice
This commit adds unified cipher slice(UCS) to Intel QuickAssist
Technology PMD and enables AES-CTR algorithm.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
976da46344 crypto/qat: enable GEN4 legacy algorithms
This commit enables algorithms labeled as 'legacy'
on QAT generation 4 devices.
Following algorithms were enabled:
* AES-CBC
* AES-CMAC
* AES-XCBC MAC
* NULL (auth, cipher)
* SHA1-HMAC
* SHA2-HMAC (224, 256, 384, 512)

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
8f393c4ffd common/qat: support GEN4 devices
This commit adds support for fourth generation (GEN4) of
Intel QuickAssist (QAT) Technology devices.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Arek Kusztal
7b976dd079 common/qat: rework queue pair per service
Different generations of Intel QuickAssist Technology devices may
differ in approach to allocate queues. Queue pair number function
therefore needs to be more generic.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-20 10:32:05 +02:00
Konstantin Ananyev
a03e4b62a7 raw/ioat: fix termination descriptor for batch
When batch_size == 1, idxd has to add a dummy termination descriptor
to satisfy HW requirements.
Right now it uses NOP descriptor with FENCE flag.
This is excessive and fencing can slowdown things quite significantly.
The patch removes FENCE flag from termination dummy descriptor.
That helps to improve performance for no-burst scenarios.

Fixes: 245efe544d ("raw/ioat: report status of completed jobs")

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-07-20 15:28:43 +02:00
Kevin Laatz
9cf9ac48b1 raw/ioat: fix config script queue size calculation
The queue size calculation is currently based on "max_tokens" rather than
"max_work_queues_size". This is resulting in the queue size being
incorrectly configured when using the script to configure devices bound to
the IDXD kernel driver.
This patch fixes this miscalculation so devices are configured with
appropriate queue size.

Fixes: 01863b9d23 ("raw/ioat: include example configuration script")
Cc: stable@dpdk.org

Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-07-20 15:28:43 +02:00
Igor Romanov
5cb4746205 net/sfc: support count action in flow query
The query reports the number of hits for a counter associated
with a flow rule.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
96fd2bd69b net/sfc: support flow action count in transfer rules
For now, a rule may have only one dedicated counter, shared counters
are not supported.

HW delivers (or "streams") counter readings using special packets.
The driver creates a dedicated Rx queue to receive such packets
and requests that HW start "streaming" the readings to it.

The counter queue is polled periodically, and the first available
service core is used for that. Hence, the user has to specify at least
one service core for counters to work. Such a core is shared by all
MAE-capable devices managed by sfc driver.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Andrew Rybchenko
248239f874 common/sfc_efx/base: add packetiser packet format definition
Packetiser composes packets with MAE counters update.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-07-20 12:20:31 +02:00
Igor Romanov
3bcd60fe2e common/sfc_efx/base: add max MAE counters to limits
The information about the maximum number of MAE counters is
crucial to the counter support in the driver.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
a9a238e9f5 net/sfc: add Rx datapath method to get pushed buffers count
The information about the number of pushed Rx buffers is required
for counter Rx queue to know when to give credits to counter
stream.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
238306cf9a common/sfc_efx/base: support counter in action set
User will be able to associate counter with MAE action set to
collect counter packets and bytes for a specific action set.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
c0a77efb9c common/sfc_efx/base: add counter stream MCDI wrappers
The MCDIs will be used to control counter Rx queue packet flow.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
bbc42f3411 common/sfc_efx/base: add counter creation MCDI wrappers
User will be able to create and free MAE counters. Support for
associating counters with action set will be added in upcoming
patches.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
983ce116c2 net/sfc: reserve internal Rx queue for counters
MAE delivers counters data as special packets via dedicated Rx queue.
Reserve an RxQ so that it does not interfere with ethdev Rx queues.
A routine will be added later to handle these packets.

There is no point to reserve the queue if no service cores are
available and counters cannot be used.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Andrew Rybchenko
7c041f971b net/sfc: add NUMA-aware registry of service logical cores
The driver requires service cores for housekeeping. Share these
cores for many adapters and various purposes to avoid extra CPU
overhead.

Since housekeeping services will talk to NIC, it should be possible
to choose logical core on matching NUMA node.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
b8cf5ba549 net/sfc: support initialising different Rx queue types
Add extra EFX flags to RxQ info initialization API to support
choosing different RxQ types and make the API public to use
it in for counter queues.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-07-20 12:20:31 +02:00
Igor Romanov
29b133bb15 net/sfc: add abstractions for the management EVQ identity
Add a function returning management event queue software index.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-07-20 12:20:31 +02:00
Igor Romanov
c414c567c7 common/sfc_efx/base: add user mark RxQ flag
Add a flag to request support for user mark field on an RxQ.
The field is required to retrieve generation count value from
counter RxQ.

Implement it only for Riverhead and EF10 ESSB since they support
the field in the Rx prefix.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
aa3e21f006 common/sfc_efx/base: add ingress m-port RxQ flag
Add a flag to request support for ingress m-port on an RxQ.
Implement it only for Riverhead, other families will return an error
if the flag is set.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
db980d266f net/sfc: prepare for internal Tx queue
Make software index of a Tx queue and ethdev index separate.
When an ethdev TxQ is accessed in ethdev callbacks, an explicit ethdev
queue index is used.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Andrew Rybchenko
704512214d net/sfc: explicitly control IRQ used for Rx queues
Interrupts support has assumptions on interrupt numbers used
for LSC and Rx queues. The first interrupt is used for LSC,
subsequent interrupts are used for Rx queues.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-07-20 12:20:31 +02:00
Andrew Rybchenko
aa6dc1017c common/sfc_efx/base: support custom EvQ to IRQ mapping
Custom mapping is actually supported for EF10 and EF100 families only.

A driver (e.g. DPDK PMD) may require to customize mapping of EvQ
to interrupts if, for example, extra EvQ are used for house-keeping
in polling or wake up (via another EvQ) mode.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-07-20 12:20:31 +02:00
Andrew Rybchenko
3dee345ab3 common/sfc_efx/base: separate target EvQ and IRQ config
Target EvQ and IRQ number are specified in the same location
in MCDI request. The value is treated as IRQ number if the
event queue is interrupting (corresponding flag is set) and
as target event queue otherwise.

However it is better to separate it on helper API level to
make it more clear.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-07-20 12:20:31 +02:00
Andrew Rybchenko
396541fe43 net/sfc: do not enable interrupts on internal Rx queues
rxq_intr flag requests support for interrupt mode for ethdev Rx queues.
There is no internal Rx queues yet.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Igor Romanov
09cafbddbb net/sfc: prepare for internal Rx queue
Make software index of an Rx queue and ethdev index separate.
When an ethdev RxQ is accessed in ethdev callbacks, an explicit ethdev
queue index is used.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
2021-07-20 12:20:31 +02:00
Pavan Nikhilesh
761a321acf event/cnxk: support vectorized Tx event fast path
Add Tx event vector fastpath, integrate event vector Tx routine
into Tx burst.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-16 14:16:50 +02:00
Pavan Nikhilesh
7fbbc981d5 event/cnxk: support vectorized Rx event fast path
Add Rx event vector fastpath to convert HW defined metadata into
rte_mbuf and rte_event_vector.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-16 14:16:45 +02:00
Pavan Nikhilesh
072a281873 event/cnxk: support vectorized Rx adapter
Add event vector support for cnxk event Rx adapter, add control path
APIs to get vector limits and ability to configure event vectorization
on a given Rx queue.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-16 14:16:44 +02:00
Pavan Nikhilesh
313e884a22 event/cnxk: support Tx adapter fast path
Add support for event eth Tx adapter fastpath operations.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-16 14:16:38 +02:00
Pavan Nikhilesh
097835ecdf event/cnxk: support Tx adapter
Add support for event eth Tx adapter.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2021-07-16 14:16:37 +02:00
Pavan Nikhilesh
aa4311c654 event/cnxk: support Rx adapter fast path
Add support for event eth Rx adapter fastpath operations.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-16 14:16:28 +02:00
Pavan Nikhilesh
cb4bfd6e7b event/cnxk: support Rx adapter
Add support for event eth Rx adapter.
Resize cn10k workslot fastpath structure to fit in 64B cacheline size.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-16 14:16:26 +02:00
Ting Xu
f5dd3ea456 net/iavf: fix bandwidth unit in TM capability query
In IAVF node TM capability querying, the unit of bandwidth is Kbps,
which is not correct according to TM specification. Change the unit to
Byte per second. Refine some unclear comments as well.

Fixes: 44d0a720a5 ("net/iavf: query QoS capabilities and set queue TC mapping")

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:19:29 +02:00
Alvin Zhang
57e383e57f net/ice/base: support MPLS ethertype switch filter
Add MPLS training packet and offsets.
Add check to identify MPLS ethertype filters.

For example:
testpmd> flow create 0 ingress pattern eth dst is 00:11:22:33:44:55 \
         type is 0x8847 / end actions queue index 2 / end

This flow will result in all the matched ingress packets be
forwarded to queue 2.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Simei Su
a02d9d9bae net/ice: fix ESP flow director with SPI as input set
FDIR can't work when SPI as inputset for both ESP over IP and ESP
over UDP flow. This patch fixes this issue by adding the corresponding
input set for ESP over IP and ESP over UDP when parsing input set. Also,
it adds input set bit for NAT_T_ESP to distinguish ESP over IP and ESP
over UDP.

Fixes: 70feafc1a3 ("net/ice: support ESP/NATT flow director to match outer IP")

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Wenjun Wu
0a37b22875 net/ice/base: revert change of first profile mask
Segmentation fault mentioned in below commit is related to
other root cause under investigation.
This reverts patch below since it may have potential
risk and side effect if the first profile mask is set to 0.

Fixes: 148fdf2d35 ("net/ice/base: fix first profile mask")
Cc: stable@dpdk.org

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Joyce Kong
8649e23566 net/i40e: replace SMP barrier with thread fence in Rx
Simply replace the SMP barrier with atomic thread fence for
i40e hw ring scan, if there is no synchronization point.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Wenjun Wu
9e29a278bc net/iavf: support default RSS for IP fragment
This patch adds default RSS support for IPv4 and IPv6 fragment packet.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Lingyu Liu
36c2b46fed net/iavf: support RSS for GTPoGRE
Support AVF RSS for inner most header of GTPoGRE packet. It supports
RSS based on inner most IP src + dst address and TCP/UDP src + dst
port.

Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Lingyu Liu
71d3c57eae net/iavf: support flow director for GTPoGRE
Support AVF FDIR for inner header of GTPoGRE tunnel packet.
Only patterns without inner most L3,L4 header support outer L3 src/dst
and TEID,QFI FDIR.

+------------------------------------+-------------------------------+
|                Pattern             |            Input Set          |
+------------------------------------+-------------------------------+
|eth/ipv4/gre/ipv4/gtpu/(eh/)ipv4    |inner: src/dst ip              |
|eth/ipv4/gre/ipv4/gtpu/(eh/)ipv4/udp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv4/gtpu/(eh/)ipv4/tcp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv4/gtpu/(eh/)ipv6    |inner: src/dst ip              |
|eth/ipv4/gre/ipv4/gtpu/(eh/)ipv6/udp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv4/gtpu/(eh/)ipv6/tcp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv6/gtpu/(eh/)ipv4    |inner: src/dst ip              |
|eth/ipv4/gre/ipv6/gtpu/(eh/)ipv4/udp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv6/gtpu/(eh/)ipv4/tcp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv6/gtpu/(eh/)ipv6    |inner: src/dst ip              |
|eth/ipv4/gre/ipv6/gtpu/(eh/)ipv6/udp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv6/gtpu/(eh/)ipv6/tcp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv4/gtpu/(eh/)ipv4    |inner: src/dst ip              |
|eth/ipv6/gre/ipv4/gtpu/(eh/)ipv4/udp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv4/gtpu/(eh/)ipv4/tcp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv4/gtpu/(eh/)ipv6    |inner: src/dst ip              |
|eth/ipv6/gre/ipv4/gtpu/(eh/)ipv6/udp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv4/gtpu/(eh/)ipv6/tcp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv6/gtpu/(eh/)ipv4    |inner: src/dst ip              |
|eth/ipv6/gre/ipv6/gtpu/(eh/)ipv4/udp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv6/gtpu/(eh/)ipv4/tcp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv6/gtpu/(eh/)ipv6    |inner: src/dst ip              |
|eth/ipv6/gre/ipv6/gtpu/(eh/)ipv6/udp|inner: src/dst ip, src/dst port|
|eth/ipv6/gre/ipv6/gtpu/(eh/)ipv6/tcp|inner: src/dst ip, src/dst port|
|eth/ipv4/gre/ipv4/gtpu(/eh)         |outer: src/dst ip, teid(,qfi)  |
|eth/ipv4/gre/ipv6/gtpu(/eh)         |outer: src/dst ip, teid(,qfi)  |
|eth/ipv6/gre/ipv4/gtpu(/eh)         |outer: src/dst ip, teid(,qfi)  |
|eth/ipv6/gre/ipv6/gtpu(/eh)         |outer: src/dst ip, teid(,qfi)  |
+------------------------------------+-------------------------------+

Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Lingyu Liu
b3025311cd net/iavf: support flow pattern for GTPoGRE
Add GTPoGRE pattern support for AVF FDIR and RSS.

Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-16 10:11:30 +02:00
Ajit Khaparde
7962fd44c8 net/bnxt: update CFA resource types
Update cfa_resource_types.h to add a new entry for compatibility with FW.

Signed-off-by: Shuanglin Wang <shuanglin.wang@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-07-16 05:44:49 +02:00
Kalesh AP
84fd852caa net/bnxt: clear cached statistics
As part of the workaround put in the commit "219842b9990c",
driver caches the last read stats values from the hardware.
But this is not cleared during the clear stats operation. This
results in showing up stale stats values while reading the stats
after the clear operation.

Fixes: 219842b999 ("net/bnxt: workaround spurious zero stats in Thor")
Cc: stable@dpdk.org

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
2021-07-15 02:31:32 +02:00
Somnath Kotur
95de0faf12 net/bnxt: handle pause storm event
FW has been modified to send a new async event when it detects
a pause storm. Register for this new event and log it upon receipt.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-07-14 20:29:05 +02:00
Somnath Kotur
1125b16bf6 net/bnxt: refactor async event handling
Store the async event completion data1 and data2 in separate variables
at the start of the function before the switch case for the different
events so they can be used by any of the event handlers.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-07-14 20:29:05 +02:00
Kalesh AP
8fd709a10b net/bnxt: inform firmware about host MTU
This enables device firmware to respond appropriately to BMC queries
about the driver's configured MTU.

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
2021-07-14 20:29:05 +02:00
Kalesh AP
89e6a0c0da net/bnxt: update HSI structure
- HWRM version updated to 1.10.2.44
- Added corresponding driver changes for the Admin MTU field name change.

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
2021-07-14 20:29:05 +02:00
Weifeng Li
8117f5f61a net/bnxt: fix nested lock during bonding
Bnxt PMD registers LSC callback (bond_ethdev_lsc_event_callback) when
working at bond mode. This callback will dead lock when LSC
interrupt triggered.

lsc interrupt ->
bnxt_handle_async_event ->
bnxt_link_update_op ->
bond_ethdev_lsc_event_callback (lsc_lock) ->
bnxt_link_update_op ->
bond_ethdev_lsc_event_callback (lsc_lock dead lock)

Fixes: c2faa1d196 ("net/bnxt: add support for LSC interrupt event")
Cc: stable@dpdk.org

Signed-off-by: Weifeng Li <liweifeng96@126.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-07-13 06:19:11 +02:00
Lance Richardson
5ed30db87f net/bnxt: fix missing barriers in completion handling
Ensure that Rx/Tx/Async completion entry fields are accessed
only after the completion's valid flag has been loaded and
verified. This is needed for correct operation on systems that
use relaxed memory consistency models.

Fixes: 2eb53b134a ("net/bnxt: add initial Rx code")
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-07-12 20:38:12 +02:00
Satheesh Paul
1c3b657a6a net/cnxk: support raw flow pattern
Add support for rte_flow_item_raw to parse custom L2 and L3
protocols.

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-13 12:19:22 +02:00
Satheesh Paul
612ce5cf7d common/cnxk: support custom L2/L3 protocols parsing
Add roc API for parsing custom L2 and L3 protocols.

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
2021-07-13 12:17:51 +02:00
Satha Rao
e7bbbcb26f net/cnxk: update link status when device stopped
Set link status to down and don't fetch link status from kernel
when device in stopped state.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-13 11:29:49 +02:00
Satha Rao
d9dda782ac net/octeontx2: fix TM node statistics query
Until hierarchy committed TM hardware resources are not allocated
for node.
This patch check for status of HW resources before reading statistics.

Fixes: 1e25d57fae ("net/octeontx2: add TM stats and shaper profile")
Cc: stable@dpdk.org

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-13 11:29:11 +02:00
Satha Rao
12e491a6b6 net/octeontx2: handle link status when device stopped
Set link status to down and don't fetch link status from kernel
when device in stopped state.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-13 11:29:10 +02:00
Satheesh Paul
d81cea5280 net/cnxk: fix default MCAM allocation size
Preallocation of MCAM entries is not valid anymore since the
AF side MCAM allocation scheme has changed. This patch disables
preallocation by changing the default MCAM preallocation size
from 8 to 1.

Fixes: 168c59cfe4 ("net/octeontx2: add flow MCAM utility functions")
Cc: stable@dpdk.org

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-12 14:44:20 +02:00
Anoob Joseph
ec8f303c65 net/octeontx2: support non-ethernet L2 header
In the inline inound path, a custom header would be present at L3 which
has sequence number & SPI. L2 need to be adjusted such that the eventual
packet would have L3 after L2. Remove assumption of L2 type in this
handling.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-12 14:04:42 +02:00
Meir Levi
71c5085bfb net/mvpp2: fix not supported VLAN operations status
vlan_strip and vlan_extend features need to return "unsupported"
error value.

Fixes: ff0b8b10dc ("net/mvpp2: support VLAN offload")
Cc: stable@dpdk.org

Signed-off-by: Meir Levi <mlevi4@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
2021-07-12 11:07:08 +02:00
Dana Vardi
e622c1a88e net/mvpp2: fix configured state dependency
Need to set configure flag to allow create and commit mrvl tm
hierarchy tree. tm configuration depends on parameters that are
being set in port configure stage, e.g. nb_tx_queues.
This also aligned with the tm api description.

Fixes: 429c394417 ("net/mvpp2: support traffic manager")
Cc: stable@dpdk.org

Signed-off-by: Dana Vardi <danat@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
2021-07-12 10:31:21 +02:00
Dana Vardi
8fa07a68a6 net/mvpp2: fix port speed overflow
ethtool_cmd_speed return uint32 and after the arithmetic
operation in mrvl_get_max_rate func the result is out of range.

Fixes: 429c394417 ("net/mvpp2: support traffic manager")
Cc: stable@dpdk.org

Signed-off-by: Dana Vardi <danat@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
2021-07-12 09:59:52 +02:00
Sarosh Arif
6e695b0cda net/mlx5: fix typo in vectorized Rx comments
Change "returing" to "returning".

Fixes: 2e542da709 ("net/mlx5: add Altivec Rx")
Fixes: 570acdb1da ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Fixes: 3c2ddbd413 ("net/mlx5: separate shareable vector functions")
Cc: stable@dpdk.org

Signed-off-by: Sarosh Arif <sarosh.arif@emumba.com>
2021-07-15 16:32:09 +02:00
Alexander Kozyrev
acc8747953 net/mlx5: fix threshold for mbuf replenishment in MPRQ
The replenishment scheme for the vectorized MPRQ Rx burst aims
to improve the cache locality by allocating new mbufs only when
there are almost no mbufs left: one burst gap between allocated
and consumed indexes.

This gap is not big enough to accommodate a corner case when we
have a very aggressive CQE compression with multiple regular CQEs
at the beginning and 64 zipped CQEs at the end.

Need to keep in mind this case and extend the replenishment
threshold by MLX5_VPMD_RX_MAX_BURST (64) to avoid mbuf overflow.

Fixes: 5fc2e5c27d ("net/mlx5: fix mbuf overflow in vectorized MPRQ")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-15 16:22:27 +02:00
Xiaoyu Min
0ed93c1344 net/mlx5: fix missing RSS expansion of IPv6 frag
IPV6_FRAG_EXT item is missed for RSS expansion which causes wrongly
expanded flows:
flow create 0 ingress pattern eth / ipv6 / udp dst is 250 / vxlan-gpe /
ipv6 / ipv6_frag_ext / end actions rss level 2 types ip end / end

Different from other items, IPV6_FRAG_EXT hasn't next field because HW
only support to do hash of UDP/TCP for non-fragment.

This MLX5_EXPANSION_IPV6_FRAG_EXT node in RSS expansion graph only helps
RSS expansion function to locate right node in graph from which start
to expand.

Fixes: 0e5a0d8f75 ("net/mlx5: support match on IPv6 fragment extension")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:22:26 +02:00
Xiaoyu Min
1c4f7044c6 net/mlx5: fix missing RSS expandable items
Some RSS expandable items are missing which leads to the expanded
rte flow rules with wrong patterns.

Fix by adding missed items.

Fixes: d91093b9a2 ("net/mlx5: fix RSS pattern expansion")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:22:25 +02:00
Gregory Etelson
c410e1d562 net/mlx5: support flow matchng on IPv4 IHL
Query MLX5 port hardware if it is capable to offload IPv4
IHL field.

Provide flow rules capability to match on IPv4 IHL field.
Minimal HCA firmware version required to offload IPv4 IHL is
xx_30_2000.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-15 16:22:20 +02:00
Suanming Mou
a5835d530f net/mlx5: optimize Rx queue match
As hrxq struct has the indirect table pointer, while matching the
hrxq, better to use the hrxq indirect table instead of searching
from the list.

This commit optimizes the hrxq indirect table matching.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:23 +02:00
Suanming Mou
cde19e8634 net/mlx5: change memory release configuration
This commit changes the index pool memory release configuration
to 0 when memory reclaim mode is not required.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:22 +02:00
Suanming Mou
f3020a331d net/mlx5: optimize hash list table allocate on demand
Currently, all the hash list tables are allocated during start up.
Since different applications may only use dedicated limited actions,
optimized the hash list table allocate on demand will save initial
memory.

This commit optimizes hash list table allocate on demand.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:22 +02:00
Suanming Mou
07b51bb9fe net/mlx5: enable indexed pool per-core cache
This commit enables the tag and header modify action indexed
pool per-core cache in non-reclaim memory mode.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:21 +02:00
Suanming Mou
f7c3f3c290 net/mlx5: adjust hash bucket size
With the new per core optimization to the list, the hash bucket size
can be tuned to a more accurate number.

This commit adjusts the hash bucket size.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:21 +02:00
Matan Azrad
4f3d8d0ea3 net/mlx5: move header modify allocator to ipool
Modify header actions are allocated by mlx5_malloc which has a big
overhead of memory and allocation time.

One of the action types under the modify header object is SET_TAG,

The SET_TAG action is commonly not reused by the flows and each flow has
its own value.

Hence, the mlx5_malloc becomes a bottleneck in flow insertion rate in
the common cases of SET_TAG.

Use ipool allocator for SET_TAG action.

Ipool allocator has less overhead of memory and insertion rate and has
better synchronization mechanism in multithread cases.

Different ipool is created for each optional size of modify header
handler.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 16:09:20 +02:00
Suanming Mou
7e1cf89271 common/mlx5: support list non-lcore operations
This commit supports the list non-lcore operations with
an extra sub-list and lock.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:20 +02:00
Suanming Mou
9a4c368807 common/mlx5: optimize cache list object memory
Currently, hash list uses the cache list as bucket list. The list
in the buckets have the same name, ctx and callbacks. This wastes
the memory.

This commit abstracts all the name, ctx and callback members in the
list to a constant struct and others to the inconstant struct, uses
the wrapper functions to satisfy both hash list and cache list can
set the list constant and inconstant struct individually.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:19 +02:00
Suanming Mou
25481e5025 common/mlx5: allocate cache list memory individually
Currently, the list's local cache instance memory is allocated with
the list. As the local cache instance array size is RTE_MAX_LCORE,
most of the cases the system will only have very limited cores.
allocate the instance memory individually per core will be more
economic to the memory.

This commit changes the instance array to pointer array, allocate
the local cache memory only when the core is to be used.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:19 +02:00
Matan Azrad
961b6774c4 common/mlx5: add per-lcore cache to hash list utility
Using the mlx5 list utility object in the hlist buckets.

This patch moves the list utility object to the common utility, creates
all the clone operations for all the hlist instances in the driver.

Also adjust all the utility callbacks to be generic for both list and
hlist.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 16:09:18 +02:00
Suanming Mou
6507c9f51d common/mlx5: call list callbacks with context
This commit optimizes to call the list callback functions with global
context directly.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 16:09:17 +02:00
Suanming Mou
d03b786005 common/mlx5: add per-lcore sharing flag in object list
Without lcores_share flag, mlx5 PMD was sharing the rdma-core objects
between all lcores.

Having lcores_share flag disabled, means each lcore will have its own
objects, which will eventually lead to increased insertion/deletion
rates.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:50:31 +02:00
Suanming Mou
9c373c524b common/mlx5: move list utility from net driver
Hash list is planned to be implemented with the cache list code.

This commit moves the list utility to common directory.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:19:13 +02:00
Matan Azrad
679f46c775 net/mlx5: allocate list memory in create function
Currently, the list memory was allocated by the list API caller.

Move it to be allocated by the create API in order to save consistence
with the hlist utility.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:13 +02:00
Matan Azrad
84fbba5b9e net/mlx5: relax list utility atomic operations
The atomic operation in the list utility no need a barriers because the
critical part are managed by RW lock.

Relax them.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:12 +02:00
Matan Azrad
a603b55ad9 net/mlx5: manage list cache entries release
When a cache entry is allocated by lcore A and is released by lcore B,
the driver should synchronize the cache list access of lcore A.

The design decision is to manage a counter per lcore cache that will be
increased atomically when the non-original lcore decreases the reference
counter of cache entry to 0.

In list register operation, before the running lcore starts a lookup in
its cache, it will check the counter in order to free invalid entries in
its cache.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:11 +02:00
Matan Azrad
0b4ce17a11 net/mlx5: minimize list critical sections
The mlx5 internal list utility is thread safe.

In order to synchronize list access between the threads, a RW lock is
taken for the critical sections.

The create\remove\clone\clone_free operations are in the critical
sections.

These operations are heavy and make the critical sections heavy because
they are used for memory and other resources allocations\deallocations.

Moved out the operations from the critical sections and use generation
counter in order to detect parallel allocations.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:11 +02:00
Matan Azrad
491b7137ff net/mlx5: add per-lcore cache to the list utility
When mlx5 list object is accessed by multiple cores, the list lock
counter is all the time written by all the cores what increases cache
misses in the memory caches.

In addition, when one thread accesses the list for add\remove\lookup
operation, all the other threads coming to do an operation in the list
are stuck in the lock.

Add per lcore cache to allow thread manipulations to be lockless when
the list objects are mostly reused.

Synchronization with atomic operations should be done in order to
allow threads to unregister an entry from other thread cache.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:10 +02:00
Matan Azrad
e78e5408da net/mlx5: remove cache term from the list utility
The internal mlx5 list tool is used mainly when the list objects need to
be synchronized between multiple threads.

The "cache" term is used in the internal mlx5 list API.

Next enhancements on this tool will use the "cache" term for per thread
cache management.

To prevent confusing, remove the current "cache" term from the API's
names.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:10 +02:00
Matan Azrad
e681eb0515 net/mlx5: optimize header modify action memory
Define the types of the modify header action fields to be with the
minimum size needed for the optional values range.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2021-07-15 15:19:09 +02:00
Suanming Mou
b4edeaf3ef net/mlx5: replace flow list with indexed pool
The flow list is used to save the create flows and to be used only
when port closes all the flows need to be flushed.

This commit takes advantage of the index pool foreach operation to
flush all the allocated flows.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:19:09 +02:00
Suanming Mou
42f463395f net/mlx5: support indexed pool non-lcore operations
This commit supports the index pool non-lcore operations with
an extra cache and lcore lock.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:19:08 +02:00
Suanming Mou
64a80f1c48 net/mlx5: add indexed pool iterator
In some cases, application may want to know all the allocated
index in order to apply some operations to the allocated index.

This commit adds the indexed pool functions to support foreach
operation.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:19:08 +02:00
Suanming Mou
d15c0946be net/mlx5: add indexed pool local cache
For object which wants efficient index allocate and free, local
cache will be very helpful.

Two level cache is introduced to allocate and free the index more
efficient. One as local and the other as global. The global cache
is able to save all the allocated index. That means all the allocated
index will not be freed. Once the local cache is full, the extra
index will be flushed to the global cache. Once local cache is empty,
first try to fetch more index from global, if global is still empty,
allocate new trunk with more index.

This commit adds new local cache mechanism for indexed pool.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:19:07 +02:00
Suanming Mou
58ecd3ad0b net/mlx5: allow limiting the indexed pool maximum index
Some ipool instances in the driver are used as ID\index allocator and
added other logic in order to work with limited index values.

Add a new configuration for ipool specify the maximum index value.
The ipool will ensure that no index bigger than the maximum value is
provided.

Use this configuration in ID allocator cases instead of the current
logics. This patch add the maximum ID configurable for the index pool.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-15 15:19:01 +02:00
Ruifeng Wang
1db288f941 net/mlx5: reduce unnecessary memory access in Rx
MR btree len is a constant during Rx replenish.
Moved retrieve of the value out of loop to reduce data loads.
Slight performance uplift was measured on both N1SDP and x86.

Suggested-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-15 15:17:22 +02:00
Ruifeng Wang
ff6fcd415f net/mlx5: remove redundant operations in NEON Rx
Mask of entries after the compressed CQE is covered by invalid mask of
non-compressed valid CQEs. Hence remove redundant calculation on mask.
The change showed slight performance uplift on N1SDP.

Fixes: 570acdb1da ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-15 15:16:26 +02:00
Rongwei Liu
630a587bfb net/mlx5: support matching on VXLAN reserved field
This adds matching on the reserved field of VXLAN
header (the last 8-bits). The capability from rdma-core
is detected by creating a dummy matcher using misc5
when the device is probed.

For non-zero groups and FDB domain, the capability is
detected from rdma-core, meanwhile for NIC domain group
zero it's relying on the HCA_CAP from FW.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Raslan Darawsheh <rasland@nvidia.com>
2021-07-13 15:06:43 +02:00
Huisong Li
fafa81dece net/hns3: support multiple TC MAC pause
MAC PAUSE can take effect on a single TC or multiple TCs, depending on the
hardware. For example, the Kunpeng 920 supports MAC pause in a single TC,
and the Kunpeng 930 supports MAC pause in multiple TCs. This patch
supports MAC PAUSE in multiple TC for some hardware.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-07-13 11:41:32 +02:00
Chengchang Tang
0f5bf5a856 net/hns3: support VLAN filter state modify for VF
Since the HW limitation for VF, the VLAN filter is default enabled, and
is not allowed to be closed. Now, the limitation has been removed in
Kunpeng930 network engine, so this patch add support for VF to modify the
VLAN filter state.

A capabilities bit is added to differentiate between different platforms
and achieve compatibility. When the VF runs on an incomatible platform or
an incompatible kernel-mode driver version is used, the VF behavior is
the same as that before.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-07-13 11:41:32 +02:00
Chengchang Tang
2735b35538 net/hns3: query basic info for VF
There are some features of VF depend on PF, so it's necessary for VF
to know whether current PF supports. Therefore, the final capability
set of VF will be composed of the capability set of hardware and the
capability set of PF.

For compatibility reasons, the mailbox HNS3_MBX_GET_TCINFO has been
modified to obatin more basic information about the current PF, including
the communication interface version and current PF capabilities set.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-07-13 11:41:32 +02:00
Dapeng Yu
ae2b3ba643 net/softnic: fix connection memory leak
In function softnic_conn_init(), a block of memory is allocated as
connection buffer, but it is never freed in softnic_conn_free(),
which cause memory leak.

Fixes: 7709a63bf1 ("net/softnic: add connection agent")
Cc: stable@dpdk.org

Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-13 11:34:57 +02:00
Jochen Behrens
046f116195 net/vmxnet3: support MSI-X interrupt
Add support for MSI-X interrupt vectors to the vmxnet3 driver.
This will allow more efficient deployments in cloud environments.

By default it will try to allocate 1 vector (0) for link
event and one MSI-X vector for each Rx queue.  To simplify
things, it will only be enabled if the number of Tx and Rx
queues are equal (so that Tx/Rx share the same vector).
If for any reason vmxnet3 cannot enable intr mode, it will
fall back to the LSC only mode.

Signed-off-by: Yong Wang <yongwang@vmware.com>
Signed-off-by: Jochen Behrens <jbehrens@vmware.com>
2021-07-13 11:31:10 +02:00
Martin Havlik
d844400966 net/bonding: check flow setting
Return value from bond_ethdev_8023ad_flow_set() is now checked
and appropriate message is logged on error.

Fixes: 112891cd27 ("net/bonding: add dedicated HW queues for LACP control")
Cc: stable@dpdk.org

Signed-off-by: Martin Havlik <xhavli56@stud.fit.vutbr.cz>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-13 11:04:55 +02:00
Martin Havlik
cb8dc97f9d net/bonding: fix error message on flow verify
Return value is now saved to errval and log message on error reports
correct function name, doesn't use q_id which was out of context,
and uses up-to-date errval.

Fixes: 112891cd27 ("net/bonding: add dedicated HW queues for LACP control")
Cc: stable@dpdk.org

Signed-off-by: Martin Havlik <xhavli56@stud.fit.vutbr.cz>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-13 10:58:14 +02:00
Jiawen Wu
cc63194e89 net/ngbe: support close and reset device
Support to close and reset device.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
aad91edd81 net/ngbe: add simple Tx flow
Initialize device with the simplest transmit functions.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
93dfebd2c2 net/ngbe: add simple Rx flow
Initialize device with the simplest receive function.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
62fc35e63d net/ngbe: support Rx queue start/stop
Initializes receive unit, support to start and stop receive unit for
specified queues.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
001c782330 net/ngbe: support Tx queue start/stop
Initializes transmit unit, support to start and stop transmit unit for
specified queues.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
3518df5774 net/ngbe: support device start/stop
Setup MSI-X interrupt, complete PHY configuration and set device link
speed to start device. Disable interrupt, stop hardware and clear queues
to stop device.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
a58e7c312c net/ngbe: support Tx queue setup/release
Setup device Tx queue and release Tx queue.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:22 +02:00
Jiawen Wu
43b7e5ea60 net/ngbe: support Rx queue setup/release
Setup device Rx queue and release Rx queue.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
3d0af70667 net/ngbe: setup PHY link
Setup PHY, determine link and speed status from PHY.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
b9246b8fa2 net/ngbe: support link update
Register to handle device interrupt.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
539d55dab6 net/ngbe: store MAC address
Store MAC addresses and init receive address filters.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
44e97550ca net/ngbe: identify and reset PHY
Identify PHY to get the PHY type, and perform a PHY reset.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
78710873c2 net/ngbe: add HW initialization
Initialize the hardware by resetting the hardware in base code.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
f501a195bd net/ngbe: initialize and validate EEPROM
Reset swfw lock before NVM access, init EEPROM and validate the
checksum.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
68eb13a1ef net/ngbe: set MAC type and LAN ID with initialization
Add basic init and uninit function.
Map device IDs and subsystem IDs to single ID for easy operation.
Then initialize the shared code.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
ed5f3bd337 net/ngbe: define registers
Define all registers that will be used.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
cc934df178 net/ngbe: add log and error types
Add log type and error type to trace functions.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
6ee7e574cd net/ngbe: support probe and remove
Add device IDs for Wangxun 1Gb NICs, map device IDs to register ngbe
PMD. Add basic PCIe ethdev probe and remove.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:21 +02:00
Jiawen Wu
26590b5200 net/ngbe: add build and doc infrastructure
Adding bare minimum PMD library and doc build infrastructure
and claim the maintainership for ngbe PMD.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-07-12 17:55:19 +02:00
Viacheslav Galaktionov
10eaf41d70 ethdev: keep count of representor ranges in API
In its current state, the API can overflow the user-passed buffer if a new
representor range appears between function calls.

In order to solve this problem, augment the representor info structure with
the numbers of allocated and initialized ranges. This way the users of this
structure can be sure they will not overrun the buffer.

Fixes: 85e1588ca7 ("ethdev: add API to get representor info")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
2021-07-10 11:29:11 +02:00
Ajit Khaparde
72d7b5959f net/bnxt: fix build
Fix build failures seen on Fedora Core 34 (GCC 11)
because of uninitialized variables.

In function ‘ulp_mapper_index_tbl_process’:
drivers/net/bnxt/tf_ulp/ulp_mapper.c:2252:43: error:
‘*(unsigned int *)((char *)&glb_res + offsetof(struct bnxt_ulp_glb_resource_info, resource_func))’
may be used uninitialized in this function
 2252 |         struct bnxt_ulp_glb_resource_info glb_res;
      |                                           ^~~~~~~
drivers/net/bnxt/tf_ulp/ulp_mapper.c:2252:43: error:
‘glb_res.resource_type’ may be used uninitialized in this function

In function ‘dpool_defrag’:
drivers/net/bnxt/tf_core/dpool.c:95:18: error:
‘index’ may be used uninitialized in this function
   95 |         uint32_t index;
      |                  ^~~~~

Fixes: 05b405d581 ("net/bnxt: add dpool allocator for EM allocation")

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-07-09 22:34:06 +02:00
Chengwen Feng
699fa1d40e net/hns3: fix Arm SVE build with GCC 8.3
If the target machine has SVE feature (e.g. '-march=armv8.2-a+sve'),
and compiler is gcc-8.3, it will fail, the error is arm_sve.h:
no such file or directory.

The solution:
a. If RTE_HAS_SVE_ACLE defined (it means the minimum instruction set
support SVE ACLE) then compiles it.
b. Else if the compiler support SVE ACLE then compiles it.
c. Otherwise don't compile it.

Fixes: 8c25b02b08 ("net/hns3: fix enabling SVE Rx/Tx")
Fixes: 952ebacce4 ("net/hns3: support SVE Rx")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-07-09 22:25:31 +02:00
Anatoly Burakov
43fb6eea49 net/af_xdp: support power monitoring
Implement support for .get_monitor_addr in AF_XDP driver.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-07-09 21:13:13 +02:00
Anatoly Burakov
6afc4baf4f eal: use callbacks for power monitoring comparison
Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
2021-07-09 21:13:13 +02:00
Juraj Linkeš
143b6270b0 net/virtio: fix aarch32 build
NEON vector path of the PMD needs aarch64 support. But it was
enabled for aarch32 build as well because aarch32 build had
cpu_family set to aarch64. So build for aarch32 will fail due
to unsupported intrinsics.

Fix aarch32 build by updating meson file to exclude NEON vector
implementation for aarch32.

Fixes: 749799482a ("net/virtio: add to meson build")
Cc: stable@dpdk.org

Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-09 20:00:06 +02:00
Ruifeng Wang
746d6f8388 net/bnxt: fix aarch32 build
NEON vector path of the PMD needs aarch64 support. But it was
enabled for aarch32 build as well because aarch32 build had
cpu_family set to aarch64. So build for aarch32 will fail due
to unsupported intrinsics.

Fix aarch32 build by updating meson file to exclude NEON vector
implementation for aarch32.

Fixes: 3983583414 ("net/bnxt: support NEON")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
2021-07-09 19:59:46 +02:00
Ruifeng Wang
a5f1b1e515 net/sfc: fix aarch32 build
The sfc PMD was enabled for aarch32 which is 32-bit mode but has
cpu_family set to aarch64.
As sfc support only 64-bit system, it should be disabled for aarch32.

Updated meson file to disable sfc for aarch32 build.

Fixes: 141d287067 ("net/sfc: support aarch64 architecture")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-07-09 19:58:20 +02:00
David Marchand
5898abedeb net/octeontx/base: fix debug build with clang
Remove conflicting declaration of this symbol.

Fixes: d0d6549860 ("net/octeontx: support event Rx adapter")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
2021-07-09 13:18:56 +02:00
Tejasree Kondoj
4a3e72a2ee crypto/cnxk: fix build with asserts
Removing usage of unavailable macro.

Fixes: baee42a6be ("crypto/cnxk: add IPsec datapath")

Reported-by: Ali Alnubani <alialnu@nvidia.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
2021-07-09 13:18:56 +02:00
Anoob Joseph
b146c30d3c crypto/cnxk: add PCI ID for cn9k
Add PCI ID for crypo_cn9k PMD.

To avoid conflicting PCI ID in crypto_octeontx2 and crypto_cn9k PMDs,
disable crypto_cn9k PMD when built with octeontx2 config.

The lack of PCI ID is causing debug build to fail on Ubuntu 18.04
for crypto_cn9k PMD.

Reported-by: Ali Alnubani <alialnu@nvidia.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Anoob Joseph <anoobj@marvell.com>
2021-07-09 13:18:56 +02:00
Dapeng Yu
75e4023dd7 net/ixgbe: fix flow entry access after freeing
The original code use a heap pointer after it is freed.
This patch fix it.

Fixes: a14de8b498 ("net/ixgbe: destroy consistent filter")
Cc: stable@dpdk.org

Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
2021-07-09 09:31:52 +02:00
Joyce Kong
65b2ec7b4f net/i40e: fix descriptor scan on Arm
For Arm platforms, reading descs can get re-ordered, then the
status of DD bits will be discontinuous, so add the logic to
only process continuous descs by checking DD bits.

Fixes: 4861cde461 ("i40e: new poll mode driver")
Cc: stable@dpdk.org

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-07-09 05:05:19 +02:00
Dapeng Yu
a87688852d net/ice: fix VXLAN flow director creation
In original implementation, error returned when creating VXLAN flow
director with SCTP or TCP as layer 4 protocol of inner segment.

There are several root causes for the error:
1. ice_fdir_input_set_hdrs() set ICE_FLOW_SEG_HDR_UDP into protocol
header flag of inner segment of VXLAN FDIR rule, even if it shall be
ICE_FLOW_SEG_HDR_TCP or ICE_FLOW_SEG_HDR_SCTP
2. ice_fdir_input_set_hdrs() set ICE_FLOW_SEG_HDR_VXLAN into protocol
header flag of segments of VXLAN FDIR rule, it not necessary, and can
be set automatically by ice_flow_set_fld() later
3. flow type: ICE_FLTR_PTYPE_NONF_IPV4_UDP_VXLAN hides the flow type of
inner segment of VXLAN FDIR rule, then further causes function:
ice_fdir_get_gen_prgm_pkt() cannot write correct protocol id into inner
segment of training packet.

This patch fixes those defects described above.

Fixes: 855d23a07b ("net/ice: support VXLAN VNI field in flow director")
Cc: stable@dpdk.org

Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-09 05:05:19 +02:00
Dapeng Yu
a7e1e2f764 net/ice/base: fix VXLAN flow director creation
In original implementation, error returned when creating VXLAN flow
director with SCTP or TCP as layer 4 protocol of inner segment.

There are several root causes for the error:
1. ice_fdir_udp4_vxlan_pkt[] is not adapted to the TCP and SCTP protocol.
Its length cannot hold TCP header, only UDP protocol was supported in
original implementation
2. VXLAN VNI offset: 45 is inconsistent with IETF RFC 7348

This patch fixes those defects described above.

Fixes: 608cd0a5e2 ("net/ice/base: support VXLAN VNI field in flow director")
Cc: stable@dpdk.org

Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-09 05:05:19 +02:00
Ting Xu
931ee54072 net/ice: support QoS bandwidth config after VF reset in DCF
When VF reset happens, the QoS bandwidth configuration will be lost. If
the reset is not caused by DCB change, it is supposed to replay the
bandwidth configuration to VF by DCF. In this patch, when a vsi update
PF event is received from PF after VF reset, and it is confirmed that
DCB is not changed, bandwidth configuration will be replayed.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-09 05:05:19 +02:00
Ting Xu
3442a8d66a net/ice: fix check for QoS in DCF
This patch fixed some unreasonable error check. Move all checks into one
helper function before configuring. Skip the check for DCF (VF0).

Fixes: 3a6bfc37ea ("net/ice: support QoS config VF bandwidth in DCF")
Cc: stable@dpdk.org

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-09 05:05:19 +02:00
Wenjun Wu
00af17037d net/iavf: simplify flow director rules for IP fragment
This patch simplify the pattern of flow rules of FDIR for IP fragment.

Flow rule can be created by the following command:
1. flow create 0 ingress pattern eth /
   ipv4 fragment_offset spec 0x2000 fragment_offset mask 0x2000 /
   end <actions>
2. flow create 0 ingress pattern eth / ipv6 /
   ipv6_frag_ext fragment_offset spec 0x0001 fragment_offset mask 0x0001 /
   end <actions>

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-09 05:05:19 +02:00
Wenjun Wu
5ae0906ee7 net/ice: simplify flow director rules for IP fragment
This patch simplify the pattern of flow rules of FDIR for IP fragment.

Flow rule can be created by the following command:
1. flow create 0 ingress pattern eth /
   ipv4 fragment_offset spec 0x2000 fragment_offset mask 0x2000 /
   end <actions>
2. flow create 0 ingress pattern eth / ipv6 /
   ipv6_frag_ext fragment_offset spec 0x0001 fragment_offset mask 0x0001 /
   end <actions>

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-07-09 05:05:19 +02:00
David Marchand
850989f938 net/ice: fix memzone leak when firmware is missing
Caught by our QE.
When the firmware is missing, memzones were not released.

$ dpdk-testpmd -c 0x1f -a 0:0:0.0 -- -i
...

testpmd> dump_memzone
...
Zone 6: name:<RTE_METRICS>, len:0x15040, virt:0x1661b24c0, socket_id:0,
flags:0
physical segments used:
  addr: 0x140000000 iova: 0x140000000 len: 0x40000000 pagesz: 0x40000000

testpmd> port attach 0000:5e:00.0
Attaching a new port...
EAL: Using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:5e:00.0 (socket 0)
ice_load_pkg(): failed to open file: /lib/firmware/intel/ice/ddp/ice.pkg

ice_dev_init(): Failed to load the DDP package,Use safe-mode-support=1 to
 enter Safe Mode
EAL: Releasing PCI mapped resource for 0000:5e:00.0
EAL: Calling pci_unmap_resource for 0000:5e:00.0 at 0x2200000000
EAL: Calling pci_unmap_resource for 0000:5e:00.0 at 0x2202000000
EAL: Driver cannot attach the device (0000:5e:00.0)
EAL: Failed to attach device on primary process
testpmd: Failed to attach port 0000:5e:00.0

testpmd> dump_memzone
...
Zone 139: name:<ice_dma_17168374657430093156>, len:0x1000,
  virt:0x1660ed800, socket_id:0, flags:0 physical segments used:
  addr: 0x140000000 iova: 0x140000000 len: 0x40000000 pagesz: 0x40000000

With 20 tries attaching a net/ice port, we would end up with:

EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:5e:00.0 (socket 0)
EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested memzone
  segments exceeds RTE_MAX_MEMZONE
ice_dev_init(): Failed to initialize HW

Fixes: a4c8c48fe3 ("net/ice: load OS default package")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
2021-07-09 04:34:07 +02:00
Viacheslav Ovsiienko
0fd928bbba common/mlx5: fix compatibility with OFED port query API
The compilation flag HAVE_MLX5DV_DR_DEVX_PORT depends on presence
of mlx5dv_query_devx_port routine in rdma-core library.

The mlx5dv_query_devx_port routine exists only in OFED versions
of rdma-core library and is being planned to be removed and replaced
with Upstream compatible mlx5dv_query_port.

As mlx5dv_query_devx_port is being removed all the dependencies on
the HAVE_MLX5DV_DR_DEVX_PORT compilation flag are reconsidered.

The new compilation flag HAVE_MLX5DV_DR_CREATE_DEST_IB_PORT is for
backward compatibility with older OFED versions.

Fixes: 6cfe84fbe7 ("net/mlx5: fix port action for LAG")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:38 +02:00
Viacheslav Ovsiienko
d0cf77e8c2 common/mlx5: use new port query API if available
In order to get E-Switch vport identifiers the mlx5 PMD relies
on two approaches:
  [a] use port query API if it is provided by rdma-core library
  [b] otherwise, deduce vport ids from the related VF index
The latter is not reliable and may not work with newer kernel
drivers and in some configurations (LAG), causing E-Switch
malfunction. Hence, engaging the port query API is highly
desirable.

Depending on rdma-core version the port query API is:
  - very old OFED versions have no query API (approach [b])
  - rdma-core OFED < 5.5 provides mlx5dv_query_devx_port,
    HAVE_MLX5DV_DR_DEVX_PORT flag is defined (approach [a])
  - rdma-core OFED >= 5.5 has mlx5dv_query_port, flag
    HAVE_MLX5DV_DR_DEVX_PORT_V35 is defined (approach [a])
  - future OFED versions might remove mlx5dv_query_devx_port
    and HAVE_MLX5DV_DR_DEVX_PORT will not be defined
  - Upstream rdma-core < v35 has no port query API (approach [b])
  - Upstream rdma-core >= v35 has  mlx5dv_query_port, flag
    HAVE_MLX5DV_DR_DEVX_PORT_V35 is defined (approach [a])

In order to support the new mlx5dv_query_port routine, the
conditional compilation flag HAVE_MLX5DV_DR_DEVX_PORT_V35
is introduced by this patch. The flag HAVE_MLX5DV_DR_DEVX_PORT
is kept for compatibility with previous rdma-core versions.

Despite this patch is not a bugfix (it follows the introduced API
variation in underlying library), it resolves the compatibility
issue and is highly desired to be ported to DPDK LTS.

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:37 +02:00
Jiawei Wang
e39226bde5 net/mlx5: control flow rules with identical pattern
In order to allow\disallow configuring rules with identical
patterns, the new device argument 'allow_duplicate_pattern'
is introduced.
If allow, these rules be inserted successfully and only the
first rule take affect.
If disallow, the first rule will be inserted and other rules
be rejected.

The default is to allow.
Set it to 0 if disallow, for example:
	-a <PCI_BDF>,allow_duplicate_pattern=0

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:35 +02:00
Shun Hao
a3b7af90ba net/mlx5: validate meter action in policy
This adds the validation when creating a policy with meter action.

Currently meter action is only allowed for green color in policy, and
8 meters are supported at maximum in one meter hierarchy.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:35 +02:00
Shun Hao
f890b030e0 net/mlx5: add meter hierarchy destroy and cleanup
When creating hierarchy meter, its color rules will increase next
meter's reference count, so when destroy the hierarchy meter, also
need to dereference the next meter's count.

During flushing all meters of a port, need to destroy all hierarchy
meters and their policies first, to dereference the last meter in
hierarchy. Then all meters have no reference and can be destroyed.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:34 +02:00
Shun Hao
8e5c9fea44 net/mlx5: support meter hierarchy drop count
When using meter hierarchy with multiple meters, every meter may have
drop counter, so a packet being set red color by one meter should be
counted to that specific meter only.

To support this, add tag action in the color rule so packet going to
next new meter can have its meter id, so as to be counted to the
correct drop counter in drop table.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:33 +02:00
Shun Hao
50cc92dde8 net/mlx5: support meter action in meter policy
This makes the meter policy support meter action. So multiple meters
can be chained as a meter hierarchy.

Only termination meter is allowed as the last meter in a hierarchy,
and there're two cases:
1. The last meter has non-RSS policy, can directly create sub-policy
and color rules during each meter's policy creation.
2. The last meter has RSS policy, don't create sub-policy/rules when
creating meter policy. Only when a RTE flow is using the meter hierarchy,
will iterate all meters of the hierarchy and create needed sub-
policies and color rules for them.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:33 +02:00
Xiaoyu Min
a26cc30fa0 net/mlx5: limit inner RSS expansion for MPLS
If user wants to do MPLS inner RSS and only provides pattern
till MPLS without inner items [1], RSS expansion will expand flows
into 13 sub-flows[2] which is too many and it impacts flow insert
rate, stack usage becomes large as well.

This expansion into 13 sub-flows seems not worthy of and it can
be significantly reduced (i.e, 7 sub-flows [3]) by user providing
at least one inner L2/L3 item [4].

[1]:
pattern eth / ipv4 / udp / mpls / end actions rss type tcp udp ip
end level 2 / end

[2]:
eth / ipv4 / udp / mpls
eth / ipv4 / udp / mpls / ipv4
eth / ipv4 / udp / mpls / ipv4 / udp
eth / ipv4 / udp / mpls / ipv4 / tcp
eth / ipv4 / udp / mpls / ipv6
eth / ipv4 / udp / mpls / ipv6 / udp
eth / ipv4 / udp / mpls / ipv6 / tcp
eth / ipv4 / udp / mpls / eth / ipv4
eth / ipv4 / udp / mpls / eth / ipv4 / udp
eth / ipv4 / udp / mpls / eth / ipv4 / tcp
eth / ipv4 / udp / mpls / eth / ipv6
eth / ipv4 / udp / mpls / eth / ipv6 / udp
eth / ipv4 / udp / mpls / eth / ipv6 / tcp

[3]:
eth / ipv4 / udp / mpls / eth
eth / ipv4 / udp / mpls / eth / ipv4 / udp
eth / ipv4 / udp / mpls / eth / ipv4 / tcp
eth / ipv4 / udp / mpls / eth / ipv6
eth / ipv4 / udp / mpls / eth / ipv6 / udp
eth / ipv4 / udp / mpls / eth / ipv6 / tcp

[4]:
pattern eth / ipv4 / udp / mpls / eth / end actions rss type tcp udp ip
level 2 / end

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:32 +02:00
Xiaoyu Min
84f4764c22 net/mlx5: fix MPLS RSS expansion
MPLSoUDP and MPLSoGRE are supported by PMD from
rte flow point of view.

RSS expansion doesn't support above but, instead, supports
normal MPLS over L2, which actually will be rejected by PMD.

This patch removes RSS expansion support of the MPLS over L2
and adds support of MPLSoUDP and MPLSoGRE.

In addition to above, support for eth over MPLS expansion is
added too.

Fixes: a4a5cd21d2 ("net/mlx5: add flow MPLS item")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:31 +02:00
Xiaoyu Min
14ad99d78a net/mlx5: remove unsupported flow item MPLS over IP
HW doesn't support match MPLS over IP traffic.

Remove related code.

Fixes: d1abe664dd ("net/mlx5: add MPLS to Direct Verbs flow engine")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:31 +02:00
Alexander Kozyrev
8762718d72 net/mlx5: fix offset calculation for modify field action
Offsets are not taken into account during MAC addresses
manipulation for the MODIFY_FIELD action. That leads to
a wrong split between 0-15 and 16-47 bits and corrupted
data being copied to/from MAC addresses. Use both source
and destination offsets to calcucate the proper modify
header action specification.

Fixes: fdd0c046f4 ("net/mlx5: fix modify field action order for MAC")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:30 +02:00
Gregory Etelson
cdc32d127e net/mlx5: fix L4 integrity translation
MLX5 PMD supports L3 and L4 integrity bits.
L4 checksum-ok bit was not translated correctly.
The patch updates the l4_csum_ok integrity bit translation.

Fixes: 79f8952783 ("net/mlx5: support integrity flow item")
Cc: stable@dpdk.org

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:30 +02:00
Viacheslav Ovsiienko
32d1e4dbad common/mlx5: fix Netlink receive message buffer size
If there are many VFs the Netlink message length sent by kernel
in reply to RTM_GETLINK request can be large. We should query
the size of message being received in advance and allocate
the large enough buffer to handle these large messages.

Fixes: ccdcba53a3 ("net/mlx5: use Netlink to add/remove MAC addresses")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:29 +02:00
Xiaoyu Min
4b1cb50a86 net/mlx5: fix match MPLS over GRE with key
Currently PMD needs previous layer information in order to set
corresponding match field for MPLSoGRE or MPLSoUDP.

GRE_KEY item is missing as supported previous layer when translate
item MPLS, which causes flow[1] cannot match MPLS over GRE traffic.

According to RFC4023, MPLS over GRE tunnel with optional key
field needs to be supported too.

By adding missing GRE_KEY as supported previous layer fix problem.

[1]:
flow create 0 ingress pattern eth / ipv6 / gre k_bit is 1 / gre_key /
mpls label is 966138 / end actions queue index 1 / mark id 0xa / end

Fixes: a7a0365565 ("net/mlx5: match GRE key and present bits")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:28 +02:00
Gregory Etelson
be548b9c9d net/mlx5: fix pattern expansion in RSS flow rules
Flow rule pattern may be implicitly expanded by the PMD if the rule
has RSS flow action. The expansion adds network headers to the
original pattern. The new pattern lists all network levels that
participate in the rule RSS action.

The patch validates that buffer for expanded pattern has enough bytes
for new flow items.

Fixes: c7870bfe09 ("ethdev: move RSS expansion code to mlx5 driver")
Cc: stable@dpdk.org

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:28 +02:00
Haifei Luo
5db9318f76 net/mlx5: add more details to flow dump
Currently the flow dump provides few information about actions
- just the pointers. Add implementations to display details for
counter, modify_hdr and encap_decap actions.

For counter, the regular flow operation query is engaged and
the counter content information is provided, including hits
and bytes values.For modify_hdr, encap_and decap actions,
the information stored in the ipool objects is dumped.

There are the formats of information presented in the dump:
	Counter: rec_type,id,hits,bytes
	Modify_hdr: rec_type,id,actions_number,actions
	Encap_decap: rec_type,id,buf

Signed-off-by: Haifei Luo <haifeil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:27 +02:00
Feifei Wang
1c196da274 net/mlx5: fix r/w lock usage in DMA unmap
For mlx5 DMA unmap, write lock should be used for rebuilding memory
region cache table rather than read lock.

Fixes: 989e999d93 ("net/mlx5: support PCI device DMA map and unmap")
Cc: stable@dpdk.org

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:27 +02:00
Shun Hao
48fbc1be82 net/mlx5: fix meter policy flow match item
Currently when creating meter policy, a src port_id match item will
always be added in switch domain. So if one meter is used by another
port, it will not work correctly.

This issue is solved:
1. If policy fate action is port_id, add the src port_id match item,
and the meter cannot be shared by another port.
2. If policy fate action isn't port_id, don't add the src port_id
match, meter can be shared by another port.

This fix enables one meter being shared by different ports. User can
create a meter flow using a port_id match item to make this meter
shared by other port.

Fixes: afb4aa4f12 ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:26 +02:00
Shun Hao
3c481324ba net/mlx5: fix meter flow direction check
When preparing prefix flow using ASO meter, if it's tx flow, need
to make meter action the first one.

Currently the check of flow direction in switch domain is incorrect
that it checks the flow dev port only.

This adds the fix for the check that if there's port_id match item
in flow, use that port_id as src port to determine flow direction.

Fixes: c99b4f8bc2 ("net/mlx5: support ASO meter action")
Cc: stable@dpdk.org

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:25 +02:00
Shun Hao
efcce4dcdc net/mlx5: fix meter policy ID table container
The meter policy handlers are managed by user IDs and the driver used l3
table in order to map the user ID to the internal driver handler of the
policy.

The l3 table was wrongly saved in the shared device structure which
manages all the switch domain ports what made the user IDs shared
between different ethdev ports.

Move the policy l3 table to be per port by saving it in the port private
structure.

Fixes: afb4aa4f12 ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:25 +02:00
Shun Hao
a295c69a8b net/mlx5: optimize meter profile lookup
Currently a list is used to save all meter profile ids, which is
not efficient when looking up profile from huge amount of profiles.

This changes to use an l3 table instead to save meter profile ids,
so as to improve the lookup performance.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:24 +02:00
Viacheslav Ovsiienko
52e1ece50a net/mlx5: fix TSO multi-segment inline length
The inline data length for TSO ethernet segment should be
calculated from the TSO header instead of the inline size
configured by txq_inline_min devarg or reported by the NIC.
It is imposed by the nature of TSO offload - inline header
is being duplicated to every output TCP packet.

Fixes: cacb44a099 ("net/mlx5: add no-inline Tx flag")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:23 +02:00
Jiawei Wang
b3880af2ce net/mlx5: fix representor ID check for sampling
The representor definition was introduced in the latest code.
For non-representor port, like PF port, use the 0xffff instead of -1.

This patch updates the representor id checking during splitting sample
flow.

Fixes: cb95feefdd ("net/mlx5: support sub-function representor")
Cc: stable@dpdk.org

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
2021-07-08 22:09:22 +02:00
Michael Baum
2f6c2adbe5 common/mlx5: fix memory region leak
All the mlx5 drivers using MRs for data-path must unregister the mapped
memory when it is freed by the dpdk process.

Currently, only the net/eth driver unregisters MRs in free event.

Move the net callback handler from net driver to common.

Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:22 +02:00
Jiawei Wang
3057f33779 net/mlx5: fix flow modify action validation
The introduced MODIFY_FIELD action was used to manipulate
the packet header field through copy or set operations.

These modify header actions should be counted as one action
in low level, the current code used wrong actions flags
checking for modify field action.

This patch update the action flags checking into the correct
MODIFY_HDR_ACTIONS set.

Fixes: 641dbe4fb0 ("net/mlx5: support modify field flow action")
Cc: stable@dpdk.org

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:21 +02:00
Viacheslav Ovsiienko
ec837ad0fc net/mlx5: fix multi-segment inline for the first segments
Before 19.08 release the Tx burst routines of mlx5 PMD
provided data inline for the first short segments of the
multi-segment packets. In the release 19.08 mlx5 Tx datapath
was refactored and this behavior was broken, affecting the
performance.

For example, the T-Rex traffic generator might use small
leading segments to handle packet headers and performance
degradation was noticed.

If the first segments of the multi-segment packet are short
and the overall length is below the inline threshold it
should be inline into the WQE to fix the performance.

Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-08 22:09:20 +02:00
Li Zhang
9f4a192328 net/mlx5: fix meter policy with RSS action
When creating the meter sub-policy RSS rule,
the RSS descriptor was used before its update.
It also need update tunnel bit in RSS descriptor
after flow translate.

Use it only when it is updated.

Fixes: ec962bad14 ("net/mlx5: fix metering cleanup on stop")
Cc: stable@dpdk.org

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:20 +02:00
Tal Shnaiderman
a6a18d06f5 net/mlx5: add TCP and IPv6 to supported items for Windows
WINOF2 2.70 Windows kernel driver allows DevX rule creation
of types TCP and IPv6.

Added the types to the supported items in mlx5_flow_os_item_supported
to allow them to be created in the PMD.

Added description of new rules support in Windows kernel driver WINOF2 2.70
to the mlx5 driver guide.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-07-08 22:09:13 +02:00
Ajit Khaparde
5ba7c65864 net/bnxt: fix Rx interrupt setting
Don't set rxq interrupt config
Applications can set the rxq interrupt config to 1 or 0 as needed.
If an application is not interested in handling Rx interrupts and
prefers to poll Rx rings, there is no need for the PMD to set this
config option to 1.

Fixes: 1fe427fd08 ("net/bnxt: support enable/disable interrupt")
Cc: stable@dpdk.org

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
2021-07-08 05:55:57 +02:00
Ajit Khaparde
b3fa83945a net/bnxt: fix ring allocation and free
Fix handling of ring alloc and free logic to fix check for invalid ring and
context IDs. This also avoids code duplication.

Fixes: 6133f20797 ("net/bnxt: add Rx queue create/destroy")
Fixes: 51c87ebafc ("net/bnxt: add Tx queue create/destroy")
Cc: stable@dpdk.org

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
2021-07-08 05:55:56 +02:00
Ajit Khaparde
0105ea1296 net/bnxt: support runtime queue setup
Add support for runtime Rx and Tx queue setup. This will allow
Rx/Tx queue setup after the interface is started.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-07-08 05:55:56 +02:00