Initialize the global parameters structure to avoid segmentation fault
in the TRUFLOW global configuration set API.
Fixes: 0a58be6f7c1e ("net/bnxt: add access to NAT global register")
Cc: stable@dpdk.org
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The pkt_type field in the profile TCAM table needs to be ignored and
should not be set to normal packet type. The pkt_type for the packets
that are segmented due to transmit segment offload feature in the driver
are not marked as normal pkt_type and this shall result in profile TCAM
table miss and flow not being offloaded hence resulting in the reduction
of the throughput.
Fixes: fe82f3e02701 ("net/bnxt: support exact match templates")
Cc: stable@dpdk.org
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
In the Stingray use case, representors are conventionally run
inside the SoC domain representing functions that are on the
X86 domain. In order to support this mechanism of building
representors for endpoints that are not in the same host domain,
additional dev args have been in the PMD like so:
rep-based-pf=<physical index> rep-is-pf=<VF=0 or PF=1>
where `rep-based-pf` specifies the physical index of the base PF
that the representor is derived off of.
Since representor(s) can be created for endpoint PFs as well,
rename struct bnxt_vf_representor to bnxt_representor and other such
dev_ops and function names.
devargs have also been extended to specify the exact CoS queue along
with flow control enablement to be used for the conduit between the
representor and the endpoint function.
This is how a sample devargs would look with all the extended devargs
-w 0000:06:02.0,host-based-truflow=1,representor=[1],rep-based-pf=8,
rep-is-pf=1,rep-q-r2f=1,rep-fc-r2f=0,rep-q-f2r=1,rep-fc-f2r=1
Call CFA_PAIR_ALLOC only in case of Stingray instead of CFA_VFR_ALLOC.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Set the TRUFLOW Enable bit in bp->flags only if the value passed in
devargs was 1. Otherwise set it to 0.
Fixes: 313ac35ac701 ("net/bnxt: support ULP session manager init")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
In page_roundup() left shifting by more than 31 bits could have
undefined behavior as the return value is int and in page_getenum()
it is possible to return a value as high as 63.
Fix that to cap the return value to less than 32.
Coverity issue: 343463
Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
rx_queue_setup_op for representor was using a common function to
initialize the software data structures for the Rx ring. But that
routine has code to init other rings not needed for representors like
cp/agg ring etc.
Define and invoke a new function to setup structures just for the
representor Rx ring
Fixes: 6dc83230b43b ("net/bnxt: support port representor data path")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Add a template to support ipv6 VXLAN flows to enable support for
vxlan decap for those flows.
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kernel v5.10 will introduce the ability to efficiently share a UMEM
between AF_XDP sockets bound to different queue ids on the same or
different devices. This patch integrates that functionality into the AF_XDP
PMD.
A PMD will attempt to share a UMEM with others if the shared_umem=1 vdev
arg is set. UMEMs can only be shared across PMDs with the same mempool, up
to a limited number of PMDs goverened by the size of the given mempool.
Sharing UMEMs is not supported for non-zero-copy (aligned) mode.
The benefit of sharing UMEM across PMDs is a saving in memory due to not
having to register the UMEM multiple times. Throughput was measured to
remain within 2% of the default mode (not sharing UMEM).
A version of libbpf >= v0.2.0 is required and the appropriate pkg-config
file for libbpf must be installed such that meson can determine the
version.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
When start a VF with no initial MAC address assigned by the underlying
Host PF driver, just reuse the MAC address assigned when VF is
initializing.
Fixes: f69166c9a3c9 ("net/ixgbe: fix reset error handling")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
To manage encap decap header format actions mlx5 PMD used the single
linked list and lookup and insertion operations took too long times if
there were millions of objects and this impacted the flow
insertion/deletion rate.
In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In case of bonding, device ifindex was detected as the PF ifindex, so
any operation using ifindex applied to PF instead of the bond device.
These operations includes MTU get/set, up/down and mac address
manipulation, etc.
This patch detects bond interface ifindex and name for PF that join a
bond interface, uses it by default for netdev operations.
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The Rx queue start/stop feature is not supported if vectorized
rx_burst routine is engaged. There was a routine address typo
and rx_burst type check was wrong.
Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Driver should not send the phy_cfg request to bring link down
during reset recovery. If the driver sends the phy_cfg request
in recovery process, then FW needs to re-establish the link which
in turn increases the recovery time based on PHY type and link partners.
Fixes: df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Fix to use correct value offset for PCI function stats.
Fixes: 5f9374de2a3a ("net/bnxt: add PCI function stats to extended stats")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Although currently only the gtpu inner hash be enabled while not the
gtpu outer hash, but the outer protocol still needed to co-exist with
inner protocol when configure the gtpu inner hash rule, that would
allow the gtpu inner hash support for the different outer protocols.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The hardware supports many kinds of FlexiMDs set into Rx descriptor, and
the FlexiMDs can have different offsets in the descriptor according the
DDP package setting.
The FlexiMDs type and offset are identified by the RXDID, which will be
used to setup the queue.
For expanding to support different RXDIDs in the future, refactor the Rx
FlexiMD handling by the functions mapped to related RXDIDs.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
This patch fixed the issue that rx/tx bytes statistics counters
overflowed on 48 bit limitation by enlarging the limitation.
Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
While receiving packets, it is possible to fail to reserve
fill queue, since buffer ring is shared between tx and rx,
and maybe not available temporary. As a result both fill
queue and Rx queue will be empty.
Then kernel side will not be able to receive packets due to
empty fill queue, and dpdk will not be able to reserve fill
queue because dpdk doesn't have packets to receive, finally
deadlock will happen.
So move reserve fill queue before xsk_ring_cons__peek to fix it.
Cc: stable@dpdk.org
Signed-off-by: RongQing Li <lirongqing@baidu.com>
Signed-off-by: Dongsheng Rong <rongdongsheng@baidu.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
New HAL allows driver to read extra ENI stats. Exact meaning of each of
them can be found in base/ena_defs/ena_admin_defs.h file and structure
ena_admin_eni_stats.
The ena_eni_stats structure is exactly the same as ena_admin_eni_stats,
but it was required to be added for compatibility with xstats macros.
Reading ENI stats requires communication with the admin queue.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
There are some cases, where the admin queue commands after the
configuration phase finished - for example, the application could ask
for the driver statistics from multiple cores at once.
As by the design, the admin queue is not multithread safe, the spinlock
was added to protect all usages of the admin queue after the
configuration is done.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The current ena_com version was generated on 26.04.2020.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Checking for the cdesc not being NULL doesn't have any sense if the idx
argument is not 0, so it can be skipped, as the error won't be detected
anyway.
To simplify that, only the 'i' value is being verified and the code is
breaking from the infinite loop in case when all descriptors were copied
into the buffer.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
When filling out meta descriptor, all values should be converted to the
desired type (u32 in case of the meta descriptor) to prevent losing the
data.
For example, io_sq->phase is of type u8. If
ENA_ETH_IO_TX_META_DESC_PHASE_SHIFT would be greater then 8, all data
would be lost.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Static code analysis showed up, that it's possible for meta_desc being
NULL. To avoid dereference of the NULL pointer, extra check was added if
the pointer is in fact valid.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
To minimize chance of integer overflow, the type of admin statistics was
changed from u32 to u64.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
To align the error checking code with other parts of the ena_com,
the conditional check is being tested for the error was wrapped inside
unlikely().
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
* Function argument style improvement (space after *)
* Align indentation of the define
* Typo fix in the documentation
* Remove extra empty line after license (aligned with other files)
* Extra alignment of one line was fixed
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Setting RSS hash function could not be supported by the device. In that
situation there is no need to fill in default hash key or even allocate
hash key.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
It's well defined how the RSS key buffer looks from the device
perspective, so the constant value should be used instead of magic
number. Also it doesn't has to be calculated dynamically.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
There is no need to keep single function for both hash function and
the key. If the caller want's to get only single value, then it had to
pass NULL as one of the values, making the API harder to use.
Except reading functions from the device, one can also use function
ena_com_get_current_hash_function() to get the integer value, which
is representing current hash function stored in the ena_com layer.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The Elastic Netfwork Interface (ENI) stats can be acquired from the HW.
They can provide advanced values which can be further used by the
application for better flow management.
It isn't available to the DPDK application, yet. The PMD must expose
them directly.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The purpose of this change is general code simplification and
type safety improvement for the logical values.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
As there is no replacement for mmiowb() and there is no need to use both
versions in the DPDK, this ifdef was simply removed.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The wait event is being accessed without making sure it the completion
context exists. The check for that is just below, so it could be used
for releasing wait even safely.
Fixes: 3adcba9a8987 ("net/ena: update HAL to the newer version")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Instead of the fixes, 5 ms delay in the polling functions, use
values into given range (by default from 100 us 5000 us) and increase
them exponentially each time, the operation isn't finished.
This change can improve responsiveness of the driver for the fast
operations.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The admin command could return ENA_ADMIN_RESOURCE_BUSY status, which
is meaning that currently the given resource cannot be used.
However, the request can be repeated, so it's being converted to the
ENA_COM_TRY_AGAIN error code.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
ENA_MSLEEP() and ENA_UDELAY() were expecting different behavior - the
first one is expecting driver to sleep, while the other, to busy wait.
For both cases, the rte_delay_(u|m)s() function was used, which could
be either sleep or block, depending on the configuration.
To make the macros valid, the operations should be specified directly.
Because of that, the rte_delay_us_sleep() and rte_delay_us_block() are
now being used.
Fixes: 9ba7981ec992 ("ena: add communication layer for DPDK")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Usage of RTE_MIN(MAX) in ENA_MIN32, ENA_MIN16, ENA_MIN8 (and same for
the MAX), was not enough, as the HAL code is assuming that those macros
will convert both arguments to the specified uintX_t type.
As RTE_MIN(MAX) is using 'typeof' operator, the behavior won't be the
same, especially if arguments has different types (and it could cause
compilation warnings).
To satisfy that, the ENA_MIN_T and ENA_MAX_T macros were added, which
are converting both arguments to the type which is being passed as an
argument.
Fixes: 9ba7981ec992 ("ena: add communication layer for DPDK")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Update mail box data structures to sync with af driver mbox
changes done to retrieve VF's base steering rule.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Harman Kalra <hkalra@marvell.com>
The macro defined for milliseconds sleep was not putting the thread
to sleep and was simply calling a delay routine. This fix redefines
the macro to call the correct rte sleep API.
Fixes: ec94dbc57362 ("qede: add base driver")
Cc: stable@dpdk.org
Signed-off-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
Sync the repair of patch("fix compile error for old glibc
caused by CLOCK_MONOTONIC_RAW") in the community.
Fixes: efeed0894e9c ("net/hinic/base: avoid system time jump")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
Get default cos of pf driver from chip configuration file.
Fixes: 6691acef0d3d ("net/hinic: support VF")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
rx_mbuf_alloc_failed value is not set to 0 when get stats from driver,
which may cause this counter added every time when call this ops.
Fixes: cb7b6606ebff ("net/hinic: add RSS stats and promiscuous ops")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
hinic supports two methods: linear table and tcam table,
if tcam filter enables failed but linear table is ok,
which also needs to enable filter, so for this scene,
driver should not close fdir switch.
Fixes: f4ca3fd54c4d ("net/hinic: create and destroy flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
If rte_zmalloc failed, pmd driver should also delete the ntuple
filter or ethertype filter or normal and tcam filter that already
added before.
Fixes: d7964ce192e7 ("net/hinic: check memory allocations in flow creation")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
These helper will be reused by other libefx consumers, e.g. vDPA
driver.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Now MCDI helpers interface is independent from network driver and
may be moved into common driver.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>