Use raw mbuf free on inline security error to simulate
HW NPA free instead of doing rte_pktmbuf_free(). This
is needed as the callback will not be called from
DPDK lcore.
Fixes: 69daa9e502 ("net/cnxk: support inline security setup for cn10k")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Realloc inline dev XAQ when Rx/Tx security ie enabled with
new packet pool as XAQ should be large enough to hold all
mbufs if inline outbound reports error or all mbufs.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Register callback early to handle initial error packets from
inline device.
Fixes: 69daa9e502 ("net/cnxk: support inline security setup for cn10k")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Fix inline device RQ tagmask to get packets with receive errors
as type ETHDEV packets to callback handler so that packet buffers
can get freed. Currently only IPsec denied packets get the right
tag mask.
Fixes: ee48f711f3 ("common/cnxk: support NIX inline inbound and outbound setup")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Removed roc NPC APIs which tracks addition and deletion of
mark actions. It was earlier needed to track number of mark
actions added as part of flow rules. If mark actions count
is > 0, then the function pointer for Rx would get updated
to even read mark value from CQE/WQE and populate in mbuf.
Now the same switch is done based on new Rx meta data negotiate
ethdev API.
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Added rx_metadata_negotiate API to enable mark update RX offload.
Removed software logic to enable/disable mark update inside flow
create/destroy APIs.
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Allow force use of SSO device even when inline dev is available
so that in case driver needs events directly delivered to
event device.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Enable time counter based threshold for raising SSO
EXE_INT instead of IAQ threshold. Time counter based
threshold helps getting periodic interrupts and process
pkts in burst instead of getting HW to raise an interrupt
for every new work.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to enable AURA tail drop via RQ specifically
for inline device RQ's pkt pool. This is better than RQ
RED drop as it can be applied to all RQ's that are not
having security enabled but using same packet pool.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Enable the field in SA to write back L2, L3 headers in case
of errors during inline processing.
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Use common SA init API before doing initialization based on
params. This is better so that all HW specific default values
are at single place for lookaside and inline.
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Update the inline device functions to work when roc_nix is NULL.
This is required, as IPsec driver have to use these APIs to work
with inline IPsec device, but the IPsec driver might not have roc_nix
information.
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Provide a method to get floor values for a requested shaper rate,
which can assure packets should never be transmitted at a rate higher
than configured.
Keep the old API to get HW suggested values.
And introduce new parameter to select appropriate API.
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to realloc inline device XAQ AURA with more
buffers of new packet pool AURA.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
CN10K supports up to 832 resources at SMQ level, so increase
bitmap count to 1024.
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adds support for priority flow control support for CNXK
platforms.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
CNXK platforms support priority flow control(802.1qbb) to pause
respective traffic per class on that link.
Adds RoC interface to configure priority flow control on MAC
block i.e. CGX on cn9k and RPM on cn10k.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Fix bug in batch alloc issue failure path where it was
enqueuing invalid pointers back to the pool. The code
should rightly be falling back to default dequeue path
in such cases.
Fixes: 91531e63f4 ("mempool/cnxk: add cn10k batch dequeue")
Cc: stable@dpdk.org
Signed-off-by: Ashwin Sekhar T K <asekhar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The "tx_db_nc" devarg forces doorbell register mapping to non-cached
region eliminating the extra write memory barrier. This argument was
used in creating the UAR for Tx and thus affected its performance.
Recently [1] its use has been extended to all UAR creation in all mlx5
drivers, and now its name is no longer so accurate.
This patch changes its name to "sq_db_nc" to suit any send queue that
uses it. The old name will still work for backward compatibility.
[1] commit 5dfa003db5 ("common/mlx5: fix post doorbell barrier")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Adds new documentation for MLX5 common driver that contains:
- Its features list (doesn't exist for now).
- Its devargs description.
- Device configuration information and tutorial.
- Quick Start Guide for Mellanox OFED/EN.
Move into this doc all shared information from other MLX5 PMD docs and
add them reference to new common doc.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Vectorized routines were removed in result of Tx datapath refactoring,
and devarg keys documentation was updated.
However, more updating should have been done. In environment variables
doc, there was explanation according to vectorized Tx which isn't
relevant anymore.
This patch removes this irrelevant explanation.
Fixes: a6bd4911ad ("net/mlx5: remove Tx implementation")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
One of the E-Switch vports plays the special role - it is assigned as
"E-Switch manager" and has some special exclusive rights and duties - it
maintains all the representors, manages FDB domain flows, etc. By
default, the E-Switch vport index was supposed to be zero on standalone
NICs (regular ConnectX) and 0xFFFE SmartNIC (BlueField), but that was
not always correct - this index can be assigned with any value by
kernel/hypervisor.
Currently the E-Switch manager vport id is supposed to be default - 0
for standalone NICs, and 0xFFFE for the SmartNICs, and is deduced from
the device PCI id.
To handle this and do not suggest any default values, can use DevX API
to query E-Switch manager vport ID directly from the firmware during
initialization, and use that value by default. If the new method is not
provided (legacy firmware), fallback to use the PCI id approach.
Fixes: a564038699 ("net/mlx5: support E-Switch manager egress traffic match")
Cc: stable@dpdk.org
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Recently shared RxQ has been introduced. All shared Rx queues with same
group and queue ID share the same rxq_ctrl, but each one has
mlx5_rxq_priv structure.
The mlx5_rx_queue_setup generates a new rxq_priv structure, and looks
for a rxq_ctrl structure to refer to. If there is already a compatible
rxq_ctrl structure it refers it, otherwise it calls the mlx5_rxq_new
function that generates a new one.
This patch makes mlx5_rxq_new function "standalone", it generates a
rxq_ctrl structure regardless to specific rxq_priv structure. All
operations on the rxq_ctrl structure that depend on the new rxq_priv
structure are performed in the mlx5_rx_queue_setup function, at the same
place for either a new rxq_ctrl structure or an existing rxq_ctrl
structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The mlx5_rxq_new function creates control structure and if it from
shared group, it is inserted into the shared RXQs list.
After that, there are some validations which in case they fail, RxQ
control object is released.
In these cases, invalid pointer to the object still in the list, and
access it may cause a crash.
Move the list insertion to the end of the function where the RxQ control
object is surely valid.
Fixes: 09c2555303 ("net/mlx5: support shared Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Previously API flow_dv_query_count_ptr is defined to get counter's
action pointer. This DV function is directly called and the better
way is by the callback.
Add one arg in API mlx5_counter_query and the related callback
counter_query. The added arg is for counter's action pointer.
Signed-off-by: Haifei Luo <haifeil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
If meter policy action was RSS, the correct items were not provided
for sub-policy creation.
This fixes the issue by providing original items in meter split, so
the sub-policy creation gets the correct items.
Fixes: 3c481324ba ("net/mlx5: fix meter flow direction check")
Cc: stable@dpdk.org
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The mlx5_l3t_prepare_entry() function is not used anymore.
This commit removes the unused mlx5_l3t_prepare_entry() function.
Fixes: 92ef4b8f16 ("ethdev: remove deprecated shared counter attribute")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
While mlx5_hlist_create() failed, the rte_flow_error was not filled
with the corresponding error information.
This commit adds the missing rte_flow_error_set() for the failure case.
Fixes: f3020a331d ("net/mlx5: optimize hash list table allocate on demand")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
VDPA driver creates two QPs(1 queue pair include 1 send queue
and 1 receive queue) per virtio queue to get traffic events
from NIC to SW.
Two QPs(called FW QP and SW QP) are created as loopback QP
and FW QP'SQ is connected to SW QP'RQ internally.
When packet receive or send out, HW will send WQE by FW QP'SQ,
then SW will get CQE from the CQ of SW QP.
With large scale and heavy traffic, the SQ's request may fail
to get ACK from RQ HW, because HW is busy.
SQ will retry the request with qpc.retry_count times and each time
wait for 4.096 uS *2^(ack_timeout) for the response. If still can’t
get RQ’s HW response, SQ will go to an error state.
16 is experienced value. It should not be too high or too low.
Too high will make QP waits too long in case it’s packet drop.
Too low will cause QP to go to an error state(retry-exceeded) easily.
Fixes: 15c3807e86 ("common/mlx5: support DevX QP operations")
Cc: stable@dpdk.org
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This release contains multiple bug fixes and improvements, including
- Removal of the linearization function from the xmit Tx path. The
DPDK assumes checking for the mbuf segments number in the Tx prepare
function.
- Extra logs, statistics, checks...
- Cleanup of the unused variables and definitions.
- Configurable Link Status event.
- Improvements for the timer service and the reset.
- Usage of the optimized memcpy on ARM.
- MP awareness improvements - extra API support for the secondary
processes (like reading basic statistics).
- Support of the xstats API to get xstat names by ID.
- Configurable Tx completions timeout.
- Proper setting of the meta-descriptor's DF flag.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Some HW may invalidly set checksum error bit for the valid L4 checksum.
To avoid drop of the packets in that situation, do not indicate bad
checksum for L4 Rx csum offloads. Instead, set it as unknown, so the
application will re-verify this value.
The statistics counters will still work as previously.
Fixes: 05817057fa ("net/ena: fix indication of bad L4 Rx checksums")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
The ena_com_config_dev_mode() performs many calculations related to LLQ
and then performs an admin queue call to configure LLQ in the device.
All of the operations performed by ena_com_config_dev_mode() are
unnecessary if membar hasn't been found. Move the dev_mem_base check
before ena_com_config_dev_mode() call. This prevents the unnecessary
operations from being performed.
Fixes: 2fca2a98c0 ("net/ena: support LLQv2")
Cc: stable@dpdk.org
Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Add information about port id, queue id and req_id to error logs in
validate_tx_req_id.
Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Whenever Tx checksum offload is being used, the meta descriptor content
is taken into consideration. Setting DF field properly in the meta
descriptor may have huge impact on the performance both for the IPv4 and
IPv6 packets.
The requirements for the df field are as below:
* No offload used - value doesn't matter
* IPv4 - 0 or 1, depending on the DF flag in the IPv4 header
* IPv6 - 1
Setting DF to 0 causes the packet to enter the slow-path in the HW and
as a result can noticeable impact the performance.
Moreover, as 'true' may not always be mapped to 1 depending on it's
definition for the given platform/compiler, for safety DF field is being
set explicitly to 1.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
The default missing Tx completion timeout was set to 5 seconds.
In order to provide users with the interface to control this timeout
to adjust it with the application's watchdog, the device argument for
controlling this value was added.
The parameter is called 'miss_txc_to' and can be modified using the
devargs interface:
./app -a <bdf>,miss_txc_to=UINT_NUMBER
This parameter accepts values from 0 to 60 and indicates number of
seconds after which the Tx packet will be considered as missing.
HW hints for the Tx completions timeout were removed to do not overwrite
parameter from the user. Also specifying default Tx completion timeout
value was moved from the configuration to init phase in order to
simplify default value assignment.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
When triggering the reset, no check was performed to see if the reset
was already triggered. This could result in original reset reason being
overwritten. Add ena_trigger_reset helper function, which checks if the
reset was triggered and only sets the reset reason if the reset wasn't
triggered yet. Replace all occurrences of manually setting the reset
with ena_trigger_reset call.
Fixes: 2081d5e2e9 ("net/ena: add reset routine")
Cc: stable@dpdk.org
Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
ENA was only supporting retrieval of all the xstats name and wasn't
implementing the eth_xstats_get_names_by_id API.
As this API may be more efficient than retrieving all the names, it
tries to avoid excessive string copying.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
ENA driver did not allow applications to call tx_cleanup. Freeing Tx
mbufs was always done by the driver and it was not possible to manually
request the driver to free mbufs.
Modify ena_tx_cleanup function to accept maximum number of packets to
free and return number of packets that was freed.
Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Originally, the ena_com memzone counter was shared by ports, which
caused the memzones to be harder to identify and could potentially
lead to race and because of that the counter had to be atomic.
This atomic counter was global variable and it couldn't work in the
multiprocess implementation.
The memzone is now being identified by the local to port memzone counter
and the port ID - both of those information can be found in the shared
data, so it can be probed easily.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Since statistic gathering is now proxied safely to primary process, it
can be enabled in secondary processes.
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Due to how the ena_com compatibility layer is written, all AQ commands
triggering functions use stack to save results of AQ and then copy them
to user given function.
Therefore to keep the compatibility layer common, introduce ENA_PROXY
macro. It either calls the wrapped function directly (in primary
process) or proxies it to the primary via DPDK IPC mechanism. Since all
proxied calls are taken under a lock share the result data through
shared memory (in struct ena_adapter) to work around 256B IPC parameter
size limit.
New proxy calls can be added by
1. Adding a new message type at the end of enum ena_mp_req
2. Adding new message arguments to the struct ena_mp_body if needed
3. Defining proxy request descriptor with ENA_PROXY_DESC. Its arguments
include handlers for request preparation and response processing.
Any of those may be empty (aside of marking arguments as used).
4. Adding request handling logic to ena_mp_primary_handle()
5. Replacing proxied function calls with ENA_PROXY(adapter, <func>, ...)
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
As the default behavior for arm64 is to alias rte_memcpy as memcpy, ENA
cannot redefine memcpy as rte_memcpy as it would cause nested
declaration.
To make it possible to use optimized memcpy in the ena_com layer on Arm,
the driver now redefines memcpy when it is beneficial:
* For arm64 only when the flag RTE_ARCH_ARM64_MEMCPY was defined
* For arm only when the flag RTE_ARCH_ARM_NEON_MEMCPY was defined
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
To increase likelihood that current burst will fit in the HW rings,
perform Tx cleanup before pushing packets to the HW. It may increase
latency a bit for sparse bursts, but the Tx flow now should be more
smooth.
It's also common order in the Tx burst function for other PMDs.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Some user applications may not support PMD reset handling. If they will
support timer service it could cause a situation, when information
about the reset trigger is being showed every time the timer service is
being called.
Timer service is now being skipped if the reset was already triggered.
Fixes: d9b8b106bf ("net/ena: add watchdog and keep alive AENQ handler")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
ENA uses AENQ for notification about various events, like LSC, keep
alive etc. By default it was enabling all AENQ that were supported by
both the driver and the device. As a result the LSC was always processed
even if the application turned it off explicitly.
As the DPDK provides application with the possibility to configure the
LSC, ENA should respect that. AENQ groups are now being updated upon
configure step, thus LSC can be activated or disabled between ENA PMD
reconfigurations. Moreover, the LSC capability for the device is being
determined dynamically.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
* Split 'bad_csum' Rx statistic into 'l3_csum_bad' and 'l4_csum_bad' to
be able to check which checksum was not calculated properly.
* Add l4_csum_good statistic, which shows how many times L4 Rx checksum
was properly offloaded.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Those variables are being set, but never read. As they seem to be
leftover from the old offloads API and don't have any purpose right
now, they are simply being removed.
Fixes: a4996bd89c ("ethdev: new Rx/Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Artur Rojek <ar@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
The enumeration seems to be leftover from porting the Linux driver to
the DPDK. It was used nowhere and refers to the ethtool which is not
present in the DPDK.
Fixes: 372c1af5ed ("net/ena: add dedicated memory area for extra device info")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Artur Rojek <ar@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>