Commit Graph

313 Commits

Author SHA1 Message Date
Chengwen Feng
0d5c38bac7 ethdev: add error handling mode to device info
Currently, the defined error handling modes include:

1) NONE: it means no error handling modes are supported by this port.

2) PASSIVE: passive error handling, after the PMD detect that a reset
   is required, the PMD reports RTE_ETH_EVENT_INTR_RESET event, and
   application invoke rte_eth_dev_reset() to recover the port.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2022-10-17 08:26:36 +02:00
David Marchand
1acb7f5474 dev: hide driver object
Make rte_driver opaque for non internal users.
This will make extending this object possible without breaking the ABI.

Introduce a new driver header and move rte_driver definition.
Update drivers and library to use the internal header.

Some applications may have been dereferencing rte_driver objects, mark
this object's accessors as stable.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
1f37cb2bb4 bus/pci: make driver-only headers private
The pci bus interface is for drivers only.
Mark as internal and move the header in the driver headers list.

While at it, cleanup the code:
- fix indentation,
- remove unneeded reference to bus specific singleton object,
- remove unneeded list head structure type,
- reorder the definitions and macro manipulating the bus singleton object,
- remove inclusion of rte_bus.h and fix the code that relied on implicit
  inclusion,

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
72206323a5 version: 22.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 23.0.
The map files are updated to the new ABI major number (23).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Special handling of removed drivers is also dropped in check-abi.sh and
a note has been added in libabigail.abignore as a reminder.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2022-07-21 12:13:48 +02:00
David Marchand
2449949584 net/ena: fix build with GCC 12
GCC 12 raises the following warning:

In file included from ../lib/mempool/rte_mempool.h:46,
                 from ../lib/mbuf/rte_mbuf.h:38,
                 from ../lib/net/rte_ether.h:22,
                 from ../drivers/net/ena/ena_ethdev.h:10,
                 from ../drivers/net/ena/ena_rss.c:6:
../drivers/net/ena/ena_rss.c: In function ‘ena_rss_key_fill’:
../lib/eal/x86/include/rte_memcpy.h:370:9: warning: array subscript 64 is
        outside array bounds of ‘uint8_t[40]’
        {aka ‘unsigned char[40]’} [-Warray-bounds]
  370 | rte_mov32((uint8_t *)dst + 2 * 32, (const uint8_t *)src + 2 * 32);
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/ena/ena_rss.c:51:24: note: while referencing ‘default_key’
   51 | static uint8_t default_key[ENA_HASH_KEY_SIZE];
      |                ^~~~~~~~~~~

This is a false positive because the copied size is checked against
ENA_HASH_KEY_SIZE in a (build) assert.
Silence this warning by calling memcpy with the minimal size.

Bugzilla ID: 849
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-15 10:19:18 +02:00
Michal Krawczyk
a0b1207584 net/ena: update version to 2.7.0
This release contains changes listed below.

  - Fast mbuf free feature support.
  - Device argument to disable the LLQ.
  - Simplification of the MTU verification.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2022-06-07 21:01:09 +02:00
Michal Krawczyk
9944919e2b net/ena: add device argument to disable LLQ
The PMD attempts to enable the LLQ (Low Latency Queue) whenever it's
possible. The LLQ requires the user to enable the Write Combining for
the supported igb_uio/vfio-pci modules.

The vfio-pci module officially doesn't support the WC. Moreover, in some
Linux distributions, it can be built into the kernel, so any
modifications to the vfio-pci module require a full rebuild of the
kernel. This can make the configuration process much harder and for some
users, that are not interested in the great network performance for
their setups, it may be redundant. These users requested to be able to
turn off LLQ to avoid the hassle of such a setup.

It's generally not recommended to disable the LLQ, as it won't result in
the performance improvement and on the 6th generation AWS instances the
lack of LLQ can have a huge negative impact on hardware performance.

The device argument which controls the LLQ is called 'enable_llq` and by
default, it's set to 1 (which means that the LLQ is enabled). Setting
it to 0 disables the LLQ.

This commit also adds the explicit initialization of the devarg for the
'use_large_llq_hdr'. The PMD_REGISTER_PARAM_STRING() call for the ENA
was updated with all the available devargs (including
ENA_DEVARG_MISS_TXC_TO, which wasn't added previously).

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2022-06-07 21:01:09 +02:00
Dawid Gorecki
c3d31352bf net/ena: remove redundant MTU verification
Remove MTU verification from ena_mtu_set() and ena_start(). It is done
by rte_ethdev already, so there is no reason to repeat it inside the ENA
driver.

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2022-06-07 21:01:09 +02:00
Dawid Gorecki
c339f53823 net/ena: support fast mbuf free
Add support for RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE offload. It can be
enabled if all the mbufs for a given queue belong to the same mempool
and their reference count is equal to 1.

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2022-06-07 21:01:09 +02:00
Michal Krawczyk
dbbdeb8b47 net/ena: update version to 2.6.0
This release contains multiple bug fixes and improvements, including
  - Removal of the linearization function from the xmit Tx path. The
    DPDK assumes checking for the mbuf segments number in the Tx prepare
    function.
  - Extra logs, statistics, checks...
  - Cleanup of the unused variables and definitions.
  - Configurable Link Status event.
  - Improvements for the timer service and the reset.
  - Usage of the optimized memcpy on ARM.
  - MP awareness improvements - extra API support for the secondary
    processes (like reading basic statistics).
  - Support of the xstats API to get xstat names by ID.
  - Configurable Tx completions timeout.
  - Proper setting of the meta-descriptor's DF flag.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
b2d2f1cf89 net/ena: fix checksum flag for L4
Some HW may invalidly set checksum error bit for the valid L4 checksum.
To avoid drop of the packets in that situation, do not indicate bad
checksum for L4 Rx csum offloads. Instead, set it as unknown, so the
application will re-verify this value.

The statistics counters will still work as previously.

Fixes: 05817057fa ("net/ena: fix indication of bad L4 Rx checksums")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2022-02-23 19:01:03 +01:00
Dawid Gorecki
9ae7a13f82 net/ena: check memory BAR before initializing LLQ
The ena_com_config_dev_mode() performs many calculations related to LLQ
and then performs an admin queue call to configure LLQ in the device.

All of the operations performed by ena_com_config_dev_mode() are
unnecessary if membar hasn't been found. Move the dev_mem_base check
before ena_com_config_dev_mode() call. This prevents the unnecessary
operations from being performed.

Fixes: 2fca2a98c0 ("net/ena: support LLQv2")
Cc: stable@dpdk.org

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Dawid Gorecki
77e764c7ec net/ena: extend logs for invalid request ID resets
Add information about port id, queue id and req_id to error logs in
validate_tx_req_id.

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
022fb61b62 net/ena: fix meta descriptor DF flag setup
Whenever Tx checksum offload is being used, the meta descriptor content
is taken into consideration. Setting DF field properly in the meta
descriptor may have huge impact on the performance both for the IPv4 and
IPv6 packets.

The requirements for the df field are as below:
* No offload used - value doesn't matter
* IPv4 - 0 or 1, depending on the DF flag in the IPv4 header
* IPv6 - 1

Setting DF to 0 causes the packet to enter the slow-path in the HW and
as a result can noticeable impact the performance.

Moreover, as 'true' may not always be mapped to 1 depending on it's
definition for the given platform/compiler, for safety DF field is being
set explicitly to 1.

Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
cc0c5d2519 net/ena: make Tx completion timeout configurable
The default missing Tx completion timeout was set to 5 seconds.
In order to provide users with the interface to control this timeout
to adjust it with the application's watchdog, the device argument for
controlling this value was added.

The parameter is called 'miss_txc_to' and can be modified using the
devargs interface:

  ./app -a <bdf>,miss_txc_to=UINT_NUMBER

This parameter accepts values from 0 to 60 and indicates number of
seconds after which the Tx packet will be considered as missing.

HW hints for the Tx completions timeout were removed to do not overwrite
parameter from the user. Also specifying default Tx completion timeout
value was moved from the configuration to init phase in order to
simplify default value assignment.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Dawid Gorecki
2bae75eaa2 net/ena: fix reset reason being overwritten
When triggering the reset, no check was performed to see if the reset
was already triggered. This could result in original reset reason being
overwritten. Add ena_trigger_reset helper function, which checks if the
reset was triggered and only sets the reset reason if the reset wasn't
triggered yet. Replace all occurrences of manually setting the reset
with ena_trigger_reset call.

Fixes: 2081d5e2e9 ("net/ena: add reset routine")
Cc: stable@dpdk.org

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
3cec73fabb net/ena: support xstat names by ID
ENA was only supporting retrieval of all the xstats name and wasn't
implementing the eth_xstats_get_names_by_id API.

As this API may be more efficient than retrieving all the names, it
tries to avoid excessive string copying.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Dawid Gorecki
a52b317e7d net/ena: support Tx mbuf free on demand
ENA driver did not allow applications to call tx_cleanup. Freeing Tx
mbufs was always done by the driver and it was not possible to manually
request the driver to free mbufs.

Modify ena_tx_cleanup function to accept maximum number of packets to
free and return number of packets that was freed.

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
850e1bb1c7 net/ena/base: make IO memzone unique per port
Originally, the ena_com memzone counter was shared by ports, which
caused the memzones to be harder to identify and could potentially
lead to race and because of that the counter had to be atomic.

This atomic counter was global variable and it couldn't work in the
multiprocess implementation.

The memzone is now being identified by the local to port memzone counter
and the port ID - both of those information can be found in the shared
data, so it can be probed easily.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Stanislaw Kardach
3aa3fa851f net/ena: enable stats for multi-process mode
Since statistic gathering is now proxied safely to primary process, it
can be enabled in secondary processes.

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Stanislaw Kardach
e3595539e0 net/ena: proxy AQ calls to primary process
Due to how the ena_com compatibility layer is written, all AQ commands
triggering functions use stack to save results of AQ and then copy them
to user given function.
Therefore to keep the compatibility layer common, introduce ENA_PROXY
macro. It either calls the wrapped function directly (in primary
process) or proxies it to the primary via DPDK IPC mechanism. Since all
proxied calls are taken under a lock share the result data through
shared memory (in struct ena_adapter) to work around 256B IPC parameter
size limit.

New proxy calls can be added by
1. Adding a new message type at the end of enum ena_mp_req
2. Adding new message arguments to the struct ena_mp_body if needed
3. Defining proxy request descriptor with ENA_PROXY_DESC. Its arguments
   include handlers for request preparation and response processing.
   Any of those may be empty (aside of marking arguments as used).
4. Adding request handling logic to ena_mp_primary_handle()
5. Replacing proxied function calls with ENA_PROXY(adapter, <func>, ...)

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
b2c1fe38ad net/ena/base: use optimized memcpy version also on Arm
As the default behavior for arm64 is to alias rte_memcpy as memcpy, ENA
cannot redefine memcpy as rte_memcpy as it would cause nested
declaration.

To make it possible to use optimized memcpy in the ena_com layer on Arm,
the driver now redefines memcpy when it is beneficial:
  * For arm64 only when the flag RTE_ARCH_ARM64_MEMCPY was defined
  * For arm only when the flag RTE_ARCH_ARM_NEON_MEMCPY was defined

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
67216c31e4 net/ena: perform Tx cleanup before sending packets
To increase likelihood that current burst will fit in the HW rings,
perform Tx cleanup before pushing packets to the HW. It may increase
latency a bit for sparse bursts, but the Tx flow now should be more
smooth.

It's also common order in the Tx burst function for other PMDs.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
e2174a5446 net/ena: skip timer if reset is triggered
Some user applications may not support PMD reset handling. If they will
support timer service it could cause a situation, when information
about the reset trigger is being showed every time the timer service is
being called.

Timer service is now being skipped if the reset was already triggered.

Fixes: d9b8b106bf ("net/ena: add watchdog and keep alive AENQ handler")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:03 +01:00
Michal Krawczyk
b9b05d6f86 net/ena: make link status change interrupt configurable
ENA uses AENQ for notification about various events, like LSC, keep
alive etc. By default it was enabling all AENQ that were supported by
both the driver and the device. As a result the LSC was always processed
even if the application turned it off explicitly.

As the DPDK provides application with the possibility to configure the
LSC, ENA should respect that. AENQ groups are now being updated upon
configure step, thus LSC can be activated or disabled between ENA PMD
reconfigurations. Moreover, the LSC capability for the device is being
determined dynamically.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:02 +01:00
Michal Krawczyk
84daba9962 net/ena: add extra Rx checksum related xstats
* Split 'bad_csum' Rx statistic into 'l3_csum_bad' and 'l4_csum_bad' to
  be able to check which checksum was not calculated properly.
* Add l4_csum_good statistic, which shows how many times L4 Rx checksum
  was properly offloaded.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:01:02 +01:00
Michal Krawczyk
fa11980449 net/ena: remove unused offload variables
Those variables are being set, but never read. As they seem to be
leftover from the old offloads API and don't have any purpose right
now, they are simply being removed.

Fixes: a4996bd89c ("ethdev: new Rx/Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Artur Rojek <ar@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 19:00:49 +01:00
Michal Krawczyk
0f135d2fe2 net/ena: remove unused enumeration
The enumeration seems to be leftover from porting the Linux driver to
the DPDK. It was used nowhere and refers to the ethtool which is not
present in the DPDK.

Fixes: 372c1af5ed ("net/ena: add dedicated memory area for extra device info")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Artur Rojek <ar@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 18:53:49 +01:00
Michal Krawczyk
3d47e9b102 net/ena: assert on outstanding mbuf in Tx
To make sure there is no outstanding mbuf in the reused Tx queue (due to
improper cleanup, or some invalid logic on Tx path), the assertion was
added on the Tx path.

As it's being compiled out in the release version, it won't affect
the IO path performance.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 18:53:49 +01:00
Michal Krawczyk
96ffa8a70f net/ena: remove Tx mbuf linearization
The linearization of the mbuf isn't common practice for the PMD, as it
can expose it's capabilities to the upper layer using
rte_eth_dev_info_get().

Moreover, the rte_eth_tx_prepare() function should also verify if the
number of segments inside the mbuf isn't too high.

Because of those 2 circumstances, it may be safer to avoid modifying
mbuf on PMD's Tx side and remove linearization at all. Instead, add
verification of the number of segments to the eth_ena_prep_pkts().

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Artur Rojek <ar@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2022-02-23 18:53:49 +01:00
Stephen Hemminger
06c047b680 remove unnecessary null checks
Functions like free, rte_free, and rte_mempool_free
already handle NULL pointer so the checks here are not necessary.

Remove redundant NULL pointer checks before free functions
found by nullfree.cocci

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-12 12:07:48 +01:00
Josh Soref
7be78d0279 fix spelling in comments and strings
The tool comes from https://github.com/jsoref

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2022-01-11 12:16:53 +01:00
Harman Kalra
d61138d4f0 drivers: remove direct access to interrupt handle
Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the drivers access the interrupt handle fields.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
2021-10-25 21:20:12 +02:00
Olivier Matz
daa02b5cdd mbuf: add namespace to offload flags
Fix the mbuf offload flags namespace by adding an RTE_ prefix to the
name. The old flags remain usable, but a deprecation warning is issued
at compilation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
2021-10-24 13:37:43 +02:00
Ferruh Yigit
295968d174 ethdev: add namespace
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible
way. The macros for backward compatibility can be removed in next LTS.
Also updated some struct names to have 'rte_eth' prefix.

All internal components switched to using new names.

Syntax fixed on lines that this patch touches.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
2021-10-22 18:15:38 +02:00
Michal Krawczyk
ba94dad4e0 net/ena: update version to 2.5.0
This version update contains:
  * Fix for verification of the offload capabilities (especially for
    IPv6 packets).
  * Support for Tx and Rx free threshold values.
  * Fixes for per-queue offload capabilities.
  * Announce support of the scattered Rx offload.
  * NUMA aware allocations.
  * Check for the missing Tx completions.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
f93e20e516 net/ena: check missing Tx completions
In some cases Tx descriptors may be uncompleted by the HW and as a
result they will never be released.

This patch adds checking for the missing Tx completions to the ENA timer
service, so in order to use this feature, the application must call the
function rte_timer_manage().

Missing Tx completion reset threshold is determined dynamically, by
taking into consideration ring size and the default value.

Tx cleanup is associated with the Tx burst function. As DPDK
applications can call Tx burst function dynamically, time when last
cleanup was called must be traced to avoid false detection of the
missing Tx completion.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
08180833cb net/ena: add NUMA-aware allocations
Only the IO rings memory was allocated with taking the socket ID into
the respect, while the other structures was allocated using the regular
rte_zmalloc() API.

Ring specific structures are now being allocated using the ring's
socket ID.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
e2a6d08bef net/ena: advertise scattered Rx capability
ENA can't be forced to always pass single descriptor for the Rx packet.
Even if the passed buffer size is big enough to hold the data, we can't
make assumption that the HW won't use extra descriptor because of
internal optimizations. This assumption may be true, but only for some
of the FW revisions, which may differ depending on the used AWS instance
type.

As the scattered Rx support on the Rx path already exists, the driver
just needs to announce DEV_RX_OFFLOAD_SCATTER capability by turning on
the rte_eth_dev_data::scattered_rx option.

Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
3a822d79c5 net/ena: fix per-queue offload capabilities
As ENA currently doesn't support offloads which could be configured
per-queue, only per-port flags should be set.

In addition, to make the code cleaner, parsing appropriate offload
flags is encapsulated into helper functions, in a similar matter it's
done by the other PMDs.

[1] https://doc.dpdk.org/guides/prog_guide/
    poll_mode_drv.html?highlight=offloads#hardware-offload

Fixes: 7369f88f88 ("net/ena: convert to new Rx offloads API")
Fixes: 56b8b9b7e5 ("net/ena: convert to new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
005064e505 net/ena: support Tx/Rx free thresholds
The caller can pass Tx or Rx free threshold value to the configuration
structure for each ring. It determines when the Tx/Rx function should
start cleaning up/refilling the descriptors. ENA was ignoring this value
and doing it's own calculations.

Now the user can configure ENA's behavior using this parameter and if
this variable won't be set, the ENA will continue with the old behavior
and will use it's own threshold value.

The default value is not provided by the ENA in the ena_infos_get(), as
it's being determined dynamically, depending on the requested ring size.

Note that NULL check for Tx conf was removed from the function
ena_tx_queue_setup(), as at this place the configuration will be
either provided by the user or the default config will be used and it's
handled by the upper (rte_ethdev) layer.

Tx threshold shouldn't be used for the Tx cleanup budget as it can be
inadequate to the used burst. Now the PMD tries to release mbufs for the
ring until it will be depleted.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
e8c838fde9 net/ena: fix offload capabilities verification
ENA PMD has multiple checksum offload flags, which are more discrete
than the DPDK offload capabilities flags.
As the driver wasn't storing it's internal checksum offload capabilities
and was relying only on the DPDK capabilities, not all scenarios could
be properly covered (like when to prepare pseudo header checksum and
when not).

Moreover, the user could request offload capability, which isn't
supported by the HW and the PMD would quietly ignore the issue.

This commit reworks eth_ena_prep_pkts() function to perform additional
checks and to properly reflect the HW requirements. With the
RTE_LIBRTE_ETHDEV_DEBUG enabled, the function will do even more
verifications, to help the user find any issues with the mbuf
configuration.

Fixes: b3fc5a1ae1 ("net/ena: add Tx preparation")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Ferruh Yigit
b563c14212 ethdev: remove jumbo offload flag
Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag.

Instead of drivers announce this capability, application can deduct the
capability by checking reported 'dev_info.max_mtu' or
'dev_info.max_rx_pktlen'.

And instead of application setting this flag explicitly to enable jumbo
frames, this can be deduced by driver by comparing requested 'mtu' to
'RTE_ETHER_MTU'.

Removing this additional configuration for simplification.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
1bb4a528c4 ethdev: fix max Rx packet length
There is a confusion on setting max Rx packet length, this patch aims to
clarify it.

'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.

Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.

These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.

Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
  'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
  Ethernet frame overhead, and this overhead may be different from
  device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
  which adds additional confusion and some APIs and PMDs already
  discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
  field, this adds configuration complexity for application.

As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.

For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.

When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.

Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2021-10-18 19:20:20 +02:00
Ferruh Yigit
7a4edfd7bb net/ena: remove useless address check
Reported by "gcc (GCC) 12.0.0 20211003 (experimental)":

./drivers/net/ena/ena_rss.c: In function ‘ena_rss_reta_query’:
./drivers/net/ena/ena_rss.c:140:66:
	error: the comparison will always evaluate as ‘false’ for the
	pointer operand in ‘reta_conf + 136’ must not be NULL
	[-Werror=address]
  140 |  (reta_size > RTE_RETA_GROUP_SIZE && ((reta_conf + 1) == NULL)))
      |                                                       ^~

Fixing it by removing useless check.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
2021-10-11 17:47:31 +02:00
Xueming Li
7483341ae5 ethdev: change queue release callback
Currently, most ethdev callback API use queue ID as parameter, but Rx
and Tx queue release callback use queue object which is used by Rx and
Tx burst data plane callback.

To align with other eth device queue configuration callbacks:
- queue release callbacks are changed to use queue ID
- all drivers are adapted

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 19:16:03 +02:00
Thomas Monjalon
fdab8f2e17 version: 21.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 22.0.
The map files are updated to the new ABI major number (22).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-08-17 08:37:52 +02:00
Ghalem Boudour
3e7008459d net/ena: enable multi-segment in Tx offload flags
The DPDK ENA driver does not provide multi-segment tx offload capability.
Let's add DEV_TX_OFFLOAD_MULTI_SEGS to ports offload capability by
default, and always set it in dev->data->dev_conf.txmode.offload.

This flag in not listed in doc/guides/nics/features/default.ini, so
ena.ini does not need to be updated.

Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org

Signed-off-by: Ghalem Boudour <ghalem.boudour@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
2021-07-30 12:10:20 +02:00
Michal Krawczyk
d00c799fda net/ena: update version to 2.4.0
This version update contains:
  * Rx interrupts feature,
  * Support for the RSS hash function reconfiguration,
  * Small rework of the works,
  * Reset trigger on Tx path fix.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2021-07-23 17:44:58 +02:00
Michal Krawczyk
34d5e97e8d net/ena: rework RSS configuration
Allow user to specify his own hash key and hash ctrl if the
device is supporting that. HW interprets the key in reverse byte order,
so the PMD reorders the key before passing it to the ena_com layer.

Default key is being set in random matter each time the device is being
initialized.

Moreover, make minor adjustments for reta size setting in terms
of returning error values.

RSS code was moved to ena_rss.c file to improve readability.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2021-07-23 17:44:09 +02:00