Commit Graph

30135 Commits

Author SHA1 Message Date
Sunil Kumar Kori
0fa36bc288 common/cnxk: support profile statistics
CN10K platform provides statistics per bandwidth profile and
per nixlf. Implement RoC API to read stats for given bandwidth
profile.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:23:53 +02:00
Sunil Kumar Kori
c8881e6ef0 common/cnxk: support bandwidth profile stats to index
CN10K platform supports different stats for HW bandwidth profiles.
Implement RoC API to get index for given stats type.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:23:32 +02:00
Sunil Kumar Kori
b609507b7c common/cnxk: support bandwidth profiles connection
To maintain chain of bandwidth profiles, they needs to be
connected. Implement RoC API to connect two bandwidth profiles
at different levels.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:23:18 +02:00
Sunil Kumar Kori
3396110111 common/cnxk: support precolor table setup
For initial coloring of input packet, CN10K platform maintains
precolor table for VLAN, DSCP and Generic. Implement RoC
interface to setup pre color table.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:23:03 +02:00
Sunil Kumar Kori
ab706fa825 common/cnxk: support bandwidth profile dump
Implement RoC API to dump bandwidth profile on CN10K
platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:22:50 +02:00
Sunil Kumar Kori
52511cd2ba common/cnxk: support profile state toggle
Implement RoC API to enable or disable HW bandwidth profiles
on CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:22:30 +02:00
Sunil Kumar Kori
4ad8bc2fc7 common/cnxk: support bandwidth profile configure
Implement RoC API to configure HW bandwidth profile for
CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:22:16 +02:00
Sunil Kumar Kori
bf7290c65f common/cnxk: support bandwidth profiles free
Implement RoC interface to free HW bandwidth profiles on
CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:22:04 +02:00
Sunil Kumar Kori
7a63d75ecb common/cnxk: support bandwidth profiles allocation
Implement RoC API to allocate HW resources i.e. bandwidth
profiles for policer processing on CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:21:40 +02:00
Sunil Kumar Kori
05a944fea3 common/cnxk: support to get profile count
Implement interface to get available profile count for given
NIXLF.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:21:25 +02:00
Sunil Kumar Kori
b7cb2203eb common/cnxk: support to get policer level to index
CN10K platform supports policer up to 3 level of hierarchy.
Implement RoC API to get corresponding index for given level.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:21:14 +02:00
Sunil Kumar Kori
cf8f6aa12a common/cnxk: update policer mbox API and HW definitions
To support ingress policer on CN10K, MBOX interfaces and HW
definitions updated.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-19 16:20:29 +02:00
Tejasree Kondoj
206c9d5d92 net/octeontx2: use fast udata and mdata flags
Using fast metadata and userdata flags instead of
driver callbacks for set_pkt_metadata and
get_userdata in inline IPsec.

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
2021-10-19 16:19:52 +02:00
Lior Margalit
0c3fa68396 net/mlx5: fix RSS expansion for L2/L3 VXLAN
The RSS expansion algorithm is using a graph to find the possible
expansion paths. The current implementation does not differentiate
between standard (L2) VXLAN and L3 VXLAN. As result the flow is expanded
with all possible paths.
For example:
testpmd> flow create... / vxlan / end actions rss level 2 / end
It is currently expanded to the following paths:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN ETH IPV4 END
ETH IPV4 UDP VXLAN ETH IPV6 END
ETH IPV4 UDP VXLAN IPV4 END
ETH IPV4 UDP VXLAN IPV6 END

The fix is to adjust the expansion according to the outer UDP destination
port. In case flow pattern defines a match on the standard udp port, 4789,
or does not define a match on the destination port, which also implies
setting the standard one, the expansion for the above example will be:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN ETH IPV4 END
ETH IPV4 UDP VXLAN ETH IPV6 END
Otherwise, the expansion will be:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN IPV4 END
ETH IPV4 UDP VXLAN IPV6 END

Fixes: f4f06e3615 ("net/mlx5: add flow VXLAN item")
Cc: stable@dpdk.org

Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-10-18 09:12:42 +02:00
Ruifeng Wang
778602fe57 net/i40e: fix risk in descriptor read in NEON Rx
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.

In NEON vector PMD, vector load loads two contiguous 8B of
descriptor data into vector register. Given vector load ensures no
16B atomicity, read of the word that includes DD field could be
reordered after read of other words. In this case, some words could
contain invalid data.

Read barrier is added after read of qword1 that includes DD field.
And qword0 is reloaded to update vector register. This ensures
that the fetched data is correct.

Testpmd single core test on N1SDP/ThunderX2 showed no performance drop.

Fixes: ae0eb310f2 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-10-19 13:13:55 +02:00
Alvin Zhang
1506c90029 net/i40e: fix IPv6 fragment RSS offload type in flow
To keep flow format uniform with ice, this patch adds support for
this RSS rule:
    flow create 0 ingress pattern eth / ipv6 / ipv6_frag_ext / end \
    actions rss types ipv6-frag end queues end queues end / end

Fixes: ef4c16fd91 ("net/i40e: refactor RSS flow")
Cc: stable@dpdk.org

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-10-19 13:06:42 +02:00
Leyi Rong
0d989ff9ca net/ice: fix generic build on FreeBSD
The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new ice_rxtx_common_avx.h header
file to avoid such bugs.

Bugzilla ID: 788
Fixes: a4e480de26 ("net/ice: optimize Tx by using AVX512")
Fixes: 20daa1c978 ("net/ice: fix crash in AVX512")
Cc: stable@dpdk.org

Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-19 13:02:37 +02:00
Leyi Rong
c454435d88 net/i40e: fix generic build on FreeBSD
The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new i40e_rxtx_common_avx.h header
file to avoid such bugs.

Bugzilla ID: 788
Fixes: 0604b1f220 ("net/i40e: fix crash in AVX512")
Cc: stable@dpdk.org

Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-19 13:01:56 +02:00
Eli Britstein
292be511d2 net/mlx5: support more tunnel types
Accept RTE_FLOW_ITEM_TYPE_GRE, RTE_FLOW_ITEM_TYPE_NVGRE and
RTE_FLOW_ITEM_TYPE_GENEVE as valid tunnel types.

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")
Cc: stable@dpdk.org

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-10-19 23:51:10 +02:00
Eli Britstein
ad6a8a20cb app/testpmd: add tunnel types
Current testpmd implementation supports VXLAN only for tunnel offload.
Add GRE, NVGRE and GENEVE for tunnel offload flow matches.

For example:
testpmd> flow tunnel create 0 type vxlan
port 0: flow tunnel #1 type vxlan
testpmd> flow tunnel create 0 type nvgre
port 0: flow tunnel #2 type nvgre
testpmd> flow tunnel create 0 type gre
port 0: flow tunnel #3 type gre
testpmd> flow tunnel create 0 type geneve
port 0: flow tunnel #4 type geneve

Fixes: 1b9f274623 ("app/testpmd: add commands for tunnel offload")
Cc: stable@dpdk.org

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gregory Etelson <getelson@nvidia.com>
2021-10-19 23:51:10 +02:00
Dapeng Yu
287ca31bea net/softnic: fix memory leak of meter policy
After the meter policies are created, they are not freed on device
close.

This patch fixes it.

Fixes: 5f0d54f372 ("ethdev: add pre-defined meter policy API")
Cc: stable@dpdk.org

Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
2021-10-19 22:45:19 +02:00
Sunil Kumar Kori
b314a4a664 app/testpmd: fix access to DSCP table entries
During parsing of DSCP entries, memory is allocated and assigned
to *dscp_table. Later on, same memory is accessed using
*dscp_table[i++].

Due to higher precedence for array subscript, dscp_table[i++] will
be executed first which actually does not point to the same memory
which was allocated previously for DSCP table entries.

Fixes: 459463ae6c ("app/testpmd: fix memory allocation for DSCP table")
Cc: stable@dpdk.org

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-19 18:10:28 +02:00
Michal Krawczyk
ba94dad4e0 net/ena: update version to 2.5.0
This version update contains:
  * Fix for verification of the offload capabilities (especially for
    IPv6 packets).
  * Support for Tx and Rx free threshold values.
  * Fixes for per-queue offload capabilities.
  * Announce support of the scattered Rx offload.
  * NUMA aware allocations.
  * Check for the missing Tx completions.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
f93e20e516 net/ena: check missing Tx completions
In some cases Tx descriptors may be uncompleted by the HW and as a
result they will never be released.

This patch adds checking for the missing Tx completions to the ENA timer
service, so in order to use this feature, the application must call the
function rte_timer_manage().

Missing Tx completion reset threshold is determined dynamically, by
taking into consideration ring size and the default value.

Tx cleanup is associated with the Tx burst function. As DPDK
applications can call Tx burst function dynamically, time when last
cleanup was called must be traced to avoid false detection of the
missing Tx completion.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
08180833cb net/ena: add NUMA-aware allocations
Only the IO rings memory was allocated with taking the socket ID into
the respect, while the other structures was allocated using the regular
rte_zmalloc() API.

Ring specific structures are now being allocated using the ring's
socket ID.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
e2a6d08bef net/ena: advertise scattered Rx capability
ENA can't be forced to always pass single descriptor for the Rx packet.
Even if the passed buffer size is big enough to hold the data, we can't
make assumption that the HW won't use extra descriptor because of
internal optimizations. This assumption may be true, but only for some
of the FW revisions, which may differ depending on the used AWS instance
type.

As the scattered Rx support on the Rx path already exists, the driver
just needs to announce DEV_RX_OFFLOAD_SCATTER capability by turning on
the rte_eth_dev_data::scattered_rx option.

Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
3a822d79c5 net/ena: fix per-queue offload capabilities
As ENA currently doesn't support offloads which could be configured
per-queue, only per-port flags should be set.

In addition, to make the code cleaner, parsing appropriate offload
flags is encapsulated into helper functions, in a similar matter it's
done by the other PMDs.

[1] https://doc.dpdk.org/guides/prog_guide/
    poll_mode_drv.html?highlight=offloads#hardware-offload

Fixes: 7369f88f88 ("net/ena: convert to new Rx offloads API")
Fixes: 56b8b9b7e5 ("net/ena: convert to new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
005064e505 net/ena: support Tx/Rx free thresholds
The caller can pass Tx or Rx free threshold value to the configuration
structure for each ring. It determines when the Tx/Rx function should
start cleaning up/refilling the descriptors. ENA was ignoring this value
and doing it's own calculations.

Now the user can configure ENA's behavior using this parameter and if
this variable won't be set, the ENA will continue with the old behavior
and will use it's own threshold value.

The default value is not provided by the ENA in the ena_infos_get(), as
it's being determined dynamically, depending on the requested ring size.

Note that NULL check for Tx conf was removed from the function
ena_tx_queue_setup(), as at this place the configuration will be
either provided by the user or the default config will be used and it's
handled by the upper (rte_ethdev) layer.

Tx threshold shouldn't be used for the Tx cleanup budget as it can be
inadequate to the used burst. Now the PMD tries to release mbufs for the
ring until it will be depleted.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Michal Krawczyk
e8c838fde9 net/ena: fix offload capabilities verification
ENA PMD has multiple checksum offload flags, which are more discrete
than the DPDK offload capabilities flags.
As the driver wasn't storing it's internal checksum offload capabilities
and was relying only on the DPDK capabilities, not all scenarios could
be properly covered (like when to prepare pseudo header checksum and
when not).

Moreover, the user could request offload capability, which isn't
supported by the HW and the PMD would quietly ignore the issue.

This commit reworks eth_ena_prep_pkts() function to perform additional
checks and to properly reflect the HW requirements. With the
RTE_LIBRTE_ETHDEV_DEBUG enabled, the function will do even more
verifications, to help the user find any issues with the mbuf
configuration.

Fixes: b3fc5a1ae1 ("net/ena: add Tx preparation")
Cc: stable@dpdk.org

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
2021-10-19 15:04:17 +02:00
Viacheslav Galaktionov
26706314d4 net/sfc: implement transfer proxy port callback
In sfc, MAE admin serves as a transfer proxy. In order to track which
ethdev is privileged, augment every independent switch port structure
with information about its MAE privilege.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-10-18 20:56:02 +02:00
Viacheslav Galaktionov
2f577f0ea1 net/sfc: allow ports without MAE privilege
Register unprivileged ports in the switch domain registry in order to
allow redirecting traffic to them.

Differentiate between different levels of MAE support, update all MAE
status checks.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-10-18 20:56:02 +02:00
Viacheslav Galaktionov
40ccb31158 common/sfc_efx/base: support unprivileged MAE clients
In order to differentiate between privileged and unprivileged MAE clients,
add a separate boolean flag to represent a NIC's MAE privilege level.

Allow initializing unprivileged MAE clients by avoiding calls to functions
that can only be called by the admin NIC.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
2021-10-18 20:56:02 +02:00
Ferruh Yigit
d1576625f7 examples/ip_reassembly: remove unused option
Remove 'max-pkt-len' parameter.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
990912e676 ethdev: unify MTU checks
Both 'rte_eth_dev_configure()' & 'rte_eth_dev_set_mtu()' sets MTU but
have slightly different checks. Like one checks min MTU against
RTE_ETHER_MIN_MTU and other RTE_ETHER_MIN_LEN.

Checks moved into common function to unify the checks. Also this has
benefit to have common error logs.

Default 'dev_info->min_mtu' (the one set by ethdev if driver doesn't
provide one), changed to ('RTE_ETHER_MIN_LEN' - overhead). Previously it
was 'RTE_ETHER_MIN_MTU' which is min MTU for IPv4 packets. Since the
intention is to provide min MTU corresponding minimum frame size, new
default value suits better.

Suggested-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
b563c14212 ethdev: remove jumbo offload flag
Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag.

Instead of drivers announce this capability, application can deduct the
capability by checking reported 'dev_info.max_mtu' or
'dev_info.max_rx_pktlen'.

And instead of application setting this flag explicitly to enable jumbo
frames, this can be deduced by driver by comparing requested 'mtu' to
'RTE_ETHER_MTU'.

Removing this additional configuration for simplification.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
f7e04f57ad ethdev: move MTU set check to library
Move requested MTU value check to the API to prevent the duplicated
code.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
dd4e429c95 ethdev: move jumbo frame offload check to library
Setting MTU bigger than RTE_ETHER_MTU requires the jumbo frame support,
and application should enable the jumbo frame offload support for it.

When jumbo frame offload is not enabled by application, but MTU bigger
than RTE_ETHER_MTU is requested there are two options, either fail or
enable jumbo frame offload implicitly.

Enabling jumbo frame offload implicitly is selected by many drivers
since setting a big MTU value already implies it, and this increases
usability.

This patch moves this logic from drivers to the library, both to reduce
the duplicated code in the drivers and to make behaviour more visible.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
1bb4a528c4 ethdev: fix max Rx packet length
There is a confusion on setting max Rx packet length, this patch aims to
clarify it.

'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.

Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.

These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.

Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
  'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
  Ethernet frame overhead, and this overhead may be different from
  device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
  which adds additional confusion and some APIs and PMDs already
  discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
  field, this adds configuration complexity for application.

As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.

For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.

When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.

Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2021-10-18 19:20:20 +02:00
Georg Sauthoff
24f1955d1e net: fix aliasing in checksum computation
That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.

Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.

Fixes: 6006818cfb ("net: new checksum functions")
Fixes: e079655c41 ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org

Signed-off-by: Georg Sauthoff <mail@gms.tf>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-18 18:15:58 +02:00
Ferruh Yigit
54fe0cf1b8 net/enic: fix build with GCC 7.5
Build error:
../drivers/net/enic/enic_fm_flow.c: In function 'enic_fm_flow_parse':
../drivers/net/enic/enic_fm_flow.c:1467:24:
	error: 'dev' may be used uninitialized in this function
	[-Werror=maybe-uninitialized]
    struct rte_eth_dev *dev;
                        ^~~
../drivers/net/enic/enic_fm_flow.c:1580:24:
	error: 'dev' may be used uninitialized in this function
	[-Werror=maybe-uninitialized]
    struct rte_eth_dev *dev;
                        ^~~
../drivers/net/enic/enic_fm_flow.c:1599:24:
	error: 'dev' may be used uninitialized in this function
	[-Werror=maybe-uninitialized]
    struct rte_eth_dev *dev;
                        ^~~

Build error looks like false positive, but to silence the compiler
initializing the pointer with NULL.

Bugzilla ID: 812
Fixes: 54bd4ebe8b ("net/enic: support meta flow actions to overrule destinations")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-10-18 17:53:17 +02:00
William Tu
d2e5ab2b42 doc: fix emulated device names in e1000 guide
The device name should be 82574L Gigabit Ethernet Controller.
The patch also remove a redundant "*".

Fixes: fc1f2750a3 ("doc: programmers guide")
Cc: stable@dpdk.org

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
2021-10-15 15:50:50 +02:00
Pavan Nikhilesh
ac6deebb58 common/octeontx2: enable build only on 64-bit Linux
Since AARCH32 extension is not implemented on octeontx2 family, only
enable build for 64bit.
Due to Linux kernel AF(Admin Function) driver dependency, only enable
build for 64-bit Linux.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-15 19:13:13 +02:00
Pavan Nikhilesh
e1369718f5 common/octeontx: enable build only on 64-bit Linux
Since AARCH32 extension is not implemented on octeontx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for 64-bit Linux.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-15 19:13:13 +02:00
Pavan Nikhilesh
9ec67c12bd net/thunderx: enable build only on 64-bit Linux
Since AARCH32 extension is not implemented on thunderx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for Linux.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-15 19:13:13 +02:00
Jie Wang
655eae01f9 app/testpmd: fix RSS hash offload display
The driver may change RSS hash offloads in dev->data->dev_conf
during dev_configure which may cause port->dev_conf and port->rx_conf
contain outdated values.
Since testpmd uses its configuration structures to display offloads
configuration, it doesn't display RSS hash offload.

This patch updates the testpmd offloads from device configuration
to fix this issue.

Fixes: ce8d561418 ("app/testpmd: add port configuration settings")

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-15 13:27:05 +02:00
Jie Wang
632be32735 ethdev: add API to get device configuration
The driver may change offloads info into dev->data->dev_conf
in dev_configure which may cause apps use outdated values.

Add a new API to get actual device configuration.

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-15 13:27:05 +02:00
Xueming Li
51c1b8f8a0 net/bonding: fix Tx queue release
When release Tx queue, Rx queue data got freed because wrong Tx queue
data located.

This patch fixes the wrong Tx queue data location.

Fixes: 7483341ae5 ("ethdev: change queue release callback")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-15 12:45:58 +02:00
Li Zhang
771253ea8f net/mlx5: fix domains selection for meter policy
Fate actions are different per domain.
When all the domains, ingress, egress and FDB (transfer),
can support all the policy actions, i.e. [SET_TAG],
the policy prepares resources for all the domains and
failure happens if one of the domains misses its fate action
in the policy action list.

Remove the domains missing their fate action
from the meter policy preparation.

Now, the policy will prepare a domain only when the domain supports
all the actions and when one of the domain fate actions is on the list.

Fixes: afb4aa4f12 ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-10-14 10:48:33 +02:00
Simei Su
9f8c4cf02d net/ice: fix dereferenced null pointer
This patch fixes coverity issue by avoiding use of null pointer
in taking false branch.

Coverity issue: 373360
Fixes: 437dbd2fd4 ("net/ice: support 1PPS")

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-10-14 08:26:43 +02:00
Dapeng Yu
fff4914bd9 net/ice: fix freeing queues on DCF device reset
In function ice_dcf_stop_queues(), RX queues and TX queues are actually
not freed, so their pointers shall not be set to NULL when queues are
stopped.

This patch adds function call to free queues on DCF device close,
which also set the RX and TX queues' pointers to NULL on freeing
queues, and avoids referring to the released resource when device is
started again.

Fixes: 1a86f4dbdf ("net/ice: support DCF device reset")
Cc: stable@dpdk.org

Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2021-10-13 12:58:04 +02:00