Commit Graph

52 Commits

Author SHA1 Message Date
Chengwen Feng
9261fd3caf net/hns3: improve IO path data cache usage
This patch improves data cache usage by:
1. Rearrange the rxq frequency accessed fields in the IO path to the
   first 128B.
2. Rearrange the txq frequency accessed fields in the IO path to the
   first 64B.
3. Make sure ptype table align cacheline size which is 128B instead of
   min cacheline size which is 64B because the L1/L2 is 64B and L3 is
   128B on Kunpeng ARM platform.

The performance gains are 1.5% in 64B packet macfwd scenarios.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-05-04 18:02:14 +02:00
Min Hu (Connor)
55d5ad6bb8 net/hns3: remove unused macro
'HNS3_RXD_LKBK_B' was defined in previous versions but no used.
This patch deleted it.

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-20 12:55:28 +02:00
Chengwen Feng
bd7399291a net/hns3: simplify Rx checksum
Currently, the L3L4P/L3E/L4E/OL3E/OL4E fields in Rx descriptor used to
indicate hardware checksum result:
1. L3L4P: indicates hardware has processed L3L4 checksum for this
   packet, if this bit is 1 then L3E/L4E/OL3E/OL4E is trustable.
2. L3E: L3 checksum error indication, 1 means with error.
3. L4E: L4 checksum error indication, 1 means with error.
4. OL3E: outer L3 checksum error indication, 1 means with error.
5. OL4E: outer L4 checksum error indication, 1 means with error.

Driver will set the good checksum flag through packet type and
L3E/L4E/OL3E/OL4E when L3L4P is 1, it runs as follows:
1. If packet type indicates it's tunnel packet:
1.1. If packet type indicates it has inner L3 and L3E is zero, then
mark the IP checksum good.
1.2. If packet type indicates it has inner L4 and L4E is zero, then
mark the L4 checksum good.
1.3. If packet type indicates it has outer L4 and OL4E is zero, then
mark the outer L4 checksum good.
2. If packet type indicates it's not tunnel packet:
2.1. If packet type indicates it has L3 and L3E is zero, then mark the
IP checksum good.
2.2. If packet type indicates it has L4 and L4E is zero, then mark the
L4 checksum good.

As described above, the good checksum calculation is time consuming,
it impacts the Rx performance.

By balancing performance and functionality, driver uses the following
scheme to set good checksum flag when L3L4P is 1:
1. If L3E is zero, then mark the IP checksum good.
2. If L4E is zero, then mark the L4 checksum good.

The performance gains are 3% in small packet iofwd scenarios.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-19 19:15:45 +02:00
Min Hu (Connor)
81b129d419 net/hns3: remove unused macros
'HNS3_RXD_TSIND_S' and 'HNS3_RXD_TSIND_M' is unused, which should
be deleted.

This patch fixed it.

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-19 18:25:42 +02:00
Chengwen Feng
aa5baf47e1 net/hns3: rename Rx burst function
Currently, user could use runtime config "rx_func_hint=simple" to
select the hns3_recv_pkts API, but the API's name get from
rte_eth_rx_burst_mode_get is "Scalar" which has not reflected "simple".

So this patch renames hns3_recv_pkts to hns3_recv_pkts_simple, and
also change it's name which gets from rte_eth_rx_burst_mode_get to
"Scalar Simple" to maintain conceptual consistency.

Fixes: 521ab3e933 ("net/hns3: add simple Rx path")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-15 15:11:03 +02:00
Chengwen Feng
1f303606e8 net/hns3: fix some packet types
Currently, the packet type calculated by
vlan/ovlan/l3id/l4id/ol3id/ol4id fields have the following problems:
1) Identify error when exist VLAN strip which will lead to the data
   buffer has non VLAN header but mbuf's ptype have L2_ETHER_VLAN flag.
2) Some packet identifies error, eg: hardware report it's RARP or
   unknown packet, but ptype will marked with L2_ETHER .

So driver will calculate packet type only by l3id/l4id/ol3id/ol4id
fields.

Fixes: 0e98d5e6d9 ("net/hns3: fix packet type report in Rx")
Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-13 11:13:41 +02:00
Min Hu (Connor)
53e6f86cf5 net/hns3: fix copyright date
This patch updates copyright date for hns3 PMD files.

Fixes: 565829db8b ("net/hns3: add build and doc infrastructure")
Fixes: 952ebacce4 ("net/hns3: support SVE Rx")
Fixes: e31f123db0 ("net/hns3: support NEON Tx")
Fixes: c09c7847d8 ("net/hns3: support traffic management")

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-08 17:55:35 +02:00
Min Hu (Connor)
fa485faca2 net/hns3: update HiSilicon copyright syntax
According to the suggestion of our legal department,
to standardize the copyright license of our code to
avoid potential copyright risks, we make a unified
modification to the "Hisilicon", which was nonstandard,
in the main modules we maintain.

We change it to "HiSilicon", which is consistent with
the terms used on the following official website:
https://www.hisilicon.com/en/terms-of-use.

Fixes: 565829db8b ("net/hns3: add build and doc infrastructure")
Fixes: 952ebacce4 ("net/hns3: support SVE Rx")
Fixes: e31f123db0 ("net/hns3: support NEON Tx")
Fixes: c09c7847d8 ("net/hns3: support traffic management")
Cc: stable@dpdk.org

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-06 18:28:13 +02:00
Min Hu (Connor)
38b539d96e net/hns3: support IEEE 1588 PTP
Add hns3 support for new ethdev APIs to enable and read IEEE1588/
802.1AS PTP timestamps.

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-01 18:39:55 +02:00
Chengchang Tang
8f01e2f847 net/hns3: fix Tx checksum for UDP packets with special port
For Kunpeng920 network engine, UDP packets with destination port 6081,
4789 or 4790 will be identified as tunnel packets. If the UDP CKSUM
offload is set in the mbuf, and the TX tunnel mask is not set, the
CKSUM of these packets will be wrong. In this case, the upper layer
user may not identify the packet as a tunnel packet, and processes it
as non-tunnel packet, and expect to offload the outer UDP CKSUM, so
they may not fill the outer L2/L3 length to mbuf. However, the HW
identifies these packet as tunnel packets and therefore offload the
inner UDP CKSUM. As a result, the inner and outer UDP CKSUM are
incorrect. And for non-tunnel UDP packets with preceding special
destination port will also exist similar checksum error.

For the new generation Kunpeng930 network engine, the above errata
have been fixed. Therefore, the concept of udp_cksum_mode is
introduced. There are two udp_cksum_mode for hns3 PMD,
HNS3_SPECIAL_PORT_HW_CKSUM_MODE means HW could solve the above
problem. And in HNS3_SPECIAL_PORT_SW_CKSUM_MODE, hns3 PMD will check
packets in the Tx prepare and perform the UDP CKSUM for such packets
to avoid a checksum error.

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-03-30 12:30:46 +02:00
Hongbo Zheng
63e05f19b8 net/hns3: support Rx descriptor status query
Add support for query Rx descriptor status in hns3 driver. Check the
descriptor specified and provide the status information of the
corresponding descriptor.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-03-23 13:04:33 +01:00
Hongbo Zheng
656a6d9cc0 net/hns3: support Tx descriptor status query
Add support for query Tx descriptor status in hns3 driver. Check the
descriptor specified and provide the status information of the
corresponding descriptor.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-03-23 13:04:33 +01:00
Chengchang Tang
d0ab89e633 net/hns3: support outer UDP checksum
Kunpeng930 support outer UDP cksum, this patch add support for it.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-03-23 13:04:32 +01:00
Chengwen Feng
fb5e906940 net/hns3: support Rx descriptor advanced layout
Currently, the driver get packet type by parse the
L3_ID/L4_ID/OL3_ID/OL4_ID from Rx descriptor and then lookup multiple
tables, it's time consuming.

Now Kunpeng930 support advanced RXD layout, which:
1. Combine OL3_ID/OL4_ID to 8bit PTYPE filed, so the driver get packet
   type by lookup only one table.  Note: L3_ID/L4_ID become reserved
   fields.
2. The 1588 timestamp located at Rx descriptor instead of query from
   firmware.
3. The L3E/L4E/OL3E/OL4E will be zero when L3L4P is zero, so driver
   could optimize the good checksum calculations (when L3E/L4E is zero
   then mark PKT_RX_IP_CKSUM_GOOD/PKT_RX_L4_CKSUM_GOOD).

Considering compatibility, the firmware will report capability of
RXD advanced layout, the driver will identify and enable it by default.

This patch only provides basic function: identify and enable the RXD
advanced layout, and lookup ptype table if supported.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2021-03-04 15:07:14 +01:00
Chengwen Feng
dfecc3201f net/hns3: implement Tx mbuf free on demand
This patch add support tx_done_cleanup ops, which could support for
the API rte_eth_tx_done_cleanup to free consumed mbufs on Tx ring.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2021-03-04 15:07:13 +01:00
Chengchang Tang
2b6b09817d net/hns3: fix interrupt resources in Rx interrupt mode
For Kunpeng930, the NIC engine support 1280 tqps being taken over by
a PF. In this case, a maximum of 1281 interrupt resources are also
supported in this PF. To support the maximum number of queues, several
patches are made. But the interrupt related modification are missing.
So, in RX interrupt mode, a large number of queues will be aggregated
into one interrupt due to insufficient interrupts. It will lead to
waste of interrupt resources and reduces usability.

To utilize all these interrupt resources, related IMP command has been
extended. And, the I/O address of the extended interrupt resources are
different from the existing ones. So, a function used for calculating
the address offset has been added.

Fixes: 76d794566d ("net/hns3: maximize queue number")
Fixes: 27911a6e62 ("net/hns3: add Rx interrupts compatibility")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2021-01-29 18:16:12 +01:00
Huisong Li
86c551d1d8 net/hns3: move queue stats to xstats
One of the hot discussions in community recently was moving queue stats
to xstats. In this solution, a temporary
'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' device flag is created to implement
the smooth switch. And the first half of this work has been completed in
the ethdev framework. Now driver needs to remove the flag from the
driver initialization process and does the rest of work.

For better readability and reasonability, per-queue stats also should be
cleared when rte_eth_stats is cleared. Otherwise, the sum of one item in
per-queue stats may be greater than corresponding item in rte_eth_stats.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2021-01-29 18:16:11 +01:00
Huisong Li
9b77f1fe30 net/hns3: encapsulate DFX stats in datapath
pkt_len_errors and l2_errors in Rx datapath indicate that driver
needs to discard received packets. And driver does not discard
packets for l3/l4/ol3/ol4_csum_errors in Rx datapath and others
stats in Tx datapath. Therefore, it is necessary for improving
code readability and maintainability to encapsulate error stats
and dfx stats.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2021-01-29 18:16:11 +01:00
Chengchang Tang
80ec1bbd5b net/hns3: fix queue state after reset
FLR operation will reset the queue enabling state and
the driver needs to restore the state after reset.
If the driver does not restore the state, it will result
in unpredictable behavior with reset when user start or
stop queue by calling the relevant function if.

This patch fix it by add a queue enabling state restore
function to the reset handler.

Fixes: fa29fe45a7 ("net/hns3: support queue start and stop")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2020-11-13 19:43:26 +01:00
Lijun Ou
445b0c8eba net/hns3: cleanup includes
Some header files have included by others. Also,
some header files have a header file self-contained
error will trigger building warning. As a result,
it is unnecessary and move it into the correct
location.

Beside, here also remove some unused lines.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
2020-11-03 23:35:07 +01:00
Chengchang Tang
fb6eb9009f net/hns3: fix Tx checksum with fixed header length
Currently, the header length of all the layers are fixed, It would
lead to a csum error when the header length changed.

This patch fixes above problem by using the header length in mbuf
instead of the fixed header length to perform the TX cksum offload.

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2020-11-03 23:35:07 +01:00
Chengchang Tang
0e98d5e6d9 net/hns3: fix packet type report in Rx
Currently, hns3 supports recognizing a lot of ptypes, but most
tunnel packet types are not reported to the API
rte_eth_dev_get_supported_ptypes.

And there are some errors in L2 and L3 packet recognition. The
ARP and LLDP are classified to L3 field in RX descriptor. So,
the ptype of LLDP and ARP packets will be set twice. And ptypes
are assigned by bitwise OR, which will eventually cause the ptype
result to be incorrect.

Besides, when a packet with only L2 header, its ptype will not
report by hns3 PMD. This is because the L2/L3 ptype table is not
initialized properly. In this case, the table query result is 0
by default.

As a result, it fixes missing supported ptypes and the mistake in
L2/L3 packet recognition and the unreported L2 packet ptype by
reporting its L2 type when the L3 type unrecognized..

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
2020-11-03 23:35:06 +01:00
Lijun Ou
2be91035b3 net/hns3: get number of used descriptors of Rx queue
Implement the available and used rxd number count function.

In Kunpeng series, the NIC hardware supports to read the bd numbers
which wait processed from the hardware FBD (Full Buffer Descriptor),
and the driver maintains the bd number to be written back hardware.
Compare the number of FBDs with the number of BDs to be written back to
the hardware.

The number of used descriptors of a rx queue is computed as follows:
The fbd numbers of reading from FBD register plus the bd numbers to be
written back to hardware maintained by the driver.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
2020-11-03 23:35:06 +01:00
Chengwen Feng
f0c243a6cb net/hns3: support SVE Tx
This patch adds SVE vector instructions to optimize Tx burst process.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
2020-10-16 19:48:19 +02:00
Wei Hu (Xavier)
952ebacce4 net/hns3: support SVE Rx
This patch adds SVE vector instructions to optimize Rx burst process.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2020-10-16 19:48:19 +02:00
Chengchang Tang
fa29fe45a7 net/hns3: support queue start and stop
The new generation hns3 network engine supports independent enabling and
disabling of a single Tx/Rx queue. So, it can support the queue start
and stop feature. In addition, when different numbers of Tx and Rx
queues need to be enabled in some applications, hns3 pmd does not need
to create fake queues to enable these scenarios.

This patch Add queue start and stop feature for the new generation hns3
networking engine. Cancel the creation of fake queue on the new
generation network engine. And the previously improperly named queue
related function was renamed to improve readability.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-10-08 19:58:10 +02:00
Wei Hu (Xavier)
76d794566d net/hns3: maximize queue number
The maximum number of queues for hns3 PF and VF driver is 64 based on
hns3 network engine with revision_id equals 0x21. Based on hns3 network
engine with revision_id equals 0x30, the hns3 PF PMD driver can support
up to 1280 queues, and hns3 VF PMD driver can support up to 128 queues.

The following points need to be modified to support maximizing queue
number and maintain better compatibility:
1) Maximizing the number of queues for hns3 PF and VF PMD driver In
   current version, VF is not supported when PF is driven by hns3 PMD
   driver. If maximum queue numbers allocated to PF PMD driver is less
   than total tqps_num allocated to this port, all remaining number of
   queues are mapped to VF function, which is unreasonable. So we fix
   that all remaining number of queues are mapped to PF function.

   Using RTE_LIBRTE_HNS3_MAX_TQP_NUM_PER_PF which comes from
   configuration file to limit the queue number allocated to PF device
   based on hns3 network engine with revision_id greater than 0x30. And
   PF device still keep the maximum 64 queues based on hns3 network
   engine with revision_id equals 0x21.

   Remove restriction of the macro HNS3_MAX_TQP_NUM_PER_FUNC on the
   maximum number of queues in hns3 VF PMD driver and use the value
   allocated by hns3 PF kernel netdev driver.

2) According to the queue number allocated to PF device, a variable
   array for Rx and Tx queue is dynamically allocated to record the
   statistics of Rx and Tx queues during the .dev_init ops
   implementation function.
3) Add an extended field in hns3_pf_res_cmd to support the case that
   numbers of queue are greater than 1024.
4) Use new base address of Rx or Tx queue if QUEUE_ID of Rx or Tx queue
   is greater than 1024.
5) Remove queue id mask and use all bits of actual queue_id as the
   queue_id to configure hardware.
6) Currently, 0~9 bits of qset_id in hns3_nq_to_qs_link_cmd used to
   record actual qset id and 10 bit as VLD bit are configured to
   hardware. So we also need to use 11~15 bits when actual qset_id is
   greater than 1024.
7) The number of queue sets based on different network engine are
   different. We use it to calculate group number and configure to
   hardware in the backpressure configuration.
8) Adding check operations for number of Rx and Tx queue user configured
   when mapping queue to tc Rx queue numbers under a single TC must be
   less than rss_size_max supported by a single TC. Rx and Tx queue
   numbers are allocated to every TC by average. So Rx and Tx queue
   numbers must be an integer multiple of 2, or redundant queues are not
   available.
9) We can specify which packets enter the queue with a specific queue
   number, when creating flow table rules by rte_flow API. Currently,
   driver uses 0~9 bits to record the queue_id. So it is necessary to
   extend one bit field to record queue_id and configure to hardware, if
   the queue_id is greater than 1024.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-10-08 19:58:10 +02:00
Wei Hu (Xavier)
dd1e461182 net/hns3: add TSO pseudo header calculation compatibility
In kunpeng 920, when process pkts which need TSO, the network driver
need to erase the L4 len value of the TCP TSO pseudo header and
recalculate the pseudo header checksum. kunpeng930 support not need
to erase the L4 len value of the TCP TSO pseudo header.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2020-09-30 19:19:10 +02:00
Hongbo Zheng
da17b003f3 net/hns3: add max number of segments compatibility
Kunpeng 920 supports the maximum nb_segs of non-tso packet is 8 in Tx
direction, kunpeng 930 expands this limit value to 18, this patch sets
the corresponding value by querying the maximum number of non-tso nb_segs
supported by the device during initialization.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2020-09-30 19:19:10 +02:00
Wei Hu (Xavier)
992b24a1ce net/hns3: add VLAN configuration compatibility
Because of hardware limitation based on the old version of hns3 network
engine, there are some restrictions:
a) HNS3 PMD driver needs select different processing mode for VLAN based
   on whether PVID is set which means our driver need sense the PVID
   states.
b) For packets transmitting process, only two layer of VLAN tag is
   supported. If the total number of VLAN tags in mbuf and VLAN offload
   by hardware (VLAN insert by descriptor) exceeds two, the VLAN in mbuf
   will be overwritten by VLAN in the descriptor.
c) If port based VLAN is set, only one VLAN header is allowed in mbuf or
   it will be discard by hardware.

In order to solve these restriction, two change is implemented on the
new versions of network engine.
1) add a new VLAN tagged insertion mode, named tag shift mode;
2) add a new VLAN strip control bit, named strip hide enable;

The tag shift mode means that VLAN tag will shift automatically when the
inserted place has a tag. For PMD driver, the VLAN tag1 and tag2
configurations in Tx side do not need to be considered because the
hardware completes it. However, the related configuration will still be
retained to be compatible with the old version of network engine.

The VLAN strip hide means that hardware will strip the VLAN tag and hide
VLAN in descriptor (VLAN ID exposed as zero and related STRIP_TAGP is
off).

These changes make it no longer necessary for the hns3 PMD driver to be
aware of the PVID status and have the ability to send mult-layer (more
than two) VLANs packets. Therefore, hns3 PMD driver introduces the
concept of VLAN mode and adds a new VLAN mode named HNS3_PVID_MODE to
indicate that PVID-related IO process can be implemented by the
hardware. And VF driver does not need to be modified because the related
mailbox messages will not be sent by PF kernel mode netdev driver under
new network engine and all the related hardware configuration is on the
PF side.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-30 19:19:10 +02:00
Wei Hu (Xavier)
a3d4f4d291 net/hns3: support NEON Rx
This patch adds NEON vector instructions to optimize Rx burst process.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
e31f123db0 net/hns3: support NEON Tx
This patch adds NEON vector instructions to optimize Tx burst process.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
7ef933908f net/hns3: add simple Tx path
This patch adds simple Tx process function. When multiple segment
packets are not needed, Which means that DEV_TX_OFFLOAD_MBUF_FAST_FREE
offload is not set, we can simple Tx process.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
521ab3e933 net/hns3: add simple Rx path
This patch adds simple Rx process function and support chose Rx function
by real Rx offloads capability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
323df8941b net/hns3: reduce address calculation in Rx
This patch adds the internal function named hns3_write_reg_opt to avoid
performance loss from address calculation during register access in the
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

In addition, because hardware always access register in little-endian
mode based on hns3 network engine, so driver should also call
rte_cpu_to_le_32 to convert data in little-endian mode before writing
register and call rte_le_to_cpu_32 to convert data after reading from
register. Here the driver encapsulates the data conversion operation in
the register read/write operation function as below:
  hns3_write_reg
  hns3_write_reg_opt
  hns3_read_reg
Therefore, when calling these functions, conversion is not required
again.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
ceabee45be net/hns3: report Rx free threshold
This patch reports .rx_free_thresh value in the .dev_infos_get ops
implementation function named hns3_dev_infos_get and
hns3vf_dev_infos_get.
In addition, the name of the member variable of struct hns3_rx_queue is
modified and comments are added to improve code readability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
395b5e08ef net/hns3: add Tx short frame padding compatibility
There are difference about padding ultra-short frame in Tx procession
for different versions of hardware network engine.

If packet length is less than minimum packet length supported by
hardware in Tx direction, driver need to pad it to avoid error. The
minimum packet length in Tx direction is 33 based on kunpeng 920, and 9
based on kunpeng 930.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2020-09-18 18:55:07 +02:00
Wei Hu (Xavier)
27911a6e62 net/hns3: add Rx interrupts compatibility
There are difference about queue's interrupt configurations for
different versions of hardware network engine, such as queue's interrupt
mapping mode, coalesce configuration, etc.

The following uses the configuration differences of the interrupt
mapping mode as an example.
1) For some versions of hardware network engine, such as kunpeng 920,
   because of the hardware constraint, we need implement unmmapping
   relationship configurations by binding all queues to the last
   interrupt vector and reserving the last interrupt vector. This
   results in a decrease of the maximum queues when upper applications
   call the rte_eth_dev_configure API function to enable Rx interrupt.
2) And for another versions, such as kunpeng 930, hns3 PMD driver can
   map/unmmap all interrupt vectors with queues when Rx interrupt is
   enabled.

This patch resolves configuration differences about Rx interrupts based
on kunpeng 920 and kunpeng 930.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-18 18:55:07 +02:00
Huisong Li
091a0f95b5 net/hns3: support getting queue information
This patch adds support for querying Rx/Tx queue information.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-18 18:55:06 +02:00
Min Hu (Connor)
8973d7c4ca net/hns3: support keeping CRC
CRC is the end of frame, which occupies 4 bytes. Keeping CRC is a
feature of MAC, which will not strip CRC field when receiving frames.
The feature can be enabled using DEV_RX_OFFLOAD_KEEP_CRC offload by
upper level application. And the feature is only supported for hns3 PF
PMD driver, not supported for hns3 VF PMD driver

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-07-17 18:21:21 +02:00
Wei Hu (Xavier)
dfac40d93e net/hns3: fix Rx buffer size
Currently, rx_buf_size of hns3 PMD driver is fixed on, and it's value
depends on the firmware which will decrease the flexibility of PMD.

The receive side mbufs was allocated from the mempool given by upper
application calling rte_eth_rx_queue_setup API function. So the memory
chunk used for net device DMA is depend on the data room size of the
objects in this mempool. Hns3 PMD driver should set the rx_buf_len
smaller than the data room size of mempool and our hardware only support
the following four specifications: 512, 1024, 2148 and 4096.

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-07-07 23:38:26 +02:00
Chengchang Tang
8c7449779c net/hns3: decrease non-nearby memory access in Rx
Currently, hns3 PMD driver needs know the PVID configuration state and
do different processing in the 'rx_pkt_burst' ops implementation
function.

This patch adds a member to struct hns3_rx_queue/hns3_tx_queue of the
driver to indicate the PVID configuration status, so it isn't need
to access other data structure in the 'rx_pkt_burst' ops implementation,
to avoid performance loss because of reducing cache miss.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-07-07 23:38:26 +02:00
Wei Hu (Xavier)
1f295c40da net/hns3: support LRO
This patch adds support of LRO offload for hns3 PMD driver.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-07-07 23:38:26 +02:00
Wei Hu (Xavier)
c4b7d6761d net/hns3: get Tx abnormal errors in xstats
When upper level application calls the rte_eth_tx_burst API function to
send multiple packets at a time with burst mode based on hns3 network
engine, there are some abnormal conditions that cause the driver to fail
to operate the hardware to send packets correctly.

This patch adds some statistic counts for the abnormal errors of Tx data
path to the extend device statistics. The upper level application can
get them by calling the rte_eth_xstats_get API function.

Note: When using burst mode to call the rte_eth_tx_burst API function to
send multiple packets at a time. When the first abnormal error is
detected, add one to the relevant error statistics item, and then exit
the loop of sending multiple packets of the function. That is to say,
even if there are multiple packets in which abnormal errors may be
detected in the burst, the relevant error statistics in the driver will
only be increased by one.

The detail description of the Tx abnormal errors statistic items as
below:
 - TX_OVER_LENGTH_PKT_CNT Total number of greater than
   HNS3_MAX_FRAME_LEN the driver supported.

 - TX_EXCEED_LIMITED_BD_PKT_CNT
     Total number of exceeding the hardware limited bd which process a
     packet needed bd numbers.

 - TX_EXCEED_LIMITED_BD_PKT_REASSEMBLE_FAIL_CNT
     Total number of exceeding the hardware limited bd fail which
     process a packet needed bd numbers and reassemble fail.

 - TX_UNSUPPORTED_TUNNEL_PKT_CNT
     Total number of unsupported tunnel packet. The unsupported tunnel
     type: vxlan_gpe, gtp, ipip and MPLSINUDP, MPLSINUDP is a packet
     with MPLS-in-UDP RFC 7510 header.

 - TX_QUEUE_FULL_CNT
     Total count which the available bd numbers in current bd queue is
     less than the bd numbers with the pkt process needed.

 - TX_SHORT_PKT_PAD_FAIL_CNT
     Total count which the packet length is less than minimum packet
     size HNS3_MIN_PKT_SIZE and fail to be appended with 0.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Hao Chen <chenhao164@huawei.com>
2020-05-05 15:54:26 +02:00
Chengwen Feng
c4ae39b2cf net/hns3: fix Rx interrupt after reset
Currently, Rx interrupt cannot work normally after reset (such as FLR,
global reset and IMP reset), when running l3fwd-power application based
on hns3 network engine.

The root cause is that the hardware configuration about Rx interrupt
does not recover after reset.

This patch fixes it with the following modification.
1. The internal static function named hns3(vf)_init_ring_with_vector is
   moved from hns3_init_pf to hns3(vf)_init_hardware because
   hns3(vf)_init_hardware is called both in the initialization and the
   RESET_STAGE_DEV_INIT stage of the reset process.
2. The internal static function named hns3(vf)_restore_rx_interrupt is
   added in hns3(vf)_restore_conf, it is used to recover hardware
   configuration about interrupt vectors of rx queues in the
   RESET_STAGE_DEV_INIT stage of the reset process.
3. The internal static function named hns3_dev_all_rx_queue_intr_enable
   and hns3_enable_all_queues are added in hns3(vf)_dev_start(which
   called in the initialization, so after calling the rte_eth_dev_start
   API successfully, the driver is ready to work.
4. The function named hns3_dev_all_rx_queue_intr_enable and
   hns3_enable_all_queues are also added in hns3(vf)_start_service(which
   called in the RESET_STAGE_DEV_INIT stage of the reset process), so
   after start_service, the driver is ready to work.

Note:
1. Because FLR will clear queue's interrupt enable bit hardware
   configuration, so we add calling hns3_dev_all_rx_queue_intr_enable to
   enable interrupt before enabling queues.
2. After finished the initialization, we can enable queues to work by
   calling the internal function named hns3_enable_all_queues.

Fixes: 02a7b55657 ("net/hns3: support Rx interrupt")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
2020-04-21 13:57:07 +02:00
Wei Hu (Xavier)
ef2e785c36 net/hns3: fix Tx interrupt when enabling Rx interrupt
Currently, when receiving and transmitting packets based on hns3 network
engine there are probably unexpected and redundant Tx interrupts if Rx
interrupt is enabled.

The root cause as below:
Tx and Rx queues with the same number share the interrupt vector in hns3
network engine, and in this case there are the residual hardware mapping
relationship configuration between queue and interrupt vector configured
in hns3 kernel ethdev driver.

We should clear the all hardware mapping relationship configurations in
the initialization. Because of the hardware constraints, we have to
implement clearing the relationship by binding all queues to the last
interrupt vector and reserving the last interrupt vector, this method
results in a decrease of the maximum queues when upper applications call
the rte_eth_dev_configure API function to enable Rx interrupt.

Fixes: 02a7b55657 ("net/hns3: support Rx interrupt")
Cc: stable@dpdk.org

Signed-off-by: Hao Chen <chenhao164@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-03-18 10:21:42 +01:00
Wei Hu (Xavier)
ffd0ec015b net/hns3: add free threshold in Rx
This patch optimizes the Rx performance by adding the rx_free_thresh
related process in the '.rx_pkt_burst' ops implementation function named
hns3_recv_pkts. The related change as follows:
1. Adding the rx_free_thresh related process to reduce the number of
   writing the HNS3_RING_RX_HEAD_REG register.
2. Adjusting the internal macro named DEFAULT_RX_FREE_THRESH to 32 and
   adjusting HNS3_MIN_RING_DESC to 64 to make the effect of the thresh
   more obvious.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-01-17 19:46:26 +01:00
Wei Hu (Xavier)
5cf7a75b2c net/hns3: remove one IO barrier in Rx
When receiving a packet, hns3 hardware network engine firstly writes the
packet content to the memory pointed by the 'addr' field of the Rx
Buffer Descriptor, secondly fills the result of parsing the packet
include the valid field into the Rx Buffer Descriptor in one write
operation, and thirdly writes the number of the Buffer Descriptor not
processed by the driver to the HNS3_RING_RX_FBDNUM_REG register.

This patch optimizes the Rx performance by removing one rte_io_rmb call
in the '.rx_pkt_burst' ops implementation function named hns3_recv_pkts.
The change as follows:
1. Driver no longer read HNS3_RING_RX_FBDNUM_REG register, so remove one
   rte_io_rmb call, and directly read the valid flag of Rx Buffer
   Descriptor to check whether the BD is ready.
2. Delete the non_vld_descs field from the statistic information of the
   hns3 driver because now it has become a common case that the valid
   flag of Rx Buffer Descriptor read by the driver is invalid.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-01-17 19:46:26 +01:00
Wei Hu (Xavier)
a951c1ed3a net/hns3: support different numbers of Rx and Tx queues
Hardware does not support individually enable/disable/reset the Tx or Rx
queue in hns3 network engine, driver must enable/disable/reset Tx and Rx
queues at the same time.

Currently, hns3 PMD driver does not support the scenarios as below:
1) When calling the following function, the input parameter nb_rx_q and
   nb_tx_q are not equal.
     rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q,
                      uint16_t nb_tx_q,
		      const struct rte_eth_conf *dev_conf);
2) When calling the following functions to setup queues, the
   cumulatively setup Rx queues are not the same as the setup Tx queues.
     rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,,,);
     rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,,,);
However, these are common usage scenarios in some applications, such as,
l3fwd, ip_ressmbly and OVS-DPDK, etc.

This patch adds support for this usage of these functions by setup fake
Tx or Rx queues to adjust numbers of Tx/Rx queues. But these fake queues
are imperceptible, and can not be used by upper applications.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-01-17 19:46:26 +01:00
Hao Chen
02a7b55657 net/hns3: support Rx interrupt
This patch adds supports of receive packets through interrupt mode for
hns3 PF/VF driver. The following ops functions should be implemented
defined in the struct eth_dev_ops:
rx_queue_intr_enable
rx_queue_intr_disable

Signed-off-by: Hao Chen <chenhao164@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-01-17 19:46:01 +01:00