Commit Graph

24209 Commits

Author SHA1 Message Date
Phil Yang
f0f5d844d1 eal: remove deprecated coherent IO memory barriers
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs
provide the same functionality as rte_io_*mb APIs on all platforms, so
remove them and use rte_io_*mb instead.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-23 13:40:26 +02:00
Feifei Wang
46697431ad test/ring: enhance debug info in failure cases
Add more parameters into the macro TEST_RING_VERIFY and expand the scope
of application for it. Then replace all ring APIs check with
TEST_RING_VERIFY to facilitate debugging.

Furthermore, correct a spelling mistakes of the macro
TEST_RING_FULL_EMTPY_ITER.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 11:23:10 +02:00
Feifei Wang
6c583103a2 test/ring: factorize object checks
Do code clean up by moving repeated code inside 'test_ring_mem_cmp'
function to validate data and print information of enqueue/dequeue
elements if validation fails.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 11:23:10 +02:00
Feifei Wang
f68c206639 test/ring: validate single element enqueue/dequeue
Validate the return value of single element enqueue/dequeue operation in
the test.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 11:23:10 +02:00
Feifei Wang
c570da3671 test/ring: check dequeued object for single element
Add check in test_ring_basic_ex and test_ring_with_exact_size for single
element enqueue and dequeue operations to validate the dequeued objects.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 11:23:08 +02:00
Feifei Wang
d943c60565 test/ring: fix dequeued object checks
When using memcmp function to check data, the third param should be the
size of all elements, rather than the number of the elements.

Fixes: a9fe152363 ("test/ring: add custom element size functional tests")
Cc: stable@dpdk.org

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 08:53:33 +02:00
Feifei Wang
f642148eea test/ring: fix number of single element enqueue/dequeue
The ring capacity is (RING_SIZE - 1), thus only (RING_SIZE - 1) number of
elements can be enqueued into the ring.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 08:53:33 +02:00
Feifei Wang
6450023ca2 test/ring: fix object reference for single element enqueue
When enqueue one element to ring in the performance test, a pointer
should be passed to rte_ring_[sp|mp]enqueue APIs, not the pointer
to a table of void *pointers.

Fixes: a9fe152363 ("test/ring: add custom element size functional tests")
Cc: stable@dpdk.org

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-23 08:53:33 +02:00
Hyong Youb Kim
a4ab862e99 net/enic: support VXLAN decap action combined with VLAN pop
Flow Manager (flowman) provides DECAP_STRIP operation which
decapsulates VXLAN header and then removes VLAN header from the inner
packet. Use this operation to support vxlan_decap followed by
of_pop_vlan.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:10:38 +02:00
Hyong Youb Kim
d52054e70f net/enic: generate VXLAN src port if it is zero in template
When VXLAN source port in the template is zero, the adapter is
expected to generate a value based on the inner packet flow, when it
performs encapsulation. Flow Manager in the VIC adapter currently
lacks such ability. So, generate a random port when creating a flow if
the port is zero, to avoid transmitting packets with source port 0.

Fixes: ea7768b5bb ("net/enic: add flow implementation based on Flow Manager API")
Cc: stable@dpdk.org

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:10:38 +02:00
Hyong Youb Kim
473e9407a4 net/enic: ignore VLAN inner type when it is zero
When a VLAN pattern is present, the flow handler always copies its
inner_type to the match buffer regardless of its value (i.e. HW
matches inner_type against packet's inner ethertype). When inner_type
spec and mask are both 0, adding it to the match buffer is usually
harmless but breaks the following pattern used in some applications
like OVS-DPDK.

flow create 0 ingress ... pattern eth ... type is 0x0800 /
vlan tci spec 0x2 tci mask 0xefff / ipv4 / end actions count /
of_pop_vlan / ...

The VLAN pattern's inner_type is 0. And the outer eth pattern's type
actually specifies the inner ethertype. The outer ethertype (0x0800)
is first copied to the match buffer. Then, the driver copies
inner_type (0) to the match buffer, which overwrites the existing
0x0800 with 0 and breaks the app usage above.

Simply ignore inner_type when it is 0, which is the correct
behavior. As a byproduct, the driver can support the usage like the
above.

Fixes: ea7768b5bb ("net/enic: add flow implementation based on Flow Manager API")
Cc: stable@dpdk.org

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:10:38 +02:00
Hyong Youb Kim
f985387e44 net/enic: support priorities for TCAM flows
Group 0 corresponds to TCAM which supports priorities. Accept non-zero
priorities for group 0 flows.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:10:38 +02:00
Hyong Youb Kim
8ca08b7026 net/enic: support egress port id action
Use Flow Manager (flowman) to support egress PORT_ID action. It can
steer egress packets from PFs and VFs to any uplink port as long as
they are all on the same VIC adapter. It can also steer packets
between ports on the same VIC adapter (i.e. loopback).

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:10:38 +02:00
Hyong Youb Kim
87884276ff net/enic: remove obsolete code
The 'next' field in struct enic is unused. The comment in enic_cq_rq()
is out-of-date. Remove them.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:10:38 +02:00
Hyong Youb Kim
a93cf16926 net/enic: enable flow API for VF representor
Use Flow Manager (flowman) to support flow API for
representors. Representor's flow handlers simply invoke PF handlers
and pass the representor's flowman structure. The PF flowman handlers
are aware of representors and perform appropriate devcmds to create
flows on the NIC.

Also use flowman to create internal flows for implicit VF-representor
path. With that, representor Tx/Rx is now functional.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:05:38 +02:00
Hyong Youb Kim
859540e719 net/enic: extend flow handler to support VF representors
VF representor ports can create flows on VFs through the PF flowman
(Flow Manager) instance in the firmware. These flows match packets
egressing from VFs and apply flowman actions.

1. Make flow handler aware of VF representors
When a representor port invokes flow APIs, use the PF port's flowman
instance to perform flowman devcmd. If the port ID refers to a
representor, use VF handle instead of PF handle.

2. Serialize flow API calls
Multiple application thread may invoke flow APIs through PF and VF
representor ports simultaneously. This leads to races, as ports all
share the same PF flowman instance. Use a lock to serialize API
calls. Lock is used only when representors exist.

3. Add functions to create flows for implicit representor paths
There is an implicit path between VF and its representor. The
functions below create flow rules to implement that path.
- enic_fm_add_rep2vf_flow()
- enic_fm_add_vf2rep_flow()

The flows created for representor paths are marked as internal. These
are not visible to application, and the flush API does not destroy
them. They are automatically deleted when the representor port stops
(enic_fm_destroy).

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:05:38 +02:00
Hyong Youb Kim
edd0854815 net/enic: add single queue Tx and Rx to VF representor
A VF representor allocates queues from PF's pool of queues and use
them for its Tx and Rx. It supports 1 Tx queue and 1 Rx queue.

Implicit packet forwarding between representor queues and VF does not
yet exist. It will be enabled in subsequent commits using flowman API.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:05:38 +02:00
Hyong Youb Kim
39cf83f177 net/enic: add minimal VF representor
Enable the minimal VF representor without Tx/Rx and flow API support.

1. Enable the standard devarg 'representor'
When the devarg is specified, create VF representor ports.

2. Initialize flowman early during PF probe
Representors require the flowman API from the firmware. Initialize it
before creating VF representors, so probe can detect the flowman
support and fail if not available.

3. Add enic_fm_allocate_switch_domain() to allocate switch domain ID
PFs and VFs on the same VIC adapter can forward packets to each other,
so the switch domain is the physical adapter.

4. Create a vnic_dev lock to serialize concurrent devcmd calls
PF and VF representor ports may invoke devcmd (e.g. dump stats)
simultaneously. As they all share a single PF devcmd instance in the
firmware, use a lock to serialize devcmd calls.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:05:38 +02:00
Hyong Youb Kim
0e7312b9d2 net/enic: extend VNIC dev API for VF representors
VF representors need to proxy devcmd through the PF vnic_dev
instance. Extend vnic_dev to accommodate them as follows.

1. Add vnic_vf_rep_register()
A VF representor creates its own vnic_dev instance via this function
and saves VF ID. When performing devcmd, vnic_dev uses the saved VF ID
to proxy devcmd through the PF vnic_dev instance.

2. Add vnic_register_lock()
As PF and VF representors appear as independent ports to the
application, its threads may invoke APIs on them simultaneously,
leading to race conditions on the PF vnic_dev. For example, thread A
can query stats on PF port, while thread B queries stats on a VF
representor.

The PF port invokes this function to provide a lock to vnic_dev. This
lock is used to serialize devcmd calls from PF and VF representors.

3. Add utility functions to assist VF representor settings
vnic_dev_mtu() and vnic_dev_uif() retrieve vnic MTU and UIF number
(uplink index), respectively.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2020-09-21 18:05:38 +02:00
Chengchang Tang
e692c74691 net/hns3: add Rx buffer size to Rx queue info
Report hns3 PMD configured Rx buffer size in Rx queue information query.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-21 18:05:38 +02:00
Chengchang Tang
61efaf5b62 ethdev: support getting Rx buffer size in Rx queue info
Add a field named rx_buf_size in rte_eth_rxq_info to indicate the buffer
size used in receiving packets for HW.

In this way, upper-layer users can get this information by calling
rte_eth_rx_queue_info_get.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-21 18:05:38 +02:00
Long Li
d9fecbe97b net/netvsc: fix rndis packet addresses
The address should be calculated before type cast, not after.

Fixes: cc02518132 ("net/netvsc: split send buffers from Tx descriptors")
Cc: stable@dpdk.org

Reported-by: Souvik Dey <sodey@rbbn.com>
Signed-off-by: Long Li <longli@microsoft.com>
2020-09-21 18:05:38 +02:00
Qi Zhang
6faf884136 net/iavf: fix iterator for RSS LUT
Change RSS LUT iterator from uint8_t to uint16_t since the
RSS LUT size could exceed 255.

Fixes: 69dd4c3d08 ("net/avf: enable queue and device")
Cc: stable@dpdk.org

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ting Xu <ting.xu@intel.com>
2020-09-21 18:05:38 +02:00
Phil Yang
7ec9d6f4bd net/memif: relax barrier for zero copy path
Using 'rte_mb' to synchronize the shared ring head/tail between producer
and consumer will stall the pipeline and damage performance on the weak
memory model platforms, such like aarch64.

Relax the expensive barrier with c11 atomic with explicit memory
ordering can improve 3.6% performance on throughput.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jakub Grajciar <jgrajcia@cisco.com>
2020-09-21 18:05:38 +02:00
Chengchang Tang
0134a5c7b4 net/hns3: fix crash when Tx multiple buffer packets
Currently, there is a possibility that segment faults occur when sending
packets whose payloads are stored in multiple buffers based on hns3
network engine. The related core dump information as follows:

Program terminated with signal 11, Segmentation fault.
0  hns3_reassemble_tx_pkts
2512                            temp = temp->next;
Missing separate debuginfos, use:
(gdb) bt
0  hns3_reassemble_tx_pkts
1  0x0000000000969c60 in hns3_check_non_tso_pkt
2  0x000000000096adbc in hns3_xmit_pkts
3  0x000000000050d4d0 in rte_eth_tx_burst
4  0x000000000050fca4 in pkt_burst_transmit
5  0x00000000004ca6b8 in run_pkt_fwd_on_lcore
6  0x00000000004ca7fc in start_pkt_forward_on_core
7  0x00000000006975a4 in eal_thread_loop
8  0x0000ffffa6f7fc48 in start_thread
9  0x0000ffffa6ed1600 in thread_start

The root cause is that hns3 PMD driver invokes the rte_pktmbuf_free_seg
API function to release the same rte_mbuf multiple times. The rte_mbuf
pointer is not set to NULL in the internal function
hns3_rx_queue_release_mbufs which is invoked during queue setup, stop
and close. As a result the rte_mbuf in Rx queues will be repeatedly
released when the user application setup queues or stop/start the dev
for multiple times.
Probably for performance reasons, DPDK mempool lib does not check for
the repeated rte_mbuf releases. The Address of released rte_mbuf are
directly stored into the per lcore cache of the mempool. This makes the
rte_mbufs obtained from mempool by calling rte_mempool_get_bulk API
function repetitively.  ultimately, it causes to access to a NULL
pointer in PMD driver.

This patch fixes this problem by setting released mbuf pointer to NULL
in the internal function named hns3_rx_queue_release_mbuf. And the other
internal function named hns3_reassemble_tx_pkts is optimized to avoid a
similar problem.

Fixes: bba6366983 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
db433d5f75 net/hns3: add restriction on setting VF MTU
When Rx of scattered packets is off, we have some possibility of using
vector Rx process function or simple Rx functions in hns3 PMD driver.
If the input MTU is increased and the maximum length of received packets
is greater than the length of a buffer for Rx packets, the hardware
network engine needs to use multiple BDs and buffers to store these
packets. This will cause problems when still using vector Rx process
function or simple Rx function to receiving packets. So, when Rx of
scattered packets is off and device is started, it is not permitted to
increase MTU so that the maximum length of Rx packets is greater than Rx
buffer length.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
a3d4f4d291 net/hns3: support NEON Rx
This patch adds NEON vector instructions to optimize Rx burst process.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
e31f123db0 net/hns3: support NEON Tx
This patch adds NEON vector instructions to optimize Tx burst process.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
7ef933908f net/hns3: add simple Tx path
This patch adds simple Tx process function. When multiple segment
packets are not needed, Which means that DEV_TX_OFFLOAD_MBUF_FAST_FREE
offload is not set, we can simple Tx process.

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
521ab3e933 net/hns3: add simple Rx path
This patch adds simple Rx process function and support chose Rx function
by real Rx offloads capability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
323df8941b net/hns3: reduce address calculation in Rx
This patch adds the internal function named hns3_write_reg_opt to avoid
performance loss from address calculation during register access in the
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

In addition, because hardware always access register in little-endian
mode based on hns3 network engine, so driver should also call
rte_cpu_to_le_32 to convert data in little-endian mode before writing
register and call rte_le_to_cpu_32 to convert data after reading from
register. Here the driver encapsulates the data conversion operation in
the register read/write operation function as below:
  hns3_write_reg
  hns3_write_reg_opt
  hns3_read_reg
Therefore, when calling these functions, conversion is not required
again.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-21 18:05:38 +02:00
Wei Hu (Xavier)
ceabee45be net/hns3: report Rx free threshold
This patch reports .rx_free_thresh value in the .dev_infos_get ops
implementation function named hns3_dev_infos_get and
hns3vf_dev_infos_get.
In addition, the name of the member variable of struct hns3_rx_queue is
modified and comments are added to improve code readability.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2020-09-21 18:05:38 +02:00
Ivan Dyukov
db4e81351f examples: use new link status print format
Add usage of rte_eth_link_to_str function to example
applications.

Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-21 18:05:38 +02:00
Ivan Dyukov
ba5509a6a8 app: use new link status print format
Add usage of rte_eth_link_to_str function to applications and docs.

Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-21 18:05:37 +02:00
Ivan Dyukov
fbf931c9c3 ethdev: format link status text
There is new link_speed value introduced. It's INT_MAX value which
means that speed is unknown. To simplify processing of the value
in application, new function is added which convert link_speed to
string. Also dpdk examples have many duplicated code which format
entire link status structure to text.

This commit adds two functions:
  * rte_eth_link_speed_to_str - format link_speed to string
  * rte_eth_link_to_str - convert link status structure to string

Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-21 18:05:37 +02:00
Ciara Loftus
fb053c35c6 net/af_xdp: fix umem size
The kernel expects the start address of the UMEM to be page size
aligned.
Since the mempool is not guaranteed to have such alignment, we have been
aligning the address to the start of the page the mempool is on. However
when passing the 'size' of the UMEM during it's creation we did not take
this into account.

This commit adds the amount by which the address was aligned to the size
of the UMEM.

Bugzilla ID: 532
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
2020-09-18 19:09:04 +02:00
Vipul Ashri
a1412e05ca net/virtio: fix variable assignment in helper macro
Inside Macro ASSIGN_UNLESS_EQUAL(var, val), assignment to var is always
failing as assignment done using var_ having local scope only.
This leads to TX packets not going out and found broken due to cleanup
malfunctioning. This patch fixes the wrong variable assignment.

Fixes: 57f90f8945 ("net/virtio: reuse packed ring functions")
Cc: stable@dpdk.org

Signed-off-by: Vipul Ashri <vipul.ashri@oracle.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Matan Azrad
4fb86eb5e8 vdpa/mlx5: fix completion queue polling
The CQ polling is done in order to notify the guest about new traffic
bursts and to release FW resources for the next bursts management.

When HW is faster than SW, it may be that all the FW resources are busy
in SW due to late polling.
In this case, due to wrong WQE counter masking, the fullness
calculation of the completions number is 0 while the queue is full.

Change the WQE counter masking to 16-bit wideness instead of the CQ
size mask as defined by the CQE format.

Fixes: c5f714e50b ("vdpa/mlx5: optimize completion queue poll")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Matan Azrad
9c0e15a117 vdpa/mlx5: fix completion queue assertion
The CQ configuration enables the collapse feature in HW what cause HW to
write all the completions in the first CQE.
When this feature is enabled the HW doesn't switch the owner bit when it
starts a new cycle of the CQ, not like working without the collapse
feature.

The current SW CQ polling wrongly added an assertion to validate the
owner bit switch what causes a panic in debug mode.

Remove the aforementioned assertion.

Fixes: c5f714e50b ("vdpa/mlx5: optimize completion queue poll")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Eugenio Pérez
46d3f57537 vhost: fix IOTLB mempool single-consumer flag
Control thread (which handles iotlb msg) and forwarding thread
both use iotlb to translate address. The former may modify the
same entry of mempool and may cause a loop in iotlb_pending_entries
list.

Bugzilla ID: 523
Fixes: d012d1f293 ("vhost: add IOTLB helper functions")
Cc: stable@dpdk.org

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Xueming Li
e8671aca20 vdpa/mlx5: fix event channel setup
During vDPA device setup, if some error happens, event channel
release stucks at polling event channel.

Event channel fd is set to non-blocking in cqe setup, so if any
error happens before this function and after event channel created,
the pooling before releasing resources will stuck.

This patch moves event channel to non-blocking mode right after
creation.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Chenbo Xia
671cc679a5 vhost: add device reset status
vhost lib now does not have definition of reset status. This patch
adds the reset status definition and changes related log.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Adrian Moreno
ce40b4a881 net/virtio-user: enable feature checking
virtio 1.0 introduced a mechanism for the driver to verify that the
feature bits it sets are accepted by the device. This mechanism consists
in setting the VIRTIO_STATUS_FEATURE_OK status bit and re-reading it,
which gives a chance for the device to clear it if the features
were not accepted.

This is currently being done only in modern virtio-pci devices but since
the appropriate vhost-user messages have been added, it can also be done
in virtio-user (vhost-user only).

This patch activates this mechanism on virtio-user.

Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2020-09-18 18:55:12 +02:00
Adrian Moreno
0b0dc66c72 net/virtio-user: support vhost status getting
This patch adds support for VHOST_USER_GET_STATUS request.

Only vhost-user backed is supported for now

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-18 18:55:12 +02:00
Maxime Coquelin
5791282461 net/virtio-user: support vhost status setting
This patch adds support for VHOST_USER_SET_STATUS
request. It is used to make the backend aware of
Virtio devices status update.

It is useful for the backend to know when the Virtio
driver is done with the Virtio device configuration.

Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2020-09-18 18:55:12 +02:00
Adrian Moreno
e84a9dab7d net/virtio: add device reset status bit
For the sake of completeness, add the definition of the missing status
bit in accordance with the virtio spec

Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
2020-09-18 18:55:12 +02:00
Rahul Lakkireddy
f2d344dfaf net/cxgbe: support RSS redirection table update
Implement eth_dev_ops to manipulate RSS redirection table.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2020-09-18 18:55:12 +02:00
Rahul Lakkireddy
e30e5407fd net/cxgbe: improve Rx congestion control
Chelsio T6 NIC can support up to 8 priority channels to manage
congestion. So, increase to 8 congestion channels for T6. Also,
add Rxq state to avoid unnecessarily ringing doorbell and polling
the hardware for more traffic when the Rxq is stopped.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2020-09-18 18:55:12 +02:00
Rahul Lakkireddy
7b3d52989a net/cxgbe: rework queue allocation between ports
Firmware returns the max queues that can be allocated on the entire
PF. The driver evenly distributes them across all the ports belonging
to the PF. However, some ports may need more queues than others and
this equal distribution scheme prevents accessing these other ports
unused queues. So, remove the equal distribution scheme and allow the
ports to allocate as many queues as they need.

Also remove the hardcoded 64 max limit on queue allocation. Instead,
use the max limit given by firmware.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2020-09-18 18:55:12 +02:00
Rahul Lakkireddy
11df4a688d net/cxgbe: release port resources during port close
Enable RTE_ETH_DEV_CLOSE_REMOVE during PCI probe for all ports
enumerated under the PF. Free up the underlying port Virtual
Identifier (VI) and associated resources during port close.
Once all the ports under the PF are closed, free up the PF-wide
shared resources. Invoke port close function of all ports under
the PF, in PCI remove too.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2020-09-18 18:55:12 +02:00