19905 Commits

Author SHA1 Message Date
David Marchand
bbabce218d eal: remove deprecated malloc virt2phys function
Remove rte_malloc_virt2phy as announced previously.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-27 10:36:19 +01:00
David Marchand
637af85090 eal: remove deprecated CPU flags check function
Remove rte_cpu_check_supported as announced previously.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-27 10:35:58 +01:00
Stephen Hemminger
65661351ca eal: make lcore config private
The internal structure of lcore_config does not need to be part of
visible API/ABI. Make it private to EAL.

Rearrange the structure so it takes less memory (and cache footprint).

Since we change the ABI, bump the library version.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-27 10:35:11 +01:00
Ciara Power
ab6ebd7020 examples/load_balancer: remove example
This example can be removed because DPDK now has a range
of libraries, especially rte_eventdev, that did not exist
previously for load balancing, making this less relevant.
Also, modern NIC cards have greater ability to do load balancing,
e.g. using RSS, over a wider range of fields than earlier cards did.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2019-10-26 22:32:40 +02:00
Ciara Power
d82610b940 examples/netmap-compat: remove example
Rather than providing a shim layer on top of netmap,
we should instead encourage users to create apps using
the DPDK APIs directly.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2019-10-26 22:32:40 +02:00
Ciara Power
46971b273e examples/quota-watermark: remove example
Original DPDK rings code had explicit support for a
single watermark per-ring, but more recent releases of
DPDK had a more general mechanism where each enqueue
or dequeue call could return the remaining elements/free-slots
in the ring.
Therefore, this example is not as relevant as before and can be removed.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2019-10-26 22:32:40 +02:00
Bruce Richardson
bd6e98193a examples/l3fwd-vf: remove example
The main l3fwd app should work with both PF and VF devices, so remove the
VF-only l3fwd example.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2019-10-26 22:32:40 +02:00
Bruce Richardson
207361b350 examples/exception_path: remove example
The example app shows the use of TUN/TAP with DPDK, but DPDK has a built-in
TAP PMD, so this example is obsolete and so can be removed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2019-10-26 22:32:40 +02:00
Ciara Power
7eb115c0e7 doc: remove unnecessary sample app guide table
The table of examples in the sample application guide is now removed,
as it was unnecessary and difficult to update when adding/removing
examples.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2019-10-26 22:32:39 +02:00
Chunsong Feng
cea37e5133 net/hns3: fix FLR reset
PF FLR resets the PCIe ECAM space of all VFs under the PF
and does not automatically recover. Therefore, the VF driver
needs to restore the ECAM configuration, including
bus_master_en, msix_enable to avoid FLR reset failure.

Fixes: 2790c6464725 ("net/hns3: support device reset")

Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2019-10-25 19:23:23 +02:00
Hongbo Zheng
1a028f2bea net/hns3: fix error code for firmware timeout
Configuration commands are sent to firmware for processing.
When firmware processing timeout, the corresponding error
code is returned. Considering that it is more reasonable to
use error code -ETIME for timeout error, the error code for
processing timeout is changed from -EBADE to -ETIME.

Fixes: 737f30e1c3ab ("net/hns3: support command interface with firmware")

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2019-10-25 19:23:23 +02:00
Wei Hu (Xavier)
ad7cf94823 net/hns3: fix offload flag for RSS hash
This patch adds PKT_RX_RSS_HASH flag to rx packet's ol_flags
to repair the bug that hns3 pmd driver doesn't set
PKT_RX_RSS_HASH flag. In hns3 network engine RSS is always
enabled.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")

Signed-off-by: Hao Chen <chenhao164@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2019-10-25 19:23:23 +02:00
Hao Chen
1e627b8d99 net/hns3: fix statistics
This patch fixes the statistics problems for sending and receiving
message as belows:

1.In receiving direction, for FCS error messages, drivers should not
record them in rte_eth_stats.ipackets statistics.

2.In sending direction, for messages of illegal length, too long or
equals 0, drivers should not notify the network card hardware to
send them, should not continue to send the remaining message in burst,
and record them in rte_eth_stats.opackets statistics.

Fixes: 8839c5e202f3 ("net/hns3: support device stats")

Signed-off-by: Hao Chen <chenhao164@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
2019-10-25 19:23:23 +02:00
Andrew Rybchenko
1526dd0532 net/virtio: fix Tx checksum offloads
Missing parenthesis around expression before type cast to struct
virtio_net_hdr pointer makes the arithmetic to be in
sizeof(struct virtio_net_hdr) units.

Use rte_pktmbuf_mtod_offset() to fix the problem.

Type of head_size is changed to signed since some compilers bark
on unary minus applied to unsigned.

Fixes: 1ae55ad38e5e ("net/virtio: fix mbuf data and packet length mismatch")
Cc: stable@dpdk.org

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:23 +02:00
Venkat Duvvuru
5f5ccaa05c net/bnxt: uninit locks during resource cleanup
Currently, bnxt_uninit_locks is called inside bnxt_dev_uninit. However,
the more appropriate place to call it is inside bnxt_uninit_resources.
This is more logical, as all the resources uninitialization is happening
inside bnxt_uninit_resources.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:23 +02:00
Kalesh AP
f5019a53d7 net/bnxt: fix memory leak
During hot fw upgrade or error recovery, driver does cleanup and
reallocation of resources. But driver reallocates memory for ring
group info table without freeing the allocated memory during
device init.

Fix this memory leak by moving the freeing of ring group info table
to bnxt_free_mem() in bnxt_uninit_resources().

Fixes: c09f57b49c13 ("net/bnxt: add start/stop/link update operations")
Cc: stable@dpdk.org

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
2019-10-25 19:23:23 +02:00
Kalesh AP
34f2f8ad40 net/bnxt: add wrapper function for Thor device check
This patch adds a wrapper function bnxt_thor_device() which checks
for the pci device is Thor family.

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
2019-10-25 19:23:23 +02:00
Kalesh AP
382e273ab6 net/bnxt: add PCI IDs for 57500 series NPAR devices
Add PCI IDs for 57500 series NPAR devices

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:23 +02:00
Kalesh AP
bfbfe4aaa1 net/bnxt: add more checks during firmware reset
Added check for fw reset or fw errors in few more routines.
While processing the events, if it is fw fatal or non-fatal
event, there is no need to process the remaining events as
driver triggers recovery mechanism.

Also added a check for fw reset or fatal error in bnxt_disable_int()
to avoid bar access in case of fatal error.

Fixes: be14720def9c ("net/bnxt: support FW reset")

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Rahul Gupta <rahul.gupta@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-10-25 19:23:23 +02:00
Santoshkumar Karanappa Rastapur
8937597cfe net/bnxt: fix crash after removing and adding slaves
On removing the slave interface, slave_remove in bonding module calls
_rte_eth_dev_reset which in turn frees both Tx and Rx queues.
1. segfault is seen after removing/adding the slave interface and starting
bond interface.
In this below path, when mtu is set for the slave interface, queues are
not created yet and driver reference to queue[0] causes the segfault.
slave_configure:
	rte_eth_dev_set_mtu
	rte_eth_dev_configure
	rte_eth_rx_queue_setup

2. segfault is seen on starting the port after removing from bond device.
This is a testpmd bug where in, on starting the port, testpmd is supposed
to recreate the queues before starting the port.

Fixed these by adding check for queues created before accessing them.

Fixes: daef48efe5e5 ("net/bnxt: support set MTU")
Cc: stable@dpdk.org

Signed-off-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-10-25 19:23:23 +02:00
Somnath Kotur
26f044cebd net/bnxt: fix MAC address check in flow validation
Add another check to flag zero mac address while validating/parsing
the flow arguments

Fixes: 407244800389 ("net/bnxt: allow only unicast MAC address filter creation")

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
2019-10-25 19:23:23 +02:00
Somnath Kotur
bd94753fc7 net/bnxt: expose some missing counters in port stats
Rx/Tx Port statistics record PFC enabled frames for each priority.
Modify the Rx/Tx port stats array to report these as well.

Fixes: bfb9c2260be2 ("net/bnxt: support xstats get/reset")
Cc: stable@dpdk.org

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:22 +02:00
Rahul Gupta
24de0b605c net/bnxt: fix filter type for non-ntuple flows
Use HWRM_CFA_L2_FILTER filter_type, as its checked during SRC MAC
filter creation in bnxt_create_l2_filter().

Fixes: afef822b2e1b ("net/bnxt: support creating SMAC and inner DMAC filters")

Signed-off-by: Rahul Gupta <rahul.gupta@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-10-25 19:23:22 +02:00
Kalesh AP
a73b8e939f net/bnxt: fix race between start and interrupt handler
During port start, driver enables interrupts. At the end of port start
driver schedules the alarm for fw health check. This can cause race
between interrupt handler and schedule fw health check alarm thread
which falsely detects FW error.

Fixes: 9d0cbaecc91a ("net/bnxt: support periodic FW health monitoring")

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-10-25 19:23:22 +02:00
Venkat Duvvuru
037cc747d1 net/bnxt: fix default MAC check
bnxt_set_hwrm_vnic_filters programs default mac addr and the
same default mac is added by mac_add_addr_op routine as well.
This redundant mac add is avoided by checking if the default
mac is already added.

However, that check is wrong. The check should consider the
mac index as well to determine the default mac. This patch
fixes it by using mac index to determine the default mac.

Fixes: 68f589f2c728 ("net/bnxt: fix setting primary MAC address")
Cc: stable@dpdk.org

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-10-25 19:23:22 +02:00
Ajit Khaparde
04102f2ffc net/bnxt: update HWRM API to version 1.10.1.6
Update HWRM API to version 1.10.1.6

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
2019-10-25 19:23:22 +02:00
Santoshkumar Karanappa Rastapur
ce0ed702a1 net/bnxt: fix RSS configuration for Thor controller
rss_table structure stores receive, completion ring IDs for Thor based
controllers and receive queue id for others. Below commit accidentally
left the old code and hence was always overwriting receive ring id with
queue id for Thor. This causes subsequent VNIC_RSS_CFG command to fail.

Fixes: 38412304b50a ("net/bnxt: enable RSS for thor-based controllers")
Cc: stable@dpdk.org

Signed-off-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2019-10-25 19:23:22 +02:00
Ajit Khaparde
8d7bdc8c13 net/bnxt: fix writing MTU to FW
We are currently writing the MAX supported MTU size even though
the MTU passed by the application is a different value.
Configure the application specified MTU instead.

Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:22 +02:00
Ajit Khaparde
32e970c166 net/bnxt: fix setting max RSS contexts
Configure max RSS contexts in case the firmware indicates a maximal
resource reservation strategy.

Fixes: 6d8109bcb398 ("net/bnxt: check VF resources if resource manager is enabled")
Cc: stable@dpdk.org

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:22 +02:00
Santoshkumar Karanappa Rastapur
7d4e9e2606 net/bnxt: fix ping with MTU change
Driver was setting maximum receive unit differently in bnxt_mtu_set_op
and bnxt_hwrm_vnic_alloc. Moreover firmware adds the 4 bytes for FCS.
Fixed it by setting correct maximum receive unit value.

Fixes: daef48efe5e5 ("net/bnxt: support set MTU")
Cc: stable@dpdk.org

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:22 +02:00
Kalesh AP
fb33ac033a net/bnxt: fix check while freeing filter
This patch fixes the coverity UNUSED_VALUE issue.

Coverity issue: 349938
Fixes: 49d0709b257f ("net/bnxt: delete and flush L2 filters cleanly")

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
2019-10-25 19:23:22 +02:00
Tianfei Zhang
79e9ae58bf raw/ifpga/base: fix dereference before null check
Add pointer null check before dereference.

Coverity issue: 344976
Fixes: 12f92a513a13 ("raw/ifpga/base: fix retimer link status")
Cc: stable@dpdk.org

Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
2019-10-25 19:23:22 +02:00
Xiaoyu Min
739e045bf2 app/testpmd: support multiple raw encap/decap
In some scenarios, the raw_encap/raw_decap actions could be multiple in
one single flow (e,g. hirepin flow):

  ... actions raw_decap / raw_encap / raw_decap / raw_encap / ...

This requires the testpmd supports multiple raw_encap/raw_decap data
settings as well.

With the multiple raw_encap/raw_decap settings, the testpmd commands –
set raw_encap / set raw_decap will become:

  set raw_encap <index> <item pattern>
  set raw_decap <index> <item pattern>

And the actions – raw_encap/raw_decap also could optionally choose which
global raw_encap/raw_decap confs to be used by index:

  ... actions raw_decap index 1 / raw_encap index 2 / ...

If there is no `index` specified, the default index is 0:

  set raw_encap <item pattern>
  ... actions raw_decap / raw_encap / ...

which will use raw_encap index 0.

In addition to the set raw_encap/raw_decap commands,

  show <raw_encap/raw_decap> <index>
  show <raw_encap/raw_decap> all

are also introduced into in order to check which index is set and to
what.

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-10-25 19:23:22 +02:00
Peng Huang
0322272c9f app/testpmd: fix Tx checksum when TSO enabled
This patch fixed the TX checksum value problem when enabled TSO in
tunnel packets, because outer UDP checksum calculation depend on
the TSO configuration.

Fixes: 0f62d63593ed ("app/testpmd: support tunneled TSO in checksum engine")
Cc: stable@dpdk.org

Signed-off-by: Peng Huang <peng.huang@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2019-10-25 19:23:22 +02:00
Raslan Darawsheh
38e118c0d8 net/mlx: fix debug build with icc
Trying to compile mlx5 pmd in debug mode with icc
will lead to compilation failures due to the fact that
icc doesn't have support for the pragma of pedantic.

Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-10-25 19:23:22 +02:00
Flavio Leitner
84c39beb2f vhost: fix IPv4 checksum
Currently the IPv4 header checksum is calculated including its
current value, which can be a valid checksum or just garbage.
In any case, if the original value is not zero, then the result
is always wrong.

The IPv4 checksum is defined in RFC791, page 14 says:
  Header Checksum:  16 bits

  The checksum algorithm is:
  The checksum field is the 16 bit one's complement of the one's
  complement sum of all 16 bit words in the header.  For purposes of
  computing the checksum, the value of the checksum field is zero.

Thus force the csum field to always be zero.

Fixes: b08b8cfeb2ae ("vhost: fix IP checksum")
Cc: stable@dpdk.org

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Jin Yu
19b8d75f64 examples: delete vhost SCSI example
This example is too old and SPDK will not maintain this example
anymore. Also SPDK has submitted a new vhost example vhost-blk.
We will keep on maintaining vhost-blk and It shows the packed
ring and live recovery support.

Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Andrew Rybchenko
9fc963acc2 net/virtio: reject unsupported Tx multi-queue modes
This driver supports none of DCB or VMDQ modes, therefore must
check and return error if configured incorrectly.

Fixes: c1f86306a026 ("virtio: add new driver")
Cc: stable@dpdk.org

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Dilshod Urazov
13b3137f3b net/virtio: reject unsupported Rx multi-queue modes
This driver supports none of DCB, RSS or VMDQ modes, therefore must
check and return error if configured incorrectly.

Virtio can distribute Rx packets across multi-queue, but there is
no controls (algorithm, redirection table, hash function) except
number of Rx queues and ETH_MQ_RX_NONE is the best fit meaning
no method is enforced on how to route packets to MQs.

Fixes: c1f86306a026 ("virtio: add new driver")
Cc: stable@dpdk.org

Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Dilshod Urazov
c0e03310ee net/virtio: reject deferred Tx start
Deferred start Tx queue is not supported by the driver.

Fixes: 0748be2cf9a2 ("ethdev: queue start and stop")
Cc: stable@dpdk.org

Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Dilshod Urazov
955d7c1f03 net/virtio: reject deferred Rx start
Deferred start Rx queue is not supported by the driver.

Fixes: 0748be2cf9a2 ("ethdev: queue start and stop")
Cc: stable@dpdk.org

Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Ilya Maximets
70c7747689 vhost: disable host TSO for linear buffers without extbuf
If linear buffers requested and external buffers are not, vhost
will not be able to receive any buffer that doesn't fit in a
single mbuf.  Moreover, if such a buffer will appear in a vring
it will never be dequeued and the whole vring will become dead
breaking the network connection.

Disable segmentation offloading from the host side to avoid
having such a big buffers.

Fixes: c3ff0ac70acb ("vhost: improve performance by supporting large buffer")

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:06 +02:00
Ilya Maximets
19896c7393 vhost: return error message for mbuf allocation failure
mbuf allocation failure is a hard failure that highlights some
significant issues with memory pool size or a mbuf leak.

We still have the message for subsequent chained mbufs, but not
for the first one.  It was removed while introducing extbuf
support for large buffers.  But it was useful for catching
mempool issues and needs to be returned back.

Fixes: c3ff0ac70acb ("vhost: improve performance by supporting large buffer")

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2019-10-25 19:22:46 +02:00
Marvin Liu
f974ca7a29 vhost: optimize packed ring dequeue when in-order
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
31d6c6a5b8 vhost: optimize packed ring dequeue
Optimize vhost device packed ring dequeue function by splitting batch
and single functions. No-chained and direct descriptors will be handled
by batch and other will be handled by single as before.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
d1eafb5322 vhost: add packed ring zcopy batch and single dequeue
Add vhost packed ring zero copy batch and single dequeue functions like
normal dequeue path.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
0294211bb6 vhost: optimize packed ring enqueue
Optimize vhost device packed ring enqueue function by splitting batch
and single functions. Packets can be filled into one desc will be
handled by batch and others will be handled by single as before.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
c119edbc2d vhost: update packed ring dequeue
Buffer used ring updates as many as possible in vhost dequeue function
for coordinating with virtio driver. For supporting buffer, shadow used
ring element should contain descriptor's flags. First shadowed ring
index was recorded for calculating buffered number.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
f41516c309 vhost: flush batched enqueue descs directly
Flush used elements when batched enqueue function is finished.
Descriptor's flags are pre-calculated as they will be reset by vhost.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
33d4a554f9 vhost: flush enqueue updates by cacheline
Buffer vhost packed ring enqueue updates, flush ring descs if buffered
content filled up one cacheline. Thus virtio can receive packets at a
faster frequency.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00