Add option to enable packed queue support for virtio-user
devices.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Use packed virtqueue format when reading and writing descriptors
to/from the ring.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This implements the transmit path for devices with
support for packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support to dump packed virtqueue data to the
VIRTQUEUE_DUMP() macro.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When a guest is spanned on multiple NUMA nodes and
multiple Virtio devices are spanned onto these nodes,
we expect that their ring memory is allocated in the
right memory node.
Otherwise, vCPUs from node A may be polling Virtio rings
allocated on node B, which would increase QPI bandwidth
and impact performance.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
In case of running with not enough capabilities, i.e. running as
non-root user any application linked with DPDK prints the message
about IOPL call failure even if it was just called like
'./testpmd --help'. For example, this breaks most of the OVS unit
tests if it built with DPDK support.
Let's register the virtio driver unconditionally and print error
message while probing the device. Silent iopl() call left in the
constructor to have privileges as early as possible as it was before.
Fixes: 565b85dcd9f4 ("eal: set iopl only when needed")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
.dev_uninit calls .dev_stop and .dev_close. The work that is done in
those routines doesn't need repeated. Use started and opened to track
the adapter's status.
Fixes: c1f86306a026 ("virtio: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We need to check the status field in virtio net config structure
instead of the bits read from ISR register to know whether we need
to do guest announce.
Fixes: 7365504f77e3 ("net/virtio: support guest announce")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Get rid of the duplicated code in device features preparation
which looks awful.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We need to save the supported frontend features (which won't be
announced by vhost backend), otherwise we will lost them when the
connection to vhost-user backend is established in server mode.
Fixes: 201a41651715 ("net/virtio-user: fix multiple queues fail in server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When driver resets the device, virtio-user just needs to send
GET_VRING_BASE messages to stop the vhost backend, and that's
what QEMU does. With this change, we won't need to set owner
when starting virtio-user device anymore. This will help us to
get rid of below error message on startup:
vhost_kernel_ioctl(): VHOST_SET_OWNER failed: Device or resource busy
Fixes: bce7e9050f9b ("net/virtio-user: fix start with kernel vhost")
Fixes: 0d6a8752ac9d ("net/virtio-user: fix crash as features change")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
There is no need to make the vhost user channel nonblock, and
making it nonblock will make vhost_user_read() fail with EAGAIN
when vhost messages need a reply.
Fixes: bd8f50a45d0f ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Without this change, virtio-user still works, but it will show
annoying error messages like this on shutdown:
vhost_kernel_set_backend(): VHOST_NET_SET_BACKEND fails, Operation not permitted
vhost_kernel_ioctl(): VHOST_RESET_OWNER failed: Operation not permitted
Fixes: e3b434818bbb ("net/virtio-user: support kernel vhost")
Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Drop the duplicated reset() method in virtio_pci_ops. Currently
vtpci_reset() is implemented on set_status() and get_status()
directly. The reset() method in virtio_pci_ops isn't used and
its implementation in the legacy device isn't right.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Register and unregister the virtio interrupt handler when the device is
started and stopped. This allows a virtio device to be hotplugged or
unplugged.
Fixes: c1f86306a026 ("virtio: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Brian Russell <brussell@brocade.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Some global variables are defined with generic names, add component name
as prefix to variables to prevent collusion with application variables.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
In virtio_read_caps and vtpci_msix_detect, rte_pci_read_config returns
the number of bytes read from PCI config or < 0 on error.
If less than the expected number of bytes are read then log the
failure and return rather than carrying on with garbage.
Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")
Cc: stable@dpdk.org
Signed-off-by: Brian Russell <brussell@brocade.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The hotplug attach/detach features are implemented in EAL layer.
There is a new ethdev iterator to retrieve ports from ethdev layer.
As announced earlier, the (buggy) ethdev functions are now removed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
This is a clean-up of common ethdev data freeing.
All data freeing are moved to rte_eth_dev_release_port()
and done only in case of primary process.
It is probably fixing some memory leaks for PMDs which were
not freeing all data.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
eal: add shorthand __rte_weak macro
qat: update code to use __rte_weak macro
avf: update code to use __rte_weak macro
fm10k: update code to use __rte_weak macro
i40e: update code to use __rte_weak macro
ixgbe: update code to use __rte_weak macro
mlx5: update code to use __rte_weak macro
virtio: update code to use __rte_weak macro
acl: update code to use __rte_weak macro
bpf: update code to use __rte_weak macro
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
With the enabling for hotplug on multi-process,
rte_eth_dev_pci_generic_remove can be used to detach the device from
a secondary process also. But we need to take care of the uninit callback
parameter to make sure it handles the secondary case correctly.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
When adding or removing external memory from the memory map, there
may be actions that need to be taken on account of this memory (e.g.
DMA mapping). Add support for triggering callbacks when adding,
removing, attaching or detaching external memory.
Some memory event callback handlers will need additional logic to
handle external memory regions. For example, virtio callback has to
completely ignore externally allocated memory, because there is no
way to find file descriptors backing the memory address in a
generic fashion. All other callbacks have also been adjusted to
handle RTE_BAD_IOVA as IOVA address, as this is one of the expected
use cases for external memory support.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we allocate and use DPDK memory, we need to be able to
differentiate between DPDK hugepage segments and segments that
were made part of DPDK but are externally allocated. Add such
a property to memseg lists.
This breaks the ABI, so document the change in release notes.
This also breaks a few internal assumptions about memory
contiguousness, so adjust malloc code in a few places.
All current calls for memseg walk functions were adjusted to
ignore external segments where it made sense.
Mempools is a special case, because we may be asked to allocate
a mempool on a specific socket, and we need to ignore all page
sizes on other heaps or other sockets. Previously, this
assumption of knowing all page sizes was not a problem, but it
will be now, so we have to match socket ID with page size when
calculating minimum page size for a mempool.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The virtio features VIRTIO_NET_F_CSUM, VIRTIO_NET_F_HOST_TSO4
and VIRTIO_NET_F_HOST_TSO6 are supported by the virtio PMD.
But they are missing in the supported feature set. And since
below commit:
commit 4174a7b59d05 ("net/virtio: improve Tx offload features negotiation")
Virtio PMD will announce the Tx offloading capabilities based
on the features read from the device. And virtio-user won't
report the features which are not in virtio-PMD's supported
feature set. So since that commit, virtio-user won't announce
the DEV_TX_OFFLOAD_UDP_CKSUM, DEV_TX_OFFLOAD_TCP_CKSUM and
DEV_TX_OFFLOAD_TCP_TSO offloading capabilities even if the
vhost backend supports them.
This patch adds these missing features, and virtio-user will
report them if the backend supports them.
Fixes: 142678d42959 ("net/virtio-user: fix wrongly get/set features")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The multiple queue support in vhost-kernel is broken because
the dev->vhostfd is only available for vhost-user. We should
always try to enable queue pairs when it's not in server mode.
Fixes: 201a41651715 ("net/virtio-user: fix multiple queues fail in server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
It's possible to have much more hugepage backed memory regions
than what vhost-kernel supports due to the memory hotplug, which
may cause problems. A better solution is to have the virtio-user
pass all the memory ranges reserved by DPDK to vhost-kernel.
Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Recently some memory APIs were introduced to allow users to
get the file descriptor and offset for each memory segment.
We can leverage those APIs to get rid of the /proc magic on
memory table preparation in vhost-user backend.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Deadlock can occur when allocating memory if a vhost-kernel
based virtio-user device is in use. To fix the deadlock,
we will take memory hotplug lock explicitly in virtio-user
when necessary, and always call the _thread_unsafe memory
functions.
Bugzilla ID: 81
Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")
Cc: stable@dpdk.org
Reported-by: Seán Harte <seanbh@gmail.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Tested-by: Seán Harte <seanbh@gmail.com>
Reviewed-by: Seán Harte <seanbh@gmail.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.
PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.
Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
This patch checks negotiated features to see if necessary to offload
before set the tap device offload capabilities. It also checks if kernel
support the TUNSETOFFLOAD operation.
Fixes: 5e97e4202563 ("net/virtio-user: enable offloading")
Cc: stable@dpdk.org
Signed-off-by: Eric Zhang <eric.zhang@windriver.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Add the missing param "server" to param string.
Also add the missing spaces after params.
Fixes: bd8f50a45d0f ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This macro isn't used any more after below commit:
Fixes: a4996bd89c42 ("ethdev: new Rx/Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
A constructor is usually declared with RTE_INIT* macros.
As it is a static function, no need to declare before its definition.
The macro is used directly in the function definition.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
In DPDK 17.11, the ethdev offloads API has changed:
commit cba7f53b717d ("ethdev: introduce Tx queue offloads API")
commit ce17eddefc20 ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload
For reminder, the main concepts in the new API were:
- All offloads are disabled by default
- Distinction between per port and per queue offloads.
The transition bits are now removed:
- Translation of the old API in ethdev
- rte_eth_conf.rxmode.ignore_offload_bitfield
- ETH_TXQ_FLAGS_IGNORE
The old API bits are now removed:
- Rx per-port rte_eth_conf.rxmode.[bit-fields]
- Tx per-queue rte_eth_txconf.txq_flags
- ETH_TXQ_FLAGS_NO*
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Instead of checking the multiple Virtio features bits for
every packet, let's do the check once at configure time and
store it in virtio_hw struct.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This patch improves the Tx offload features selection depending
on whether the application request for offloads.
When the application doesn't request for Tx offload features,
the corresponding features bits aren't negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The simple Tx path does not comply with the Virtio specification.
Now that VIRTIO_F_IN_ORDER feature is supported by the Virtio PMD,
let's use this optimized path instead.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
After IN_ORDER Rx/Tx paths added, need to update Rx/Tx path selection
logic.
Rx path select logic: If IN_ORDER and merge-able are enabled will select
IN_ORDER Rx path. If IN_ORDER is enabled, Rx offload and merge-able are
disabled will select simple Rx path. Otherwise will select normal Rx
path.
Tx path select logic: If IN_ORDER is enabled will select IN_ORDER Tx
path. Otherwise will select default Tx path.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
IN_ORDER Rx function depends on merge-able feature. Descriptors
allocation and free will be done in bulk.
Virtio dequeue logic:
dequeue_burst_rx(burst mbufs)
for (each mbuf b) {
if (b need merge) {
merge remained mbufs
add merged mbuf to return mbufs list
} else {
add mbuf to return mbufs list
}
}
if (last mbuf c need merge) {
dequeue_burst_rx(required mbufs)
merge last mbuf c
}
refill_avail_ring_bulk()
update_avail_ring()
return mbufs list
IN_ORDER Tx function can support offloading features. Packets which
matched "can_push" option will be handled by simple xmit function. Those
packets can't match "can_push" will be handled by original xmit function
with in-order flag.
Virtio enqueue logic:
xmit_cleanup(used descs)
for (each xmit mbuf b) {
if (b can inorder xmit) {
add mbuf b to inorder burst list
continue
} else {
xmit inorder burst list
xmit mbuf b by original function
}
}
if (inorder burst list not empty) {
xmit inorder burst list
}
update_avail_ring()
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
IN_ORDER virtio-user Tx function support Tx checksum offloading and
TSO which also support on normal Tx function. So extracts common part
into separated function for reuse.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>