Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When eth_virtio_dev_init() is cleaning up, it does not correctly set
the mac_addrs variable to NULL, which will lead to a double free.
Found during unit-test fixes.
Fixes: 43d18765c0 ("net/virtio: fix memory leak on failure")
Cc: stable@dpdk.org
Reported-by: Michael Santana <msantana@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The function rte_vlan_insert may allocate a new buffer for the
vlan header and return a different mbuf than originally passed.
In this case, the stored mbuf in txm[] array could point to wrong
buffer.
Fixes: dd856dfcb9 ("virtio: use any layout on Tx")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed. The function changes in this patch were
auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place
and then the files edited using awk to add in the missing header:
gawk -i inplace '/include <rte_/ && ! seen { \
print "#include <rte_string_fns.h>"; seen=1} {print}'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Since previous test is for mtu < 1519 the next else if
is always true. This causes the lgtm static tool to complain.
Not a real issue, just cosmetic.
Fixes: 76d4c652e0 ("virtio: add extended stats")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We are consistently passing 1 as the argument in the data path,
so there is no need to define avail/used flags as function-like
macros anymore. This patch changes the avail and used flags to
constants. And a frequently used combination is also introduced.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes the multi-process support for virtio-user.
Currently virtio-user just provides some limited secondary
process supports. Only some basic operations can be done in
secondary process on virtio-user port, e.g. getting port stats.
Actions which will trigger the communication with vhost backend
can't be done in secondary process for now, as the fds are
not synced between processes. The processing of server mode
devargs is also moved into virtio_user_dev_init().
Fixes: cdb068f031 ("bus/vdev: scan by multi-process channel")
Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
"The macro name '_VHOST_NET_USER_H' of this include guard is used
in 2 different header files."
lib/librte_vhost/vhost_user.h has the same include guard.
Renamed the include guard in vhost.h to differentiate.
Fixes: 6a84c37e39 ("net/virtio-user: add vhost-user adapter layer")
Cc: stable@dpdk.org
Signed-off-by: Andrius Sirvys <andrius.sirvys@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This patch improves descriptors refill by using the same
batching strategy as done in in-order and mergeable path.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add a helper for sending commands in split ring to make the
code consistent with the corresponding code in packed ring.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add a helper for disabling interrupts in split ring to make the
code consistent with the corresponding code in packed ring.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Drop the unused field tx_indir_pq from virtio_tx_region
structure.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Drop redundant suffix (_packed and _event) from the fields in
packed ring structure.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Put split ring and packed ring specific fields into separate
sub-structures, and also union them as they won't be available
at the same time.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Cache the AVAIL, USED and WRITE bits to avoid calculating
them as much as possible. Note that, the WRITE bit isn't
cached for control queue.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Typically, after enabling Rx interrupt, a check should be done
to make sure that there is no new incoming packets before going
to sleep. So a barrier is needed to make sure that any following
check won't happen before the interrupt is actually enabled.
Fixes: c056be239d ("net/virtio: add Rx interrupt enable/disable functions")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When disabling interrupt, the shadow event flags should also be
updated accordingly. The unnecessary wmb is also dropped.
Fixes: e9f4feb7e6 ("net/virtio: add packed virtqueue helpers")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The pointer to event structure should be cast to uintptr_t first.
Fixes: f803734b0f ("net/virtio: vring init for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The multiqueue support in virtio-user with vhost kernel backend
is broken when tap name isn't specified by users explicitly,
because the tap name returned by ioctl(TUNSETIFF) isn't saved
properly, and multiple tap interfaces will be created in this
case. Fix this by saving the dynamically allocated tap name
first before reusing the ifr structure. Besides, also make it
possible to support the format string in tap name (e.g. foo%d)
specified by users explicitly.
Fixes: 791b43e088 ("net/virtio-user: specify MAC of the tap")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Rename the macro to make things shorter and more comprehensible. For
both meson and make builds, keep the old macro around for backward
compatibility.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This patch introduces an optimized enqueue function in packed
ring for the case that virtio net header can be prepended to
the unchained mbuf.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch introduces a helper for clearing the virtio net header
to avoid the code duplication. Macro is used as it shows slightly
better performance.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When IN_ORDER feature is negotiated, device may just write out a
single used descriptor for a batch of buffers:
"""
Some devices always use descriptors in the same order in which they
have been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the
use of a batch of buffers to the driver by only writing out a single
used descriptor with the Buffer ID corresponding to the last descriptor
in the batch.
The device then skips forward in the ring according to the size of the
batch. The driver needs to look up the used Buffer ID and calculate the
batch size to be able to advance to where the next used descriptor will
be written by the device.
"""
But the Tx path of packed ring can't handle this. With this patch,
when IN_ORDER is negotiated, driver will manage the IDs linearly,
look up the used buffer ID and advance to the next used descriptor
that will be written by the device.
Fixes: 892dc798fa ("net/virtio: implement Tx path for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When IN_ORDER feature is negotiated, device may just write out a
single used ring entry for a batch of buffers:
"""
Some devices always use descriptors in the same order in which they
have been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the
use of a batch of buffers to the driver by only writing out a single
used ring entry with the id corresponding to the head entry of the
descriptor chain describing the last buffer in the batch.
The device then skips forward in the ring according to the size of
the batch. Accordingly, it increments the used idx by the size of
the batch.
The driver needs to look up the used id and calculate the batch size
to be able to advance to where the next used ring entry will be written
by the device.
"""
Currently, the in-order Tx path in split ring can't handle this.
With this patch, driver will allocate desc_extra[] based on the
index in avail/used ring instead of the index in descriptor table.
And driver can just relay on the used->idx written by device to
reclaim the descriptors and Tx buffers.
Fixes: e5f456a98d ("net/virtio: support in-order Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We should try to cleanup at least the 'need' number of descs.
Fixes: 892dc798fa ("net/virtio: implement Tx path for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This minor cleanup patch removes an unnecessary forward
declaration of virtio_intr_enable() in net/virtio PMD.
Fixes: fe19d49cb5 ("net/virtio: fix Rx interrupt with VFIO")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Port configuration fails because offload flags don't match the expected
value when max-pkt-len is set to a value that should enable receive port
offloading but doesn't.
The .dev_infos_get callback can be called before the configure callback.
At that time we don't know the maximum packet size yet because it is
only set up when ports are started. So in virtio_dev_info_get() just
always set the jumbo packet offload flag.
Check the maximum packet length at device configure time, because then we
have access to the max-pkt-len value provided by the user. If the
max-pkt-len exceeds the maximum MTU supported by the device we remove
the VIRTIO_NET_F_MTU flag from requested features.
Fixes: a4996bd89c ("ethdev: new Rx/Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Read barrier is required between reading the flags (desc_is_used)
and the content of descriptor to ensure the ordering.
Otherwise, speculative read of desc.id could be reordered with
reading of the desc.flags.
Fixes: a76290c8f1 ("net/virtio: implement Rx path for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
There should be read barrier between checking VIRTQUEUE_NUSED (reading
the used->idx) and reading these descriptors. It's done for the first
checks at the beginning of these functions but missed while checking
for extra required descriptors.
Fixes: e5f456a98d ("net/virtio: support in-order Rx and Tx")
Fixes: 13ce5e7eb9 ("virtio: mergeable buffers")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Read barrier must be implied between reading descriptor flags
and descriptor id. Otherwise, in case of reordering, we could
read wrong descriptor id.
For the reference, similar barrier for split rings is the read
barrier between VIRTQUEUE_NUSED (reading the used->idx) and
the call to the virtio_xmit_cleanup().
Additionally removed double update of 'used_idx'. It's enough
to set it in the end of the loop.
Fixes: 892dc798fa ("net/virtio: implement Tx path for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When updating used ring, the id in used element should be the
index of the first desc in the desc chain.
Fixes: f9b9d1a557 ("net/virtio-user: add multiple queues in device emulation")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Always use the virtio variants which support the platform
memory ordering.
Fixes: 9230ab8d79 ("net/virtio: support platform memory ordering")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixed below issues in the packed ring based control
vq support in virtio user:
1. The idx_hdr should be used_idx instead of the id in the desc;
2. We just need to write out a single used descriptor for each
descriptor list;
3. The avail/used bits should be initialized to 0;
Meanwhile, make the function name consistent with other parts.
Fixes: 48a4464029 ("net/virtio-user: support control VQ for packed")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch mainly fixed below issues in the packed ring based
control vq support in virtio driver:
1. When parsing the used descriptors, we have to track the
number of descs that we need to skip;
2. vq->vq_free_cnt was decreased twice for a same desc;
Meanwhile, make the function name consistent with other parts.
Fixes: ec194c2f18 ("net/virtio: support packed queue in send command")
Fixes: a4270ea4ff ("net/virtio: check head desc with correct wrap counter")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support to virtio-user for control virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In virtio_pq_send_command() we check for a used descriptor
and wait in an idle loop until it becomes used. We can't use
vq->used_wrap_counter here to check for the first descriptor
we made available because the ring could have wrapped. Let's use
the used_wrap_counter that matches the state of the head descriptor.
Fixes: ec194c2f18 ("net/virtio: support packed queue in send command")
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
VIRTIO_F_ORDER_PLATFORM is required to use proper memory barriers
in case of HW vhost implementations like vDPA.
DMA barriers (rte_cio_*) are sufficent for that purpose.
Previously known as VIRTIO_F_IO_BARRIER.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
We're not using IO ports in case of modern device even on IA.
Also, this comment useless for other architectures.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reading the used->flags could be reordered with avail->idx update.
vhost in kernel disables notifications for the time of packets
receiving, like this:
1. disable notify
2. process packets
3. enable notify
4. has more packets ? goto 1
In case of reordering, virtio driver could read the flags on
step 2 while notifications disabled and update avail->idx after
the step 4, i.e. vhost will exit the loop on step 4 with
notifications enabled, but virtio will not notify.
Fixes: c1f86306a0 ("virtio: add new driver")
Cc: stable@dpdk.org
Reported-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Add the RING_PACKED feature to dev->unsupported_features
when it's disabled, and add the missing packed vq param
string. And also revert the unexpected change to MAC option
introduced when adding packed vq option.
Fixes: 34f3966c7f ("net/virtio-user: add option to use packed queues")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch improves both descriptors dequeue and refill,
by using the same batching strategy as done in in-order path.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
This patch adds support for in-order path when meargeable buffers
feature hasn't been negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Accounting of bytes was moved to a common function, so at the moment we do
it twice. This patches fixes it for sending packets with packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Until we have support for control virtqueues let's disable it and
fail device initalization if specified as a parameter.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add option to enable packed queue support for virtio-user
devices.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Use packed virtqueue format when reading and writing descriptors
to/from the ring.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This implements the transmit path for devices with
support for packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support to dump packed virtqueue data to the
VIRTQUEUE_DUMP() macro.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When a guest is spanned on multiple NUMA nodes and
multiple Virtio devices are spanned onto these nodes,
we expect that their ring memory is allocated in the
right memory node.
Otherwise, vCPUs from node A may be polling Virtio rings
allocated on node B, which would increase QPI bandwidth
and impact performance.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
In case of running with not enough capabilities, i.e. running as
non-root user any application linked with DPDK prints the message
about IOPL call failure even if it was just called like
'./testpmd --help'. For example, this breaks most of the OVS unit
tests if it built with DPDK support.
Let's register the virtio driver unconditionally and print error
message while probing the device. Silent iopl() call left in the
constructor to have privileges as early as possible as it was before.
Fixes: 565b85dcd9 ("eal: set iopl only when needed")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
.dev_uninit calls .dev_stop and .dev_close. The work that is done in
those routines doesn't need repeated. Use started and opened to track
the adapter's status.
Fixes: c1f86306a0 ("virtio: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We need to check the status field in virtio net config structure
instead of the bits read from ISR register to know whether we need
to do guest announce.
Fixes: 7365504f77 ("net/virtio: support guest announce")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Get rid of the duplicated code in device features preparation
which looks awful.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We need to save the supported frontend features (which won't be
announced by vhost backend), otherwise we will lost them when the
connection to vhost-user backend is established in server mode.
Fixes: 201a416517 ("net/virtio-user: fix multiple queues fail in server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When driver resets the device, virtio-user just needs to send
GET_VRING_BASE messages to stop the vhost backend, and that's
what QEMU does. With this change, we won't need to set owner
when starting virtio-user device anymore. This will help us to
get rid of below error message on startup:
vhost_kernel_ioctl(): VHOST_SET_OWNER failed: Device or resource busy
Fixes: bce7e9050f ("net/virtio-user: fix start with kernel vhost")
Fixes: 0d6a8752ac ("net/virtio-user: fix crash as features change")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
There is no need to make the vhost user channel nonblock, and
making it nonblock will make vhost_user_read() fail with EAGAIN
when vhost messages need a reply.
Fixes: bd8f50a45d ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Without this change, virtio-user still works, but it will show
annoying error messages like this on shutdown:
vhost_kernel_set_backend(): VHOST_NET_SET_BACKEND fails, Operation not permitted
vhost_kernel_ioctl(): VHOST_RESET_OWNER failed: Operation not permitted
Fixes: e3b434818b ("net/virtio-user: support kernel vhost")
Fixes: 12ecb2f63b ("net/virtio-user: support memory hotplug")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Drop the duplicated reset() method in virtio_pci_ops. Currently
vtpci_reset() is implemented on set_status() and get_status()
directly. The reset() method in virtio_pci_ops isn't used and
its implementation in the legacy device isn't right.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Register and unregister the virtio interrupt handler when the device is
started and stopped. This allows a virtio device to be hotplugged or
unplugged.
Fixes: c1f86306a0 ("virtio: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Brian Russell <brussell@brocade.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Some global variables are defined with generic names, add component name
as prefix to variables to prevent collusion with application variables.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
In virtio_read_caps and vtpci_msix_detect, rte_pci_read_config returns
the number of bytes read from PCI config or < 0 on error.
If less than the expected number of bytes are read then log the
failure and return rather than carrying on with garbage.
Fixes: 6ba1f63b5a ("virtio: support specification 1.0")
Cc: stable@dpdk.org
Signed-off-by: Brian Russell <brussell@brocade.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The hotplug attach/detach features are implemented in EAL layer.
There is a new ethdev iterator to retrieve ports from ethdev layer.
As announced earlier, the (buggy) ethdev functions are now removed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
This is a clean-up of common ethdev data freeing.
All data freeing are moved to rte_eth_dev_release_port()
and done only in case of primary process.
It is probably fixing some memory leaks for PMDs which were
not freeing all data.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
eal: add shorthand __rte_weak macro
qat: update code to use __rte_weak macro
avf: update code to use __rte_weak macro
fm10k: update code to use __rte_weak macro
i40e: update code to use __rte_weak macro
ixgbe: update code to use __rte_weak macro
mlx5: update code to use __rte_weak macro
virtio: update code to use __rte_weak macro
acl: update code to use __rte_weak macro
bpf: update code to use __rte_weak macro
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
With the enabling for hotplug on multi-process,
rte_eth_dev_pci_generic_remove can be used to detach the device from
a secondary process also. But we need to take care of the uninit callback
parameter to make sure it handles the secondary case correctly.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
When adding or removing external memory from the memory map, there
may be actions that need to be taken on account of this memory (e.g.
DMA mapping). Add support for triggering callbacks when adding,
removing, attaching or detaching external memory.
Some memory event callback handlers will need additional logic to
handle external memory regions. For example, virtio callback has to
completely ignore externally allocated memory, because there is no
way to find file descriptors backing the memory address in a
generic fashion. All other callbacks have also been adjusted to
handle RTE_BAD_IOVA as IOVA address, as this is one of the expected
use cases for external memory support.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we allocate and use DPDK memory, we need to be able to
differentiate between DPDK hugepage segments and segments that
were made part of DPDK but are externally allocated. Add such
a property to memseg lists.
This breaks the ABI, so document the change in release notes.
This also breaks a few internal assumptions about memory
contiguousness, so adjust malloc code in a few places.
All current calls for memseg walk functions were adjusted to
ignore external segments where it made sense.
Mempools is a special case, because we may be asked to allocate
a mempool on a specific socket, and we need to ignore all page
sizes on other heaps or other sockets. Previously, this
assumption of knowing all page sizes was not a problem, but it
will be now, so we have to match socket ID with page size when
calculating minimum page size for a mempool.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The virtio features VIRTIO_NET_F_CSUM, VIRTIO_NET_F_HOST_TSO4
and VIRTIO_NET_F_HOST_TSO6 are supported by the virtio PMD.
But they are missing in the supported feature set. And since
below commit:
commit 4174a7b59d ("net/virtio: improve Tx offload features negotiation")
Virtio PMD will announce the Tx offloading capabilities based
on the features read from the device. And virtio-user won't
report the features which are not in virtio-PMD's supported
feature set. So since that commit, virtio-user won't announce
the DEV_TX_OFFLOAD_UDP_CKSUM, DEV_TX_OFFLOAD_TCP_CKSUM and
DEV_TX_OFFLOAD_TCP_TSO offloading capabilities even if the
vhost backend supports them.
This patch adds these missing features, and virtio-user will
report them if the backend supports them.
Fixes: 142678d429 ("net/virtio-user: fix wrongly get/set features")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The multiple queue support in vhost-kernel is broken because
the dev->vhostfd is only available for vhost-user. We should
always try to enable queue pairs when it's not in server mode.
Fixes: 201a416517 ("net/virtio-user: fix multiple queues fail in server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
It's possible to have much more hugepage backed memory regions
than what vhost-kernel supports due to the memory hotplug, which
may cause problems. A better solution is to have the virtio-user
pass all the memory ranges reserved by DPDK to vhost-kernel.
Fixes: 12ecb2f63b ("net/virtio-user: support memory hotplug")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Recently some memory APIs were introduced to allow users to
get the file descriptor and offset for each memory segment.
We can leverage those APIs to get rid of the /proc magic on
memory table preparation in vhost-user backend.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Deadlock can occur when allocating memory if a vhost-kernel
based virtio-user device is in use. To fix the deadlock,
we will take memory hotplug lock explicitly in virtio-user
when necessary, and always call the _thread_unsafe memory
functions.
Bugzilla ID: 81
Fixes: 12ecb2f63b ("net/virtio-user: support memory hotplug")
Cc: stable@dpdk.org
Reported-by: Seán Harte <seanbh@gmail.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Tested-by: Seán Harte <seanbh@gmail.com>
Reviewed-by: Seán Harte <seanbh@gmail.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.
PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.
Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
This patch checks negotiated features to see if necessary to offload
before set the tap device offload capabilities. It also checks if kernel
support the TUNSETOFFLOAD operation.
Fixes: 5e97e42025 ("net/virtio-user: enable offloading")
Cc: stable@dpdk.org
Signed-off-by: Eric Zhang <eric.zhang@windriver.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Add the missing param "server" to param string.
Also add the missing spaces after params.
Fixes: bd8f50a45d ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This macro isn't used any more after below commit:
Fixes: a4996bd89c ("ethdev: new Rx/Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
A constructor is usually declared with RTE_INIT* macros.
As it is a static function, no need to declare before its definition.
The macro is used directly in the function definition.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
In DPDK 17.11, the ethdev offloads API has changed:
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload
For reminder, the main concepts in the new API were:
- All offloads are disabled by default
- Distinction between per port and per queue offloads.
The transition bits are now removed:
- Translation of the old API in ethdev
- rte_eth_conf.rxmode.ignore_offload_bitfield
- ETH_TXQ_FLAGS_IGNORE
The old API bits are now removed:
- Rx per-port rte_eth_conf.rxmode.[bit-fields]
- Tx per-queue rte_eth_txconf.txq_flags
- ETH_TXQ_FLAGS_NO*
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Instead of checking the multiple Virtio features bits for
every packet, let's do the check once at configure time and
store it in virtio_hw struct.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This patch improves the Tx offload features selection depending
on whether the application request for offloads.
When the application doesn't request for Tx offload features,
the corresponding features bits aren't negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The simple Tx path does not comply with the Virtio specification.
Now that VIRTIO_F_IN_ORDER feature is supported by the Virtio PMD,
let's use this optimized path instead.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
After IN_ORDER Rx/Tx paths added, need to update Rx/Tx path selection
logic.
Rx path select logic: If IN_ORDER and merge-able are enabled will select
IN_ORDER Rx path. If IN_ORDER is enabled, Rx offload and merge-able are
disabled will select simple Rx path. Otherwise will select normal Rx
path.
Tx path select logic: If IN_ORDER is enabled will select IN_ORDER Tx
path. Otherwise will select default Tx path.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
IN_ORDER Rx function depends on merge-able feature. Descriptors
allocation and free will be done in bulk.
Virtio dequeue logic:
dequeue_burst_rx(burst mbufs)
for (each mbuf b) {
if (b need merge) {
merge remained mbufs
add merged mbuf to return mbufs list
} else {
add mbuf to return mbufs list
}
}
if (last mbuf c need merge) {
dequeue_burst_rx(required mbufs)
merge last mbuf c
}
refill_avail_ring_bulk()
update_avail_ring()
return mbufs list
IN_ORDER Tx function can support offloading features. Packets which
matched "can_push" option will be handled by simple xmit function. Those
packets can't match "can_push" will be handled by original xmit function
with in-order flag.
Virtio enqueue logic:
xmit_cleanup(used descs)
for (each xmit mbuf b) {
if (b can inorder xmit) {
add mbuf b to inorder burst list
continue
} else {
xmit inorder burst list
xmit mbuf b by original function
}
}
if (inorder burst list not empty) {
xmit inorder burst list
}
update_avail_ring()
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
IN_ORDER virtio-user Tx function support Tx checksum offloading and
TSO which also support on normal Tx function. So extracts common part
into separated function for reuse.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add new function for freeing IN_ORDER descriptors. As descriptors will
be allocated and freed sequentially when IN_ORDER feature was
negotiated. There will be no need to utilize chain for freed descriptors
management, only index update is enough.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add parameters for configuring VIRTIO_NET_F_MRG_RXBUF and
VIRTIO_F_IN_ORDER feature bits. If feature is disabled, also update
corresponding unsupported feature bit.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch introduces unsupported features mask for virtio-user device.
For virtio-user server mode, when reconnecting virtio-user will retrieve
vhost device features as base and then unmask unsupported features.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If VIRTIO_F_IN_ORDER has been negotiated, driver will use descriptors in
ring order: starting from offset 0 in the table, and wrapping around at
the end of the table. Also introduce use_inorder_[rt]x flag for
selection of IN_ORDER [RT]x handlers.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
DEV_RX_OFFLOAD_KEEP_CRC offload flag is added. PMDs that support
keeping CRC should advertise this offload capability.
DEV_RX_OFFLOAD_CRC_STRIP flag will remain one more release
default behavior in PMDs are to keep the CRC until this flag removed
Until DEV_RX_OFFLOAD_CRC_STRIP flag is removed:
- Setting both KEEP_CRC & CRC_STRIP is INVALID
- Setting only CRC_STRIP PMD should strip the CRC
- Setting only KEEP_CRC PMD should keep the CRC
- Not setting both PMD should keep the CRC
A helper function rte_eth_dev_is_keep_crc() has been added to be able to
change the no flag behavior with minimal changes in PMDs.
The PMDs that doesn't report the DEV_RX_OFFLOAD_KEEP_CRC offload can
remove rte_eth_dev_is_keep_crc() checks next release, related code
commented to help the maintenance task.
And DEV_RX_OFFLOAD_CRC_STRIP has been added to virtual drivers since
they don't use CRC at all, when an application requires this offload
virtual PMDs should not return error.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
For virtio-user server mode, one use case comes across segmentation fault.
step 1: Launch vhost side as client firstly.
step 2: launch virtio-user side as server.
The cause is: after registering virtio_interrupt_handler into
eal-intr-thread, two threads (main thread and eal-intr-thread) have
sync issues, so add rxvq pointer checking in function virtio_notify_peers
to decide if the code can continue.
Fixes: bd8f50a45d ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In legacy-mem mode, memory event callback registering is not supported,
we should not return error in dev_init on this case.
Fixes: 12ecb2f63b ("net/virtio-user: support memory hotplug")
Suggested-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Currently VIRTIO_NET_F_MAC is set unconditionally when server
mode is used. It should be stripped when MAC isn't specified.
Fixes: bd8f50a45d ("net/virtio-user: support server mode")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When the backend is vhost-net, virtio-user must work in client mode and
needs to request features from the backend in virtio_user_dev_init().
But currently, virtio-user is assigned to default features in this case.
This patch is to fix this inappropriate feature setting.
Fixes: bd8f50a45d ("net/virtio-user: support server mode")
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
This patch fixes multiple queues failure when virtio-user works in
server mode.
This patch adds feature negotiation in the processing of virtio-user
connection and enables multiple-queue pairs.
Fixes: bd8f50a45d ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
A new hook function is added and called inside the PMDs at the end
of the device probing:
- in primary process, after allocating, init and config
- in secondary process, after attaching and local init
This new function is almost empty for now.
It will be used later to add some post-initialization processing.
For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.
Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.
This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.
This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().
In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
When memory is hot-added or hot-removed, the virtio-user driver has to
notify the vhost-user backend with sending a VHOST_USER_SET_MEM_TABLE
request with the new memory map as payload.
This patch implements and registers a mem_event callback, it pauses the
datapath and updates memory regions to vhost in case of hot-add or
hot-remove event. This memory region update has only to be done when the
device is already started, so a new status flag is added to the device to
keep track of the status.
As the device can now be managed by different threads, a mutex is
introduced to protect against concurrent device configuration.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
After the commit 2a04139f66 ("eal: add single file segments option"),
one hugepage file could contain multiple hugepages which are further
mapped to different memory regions.
Original enumeration implementation cannot handle this situation.
This patch filters out the duplicated files; and adjust the size after
the enumeration.
Fixes: 6a84c37e39 ("net/virtio-user: add vhost-user adapter layer")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If we want a virtio device to work in vDPA (vhost data path acceleration)
mode, we could add a "vdpa=1" devarg for this device to specify the mode.
This patch let virtio pmd skip device probe when detecting this parameter.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a
return code is added to notify the caller (librte_ether) if an error
occurred in the PMD.
The new default MAC address is now copied in dev->data->mac_addrs[0]
only if the operation is successful.
The patch also updates all the PMDs accordingly.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In a container environment if the vhost-user backend restarts, there's
no way for it to reconnect to virtio-user. To address this, support for
server mode is added. In this mode the socket file is created by virtio-
user, which the backend then connects to. This means that if the backend
restarts, it can reconnect to virtio-user and continue communications.
With current implementation, LSC is enabled at virtio-user side to
support to accept the coming connection.
Server mode virtio-user only supports to work with vhost-user.
Release note is updated in this patch.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Public struct rte_eth_dev_info has a "struct rte_pci_device" field in it
although it is common for all ethdev in all buses.
Replacing pci specific struct with generic device struct and updating
places that are using pci device in a way to get this information from
generic device.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
virtio-user port_id range should be increased from 8 bits to 16 bits.
Fixes: f8244c6399 ("ethdev: increase port id range")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Before, we were aggregating multiple pages into one memseg, so the
number of memsegs was small. Now, each page gets its own memseg,
so the list of memsegs is huge. To accommodate the new memseg list
size and to keep the under-the-hood workings sane, the memseg list
is now not just a single list, but multiple lists. To be precise,
each hugepage size available on the system gets one or more memseg
lists, per socket.
In order to support dynamic memory allocation, we reserve all
memory in advance (unless we're in 32-bit legacy mode, in which
case we do not preallocate memory). As in, we do an anonymous
mmap() of the entire maximum size of memory per hugepage size, per
socket (which is limited to either RTE_MAX_MEMSEG_PER_TYPE pages or
RTE_MAX_MEM_MB_PER_TYPE megabytes worth of memory, whichever is the
smaller one), split over multiple lists (which are limited to
either RTE_MAX_MEMSEG_PER_LIST memsegs or RTE_MAX_MEM_MB_PER_LIST
megabytes per list, whichever is the smaller one). There is also
a global limit of CONFIG_RTE_MAX_MEM_MB megabytes, which is mainly
used for 32-bit targets to limit amounts of preallocated memory,
but can be used to place an upper limit on total amount of VA
memory that can be allocated by DPDK application.
So, for each hugepage size, we get (by default) up to 128G worth
of memory, per socket, split into chunks of up to 32G in size.
The address space is claimed at the start, in eal_common_memory.c.
The actual page allocation code is in eal_memalloc.c (Linux-only),
and largely consists of copied EAL memory init code.
Pages in the list are also indexed by address. That is, in order
to figure out where the page belongs, one can simply look at base
address for a memseg list. Similarly, figuring out IOVA address
of a memzone is a matter of finding the right memseg list, getting
offset and dividing by page size to get the appropriate memseg.
This commit also removes rte_eal_dump_physmem_layout() call,
according to deprecation notice [1], and removes that deprecation
notice as well.
On 32-bit targets due to limited VA space, DPDK will no longer
spread memory to different sockets like before. Instead, it will
(by default) allocate all of the memory on socket where master
lcore is. To override this behavior, --socket-mem must be used.
The rest of the changes are really ripple effects from the memseg
change - heap changes, compile fixes, and rewrites to support
fbarray-backed memseg lists. Due to earlier switch to _walk()
functions, most of the changes are simple fixes, however some
of the _walk() calls were switched to memseg list walk, where
it made sense to do so.
Additionally, we are also switching locks from flock() to fcntl().
Down the line, we will be introducing single-file segments option,
and we cannot use flock() locks to lock parts of the file. Therefore,
we will use fcntl() locks for legacy mem as well, in case someone is
unfortunate enough to accidentally start legacy mem primary process
alongside an already working non-legacy mem-based primary process.
[1] http://dpdk.org/dev/patchwork/patch/34002/
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Add checking for cvq to judge if virtio_ack_link_announce should be called.
The existing code doesn't cause issue, and add the checking just to look
more reasonable.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
It is necessary to add pointer checking because in some case the
code will cause crash. For example, the code goes here before
memory allocation of rxvq is finished.
Fixes: 7365504f77 ("net/virtio: support guest announce")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When using virtio-user with vhost-kernel to exchange
packet with kernel networking stack, application can
set the MAC of the tap interface via parameter.
Signed-off-by: Ning Li <muziding001@163.com>
Reviewed-by: Seán Harte <seanbh@gmail.com>
Tested-by: Seán Harte <seanbh@gmail.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Use new rte_eth_linkstatus_get/set helper functions to handle link
status update.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
intr_handle->fd was wrongly initialized as 0 (usually as the stdio fd)
when virtio-user is used with vhost-kernel. So the interrupt thread
might wrongly treat stdin events as LSC interrupts.
Fixes: 3d4fb6fd25 ("net/virtio-user: support Rx interrupt")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
After reset owner in below patch, we failed to set owner before
sending further vhost messages. It is OK with vhost user implemented
DPDK/VPP/Contrail, but it sees "Operation not permitted" error when
used with vhost kernel.
We fix this by setting owner every time the device is started.
Fixes: 0d6a8752ac ("net/virtio-user: fix crash as features change")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Since commit efc83a1e7f ("net/virtio: fix queue setup consistency"),
when resuming a virtio port, the rx rings are refilled with new mbufs
until they are full (vq->vq_free_cnt == 0). This is done without
ensuring that the descriptor index remains a multiple of
RTE_VIRTIO_VPMD_RX_REARM_THRESH, which is a prerequisite when using the
vector mode. This can cause an out of bound access in the rx ring.
This commit changes the vector refill method from
virtqueue_enqueue_recv_refill_simple() to virtio_rxq_rearm_vec(), which
properly checks that the refill is done by batch of
RTE_VIRTIO_VPMD_RX_REARM_THRESH.
As virtqueue_enqueue_recv_refill_simple() is no more used, this
patch also removes the function.
Fixes: efc83a1e7f ("net/virtio: fix queue setup consistency")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
The mbuf->data_off was was not properly set for the first received
mbufs. Fix this by setting it in virtqueue_enqueue_recv_refill_simple(),
which is used to enqueue the first mbuf in the ring.
The function virtio_rxq_rearm_vec(), which is used to rearm the ring
with new mbufs, is valid and does not need to be updated.
Fixes: cab0461234 ("virtio: fill Rx avail ring with blank mbufs")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Report error message if clearing O_NONBLOCK flag will fail,
then return from function.
Coverity issue: 143439
Fixes: ef53b60300 ("net/virtio-user: support LSC")
Cc: stable@dpdk.org
Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
virtio_dev_free_mbufs was recently modified to free the
virtqueues but failed to check whether the array was
allocated. Added a check to ensure vqs was non-null.
Fixes: bdb32afbb6 ("net/virtio: rationalize queue flushing")
Signed-off-by: David Harton <dharton@cisco.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This commit aligns the names for dynamic logging with
the newly defined logging format.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Use the same kind of loop than in virtio_free_queues() and factorize
common code.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Free the previous queues and the attached mbufs before initializing new
ones.
The function virtio_dev_free_mbufs() is now called when reconfiguring the
device, so we also need to add a check to ensure that it won't crash for
uninitialized queues.
Cc: stable@dpdk.org
Fixes: 60e6f4707e ("net/virtio: reinitialize device when configuring")
Signed-off-by: Zijie Pan <zijie.pan@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
When using vector Rx mode (use_simple_rx = 1), vq->vq_descx[] is not
kept up to date. To properly detach the mbufs in this case, browse
sw_ring[] instead, as it's done in virtqueue_rxvq_flush().
Since we need virtio_get_queue_type(), also move this function in
virtqueue.h as a static inline.
Fixes: fc3d66212f ("virtio: add vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
On arm32, we were always selecting the simple handler, but it is only
available if neon is present.
This is due to a typo in the name of the config option.
CONFIG_RTE_ARCH_ARM is for Makefiles. One should use RTE_ARCH_ARM.
Fixes: 2d7c37194e ("net/virtio: add NEON based Rx handler")
Cc: stable@dpdk.org
Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Since commit 59fe5e17d9 ("vhost: propagate set features handling error"),
vhost does not allow to set different features without reset.
The virtio-user driver fails to reset the device in below commit.
To fix, we send the reset message as stopping the device.
Fixes: c12a26ee20 ("net/virtio-user: fix not properly reset device")
Cc: stable@dpdk.org
Reported-by: Lei Yao <lei.a.yao@intel.com>
Reported-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Add checks during build to ensure that all symbols in the EXPERIMENTAL
version map section have __experimental tags on their definitions, and
enable the warnings needed to announce their use. Also add an
ALLOW_EXPERIMENTAL_APIS define to allow individual libraries and files
to declare the acceptability of experimental api usage
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
When live migration is done, for the backup VM, either the virtio
frontend or the vhost backend needs to send out gratuitous RARP packet
to announce its new network location.
This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live
migration scenario where the vhost backend doesn't have the ability to
generate RARP packet.
Brief introduction of the work flow:
1. QEMU finishes live migration, pokes the backup VM with an interrupt.
2. Virtio interrupt handler reads out the interrupt status value, and
realizes it needs to send out RARP packet to announce its location.
3. Pause device to stop worker thread touching the queues.
4. Inject a RARP packet into a Tx Queue.
5. Ack the interrupt via control queue.
6. Resume device to continue packet processing.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
This patch adds dev_pause, dev_resume and inject_pkts APIs to allow
driver to pause the worker threads and inject special packets into
Tx queue. The next patch will be based on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The virtio_send_command function may be called from app's configuration
routine, but also from an interrupt handler called when live migration
is done on the backup side. So this patch makes control queue
thread-safe first.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The max_mtu is kept as zero in case no CRTL channel, which leads
to failure when calling virtio_mtu_set().
Signed-off-by: Zhike Wang <wangzhike@jd.com>
Acked-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
The pointer to the user parameter of the callback registration is
automatically pass to the callback function.
There is no point to allow changing this user parameter by a caller.
That's why this parameter is always set to NULL by PMDs and set only
in ethdev layer before calling the callback function.
The history is that the user parameter was initially used
by the callback implementation to pass some information
between the application and the driver:
c1ceaf3ad0 ("ethdev: add an argument to internal callback function")
Then a new parameter has been added to leave the user parameter
to its standard usage of context given at registration:
d6af1a13d7 ("ethdev: add return values to callback process API")
The NULL parameter in the internal callback processing function
is now removed. It makes clear that the callback parameter is user
managed and opaque from a DPDK point of view.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>