In some cases when using the vHost PMD, certain vHost library functions
may still need to be accessed. One such example is the
rte_vhost_get_queue_num function which returns the number of virtqueues
reported by the guest - information which is not exposed by the PMD.
This commit introduces a new rte_eth_vhost function that returns the
'vid' associated with a given port id. This allows the PMD user to call
vHost library functions which require the 'vid' value.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Added neon based Rx vector implementation.
Selection of the new handler based neon availability at runtime.
Updated the release notes and MAINTAINERS file.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Introduced cpuflag based run-time detection to select the
SSE based simple Rx handler
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Split out SSE instruction based virtio simple Rx
implementation to a separate file
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Removed unnecessary compile time dependency on "use_simple_rxtx".
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Interestingly, clang and gcc has different prototype for _mm_prefetch().
For gcc, we have
_mm_prefetch (const void *__P, enum _mm_hint __I)
While for clang, it's
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
That's how the following error comes with clang:
error: cast from 'const void *' to 'void *' drops const qualifier
[-Werror,-Wcast-qual]
_mm_prefetch((const void *)rused, _MM_HINT_T0);
/usr/lib/llvm-3.8/bin/../lib/clang/3.8.0/include/xmmintrin.h:684:58:
note: expanded from macro '_mm_prefetch'
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a),
0, (sel)))
What's weird is that the build was actaully Okay before. I met it while
apply Jerin's vector support for ARM patch set: he just move this piece
of code to another file, nothing else changed.
This patch fix the issue when Jerin's patchset is applied. Thus, I think
it's still needed.
Similarly, make the same change to other _mm_prefetch users, just in case
this weird issue shows up again somehow later.
Fixes: fc3d66212f ("virtio: add vector Rx")
Fixes: c95584dc2b ("ixgbe: new vectorized functions for Rx/Tx")
Fixes: 9ed94e5bb0 ("i40e: add vector Rx")
Fixes: 7092be8437 ("fm10k: add vector Rx")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Currently, when virtio_user device fails to be started (e.g., vhost
unix socket does not exit), the init function does not return struct
rte_eth_dev (and some other structs) back to ether layer. And what's
more, it does not report the error to upper layer.
The fix is to free those structs and report error when failing to
start virtio_user devices.
Fixes: ce2eabdd43 ("net/virtio-user: add virtual device")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
When virtio_user is used with VPP's native vhost user, it cannot
send/receive any packets.
The root cause is that vpp-vhost-user translates the message
VHOST_USER_SET_FEATURES as puting this device into init state,
aka, zero all related structures. However, previous code
puts this message at last in the whole initialization process,
which leads to all previous information are zeroed.
To fix this issue, we rearrange the sequence of those messages.
- step 0, send VHOST_USER_SET_VRING_CALL so that vhost allocates
virtqueue structures;
- step 1, send VHOST_USER_SET_FEATURES to confirm the features;
- step 2, send VHOST_USER_SET_MEM_TABLE to share mem regions;
- step 3, send VHOST_USER_SET_VRING_NUM, VHOST_USER_SET_VRING_BASE,
VHOST_USER_SET_VRING_ADDR, VHOST_USER_SET_VRING_KICK for each
queue;
- ...
Fixes: 37a7eb2ae8 ("net/virtio-user: add device emulation layer")
Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
When virtio_user is used with OVS-DPDK (with mq disabled), it cannot
receive any packets. This is because no queue is enabled at all when
mq is disabled.
To fix it, we should consistently make sure the 1st queue is enabled,
which is also the behaviour QEMU takes.
Fixes: 37a7eb2ae8 ("net/virtio-user: add device emulation layer")
Reported-by: Ning Li <lining18@jd.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Indirect descriptors are usually supported by virtio-net devices,
allowing to dispatch a larger number of requests.
When the virtio device sends a packet using indirect descriptors,
only one slot is used in the ring, even for large packets.
The main effect is to improve the 0% packet loss benchmark.
A PVP benchmark using Moongen (64 bytes) on the TE, and testpmd
(fwd io for host, macswap for VM) on DUT shows a +50% gain for
zero loss.
On the downside, micro-benchmark using testpmd txonly in VM and
rxonly on host shows a loss between 1 and 4%. But depending on
the needs, feature can be disabled at VM boot time by passing
indirect_desc=off argument to vhost-user device in Qemu.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The dpdk-devbind.py script does not find/display the ifname for virtio
interfaces since the "net" directory is not directly under the device
directory but rather under a subdirectory.
eg.
> dpdk-devbind.py --status
0000:00:03.0 'Virtio network device' if= drv=virtio-pci unused=
This change searches for the first "net" directory under the device
directory hierarchy.
eg.
0000:00:03.0 'Virtio network device' if=ens3 drv=virtio-pci unused=
Fixes: 629395b063 ("igb_uio: remove PCI id table")
Signed-off-by: Gary Mussar <gmussar@ciena.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
We have a stats named "size_1024_1517_packets", while the code
actually counts the range "[1024, 1518]", which is obviously wrong.
The code is as follows in the function virtio_update_packet_stats.
else if (s < 1519)
stats->size_bins[6]++;
We could either fix it by correcting the "if" check in the code,
or fix it by just renaming the stats to conform to the code. The
latter solution is taken because that's what the RFC2819 suggests.
Fixes: 76d4c652e0 ("virtio: add extended stats")
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Virtio indirect descriptors are supported by the data-path
but the feature bit is never set during feature negociation.
This patch simply adds VIRTIO_RING_F_INDIRECT_DESC back to
the supported features bit mask, hence enabling the use of
indirect descriptors when the feature is negociated with the
device.
Signed-off-by: Pierre Pfister <ppfister@cisco.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
As new_device and destroy_device use an int instead of a
"struct virtio_net *", The comment about setting VIRTIO_DEV_RUNNING
doesn't make sense anymore, plus If I've correctly understand the
code, the drivers take care of setting the flag before calling the
callbacks, so I guess that this comment is obsolet and I've remove it.
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
No need to use a pointer to store/retrieve features.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Invoke get_device() at the beginning of vhost_user_msg_handler, so that
we could check the return value once. Which could save tons of duplicate
get-and-check device.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Some functions are with prefix "user_", while others with "vhost_".
Making them all starting with "vhost_user_" to unify the function names.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Due to history reason (that we have 2 vhost implementations), some
messages are handled in two calls: vhost specific implementation
handles it first and then invoke the common one to do another handling.
We have one implementation only now, we could write one method for
each message. Here fold those common handles to corresponding vhost
user handler.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The code structure is a bit messy now. For example, vhost-user message
handling is spread to three different files:
vhost-net-user.c virtio-net.c virtio-net-user.c
Where, vhost-net-user.c is the entrance to handle all those messages
and then invoke the right method for a specific message. Some of them
are stored at virtio-net.c, while others are stored at virtio-net-user.c.
The truth is all of them should be in one file, vhost_user.c.
So this patch refactors the source code structure: mainly on renaming
files and moving code from one file to another file that is more suitable
for storing it. Thus, no functional changes are made.
After the refactor, the code structure becomes to:
- socket.c handles all vhost-user socket file related stuff, such
as, socket file creation for server mode, reconnection
for client mode.
- vhost.c mainly on stuff like vhost device creation/destroy/reset.
Most of the vhost API implementation are there, too.
- vhost_user.c all stuff about vhost-user messages handling goes there.
- virtio_net.c all stuff about virtio-net should go there. It has virtio
net Rx/Tx implementation only so far: it's just a rename
from vhost_rxtx.c
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We now have one vhost implementation; no sub source dir is needed.
Remove it by move them to upper dir.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
remove vhost-cuse code, including the eventfd_link kernel module that
is for vhost-cuse only.
The lib/virt/qemu-wrap.py is also removed, as it's mainly for vhost-cuse
usage.
As we have one vhost implementation now, one vhost config option is
needed only. Thus, CONFIG_RTE_LIBRTE_VHOST_USER is removed.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When VMDQ is enabled, different NICs have different behaviors for
disabling VLAN strip. In detail, i40e only enables/disables it of
PF's main vsi; fm10k cannot disable VLAN strip, etc. We now remove
this option, --vlan-strip, to reduce any confusion. And now, VLAN
strip will be enabled and cannot be disabled.
Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
When examples/vhost runs in client mode, only one QEMU can be connected.
This is because that examples/vhost just supports one socket file. This
patch is to add multiple sockets support for examples/vhost.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
In examples/vhost, "dev-basename" is a program option, which is to set
the vhost-net socket used by vhost-user, or the character device used
by vhost-cuse. Since vhost-cuse should be dropped, and "dev-basename"
is not a suitable name for the vhost-net socket. Therefore, this patch
is to change this option name for examples/vhost.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Using dpdk-pmdinfo with the '-r' flag does not produce a json output as
documented. Instead, the python representation of the json object is
shown, which is nearly the same, but cannot be properly parsed by a json
parser.
python repr (before):
{u'pci_ids': [[5549, 1968, 65535, 65535]], u'name': u'vmxnet3'}
json (after):
{"pci_ids": [[5549, 1968, 65535, 65535]], "name": "vmxnet3"}
Fixes: c67c9a5c64 ("tools: query binaries for HW and other support information")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Compile error:
CC mlx5.o.pmd.o
mlx5.o.pmd.c:1:227:
error: no newline at end of file [-Werror,-Wnewline-eof]
...__attribute__((used)) = "PMD_INFO_STRING= {...}";
^
Produced with clang 3.8.0 and MLX5_PMD and MLX5_DEBUG
config options enabled.
Fixes: 98b0fdb0ff ("pmdinfogen: add buildtools and pmdinfogen utility")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
In function rte_hash_cuckoo_insert_mw_tm, while looking for
an empty slot, only the first entry in the bucket was being checked,
as key_idx array was not being iterated.
Fixes: 5fc74c2e14 ("hash: check if slot is empty with key index")
Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The "imissed" stats represent RX packets dropped by the HW,
so we should not talk about mbufs as the hardware is not aware
of this structure. Buffer seems to be a better word.
Fixes: 4eadb8ba11 ("ethdev: do not deprecate imissed counter")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Fixes: 85226f9c52 ("mempool: introduce a function to create an empty pool")
Fixes: d1d914ebbc ("mempool: allocate in several memory chunks by default")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Existing cntvct_el0 based rte_rdtsc() provides portable
means to get wall clock counter at user space. Typically
it runs at <= 100MHz.
The alternative method to enable rte_rdtsc() for high resolution
wall clock counter is through armv8 PMU subsystem.
The PMU cycle counter runs at CPU frequency, However,
access to PMU cycle counter from user space is not enabled
by default in the arm64 linux kernel.
It is possible to enable cycle counter at user space access
by configuring the PMU from the privileged mode (kernel space).
by default rte_rdtsc() implementation uses portable
cntvct_el0 scheme. Application can choose the PMU based
implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Fixes: dbe6b4b61b ("pci: probe or close device")
Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Now that rte_device is available, drivers can start using its members
(numa, name) as well as link themselves into another rte_device list.
As of now no one is using this list, but can be used for moving over all
devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log for extra rte_device list]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
To register both vdev and pci drivers into the list of all rte_driver,
we have to call rte_eal_driver_register explicitly.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Remove the 'name' member from rte_pci_driver and move to generic
rte_driver.
Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..)
as well as assigning a name to eth_driver.pci_drv.name member.
In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has
been populated into the rte_driver.name member - assignments through
eth_driver has been removed.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Rebase and expand changes to newly added files]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
There is no need to have a custom memory resource representation for
each infrastructure (PCI, ...) as it would always have the same members.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Further refactoring and generalization of PCI infrastructure will
require access to the rte_dev.h contents.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
- All devices register themselfs by calling a kind of DRIVER_REGISTER_XXX.
The PMD_REGISTER_DRIVER is not used anymore.
- PMD_VDEV type is also not being used - can be removed from all VDEVs.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
All PMD_VDEV drivers can now use rte_vdev_driver instead of the
rte_driver (which is embedded in the rte_vdev_driver).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
- Remove checks for VDEV from rte_eal_vdev_(init/uninint) as all devices
are inherently virtual here.
- PDEVs perform PCI specific inits - rte_eal_dev_init() need not call
rte_driver->init();
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Move all PMD_VDEV-specific code into a separate module and header
file to not polute the generic code anymore. There is now a list
of virtual devices available.
The rte_vdev_driver integrates the original rte_driver inside
(C inheritance). The rte_driver will be however change in the
future to serve as a common base for all other types of drivers.
The existing PMDs (PMD_VDEV) are to be modified later (there is
no change for them at the moment).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Now that hotplug has been moved to eal, there is no reason to keep the
device type in this layer.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Remove bus logic from ethdev hotplug by using eal for this.
Current api is preserved:
- the last port that has been created is tracked to return it to the
application when attaching,
- the internal device name is reused when detaching.
We can not get rid of ethdev hotplug yet since we still need some
mechanism to inform applications of port creation/removal to substitute
for ethdev hotplug api.
dev_type field in struct rte_eth_dev and rte_eth_dev_allocate are kept as
is, but this information is not needed anymore and is removed in the
following commit.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Hotplug invocations, which deals with devices, should come from the layer
that already handles them, i.e. EAL.
For both attach and detach operations, 'name' is used to select the bus
that will handle the request.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
No need to scan all devices, we only need to update the device being
attached.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
- Move rte_eth_dev_create_unique_device_name() from ether/rte_ethdev.c to
common/include/rte_pci.h as rte_eal_pci_device_name(). Being a common
method, can be used across crypto/net PCI PMDs.
- Remove crypto specific routine and fallback to common name function.
- Introduce a eal private Update function for PCI device naming.
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Shreyansh: Merge crypto/pci helper patches]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.
Exceptions:
- virtio and mlx* use RTE_INIT directly as they have custom initialization
steps.
- VDEV devices are not modified - they continue to use PMD_REGISTER_DRIVER.
Update documentation for replacing an example referring to
PMD_REGISTER_DRIVER.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
crypto and ethdev drivers aligned to PCI probe/remove. These wrappers are
mapped directly to PCI resources.
Existing handlers for init/uninit can be easily reused for this.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Introduce a RTE_INIT macro used to mark an init function as a constructor.
Current eal macros have been converted to use this (no functional impact).
DRIVER_REGISTER_PCI is added as a helper for pci drivers.
Suggested-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Shreyansh: Update PCI Registration macro name]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>