Those counters provide information regards sent/received bytes and
packets per queue.
Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
When wrong req_id is detected some previous mbufs could be used for
receiving different segments of received packets. In such cases chained
mbufs will be twice returned to pool.
To prevent it chained mbuf is now freed just after error detection.
To simplify cleaning, pointers taken for Rx ring are set to NULL.
As after ena_rx_queue_release_bufs and ena_tx_queue_release_bufs queues
are not used updating of next_to_clean pointer is not necessary.
Fixes: c2034976673d ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Use empty_rx_reqs instead of empty_tx_reqs.
As those two variables are part of union this not cause
any failure, but for consistency should be changed.
Fixes: c2034976673d ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The PMD was not passing RSS offloads values although it was supporting
the RSS. To allow application to probe the PMD for RSS support, the
missing information was added.
Fixes: 1173fca25af9 ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Only PMD_*_LOG is adding new line character to the log message.
All printouts were adjusted for consistency.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Reset function should return the port to initial state, in which no Tx
and Rx queues are setup. Then application should reconfigure the queues.
According to DPDK documentation the rte_eth_dev_reset() itself is a
generic function which only does some hardware reset operations through
calling dev_unint() and dev_init().
ena_com_dev_reset which perform NIC registers reset should be called
during stop.
Fixes: 2081d5e2e92d ("net/ena: add reset routine")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
If start function fails, previously created queues have to be removed.
ena_queue_restart_all() and ena_queue_restart() are renamed to
ena_queue_start_all() and ena_queue_start().
ena_free_io_queues_all() is renamed to ena_queue_stop_all().
Fixes: df238f84c0a2 ("net/ena: recreate HW IO rings on start and stop")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Before sending next packet, check if calling doorbell is needed.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Some of ENA devices supports 8k Rx rings. Maximum supported size is
received upon device initialization.
As ENA_DEFAULT_RING_SIZE_RX macro is upper limit, it needs to be
adjusted.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
LLQ (Low Latency Queue) is the feature that allows pushing header
directly to the device through PCI before even DMA is triggered.
It reduces latency, because device can start preparing packet before
payload is sent through DMA.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
When invalid req_id is received, the reset should be handled by the
application, as it is indicating invalid rings state, so further Rx
is not making any sense.
Fixes: c2034976673d ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The device now allows driver to reconfigure Tx and Rx queues depth
independently. Moreover, maximum size for Tx and Rx can be different.
Those maximum values are received from the device.
After reset, previous ring configuration is restored.
If number of descriptor is set to RTE_ETH_DEV_FALLBACK_RX_RINGSIZE
or RTE_ETH_DEV_FALLBACK_TX_RINGSIZE, the maximum value is restored.
Remove checks, if provided number is not too big, as this is done in
generic functions (rte_eth_rx_queue_setup and rte_eth_tx_queue_setup).
Maximum number of segments is being set for Rx packets and provided to
ena_com_rx_pkt() for validation.
Unused definitions were removed.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Whenever the driver will receive too many descriptors from the device,
it should trigger the device reset with reset reason set to
ENA_REGS_RESET_TOO_MANY_RX_DESCS.
Fixes: 241da076b1f7 ("net/ena: adjust error checking and cleaning")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The new ena_com allows the number of CPUs to be passed to the device in
the host info structure.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
ena_com is the communication layer provided by the vendor and common to
all ENA drivers.
This patch updates it to version from 2018.09.26.
It adds support for ENAv2 device together with LLQ feature, adds
doorbell optimization and reconfiguration of HW queues depth
independently.
The driver was adjusted to the new changes in the HAL.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Do four packets macswap in same loop iterate to squeeze more
CPU cycles.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
The patch optimizes the mac swap operation by taking advantage
of SSE instructions, it only impacts x86 platform.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Move macswap workload to dedicate function, so we can further enable
platform specific optimized version.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Add the SW assisted VDPA live migration feature into NIC doc.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In SW assisted live migration mode, driver will stop the device and
setup a mediated virtio ring to relay the communication between the
virtio driver and the VDPA device.
This data path intervention will allow SW to help on guest dirty page
logging for live migration.
This SW fallback is event driven relay thread, so when the network
throughput is low, this SW fallback will take little CPU resource, but
when the throughput goes up, the relay thread's CPU usage will goes up
accordingly.
User needs to take all the factors including CPU usage, guest perf
degradation, etc. into consideration when selecting the live migration
support mode.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Vhost lib has already provided a helper for used ring logging, driver
could use it to reduce code.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch series enables a new method for live migration, i.e. software
assisted live migration. This patch provides a device argument for user
to choose the methold.
When "sw-live-migration=1", driver/device will do live migration with a
relay thread dealing with dirty page logging. Without this parameter,
device will do dirty page logging and there's no relay thread consuming
CPU resource.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If user wants the VF to be used in VDPA (vhost data path acceleration)
mode, then the user can add a "vdpa=1" parameter for the device.
So if driver does not find this option, it should quit and let the bus
continue the probe.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If driver fails to register ifc VF device into vhost lib, then this
device should not be stored.
Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Driver probe may fail for different causes, debug message is helpful for
debugging issue.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch provides two helpers for vdpa device driver to perform a
relay between the guest virtio ring and a mediated virtio ring.
The available ring relay will synchronize the available entries, and
help to do desc validity checking.
The used ring relay will synchronize the used entries from mediated ring
to guest ring, and help to do dirty page logging for live migration.
The later patch will leverage these two helpers.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
VDPA driver can decide if it needs to enable/disable the host notifier
mapping, so exposing a API can allow flexibility. A later patch will
base on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
vhost_detach_vdpa_device() is internally defined but not used, remove
it in this patch.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Until we have support for control virtqueues let's disable it and
fail device initalization if specified as a parameter.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add option to enable packed queue support for virtio-user
devices.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Use packed virtqueue format when reading and writing descriptors
to/from the ring.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This implements the transmit path for devices with
support for packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support to dump packed virtqueue data to the
VIRTQUEUE_DUMP() macro.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
fdset_add can call fdset_shrink_nolock which call fdset_move
concurrently to poll that is call in fdset_event_dispatch.
This patch add a mutex to protect poll from been call at the same time
fdset_add call fdset_shrink_nolock.
Fixes: 1b815b89599c ("vhost: try to shrink pfdset when fdset_add fails")
Cc: stable@dpdk.org
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As of commit 364e08f2bbc0, DPDK allows an application to send and
receive raw packets using an AF_PACKET and PACKET_MMAP, when using
Linux Kernel. This complements it by adding a simple guide with the
following information:
- An introduction, where a brief explanation of this driver is given,
pointing out the dependency on PACKET_MMAP;
- Which options are supported at configuration time, while setting up an
interface, and it's inherent limitations;
- What the prerequisites are;
- A command line example of how to set up a DPDK port using the
af_packet driver.
Since there's a dependency in PACKET_MMAP, the guide also points to the
original Kernel documentation, so the reader can get more details.
Signed-off-by: Tiago Lam <tiago.lam@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
It is possible that the VF device exists but DPDK doesn't know
about it. This could happen if device was blacklisted or more
likely the necessary device (Mellanox) was not part of the DPDK
configuration.
In either case, the right thing to do is just keep working
but only with the slower para-virtual device.
Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
On device close or startup errors, the transmit descriptor pool
was being left behind.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
While calculating the input packet count per port, discarded packets
should be reduced, right now only PF VSI discarded packets are reduced.
But while calculating the input byte count per port, Rx byte count is
used, which should take all discarded packets into account, including
VF VSI ones.
This will cause inconsistency in stat counters in some cases.
This patch would take all VSI stats as packet and byte count to address
the issue.
Fixes: 763de290cbd1 ("net/i40e: fix packet count for PF")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The reset flags vf->vf_reset and vf->pend_msg are set when VF received
VIRTCHNL_EVENT_RESET_IMPENDING. So after resetting done, these flags
should be cleared.
Fixes: 8cacf78469a7 ("net/i40e: fix VF initialization error")
Cc: stable@dpdk.org
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Currently, extmem autotest only covers the external malloc heap
API. Extend it to also cover the non-heap, register/unregister
external memory API.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Extend the extmem autotest to check whether the memseg lists for
externally allocated memory are always marked as external.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, only scenario with valid IOVA table is tested. Fix this
by also testing without IOVA table - in these cases, EAL should
always return RTE_BAD_IOVA for all memsegs, and contiguous memzone
allocation should fail.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will be adding a new extmem test that will behave roughly similar
to already existing, so clarify function names to distinguish between
these tests, as well as factor out the common parts.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, malloc statistics and external heap creation code
use memory hotplug lock as a way to synchronize accesses to
heaps (as in, locking the hotplug lock to prevent list of heaps
from changing under our feet). At the same time, malloc
statistics code will also lock the heap because it needs to
access heap data and does not want any other thread to allocate
anything from that heap.
In such scheme, it is possible to enter a deadlock with the
following sequence of events:
thread 1 thread 2
rte_malloc()
rte_malloc_dump_stats()
take heap lock
take hotplug lock
failed to allocate,
attempt to take
hotplug lock
attempt to take heap lock
Neither thread will be able to continue, as both of them are
waiting for the other one to drop the lock. Adding an
additional lock will require an ABI change, so instead of
that, make malloc statistics calls thread-unsafe with
respect to creating/destroying heaps.
Fixes: 72cf92b31855 ("malloc: index heaps using heap ID rather than NUMA node")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>