The portal driver is bound to DPIO objects discovered on the fsl-mc bus and
provides services that:
- allow other drivers, such as the Ethernet driver, to enqueue and dequeue
frames for their respective objects
A system will typically allocate 1 DPIO object per CPU to allow queuing
operations to happen simultaneously across all CPUs.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This patch will add support in fslmc vfio process to
scan and parse the dpni and dpseci object for net and crypto
devices. It will add the scanned devices to the fslmc bus.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Add support for using VFIO for dpaa2 based fsl-mc bus.
There are some differences in the way vfio used for fsl-mc bus
from the eal vfio.
- The scanning of bus for individual objects on the basis of
the DPRC container.
- The use and mapping of MC portal for object access
With the evolution of bus model, they can be further aligned with
eal vfio code.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
DPBP object represent a hw based buffer pool instance
in the DPAA2 hardware.
Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This patch adds the DPIO object support in MC driver.
DPIO - Data Path Input Output represent the processing
context to access the QBMAN HW for packet I/O.
Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This patch introduces the DPAA2 MC(Management complex Driver).
This is a minimal set of low level functions to send and
receive commands to the fsl-mc. It includes support for basic
management commands and commands to manipulate MC objects.
This is common to be used by various DPAA2 PMDs. e.g.net, crypto
and other drivers.
This is a low level library also used in kernel.
Signed-off-by: Cristian Sovaiala <cristian.sovaiala@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
QBMAN, is a hardware block which interfaces with the other
accelerating hardware blocks (For e.g., WRIOP) on NXP's DPAA2
SoC for queue, buffer and packet scheduling.
This patch introduces a userspace driver for interfacing with
the QBMAN hw block.
The qbman-portal component provides APIs to do the low level
hardware bit twiddling for operations such as:
-initializing Qman software portals
-building and sending portal commands
-portal interrupt configuration and processing
This same/similar code is used in kernel and compat file is used
to make it working in user space.
Signed-off-by: Geoff Thorpe <geoff.thorpe@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
When TSO is disabled we still have to calculate partial checksum if DF bit
if turned off. This is caused by firmware bug.
First of all, we must make sure that we are dealing with IPV4 packet.
If not, we will just skip further checking of this packet and move to
the next one.
If application will not set m2_len field, we assume we that it was Ethernet
frame because we have to look inside the packet to check for the DF flag.
To make it work properly, PMD is assuming that before sending
packet application called function rte_eth_tx_prepare().
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Jakub Palider <jpalider@gmail.com>
Acked-by: Jan Medala <jan.medala@outlook.com>
If wrong number of descriptors for refilling was passed to the Rx
repopulate function, there was memory leak which caused memory pool to
run out of resources in longer go.
In case of fail when refilling Rx descriptors, all additional mbufs
have to be released.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Jakub Palider <jpalider@gmail.com>
Acked-by: Jan Medala <jan.medala@outlook.com>
On RX path, after receiving bunch of packets, variable tracking
available descriptors in HW queue was not updated.
To fix this issue, variable tracking used descriptors must be updated
after receiving packets - it must be reduced by the amount of received
descriptors in current batch.
Additionally, variable next_to_clean in rx_ring must be updated before
entering ena_populate_rx_queue() to keep it up to date with the current
ring state.
Fixes: 1daff5260f ("net/ena: use unmasked head and tail")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Jakub Palider <jpalider@gmail.com>
Acked-by: Jan Medala <jan.medala@outlook.com>
When application tried to allocate 1024 descriptors, device was not
initializing properly.
This patch solves it by avoiding allocation of all descriptors in the
ring in one attempt. At least one descriptor must remain unused in the
HW ring.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Jakub Palider <jpalider@gmail.com>
Acked-by: Jan Medala <jan.medala@outlook.com>
Expose additional fpga status registers.
Report hardware status during PMD init.
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Acked-by: John Miller <john.miller@atomicrules.com>
Time stamp was carried in the packet meta data, but not
place in the mbuf. The new time stamp field is the proper
destination.
Remove the setting of data offset since this is done by
rte_pktmbuf_free()
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Acked-by: John Miller <john.miller@atomicrules.com>
Removing this check improves performance as VLAN and CRC stripping are
enabled most of the time.
Convert MLX5_CQE_VLAN_STRIPPED to network order to speed up the check
instead of doing it on the completion queue entry field.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Trying to assign more queues to stats struct break only from one loop
when the maximum size is reached. Outside loop iteration is continued.
This leads to access an array out of bounds.
Fixes: 21e3fb0050 ("net/thunderx: add final bits for secondary queue support")
Cc: stable@dpdk.org
Signed-off-by: Marcin Wilk <marcin.wilk@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Create a new file rte_pmd_ixgbe.c for all the private
APIs. Move all the related code to the new file.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
To keep consistent with previous release, Priority Flow Control (PFC)
need to be disabled by default. This patch fixes it.
This also fixes an issue where traffic was not forwarded by testpmd
occasionally. In those cases ~4770 pps seen on one of the ports rather
than the full rate (>20mpps).
Fixes: 6f0a707e5b ("net/i40e: enable DCB on SRIOV VFs")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
There are some new PCTYPEs on X722, but they have not been announced
on the RTE lib, so if it can not set corresponding hash input set for
these packet type, the hash function won’t work.
So we need to handle them base on the translation of the new
PCTYPE and the original PCTYPE.
Fixes: b6a0ec4182 ("i40e: use AQ for Rx control register read/write")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Remove initialization of next and nb_segs mbuf fields in the Rx path
since they are now initialized in the mbuf pool.
See commit 8f094a9ac5 ("mbuf: set mbuf fields while in pool").
Signed-off-by: John Daley <johndale@cisco.com>
The base code was using native GCC macros
for endian detection. Change to portable
rte_byteorder.h based endian detection.
Fixes: 262d43fe20 ("net/thunderx/base: add HW register definitions")
Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Having packets received without any offload flags given in the mbuf is not
very useful, and performance tests with testpmd indicates little
benefit is got with the current code by turning off the flags. This makes
the build-time option pointless, so we can remove it.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
With the mbuf rework, we now have 8 contiguous bytes to be rearmed in the
mbuf just before the 8-bytes of olflags. If we don't do the rearm write
inside the descriptor ring replenishment function, and delay it to
receiving the packet, we can do a single 16B write inside the RX function
to set both the rearm data, and the flags together.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Porting two changes from x86 SSE implematation.
net/i40e: fix checksum flag in x86 vector Rx
net/i40e: eliminate mbuf write on rearm
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Having packets received without any offload flags given in the mbuf is not
very useful, and performance tests with testpmd indicates little to no
benefit is got with the current code by turning off the flags. This makes
the build-time option pointless, so we can remove it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
With the mbuf rework, we now have 8 contiguous bytes to be rearmed in the
mbuf just before the 8-bytes of olflags. If we don't do the rearm write
inside the descriptor ring replenishment function, and delay it to
receiving the packet, we can do a single 16B write inside the RX function
to set both the rearm data, and the flags together.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Initialize mail box command to request VF FLR. Data field was
uninitialized before as it was not required and caused the following
error during scan.
Coverity issue: 1384518
Fixes: cdb166963c ("net/liquidio: add API for VF FLR")
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
When PF notify link speed to VF, encode of link speed need to
be converted from ETH_SPEED_XXX to I40E_LINK_SPEED_XXX.
Fixes: bb6722fb5c ("net/i40e: fix VF bonded device link down")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Make i40e_notify_all_vfs_link_status as a wrapper of
i40e_notify_vf_link_status so duplicate code can be removed.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The eth keyword only should be accepted.
Fixes: d37705068e ("net/i40e: parse QinQ pattern")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
mbuf next is equal to NULL, nb_segs and reference counter are equal to 1
when mbuf is obtained from mempool.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
* Core TX packet moving functions
* Flesh out ark_adapter struct to support TX code
(not all fields used at this patch)
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Provide C-level interface for Arkville's internal HW resources
DDM (Downstream Data Mover) and UDM (Upstream Data Mover) modules
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
function i40evf_add_del_all_mac_addr without check return
value of rte_zmalloc
Fixes: 97ac72aa71 ("i40e: support setting VF MAC address")
Cc: stable@dpdk.org
Signed-off-by: Henry Cai <caihe@huawei.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Check return value of malloc.
Fixes: 3bd122eef2 ("cxgbe/base: add hardware API for Chelsio T5 series adapters")
Cc: stable@dpdk.org
Signed-off-by: Henry Cai <caihe@huawei.com>
Acked-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
build error:
.../drivers/net/xenvirt/rte_eth_xenvirt.c:676:9:
error: ‘struct rte_eth_dev’ has no member named ‘driver’
eth_dev->driver = NULL;
^~
ethdev driver removed, removing assignment to it to fix the build error
Fixes: 9dca21fb80 ("ethdev: remove ethdev driver")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This moves the rte_eth_copy_pci_info() into the PCI specific ethdev
header. As a side effect this also removes it from the list of symbols
exported by the rte_ethdev library.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This removes the now unused struct eth_driver.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
There is no need to export this API. Remaining users should use the
rte_eal_vdev_init() function instead.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
A virtual device should get initialized through the rte_eal_vdev_init()
function to properly initialize the driver.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This driver can use the library function rte_eth_dma_zone_reserve()
instead of duplicating the code.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
To properly embed the generic rte_device into the rte_eth_dev this reworks
the bonding API to call through rte_eal_vdev_init().
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This is a preparation to embed the generic rte_device into the rte_eth_dev
also for virtual devices.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Secondary process doesn't properly attach to the rte_eth_device
initialized by the primary process.
Accessing device from secondary process (e.g. via rte_eth_rx_burst),
causes process to crash. because rte_eth_dev_data is not properly set.
The issue was flood by
'commit 7f95f78a8a ("ethdev: clear data when allocating device")'
which now clears rte_eth_dev_data entry.
For pci devices the struct is initialized by rte_eth_dev_pci_probe
->eth_dev_attach_secondary().
However, for virtio-user virtio_user_pmd_probe() is called instead of
rte_eth_dev_pci_probe().
The fix is to call rte_eth_dev_attach_secondary(), for secondary
process, from virtio_user_pmd_probe.
Fixes: 7f95f78a8a ("ethdev: clear data when allocating device")
Cc: stable@dpdk.org
Signed-off-by: Ami Sabo <amis@radware.com>
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Failed to destroy tunnel filter rule if the action of
the tunnel filter is VF, root cause is the wrong vsi
used.
Fixes: c50474f31e ("net/i40e: support tunnel filter to VF")
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
build error:
.../event/sw/sw_evdev_worker.c: In function ‘sw_event_release’:
.../event/sw/sw_evdev_worker.c:52:3: error: unknown field ‘op’ specified
in initializer
Fixed by updating struct initialization.
Fixes: 656af91800 ("event/sw: add worker core functions")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Disable for gcc < 4.7 and icc <= 14.0
PMD uses some compiler builtins and new compiler options. Tested with
gcc 4.5.1 and following were not supported:
option:
-Ofast
macros:
_Static_assert
__ORDER_LITTLE_ENDIAN__
__ORDER_BIG_ENDIAN__
__BYTE_ORDER__
__atomic_fetch_add
__ATOMIC_ACQUIRE
__atomic_load_n
__ATOMIC_RELAXED
__atomic_store_n
__ATOMIC_RELEASE
It is not easy to fix all in PMD, disabling PMD for older compilers.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch deprecates the following functions in 17.05,
which will be removed in 17.08.
- rte_crpytodev_scheduler_mode_get()
- rte_crpytodev_scheduler_mode_set()
These two new functions replace them, fixing the typo in their names.
- rte_cryptodev_scheduler_mode_get()
- rte_cryptodev_scheduler_mode_set()
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch fixes the incorrection slave session free operation.
Fixes: 57523e682b ("crypto/scheduler: register operation functions")
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Refactor capabilities data structures to facilitate
defining different capability sets for different devices
without duplication of data.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
This patch adds an API to get the run-time slaves number and list
of a cryptodev scheduler PMD.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch fixes segementation fault that may occur in case
of wrong parameters being provided to the cryptographic
session. Unused fields which would cause null dereference
are removed.
Fixes: 1703e94ac5 ("qat: add driver for QuickAssist devices")
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
DOCSIS BPI mode is handled in the QAT PMD by sending full blocks to the
hardware device for encryption and using OpenSSL libcrypto for pre- or
post-processing of any partial blocks.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Adds support in OpenSSL PMD for algorithm following the DOCSIS
specification, which combines DES-CBC for full DES blocks (8 bytes)
and DES-CFB for last runt block (less than 8 bytes).
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Tested-by: Yang Gang <gangx.yang@intel.com>
Underlying IPSec Multi buffer library implements
DOCSIS specification, so this commit adds support
for this new feature, which combines AES-CBC for full
AES blocks (16 bytes) and AES-CFB for last runt block
(less than 16 bytes).
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
IPSec Multi-buffer library v0.45 has been released,
which includes, among other features, support for DOCSIS BPI
specification and include AVX512 optimizations.
This new version added const qualifiers to some of the function
prototypes, so the PMD has been updated to include these changes.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Moving the crypto processing from the enqueue burst to the dequeue burst,
to remove the requirement to continually call the
rte_cryptodev_burst_enqueue function to guarantee that all operations get
flushed from the multi-buffer managers buffers.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
There is bug when more crypto ops are enqueued than dequeued.
The return value is not checked when trying to enqueue the
processed crypto op into the internal ring, which in the case of being
full will results in crypto ops and mbufs being leaked.
The issue is more obvious with different cores doing enqueue/dequeue.
This patch moves the crypto operation to the dequeue function which
fixes the above issue without having to check for the number of free
entries in the ring.
Fixes: eec136f3c5 ("aesni_gcm: add driver for AES-GCM crypto operations")
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Fail-over mode works with 2 slaves, primary slave and secondary slave.
In this mode, the scheduler will enqueue the incoming crypto op burst
to the primary slave. When one or more crypto ops are failed to be
enqueued, they then will be enqueued to the secondary slave.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Packet-size based distribution mode is a scheduling mode works with 2
slaves, primary slave and secondary slave, and distribute the enqueued
crypto ops to them based on their data lengths. A crypto op will be
distributed to the primary slave if its data length equals or bigger
than the designated threshold, otherwise it will be handled by the
secondary slave.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Since the new device configuration API is updated, we can make use of
this feature to the crypto scheduler PMD to configure its slaves
automatically with the same configurations it got. As originally the
slaves have to be manually configured one by one, this patch should
help reducing the coding complexity.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
This patch changes the device configuration API for rte_cryptodev_ops
function prototype, and update all cryptodev PMDs for this change.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
This patch optimizes the crypto op ordering by replacing the
ordering method from using rte_reorder library to using rte_ring
to avoid unnecessary crypto op storing and recovering cost.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
This patch changes the enqueue and dequeue methods to cryptodev
scheduler PMD. Originally a 2-layer function call is carried out
upon enqueuing or dequeuing a burst of crypto ops. This patch
removes one layer to improve the performance.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
The macro RTE_PMD_TAP_MAX_QUEUES was defined twice.
On machines with kernel < 3.8, IFF_MULTI_QUEUE didn't exist, and thus
both definitions used different values.
Fixes: cf56436611 ("net/tap: move private elements to external header")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Since patch "mbuf: structure reorganization" the compiler complains
sometimes (in some conditions):
.../drivers/net/mlx5/mlx5_rxtx.c: In function ‘mlx5_rx_burst’:
.../drivers/net/mlx5/mlx5_rxtx.c:2082:17: error: ‘len’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
len is not initialised as it will be at the first segment of a received
packet, but it remains hard for the compiler to determine it.
Fixes: 9964b965ad ("net/mlx5: re-add Rx scatter support")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
build error:
.../drivers/event/octeontx/ssovf_worker.c(212):
error #592: variable "get_work0" is used before its value is set
RTE_SET_USED(get_work0);
^
.../drivers/event/octeontx/ssovf_worker.c(213):
error #592: variable "get_work1" is used before its value is set
RTE_SET_USED(get_work1);
^
For x86 these variables set but not used, move macros below
where values assigned.
Fixes: f61808eaa9 ("event/octeontx: add start function")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
To avoid multiple stores on fast path, Ethernet drivers
aggregate the writes to data_off, refcnt, nb_segs and port
to an uint64_t data and write the data in one shot
with uint64_t* at &mbuf->rearm_data address.
Some of the non-IA platforms have store operation overhead
if the store address is not naturally aligned.This patch
fixes the performance issue on those targets.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Now that the m->next pointer and m->nb_segs is expected to be set (to
NULL and 1 respectively) after a mempool_get(), we can avoid to write them
in the Rx functions of drivers.
Only some drivers are patched, it's not an exhaustive patch. It gives
the idea to do the same in other drivers.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Set the value of m->refcnt to 1, m->nb_segs to 1 and m->next
to NULL when the mbuf is stored inside the mempool (unused).
This is done in rte_pktmbuf_prefree_seg(), before freeing or
recycling a mbuf.
Before this patch, the value of m->refcnt was expected to be 0
while in pool.
The objectives are:
- to avoid drivers to set m->next to NULL in the early Rx path, since
this field is in the second 64B of the mbuf and its access could
trigger a cache miss
- rationalize the behavior of raw_alloc/raw_free: one is now the
symmetric of the other, and refcnt is never changed in these functions.
To optimize the freeing of the segments, we try try to only update
m->refcnt, m->next, and m->nb_segs when it's required (idea from
Konstantin Ananyev <konstantin.ananyev@intel.com>).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Rename __rte_mbuf_raw_free() as rte_mbuf_raw_free() and make
it public. The old function is kept for compat but is marked as
deprecated.
The next commit changes the behavior of rte_mbuf_raw_free() to
make it more consistent with rte_mbuf_raw_alloc().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Document the function and make it public, since it is used at several
places in the drivers. The old one is marked as deprecated.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
If device is configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
configuration then use different fast path dequeue handler to wait till
requested amount of nanosecond if the event is not available.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
SSO co-processor runs at a different frequency than core clock.
Request PF to convert the ns to SSO get_work timeout period.
On dequeue, If device is configured with
RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT configuration then
use different fast path dequeue handler to wait till requested
amount of nanosecond if the event is not available.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
queues to port link and unlink establishment is through
setting/resetting the queue/group membership in
SSOW_VHWS_GRPMSK_CHGX
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Add in the data-structures for the ports used by workers
to sent events to/from the HW scheduler. Also add a
function to release the resource allocated in setup
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Pretty much everything done in HW. Need to configure
the priority associated with event queue aka sso group
through a mailbox request to PF
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Add the info_get function to return details on the queues, flow,
prioritization capabilities, etc. which this device has.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
This adds the minimal changes to allow a octeontx eventdev
implementation to be compiled, linked and created at run time.
The eventdev does nothing, but can be created via vdev
on command line, e.g.
sudo ./build/app/test -c 0xff00 --vdev=event_octeontx
...
Initializing event_octeontx domain=4 max_queues=64 max_ports=32
RTE>>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
ssovf VF device has mailbox mechanism to communicate
with PF device. This patch adds support for a mbox API to
send the mailbox request to PF device.
The ssovf VF device will be used as the communication channel
to talk to PF devices of all the network accelerated
co-processors in Octeontx. Exposing as shared function to
use it from pool, crypto, network devices.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
ssovf and ssowvf PCIe VF devices are shared
between eventdev PMD and ethdev PMD. This patch
expose a set of interface API to get info
about probed ssovf and ssowvf VF resources to use
with eventdev and ethdev vdev devices latter.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
An event device consists of event queues and event ports.
On Octeontx HW, each event queues(sso group/ssovf) and
event ports(sso hws/ssowvf) are enumerated as separate
SRIOV VF PCIe device. In order to expose as an event device,
On PCIe probe, the driver stores the information associated
with the PCIe device and later with vdev infrastructure
creates event device with earlier probed PCIe VF devices.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
An event device consists of event queues and event ports.
On Octeontx HW, each event queues(sso group/ssovf) and
event ports(sso hws/ssowvf) are enumerated as separate
SRIOV VF PCIe device. In order to expose as an event device,
On PCIe probe, the driver stores the information associated
with the PCIe device and later with vdev infrastructure
creates event device with earlier probed PCIe VF devices.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Add support for xstats to report out on the state of the eventdev.
Useful for debugging and for unit tests, as well as observability
at runtime and performance tuning of apps to work well with the
scheduler.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Segfault issue resolved when only partially configured and
rte_event_dev_dump() is called before start(),
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add in the scheduling function which takes the events from the
producer queues and buffers them before scheduling them to consumer
queues. The scheduling logic includes support for atomic, reordered,
and parallel scheduling of flows.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
add the event enqueue, dequeue and release functions to the eventdev.
These also include tracking of stats for observability in the load of
the scheduler.
Internally in the enqueue function, the various types of enqueue
operations, to forward an existing event, to send a new event, to
drop a previous event, are converted to a series of flags which will
be used by the scheduler code to perform the needed actions for that
event.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add in the data-structures for the ports used by workers to send
packets to/from the scheduler. Also add in the functions to
create/destroy those ports.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add in the data structures for the event queues, and the eventdev
functions to create and destroy those queues.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add in the info_get function to return details on the queues, flow,
prioritization capabilities, etc. that this device has.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This adds the minimal changes to allow a SW eventdev implementation to
be compiled, linked and created at run time. The eventdev does nothing,
but can be created via vdev on commandline, e.g.
sudo ./x86_64-native-linuxapp-gcc/app/test --vdev=event_sw0
...
PMD: Creating eventdev sw device event_sw0, numa_node=0, sched_quanta=128
RTE>>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
PMDs that only do a specific type of scheduling cannot provide
CFG_ALL_TYPES, so the Eventdev infrastructure should not demand
that every PMD supports CFG_ALL_TYPES.
By not overriding the default configuration of the queue as
suggested by the PMD, the eventdev_common unit tests can pass
on all PMDs, regardless of their capabilities.
RTE_EVENT_QUEUE_CFG_DEFAULT is no longer used by the eventdev layer
it can be removed now. Applications should use CFG_ALL_TYPES
if they require enqueue of all types a queue, or specify which
type of queue they require.
The CFG_DEFAULT value is changed to CFG_ALL_TYPES in event/skeleton,
to not break the compile.
A capability flag is added that indicates if the underlying PMD
supports creating queues of ALL_TYPES.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
eventdev driver may return error on dequeue timeout tick conversion.
Change the pmd callback interface to address the same.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
rte_device is a generic device which is available to the applications
and EAL. This patch replaces rte_pci_device in 'struct rte_eventdev'
and in 'struct rte_event_dev_info' with common rte_device.
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Removed global index based device name
generation as vdev uninit needs the exact driver
name used vdev init.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Added a pointer to the rte_eventdev type in the event port
link and unlink callbacks. This device shall be used by some
of the event drivers to fetch queue related information.
Also, update the skeleton eventdev driver with corresponding changes.
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The skeleton driver facilitates, bootstrapping the new
eventdev driver and creates a platform to verify
the northbound eventdev common code.
The driver supports both VDEV and PCI based eventdev
devices.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We used to use rte_vhost_driver_session_start() to trigger the vhost-user
session. It takes no argument, thus it's a global trigger. And it could
be problematic.
The issue is, currently, rte_vhost_driver_register(path, flags) actually
tries to put it into the session loop (by fdset_add). However, it needs
a set of APIs to set a vhost-user driver properly:
* rte_vhost_driver_register(path, flags);
* rte_vhost_driver_set_features(path, features);
* rte_vhost_driver_callback_register(path, vhost_device_ops);
If a new vhost-user driver is registered after the trigger (think OVS-DPDK
that could add a port dynamically from cmdline), the current code will
effectively starts the session for the new driver just after the first
API rte_vhost_driver_register() is invoked, leaving later calls taking
no effect at all.
To handle the case properly, this patch introduce a new API,
rte_vhost_driver_start(path), to trigger a specific vhost-user driver.
To do that, the rte_vhost_driver_register(path, flags) is simplified
to create the socket only and let rte_vhost_driver_start(path) to
actually put it into the session loop.
Meanwhile, the rte_vhost_driver_session_start is removed: we could hide
the session thread internally (create the thread if it has not been
created). This would also simplify the application.
NOTE: the API order in prog guide is slightly adjusted for showing the
correct invoke order.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
rename "virtio_net_device_ops" to "vhost_device_ops", to not let it
be virtio-net specific.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
They are virtio-net specific and should be defined inside the virtio-net
driver.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We used to use rte_vhost_get_queue_num() for telling how many vrings.
However, the return value is the number of "queue pairs", which is
very virtio-net specific. To make it generic, we should return the
number of vrings instead, and let the driver do the proper translation.
Say, virtio-net driver could turn it to the number of queue pairs by
dividing 2.
Meanwhile, mark rte_vhost_get_queue_num as deprecated.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Assume there is an application both support vhost-user net and
vhost-user scsi, the callback should be different. Making notify
ops per vhost driver allow application define different set of
callbacks for different driver.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The rte_eth_vhost_feature_disable/enable/get APIs are just a wrapper of
rte_vhost_feature_disable/enable/get. However, the later are going to
be refactored; it's going to take an extra parameter (socket_file path),
to let it be per-device.
Instead of changing those vhost-pmd APIs to adapt to the new vhost APIs,
we could simply remove them, and let vdev to serve this purpose. After
all, vdev options is better for disabling/enabling some features.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
So far, virtio-user with vhost-user as the backend can only support
client mode. So when vhost user backend is down, i.e., unix socket
connection is broken, the connection cannot be re-connected. We will
forcely set the link state to be down.
Note: virtio-user with vhost-kernel as the backend still cannot
support lsc now as we fail to find a way to monitor the backend, tap
device, up/down events.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Originally, we did not report support of VIRTIO_NET_F_STATUS.
This feature is not reported by vhost backend, instead, it
is added/removed by QEMU in virtio PCI case.
We report the support of this feature so that following patch
will depend on this feature to enable LSC interrupt.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
For rxq interrupt, the device (backend driver) will notify driver
through callfd. Each virtqueue has a callfd. To keep compatible
with the existing framework, we will give these callfds to
interrupt thread for listening for interrupts.
Before that, we need to allocate intr_handle, and fill callfds
into it so that driver can use it to set up rxq interrupt mode.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Originally, eventfd is opened when initializing each vq; and gets closded
in virtio_user_stop_device().
To make it possible to initialize intr_handle struct in init() in following
patch, we put the open() of all eventfds into init(); and put the close()
into uninit().
Suggested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch adds a new option 'iface' to change the interface name of
tap device with vhost-kernel as backend.
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch implements support for the Virtio MTU feature.
When negotiated, the host shares its maximum supported MTU,
which is used as initial MTU and as maximum MTU the application
can set.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch adds a call to rte_vhost_mtu_get() at device creation
time to fill device's MTU property when available.
This makes the MTU value defined in QEMU cmdline accessible to the
application by calling rte_eth_dev_get_mtu().
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The link state change interrupt can only be configured if the virtio device
supports MSIX. Prior to this change the writing of the vector to the PCI
config space was causing it to overwrite the initial part of the MAC
address since the MSIX vector is not in the config space and is occupied by
the MAC address.
This has been reproduced in Virtual Box (v5.0.30.r112061) in Windows 7.
Fixes: 954ea11540 ("virtio: do not report link state feature unless available")
Cc: stable@dpdk.org
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
virtio-user limits the qeueue number to 8 but provides no limit
check against the queue number input from user. If a bigger queue
number (> 8) is given, there is an overflow issue. Doing a sanity
check could avoid it.
Fixes: 37a7eb2ae8 ("net/virtio-user: add device emulation layer")
Cc: stable@dpdk.org
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The valid tap file descriptor range should be equal or greater
than zero instead of non-zero
Fixes: e3b434818b ("net/virtio-user: support kernel vhost")
Cc: stable@dpdk.org
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch revmoves include of the numaif.h header from rte_eth_vhost.c.
Commit 586e390013 ("vhost: export numa node") moved the invocation of
get_mempolicy() from rte_eth_vhost.c to librte_vhost. So there is no need
to include the numaif.h header anymore in rte_eth_vhost.c.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The minor change aims to remove the redundant computing and make
it easier to understand the code.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Improve error checking to avoid a caller overflowing the MCDI
request buffer if the requested TXQ size was excessively large.
Coverity issue: 1305527
Fixes: e7cd430c86 ("net/sfc/base: import SFN7xxx family support")
CC: stable@dpdk.org
Signed-off-by: Andy Moreton <amoreton@solarflare.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ixgbe supports to set the relative bandwidth for the TCs.
It's a global setting for the PF and all the VFs of a
physical port.
This feature provide the API to set the bandwidth.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Some DPDK applications/examples check link status on their
start. NICVF does not wait for the link, so those apps fail.
Wait up to 9 seconds for the link as other PMDs do in order
to fix those apps/examples.
Signed-off-by: Andriy Berestovskyy <andriy.berestovskyy@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
After adding VLAN filter, the VLAN promiscuous mode is
disabled. But there's no chance to enable it.
So add the check after deleting VLAN filter. If there's
no VLAN filter left, enable the VLAN promiscuous mode.
Fixes: 9f0645cd14 ("net/i40e: fix VLAN filter")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This is necessary to ensure packets with the new MAC address as
destination get redirected to the tap device.
Also change the MAC address only if the current one is different from
the requested one.
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Immediately after init (probing), the device MAC address is all zeroes.
It should be possible to get a correct MAC address as soon as that,
without need for a dev_configure().
With this patch, a MAC address is set in eth_dev_tap_create()
explicitly. It either comes from the remote if any was configured, or is
randomly generated. In any case, the device MAC address is guaranteed to
be the correct one when the tap netdevice actually gets created in
tun_alloc().
Fixes: f76d46b4ff ("net/tap: add MAC address management")
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Some errors received from the kernel are acceptable, such as a -ENOENT
for a rule deletion (the rule was already no longer existing in the
kernel). Make sure we consider return codes properly. For that,
nl_recv() has been simplified.
qdisc_exists() function is no longer needed as we can check whether the
kernel returned -EEXIST when requiring the qdisc creation. It's simpler
and faster.
Add a few messages for clarity when a netlink error occurs.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Flow validation already processes the final actions to verify if a rule can
be applied or not and the same is done during creation. As the creation
function relies on validation to generate and apply a rule, this job can be
fully handled by the validation function.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
With support for segmented packets, it is now possible to easily receive
packets of many sizes, given an adequate number of descriptors.
Remove limitation on the minimum size of mbuf: on reception, if a packet
won't fit in the queue's mbufs, it will be detected in the packet info
and the packet will be discarded.
Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Only full mask (0xffff) is accepted, there is no way to specify a mask
for layer 4 ports to the kernel using TC rules.
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Remove unnecessary interface queries and the Resource Domain when
creating the Queue Pair. Since mlx5 PMD data path is on top of native
APIs, such Verbs library calls are no longer needed.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Currently mlx5_dev_rss_reta_update() just updates tables in the host,
therefore it isn't immediately effective until restarting the device by
calling mlx5_dev_stop()/mlx5_dev_start() to update the changes in the
device side. This patch adds rebuilding the device-specific data
structure and applying it to the device right away.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
When querying and updating RSS RETA table, it always uses the max size of
the device instead of configured value. This patch fixes it and removed the
related comments in the code.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
If the primary process has initialized all the queues to vector
pmd mode, the secondary process should not use scalar code path,
because the per queue data structures haven't been prepared for
that, e.g. txq->ops is for vector Tx rather than scalar Tx.
Fixes: a6ce64a975 ("fm10k: introduce vector driver")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
The QinQ filter uses big buffers, set the big_buffer flag
when restoring a QinQ filter.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The QinQ filter uses big buffers, set the big_buffer flag
when removing a QinQ filter.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add i40e_cloud_filter_qinq_create function, and call it
from i40e_dev_consistent_tunnel_filter_set function.
Replace the Outer IP filter with the QinQ filter.
QinQ allows multiple VLAN tags to be inserted into a single Ethernet
frame. A QinQ frame is a frame that has two VLAN 802.1Q headers.
802.1Q tunneling (QinQ) is a technique often used by Metro Ethernet
providers as a layer 2 VPN for customers.
Signed-off-by: Laura Stroe <laura.stroe@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
The L3 MAP register is initialised to support QinQ
cloud filters.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Mark ID in the completion queue entry is 24 bits, the remaining 8 bits are
reserved and may be nonzero. Do not take them into account when looking for
marked packets.
Fixes: ea3bc3b1df ("net/mlx5: support mark flow action")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Device reconfigure should be done without close which releases
all transmit and receive queue. ethdev API assumes that previously
setup queues (minimum from configured before and now) are kept
across device reconfigure.
Fixes: aaa3f5f0f7 ("net/sfc: add configure and close stubs")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Fixes: a8ad8cf83f ("net/sfc: provide basic stubs for Tx subsystem")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Prepare to fix device reconfigure. Make it clear that corresponding
functions should be called on device configure and close operations.
No functional change.
Fixes: a8ad8cf83f ("net/sfc: provide basic stubs for Tx subsystem")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Prepare to fix device reconfigure. Make it clear that corresponding
functions should be called on device configure and close operations.
No functional change.
Fixes: a8e64c6b45 ("net/sfc: implement Rx subsystem stubs")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Port configuration should be initialized on attach to avoid reset to
defaults on device reconfigure.
Fixes: 03ed21195d ("net/sfc: minimum port control sufficient to receive traffic")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Prepare to fix device reconfigure.
Device arguments should be parsed once on attach.
Management event queue should be initialized once on attach.
Fixes: 58294ee65a ("net/sfc: support event queue")
Fixes: c22d3c508e ("net/sfc: support parameter to choose performance profile")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
EvQ info array keeps information about EvQ centralized, however
EvQ pointers are available in TxQ and RxQ structures. Single
array for all EvQs complicates device reconfigure handling, so
simply git rid of it.
It removes notion of EvQ software index since there is no EvQ
array in software any more.
Fixes: 58294ee65a ("net/sfc: support event queue")
Fixes: 9a75f75cb1 ("net/sfc: maintain management event queue")
Fixes: ce35b05c63 ("net/sfc: implement Rx queue setup release operations")
Fixes: 28944ac098 ("net/sfc: implement Rx queue start and stop operations")
Fixes: b1b7ad933b ("net/sfc: set up and release Tx queues")
Fixes: fed9aeb46c ("net/sfc: implement transmit path start / stop")
Fixes: 3b809c27b1 ("net/sfc: support link status change interrupt")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Next step to get rid of EvQ info at all.
Fixes: c22d3c508e ("net/sfc: support parameter to choose performance profile")
Fixes: 3b809c27b1 ("net/sfc: support link status change interrupt")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
EvQ info array is a problem on device reconfigure when number of
Rx and Tx queues may change. It is a step to get rid of it.
Fixes: 58294ee65a ("net/sfc: support event queue")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
The index of an event queue index is computed from the index of the
corresponding transmit or receive queue, and depends on the total
number of receive queues. As a consequence, the index of an event
queue bound to a transmit queue changes if the total number of
receive queues is changed.
Fixes: 58294ee65a ("net/sfc: support event queue")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Make it clear that corresponding functions should be called on device
configure and close operations. No functional change.
Fixes: 06bc197796 ("net/sfc: interrupts support sufficient for event queue init")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If Tx datapath does not support TSO, TSO was dropped on device configure.
It is incorrect to change advertised offloads.
Fixes: 7a4d44a639 ("net/sfc: make TSO a datapath-dependent feature")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Datapath choice requires NIC capabilities knowledge and, therefore,
should be done after probe. Whereas NIC resources estimation needs
to know chosen datapath (e.g. if Tx datapath is going to use TSO).
Fixes: df1bfde4ff ("net/sfc: factor out libefx-based Rx datapath")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Put to mempool does not free chained segments.
Fixes: e0b063941e ("net/sfc: support scattered Rx DMA")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
MPLSoUDP & MPLSoGRE is not supported by tunnel
filter due to limited resource of HW, this patch
enables MPLS tunnel filter by replacing inner_mac
filter.
This configuration will be set when adding MPLSoUDP
and MPLSoGRE filter rules, and it will be invalid
only by NIC core reset.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
This patch add MPLS parsing function to support MPLS filtering.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
First segment size must be at least 18 bytes, packets not respecting this
are silently not sent by the NIC but counted as sent by the PMD. The only
way to figure out is compiling the PMD in debug mode.
Fixes: 6579c27c11 ("net/mlx5: remove gather loop on segments")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The driver can handle dynamic MTU change without needing the port to be
stopped explicitly by the application. However, there is currently no
check to prevent I/Os from happening on a different thread while the
port is going thru' reset internally. This patch fixes this issue by
assigning RX/TX burst functions to a dummy function and also reconfigure
RX bufsize for each rx queue based on the new MTU value.
Fixes: 200645ac79 ("net/qede: set MTU")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
The newer SR-IOV PF drivers expects RX/TX queues to be created before
applying RSS configuration. This patch addresses this requirement by
deferring RSS configuration till the queues are created. Even though
this issue is only seen in SR-IOV context, the changes will be made
applicable to PF also to keep the behavior consistent between VF/PF.
Fixes: 7ab35bf6b9 ("net/qede: fix RSS")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Both UDP and TCP based RSS offload types are supported by the device.
This patch adds UDP protocol which got missed out in the original patch.
Fixes: 4c98f2768e ("net/qede: support RSS hash configuration")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add i40e_tunnel_type enumeration type to refine consistent
tunnel filter, it will be esay to add new tunnel type for
i40e.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Previously, only tunnel filter to PF is supported.
This patch adds i40e_dev_consistent_tunnel_filter_set
function for consistent filter API to support tunnel
filter to VF.
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Change tunnel filter function name to VXLAN filter
function, prepare for other tunnel filter function.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Rework tunnel filter functions to align with the
new command buffer for add/remove cloud filter.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
vhost removes limit of Rx burst size(32 pkts) and supports to make
an best effort to receive pkts.
Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
vhost removes limit of Tx burst size(32 pkts) and supports to make
an best effort to transmit pkts.
Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size and
implement the "make an best effort to transmit the pkts" policy.
The patch makes ixgbe vec function work in a consistent behavior
like ixgbe_xmit_pkts_simple and ixgbe_xmit_pkts do that.
Cc: Helin Zhang <helin.zhang@intel.com>
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size. The patch
makes i40e vec function an best effort to transmit the pkts in the
consistent behavior like i40e_xmit_pkts_simple and i40e_xmit_pkts do that.
Cc: Helin Zhang <helin.zhang@intel.com>
Cc: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size.
The patch makes fm10k vec function an best effort to transmit
pkts in the consistent behavior like fm10k_xmit_pkts does that.
Cc: Jing Chen <jing.d.chen@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch includes slowpath configuration and fastpath changes
to support LRO and TSO. A bit of revamping is needed in order
to make use of existing packet classification schemes in Rx fastpath
and for SG element processing in Tx.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add limited support for ntuple filter and flow director configuration.
The filtering is based on 4-tuples viz src-ip, dst-ip, src-port,
dst-port. The mask fields, tcp_flags, flex masks, priority fields,
Rx queue drop etc are not supported.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add base driver APIs to enable accelerated RFS[aRFS] mode and ramrod
to configure rfs and ntuple filter.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Merge hw_stop and hw_reset into one function.
Prevent race condition between MFW attentions and pf stop command during
unload flow that causes an ASSERT.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
A step toward having multi-Txq support on same queue-zone for VFs.
This change takes care of:
- VFs assume a single CID per-queue, where queue X receives CID X.
Switch to a model similar to that of PF - I.e., Use different CIDs
for Rx/Tx, and use mapping to acquire/release those. Each VF
currently will have 32 CIDs available for it [for its possible 16
Rx & 16 Tx queues].
- To retain the same interface for PFs/VFs when initializing queues,
the base driver would have to retain a unique number per-each queue
that would be communicated in some extended TLV [current TLV
interface allows the PF to send only the queue-id]. The new TLV isn't
part of the current change but base driver would now start adding
such unique keys internally to queue_cids. This would also force
us to start having alloc/setup/free for L2 [we've refrained from
doing so until now]
The limit would be no-more than 64 queues per qzone [This could be
changed if needed, but hopefully no one needs so many queues]
- In IOV, Add infrastructure for up to 64 qids per-qzone, although
at the moment hard-code '0' for Rx and '1' for Tx [Since VF still
isn't communicating via new TLV which index to associate with a
given queue in its queue-zone].
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Add support for the new interface with the Management FW for setting
max values of "soft" resources.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Add new image types - RECOVERY and PK (Public Key) towards
the second phase of NVRAM security support.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Optimize cache-line access in ecore_chain -
re-arrange fields so that fields that are needed for fastpath
[mostly produce/consume and their derivatives] are in the first cache
line, and the rest are in the second.
This is true for both PBL and NEXT_PTR kind of chains.
Advancing a page in a SINGLE_PAGE chain would still require the 2nd
cacheline as well, but afaik only SPQ uses it and so it isn't
considered as 'fastpath'.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
L2 handler changes:
This is change to remove the queue-id/qzone difference for Tx queues.
It does that by mainly doing:
a. VFs queues are no longer determined by the SBs they're using.
Instead, the ecore-client needs to maintain those and choose the values
to be used by VF when initializing it.
b. Eliminate the HW-cid array in the hw-function.
To do that, have all the rx/tx functionality turn into 'handle' base -
when queue would be started the caller would get a (void*) handle,
which it would later use with ecore for configuring various
queue-related stop [update, close].
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>