Introduce the fail-safe poll mode driver initialization and enable its
build infrastructure.
This PMD allows for applications to benefit from true hot-plugging
support without having to implement it.
It intercepts and manages Ethernet device removal events issued by
slave PMDs and re-initializes them transparently when brought back.
It also allows defining a contingency to the removal of a device, by
designating a fail-over device that will take on transmitting operations
if the preferred device is removed.
Applications only see a fail-safe instance, without caring for
underlying activity ensuring their continued operations.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Make the rte_eth_dev_count() return the number of available devices even
after some are detached by the hotplug API or put in a deferred state.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This device state means that the device is managed externally, by
whichever party has set this state (PMD or application).
Note: this new device state is only an information. The related device
structure and operators are still valid and can be used normally.
It is however made private by device management helpers within ethdev,
making the device invisible to applications.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
VF performance is limited by the kernel PCI extended tag setting.
Update the document to explain the known issue and the workaround.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The x550 family does not support ipv6-other flow as well as
ipv4-other flow, so add this limitation.
Fixes: 7d629cacedee ("net/ixgbe: enable IPv6 for consistent API")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
82599ES can support SCTP packet drop action, but the
configuration is different from TCP or UDP packet, so
it need to rework some FDIR related code to adapt
drop action rule of SCTP packet.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Previously SW workaround for GL_SWR_PRI_JOIN_MAP is added for X710
performance. As new FW version 6.0 supports ADQ,
value for GL_SWR_PRI_JOIN_MAP should be changed, otherwise
ehtertype filter will be impacted.
Fixes: 973273c7a4b7 ("i40e: workaround for X710 performance")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Remove checks of Linux kernel version
in order to support kernel with backported features.
the expected behavior with a kernel that doesn't support flower
and other bits is the following:
-flow validate can return successfully
-flow create using the same rule fails.
Using the "remote" feature without kernel flower does not fail silently.
The TAP instance is not initialized if the requested parameters cannot
be satisfied.
it has been tested on an old kernel without required support:
PMD: Kernel refused TC filter rule creation (2): No such file or directory
PMD: tap0 failed to create implicit rules.
PMD: Can't set up remote feature: No such file of directory(2)
PMD: TAP Unable to initialize net_tap0
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Drop action is not supported by signature match, should return
error when try to create a signature match flow with drop action.
Fixes: a948d33bc05a ("net/ixgbe: enable signature match for consistent API")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
PF driver and VF driver communicated with each other by virtual
channel message. When VF sends message to PF to enable some
offload capability, PF should response if it is successful or not.
VIRTCHNL_OP_ENABLE_VLAN_STRIPPING is a new added message and the
old PF driver doesn’t support that. So no response is received by
DPDK VF. Then VF is blocked on this message and cannot roll back.
This patch clears pending command on VF side when the waiting duration
expires to avoid blocking following communication.
Fixes: 5f0b95d59a98 ("net/i40e: support VLAN stripping for VF")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
There's only invalid queue id checking for PF when creating FDIR
rules, this patch adds checking invalid queue id for VF.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
If LSC flag is changed to off at last device start, the
enable flag is not cleared in HW.
This patch fixes it.
Fixes: c3cd3de0ab50 ("igb: enable Rx queue interrupts for PF")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
If LSC flag is changed to off at last device start, the
enable flag is not cleared in HW.
This patch fixes it.
Fixes: f4668a33efe5 ("net/i40e: fix link status change interrupt")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
If LSC flag is changed to off at last device start, the
enable flag is not cleared in HW.
This patch fixes it.
Fixes: 0eb609239efd ("ixgbe: enable Rx queue interrupts for PF and VF")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
On a host having 128B cacheline size, some devices insert 64B padding in
each completion entry to avoid partial cacheline write by HW. But, as the
padding is ahead of completion data, casting a completion entry to
compressed mini-completions must start from the middle of the completion.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
cq_limit field is added in cn88xx-pass2 and subsequent
versions. Reflect the change in the sq_config structure.
This change is backward compatible as the old pass versions
ignore this field.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tx CRC size is not counted by VSI's stats register, so it is not necessary
excluded by driver.
Fixes: 98abce237ba7 ("net/i40e: fix VF statistics")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
If MAC stats refresh is arranged to be done by periodic DMA,
the first DMA transaction is unlikely to occur right on the
port start; if the user tries to get stats right after port
start and before the transaction occurs, bogus figures will
be collected; a one-off stats upload on port start is a fix
Fixes: 1caab2f1e684 ("net/sfc: add basic statistics")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Some logs are missing the newline character \n.
The logs using only one line can be checked with this command:
git grep 'RTE_LOG(.*".*[^n]"' drivers/net/ring/
Fixes: 61934c0956d4 ("ring: convert to use of PMD_REGISTER_DRIVER and fix linking")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Some logs are missing the newline character \n.
The logs using only line can be checked with this command:
git grep 'RTE_LOG(.*".*[^n]"' drivers/net/tap/
Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD")
Fixes: 268483dc2086 ("net/tap: add preliminary support for flow API")
Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Fixes: bf7b7f437b49 ("net/tap: create netdevice during probing")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
vhost-user protocol is common to many virtio devices, such as
virtio_net/virtio_scsi/virtio_blk. Since DPDK vhost library
removed the NET specific data structures, the vhost library
is common to other virtio devices, such as virtio-scsi.
Here we introduce a simple memory based block device that
can be presented to Guest VM through vhost-user-scsi-pci
controller. Similar with vhost-net, the sample application
will process the I/Os sent via virt rings.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Exception handling is executed in the normal path and it will cause
vhost-user init failure.
Fixes: d6983a70e259 ("vhost: check return of pthread calls")
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Revert "devargs: make device types generic"
This commit broke the rte_devargs API by changing the meaning of
the rte_devtype enum.
Restore the previous API, unit tests and function calls.
Introduce parallel enum that acts as translation between previous API
and current structures.
Restoring the previous API means that -w and -b are not usable anymore
with any bus having implemented the "parse" operation. Only PCI devices
can be used with -w and -b, virtual devices are declared using vdev.
This (partially) reverts commit bd279a79366f50a4893fb84db91bbf64b56f9fb1.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The prior scan should link the relevant rte_devargs to the newly
allocated rte_device. As such, it is useless to pass device arguments to
the plug callback. Those arguments are available within the devargs
field of the rte_device structure.
Fixes: 7c8810f43f6e ("bus: introduce device plug/unplug")
Fixes: 00e62aae69c0 ("bus/pci: implement plug/unplug operations")
Fixes: a3ee360f4440 ("eal: add hotplug add/remove device")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The device handle is already known and does not have to be infered from
the PCI address. The relevant helpers are already available within the
PCI bus to avoid searching for a handle already known.
Additionally, rte_memcpy.h was erroneously included.
Fixes: 00e62aae69c0 ("bus/pci: implement plug/unplug operations")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The field is set but never resetted on error.
This marks the device as being attached while it is not, and forbid
further attempts to hotplug it.
Fixes: 7917d5f5ea46 ("pci: initialize generic driver pointer")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
When an application requests the use of a PCI device, it can currently
interchangeably use either the longform DomBDF format (0000:00:00.0) or
the shorter BDF format (00:00.0).
When a device is inserted via the hotplug API, it must first be scanned
and then will be identified by its name using `find_device`. The name of
the device must match the name given by the user to be found and then
probed.
A new function sets the expected name for a scanned PCI device. It was
previously generated from parsing the PCI address. This canonical name
is superseded when an rte_devargs exists describing the device. In such
case, the device takes the given name found within the rte_devargs.
As the rte_devargs is linked to the rte_pci_device during scanning, it
can be avoided during the probe. Additionally, this fixes the issue of
the rte_devargs lookup not being done within rte_pci_probe_one.
Fixes: beec692c5157 ("eal: add name field to generic device")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The hotplug API requires a few properties that were not previously
explicitly enforced:
- Idempotency, two consecutive scans should result in the same state.
- Upon returning, internal devices are now allocated and available
through the new `find_device` operator, meaning that they should be
identifiable.
The current rte_eal_hotplug_add implementation identifies devices by
their names, as it is readily available and easy to define.
The device name must be passed to the internal rte_device handle in
order to be available during scan, when it is then assigned to the
device. The current way of passing down this information from the device
declaration is through the global rte_devargs list.
Furthermore, the rte_device cannot take a bus-specific generated name,
as it is then not identifiable by the `find_device` operator. The device
must take the user-defined name. Ideally, an rte_device name should not
change during its existence.
This commit generates a new rte_devargs associated with the plugged
device and inserts it in the global rte_devargs list. It consequently
releases it upon device removal.
Fixes: a3ee360f4440 ("eal: add hotplug add/remove device")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Some buses will operate either in whitelist or blacklist mode.
This mode is currently passed down by the rte_eal_devargs_add function
with the devtype argument.
When inserting devices using the hotplug API, the implicit assumption is
that this device is being whitelisted, meaning that it is explicitly
requested by the application to be used. This can conflict with the
initial bus configuration.
While the rte_eal_devargs_add API is being deprecated soon, it cannot
be modified at the moment to accommodate this situation.
As such, this new experimental API offers a bare interface for inserting
rte_devargs without directly manipulating the global rte_devargs list.
This new function expects a fully-formed rte_devargs, previously parsed
and allocated.
It does not check whether the new rte_devargs is compatible with current
bus configuration, but will replace any eventual existing one for the same
device, allowing the hotplug operation to proceed. i.e. a previously
blacklisted device can be redefined as being whitelisted.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Hotplug support introduces the possibility of removing devices from the
system. Allocated resources must be freed.
Extend the rte_devargs API to allow freeing allocated resources.
This API is experimental and bound to change. It is currently designed
as a symetrical to rte_eal_devargs_add(), but the latter will evolve
shortly anyway.
Its DEVTYPE parameter is currently only used to specify scan policies,
and those will evolve in the next release. This evolution should
rationalize the rte_devargs API.
As such, the proposed API here is not the most convenient, but is
taylored to follow the current design and integrate easily with its main
use within rte_eal_hotplug_* functions.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This method must be implemented to allow using a unified, generic API to
hotplug devices, including virtual ones.
VDEV devices actually exist unattached after performing a scan on the
rte_devargs list. As such it makes sense to be able to perform a device
hotplug afterward.
Finally, missing this generic interface forces the EAL to be dependent
on vdev-specific API, which hinders the plan of moving the vdev bus to
drivers/bus.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The public API (struct rte_metric_name) includes the NULL terminator
byte in RTE_METRICS_MAX_NAME_LENGTH but the library itself internally
excludes it. This makes it possible for an application to receive an
unterminated name string. Fix be enforcing the NULL termination of all
name strings to the length that the public API expects.
Fixes: 349950ddb9c5 ("metrics: add information metrics library")
Cc: stable@dpdk.org
Signed-off-by: Remy Horton <remy.horton@intel.com>
IANA assigns a destination port of 4789 for the VXLAN in the Service
Name and Transport Protocol Port Number Registry. This is mentioned in
RFC 7348.
Fixes: f295a00a2b44 ("mbuf: add definitions of unified packet types")
Cc: stable@dpdk.org
Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This commit allows the -S (captial 's') to be used to indicate
a corelist for Services. This is a "nice to have" patch, and does
not modify any of the service core functionality.
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
This is not required any more for A72 based dpaa2 systems.
(A57 based platform is not in production anymore)
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Introducing the DEV_TX_OFFLOAD_MT_LOCKFREE TX capability flag.
if a PMD advertises DEV_TX_OFFLOAD_MT_LOCKFREE capable, multiple threads
can invoke rte_eth_tx_burst() concurrently on the same tx queue without
SW lock. This PMD feature will be useful in the following use cases and
found in the OCTEON family of NPUs.
1) Remove explicit spinlock in some applications where lcores
to TX queues are not mapped 1:1.
example: OVS has such instance
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L299https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1859
See the the usage of tx_lock spinlock.
2) In the eventdev use case, avoid dedicating a separate TX core for
transmitting and thus enables more scaling as all workers can
send the packets.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
This commit shows how easy it is to enable a specific
DPDK component with a service callback, in order to get
CPU cycles for it.
The beauty of this method is that the service is unaware
of how much CPU time it is getting - the application can
decide how to split and slice cores and map them to the
registered services.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add a bunch of unit tests, to ensure that the service
core functions are operating as expected.
As part of these tests a dummy service is registered which
allows identifying if a service callback has been invoked
by using the CPU tick counter. This allows identifying if
functions to start and stop service lcores are actually having
effect.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add logic for parsing a coremask from EAL, which allows
the application to be unaware of the cores being taken from
its coremask.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This commit shows the changes required in rte_eal_init()
to transparently launch the service threads. The threads
are launched into the service worker functions here because
after rte_eal_init() the application is not gauranteed to
call any other DPDK API.
As the registration of services happens at initialization
time, the services that require CPU time are already available
when we reach the end of rte_eal_init().
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add header files, update .map files with new service
functions, and add the service header to the doxygen
for building.
This service header API allows DPDK to use services as
a concept of something that requires CPU cycles. An example
is a PMD that runs in software to schedule events, where a
hardware version exists that does not require a CPU.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>