Change return value of the callbacks from void to int. Make
implementations across all drivers return negative errno
values in case of error conditions.
Both callbacks are updated together because a large number of
drivers assign the same function to both callbacks.
Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Change rte_eth_xstats_reset() return value from void to int and
return negative errno values in case of error conditions.
Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_eth_promiscuous_enable()/rte_eth_promiscuous_disable() return
value was changed from void to int, so modify usage of these
functions across lib/librte_kni according to new return type.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Since driver callbacks return status code now, there is no necessity
to enable or disable promiscuous mode once again if it is already
successfully enabled or disabled.
Configuration restore at startup tries to ensure that configured
promiscuous mode is applied and start will return error if it fails.
Also it avoids theoretical cases when already configured promiscuous
mode is applied once again and fails. In this cases it is unclear
which value should be reported on get (configured or opposite).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Enabling/disabling of promiscuous mode is not always successful and
it should be taken into account to be able to handle it properly.
When correct return status is unclear from driver code, -EAGAIN is used.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Change rte_eth_promiscuous_enable()/rte_eth_promiscuous_disable()
return value from void to int and return negative errno values
in case of error conditions.
Modify usage of these functions across the ethdev according
to new return type.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
We need to close the old slave request fd if any first
before taking the new one.
Fixes: 275c3f9447 ("vhost: support slave requests channel")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds an operation callback which gets called every time
the library is waking up the guest trough an eventfd_write() call.
This can be used by 3rd party application, like OVS, to track the
number of times interrupts where generated. This might be of
interest to find out system-call were called in the fast path.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Change eth_dev_infos_get_t return value from void to int.
Make eth_dev_infos_get_t implementations across all drivers to return
negative errno values if case of error conditions.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_eth_dev_info_get() return value was changed from void to int,
so this patch modify rte_eth_dev_info_get() usage across
pdump component according to its new return type.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_eth_dev_info_get() return value was changed from void to
int, so this patch modify rte_eth_dev_info_get() usage across
latency component according to its new return type.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Change rte_eth_dev_info_get() return value from void to int and return
negative errno values in case of error conditions.
Modify rte_eth_dev_info_get() usage across the ethdev according
to new return type.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Primary process is responsible to initialize the data struct of each
crypto devices.
Secondary process should not override this data during the
initialization.
Fixes: d11b0f30df ("cryptodev: introduce API and framework for crypto devices")
Cc: stable@dpdk.org
Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
HFN can be given as a per packet value also.
As we do not have IV in case of PDCP, and HFN is
used to generate IV. IV field can be used to get the
per packet HFN while enq/deq
If hfn_ovrd field in pdcp_xform is set,
application is expected to set the per packet HFN
in place of IV. Driver will extract the HFN and perform
operations accordingly.
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Replace /**< with /** for multiline doxygen comments.
Fixes: c261d1431b ("security: introduce security API and framework")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Restrict this header inclusion to its real users.
Fixes: 028669bc9f ("eal: hide shared memory config")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Security Parameters Index (SPI) should be set with network endian
values.
While 0xffffffff == htonl(0xffffffff), this missing annotation is
caught by sparse when compiling ovs (dpdk-latest branch).
Fixes: d4b684f719 ("net: add ESP header to generic flow steering")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
There is no RTE_FDIR_DISABLE. The right name is RTE_FDIR_MODE_NONE.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
ARM is supporting maximum 4 hugepage sizes (64K, 2M, 32M
and 1G) when granule is 4KB since very long and DPDK
support maximum 3 hugepage sizes.
With all 4 hugepage sizes enabled, applications and some
stacks like VPP which are working over DPDK and using
"in-memory" eal option, or using separate mount points
on ARM based platform, fails at huge page initialization,
reporting error messages from eal:
EAL: FATAL: Cannot get hugepage information.
EAL: Cannot get hugepage information.
EAL: Error - exiting with code: 1
This issue is originated from Linux 5.0
(a21b0b78eaf7 "arm64: hugetlb: Register hugepages during arch init")
where kernel is by default creating directories for each supported
hugepage size in /sys/kernel/mm/hugepages/
On earlier Stable Kernel LTR's, the directories visible in
/sys/kernel/mm/hugepages/ were dependent upon what hugepage
sizes are configured at boot time.
This change increases the maximum supported hugepage sizes
to 4 for ARM based platforms.
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Sort the experimental symbols per release to make it easier/quicker to
check for how long we have them.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This function has never been used outside of this code unit.
Mark it static and remove it from the eal internal header.
Fixes: 9e29251b2a ("eal: thread affinity API")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
When using --no-huge mode, dynamic allocation is not supported.
Because of this limitation, the option --legacy-mem is implied
and -m may be needed to specify the amount of memory to allocate.
Otherwise the default amount MEMSIZE_IF_NO_HUGE_PAGE will be allocated.
The option --socket-mem can also be used with --legacy-mem
when hugepages are supported.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Left-shift of an integer constant is represented as 'int' type, but a left
shift of 1 by 31 bits in 'int' is undefined. Use the U suffix to force
a representation as unsigned.
Caught while running with ubsan under gcc.
Fixes: dc276b5780 ("acl: new library")
Cc: stable@dpdk.org
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The ctrl thread cpu affinity setting has been broken when using --lcores.
Using -l/-c options makes each lcore associated to a physical cpu in a 1:1
fashion.
On the contrary, when using --lcores, each lcore cpu affinity can be set
to a list of any online cpu on the system.
To handle both cases, each lcore cpu affinity is considered and removed
from the process startup cpu affinity.
Introduced macros to manipulate dpdk cpu sets in both Linux and FreeBSD.
Examples on a 8 cores Linux system:
$ cd /sys/fs/cgroup/cpuset/
$ mkdir dpdk
$ cd dpdk
$ echo 4-7 > cpuset.cpus
$ echo 0 > cpuset.mems
$ echo $$ > tasks
Before the fix:
$ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \
--no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048
8427 cpu_list=4-5,7 testpmd
8428 cpu_list=4-6 eal-intr-thread
8429 cpu_list=4-6 rte_mp_handle
8430 cpu_list=4-5,7 lcore-slave-7
$ taskset -c 7 \
./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \
--no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Failed to create thread for interrupt handling
EAL: FATAL: Cannot init interrupt-handling thread
EAL: Cannot init interrupt-handling thread
PANIC in main():
Cannot init EAL
After the fix:
$ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \
--no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048
15214 cpu_list=4-5,7 testpmd
15215 cpu_list=6 eal-intr-thread
15216 cpu_list=6 rte_mp_handle
15217 cpu_list=4-5,7 lcore-slave-7
$ taskset -c 7 \
./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \
--no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048
15297 cpu_list=4-5,7 testpmd
15298 cpu_list=4-5,7 eal-intr-thread
15299 cpu_list=4-5,7 rte_mp_handle
15300 cpu_list=4-5,7 lcore-slave-7
Bugzilla ID: 322
Fixes: c3568ea376 ("eal: restrict control threads to startup CPU affinity")
Cc: stable@dpdk.org
Reported-by: Johan Källström <johan.kallstrom@ericsson.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
When IOMMU is not available, /sys/kernel/iommu_groups will not be
populated. This is happening since at least 3.6 when VFIO support
was added. If the directory is empty, EAL should not pick IOVA as
VA as the default IOVA mode.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The Distributor autotest can lock if ran enough times. Worker and
distributor threads get into a livelock situation waiting on each
other.
To repeat:
`while sudo sh -c "echo 'distributor_autotest' |
./build/app/test/dpdk-test"; do :; done`
The root cause is where we are flushing on exit, and do not wait for
all worker packets to be returned before exiting.
Add a delay on flush so that all worker packets are returned before
completing the flush.
Bugzilla ID: 316
Fixes: 775003ad2f ("distributor: add new burst-capable library")
Cc: stable@dpdk.org
Reported-by: Michael Santana <msantana@redhat.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Tested-by: Michael Santana <msantana@redhat.com>
This was missed when promoting this API to stable.
Fixes: 7a0ac7cdb4 ("service: promote experimental functions to stable")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Sort the experimental symbols per release to make it easier/quicker to
check for how long we have them.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Michael Santana <msantana@redhat.com>
This reverts commit debacba029.
Reverting this patch as it currently breaks the initialization of
telemetry, more investigation is ongoing to fix the issue for the
printed error message for unrecognized argument.
Fixes: debacba029 ("eal: fix parsing option --telemetry")
Cc: stable@dpdk.org
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Clarify the corner case with incompressible data
whereby the output can actually be greater than the
uncompressed data.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
When using IOVA as VA mode, there is no need to map segments
page by page. This normally isn't a problem, but it becomes one
when attempting to use DPDK in no-huge mode, where VFIO subsystem
simply runs out of space to store mappings.
Fix this for x86 by triggering different callbacks based on whether
IOVA as VA mode is enabled.
Fixes: 73a6390859 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Andrius Sirvys <andrius.sirvys@intel.com>
rte_eth_dev_info_get() returns void and caller does know if the function
does its job or not. Changing of the return value to int would be
API/ABI breakage which requires deprecation process and cannot be
backported to stable branches. For now, make sure that device info is
initialized even in the case of invalid port ID.
Fixes: a30268e9a2 ("ethdev: reset whole dev info structure before filling")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The current ether_unformat_addr code was based off of
BSD ether_aton. That version changed what was allowed
by the cmdline ether address parser.
For example, it allows dropping leading zeros.
Change the code to be more restrictive and only allow the fully
expanded standard formats.
Bugzilla ID: 324
Fixes: 596d31092d ("net: add function to convert string to ethernet address")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Layer 2 length must be updated after the prepend to mbuf to keep
the length right to be used by other Tx offloads.
If the packet has tunnel encapsulation, outer_l2_len should be
updated. Otherwise l2_len should be updated.
Fixes: c974021a59 ("ether: add soft vlan encap/decap")
Cc: stable@dpdk.org
Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add new ack interrupt API to avoid using
VFIO_IRQ_SET_ACTION_TRIGGER(rte_intr_enable()) for
acking interrupt purpose for VFIO based interrupt handlers.
This implementation is specific to Linux.
Using rte_intr_enable() for acking interrupt has below issues
* Time consuming to do for every interrupt received as it will
free_irq() followed by request_irq() and all other initializations
* A race condition because of a window between free_irq() and
request_irq() with packet reception still on and device still
enabled and would throw warning messages like below.
[158764.159833] do_IRQ: 9.34 No irq handler for vector
In this patch, rte_intr_ack() is a no-op for VFIO_MSIX/VFIO_MSI interrupts
as they are edge triggered and kernel would not mask the interrupt before
delivering the event to userspace and we don't need to ack.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Shahed Shaikh <shshaikh@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
This reverts commit 89aac60e0b.
"vfio: fix interrupts race condition"
The above mentioned commit moves the interrupt's eventfd setup
to probe time but only enables one interrupt for all types of
interrupt handles i.e VFIO_MSI, VFIO_LEGACY, VFIO_MSIX, UIO.
It works fine with default case but breaks below cases specifically
for MSIX based interrupt handles.
* Applications like l3fwd-power that request rxq interrupts
while ethdev setup.
* Drivers that need > 1 MSIx interrupts to be configured for
functionality to work.
VFIO PCI for MSIx expects all the possible vectors to be setup up
when using VFIO_IRQ_SET_ACTION_TRIGGER so that they can be
allocated from kernel pci subsystem. Only way to increase the number
of vectors later is first free all by using VFIO_IRQ_SET_DATA_NONE
with action trigger and then enable new vector count.
Above commit changes the behavior of rte_intr_[enable|disable] to
only mask and unmask unlike earlier behavior and thereby
breaking above two scenarios.
Fixes: 89aac60e0b ("vfio: fix interrupts race condition")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Shahed Shaikh <shshaikh@marvell.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Added telemetry to EAL long options so that when
--telemetry is passed as an EAL arg that there is
no unrecognized argument error message printed.
Fixes: 8877ac688b ("telemetry: introduce infrastructure")
Cc: stable@dpdk.org
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Tested-by: John OLoughlin <john.oloughlin@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
When bus layer reports the preferred mode as RTE_IOVA_DC then
select the RTE_IOVA_VA mode:
- All drivers work in RTE_IOVA_VA mode, irrespective of physical
address availability.
- By default, a mempool asks for IOVA-contiguous memory using
RTE_MEMZONE_IOVA_CONTIG. This is slow in RTE_IOVA_PA mode and it
may affect the application boot time.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which
was intended to mean "driver only supports VA" but had been understood
as "driver supports both PA and VA" by most net drivers and used to let
dpdk processes to run as non root (which do not have access to physical
addresses on recent kernels).
The check on physical addresses actually closed the gap for those
drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this
flag can retain its intended meaning.
Document explicitly its meaning.
We can check that a driver requirement wrt to IOVA mode is fulfilled
before trying to probe a device.
Finally, document the heuristic used to select the IOVA mode and hope
that we won't break it again.
Fixes: 703458e19c ("bus/pci: consider only usable devices for IOVA mode")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>