This commit addresses various issues that may lead to undefined behavior
when configuring Rx interrupts.
While failure to create a Rx queue completion channel in rxq_setup()
prevents that queue from being created, existing queues still have theirs.
Since the error handler disables dev_conf.intr_conf.rxq as well, subsequent
calls to rxq_setup() create Rx queues without interrupts. This leads to a
scenario where not all Rx queues support interrupts; missing checks on the
presence of completion channels may crash the application.
Considering that the PMD is not supposed to disable user-provided
configuration parameters (dev_conf.intr_conf.rxq), and that these can
change for subsequent rxq_setup() calls anyway, properly supporting a mixed
mode where not all Rx queues have interrupts enabled is a better approach.
To do so with a minimum set of changes, priv_intr_efd_enable() and
priv_create_intr_vec() are first refactored as a single
priv_rx_intr_vec_enable() function (same for their "disable" counterparts).
Since they had to be used together, there was no point in keeping them
separate.
Remaining changes:
- Always clean up before reconfiguring interrupts to avoid memory leaks.
- Always clean up when closing the device.
- Use malloc()/free() instead of their rte_*() counterparts since there is
no need to store the vector in huge pages-backed memory.
- Allow more Rx queues than the size of the event file descriptor array as
long as Rx interrupts are not requested on all of them.
- Properly clean up interrupt handle when disabling Rx interrupts (nb_efd
and intr_vec reset to 0).
- Check completion channel presence while toggling Rx interrupts on a given
queue.
Fixes: 9f05a4b818 ("net/mlx4: support user space Rx interrupt event")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
Several Ethernet device structures are allocated on top of a common PCI
device for mlx4 adapters with multiple ports. These inherit a common
interrupt handle from their parent PCI device, which prevents Rx interrupts
from working properly on all ports as their configuration is overwritten.
Use a local interrupt handle to address this issue.
Fixes: 9f05a4b818 ("net/mlx4: support user space Rx interrupt event")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
With this patch, it is possible to enable or disable the isolate
feature anytime, even immediately after a probe while the tap has not
been configured yet. It will do its job as soon as the netdevice gets
created.
A specific implicit flow rule is created with the lowest priority (all
other flow rules will be evaluated before), at the end of the list. If
isolated mode is enabled, the associated action will be to drop the
packet. Otherwise, the action would be passthrough.
In case of a remote netdevice, implicit rules on it will be removed in
isolated mode, to ensure only actual flow rules redirect packets to the
tap.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Drop flows being created when the port is stop should not access to the
drop table hash queues as it is invalid.
Fixes: 028761059a ("net/mlx5: use an RSS drop queue")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Macro for alignment is defined in the common library.
Use macro from the common library in own macro definition.
Signed-off-by: Matej Vido <vido@cesnet.cz>
Hardware PTYPE in Rx desc will be parsed to fill mbuf's packet_type.
Signed-off-by: Ray Kinsella <ray.kinsella@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The current drop action is implemented as a queue tail drop,
requiring to instantiate multiple WQs to maintain high drop rate.
This commit, implements the drop action in hardware classifier.
This enables to reduce the amount of contexts needed for the drop,
without affecting the drop rate.
Signed-off-by: Shachar Beiser <shacharbe@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In some environments, the PCI domain can be larger than 16 bits.
For example, a PCI device passed through in Azure gets a synthetic domain
id which is internally generated based on GUID. The PCI standard does
not restrict domain to be 16 bits.
This change breaks ABI for API's that expose PCI address structure.
The printf format for PCI remains unchanged, so that on most
systems (with only 16 bit domain) the output format is unchanged
and is 4 characters wide. For example: 0000:00:01.0
Only on sysetms with higher bits will the domain take up more
space; example: 12000:00:01.0
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Device name resides in two different locations, in rte_device->name and
in ethernet device private data.
For now, the copy in the ethernet device private data is required for
multi process support, the name is the how secondary process finds about
primary process device.
But for drivers there is no reason to use the copy in the ethernet
device private data.
This patch updates PMDs to use only rte_device->name.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since SSE4 is now part of minimum requirements for DPDK on x86, we no
longer need this fallback code.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Since SSE4 is now part of the minimum requirements for DPDK, we no longer
need these checks.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Since SSE4 is now minimum requirement for x86 platforms we can replace the
check for SSE4 with a check for x86
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Replaced usage of %a0 in inline assembly with [%x0]
Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
At some places, the log2() function is used despite this function
works on float. This introduces a dependency to the math lib but
most of the time it is not required because we want an integer log2.
Add a new helper to do this job and fix nfp driver.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Change the rte_eth_dev_callback_process function to return int,
and add a void *ret_param parameter.
The new parameter is used by ixgbe and i40e instead of abusing
the user data of the callback.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Socket id parsed from the user was checked
if it was in the range of available sockets.
This check is unnecessary, as the socket specified
might not have memory anyway, so it will fail
at memory allocation.
Therefore, the best solution is to remove this check.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Zero the whole memory zone instead of the first few bytes.
Fixes: c1f86306a0 ("virtio: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch implements the ops rx_queue_count for vhost PMD by adding
a helper function rte_vhost_rx_queue_count in vhost lib.
The ops rx_queue_count gets vhost RX queue avail count and helps to
understand the queue fill level.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Isolated mode can be requested by applications on individual ports to avoid
ingress traffic outside of the flow rules they define.
Besides making ingress more deterministic, it allows PMDs to safely reuse
resources otherwise assigned to handle the remaining traffic, such as
global RSS configuration settings, VLAN filters, MAC address entries,
legacy filter API rules and so on in order to expand the set of possible
flow rule types.
To minimize code complexity, PMDs implementing this mode may provide
partial (or even no) support for flow rules when not enabled (e.g. no
priorities, no RSS action). Applications written to use the flow API are
therefore encouraged to enable it.
Once effective, leaving isolated mode may not be possible depending on PMD
implementation.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
build error:
.../dpdk/drivers/net/mlx5/mlx5_fdir.c:
In function ‘fdir_filter_to_flow_desc’:
.../dpdk/drivers/net/mlx5/mlx5_fdir.c:146:18:
error: this statement may fall through [-Werror=implicit-fallthrough=]
desc->dst_port = fdir_filter->input.flow.udp4_flow.dst_port;
~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../dpdk/drivers/net/mlx5/mlx5_fdir.c:147:2: note: here
case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
^~~~
Fixed by adding fallthrough comment to the code.
Fixes: 76f5c99e68 ("mlx5: support flow director")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
build error:
.../dpdk/drivers/net/enic/base/vnic_dev.c:
In function ‘vnic_dev_get_mac_addr’:
.../dpdk/drivers/net/enic/base/vnic_dev.c:470:12:
error: ‘a0’ is used uninitialized in this function
[-Werror=uninitialized]
args[0] = *a0;
^~~
...dpdk/drivers/net/enic/base/vnic_dev.c:
In function ‘vnic_dev_classifier’:
...dpdk/drivers/net/enic/base/vnic_dev.c:471:12:
error: ‘a1’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
args[1] = *a1;
^~~
Fixed by providing initial values.
Fixes: 9913fbb91d ("enic/base: common code")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This causes build error with gcc 7.1.1 :
...dpdk/drivers/net/i40e/i40e_flow.c:2357:2:
error: ‘memset’ used with length equal to number of elements without
multiplication by element size [-Werror=memset-elt-size]
memset(off_arr, 0, I40E_MAX_FLXPLD_FIED);
^~~~~~
...dpdk/drivers/net/i40e/i40e_flow.c:2358:2:
error: ‘memset’ used with length equal to number of elements without
multiplication by element size [-Werror=memset-elt-size]
memset(len_arr, 0, I40E_MAX_FLXPLD_FIED);
^~~~~~
Fixed by providing correct size to memset.
Fixes: 6ced3dd72f ("net/i40e: support flexible payload parsing for FDIR")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Since the commit e84ad157b7 ("pci: unmap resources if probe fails"),
EAL unmaps the PCI device if ethdev probe returns positive or
negative value.
nicvf thunderx PMD needs special treatment for Secondary queue set(SQS)
PCIe VF devices, where, it expects to not unmap or free the memory
without registering the ethdev subsystem.
Enable the same behavior by using RTE_PCI_DRV_KEEP_MAPPED_RES
PCI driver flag.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_driver->name has the driver name and all physical and virtual
devices has access to it.
Previously it was not possible for virtual ethernet devices to access
rte_driver->name field (because eth_dev used to keep only pci_dev),
and it was required to save driver name in the device private struct.
After re-works on bus and vdev, it is possible for all bus types to
access rte_driver.
It is able to remove the driver name from ethdev device private data and
use eth_dev->device->driver->name.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
When ring PMD created via PMD specific API instead of EAL abstraction
it is missing the virtual device creation done by EAL vdev.
And this makes eth_dev unusable exact same as other PMDs used, because
of some missing fields, like rte_device->name.
Now API calls EAL APIs to create ring PMDs.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The eth_dev->device link was missing for ring PMD, adding it.
This is to generalize rte_device access from eth_dev.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This is to prepare for firmwares with multiple ibufs and obufs.
Ibufs and obufs are the modules in FPGA firmware implementing
the Ethernet port.
There is one ibuf+obuf per Ethernet port.
The cards and firmwares allow one physical port to be one Ethernet
port or split into more Ethernet ports, e.g. one 100GE physical
port can be one Ethernet port of 100GE or split into ten Ethernet
ports of 10GE.
All DMA queues in the device are shared between all Ethernet ports.
Offsets of ibufs and obufs are defined in array.
Functions which operate on ibufs and obufs iterate over this array.
Signed-off-by: Matej Vido <vido@cesnet.cz>
Remove unused read and write functions.
Use rte_read*, rte_write* functions to access ibuf and obuf
address space.
Signed-off-by: Matej Vido <vido@cesnet.cz>
Prefix "cgmii" is removed because it is too specific.
There are different ibuf/obuf modules in different firmwares
but the address space definition is the same.
This patch makes the name general.
Signed-off-by: Matej Vido <vido@cesnet.cz>
Avoid re-initializing of mbuf fields which are set while in pool.
Replaced lio_recv_buffer_alloc with rte_pktmbuf_alloc.
See commit 8f094a9ac5 ("mbuf: set mbuf fields while in pool").
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
This patch fixes incorrect reporting of link status
1) When link is down, set speed to zero. Otherwise a wrong non-zero
speed will be displayed.
2) DAC cables can detect there is a signal, but it necessarily does not
mean link is up. Code previously treated this as link up.
Fixes: 7bc8e9a227 ("net/bnxt: support async link notification")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>