This reverts commit ff00a0dc5600dbb0a29e4aa7fa4b078f98c7a360.
Support for several RSS parent queues was necessary to implement the RSS
flow rule action, dropped in a prior commit.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This reverts commit d7769c7c08cc08a9d1bc4e40b95524d9697707d9.
Existing RSS features rely on experimental Verbs provided by Mellanox OFED.
In order to replace this dependency with standard distribution packages,
RSS support must be temporarily removed to be re-implemented using a
different API.
Removing support for the RSS flow rule action is the first step toward this
goal.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Configuring several Rx queues enables RSS, which causes an additional
special parent queue to be created to manage them.
MAC flows are associated with the queue supposed to receive packets; either
the parent one in case of RSS or the single orphan otherwise.
For historical reasons the current implementation supports another scenario
with multiple orphans, in which case MAC flows are configured on all of
them. This is harmless but useless since it cannot happen.
Removing this feature allows dissociating the remaining MAC flow from Rx
queues and store it inside the private structure where it belongs.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Only the default port MAC address remains and is not configurable.
This is done in preparation for a major refactoring.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This option both sets the maximum number of segments for Rx/Tx packets and
whether scattered mode is supported at all. This commit removes the latter
as well as configuration file exposure since the most appropriate value
should be decided at run-time.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Current implementation is partial (Tx only), not convenient to use and
not of primary concern.
Remove this feature before refactoring the PMD.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Verbs support for RSS, inline receive and extended device query calls has
not been optional for a while. Their absence is untested and is therefore
unsupported.
Remove the related compilation checks and assume Mellanox OFED is up to
date, as described in the documentation.
Use this opportunity to remove a few useless data path debugging messages
behind compilation checks on never defined macros.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Use maximum number reported by hardware capabilities as replacement for the
static check on MLX4_PMD_MAX_PHYS_PORTS.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
The seemingly artificial limitation on the maximum number of instances for
this PMD is an historical leftover that predates its first public release.
It was used as a workaround to support multiple physical ports on a PCI
device exposing a single bus address when mlx4 was implemented directly as
an Ethernet device driver instead of a PCI driver spawning Ethernet
devices.
Getting rid of it simplifies device initialization.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Copyright lasts long enough not to require notices to be updated yearly.
The current approach of updating them occasionally while working on
unrelated tasks should be deprecated in favor of dedicated commits updating
all files at once when necessary.
Standardize on a single year per copyright owner.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Current mlx4 OFED version has bug which returns error to
ibv destroy functions when the device was plugged out, in
spite of the resources were destroyed correctly.
Hence, failsafe PMD was aborted, only in debug mode, when
it tries to remove the device in plug-out process.
The workaround added option to replace all claim_zero
assertions with debugging messages, by the way, this option
affects non ibv destroy assertions.
DPDK 18.02 release should work with Mellanox OFED-4.2 which will
include the verbs fix to this bug, then, this patch can
be removed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The corrupted code doesn't return error when probe function
fails due to error in device mac address getting.
By this way, the probe function may return success even if the
ETH dev is not allocated.
Hence, the probe caller, for example failsafe PMD, fails when it
tries to get ETH dev after the device was plugged out while mlx4
was probing it.
The fix adds error report to the probe caller when priv_get_mac fails
and in all other failure options which are missing it.
By this way, it prevents the unexpected behavior to miss ETH device
after the device was probed successfully.
Fixes: 7fae69eeff13 ("mlx4: new poll mode driver")
Fixes: 001a520e419f ("net/mlx4: add port parameter")
Fixes: 7b0661539229 ("mlx4: check if port is configured for ethernet")
Fixes: fec3608673e6 ("mlx4: query netdevice to get initial MAC address")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The corrupted code causes segmentation fault when user creates
flow with drop action before device starting.
For example, failsafe PMD recreates all the flows before calling
dev_start in plug-in sequence and mlx4 allocated its flow drop
queue in dev_start.
Hence, when failsafe created flow with drop action after plug-in
event, mlx4 tried to dereference flow drop queue which was
uninitialized.
The fix added check to the drop qp accessible and conditioned the
ibv_create_flow calling on device starting.
Fixes: 642fe56a1ba5 ("net/mlx4: use a single drop queue for all drop flows")
Fixes: 46d5736a7049 ("net/mlx4: support basic flow items and actions")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This PMD supports hotplug, it is able to be detached.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This commit adds support for the flow API RSS action with the following
limitations:
- Only supported when isolated mode is enabled.
- The number of queues specified by the action (rte_flow_action_rss.num)
must be a power of two.
- Each queue index can be specified at most once in the configuration
(rte_flow_action_rss.queue[]).
- Because a queue can be associated with a single RSS context, it cannot
be targeted by multiple RSS actions simultaneously.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
A special "parent" queue must be allocated in addition to a group of
standard Rx queues for RSS to work. This is done automatically outside of
isolated mode by the PMD when applications request several Rx queues.
Since each configured flow rule with the RSS action may target a different
set of queues, the PMD must have the ability to dynamically allocate
several parent queues, one per RSS group.
If isolated mode was requested the default RSS parent queue isn't created
in this case.
Refactor RSS parent queue allocations (currently limited to a single
parent) in preparation for flow API RSS action support.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The user must request isolated mode before device configuration.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
In debug mode, all mbuf ol_flags are temporarily enabled while sitting
in the Rx queue to detect otherwise silent data corruption, however
some of them are special (indirect and control) and must be cleared
before returning mbufs to the pool to avoid crashing.
Fixes: 7fae69eeff13 ("mlx4: new poll mode driver")
CC: stable@dpdk.org
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This commit addresses a compilation issue against Glibc >= 2.25, which
implements assert() through a nonstandard ({ }) construct. Such constructs
can normally not be used without __extension__ keyword when -pedantic is
enabled, as is the case when compiling mlx4 and mlx5 PMDs in debug mode.
While assert.h checks for the compiler ability to support GNU extensions,
Clang, unlike GCC, does not allow the above syntax when combining
-std=gnu99 with -pedantic.
Work around missing keyword by moving these PMDs to a stricter compliance
standard without GNU extensions but properly checked by Glibc. Doing so is
supported on the DPDK side since includes have been cleaned up.
Even in C11, using types other than _Bool or signed/unsigned int for
bit-fields is an extension. Some GCC versions complain about that when
-pedantic checks are enabled.
The RTE_STD_C11 macro correctly prevented this issue with C99 but not with
C11 as it becomes a no-op. Forcing the extension keyword addresses it.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Tested-by: Yongseok Koh <yskoh@mellanox.com>
The interrupt handler can sometimes be triggered for reasons other than a
link status event. An assertion failure happen when such events occur while
an asynchronous link status update is already scheduled.
Address this issue using the same approach as its mlx5 counterpart,
commit a9f2fbc42f0c ("net/mlx5: fix inconsistent link status")
Fixes: c4da6caa426d ("mlx4: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This commit addresses various issues that may lead to undefined behavior
when configuring Rx interrupts.
While failure to create a Rx queue completion channel in rxq_setup()
prevents that queue from being created, existing queues still have theirs.
Since the error handler disables dev_conf.intr_conf.rxq as well, subsequent
calls to rxq_setup() create Rx queues without interrupts. This leads to a
scenario where not all Rx queues support interrupts; missing checks on the
presence of completion channels may crash the application.
Considering that the PMD is not supposed to disable user-provided
configuration parameters (dev_conf.intr_conf.rxq), and that these can
change for subsequent rxq_setup() calls anyway, properly supporting a mixed
mode where not all Rx queues have interrupts enabled is a better approach.
To do so with a minimum set of changes, priv_intr_efd_enable() and
priv_create_intr_vec() are first refactored as a single
priv_rx_intr_vec_enable() function (same for their "disable" counterparts).
Since they had to be used together, there was no point in keeping them
separate.
Remaining changes:
- Always clean up before reconfiguring interrupts to avoid memory leaks.
- Always clean up when closing the device.
- Use malloc()/free() instead of their rte_*() counterparts since there is
no need to store the vector in huge pages-backed memory.
- Allow more Rx queues than the size of the event file descriptor array as
long as Rx interrupts are not requested on all of them.
- Properly clean up interrupt handle when disabling Rx interrupts (nb_efd
and intr_vec reset to 0).
- Check completion channel presence while toggling Rx interrupts on a given
queue.
Fixes: 9f05a4b81809 ("net/mlx4: support user space Rx interrupt event")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
Several Ethernet device structures are allocated on top of a common PCI
device for mlx4 adapters with multiple ports. These inherit a common
interrupt handle from their parent PCI device, which prevents Rx interrupts
from working properly on all ports as their configuration is overwritten.
Use a local interrupt handle to address this issue.
Fixes: 9f05a4b81809 ("net/mlx4: support user space Rx interrupt event")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
In some environments, the PCI domain can be larger than 16 bits.
For example, a PCI device passed through in Azure gets a synthetic domain
id which is internally generated based on GUID. The PCI standard does
not restrict domain to be 16 bits.
This change breaks ABI for API's that expose PCI address structure.
The printf format for PCI remains unchanged, so that on most
systems (with only 16 bit domain) the output format is unchanged
and is 4 characters wide. For example: 0000:00:01.0
Only on sysetms with higher bits will the domain take up more
space; example: 12000:00:01.0
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Change the rte_eth_dev_callback_process function to return int,
and add a void *ret_param parameter.
The new parameter is used by ixgbe and i40e instead of abusing
the user data of the callback.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Instead of many PMD define their own macro, define a generic one in
ethdev and use that in PMDs.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Some customers find adding MAC addr to VF sometimes can fail,
but it is still stored in dev->data->mac_addrs[ ]. So this
can lead to some errors that assumes the non-zero entry in
dev->data->mac_addrs[ ] is valid.
Following acknowledgements are from specific NIC PMD
maintainer for their managing part.
This patch changes the ethdev internal API, it should not be
backported to a stable/LTS release so far.
Fixes: af75078fece3 ("first public release")
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The PCI code will move to the bus drivers directory.
Rename functions from rte_eal_pci_ to rte_pci_
to prepare the move of the driver out of EAL.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Extend the LSC event handling to support the device removal as well. The
Verbs library will send several related events, that can conflict
with the LSC event itself.
The event handling has thus been made capable of receiving and signaling
several event types at once.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Elad Persiko <eladpe@mellanox.com>
Fixes issue where mlx4 driver stops receiving packets when mbuf
allocation fails in mlx4_rx_burst().
This issue appears to be caused because the code doesn't recycle the
existing mbuf to the sges array when mbuf allocation fails as is done
in the code right above it which handles (wc.status != IBV_WC_SUCCESS).
Copying the code from the above case fixes the issue.
Fixes: acac55f16412 ("mlx4: use MOFED 3.0 fast verbs interface for Rx operations")
Cc: stable@dpdk.org
Signed-off-by: Charles Myers <charles.myers@spirent.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Let error messages in place, but return unambiguous values upon
probing errors.
Fixes: 66e1591687ac ("mlx4: avoid init errors when kernel modules are not loaded")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Toggle Rx scatter mode based on the scatter_enable flag and the maximum
packet size only instead of deriving this information from the jumbo_frame
setting and the MTU configuration.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Most ConnectX-3 adapters expose two physical ports on a single PCI bus
address.
Add a new port parameter allowing the user to choose
either or both physical ports to be used by the application.
This parameter is used as follows:
Selecting only the second port:
-w 00:00.0,port=1
Selecting both ports:
-w 00:00.0,port=0,port=1
If no parameter is given, the default behavior is unchanged: all ports
are probed.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Adding support for the next items: eth, vlan, ipv4, udp, tcp and for the
next actions: queue, drop
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Make priv_lock/priv_unlock functions and some other structs/defines visible
from different source files by placing them into mlx4.h header.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
If LSC interrupts are enabled, the application expects the link_update
ops to be executed by the PMD itself.
No link status change event is received upon probing, therefore the link
status update must be forced.
Fixes: c4da6caa426d ("mlx4: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Before this patch, the management of dependencies between directories
had several issues:
- the generation of .depdirs, done at configuration is slow: it can take
more than one minute on some slow targets (usually ~10s on a standard
PC without -j).
- for instance, it is possible to express a dependency like:
- app/foo depends on lib/librte_foo
- and lib/librte_foo depends on app/bar
But this won't work because the directories are traversed with a
depth-first algorithm, so we have to choose between doing 'app' before
or after 'lib'.
- the script depdirs-rule.sh is too complex.
- we cannot use "make -d" for debug, because the output of make is used for
the generation of .depdirs.
This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.
After this commit, "make config" is almost immediate.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Mellanox PMDs do not differentiate IP header with or without options, so
the advertised packet type for an IPv4 should not be RTE_PTYPE_L3_IPV4,
which explicitly means "does not contain any header option".
Change the driver to set
RTE_PTYPE(_INNER)_L3_IPV4_EXT_UNKNOWN or
RTE_PTYPE(_INNER)_L3_IPV6_EXT_UNKNOWN flags for all IPv4/IPv6 packets
received.
Fixes: 429df3803a16 ("mlx4: replace some offload flags with packet type")
Fixes: 67fa62bc672d ("mlx5: support checksum offload")
Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com>
Signed-off-by: Matthieu Ternisien d'Ouville <matthieu.tdo@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Retrieving link status information through the link update callback should
be quick and non-blocking.
Mellanox PMDs retrieve this information through ioctl() calls on the
related kernel netdevice. This appears to take a long time to
complete and may cause significant slowdowns in applications.
While these system calls cannot be accelerated, removing the lock on the
private structure allows applications to perform other control operations
from separate threads in the meantime. This function remains safe without
locking as it does not write the private structure, it is only used to
retrieve the name of the netdevice.
Signed-off-by: Matthieu Ternisien d'Ouville <matthieu.tdo@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>