All information about a device to probe can be grouped
in a common string, which is what we usually call devargs.
An application should not have to parse this string before
calling the EAL probe function.
And the syntax could evolve to be more complex and support
matching multiple devices in one string.
That's why the bus name and device name should be removed from
rte_eal_hotplug_add().
Instead of changing this function, a simpler one is added
and used in the old one, which may be deprecated later.
When removing a device, we already know its rte_device handle
which can be directly passed as parameter of rte_eal_hotplug_remove().
If the rte_device is not known, it can be retrieved with the devargs,
by iterating in the device list (future RTE_DEV_FOREACH()).
Similarly to the probing case, a new function is added
and used in the old one, which may be deprecated later.
The new function is used in failsafe, because the replacement is easy.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The fail-safe PMD registers to RMV event for each removable sub-device
port in order to cleanup the sub-device resources and switch the Tx
sub-device directly when it is plugged-out.
During removal time, the fail-safe PMD stops and closes the sub-device
but it doesn't unregister the LSC and RMV callbacks of the sub-device
port.
It can lead the callbacks to be called for a port which is no more
associated with the fail-safe sub-device, because there is not a
guarantee that a sub-device gets the same port ID for each plug-in
process. This port, for example, may belong to another sub-device of a
different fail-safe device.
Unregister the LSC and RMV callbacks for sub-devices which are not
used.
Fixes: 598fb8aec6f6 ("net/failsafe: support device removal")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
There is time between the sub-device port probing by the sub-device PMD
to the sub-device port ownership taking by a fail-safe port.
In this time, the port is available for the application usage. For
example, the port will be exposed to the applications which use
RTE_ETH_FOREACH_DEV iterator.
Thus, ownership unaware applications may manage the port in this time
what may cause a lot of problematic behaviors in the fail-safe
sub-device initialization.
Register to the ethdev NEW event to take the sub-device port ownership
before it becomes exposed to the application.
Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Fail-safe uses a periodic alarm mechanism, running from the host
thread, to manage the hot-plug events of its sub-devices. This
management requires a lot of sub-devices PMDs operations
(stop, close, start, configure, etc.).
While the hot-plug alarm runs in the host thread, the application may
call fail-safe operations, which directly trigger the sub-devices PMDs
operations as well. This call may occur from any thread decided by the
application (probably the master thread).
Thus, more than one operation can be executed to a sub-device at the
same time. This can initiate a lot of races in the sub-PMDs.
Moreover, some control operations update the fail-safe internal
databases, which can be used by the alarm mechanism at the same time.
This can also initiate races and crashes.
Fail-safe is the owner of its sub-devices and must synchronize their
use according to the ETHDEV ownership rules.
Synchronize hot-plug management by a new lock mechanism uses a mutex to
atomically defend each critical section in the fail-safe hot-plug
mechanism and control operations to prevent any races between them.
Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The fail-safe PMD uses a per sub-device flag called "remove" to
indicate the scope where the sub-device was removed physically and
whether its software resources should be released.
This flag is set when the fail-safe receives an RMV notification
about the physical removal of the sub-device, and should be unset when
all the sub-device resources are released.
The previous code wrongly unsets the flag in dev_configure(), instead
of when the software resources release is completed.
Change the remove flag unsetting to take action in the end of the
software resources release.
Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Call dedicated ethdev API to free port in remove time as was done in
other fail-safe places.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The pointer to the user parameter of the callback registration is
automatically pass to the callback function.
There is no point to allow changing this user parameter by a caller.
That's why this parameter is always set to NULL by PMDs and set only
in ethdev layer before calling the callback function.
The history is that the user parameter was initially used
by the callback implementation to pass some information
between the application and the driver:
c1ceaf3ad056 ("ethdev: add an argument to internal callback function")
Then a new parameter has been added to leave the user parameter
to its standard usage of context given at registration:
d6af1a13d7a1 ("ethdev: add return values to callback process API")
The NULL parameter in the internal callback processing function
is now removed. It makes clear that the callback parameter is user
managed and opaque from a DPDK point of view.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fail-safe attempts to read an ultimate statistics on removal time; if
that fails, it uses the latest recorded snapshot.
This patch adds timestamp for each stats snapshot to allow a time report
since the last snapshot in case of the above failure.
By this way, the user can estimate the stats read accuracy.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The stats_get API was changed to signal a potential failure to read
stats. Furthermore, some PMDs are able to provide statistics even
after a removal event occurred.
Considering this, the fail-safe can try to access the latest
statistics of a PMD to improve statistics accuracy.
Attempt an ultimate statistics read on removal time; if that
fails, use the latest recorded snapshot.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Extend port_id definition from uint8_t to uint16_t in lib and drivers
data structures, specifically rte_eth_dev_data. Modify the APIs,
drivers and app using port_id at the same time.
Fix some checkpatch issues from the original code and remove some
unnecessary cast operations.
release_17_11 and deprecation docs have been updated in this patch.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The previous stats code returned only the current TX sub
device stats.
This enhancement extends it to return the sum of all sub
devices stats with history of removed sub-devices.
Dedicated stats accumulator saves the stat history of all
sub device remove events.
Each failsafe sub device contains the last stats asked by
the user and updates the accumulator in removal time.
I would like to implement ultimate snapshot on removal time.
The stats_get API needs to be changed to return error in the
case it is too late to retrieve statistics.
By this way, failsafe can get stats snapshot in removal interrupt
callback for each PMD which can give stats after removal event.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The corrupted code don't reply error in case of MAC
address adding failure while failsafe PMD was trying
to apply configuration to the sub device.
Hence, the application may get unwanted packets.
The fix adds error report for this case.
Fixes: ebea83f899d8 ("net/failsafe: add plug-in support")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Listen to INTR_RMV events issued by slaves.
Add atomic flags on slave queues to detect use of slave bursts function.
If a removal is detected, set the recollection flag on this slave.
During a slave upkeep round, if its recollection flag is set and its
burst functions are not in use by any thread, remove that slave.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Add the "exec" device type.
The parameters given to this type of device will be executed in a shell.
The output of this command is then used as a definition for a device.
That command can be re-interpreted if the related device is not
plugged-in. It allows for a device definition to react to system
changes (e.g. changing PCI bus for a given device).
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Periodically check for the existence of a device.
If a device has not been initialized and exists on the system, then it
is probed and configured.
The configuration process strives to synchronize the states between the
plugged-in sub-device and the fail-safe device.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>