The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.
The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
- trigger event callback
- reset state and few pointers
- free all generic port resources
The private port resources must be released in the .dev_close callback.
The .remove callback should:
- call .dev_close callback
- call rte_eth_dev_release_port()
- free multi-port device shared resources
Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
ionic, liquidio, nfp
Below is a summary of what to implement in those drivers.
* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.
* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.
* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.
* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.
* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The device operation .dev_close was returning void.
This driver interface is changed to return an int.
Note that the API rte_eth_dev_close() is still returning void,
although a deprecation notice is pending to change it as well.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The pointers .device and .intr_handle were already reset by the helper
rte_eth_dev_pci_generic_remove().
It is now made part of rte_eth_dev_release_port().
It makes rte_eth_dev_pci_release() meaningless,
so it is replaced with a call to rte_eth_dev_release_port().
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This patch defines various PCI config space access APIs
in order to read and find IOV specific PCI capabilities.
With these definitions implemented, it enables the base
driver to do SR-IOV specific initialization and HW specific
configuration required from PF-PMD driver instance.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
Add the exact match table type for the SWX pipeline. Used under the
hood by the SWX pipeline table instruction.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add the PCAP file-based source (input) and sink (output) port types
for the SWX pipeline. The sink port is typically used to implement the
packet drop pipeline action. Used under the hood by the pipeline rx
and tx instructions.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add the Ethernet device input/output port type for the SWX pipeline.
Used under the hood by the pipeline rx and tx instructions.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add support for building the SWX pipeline based on specification file
with syntax aligned to the P4 language. The specification file may be
generated by the P4C compiler in the future.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
High-level transaction-oriented API for SWX pipeline table updates. It
supports multi-table atomic updates, i.e. multiple tables can be
updated in a single step with only the before and after table set
visible to the packets. Uses the lower-level table update mechanisms.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Query API to be used by the control plane to detect the configuration
and state of the SWX pipeline and its internal objects.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Instruction optimizer. Detects frequent patterns and replaces them
with some more powerful vector-like pipeline instructions without any
user effort. Executes at instruction translation, not at run-time.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Instruction verifier. Executes at instruction translation time during
SWX pipeline build, i.e. at initialization instead of run-time.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The jump instructions are either unconditional (jmp) or conditional on
positive/negative tests such as header validity (jmpv/jmpnv), table
lookup hit/miss (jmph/jmpnh), executed action (jmpa/jmpna), equality
(jmpeq/jmpneq), comparison result (jmplt/jmpgt). The return
instruction resumes the pipeline execution after action subroutine.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The extern instruction calls one of the member functions of a given
extern object or it calls the given extern function. The function
arguments must be written in advance to the mailbox. The results
are available in the same place after execution.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The table instruction looks up the input key into the table and then
it triggers the execution of the action found in the table entry. On
lookup miss, the default table action is executed.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The shr (i.e. shift right) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The shl (i.e. shift left) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The xor (i.e. bitwise exclusive or) instruction source can be header
field (H), meta-data field (M), extern object (E) or function (F)
mailbox field, table entry action data field (T) or immediate value
(I). The destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The or (i.e. bitwise or) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The and (i.e. bitwise and) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The cksub (i.e. checksum subtract) instruction is used to update the
1's complement sum commonly used by protocols such as IPv4, TCP or
UDP.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The ckadd (i.e. checksum add) instruction is used to either compute,
verify or update the 1's complement sum commonly used by protocols
such as IPv4, TCP or UDP.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The sub (i.e. subtract) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The add instruction source can be header field (H), meta-data field
(M), extern object (E) or function (F) mailbox field, table entry
action data field (T) or immediate value (I). The destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The DMA instruction handles the bulk read transfer of one header from
the table entry action data. Typically used to generate headers, i.e.
headers that are not extracted from the input packet.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The mov (i.e. move) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add instructions to flag a header as valid or invalid. This flag can
be tested by the jmpv (jump if header valid) and jmpnv (jump if header
not valid) instructions.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add header emit and packet transmission instructions. Emit adds to the
output packet a header that is either generated (e.g. read from table
entry by action) or extracted from the input packet. Tx ends the
pipeline processing; discard is implemented by tx to special port.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add packet reception and header extraction instructions. The Rx must
be the first pipeline instruction. Each extracted header is logically
removed from the packet, then it can be read/written by instructions,
emitted into the outgoing packet or discarded.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The SWX pipeline instructions represent the main program that defines
the life of the packet. As packets go through tables that trigger
action subroutines, the headers and meta-data get transformed along
the way.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add tables to the SWX pipeline. The match fields are flexibly selected
from the headers and meta-data. The set of table actions is flexibly
selected for each table from the set of pipeline actions.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add SWX actions that are dynamically-defined through instructions as
opposed to pre-defined. The actions are subroutines of the pipeline
program that triggered by table lookup. The input arguments are the
action data from the table entry (format defined by struct), the
headers and meta-data are in/out.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add extern objects and functions to plug into the SWX pipeline any
functionality that cannot be efficiently implemented with existing
instructions, e.g. special checksum/ECC, crypto, meters, stats arrays,
heuristics, etc. In/out arguments are passed through mailbox with
format defined by struct.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add support for dynamically-defined packet headers and meta-data to
the SWX pipeline. The header and meta-data format are defined by the
struct type they instantiate.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add output ports to the newly introduced SWX pipeline type. Each port
instantiates a port type that defines the port operations, e.g. ethdev
port, PCAP port, etc. The TX interface is single packet, with packet
batching internally for performance.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add input ports to the newly introduced SWX pipeline type. Each port
instantiates a port type that defines the port operations, e.g. ethdev
port, PCAP port, etc. The RX interface is single packet, with packet
batching internally for performance.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add new improved Software Switch (SWX) pipeline type that supports
dynamically-defined packet headers, meta-data, actions and pipelines.
Actions and pipelines are defined through instructions.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
This patch fixes an issue that uninitialized 'success'
is used to be compared with '0'.
Coverity issue: 337676
Fixes: 3340202f59 ("stack: add lock-free implementation")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Replace the store-release by relaxed for the CAS success at the end of
pop. Release isn't needed, because there is not write to data that need
to be synchronized.
The only preceding write is when the length is decreased, but the length
CAS loop already ensures the right synchronization.
The situation to avoid is when a thread sees the old length but the new
list, that doesn't have enough items for pop to success.
But the CAS success on length before the pop loop ensures any core reads
and updates the latest length, preventing this situation.
The store-release is also used to make sure that the items are read
before the head is updated, in order to prevent a core in pop to read an
incorrect value because another core rewrites it with push.
But this isn't needed, because items are read only when removed from the
used list. Right after this, they are pushed to the free list, and the
store-release in push makes sure the items are read before they are
visible in the free list.
Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
List head must be loaded right before continue (when failed to
find the new head).
Without this, one thread might keep trying and failing to pop items
without ever loading the new correct head.
Fixes: 7e6e609939 ("stack: add C11 atomic implementation")
Cc: gage.eads@intel.com
Cc: stable@dpdk.org
Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
The load-acquire of list->len on pop function is redundant.
Only the CAS success needs to be load-acquire.
It synchronizes with the store release in push, to ensure that the
updated head is visible when the new length is visible.
Without this, one thread in pop could see the increased length but the
old list, which doesn't have enough items yet for pop to succeed.
Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
An acquire fence is used to make sure loads after the fence can observe
all store operations before a specific store-release.
But push doesn't read any data, except for the head which is part of a
CAS operation (the items on the list are not read).
So there is no need for the acquire barrier.
Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Fix cmpexchange usage of weak / strong.
The generated code is the same on x86 and ARM (there is no weak
cmpexchange), but the old usage was inconsistent.
For push and pop update size, weak is used because cmpexchange is inside
a loop.
For pop update root, strong is used even though cmpexchange is inside a
loop, because there may be a lot of operations to do in a loop iteration
(locate the new head).
Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
When generating the documentation, a new warning can be seen:
.../dpdk/lib/librte_ethdev/rte_ethdev.h:2441:
warning: argument 'link_speed' of command @param is not found in the
argument list of rte_eth_link_speed_to_str(uint32_t speed_link)
.../dpdk/lib/librte_ethdev/rte_ethdev.h:2455: warning: The following
parameters of rte_eth_link_speed_to_str(uint32_t speed_link) are not
documented: parameter 'speed_link'
Align the function prototype to its doxygen description.
Fixes: fbf931c9c3 ("ethdev: format link status text")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch fixes the possible time-of-check to time-of-use (TOCTOU)
attack problem by copying request data and descriptor index to local
variable prior to process.
Also the original sequential read of descriptors may lead to TOCTOU
attack. This patch fixes the problem by loading all descriptors of a
request to local buffer before processing.
CVE-2020-14375
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
This patch fixes the incorrect data length check to vhost crypto.
Instead of blindly accepting the descriptor length as data length, the
change compare the request provided data length and descriptor length
first. The security issue CVE-2020-14374 is not fixed alone by this
patch, part of the fix is done through:
"vhost/crypto: fix missed request check for copy mode".
CVE-2020-14374
Fixes: 3c79609fda ("vhost/crypto: handle virtually non-contiguous buffers")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
This patch fixes vhost crypto library for the incorrect source and
destination buffer calculation in the copy mode.
Fixes: cd1e8f03ab ("vhost/crypto: fix packet copy in chaining mode")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>