This commit implements the eventdev ABI changes required by
the DLB/DLB2 PMDs. Several data structures and constants are modified
or added in this patch, thereby requiring modifications to the
dependent apps and examples.
The DLB/DLB2 hardware does not conform exactly to the eventdev interface.
1) It has a limit on the number of queues that may be linked to a port.
2) Some ports a further restricted to a maximum of 1 linked queue.
3) DLB does not have the ability to carry the flow_id as part
of the event (QE) payload. Note that the DLB2 hardware is capable of
carrying the flow_id.
Following is a detailed description of the changes that have been made.
1) Add new fields to the rte_event_dev_info struct. These fields allow
the device to advertise its capabilities so that applications can take
the appropriate actions based on those capabilities.
struct rte_event_dev_info {
uint32_t max_event_port_links;
/**< Maximum number of queues that can be linked to a single event
* port by this device.
*/
uint8_t max_single_link_event_port_queue_pairs;
/**< Maximum number of event ports and queues that are optimized for
* (and only capable of) single-link configurations supported by this
* device. These ports and queues are not accounted for in
* max_event_ports or max_event_queues.
*/
}
2) Add a new field to the rte_event_dev_config struct. This field allows
the application to specify how many of its ports are limited to a single
link, or will be used in single link mode.
/** Event device configuration structure */
struct rte_event_dev_config {
uint8_t nb_single_link_event_port_queues;
/**< Number of event ports and queues that will be singly-linked to
* each other. These are a subset of the overall event ports and
* queues; this value cannot exceed *nb_event_ports* or
* *nb_event_queues*. If the device has ports and queues that are
* optimized for single-link usage, this field is a hint for how many
* to allocate; otherwise, regular event ports and queues can be used.
*/
}
3) Replace the dedicated implicit_release_disabled field with a bit field
of explicit port capabilities. The implicit_release_disable functionality
is assigned to one bit, and a port-is-single-link-only attribute is
assigned to other, with the remaining bits available for future assignment.
* Event port configuration bitmap flags */
#define RTE_EVENT_PORT_CFG_DISABLE_IMPL_REL (1ULL << 0)
/**< Configure the port not to release outstanding events in
* rte_event_dev_dequeue_burst(). If set, all events received through
* the port must be explicitly released with RTE_EVENT_OP_RELEASE or
* RTE_EVENT_OP_FORWARD. Must be unset if the device is not
* RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE capable.
*/
#define RTE_EVENT_PORT_CFG_SINGLE_LINK (1ULL << 1)
/**< This event port links only to a single event queue.
*
* @see rte_event_port_setup(), rte_event_port_link()
*/
#define RTE_EVENT_PORT_ATTR_IMPLICIT_RELEASE_DISABLE 3
/**
* The implicit release disable attribute of the port
*/
struct rte_event_port_conf {
uint32_t event_port_cfg;
/**< Port cfg flags(EVENT_PORT_CFG_) */
}
This patch also removes the depreciation notice and announce
the new eventdev ABI changes in release note.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add minimum burst throughout the scheduler pipeline and a flush counter.
Use a single threaded ring implementation for the reorder buffer free list.
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Issue has been observed in case of multi segments where mbuf
data gets corrupted due to missing barriers. Changes made to
mbuf just before LMTST by one core gets updatded when the
same mbuf is in use by another core, leading to corruption.
It should be ensured that all changes made to mbuf should be
written before LMTST.
Fixes: cbd5710db48d ("net/octeontx2: add Tx multi segment version")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Improve single flow performance by moving the point of coherence
to the end of transmit sequence.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add SWTAG flush operation at the end of transmit sequence to
immediately release the tag held by the core.
Reuse Tag address to check SWTAG completion status.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
In the op new mode of crypto adapter, the completed crypto operation
is submitted to the event device by the OCTEON TX2 crypto PMD.
During event device dequeue the result of crypto operation is checked.
Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
The crypto adapter callback functions and associated data structures
are added.
Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Coverity flags that 'portal' variable is used before
it's checked for NULL. This patch fixes this issue.
Coverity issue: 323516
Fixes: 4ab57b042e7c ("event/dpaa2: affine portal at runtime during I/O")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Validate events configured in ssopf against the total number of
events configured across all the RX/TIM event adapters.
Events available to ssopf can be reconfigured by passing the required
amount to kernel bootargs and are only limited by DRAM size.
Example:
ssopf.max_events= 2097152
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs
provide the same functionality as rte_io_*mb APIs on all platforms, so
remove them and use rte_io_*mb instead.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Start a new release cycle with empty release notes.
The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
DPAA2 eventdev device is capable of all type queue feature.
Fix the capability flag to reflect the same.
Fixes: 8f4a294c23 ("event/dpaa2: apply new capability flags")
Cc: stable@dpdk.org
Signed-off-by: Apeksha Gupta <apeksha.gupta@nxp.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Minimize the number of different thread variables
Add all the thread specific variables in dpaa_portal
structure to optimize TLS Usage.
Signed-off-by: Rohit Raj <rohit.raj@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
When event device is transmitting packet on OCTEONTX2 it needs to access
the destined ethernet device TXq data.
Currently, we get the TXq data through rte_eth_devices global array.
Instead save the TXq address inside event port memory.
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
In OCTEONTX2 event device we use sub_event_type to store the ethernet
port identifier when we receive work from OCTEONTX2 ethernet device.
This violates the event device spec as sub_event_type should be 0 in
the initial receive stage.
Set sub_event_type to 0 after copying the port id.
Fixes: 0fe4accd8ec8 ("event/octeontx2: add Rx adapter fastpath ops")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
When event device is re-configured maintain the event queue to event port
links and event port status instead of resetting them.
Fixes: cd24e70258bd ("event/octeontx2: add device configure function")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Since PMD enqueues a single event at a time, fixing the issue by
passing 1 rather than nb_events to avoid any out of bound access as
reported by coverity.
Coverity issue: 358447
Fixes: 56a96aa42464 ("event/octeontx: add framework for Rx/Tx offloads")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Introduce the RTE_LOG_REGISTER macro to avoid the code duplication
in the logtype registration process.
It is a wrapper macro for declaring the logtype, registering it and
setting its level in the constructor context.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Add device arguments to lock NPA aura and pool contexts in NDC cache.
The device args take hexadecimal bitmask where each bit represent the
corresponding aura/pool id.
Example:
-w 0002:02:00.0,npa_lock_mask=0xf // Lock first 4 aura/pool ctx
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Currently rte_mcp_ptr_list is being shared as a variable
across libs. This is only used in control path.
This patch change it to a exported function based access.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
This is to reduce the number of variables getting exposed
from the dpaa bus. They are not required to be in bus.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The returned number from rte_event_enqueue_*()
wouldn't include events marked with RTE_EVENT_OP_RELEASE.
Fixes: 1c8e3caa3 ("event/dsw: add event scheduling and device start/stop")
Cc: stable@dpdk.org
Signed-off-by: Yuri Chipchev <yuric@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Acked-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Update the portal allocation failure log to print the thread id
as well.
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Can be reproduced with "make EXTRA_CFLAGS='-O1'" command using
gcc 7.3.0
Build error
In file included from .../drivers/event/octeontx2/ot
x2_evdev.c:15:0:
.../drivers/event/octeontx2/otx2_evdev_stats.h:
In function ‘otx2_sso_xstats_get’:
.../drivers/event/octeontx2/otx2_evdev_stats.h:124:9:
error: ‘xstats’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
xstat = &xstats[ids[i] - start_offset];
~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is false positive, 'xstats_mode_count' should be preventing taking
the loop and accessing 'xstats'.
Returning in that case to silence the compiler warning.
Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Avoid reusing recorded events when performing a migration, since this
may make the migration selection logic pick an already-moved flow.
Fixes: f6257b22e767 ("event/dsw: add load balancing")
Cc: stable@dpdk.org
Reported-by: Venky Venkatesh <vvenkatesh@paloaltonetworks.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Adding support for rx checksum offload. In case of wrong
checksum received (inner/outer l3/l4) it reports the
corresponding layer which has bad checksum. It also adds
rx burst function pointer hook for rx checksum offload to
event PMD.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adding macro based framework to hook dequeue/enqueue function
pointers to the appropriate function based on rx/tx offloads.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
When eth port queue is removed from Rx adapter using
rte_event_eth_rx_adapter_queue_del() it incorrectly
initializes CQ context instead of modifying it. This
might lead to a crash when CQ context is modified
as a part of rte_eth_dev_stop() sequence as CQ will
hold invalid entries. This is responsibility of an
application to call rte_event_eth_rx_adapter_queue_del()
to remove eth port queue from Rx adapter in tear down
sequence.
Fixes: 37720fc1fba8 ("event/octeontx2: add Rx adapter")
Cc: stable@dpdk.org
Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Adding macro based framework to hook rx/tx burst function
pointers to the appropriate function based on rx/tx offloads.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Adding multi segment support to the octeontx PMD. Also
adding the logic to share rx/tx ofloads with the eventdev
code.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Add libatomic as a global dependency when compiling for 32-bit using
clang. As we need libatomic for 64-bit atomic ops.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Add redundant stack variable initialization to work around
false-positive warnings in older versions of GCC.
Fixes: 1f2b99e8d9b1 ("event/dsw: improve migration mechanism")
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Use c11 atomics with RELAXED ordering instead of rte_atomic ops which
enforce unnessary barriers on arm64.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
DSW keeps an internal port load estimate, used by the load balancing
mechanism. As a side effect, it keeps track of the total number of
busy cycles since startup. This metric is indirectly exposed in the
form of DSW xstats' "port_<n>_event_proc_latency", which is the total
number of busy cycles divided by the total number of events processed
on a particular port.
An external application can take (event_latency * dequeued) to go back
to busy_cycles. One reason for doing this is to measure the port's
load during a longer time period, without resorting to sampling
"port_<n>_load". However, as the number dequeued events grows, a
rounding error in event_latency renders the application-calculated
busy_cycles inaccurate.
Thus, it makes sense to directly expose the number of busy cycles as a
DSW xstats, even though it might seem redundant.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
On dequeue, polling the control ring once is enough.
Fixes: f6257b22e767 ("event/dsw: add load balancing")
Cc: stable@dpdk.org
Suggested-by: Ola Liljedahl <ola.liljedahl@arm.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
DSW limits the rate of migrations on a per-port basis. Hence, as the
number of cores grows, so does the total migration capacity.
In high core-count systems, this allows for a situation where flows
are migrated to a lightly loaded port which recently already received
a number of new flows (from other ports). The processing load
generated by these new flows may not yet be reflected in the lightly
loaded port's load estimate. The result is that the previously lightly
loaded port is now overloaded.
This patch adds a rough estimate of the size of the inbound migrations
to a particular port, which can be factored into the migration logic,
avoiding the above problem.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Allowing moving multiple flows in one migration transaction, to
rebalance load more quickly.
Introduce a threshold to avoid migrating flows between ports with very
similar load.
Simplify logic for selecting which flow to migrate. The aim is now to
move flows in such a way that the receiving port is as lightly-loaded
as possible (after receiving the flow), while still migrating enough
flows from the source port to reduce its load. This is essentially how
legacy strategy work as well, but the code is more readable.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
To allow visualization of migrations, track the number flow
immigrations in "port_<N>_immigrations". The "port_<N>_migrations"
retains legacy semantics, but is renamed "port_<N>_emigrations".
Expose the number of events currently undergoing processing
(i.e. pending releases) at a particular port.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Reduce the maximum number of DSW flows from 32k to 8k, to be able
rebalance load faster.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
In DSW, in case a port can't produce any events for the application to
consume, the port is considered idle.
To slightly reduce wall-time latency, flush the port's output buffer
in case of such an empty dequeue.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Each workslot is always bound to a specific lcore there is no multi-core
contention to cause cache trashing as a result it is safe to remove the
WFE. Also, in dual workslot dequeue work will mostlikely be available on
the pair workslot making WFE impractical.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>