When attempting to link a port and queue immediately after unlinking,
the CQ inflights may not all be processed. Poll the h/w register for
outstanding inflights instead of reading once, in case the inflights
are still being processed. Also return EBUSY if the inflight
processing is not completed in a suitable amount of time.
Fixes: 1857f1922c ("event/dlb2: use new implementation of resource file")
Cc: stable@dpdk.org
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the rolling mask used in dequeue operations to
fix the vector optimized dequeue.
Fixes: 000a7b8e75 ("event/dlb2: optimize dequeue operation")
Cc: stable@dpdk.org
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Improve vWQE and CQ Rx performance by tuning perfetches to 64B
cacheline size.
Also, prefetch the vWQE array offsets at cacheline boundaries.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Setting WAITW bit enables default min dequeue timeout of 1us.
Avoid the min dequeue timeout by setting WAITW only when dequeue_timeout
is configured.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Store and reuse workslot status for TT, GRP and HEAD status
instead of reading from GWC as reading from GWC imposes
additional latency.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Tx command is prepared based on offloads enabled and stored in
Tx queue structure at tx_queue_setup phase.
In fastpath the command is copied from Tx queue to LMT line for
all the packets.
Since, the command contents are mostly constants we can move the
command preparation to fastpath and avoid accessing Tx queue
memory.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Functions like free, rte_free, and rte_mempool_free
already handle NULL pointer so the checks here are not necessary.
Remove redundant NULL pointer checks before free functions
found by nullfree.cocci
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add cn10k segregated Rx and event Tx enqueue template functions,
these help in parallelizing the build.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add cn10k segregated Rx and event dequeue functions to build,
add macros to make future modifications simpler.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add cn10k segregated Rx and event dequeue template functions,
these help in parallelizing the build.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add cn9k segregated Tx and event Tx functions to build,
add macros to make future modifications simpler.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add cn9k segregated Rx and event Tx enqueue template functions,
these help in parallelizing the build.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add cn9k segregated Rx and event dequeue functions to build,
add macros to make future modifications simpler.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Split template functions to multiple files based on the range
of offloads. This allows them to be built in parallel reducing
time spent on compiling single files containing all the template
functions.
The files are added to the build system in later patches modifying
the existing scheme of selecting template lookup with a simple
flat array based lookup.
Add cn9k segregated Rx and event dequeue template functions,
these help in parallelizing the build.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Support the tx enqueue in order queue mode, where queue id
for each event may be different.
Signed-off-by: Jun Yang <jun.yang@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Add external clock support for cnxk timer adapter.
External clock mapping is as follows:
RTE_EVENT_TIMER_ADAPTER_EXT_CLK0 = TIM_CLK_SRC_10NS,
RTE_EVENT_TIMER_ADAPTER_EXT_CLK1 = TIM_CLK_SRC_GPIO,
RTE_EVENT_TIMER_ADAPTER_EXT_CLK2 = TIM_CLK_SRC_PTP,
RTE_EVENT_TIMER_ADAPTER_EXT_CLK3 = TIM_CLK_SRC_SYNCE,
TIM supports clock input from external GPIO, PTP, SYNCE clocks.
Input resolution is adjusted based on CNTVCT frequency for better
estimation.
Since TIM is unaware of input clock frequency, application is
expected to pass the frequency.
Example:
-a 0002:0e:00.0,tim_eclk_freq=122880000-0-0
The order of frequencies above is GPIO-PTP-SYNCE.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Minimum supported interval should now be retrieved from
mailbox based on the clock source and clock frequency.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
As per the deprecation notice, In the view of enabling unified driver
for octeontx2(cn9k)/octeontx3(cn10k), removing drivers/octeontx2
drivers and replace with drivers/cnxk/ which
supports both octeontx2(cn9k) and octeontx3(cn10k) SoCs.
This patch does the following
- Replace drivers/common/octeontx2/ with drivers/common/cnxk/
- Replace drivers/mempool/octeontx2/ with drivers/mempool/cnxk/
- Replace drivers/net/octeontx2/ with drivers/net/cnxk/
- Replace drivers/event/octeontx2/ with drivers/event/cnxk/
- Replace drivers/crypto/octeontx2/ with drivers/crypto/cnxk/
- Rename config/arm/arm64_octeontx2_linux_gcc as
config/arm/arm64_cn9k_linux_gcc
- Update the documentation and MAINTAINERS to reflect the same.
- Change the reference to OCTEONTX2 as OCTEON 9. Old release notes and
the kernel related documentation is not accounted for this change.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Number of events scheduled and available for dequeue
after token pop was set to dequeue_depth-1 instead of
dequeue_depth in test_delayed_pop. The expectation is
that all dequeue_depth number of events can be dequeued
once the last event is released.
Fixes: 07d55c418d ("event/dlb2: add delayed token pop logic")
Cc: stable@dpdk.org
Signed-off-by: Rashmi Shetty <rashmi.shetty@intel.com>
Reviewed-by: Mike Ximing Chen <mike.ximing.chen@intel.com>
Replace RTE_EVENT_DEV_CAP_REQUIRES_MAINT, which signaled the need
for the application to call rte_event_maintain(), with
RTE_EVENT_DEV_CAP_MAINTENANCE_FREE, which does the opposite (i.e.,
signifies that the event device does not require maintenance).
This approach is more in line with how other eventdev hardware and/or
software limitations are handled in the Eventdev API.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reported by clang 13.
This patch removes the inflights variable from the sw_dump function
within the software section of the event driver as it is an unused but
set variable.
Bugzilla ID: 881
Fixes: c66baa68e4 ("event/sw: add dump function for easier debugging")
Cc: stable@dpdk.org
Reported-by: Liang Longfeng <longfengx.liang@intel.com>
Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Set the RTE_EVENT_DEV_CAP_REQUIRES_MAINT flag, and perform DSW
background tasks on rte_event_maintain() calls.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Tested-by: Richard Eklycke <richard.eklycke@ericsson.com>
Tested-by: Liron Himi <lironh@marvell.com>
SSO group base addresses are always are always contiguous we
need not store all the base addresses in workslot memory, instead
just store the base address and compute the group address offset
when required.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
The transmit loop incorrectly assumes that nb_mbufs is always
a multiple of 4 when transmitting an event vector. The max
size of the vector might not be reached and pushed out early
due to timeout.
Fixes: 761a321acf ("event/cnxk: support vectorized Tx event fast path")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Disable drop_re i.e dropping packets with receive errors on
vector enable for few cn10k revisions due to HW errata.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the drivers access the interrupt handle fields.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
FINISHED state seems to be used to indicate that the worker's update
of the 'state' is not visible to other threads. There seems to be no
requirement to have such a state.
Since the FINISHED state is removed, the API rte_eal_wait_lcore
is updated to always return the status of the last function that
ran in the worker core.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Feifei Wang <feifei.wang2@arm.com>
Fix the mbuf offload flags namespace by adding an RTE_ prefix to the
name. The old flags remain usable, but a deprecation warning is issued
at compilation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
The device specific structures - rte_cryptodev
and rte_cryptodev_data are moved to cryptodev_pmd.h
to hide it from the applications.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Tested-by: Rebecca Troy <rebecca.troy@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
CNF10KA does not differ it terms of RVU resources from
CN10KA platform hence add it to list of devices respective
drivers support.
Otherwise devices on CNF10KA are not probed even though
compatible drivers exist.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
This commit implements the changes required for using suggested
port type hint feature. Each port uses different credit quanta
based on port type specified using port configuration flags.
Each port has separate quanta defined in dlb2_priv.h
Producer and consumer ports will need larger quanta value to reduce number
of credit calls they make. Workers can use small quanta as they mostly
work out of locally cached credits and don't request/return credits often.
Signed-off-by: Pravin Pathak <pravin.pathak@intel.com>
Hide rte_event_timer_adapter_pmd.h file as it is an internal file.
Remove rte_ prefix from rte_event_timer_adapter_ops structure.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Invoke event_dev_probing_finish() function at the end of probing,
this function sets the function pointers in the fp_ops flat array.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Mark all the driver specific functions as internal, remove
`rte` prefix from `struct rte_eventdev_ops`.
Remove experimental tag from internal functions.
Remove `eventdev_pmd.h` from non-internal header files.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Include vector configuration into the structure
``rte_event_eth_rx_adapter_queue_conf`` that is used to configure
Rx adapter ethernet device Rx queue parameters.
This simplifies event vector configuration as it avoids splitting
configuration per Rx queue.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reduced max chunk pool cache size from RTE_MEMPOOL_CACHE_MAX_SIZE(512)
to 128.
If chunk pool cache is empty, it gets filled during arm. Filling 512
entries at a time will fail arm if timeout is shorter, hence
reduce the pool cache size.
Fixes: 0e792433d0 ("event/cnxk: create and free timer adapter")
Cc: stable@dpdk.org
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Type of kvargs value and handler function argument should match to avoid
spilling memory.
Fixes: 7ffa737996 ("event/cnxk: add option to configure getwork mode")
Cc: stable@dpdk.org
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Since AARCH32 extension is not implemented on octeontx2 family, only
enable build for 64bit.
Due to Linux kernel AF(Admin Function) driver dependency, only enable
build for 64-bit Linux.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Since AARCH32 extension is not implemented on octeontx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for 64-bit Linux.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add RTE_ prefix to internal API defined in public header.
Use the prefix instead of double underscore.
Use uppercase for macros in the case of name conflict.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Fix the mempool flags namespace by adding an RTE_ prefix to the name.
The old flags remain usable, to be deprecated in the future.
Flag MEMPOOL_F_NON_IO added in the release is just renamed to have RTE_
prefix.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add support to create and submit CPT instructions on Tx
on CN10K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to receive CPT processed packets on Rx via
second pass on CN10K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to create and submit CPT instructions on Tx
on CN9K SoC.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to receive CPT processed packets on Rx for
CN9K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support for inline inbound and outbound IPSec for SA create,
destroy and other NIX / CPT LF configurations.
This patch also changes dpdk-devbind.py to list new inline
device as misc device.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The rte_cryptodev_pmd.* files are for drivers only and should be
private to DPDK, and not installed for app use.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Added a common macro to set eventdev enqueue and
dequeue operations to reduce code.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Start a new release cycle with empty release notes.
The ABI version becomes 22.0.
The map files are updated to the new ABI major number (22).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Reads to Tx queue FC memory need to be atomic to avoid cores using
same Tx queue spinning on stale values.
Fixes: 313e884a22 ("event/cnxk: support Tx adapter fast path")
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add Rx event vector fastpath to convert HW defined metadata into
rte_mbuf and rte_event_vector.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add event vector support for cnxk event Rx adapter, add control path
APIs to get vector limits and ability to configure event vectorization
on a given Rx queue.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support for event eth Tx adapter fastpath operations.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support for event eth Rx adapter fastpath operations.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support for event eth Rx adapter.
Resize cn10k workslot fastpath structure to fit in 64B cacheline size.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.
This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.
Existing implementations are adjusted to follow the new semantics.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Added support for crypto adapter OP_FORWARD mode.
As OcteonTx CPT crypto completions could be out of order, each crypto op
is enqueued to CPT, dequeued from CPT and enqueued to SSO one-by-one.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
In poll mode driver of octeontx2 the RQ is connected to a CQ and it is
responsible for asserting backpressure to the CGX channel.
When event eth Rx adapter is configured, the RQ is connected to a event
queue, to enable backpressure we need to configure AURA assigned to a
given RQ to backpressure CGX channel.
Event device expects unique AURA to be configured per ethernet device.
If multiple RQ from different ethernet devices use the same AURA,
the backpressure will be disabled, application can override this
using devargs:
-a 0002:0e:00.0,force_rx_bp=1
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Set the appropriate capability flags for the RX, crypto and timer
eventdev adapters to use.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Tested-by: Heng Wang <heng.wang@ericsson.com>
clang-10 build issue log:
drivers/event/cnxk/cnxk_tim_worker.h:372:23:
warning: value size does not match register size
specified by the constraint and modifier [-Wasm-operand-widths]
: [rem] "=&r"(rem)
^
cnxk/cnxk_tim_worker.h:365:17: note: use constraint modifier "w"
"ldxr %[rem], [%[crem]] \n"
^~~~~~
%w[rem]
Changed variable type to match register size, which placates clang.
Fixes: 300b796262 ("event/cnxk: add timer arm routine")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Currently LSO formats setup initially are expected to be
compile time constants and start from 0.
Change the logic in slow and fast path so that LSO format indexes
are only determined runtime.
Fixes: 3b635472a9 ("net/octeontx2: support TSO offload")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Optimized dequeue using x86 vector instructions was added
in 21.05, but due to limited testing the default has been
changed back to the scalar mode implementation. The vector mode
implementation can be enabled via the devargs option
"vector_opts_enabled=<y/Y>".
Fixes: 000a7b8e75 ("event/dlb2: optimize dequeue operation")
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
The HW scheduling type was not being extracted properly
in the vector optimized dequeue path. It was also not
being recorded in the xstats.
Fixes: 000a7b8e75 ("event/dlb2: optimize dequeue operation")
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Deferred scheduling is a DLB v1.0 feature, and is not valid for
DLB v2.0 or v2.5.
Fixes: bc62748bd7 ("event/dlb2: add private data structures and constants")
Fixes: a2e4f1f5e7 ("event/dlb2: add dequeue and its burst variants")
Cc: stable@dpdk.org
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Let's try to enforce the convention where most drivers use a pmd. logtype
with their class reflected in it, and libraries use a lib. logtype.
Introduce two new macros:
- RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is
used in a component. It is associated to the default name provided
by the build system,
- RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used,
and then the passed name is appended to the default name,
RTE_LOG_REGISTER is left untouched for existing external users
and for components that do not comply with the convention.
There is a new Meson variable log_prefix to adapt the default name
for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes.
Note: achieved with below commands + reverted change on net/bonding +
edits on crypto/virtio, compress/mlx5, regex/mlx5
$ git grep -l RTE_LOG_REGISTER drivers/ |
while read file; do
pattern=${file##drivers/};
class=${pattern%%/*};
pattern=${pattern#$class/};
drv=${pattern%%/*};
case "$class" in
baseband) pattern=pmd.bb.$drv;;
bus) pattern=bus.$drv;;
mempool) pattern=mempool.$drv;;
*) pattern=pmd.$class.$drv;;
esac
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
done
$ git grep -l RTE_LOG_REGISTER lib/ |
while read file; do
pattern=${file##lib/};
pattern=lib.${pattern%%/*};
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Add devargs to control each event timer adapter i.e. TIM rings internal
parameters uniquely. The following dict format is expected
[ring-chnk_slots-disable_npa-stats_ena]. 0 represents default values.
Example:
--dev "0002:1e:00.0,tim_ring_ctl=[2-1023-1-0]"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add event timer adapter statistics get and reset functions.
Stats are disabled by default and can be enabled through devargs.
Example:
--dev "0002:1e:00.0,tim_stats_ena=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add function to cancel event timer that has been armed.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add event timer arm timeout burst function.
All the timers requested to be armed have the same timeout.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add TIM bucket operations used for event timer arm and cancel.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add devargs to control default chunk size and max numbers of
timer rings to attach to a given RVU PF.
Example:
--dev "0002:1e:00.0,tim_chnk_slots=1024"
--dev "0002:1e:00.0,tim_rings_lmt=4"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add internal SSO functions to allow event adapters to resize SSO buffers
that are used to hold in-flight events in DRAM.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
If the chunks are allocated from NPA then TIM can automatically free
them when traversing the list of chunks.
Add devargs to disable NPA and use software mempool to manage chunks.
Example:
--dev "0002:0e:00.0,tim_disable_npa=1"
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
When the application calls timer adapter create the following is used:
- Allocate a TIM LF based on number of LF's provisioned.
- Verify the config parameters supplied.
- Allocate memory required for
* Buckets based on min and max timeout supplied.
* Allocate the chunk pool based on the number of timers.
On Free:
- Free the allocated bucket and chunk memory.
- Free the TIM lf allocated.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add eventdev start function along with few cleanup API's to maintain
sanity.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add devargs to configure the platform specific getwork mode.
CN9K getwork mode by default is set to use dual workslot mode.
Add option to force single workslot mode.
Example:
--dev "0002:0e:00.0,single_ws=1"
CN10K supports multiple getwork prefetch modes, by default the
prefetch mode is set to none.
Add option to select getwork prefetch mode
Example:
--dev "0002:1e:00.0,gw_mode=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
SSO HWGRPs i.e. queue uses DRAM & SRAM buffers to hold in-flight
events. By default the buffers are assigned to the SSO HWGRPs to
satisfy minimum HW requirements. SSO is free to assign the remaining
buffers to HWGRPs based on a preconfigured threshold.
We can control the QoS of SSO HWGRP by modifying the above mentioned
thresholds. HWGRPs that have higher importance can be assigned higher
thresholds than the rest.
Example:
--dev "0002:0e:00.0,qos=[1-50-50-50]" // [Qx-XAQ-TAQ-IAQ]
Qx -> Event queue Aka SSO GGRP.
XAQ -> DRAM In-flights.
TAQ & IAQ -> SRAM In-flights.
The values need to be expressed in terms of percentages, 0 represents
default.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
The number of events for a *open system* event device is specified
as -1 as per the eventdev specification.
Since, SSO inflight events are only limited by DRAM size, the
xae_cnt devargs parameter is introduced to provide upper limit for
in-flight events.
Example:
--dev "0002:0e:00.0,xae_cnt=8192"
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Allocate buffers in DRAM that hold inflight events.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add platform specific event device configuration that attaches the
requested number of SSO HWS(event ports) and HWGRP(event queues) LFs
to the RVU PF/VF.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add platform specific event device probe and remove, also add
event device info get function.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Add the info_get function to return details on the queues, flow,
prioritization capabilities, etc. which this device has.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add meson build infra structure along with the event device
SSO initialization and teardown functions.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Convert code to use x86 vector instructions, thereby significantly
improving dequeue performance.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
The new devarg names and their default values
are listed below. The defaults have not changed, and
none of these parameters are accessed in the fast path.
poll_interval=1000
sw_credit_quantai=32
default_depth_thresh=256
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
All references to the old register map have been removed,
so it is safe to rename the new combined file that supports
both DLB v2.0 and DLB v2.5. Also fixed all places where this
file is included.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
As support for DLB v2.5 was added, modifications were made to
dlb_hw_types_new.h, but the old file needed to be preserved during
the port in order to meet the requirement that individual patches in
a series each compile successfully. Since the DLB v2.5 support is
completely integrated, it is now safe to remove the old (original)
file, as well as the DLB2_USE_NEW_HEADERS define that was used to
control which version of the file was to be included in certain
source files.
It is now safe to rename the new file, and use it unconditionally
in all DLB source files.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
The file dlb_resource_new.c now contains all of the low level
functions required to support both DLB v2.0 and DLB v2.5, and
the original file (dlb_resource.c) was removed in the previous
commit, so rename dlb_resource_new.c to dlb_resource.c, and
update the meson build file so that the new file is built.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
A temporary version of dlb_resource.h (dlb_resource_new.h) was used
by the previous commits in this patch series. Merge the two files
now that DLB v2.5 support has been fully added to dlb_resource.c.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions that perform the sequence number
management functions. These include getting a groups number of
sequence numbers per queue, managing in-use slots, getting the
current occupancy, and setting sequence numbers for a group.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
configuring sparse CQ mode, where each cache line
contains just one QE instead of 4.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
finishing the queue map/unmap operation, which is an
asynchronous operation.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level hardware functions responsible for
getting the queue depth. The command arguments are also
validated.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
DLB v2.5 uses a different credit scheme than was used in DLB v2.0 .
Specifically, there is a single credit pool for both load balanced
and directed traffic, instead of a separate pool for each as is
found with DLB v2.0.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
starting the scheduling domain. Once a domain is
started, its resources can no longer be configured,
except for QID remapping and port enable/disable.
The start domain arguments are validated, and an error
is returned if validation fails, or if the domain is
not configured or has already been started.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
removing the linkage between a queue and a load
balanced port. Runtime checks are performed on the
port and queue to make sure the state is appropriate
for the unmap operation, and the unmap arguments
are also validated.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
mapping queues to ports. These functions also validate
the map arguments and verify that the maximum number
of queues linked to a load balanced port does not
exceed the capabilities of the hardware.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware
version, v2.0 or v2.5.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
creating directed queues. These functions configure
the depth threshold, configure queue depth, and
validate the queue creation arguments.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware
version, v2.0 or v2.5.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
creating directed ports. These functions create the
producer port (PP), configure the consumer queue (CQ),
configure queue depth, and validate the port creation
arguments.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware
version, v2.0 or v2.5.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update the low level HW functions responsible for
creating load balanced ports. These functions create the
producer port (PP), configure the consumer queue (CQ), and
validate the port creation arguments.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware
version, v2.0 or v2.5.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Updated low level hardware functions related to configuring
load balanced queues. These functions create the queues,
as well as attach related resources required by load
balanced queues, such as sequence numbers.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action based on the hardware
version, v2.0 or v2.5.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Reset hardware registers, consumer queues, ports,
interrupts and software. Queues must also be drained
as part of the reset process.
The logic is very similar to what was done for v2.0,
but the new combined register map for v2.0 and v2.5
uses new register names and bit names. Additionally,
new register access macros are used so that the code
can perform the correct action, based on the hardware
version, v2.0 or v2.5.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Update domain creation logic to account for DLB v2.5
credit scheme, new register map, and new register access
macros.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
DLB v2.5 uses a new credit scheme, where directed and load balanced
credits are unified, instead of having separate directed and load
balanced credit pools.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Add support for DLB v2.5 probe-time hardware init,
and sets up a framework for incorporating the remaining
changes required to support DLB v2.5.
DLB v2.0 and DLB v2.5 are similar in many respects, but their
register offsets and definitions are different. As a result of these,
differences, the low level hardware functions must take the device
version into consideration. This requires that the hardware version be
passed to many of the low level functions, so that the PMD can
take the appropriate action based on the device version.
To ease the transition and keep the individual patches small, three
temporary files are added in this commit. These files have "new"
in their names. The files with "new" contain changes specific to a
consolidated PMD that supports both DLB v2.0 and DLB 2.5. Their sister
files of the same name (minus "new") contain the old DLB v2.0 specific
code. The intent is to remove code from the original files as that code
is ported to the combined DLB 2.0/2.5 PMD model and added to the "new"
files in a series of commits. At end of the patch series, the old files
will be empty and the "new" files will have the logic needed
to implement a single PMD that supports both DLB v2.0 and DLB v2.5.
At that time, the original DLB v2.0 specific files will be deleted,
and the "new" files will be renamed and replace them.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Add auto-generated register definitions, updated to
support both DLB v2.0 and v2.5 devices.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
This commit adds dlb v2.5 probe support, and updates
parameter parsing.
The dlb v2.5 device differs from dlb v2, in that the
number of resources (ports, queues, ...) is different,
so macros have been added to take the device version
into account.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
- Remove references of FPGA.
- Do not include dlb2_mbox.h as it is not needed.
- Remove duplicate macros/defines that were
present in both dlb2_priv.h and dlb2_hw_types.h.
Update dlb2_resource.c to include dlb2_priv.h
so that it picks up the macros/defines that
have now been consolidated.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Configure xaq pool based on number of in-use crypto queues to avoid CPT
add work failure due to xaq buffer run out. This patch configures
OTX2_CPT_DEFAULT_CMD_QLEN number of xae entries per queue pair.
Fixes: 29768f78d5 ("event/octeontx2: add crypto adapter framework")
Cc: stable@dpdk.org
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Parameter queue_pair_id of crypto adapter queue pair add/del operation
can be -1 to select all pre configured crypto queue pairs. Added support
for the same in driver. Also added a member in cpt qp structure to
indicate binding state of a queue pair to an event queue.
Fixes: 29768f78d5 ("event/octeontx2: add crypto adapter framework")
Cc: stable@dpdk.org
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Ankur Dwivedi <adwivedi@marvell.com>
Ensure all lists of drivers are standardized:
* one driver per line
* lists double-indented with spaces (as they are line continuations)
* elements in alphabetical order
* opening and closing list brackets "[" & "]" on own lines
* last element has trailing comma
Any code snippets in the list files is adjusted to single-indent using
whitespace to correspond to the new style also.
The lists of standard library dependencies per class, and other short
lists are not formatted one-per-line as these lists are not expected to
grow beyond 2 or 3 entries.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
These headers are used but not included explicitly, including them.
"arpa/inet.h" is included for 'htons' and friends.
"netinet/in.h" is included for 'IPPROTO_IP'.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Rasesh Mody <rmody@marvell.com>
When device is re-configured, memory allocated for work slot is freed
and new memory is allocated. Due to this we may loose some important
configurations/mappings done with initial work slot memory.
For example, whenever rte_event_eth_tx_adapter_queue_add is called
some important meta i.e. txq handle is stored in work slot structure.
If device gets reconfigured after this tx adaptor add, txq to work
slot mapping will be lost resulting in seg fault during packet
processing, as txq handle could not be retrieved from work slot.
Fixes: 67b5f46864 ("event/octeontx2: add port config functions")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Use virtual counter for estimating current bucket as PMU cannot be
reliably used to estimate time.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reduce amount of memory used by chunk pool when the mempool used
is OCTEONTX2 NPA.
Previously, the number of chunks configured when NPA is used is
equal to the number of timers requested plus the number of buckets
and if the max timeout is long enough w.r.t. resolution requested
there will a large number of buckets which would cause high memory
usage.
Reduce the number of chunks when NPA is used to the number of timers
requested as buckets that are processed chunk lists are automatically
freed.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>