Add devargs support to modify number of chunk slots. Chunks are used to
store event timers, a chunk can be visualised as an array where the last
element points to the next chunk and rest of them are used to store
events. TIM traverses the list of chunks and enqueues the event timers
to SSO.
If no argument is passed then a default value of 255 is taken.
Example:
--dev "0002:0e:00.0,tim_chnk_slots=511"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
If the chunks are allocated from NPA then TIM can automatically free
them when traversing the list of chunks.
Add devargs to disable NPA and use software mempool to manage chunks.
Example:
--dev "0002:0e:00.0,tim_disable_npa=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Allow TIM to optimize user supplied configuration based on
RTE_EVENT_TIMER_ADAPTER_F_ADJUST_RES flag.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
When the application calls timer adapter create the following is used:
- Allocate a TIM lf based on number of lf's provisioned.
- Verify the config parameters supplied.
- Allocate memory required for
* Buckets based on min and max timeout supplied.
* Allocate the chunk pool based on the number of timers.
On Free:
- Free the allocated bucket and chunk memory.
- Free the TIM lf allocated.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add selftest to verify sanity of SSO.
Can be run by passing devargs to SSO PF as follows:
Example:
--dev "0002:0e:00.0,selftest=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
SSO GGRPs i.e. queue uses DRAM & SRAM buffers to hold in-flight
events. By default the buffers are assigned to the SSO GGRPs to
satisfy minimum HW requirements. SSO is free to assign the remaining
buffers to GGRPs based on a preconfigured threshold.
We can control the QoS of SSO GGRP by modifying the above mentioned
thresholds. GGRPs that have higher importance can be assigned higher
thresholds than the rest.
Example:
--dev "0002:0e:00.0,qos=[1-50-50-50]" // [Qx-XAQ-TAQ-IAQ]
Qx -> Event queue Aka SSO GGRP.
XAQ -> DRAM In-flights.
TAQ & IAQ -> SRAM In-flights.
The values need to be expressed in terms of percentages, 0 represents
default.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Octeontx2 SSO by default is set to use dual workslot mode.
Add devargs option to force legacy mode i.e. single workslot mode.
Example:
--dev "0002:0e:00.0,single_ws=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
OcteonTx2 AP core SSO cache contains two entries each entry caches
state of an single GWS aka event port.
AP core requests events from SSO by using following sequence :
1. Write to SSOW_LF_GWS_OP_GET_WORK
2. Wait for SSO to complete scheduling by polling on SSOW_LF_GWS_TAG[63]
3. SSO notifies core by clearing SSOW_LF_GWS_TAG[63] and if work is
valid SSOW_LF_GWS_WQP is non-zero.
The above sequence uses only one in-core cache entry.
In dual workslot mode we try to use both the in-core cache entries by
triggering GET_WORK on a second workslot as soon as the above sequence
completes. This effectively hides the schedule latency of SSO if there
are enough events with unique flow_tags in-flight.
This mode reserves two SSO GWS lf's for each event port effectively
doubling single core performance.
Dual workslot mode is the default mode of operation in octeontx2.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Add support for retrieving statistics from SSO GWS and GGRP.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Register and implement SSO GWS and GGRP IRQ handlers for error
interrupts.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Links between queues and ports are controlled by setting/clearing GGRP
membership in SSOW_LF_GWS_GRPMSK_CHG.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
The number of events for a *open system* event device is specified
as -1 as per the eventdev specification.
Since, Octeontx2 SSO inflight events are only limited by DRAM size, the
xae_cnt devargs parameter is introduced to provide upper limit for
in-flight events.
Example:
--dev "0002:0e:00.0,xae_cnt=8192"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add the device configure function that attaches the requested number of
SSO GWS(event ports) and GGRP(event queues) LF's to the PF.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add the info_get function to return details on the queues, flow,
prioritization capabilities, etc. which this device has.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
SSO object needs to be initialized to communicate with the kernel AF
driver through mbox using the common API's.
Also, initialize the internal eventdev structure to defaults.
Attach NPA lf to the PF if needed.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add the make and meson based build infrastructure along with the
eventdev(SSO) device probe.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
When producer type is event timer adapter producer lcore checks are
skipped. Since, timer adapter relies on SW to arm timers producer lcore
is essential for its functionality.
Verify producer lcore validity when producer type is event timer
adapter.
Fixes: b01974da9f25 ("app/eventdev: add ethernet device producer option")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Configure event ports based on the underlying event device info rather
than using hardcoded values.
Fixes: 5710e751813e ("app/testeventdev: add order port setup")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Replace the mbuf pointer array in the event eth Rx adapter
callback with an event array. Using an event array allows
the application to change attributes of the events enqueued
by the SW adapter.
The callback can drop packets and populate a callback
argument with the number of dropped packets. Add a Rx adapter
stats field to keep track of the total number of dropped packets.
This commit removes the experimental tags from
the callback and stats APIs, the experimental tag from eventdev
is also removed and eventdev functions become part of the
main DPDK API/ABI.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
This patch introduces a new version of the event timer adapter software
PMD. In the original design, timer event producer lcores in the primary
and secondary processes enqueued event timers into a ring, and a
service core in the primary process dequeued them and processed them
further. To improve performance, this version does away with the ring
and lets lcores insert timers directly into timer skiplist data
structures; the service core directly accesses the lists as well, when
looking for timers that have expired.
To compare the burst and non-burst performance of the original and new
versions of the software event timer adapter, I ran the following
commands:
$ sudo ./build/app/dpdk-test-eventdev -c 0xFFE -s 0xC --vdev=event_sw0 \
-- --test=perf_queue --plcores=4,5,6 --wlcore=7,8,9 --stlist=p \
--prod_type_timerdev --worker_deq_depth=32
$ sudo ./build/app/dpdk-test-eventdev -c 0xFFE -s 0xC --vdev=event_sw0 \
-- --test=perf_queue --plcores=4,5,6 --wlcore=7,8,9 --stlist=p \
--prod_type_timerdev_burst --worker_deq_depth=32
With the new version, I see a 151% improvement in throughput for the
non-burst case, and a 270% improvement in throughput for the burst case.
I also see a 53% improvement in arm latency in the non-burst case and a
65% improvement in arm latency in the burst case.
Note: To perform the test, I commented out a check in the original
version that checks the adapter tick interval against a minimum value.
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Setup event when the Rx queue is added to the
adapter in place of generating the event when it is
being enqueued to the event device.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Remove copy from temporary event array on the stack to the
enqueue buffer event array entry, instead initialize event in the
enqueue buffer event array entry.
Suggested-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Correct the wrong -march=-mcpu=armv8.2-a+crc+crypto+lse for
octeontx2 target. Since rte_cc_has_argument drops invalid
CFLAG and -mcpu=octeontx2 picks up the correct optimization,
this typo is not noticed in performance testing.
Fixes: 01d184798731 ("config: add octeontx2 machine")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
For each driver where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
For each library where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
When configuring with meson we print out a list of enabled components, but
it is also useful to list out the disabled components and the reasons why.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
The vhost_crypto example app did not check for a libdpdk pkg-config file
and attempt to build using that. Add support for that method of compile to
align the app with the other examples.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
The vdpa example app did not check for a libdpdk pkg-config file and
attempt to build using that. Add support for that method of compile to
align the app with the other examples.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
The pkg-config file generated as part of the build of DPDK should allow
applications to be built with an installed DPDK. We can test this as
part of the build by doing an install of DPDK to a temporary directory
within the build folder, and by then compiling up a few sample apps
using make working off that directory.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Allow the script to run with a reduced set of builds if clang, or
other compilers, are missing.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
For testing of DPDK, we want to override the prefix given by the
pkg-config file, so that we can get correct paths for DPDK installed
in an unusual location.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
When running self-tests, the driver needs to know the device on which to
run the tests, so we need to take the device ID as parameter. Only the
skeleton driver is providing this selftest capability right now, so we can
easily update it for this change.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Use a variable value rather than compile-time constant zero as the
device id for the skeleton rawdev tests. This ensures we can make the
tests work even if other rawdevs are present.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>