raw/ioat: remove deprecated driver

The ioat driver has been superseded by the ioat and idxd dmadev drivers,
and has been deprecated for some time, so remove it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This commit is contained in:
Bruce Richardson 2022-09-20 16:54:45 +01:00 committed by Thomas Monjalon
parent 8841fbadda
commit 8c76e2f693
20 changed files with 0 additions and 3880 deletions

View File

@ -1375,11 +1375,6 @@ T: git://dpdk.org/next/dpdk-next-net-intel
F: drivers/raw/ifpga/ F: drivers/raw/ifpga/
F: doc/guides/rawdevs/ifpga.rst F: doc/guides/rawdevs/ifpga.rst
IOAT Rawdev - DEPRECATED
M: Bruce Richardson <bruce.richardson@intel.com>
F: drivers/raw/ioat/
F: doc/guides/rawdevs/ioat.rst
Marvell CNXK BPHY Marvell CNXK BPHY
M: Jakub Palider <jpalider@marvell.com> M: Jakub Palider <jpalider@marvell.com>
M: Tomasz Duszynski <tduszynski@marvell.com> M: Tomasz Duszynski <tduszynski@marvell.com>

View File

@ -46,7 +46,6 @@ The public API headers are grouped by topics:
[i40e](@ref rte_pmd_i40e.h), [i40e](@ref rte_pmd_i40e.h),
[ice](@ref rte_pmd_ice.h), [ice](@ref rte_pmd_ice.h),
[iavf](@ref rte_pmd_iavf.h), [iavf](@ref rte_pmd_iavf.h),
[ioat](@ref rte_ioat_rawdev.h),
[bnxt](@ref rte_pmd_bnxt.h), [bnxt](@ref rte_pmd_bnxt.h),
[cnxk](@ref rte_pmd_cnxk.h), [cnxk](@ref rte_pmd_cnxk.h),
[dpaa](@ref rte_pmd_dpaa.h), [dpaa](@ref rte_pmd_dpaa.h),

View File

@ -24,7 +24,6 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/drivers/net/softnic \ @TOPDIR@/drivers/net/softnic \
@TOPDIR@/drivers/raw/dpaa2_cmdif \ @TOPDIR@/drivers/raw/dpaa2_cmdif \
@TOPDIR@/drivers/raw/ifpga \ @TOPDIR@/drivers/raw/ifpga \
@TOPDIR@/drivers/raw/ioat \
@TOPDIR@/lib/eal/include \ @TOPDIR@/lib/eal/include \
@TOPDIR@/lib/eal/include/generic \ @TOPDIR@/lib/eal/include/generic \
@TOPDIR@/lib/acl \ @TOPDIR@/lib/acl \

View File

@ -15,5 +15,4 @@ application through rawdev API.
cnxk_gpio cnxk_gpio
dpaa2_cmdif dpaa2_cmdif
ifpga ifpga
ioat
ntb ntb

View File

@ -1,333 +0,0 @@
.. SPDX-License-Identifier: BSD-3-Clause
Copyright(c) 2019 Intel Corporation.
.. include:: <isonum.txt>
IOAT Rawdev Driver
===================
.. warning::
As of DPDK 21.11 the rawdev implementation of the IOAT driver has been deprecated.
Please use the dmadev library instead.
The ``ioat`` rawdev driver provides a poll-mode driver (PMD) for Intel\ |reg|
Data Streaming Accelerator `(Intel DSA)
<https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator>`_ and for Intel\ |reg|
QuickData Technology, part of Intel\ |reg| I/O Acceleration Technology
`(Intel I/OAT)
<https://www.intel.com/content/www/us/en/wireless-network/accel-technology.html>`_.
This PMD, when used on supported hardware, allows data copies, for example,
cloning packet data, to be accelerated by that hardware rather than having to
be done by software, freeing up CPU cycles for other tasks.
Hardware Requirements
----------------------
The ``dpdk-devbind.py`` script, included with DPDK,
can be used to show the presence of supported hardware.
Running ``dpdk-devbind.py --status-dev misc`` will show all the miscellaneous,
or rawdev-based devices on the system.
For Intel\ |reg| QuickData Technology devices, the hardware will be often listed as "Crystal Beach DMA",
or "CBDMA".
For Intel\ |reg| DSA devices, they are currently (at time of writing) appearing as devices with type "0b25",
due to the absence of pci-id database entries for them at this point.
Compilation
------------
For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based.
No additional compilation steps are necessary.
.. note::
Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver
will only be built if their ``dmadev`` counterparts are not built.
The following can be used to disable the ``dmadev`` drivers,
if the raw drivers are to be used instead::
$ meson -Ddisable_drivers=dma/* <build_dir>
Device Setup
-------------
Depending on support provided by the PMD, HW devices can either use the kernel configured driver
or be bound to a user-space IO driver for use.
For example, Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers,
such as ``vfio-pci``.
Intel\ |reg| DSA devices using idxd kernel driver
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To use a Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured.
The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration.
.. note::
The device configuration can also be done by directly interacting with the sysfs nodes.
An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py``
included in the driver source directory.
There are some mandatory configuration steps before being able to use a device with an application.
The internal engines, which do the copies or other operations,
and the work-queues, which are used by applications to assign work to the device,
need to be assigned to groups, and the various other configuration options,
such as priority or queue depth, need to be set for each queue.
To assign an engine to a group::
$ accel-config config-engine dsa0/engine0.0 --group-id=0
$ accel-config config-engine dsa0/engine0.1 --group-id=1
To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used.
However, the work queues also need to be configured depending on the use case.
Some configuration options include:
* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously.
* priority: WQ priority between 1 and 15. Larger value means higher priority.
* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device.
* type: WQ type (kernel/mdev/user). Determines how the device is presented.
* name: identifier given to the WQ.
Example configuration for a work queue::
$ accel-config config-wq dsa0/wq0.0 --group-id=0 \
--mode=dedicated --priority=10 --wq-size=8 \
--type=user --name=dpdk_app1
Once the devices have been configured, they need to be enabled::
$ accel-config enable-device dsa0
$ accel-config enable-wq dsa0/wq0.0
Check the device configuration::
$ accel-config list
Devices using VFIO/UIO drivers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The HW devices to be used will need to be bound to a user-space IO driver for use.
The ``dpdk-devbind.py`` script can be used to view the state of the devices
and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``.
For example::
$ dpdk-devbind.py -b vfio-pci 00:04.0 00:04.1
Device Probing and Initialization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will
be found as part of the device scan done at application initialization time without
the need to pass parameters to the application.
For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the
maximum number of workqueues available on it, partitioning all resources equally
among the queues.
If fewer workqueues are required, then the ``max_queues`` parameter may be passed to
the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.::
$ dpdk-test -a <b:d:f>,max_queues=4
For devices bound to the IDXD kernel driver,
the DPDK ioat driver will automatically perform a scan for available workqueues to use.
Any workqueues found listed in ``/dev/dsa`` on the system will be checked in ``/sys``,
and any which have ``dpdk_`` prefix in their name will be automatically probed by the
driver to make them available to the application.
Alternatively, to support use by multiple DPDK processes simultaneously,
the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue name prefix,
instead of ``dpdk_``,
allowing each DPDK application instance to only use a subset of configured queues.
Once probed successfully, irrespective of kernel driver, the device will appear as a ``rawdev``,
that is a "raw device type" inside DPDK, and can be accessed using APIs from the
``rte_rawdev`` library.
Using IOAT Rawdev Devices
--------------------------
To use the devices from an application, the rawdev API can be used, along
with definitions taken from the device-specific header file
``rte_ioat_rawdev.h``. This header is needed to get the definition of
structure parameters used by some of the rawdev APIs for IOAT rawdev
devices, as well as providing key functions for using the device for memory
copies.
Getting Device Information
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Basic information about each rawdev device can be queried using the
``rte_rawdev_info_get()`` API. For most applications, this API will be
needed to verify that the rawdev in question is of the expected type. For
example, the following code snippet can be used to identify an IOAT
rawdev device for use by an application:
.. code-block:: C
for (i = 0; i < count && !found; i++) {
struct rte_rawdev_info info = { .dev_private = NULL };
found = (rte_rawdev_info_get(i, &info, 0) == 0 &&
strcmp(info.driver_name,
IOAT_PMD_RAWDEV_NAME_STR) == 0);
}
When calling the ``rte_rawdev_info_get()`` API for an IOAT rawdev device,
the ``dev_private`` field in the ``rte_rawdev_info`` struct should either
be NULL, or else be set to point to a structure of type
``rte_ioat_rawdev_config``, in which case the size of the configured device
input ring will be returned in that structure.
Device Configuration
~~~~~~~~~~~~~~~~~~~~~
Configuring an IOAT rawdev device is done using the
``rte_rawdev_configure()`` API, which takes the same structure parameters
as the, previously referenced, ``rte_rawdev_info_get()`` API. The main
difference is that, because the parameter is used as input rather than
output, the ``dev_private`` structure element cannot be NULL, and must
point to a valid ``rte_ioat_rawdev_config`` structure, containing the ring
size to be used by the device. The ring size must be a power of two,
between 64 and 4096.
If it is not needed, the tracking by the driver of user-provided completion
handles may be disabled by setting the ``hdls_disable`` flag in
the configuration structure also.
The following code shows how the device is configured in
``test_ioat_rawdev.c``:
.. code-block:: C
#define IOAT_TEST_RINGSIZE 512
struct rte_ioat_rawdev_config p = { .ring_size = -1 };
struct rte_rawdev_info info = { .dev_private = &p };
/* ... */
p.ring_size = IOAT_TEST_RINGSIZE;
if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) {
printf("Error with rte_rawdev_configure()\n");
return -1;
}
Once configured, the device can then be made ready for use by calling the
``rte_rawdev_start()`` API.
Performing Data Copies
~~~~~~~~~~~~~~~~~~~~~~~
To perform data copies using IOAT rawdev devices, the functions
``rte_ioat_enqueue_copy()`` and ``rte_ioat_perform_ops()`` should be used.
Once copies have been completed, the completion will be reported back when
the application calls ``rte_ioat_completed_ops()``.
The ``rte_ioat_enqueue_copy()`` function enqueues a single copy to the
device ring for copying at a later point. The parameters to that function
include the IOVA addresses of both the source and destination buffers,
as well as two "handles" to be returned to the user when the copy is
completed. These handles can be arbitrary values, but two are provided so
that the library can track handles for both source and destination on
behalf of the user, e.g. virtual addresses for the buffers, or mbuf
pointers if packet data is being copied.
While the ``rte_ioat_enqueue_copy()`` function enqueues a copy operation on
the device ring, the copy will not actually be performed until after the
application calls the ``rte_ioat_perform_ops()`` function. This function
informs the device hardware of the elements enqueued on the ring, and the
device will begin to process them. It is expected that, for efficiency
reasons, a burst of operations will be enqueued to the device via multiple
enqueue calls between calls to the ``rte_ioat_perform_ops()`` function.
The following code from ``test_ioat_rawdev.c`` demonstrates how to enqueue
a burst of copies to the device and start the hardware processing of them:
.. code-block:: C
struct rte_mbuf *srcs[32], *dsts[32];
unsigned int j;
for (i = 0; i < RTE_DIM(srcs); i++) {
char *src_data;
srcs[i] = rte_pktmbuf_alloc(pool);
dsts[i] = rte_pktmbuf_alloc(pool);
srcs[i]->data_len = srcs[i]->pkt_len = length;
dsts[i]->data_len = dsts[i]->pkt_len = length;
src_data = rte_pktmbuf_mtod(srcs[i], char *);
for (j = 0; j < length; j++)
src_data[j] = rand() & 0xFF;
if (rte_ioat_enqueue_copy(dev_id,
srcs[i]->buf_iova + srcs[i]->data_off,
dsts[i]->buf_iova + dsts[i]->data_off,
length,
(uintptr_t)srcs[i],
(uintptr_t)dsts[i]) != 1) {
printf("Error with rte_ioat_enqueue_copy for buffer %u\n",
i);
return -1;
}
}
rte_ioat_perform_ops(dev_id);
To retrieve information about completed copies, the API
``rte_ioat_completed_ops()`` should be used. This API will return to the
application a set of completion handles passed in when the relevant copies
were enqueued.
The following code from ``test_ioat_rawdev.c`` shows the test code
retrieving information about the completed copies and validating the data
is correct before freeing the data buffers using the returned handles:
.. code-block:: C
if (rte_ioat_completed_ops(dev_id, 64, (void *)completed_src,
(void *)completed_dst) != RTE_DIM(srcs)) {
printf("Error with rte_ioat_completed_ops\n");
return -1;
}
for (i = 0; i < RTE_DIM(srcs); i++) {
char *src_data, *dst_data;
if (completed_src[i] != srcs[i]) {
printf("Error with source pointer %u\n", i);
return -1;
}
if (completed_dst[i] != dsts[i]) {
printf("Error with dest pointer %u\n", i);
return -1;
}
src_data = rte_pktmbuf_mtod(srcs[i], char *);
dst_data = rte_pktmbuf_mtod(dsts[i], char *);
for (j = 0; j < length; j++)
if (src_data[j] != dst_data[j]) {
printf("Error with copy of packet %u, byte %u\n",
i, j);
return -1;
}
rte_pktmbuf_free(srcs[i]);
rte_pktmbuf_free(dsts[i]);
}
Filling an Area of Memory
~~~~~~~~~~~~~~~~~~~~~~~~~~
The IOAT driver also has support for the ``fill`` operation, where an area
of memory is overwritten, or filled, with a short pattern of data.
Fill operations can be performed in much the same was as copy operations
described above, just using the ``rte_ioat_enqueue_fill()`` function rather
than the ``rte_ioat_enqueue_copy()`` function.
Querying Device Statistics
~~~~~~~~~~~~~~~~~~~~~~~~~~~
The statistics from the IOAT rawdev device can be got via the xstats
functions in the ``rte_rawdev`` library, i.e.
``rte_rawdev_xstats_names_get()``, ``rte_rawdev_xstats_get()`` and
``rte_rawdev_xstats_by_name_get``. The statistics returned for each device
instance are:
* ``failed_enqueues``
* ``successful_enqueues``
* ``copies_started``
* ``copies_completed``

View File

@ -180,10 +180,3 @@ Deprecation Notices
* raw/dpaa2_cmdif: The ``dpaa2_cmdif`` rawdev driver will be deprecated * raw/dpaa2_cmdif: The ``dpaa2_cmdif`` rawdev driver will be deprecated
in DPDK 22.11, as it is no longer in use, no active user known. in DPDK 22.11, as it is no longer in use, no active user known.
* raw/ioat: The ``ioat`` rawdev driver has been deprecated, since it's
functionality is provided through the new ``dmadev`` infrastructure.
To continue to use hardware previously supported by the ``ioat`` rawdev driver,
applications should be updated to use the ``dmadev`` library instead,
with the underlying HW-functionality being provided by the ``ioat`` or
``idxd`` dma drivers

View File

@ -1 +0,0 @@
../../dma/idxd/dpdk_idxd_cfg.py

View File

@ -1,365 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2021 Intel Corporation
*/
#include <dirent.h>
#include <libgen.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <bus_driver.h>
#include <rte_log.h>
#include <rte_string_fns.h>
#include "ioat_private.h"
/* default value for DSA paths, but allow override in environment for testing */
#define DSA_DEV_PATH "/dev/dsa"
#define DSA_SYSFS_PATH "/sys/bus/dsa/devices"
static unsigned int devcount;
/** unique identifier for a DSA device/WQ instance */
struct dsa_wq_addr {
uint16_t device_id;
uint16_t wq_id;
};
/** a DSA device instance */
struct rte_dsa_device {
struct rte_device device; /**< Inherit core device */
TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */
char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */
struct dsa_wq_addr addr; /**< Identifies the specific WQ */
};
/* forward prototypes */
struct dsa_bus;
static int dsa_scan(void);
static int dsa_probe(void);
static struct rte_device *dsa_find_device(const struct rte_device *start,
rte_dev_cmp_t cmp, const void *data);
static enum rte_iova_mode dsa_get_iommu_class(void);
static int dsa_addr_parse(const char *name, void *addr);
/** List of devices */
TAILQ_HEAD(dsa_device_list, rte_dsa_device);
/**
* Structure describing the DSA bus
*/
struct dsa_bus {
struct rte_bus bus; /**< Inherit the generic class */
struct rte_driver driver; /**< Driver struct for devices to point to */
struct dsa_device_list device_list; /**< List of PCI devices */
};
struct dsa_bus dsa_bus = {
.bus = {
.scan = dsa_scan,
.probe = dsa_probe,
.find_device = dsa_find_device,
.get_iommu_class = dsa_get_iommu_class,
.parse = dsa_addr_parse,
},
.driver = {
.name = "rawdev_idxd"
},
.device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list),
};
static inline const char *
dsa_get_dev_path(void)
{
const char *path = getenv("DSA_DEV_PATH");
return path ? path : DSA_DEV_PATH;
}
static inline const char *
dsa_get_sysfs_path(void)
{
const char *path = getenv("DSA_SYSFS_PATH");
return path ? path : DSA_SYSFS_PATH;
}
static const struct rte_rawdev_ops idxd_vdev_ops = {
.dev_close = idxd_rawdev_close,
.dev_selftest = ioat_rawdev_test,
.dump = idxd_dev_dump,
.dev_configure = idxd_dev_configure,
.dev_info_get = idxd_dev_info_get,
.xstats_get = ioat_xstats_get,
.xstats_get_names = ioat_xstats_get_names,
.xstats_reset = ioat_xstats_reset,
};
static void *
idxd_vdev_mmap_wq(struct rte_dsa_device *dev)
{
void *addr;
char path[PATH_MAX];
int fd;
snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name);
fd = open(path, O_RDWR);
if (fd < 0) {
IOAT_PMD_ERR("Failed to open device path: %s", path);
return NULL;
}
addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);
if (addr == MAP_FAILED) {
IOAT_PMD_ERR("Failed to mmap device %s", path);
return NULL;
}
return addr;
}
static int
read_wq_string(struct rte_dsa_device *dev, const char *filename,
char *value, size_t valuelen)
{
char sysfs_node[PATH_MAX];
int len;
int fd;
snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s",
dsa_get_sysfs_path(), dev->wq_name, filename);
fd = open(sysfs_node, O_RDONLY);
if (fd < 0) {
IOAT_PMD_ERR("%s(): opening file '%s' failed: %s",
__func__, sysfs_node, strerror(errno));
return -1;
}
len = read(fd, value, valuelen - 1);
close(fd);
if (len < 0) {
IOAT_PMD_ERR("%s(): error reading file '%s': %s",
__func__, sysfs_node, strerror(errno));
return -1;
}
value[len] = '\0';
return 0;
}
static int
read_wq_int(struct rte_dsa_device *dev, const char *filename,
int *value)
{
char sysfs_node[PATH_MAX];
FILE *f;
int ret = 0;
snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s",
dsa_get_sysfs_path(), dev->wq_name, filename);
f = fopen(sysfs_node, "r");
if (f == NULL) {
IOAT_PMD_ERR("%s(): opening file '%s' failed: %s",
__func__, sysfs_node, strerror(errno));
return -1;
}
if (fscanf(f, "%d", value) != 1) {
IOAT_PMD_ERR("%s(): error reading file '%s': %s",
__func__, sysfs_node, strerror(errno));
ret = -1;
}
fclose(f);
return ret;
}
static int
read_device_int(struct rte_dsa_device *dev, const char *filename,
int *value)
{
char sysfs_node[PATH_MAX];
FILE *f;
int ret = 0;
snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s",
dsa_get_sysfs_path(), dev->addr.device_id, filename);
f = fopen(sysfs_node, "r");
if (f == NULL) {
IOAT_PMD_ERR("%s(): opening file '%s' failed: %s",
__func__, sysfs_node, strerror(errno));
return -1;
}
if (fscanf(f, "%d", value) != 1) {
IOAT_PMD_ERR("%s(): error reading file '%s': %s",
__func__, sysfs_node, strerror(errno));
ret = -1;
}
fclose(f);
return ret;
}
static int
idxd_rawdev_probe_dsa(struct rte_dsa_device *dev)
{
struct idxd_rawdev idxd = {{0}}; /* double {} to avoid error on BSD12 */
int ret = 0;
IOAT_PMD_INFO("Probing device %s on numa node %d",
dev->wq_name, dev->device.numa_node);
if (read_wq_int(dev, "size", &ret) < 0)
return -1;
idxd.max_batches = ret;
idxd.qid = dev->addr.wq_id;
idxd.u.vdev.dsa_id = dev->addr.device_id;
idxd.public.portal = idxd_vdev_mmap_wq(dev);
if (idxd.public.portal == NULL) {
IOAT_PMD_ERR("WQ mmap failed");
return -ENOENT;
}
ret = idxd_rawdev_create(dev->wq_name, &dev->device, &idxd, &idxd_vdev_ops);
if (ret) {
IOAT_PMD_ERR("Failed to create rawdev %s", dev->wq_name);
return ret;
}
return 0;
}
static int
is_for_this_process_use(const char *name)
{
char *runtime_dir = strdup(rte_eal_get_runtime_dir());
char *prefix = basename(runtime_dir);
int prefixlen = strlen(prefix);
int retval = 0;
if (strncmp(name, "dpdk_", 5) == 0)
retval = 1;
if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_')
retval = 1;
free(runtime_dir);
return retval;
}
static int
dsa_probe(void)
{
struct rte_dsa_device *dev;
TAILQ_FOREACH(dev, &dsa_bus.device_list, next) {
char type[64], name[64];
if (read_wq_string(dev, "type", type, sizeof(type)) < 0 ||
read_wq_string(dev, "name", name, sizeof(name)) < 0)
continue;
if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) {
dev->device.driver = &dsa_bus.driver;
idxd_rawdev_probe_dsa(dev);
continue;
}
IOAT_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name);
}
return 0;
}
static int
dsa_scan(void)
{
const char *path = dsa_get_dev_path();
struct dirent *wq;
DIR *dev_dir;
dev_dir = opendir(path);
if (dev_dir == NULL) {
if (errno == ENOENT)
return 0; /* no bus, return without error */
IOAT_PMD_ERR("%s(): opendir '%s' failed: %s",
__func__, path, strerror(errno));
return -1;
}
while ((wq = readdir(dev_dir)) != NULL) {
struct rte_dsa_device *dev;
int numa_node = -1;
if (strncmp(wq->d_name, "wq", 2) != 0)
continue;
if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) {
IOAT_PMD_ERR("%s(): wq name too long: '%s', skipping",
__func__, wq->d_name);
continue;
}
IOAT_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name);
dev = malloc(sizeof(*dev));
if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) {
IOAT_PMD_ERR("Error parsing WQ name: %s", wq->d_name);
free(dev);
continue;
}
dev->device.bus = &dsa_bus.bus;
strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name));
TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next);
devcount++;
read_device_int(dev, "numa_node", &numa_node);
dev->device.numa_node = numa_node;
dev->device.name = dev->wq_name;
}
closedir(dev_dir);
return 0;
}
static struct rte_device *
dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
const void *data)
{
struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list);
/* the rte_device struct must be at start of dsa structure */
RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0);
if (start != NULL) /* jump to start point if given */
dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next);
while (dev != NULL) {
if (cmp(&dev->device, data) == 0)
return &dev->device;
dev = TAILQ_NEXT(dev, next);
}
return NULL;
}
static enum rte_iova_mode
dsa_get_iommu_class(void)
{
/* if there are no devices, report don't care, otherwise VA mode */
return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC;
}
static int
dsa_addr_parse(const char *name, void *addr)
{
struct dsa_wq_addr *wq = addr;
unsigned int device_id, wq_id;
if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) {
IOAT_PMD_DEBUG("Parsing WQ name failed: %s", name);
return -1;
}
wq->device_id = device_id;
wq->wq_id = wq_id;
return 0;
}
RTE_REGISTER_BUS(dsa, dsa_bus.bus);

View File

@ -1,380 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2020 Intel Corporation
*/
#include <bus_pci_driver.h>
#include <rte_memzone.h>
#include <rte_devargs.h>
#include "ioat_private.h"
#include "ioat_spec.h"
#define IDXD_VENDOR_ID 0x8086
#define IDXD_DEVICE_ID_SPR 0x0B25
#define IDXD_PMD_RAWDEV_NAME_PCI rawdev_idxd_pci
const struct rte_pci_id pci_id_idxd_map[] = {
{ RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) },
{ .vendor_id = 0, /* sentinel */ },
};
static inline int
idxd_pci_dev_command(struct idxd_rawdev *idxd, enum rte_idxd_cmds command)
{
uint8_t err_code;
uint16_t qid = idxd->qid;
int i = 0;
if (command >= idxd_disable_wq && command <= idxd_reset_wq)
qid = (1 << qid);
rte_spinlock_lock(&idxd->u.pci->lk);
idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid;
do {
rte_pause();
err_code = idxd->u.pci->regs->cmdstatus;
if (++i >= 1000) {
IOAT_PMD_ERR("Timeout waiting for command response from HW");
rte_spinlock_unlock(&idxd->u.pci->lk);
return err_code;
}
} while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK);
rte_spinlock_unlock(&idxd->u.pci->lk);
return err_code & CMDSTATUS_ERR_MASK;
}
static uint32_t *
idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx)
{
return RTE_PTR_ADD(pci->wq_regs_base,
(uintptr_t)wq_idx << (5 + pci->wq_cfg_sz));
}
static int
idxd_is_wq_enabled(struct idxd_rawdev *idxd)
{
uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[WQ_STATE_IDX];
return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1;
}
static void
idxd_pci_dev_stop(struct rte_rawdev *dev)
{
struct idxd_rawdev *idxd = dev->dev_private;
uint8_t err_code;
if (!idxd_is_wq_enabled(idxd)) {
IOAT_PMD_ERR("Work queue %d already disabled", idxd->qid);
return;
}
err_code = idxd_pci_dev_command(idxd, idxd_disable_wq);
if (err_code || idxd_is_wq_enabled(idxd)) {
IOAT_PMD_ERR("Failed disabling work queue %d, error code: %#x",
idxd->qid, err_code);
return;
}
IOAT_PMD_DEBUG("Work queue %d disabled OK", idxd->qid);
}
static int
idxd_pci_dev_start(struct rte_rawdev *dev)
{
struct idxd_rawdev *idxd = dev->dev_private;
uint8_t err_code;
if (idxd_is_wq_enabled(idxd)) {
IOAT_PMD_WARN("WQ %d already enabled", idxd->qid);
return 0;
}
if (idxd->public.desc_ring == NULL) {
IOAT_PMD_ERR("WQ %d has not been fully configured", idxd->qid);
return -EINVAL;
}
err_code = idxd_pci_dev_command(idxd, idxd_enable_wq);
if (err_code || !idxd_is_wq_enabled(idxd)) {
IOAT_PMD_ERR("Failed enabling work queue %d, error code: %#x",
idxd->qid, err_code);
return err_code == 0 ? -1 : err_code;
}
IOAT_PMD_DEBUG("Work queue %d enabled OK", idxd->qid);
return 0;
}
static const struct rte_rawdev_ops idxd_pci_ops = {
.dev_close = idxd_rawdev_close,
.dev_selftest = ioat_rawdev_test,
.dump = idxd_dev_dump,
.dev_configure = idxd_dev_configure,
.dev_start = idxd_pci_dev_start,
.dev_stop = idxd_pci_dev_stop,
.dev_info_get = idxd_dev_info_get,
.xstats_get = ioat_xstats_get,
.xstats_get_names = ioat_xstats_get_names,
.xstats_reset = ioat_xstats_reset,
};
/* each portal uses 4 x 4k pages */
#define IDXD_PORTAL_SIZE (4096 * 4)
static int
init_pci_device(struct rte_pci_device *dev, struct idxd_rawdev *idxd,
unsigned int max_queues)
{
struct idxd_pci_common *pci;
uint8_t nb_groups, nb_engines, nb_wqs;
uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */
uint16_t wq_size, total_wq_size;
uint8_t lg2_max_batch, lg2_max_copy_size;
unsigned int i, err_code;
pci = malloc(sizeof(*pci));
if (pci == NULL) {
IOAT_PMD_ERR("%s: Can't allocate memory", __func__);
goto err;
}
rte_spinlock_init(&pci->lk);
/* assign the bar registers, and then configure device */
pci->regs = dev->mem_resource[0].addr;
grp_offset = (uint16_t)pci->regs->offsets[0];
pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100);
wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16);
pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100);
pci->portals = dev->mem_resource[2].addr;
pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F;
/* sanity check device status */
if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) {
/* need function-level-reset (FLR) or is enabled */
IOAT_PMD_ERR("Device status is not disabled, cannot init");
goto err;
}
if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) {
/* command in progress */
IOAT_PMD_ERR("Device has a command in progress, cannot init");
goto err;
}
/* read basic info about the hardware for use when configuring */
nb_groups = (uint8_t)pci->regs->grpcap;
nb_engines = (uint8_t)pci->regs->engcap;
nb_wqs = (uint8_t)(pci->regs->wqcap >> 16);
total_wq_size = (uint16_t)pci->regs->wqcap;
lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F;
lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F;
IOAT_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u",
nb_groups, nb_engines, nb_wqs);
/* zero out any old config */
for (i = 0; i < nb_groups; i++) {
pci->grp_regs[i].grpengcfg = 0;
pci->grp_regs[i].grpwqcfg[0] = 0;
}
for (i = 0; i < nb_wqs; i++)
idxd_get_wq_cfg(pci, i)[0] = 0;
/* limit queues if necessary */
if (max_queues != 0 && nb_wqs > max_queues) {
nb_wqs = max_queues;
if (nb_engines > max_queues)
nb_engines = max_queues;
if (nb_groups > max_queues)
nb_engines = max_queues;
IOAT_PMD_DEBUG("Limiting queues to %u", nb_wqs);
}
/* put each engine into a separate group to avoid reordering */
if (nb_groups > nb_engines)
nb_groups = nb_engines;
if (nb_groups < nb_engines)
nb_engines = nb_groups;
/* assign engines to groups, round-robin style */
for (i = 0; i < nb_engines; i++) {
IOAT_PMD_DEBUG("Assigning engine %u to group %u",
i, i % nb_groups);
pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i);
}
/* now do the same for queues and give work slots to each queue */
wq_size = total_wq_size / nb_wqs;
IOAT_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u",
wq_size, lg2_max_batch, lg2_max_copy_size);
for (i = 0; i < nb_wqs; i++) {
/* add engine "i" to a group */
IOAT_PMD_DEBUG("Assigning work queue %u to group %u",
i, i % nb_groups);
pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i);
/* now configure it, in terms of size, max batch, mode */
idxd_get_wq_cfg(pci, i)[WQ_SIZE_IDX] = wq_size;
idxd_get_wq_cfg(pci, i)[WQ_MODE_IDX] = (1 << WQ_PRIORITY_SHIFT) |
WQ_MODE_DEDICATED;
idxd_get_wq_cfg(pci, i)[WQ_SIZES_IDX] = lg2_max_copy_size |
(lg2_max_batch << WQ_BATCH_SZ_SHIFT);
}
/* dump the group configuration to output */
for (i = 0; i < nb_groups; i++) {
IOAT_PMD_DEBUG("## Group %d", i);
IOAT_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]);
IOAT_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg);
IOAT_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags);
}
idxd->u.pci = pci;
idxd->max_batches = wq_size;
/* enable the device itself */
err_code = idxd_pci_dev_command(idxd, idxd_enable_dev);
if (err_code) {
IOAT_PMD_ERR("Error enabling device: code %#x", err_code);
return err_code;
}
IOAT_PMD_DEBUG("IDXD Device enabled OK");
return nb_wqs;
err:
free(pci);
return -1;
}
static int
idxd_rawdev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev)
{
struct idxd_rawdev idxd = {{0}}; /* Double {} to avoid error on BSD12 */
uint8_t nb_wqs;
int qid, ret = 0;
char name[PCI_PRI_STR_SIZE];
unsigned int max_queues = 0;
rte_pci_device_name(&dev->addr, name, sizeof(name));
IOAT_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
dev->device.driver = &drv->driver;
if (dev->device.devargs && dev->device.devargs->args[0] != '\0') {
/* if the number of devargs grows beyond just 1, use rte_kvargs */
if (sscanf(dev->device.devargs->args,
"max_queues=%u", &max_queues) != 1) {
IOAT_PMD_ERR("Invalid device parameter: '%s'",
dev->device.devargs->args);
return -1;
}
}
ret = init_pci_device(dev, &idxd, max_queues);
if (ret < 0) {
IOAT_PMD_ERR("Error initializing PCI hardware");
return ret;
}
nb_wqs = (uint8_t)ret;
/* set up one device for each queue */
for (qid = 0; qid < nb_wqs; qid++) {
char qname[32];
/* add the queue number to each device name */
snprintf(qname, sizeof(qname), "%s-q%d", name, qid);
idxd.qid = qid;
idxd.public.portal = RTE_PTR_ADD(idxd.u.pci->portals,
qid * IDXD_PORTAL_SIZE);
if (idxd_is_wq_enabled(&idxd))
IOAT_PMD_ERR("Error, WQ %u seems enabled", qid);
ret = idxd_rawdev_create(qname, &dev->device,
&idxd, &idxd_pci_ops);
if (ret != 0) {
IOAT_PMD_ERR("Failed to create rawdev %s", name);
if (qid == 0) /* if no devices using this, free pci */
free(idxd.u.pci);
return ret;
}
}
return 0;
}
static int
idxd_rawdev_destroy(const char *name)
{
int ret;
uint8_t err_code;
struct rte_rawdev *rdev;
struct idxd_rawdev *idxd;
if (!name) {
IOAT_PMD_ERR("Invalid device name");
return -EINVAL;
}
rdev = rte_rawdev_pmd_get_named_dev(name);
if (!rdev) {
IOAT_PMD_ERR("Invalid device name (%s)", name);
return -EINVAL;
}
idxd = rdev->dev_private;
if (!idxd) {
IOAT_PMD_ERR("Error getting dev_private");
return -EINVAL;
}
/* disable the device */
err_code = idxd_pci_dev_command(idxd, idxd_disable_dev);
if (err_code) {
IOAT_PMD_ERR("Error disabling device: code %#x", err_code);
return err_code;
}
IOAT_PMD_DEBUG("IDXD Device disabled OK");
/* free device memory */
IOAT_PMD_DEBUG("Freeing device driver memory");
rdev->dev_private = NULL;
rte_free(idxd->public.batch_idx_ring);
rte_free(idxd->public.desc_ring);
rte_free(idxd->public.hdl_ring);
rte_memzone_free(idxd->mz);
/* rte_rawdev_close is called by pmd_release */
ret = rte_rawdev_pmd_release(rdev);
if (ret)
IOAT_PMD_DEBUG("Device cleanup failed");
return 0;
}
static int
idxd_rawdev_remove_pci(struct rte_pci_device *dev)
{
char name[PCI_PRI_STR_SIZE];
int ret = 0;
rte_pci_device_name(&dev->addr, name, sizeof(name));
IOAT_PMD_INFO("Closing %s on NUMA node %d",
name, dev->device.numa_node);
ret = idxd_rawdev_destroy(name);
return ret;
}
struct rte_pci_driver idxd_pmd_drv_pci = {
.id_table = pci_id_idxd_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
.probe = idxd_rawdev_probe_pci,
.remove = idxd_rawdev_remove_pci,
};
RTE_PMD_REGISTER_PCI(IDXD_PMD_RAWDEV_NAME_PCI, idxd_pmd_drv_pci);
RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_RAWDEV_NAME_PCI, pci_id_idxd_map);
RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_RAWDEV_NAME_PCI,
"* igb_uio | uio_pci_generic | vfio-pci");
RTE_PMD_REGISTER_PARAM_STRING(rawdev_idxd_pci, "max_queues=0");

View File

@ -1,273 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2020 Intel Corporation
*/
#include <rte_rawdev_pmd.h>
#include <rte_memzone.h>
#include <rte_common.h>
#include <rte_string_fns.h>
#include "ioat_private.h"
RTE_LOG_REGISTER_DEFAULT(ioat_rawdev_logtype, INFO);
static const char * const xstat_names[] = {
"failed_enqueues", "successful_enqueues",
"copies_started", "copies_completed"
};
int
ioat_xstats_get(const struct rte_rawdev *dev, const unsigned int ids[],
uint64_t values[], unsigned int n)
{
const struct rte_ioat_rawdev *ioat = dev->dev_private;
const uint64_t *stats = (const void *)&ioat->xstats;
unsigned int i;
for (i = 0; i < n; i++) {
if (ids[i] > sizeof(ioat->xstats)/sizeof(*stats))
values[i] = 0;
else
values[i] = stats[ids[i]];
}
return n;
}
int
ioat_xstats_get_names(const struct rte_rawdev *dev,
struct rte_rawdev_xstats_name *names,
unsigned int size)
{
unsigned int i;
RTE_SET_USED(dev);
if (size < RTE_DIM(xstat_names))
return RTE_DIM(xstat_names);
for (i = 0; i < RTE_DIM(xstat_names); i++)
strlcpy(names[i].name, xstat_names[i], sizeof(names[i]));
return RTE_DIM(xstat_names);
}
int
ioat_xstats_reset(struct rte_rawdev *dev, const uint32_t *ids, uint32_t nb_ids)
{
struct rte_ioat_rawdev *ioat = dev->dev_private;
uint64_t *stats = (void *)&ioat->xstats;
unsigned int i;
if (!ids) {
memset(&ioat->xstats, 0, sizeof(ioat->xstats));
return 0;
}
for (i = 0; i < nb_ids; i++)
if (ids[i] < sizeof(ioat->xstats)/sizeof(*stats))
stats[ids[i]] = 0;
return 0;
}
int
idxd_rawdev_close(struct rte_rawdev *dev __rte_unused)
{
return 0;
}
int
idxd_dev_dump(struct rte_rawdev *dev, FILE *f)
{
struct idxd_rawdev *idxd = dev->dev_private;
struct rte_idxd_rawdev *rte_idxd = &idxd->public;
int i;
fprintf(f, "Raw Device #%d\n", dev->dev_id);
fprintf(f, "Driver: %s\n\n", dev->driver_name);
fprintf(f, "Portal: %p\n", rte_idxd->portal);
fprintf(f, "Config: {ring_size: %u, hdls_disable: %u}\n\n",
rte_idxd->cfg.ring_size, rte_idxd->cfg.hdls_disable);
fprintf(f, "max batches: %u\n", rte_idxd->max_batches);
fprintf(f, "batch idx read: %u\n", rte_idxd->batch_idx_read);
fprintf(f, "batch idx write: %u\n", rte_idxd->batch_idx_write);
fprintf(f, "batch idxes:");
for (i = 0; i < rte_idxd->max_batches + 1; i++)
fprintf(f, "%u ", rte_idxd->batch_idx_ring[i]);
fprintf(f, "\n\n");
fprintf(f, "hdls read: %u\n", rte_idxd->max_batches);
fprintf(f, "hdls avail: %u\n", rte_idxd->hdls_avail);
fprintf(f, "batch start: %u\n", rte_idxd->batch_start);
fprintf(f, "batch size: %u\n", rte_idxd->batch_size);
return 0;
}
int
idxd_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
size_t info_size)
{
struct rte_ioat_rawdev_config *cfg = dev_info;
struct idxd_rawdev *idxd = dev->dev_private;
struct rte_idxd_rawdev *rte_idxd = &idxd->public;
if (info_size != sizeof(*cfg))
return -EINVAL;
if (cfg != NULL)
*cfg = rte_idxd->cfg;
return 0;
}
int
idxd_dev_configure(const struct rte_rawdev *dev,
rte_rawdev_obj_t config, size_t config_size)
{
struct idxd_rawdev *idxd = dev->dev_private;
struct rte_idxd_rawdev *rte_idxd = &idxd->public;
struct rte_ioat_rawdev_config *cfg = config;
uint16_t max_desc = cfg->ring_size;
if (config_size != sizeof(*cfg))
return -EINVAL;
if (dev->started) {
IOAT_PMD_ERR("%s: Error, device is started.", __func__);
return -EAGAIN;
}
rte_idxd->cfg = *cfg;
if (!rte_is_power_of_2(max_desc))
max_desc = rte_align32pow2(max_desc);
IOAT_PMD_DEBUG("Rawdev %u using %u descriptors",
dev->dev_id, max_desc);
rte_idxd->desc_ring_mask = max_desc - 1;
/* in case we are reconfiguring a device, free any existing memory */
rte_free(rte_idxd->desc_ring);
rte_free(rte_idxd->hdl_ring);
rte_free(rte_idxd->hdl_ring_flags);
/* allocate the descriptor ring at 2x size as batches can't wrap */
rte_idxd->desc_ring = rte_zmalloc(NULL,
sizeof(*rte_idxd->desc_ring) * max_desc * 2, 0);
if (rte_idxd->desc_ring == NULL)
return -ENOMEM;
rte_idxd->desc_iova = rte_mem_virt2iova(rte_idxd->desc_ring);
rte_idxd->hdl_ring = rte_zmalloc(NULL,
sizeof(*rte_idxd->hdl_ring) * max_desc, 0);
if (rte_idxd->hdl_ring == NULL) {
rte_free(rte_idxd->desc_ring);
rte_idxd->desc_ring = NULL;
return -ENOMEM;
}
rte_idxd->hdl_ring_flags = rte_zmalloc(NULL,
sizeof(*rte_idxd->hdl_ring_flags) * max_desc, 0);
if (rte_idxd->hdl_ring_flags == NULL) {
rte_free(rte_idxd->desc_ring);
rte_free(rte_idxd->hdl_ring);
rte_idxd->desc_ring = NULL;
rte_idxd->hdl_ring = NULL;
return -ENOMEM;
}
rte_idxd->hdls_read = rte_idxd->batch_start = 0;
rte_idxd->batch_size = 0;
rte_idxd->hdls_avail = 0;
return 0;
}
int
idxd_rawdev_create(const char *name, struct rte_device *dev,
const struct idxd_rawdev *base_idxd,
const struct rte_rawdev_ops *ops)
{
struct idxd_rawdev *idxd;
struct rte_idxd_rawdev *public;
struct rte_rawdev *rawdev = NULL;
const struct rte_memzone *mz = NULL;
char mz_name[RTE_MEMZONE_NAMESIZE];
int ret = 0;
RTE_BUILD_BUG_ON(sizeof(struct rte_idxd_hw_desc) != 64);
RTE_BUILD_BUG_ON(offsetof(struct rte_idxd_hw_desc, size) != 32);
RTE_BUILD_BUG_ON(sizeof(struct rte_idxd_completion) != 32);
if (!name) {
IOAT_PMD_ERR("Invalid name of the device!");
ret = -EINVAL;
goto cleanup;
}
/* Allocate device structure */
rawdev = rte_rawdev_pmd_allocate(name, sizeof(struct idxd_rawdev),
dev->numa_node);
if (rawdev == NULL) {
IOAT_PMD_ERR("Unable to allocate raw device");
ret = -ENOMEM;
goto cleanup;
}
/* Allocate memory for the primary process or else return the memory
* of primary memzone for the secondary process.
*/
snprintf(mz_name, sizeof(mz_name), "rawdev%u_private", rawdev->dev_id);
if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
mz = rte_memzone_lookup(mz_name);
if (mz == NULL) {
IOAT_PMD_ERR("Unable lookup memzone for private data\n");
ret = -ENOMEM;
goto cleanup;
}
rawdev->dev_private = mz->addr;
rawdev->dev_ops = ops;
rawdev->device = dev;
return 0;
}
mz = rte_memzone_reserve(mz_name, sizeof(struct idxd_rawdev),
dev->numa_node, RTE_MEMZONE_IOVA_CONTIG);
if (mz == NULL) {
IOAT_PMD_ERR("Unable to reserve memzone for private data\n");
ret = -ENOMEM;
goto cleanup;
}
rawdev->dev_private = mz->addr;
rawdev->dev_ops = ops;
rawdev->device = dev;
rawdev->driver_name = IOAT_PMD_RAWDEV_NAME_STR;
idxd = rawdev->dev_private;
*idxd = *base_idxd; /* copy over the main fields already passed in */
idxd->rawdev = rawdev;
idxd->mz = mz;
public = &idxd->public;
public->type = RTE_IDXD_DEV;
public->max_batches = idxd->max_batches;
public->batch_idx_read = 0;
public->batch_idx_write = 0;
/* allocate batch index ring. The +1 is because we can never fully use
* the ring, otherwise read == write means both full and empty.
*/
public->batch_idx_ring = rte_zmalloc(NULL,
sizeof(uint16_t) * (idxd->max_batches + 1), 0);
if (public->batch_idx_ring == NULL) {
IOAT_PMD_ERR("Unable to reserve memory for batch data\n");
ret = -ENOMEM;
goto cleanup;
}
return 0;
cleanup:
if (mz)
rte_memzone_free(mz);
if (rawdev)
rte_rawdev_pmd_release(rawdev);
return ret;
}

View File

@ -1,84 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2020 Intel Corporation
*/
#ifndef _IOAT_PRIVATE_H_
#define _IOAT_PRIVATE_H_
/**
* @file idxd_private.h
*
* Private data structures for the idxd/DSA part of ioat device driver
*
* @warning
* @b EXPERIMENTAL: these structures and APIs may change without prior notice
*/
#include <rte_spinlock.h>
#include <rte_rawdev_pmd.h>
#include "rte_ioat_rawdev.h"
extern int ioat_rawdev_logtype;
#define IOAT_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \
ioat_rawdev_logtype, "IOAT: %s(): " fmt "\n", __func__, ##args)
#define IOAT_PMD_DEBUG(fmt, args...) IOAT_PMD_LOG(DEBUG, fmt, ## args)
#define IOAT_PMD_INFO(fmt, args...) IOAT_PMD_LOG(INFO, fmt, ## args)
#define IOAT_PMD_ERR(fmt, args...) IOAT_PMD_LOG(ERR, fmt, ## args)
#define IOAT_PMD_WARN(fmt, args...) IOAT_PMD_LOG(WARNING, fmt, ## args)
struct idxd_pci_common {
rte_spinlock_t lk;
uint8_t wq_cfg_sz;
volatile struct rte_idxd_bar0 *regs;
volatile uint32_t *wq_regs_base;
volatile struct rte_idxd_grpcfg *grp_regs;
volatile void *portals;
};
struct idxd_rawdev {
struct rte_idxd_rawdev public; /* the public members, must be first */
struct rte_rawdev *rawdev;
const struct rte_memzone *mz;
uint8_t qid;
uint16_t max_batches;
union {
struct {
unsigned int dsa_id;
} vdev;
struct idxd_pci_common *pci;
} u;
};
int ioat_xstats_get(const struct rte_rawdev *dev, const unsigned int ids[],
uint64_t values[], unsigned int n);
int ioat_xstats_get_names(const struct rte_rawdev *dev,
struct rte_rawdev_xstats_name *names,
unsigned int size);
int ioat_xstats_reset(struct rte_rawdev *dev, const uint32_t *ids,
uint32_t nb_ids);
extern int ioat_rawdev_test(uint16_t dev_id);
extern int idxd_rawdev_create(const char *name, struct rte_device *dev,
const struct idxd_rawdev *idxd,
const struct rte_rawdev_ops *ops);
extern int idxd_rawdev_close(struct rte_rawdev *dev);
extern int idxd_dev_configure(const struct rte_rawdev *dev,
rte_rawdev_obj_t config, size_t config_size);
extern int idxd_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
size_t info_size);
extern int idxd_dev_dump(struct rte_rawdev *dev, FILE *f);
#endif /* _IOAT_PRIVATE_H_ */

View File

@ -1,332 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2019 Intel Corporation
*/
#include <rte_cycles.h>
#include <bus_pci_driver.h>
#include <rte_memzone.h>
#include <rte_string_fns.h>
#include <rte_rawdev_pmd.h>
#include "rte_ioat_rawdev.h"
#include "ioat_spec.h"
#include "ioat_private.h"
static struct rte_pci_driver ioat_pmd_drv;
#define IOAT_VENDOR_ID 0x8086
#define IOAT_DEVICE_ID_SKX 0x2021
#define IOAT_DEVICE_ID_BDX0 0x6f20
#define IOAT_DEVICE_ID_BDX1 0x6f21
#define IOAT_DEVICE_ID_BDX2 0x6f22
#define IOAT_DEVICE_ID_BDX3 0x6f23
#define IOAT_DEVICE_ID_BDX4 0x6f24
#define IOAT_DEVICE_ID_BDX5 0x6f25
#define IOAT_DEVICE_ID_BDX6 0x6f26
#define IOAT_DEVICE_ID_BDX7 0x6f27
#define IOAT_DEVICE_ID_BDXE 0x6f2E
#define IOAT_DEVICE_ID_BDXF 0x6f2F
#define IOAT_DEVICE_ID_ICX 0x0b00
#define DESC_SZ sizeof(struct rte_ioat_generic_hw_desc)
#define COMPLETION_SZ sizeof(__m128i)
static int
ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config,
size_t config_size)
{
struct rte_ioat_rawdev_config *params = config;
struct rte_ioat_rawdev *ioat = dev->dev_private;
char mz_name[RTE_MEMZONE_NAMESIZE];
unsigned short i;
if (dev->started)
return -EBUSY;
if (params == NULL || config_size != sizeof(*params))
return -EINVAL;
if (params->ring_size > 4096 || params->ring_size < 64 ||
!rte_is_power_of_2(params->ring_size))
return -EINVAL;
ioat->ring_size = params->ring_size;
ioat->hdls_disable = params->hdls_disable;
if (ioat->desc_ring != NULL) {
rte_memzone_free(ioat->desc_mz);
ioat->desc_ring = NULL;
ioat->desc_mz = NULL;
}
/* allocate one block of memory for both descriptors
* and completion handles.
*/
snprintf(mz_name, sizeof(mz_name), "rawdev%u_desc_ring", dev->dev_id);
ioat->desc_mz = rte_memzone_reserve(mz_name,
(DESC_SZ + COMPLETION_SZ) * ioat->ring_size,
dev->device->numa_node, RTE_MEMZONE_IOVA_CONTIG);
if (ioat->desc_mz == NULL)
return -ENOMEM;
ioat->desc_ring = ioat->desc_mz->addr;
ioat->hdls = (void *)&ioat->desc_ring[ioat->ring_size];
ioat->ring_addr = ioat->desc_mz->iova;
/* configure descriptor ring - each one points to next */
for (i = 0; i < ioat->ring_size; i++) {
ioat->desc_ring[i].next = ioat->ring_addr +
(((i + 1) % ioat->ring_size) * DESC_SZ);
}
return 0;
}
static int
ioat_dev_start(struct rte_rawdev *dev)
{
struct rte_ioat_rawdev *ioat = dev->dev_private;
if (ioat->ring_size == 0 || ioat->desc_ring == NULL)
return -EBUSY;
/* inform hardware of where the descriptor ring is */
ioat->regs->chainaddr = ioat->ring_addr;
/* inform hardware of where to write the status/completions */
ioat->regs->chancmp = ioat->status_addr;
/* prime the status register to be set to the last element */
ioat->status = ioat->ring_addr + ((ioat->ring_size - 1) * DESC_SZ);
return 0;
}
static void
ioat_dev_stop(struct rte_rawdev *dev)
{
RTE_SET_USED(dev);
}
static int
ioat_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
size_t dev_info_size)
{
struct rte_ioat_rawdev_config *cfg = dev_info;
struct rte_ioat_rawdev *ioat = dev->dev_private;
if (dev_info == NULL || dev_info_size != sizeof(*cfg))
return -EINVAL;
cfg->ring_size = ioat->ring_size;
cfg->hdls_disable = ioat->hdls_disable;
return 0;
}
static int
ioat_dev_close(struct rte_rawdev *dev __rte_unused)
{
return 0;
}
static int
ioat_rawdev_create(const char *name, struct rte_pci_device *dev)
{
static const struct rte_rawdev_ops ioat_rawdev_ops = {
.dev_configure = ioat_dev_configure,
.dev_start = ioat_dev_start,
.dev_stop = ioat_dev_stop,
.dev_close = ioat_dev_close,
.dev_info_get = ioat_dev_info_get,
.xstats_get = ioat_xstats_get,
.xstats_get_names = ioat_xstats_get_names,
.xstats_reset = ioat_xstats_reset,
.dev_selftest = ioat_rawdev_test,
};
struct rte_rawdev *rawdev = NULL;
struct rte_ioat_rawdev *ioat = NULL;
const struct rte_memzone *mz = NULL;
char mz_name[RTE_MEMZONE_NAMESIZE];
int ret = 0;
int retry = 0;
if (!name) {
IOAT_PMD_ERR("Invalid name of the device!");
ret = -EINVAL;
goto cleanup;
}
/* Allocate device structure */
rawdev = rte_rawdev_pmd_allocate(name, sizeof(struct rte_ioat_rawdev),
dev->device.numa_node);
if (rawdev == NULL) {
IOAT_PMD_ERR("Unable to allocate raw device");
ret = -ENOMEM;
goto cleanup;
}
/* Allocate memory for the primary process or else return the memory
* of primary memzone for the secondary process.
*/
snprintf(mz_name, sizeof(mz_name), "rawdev%u_private", rawdev->dev_id);
if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
mz = rte_memzone_lookup(mz_name);
if (mz == NULL) {
IOAT_PMD_ERR("Unable lookup memzone for private data\n");
ret = -ENOMEM;
goto cleanup;
}
rawdev->dev_private = mz->addr;
rawdev->dev_ops = &ioat_rawdev_ops;
rawdev->device = &dev->device;
rawdev->driver_name = dev->device.driver->name;
return 0;
}
mz = rte_memzone_reserve(mz_name, sizeof(struct rte_ioat_rawdev),
dev->device.numa_node, RTE_MEMZONE_IOVA_CONTIG);
if (mz == NULL) {
IOAT_PMD_ERR("Unable to reserve memzone for private data\n");
ret = -ENOMEM;
goto cleanup;
}
rawdev->dev_private = mz->addr;
rawdev->dev_ops = &ioat_rawdev_ops;
rawdev->device = &dev->device;
rawdev->driver_name = dev->device.driver->name;
ioat = rawdev->dev_private;
ioat->type = RTE_IOAT_DEV;
ioat->rawdev = rawdev;
ioat->mz = mz;
ioat->regs = dev->mem_resource[0].addr;
ioat->doorbell = &ioat->regs->dmacount;
ioat->ring_size = 0;
ioat->desc_ring = NULL;
ioat->status_addr = ioat->mz->iova +
offsetof(struct rte_ioat_rawdev, status);
/* do device initialization - reset and set error behaviour */
if (ioat->regs->chancnt != 1)
IOAT_PMD_ERR("%s: Channel count == %d\n", __func__,
ioat->regs->chancnt);
if (ioat->regs->chanctrl & 0x100) { /* locked by someone else */
IOAT_PMD_WARN("%s: Channel appears locked\n", __func__);
ioat->regs->chanctrl = 0;
}
ioat->regs->chancmd = RTE_IOAT_CHANCMD_SUSPEND;
rte_delay_ms(1);
ioat->regs->chancmd = RTE_IOAT_CHANCMD_RESET;
rte_delay_ms(1);
while (ioat->regs->chancmd & RTE_IOAT_CHANCMD_RESET) {
ioat->regs->chainaddr = 0;
rte_delay_ms(1);
if (++retry >= 200) {
IOAT_PMD_ERR("%s: cannot reset device. CHANCMD=0x%"PRIx8", CHANSTS=0x%"PRIx64", CHANERR=0x%"PRIx32"\n",
__func__,
ioat->regs->chancmd,
ioat->regs->chansts,
ioat->regs->chanerr);
ret = -EIO;
}
}
ioat->regs->chanctrl = RTE_IOAT_CHANCTRL_ANY_ERR_ABORT_EN |
RTE_IOAT_CHANCTRL_ERR_COMPLETION_EN;
return 0;
cleanup:
if (rawdev)
rte_rawdev_pmd_release(rawdev);
return ret;
}
static int
ioat_rawdev_destroy(const char *name)
{
int ret;
struct rte_rawdev *rdev;
if (!name) {
IOAT_PMD_ERR("Invalid device name");
return -EINVAL;
}
rdev = rte_rawdev_pmd_get_named_dev(name);
if (!rdev) {
IOAT_PMD_ERR("Invalid device name (%s)", name);
return -EINVAL;
}
if (rdev->dev_private != NULL) {
struct rte_ioat_rawdev *ioat = rdev->dev_private;
rdev->dev_private = NULL;
rte_memzone_free(ioat->desc_mz);
rte_memzone_free(ioat->mz);
}
/* rte_rawdev_close is called by pmd_release */
ret = rte_rawdev_pmd_release(rdev);
if (ret)
IOAT_PMD_DEBUG("Device cleanup failed");
return 0;
}
static int
ioat_rawdev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
{
char name[32];
int ret = 0;
rte_pci_device_name(&dev->addr, name, sizeof(name));
IOAT_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
dev->device.driver = &drv->driver;
ret = ioat_rawdev_create(name, dev);
return ret;
}
static int
ioat_rawdev_remove(struct rte_pci_device *dev)
{
char name[32];
int ret;
rte_pci_device_name(&dev->addr, name, sizeof(name));
IOAT_PMD_INFO("Closing %s on NUMA node %d",
name, dev->device.numa_node);
ret = ioat_rawdev_destroy(name);
return ret;
}
static const struct rte_pci_id pci_id_ioat_map[] = {
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_SKX) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX0) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX1) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX2) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX3) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX4) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX5) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX6) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDX7) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDXE) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_BDXF) },
{ RTE_PCI_DEVICE(IOAT_VENDOR_ID, IOAT_DEVICE_ID_ICX) },
{ .vendor_id = 0, /* sentinel */ },
};
static struct rte_pci_driver ioat_pmd_drv = {
.id_table = pci_id_ioat_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
.probe = ioat_rawdev_probe,
.remove = ioat_rawdev_remove,
};
RTE_PMD_REGISTER_PCI(IOAT_PMD_RAWDEV_NAME, ioat_pmd_drv);
RTE_PMD_REGISTER_PCI_TABLE(IOAT_PMD_RAWDEV_NAME, pci_id_ioat_map);
RTE_PMD_REGISTER_KMOD_DEP(IOAT_PMD_RAWDEV_NAME, "* igb_uio | uio_pci_generic");

View File

@ -1,734 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2019 Intel Corporation
*/
#include <unistd.h>
#include <inttypes.h>
#include <rte_mbuf.h>
#include "rte_rawdev.h"
#include "rte_ioat_rawdev.h"
#include "ioat_private.h"
#define MAX_SUPPORTED_RAWDEVS 64
#define TEST_SKIPPED 77
#define COPY_LEN 1024
int ioat_rawdev_test(uint16_t dev_id); /* pre-define to keep compiler happy */
static struct rte_mempool *pool;
static unsigned short expected_ring_size[MAX_SUPPORTED_RAWDEVS];
#define PRINT_ERR(...) print_err(__func__, __LINE__, __VA_ARGS__)
static inline int
__rte_format_printf(3, 4)
print_err(const char *func, int lineno, const char *format, ...)
{
va_list ap;
int ret;
ret = fprintf(stderr, "In %s:%d - ", func, lineno);
va_start(ap, format);
ret += vfprintf(stderr, format, ap);
va_end(ap);
return ret;
}
static int
do_multi_copies(int dev_id, int split_batches, int split_completions)
{
struct rte_mbuf *srcs[32], *dsts[32];
struct rte_mbuf *completed_src[64];
struct rte_mbuf *completed_dst[64];
unsigned int i, j;
for (i = 0; i < RTE_DIM(srcs); i++) {
char *src_data;
if (split_batches && i == RTE_DIM(srcs) / 2)
rte_ioat_perform_ops(dev_id);
srcs[i] = rte_pktmbuf_alloc(pool);
dsts[i] = rte_pktmbuf_alloc(pool);
src_data = rte_pktmbuf_mtod(srcs[i], char *);
for (j = 0; j < COPY_LEN; j++)
src_data[j] = rand() & 0xFF;
if (rte_ioat_enqueue_copy(dev_id,
srcs[i]->buf_iova + srcs[i]->data_off,
dsts[i]->buf_iova + dsts[i]->data_off,
COPY_LEN,
(uintptr_t)srcs[i],
(uintptr_t)dsts[i]) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy for buffer %u\n",
i);
return -1;
}
}
rte_ioat_perform_ops(dev_id);
usleep(100);
if (split_completions) {
/* gather completions in two halves */
uint16_t half_len = RTE_DIM(srcs) / 2;
if (rte_ioat_completed_ops(dev_id, half_len, NULL, NULL,
(void *)completed_src,
(void *)completed_dst) != half_len) {
PRINT_ERR("Error with rte_ioat_completed_ops - first half request\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (rte_ioat_completed_ops(dev_id, half_len, NULL, NULL,
(void *)&completed_src[half_len],
(void *)&completed_dst[half_len]) != half_len) {
PRINT_ERR("Error with rte_ioat_completed_ops - second half request\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
} else {
/* gather all completions in one go */
if (rte_ioat_completed_ops(dev_id, RTE_DIM(completed_src), NULL, NULL,
(void *)completed_src,
(void *)completed_dst) != RTE_DIM(srcs)) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
}
for (i = 0; i < RTE_DIM(srcs); i++) {
char *src_data, *dst_data;
if (completed_src[i] != srcs[i]) {
PRINT_ERR("Error with source pointer %u\n", i);
return -1;
}
if (completed_dst[i] != dsts[i]) {
PRINT_ERR("Error with dest pointer %u\n", i);
return -1;
}
src_data = rte_pktmbuf_mtod(srcs[i], char *);
dst_data = rte_pktmbuf_mtod(dsts[i], char *);
for (j = 0; j < COPY_LEN; j++)
if (src_data[j] != dst_data[j]) {
PRINT_ERR("Error with copy of packet %u, byte %u\n",
i, j);
return -1;
}
rte_pktmbuf_free(srcs[i]);
rte_pktmbuf_free(dsts[i]);
}
return 0;
}
static int
test_enqueue_copies(int dev_id)
{
unsigned int i;
/* test doing a single copy */
do {
struct rte_mbuf *src, *dst;
char *src_data, *dst_data;
struct rte_mbuf *completed[2] = {0};
src = rte_pktmbuf_alloc(pool);
dst = rte_pktmbuf_alloc(pool);
src_data = rte_pktmbuf_mtod(src, char *);
dst_data = rte_pktmbuf_mtod(dst, char *);
for (i = 0; i < COPY_LEN; i++)
src_data[i] = rand() & 0xFF;
if (rte_ioat_enqueue_copy(dev_id,
src->buf_iova + src->data_off,
dst->buf_iova + dst->data_off,
COPY_LEN,
(uintptr_t)src,
(uintptr_t)dst) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy\n");
return -1;
}
rte_ioat_perform_ops(dev_id);
usleep(10);
if (rte_ioat_completed_ops(dev_id, 1, NULL, NULL, (void *)&completed[0],
(void *)&completed[1]) != 1) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
return -1;
}
if (completed[0] != src || completed[1] != dst) {
PRINT_ERR("Error with completions: got (%p, %p), not (%p,%p)\n",
completed[0], completed[1], src, dst);
return -1;
}
for (i = 0; i < COPY_LEN; i++)
if (dst_data[i] != src_data[i]) {
PRINT_ERR("Data mismatch at char %u [Got %02x not %02x]\n",
i, dst_data[i], src_data[i]);
return -1;
}
rte_pktmbuf_free(src);
rte_pktmbuf_free(dst);
/* check ring is now empty */
if (rte_ioat_completed_ops(dev_id, 1, NULL, NULL, (void *)&completed[0],
(void *)&completed[1]) != 0) {
PRINT_ERR("Error: got unexpected returned handles from rte_ioat_completed_ops\n");
return -1;
}
} while (0);
/* test doing a multiple single copies */
do {
const uint16_t max_ops = 4;
struct rte_mbuf *src, *dst;
char *src_data, *dst_data;
struct rte_mbuf *completed[32] = {0};
const uint16_t max_completions = RTE_DIM(completed) / 2;
src = rte_pktmbuf_alloc(pool);
dst = rte_pktmbuf_alloc(pool);
src_data = rte_pktmbuf_mtod(src, char *);
dst_data = rte_pktmbuf_mtod(dst, char *);
for (i = 0; i < COPY_LEN; i++)
src_data[i] = rand() & 0xFF;
/* perform the same copy <max_ops> times */
for (i = 0; i < max_ops; i++) {
if (rte_ioat_enqueue_copy(dev_id,
src->buf_iova + src->data_off,
dst->buf_iova + dst->data_off,
COPY_LEN,
(uintptr_t)src,
(uintptr_t)dst) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy\n");
return -1;
}
rte_ioat_perform_ops(dev_id);
}
usleep(10);
if (rte_ioat_completed_ops(dev_id, max_completions, NULL, NULL,
(void *)&completed[0],
(void *)&completed[max_completions]) != max_ops) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (completed[0] != src || completed[max_completions] != dst) {
PRINT_ERR("Error with completions: got (%p, %p), not (%p,%p)\n",
completed[0], completed[max_completions], src, dst);
return -1;
}
for (i = 0; i < COPY_LEN; i++)
if (dst_data[i] != src_data[i]) {
PRINT_ERR("Data mismatch at char %u\n", i);
return -1;
}
rte_pktmbuf_free(src);
rte_pktmbuf_free(dst);
} while (0);
/* test doing multiple copies */
do_multi_copies(dev_id, 0, 0); /* enqueue and complete one batch at a time */
do_multi_copies(dev_id, 1, 0); /* enqueue 2 batches and then complete both */
do_multi_copies(dev_id, 0, 1); /* enqueue 1 batch, then complete in two halves */
return 0;
}
static int
test_enqueue_fill(int dev_id)
{
const unsigned int lengths[] = {8, 64, 1024, 50, 100, 89};
struct rte_mbuf *dst = rte_pktmbuf_alloc(pool);
char *dst_data = rte_pktmbuf_mtod(dst, char *);
struct rte_mbuf *completed[2] = {0};
uint64_t pattern = 0xfedcba9876543210;
unsigned int i, j;
for (i = 0; i < RTE_DIM(lengths); i++) {
/* reset dst_data */
memset(dst_data, 0, lengths[i]);
/* perform the fill operation */
if (rte_ioat_enqueue_fill(dev_id, pattern,
dst->buf_iova + dst->data_off, lengths[i],
(uintptr_t)dst) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_fill\n");
return -1;
}
rte_ioat_perform_ops(dev_id);
usleep(100);
if (rte_ioat_completed_ops(dev_id, 1, NULL, NULL, (void *)&completed[0],
(void *)&completed[1]) != 1) {
PRINT_ERR("Error with completed ops\n");
return -1;
}
/* check the result */
for (j = 0; j < lengths[i]; j++) {
char pat_byte = ((char *)&pattern)[j % 8];
if (dst_data[j] != pat_byte) {
PRINT_ERR("Error with fill operation (lengths = %u): got (%x), not (%x)\n",
lengths[i], dst_data[j], pat_byte);
return -1;
}
}
}
rte_pktmbuf_free(dst);
return 0;
}
static int
test_burst_capacity(int dev_id)
{
#define BURST_SIZE 64
const unsigned int ring_space = rte_ioat_burst_capacity(dev_id);
struct rte_mbuf *src, *dst;
unsigned int length = 1024;
unsigned int i, j, iter;
unsigned int old_cap, cap;
uintptr_t completions[BURST_SIZE];
src = rte_pktmbuf_alloc(pool);
dst = rte_pktmbuf_alloc(pool);
old_cap = ring_space;
/* to test capacity, we enqueue elements and check capacity is reduced
* by one each time - rebaselining the expected value after each burst
* as the capacity is only for a burst. We enqueue multiple bursts to
* fill up half the ring, before emptying it again. We do this twice to
* ensure that we get to test scenarios where we get ring wrap-around
*/
for (iter = 0; iter < 2; iter++) {
for (i = 0; i < ring_space / (2 * BURST_SIZE); i++) {
cap = rte_ioat_burst_capacity(dev_id);
if (cap > old_cap) {
PRINT_ERR("Error, avail ring capacity has gone up, not down\n");
return -1;
}
old_cap = cap;
for (j = 0; j < BURST_SIZE; j++) {
if (rte_ioat_enqueue_copy(dev_id, rte_pktmbuf_iova(src),
rte_pktmbuf_iova(dst), length, 0, 0) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy\n");
return -1;
}
if (cap - rte_ioat_burst_capacity(dev_id) != j + 1) {
PRINT_ERR("Error, ring capacity did not change as expected\n");
return -1;
}
}
rte_ioat_perform_ops(dev_id);
}
usleep(100);
for (i = 0; i < ring_space / (2 * BURST_SIZE); i++) {
if (rte_ioat_completed_ops(dev_id, BURST_SIZE,
NULL, NULL,
completions, completions) != BURST_SIZE) {
PRINT_ERR("Error with completions\n");
return -1;
}
}
if (rte_ioat_burst_capacity(dev_id) != ring_space) {
PRINT_ERR("Error, ring capacity has not reset to original value\n");
return -1;
}
old_cap = ring_space;
}
rte_pktmbuf_free(src);
rte_pktmbuf_free(dst);
return 0;
}
static int
test_completion_status(int dev_id)
{
#define COMP_BURST_SZ 16
const unsigned int fail_copy[] = {0, 7, 15};
struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ];
struct rte_mbuf *completed_src[COMP_BURST_SZ * 2];
struct rte_mbuf *completed_dst[COMP_BURST_SZ * 2];
unsigned int length = 1024;
unsigned int i;
uint8_t not_ok = 0;
/* Test single full batch statuses */
for (i = 0; i < RTE_DIM(fail_copy); i++) {
uint32_t status[COMP_BURST_SZ] = {0};
unsigned int j;
for (j = 0; j < COMP_BURST_SZ; j++) {
srcs[j] = rte_pktmbuf_alloc(pool);
dsts[j] = rte_pktmbuf_alloc(pool);
if (rte_ioat_enqueue_copy(dev_id,
(j == fail_copy[i] ? (phys_addr_t)NULL :
(srcs[j]->buf_iova + srcs[j]->data_off)),
dsts[j]->buf_iova + dsts[j]->data_off,
length,
(uintptr_t)srcs[j],
(uintptr_t)dsts[j]) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy for buffer %u\n", j);
return -1;
}
}
rte_ioat_perform_ops(dev_id);
usleep(100);
if (rte_ioat_completed_ops(dev_id, COMP_BURST_SZ, status, &not_ok,
(void *)completed_src, (void *)completed_dst) != COMP_BURST_SZ) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (not_ok != 1 || status[fail_copy[i]] == RTE_IOAT_OP_SUCCESS) {
unsigned int j;
PRINT_ERR("Error, missing expected failed copy, %u\n", fail_copy[i]);
for (j = 0; j < COMP_BURST_SZ; j++)
printf("%u ", status[j]);
printf("<-- Statuses\n");
return -1;
}
for (j = 0; j < COMP_BURST_SZ; j++) {
rte_pktmbuf_free(completed_src[j]);
rte_pktmbuf_free(completed_dst[j]);
}
}
/* Test gathering status for two batches at once */
for (i = 0; i < RTE_DIM(fail_copy); i++) {
uint32_t status[COMP_BURST_SZ] = {0};
unsigned int batch, j;
unsigned int expected_failures = 0;
for (batch = 0; batch < 2; batch++) {
for (j = 0; j < COMP_BURST_SZ/2; j++) {
srcs[j] = rte_pktmbuf_alloc(pool);
dsts[j] = rte_pktmbuf_alloc(pool);
if (j == fail_copy[i])
expected_failures++;
if (rte_ioat_enqueue_copy(dev_id,
(j == fail_copy[i] ? (phys_addr_t)NULL :
(srcs[j]->buf_iova + srcs[j]->data_off)),
dsts[j]->buf_iova + dsts[j]->data_off,
length,
(uintptr_t)srcs[j],
(uintptr_t)dsts[j]) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy for buffer %u\n",
j);
return -1;
}
}
rte_ioat_perform_ops(dev_id);
}
usleep(100);
if (rte_ioat_completed_ops(dev_id, COMP_BURST_SZ, status, &not_ok,
(void *)completed_src, (void *)completed_dst) != COMP_BURST_SZ) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (not_ok != expected_failures) {
unsigned int j;
PRINT_ERR("Error, missing expected failed copy, got %u, not %u\n",
not_ok, expected_failures);
for (j = 0; j < COMP_BURST_SZ; j++)
printf("%u ", status[j]);
printf("<-- Statuses\n");
return -1;
}
for (j = 0; j < COMP_BURST_SZ; j++) {
rte_pktmbuf_free(completed_src[j]);
rte_pktmbuf_free(completed_dst[j]);
}
}
/* Test gathering status for half batch at a time */
for (i = 0; i < RTE_DIM(fail_copy); i++) {
uint32_t status[COMP_BURST_SZ] = {0};
unsigned int j;
for (j = 0; j < COMP_BURST_SZ; j++) {
srcs[j] = rte_pktmbuf_alloc(pool);
dsts[j] = rte_pktmbuf_alloc(pool);
if (rte_ioat_enqueue_copy(dev_id,
(j == fail_copy[i] ? (phys_addr_t)NULL :
(srcs[j]->buf_iova + srcs[j]->data_off)),
dsts[j]->buf_iova + dsts[j]->data_off,
length,
(uintptr_t)srcs[j],
(uintptr_t)dsts[j]) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy for buffer %u\n", j);
return -1;
}
}
rte_ioat_perform_ops(dev_id);
usleep(100);
if (rte_ioat_completed_ops(dev_id, COMP_BURST_SZ / 2, status, &not_ok,
(void *)completed_src,
(void *)completed_dst) != (COMP_BURST_SZ / 2)) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (fail_copy[i] < COMP_BURST_SZ / 2 &&
(not_ok != 1 || status[fail_copy[i]] == RTE_IOAT_OP_SUCCESS)) {
PRINT_ERR("Missing expected failure in first half-batch\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (rte_ioat_completed_ops(dev_id, COMP_BURST_SZ / 2, status, &not_ok,
(void *)&completed_src[COMP_BURST_SZ / 2],
(void *)&completed_dst[COMP_BURST_SZ / 2]) != (COMP_BURST_SZ / 2)) {
PRINT_ERR("Error with rte_ioat_completed_ops\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
if (fail_copy[i] >= COMP_BURST_SZ / 2 && (not_ok != 1 ||
status[fail_copy[i] - (COMP_BURST_SZ / 2)]
== RTE_IOAT_OP_SUCCESS)) {
PRINT_ERR("Missing expected failure in second half-batch\n");
rte_rawdev_dump(dev_id, stdout);
return -1;
}
for (j = 0; j < COMP_BURST_SZ; j++) {
rte_pktmbuf_free(completed_src[j]);
rte_pktmbuf_free(completed_dst[j]);
}
}
/* Test gathering statuses with fence */
for (i = 1; i < RTE_DIM(fail_copy); i++) {
uint32_t status[COMP_BURST_SZ * 2] = {0};
unsigned int j;
uint16_t count;
for (j = 0; j < COMP_BURST_SZ; j++) {
srcs[j] = rte_pktmbuf_alloc(pool);
dsts[j] = rte_pktmbuf_alloc(pool);
/* always fail the first copy */
if (rte_ioat_enqueue_copy(dev_id,
(j == 0 ? (phys_addr_t)NULL :
(srcs[j]->buf_iova + srcs[j]->data_off)),
dsts[j]->buf_iova + dsts[j]->data_off,
length,
(uintptr_t)srcs[j],
(uintptr_t)dsts[j]) != 1) {
PRINT_ERR("Error with rte_ioat_enqueue_copy for buffer %u\n", j);
return -1;
}
/* put in a fence which will stop any further transactions
* because we had a previous failure.
*/
if (j == fail_copy[i])
rte_ioat_fence(dev_id);
}
rte_ioat_perform_ops(dev_id);
usleep(100);
count = rte_ioat_completed_ops(dev_id, COMP_BURST_SZ * 2, status, &not_ok,
(void *)completed_src, (void *)completed_dst);
if (count != COMP_BURST_SZ) {
PRINT_ERR("Error with rte_ioat_completed_ops, got %u not %u\n",
count, COMP_BURST_SZ);
for (j = 0; j < count; j++)
printf("%u ", status[j]);
printf("<-- Statuses\n");
return -1;
}
if (not_ok != COMP_BURST_SZ - fail_copy[i]) {
PRINT_ERR("Unexpected failed copy count, got %u, expected %u\n",
not_ok, COMP_BURST_SZ - fail_copy[i]);
for (j = 0; j < COMP_BURST_SZ; j++)
printf("%u ", status[j]);
printf("<-- Statuses\n");
return -1;
}
if (status[0] == RTE_IOAT_OP_SUCCESS || status[0] == RTE_IOAT_OP_SKIPPED) {
PRINT_ERR("Error, op 0 unexpectedly did not fail.\n");
return -1;
}
for (j = 1; j <= fail_copy[i]; j++) {
if (status[j] != RTE_IOAT_OP_SUCCESS) {
PRINT_ERR("Error, op %u unexpectedly failed\n", j);
return -1;
}
}
for (j = fail_copy[i] + 1; j < COMP_BURST_SZ; j++) {
if (status[j] != RTE_IOAT_OP_SKIPPED) {
PRINT_ERR("Error, all descriptors after fence should be invalid\n");
return -1;
}
}
for (j = 0; j < COMP_BURST_SZ; j++) {
rte_pktmbuf_free(completed_src[j]);
rte_pktmbuf_free(completed_dst[j]);
}
}
return 0;
}
int
ioat_rawdev_test(uint16_t dev_id)
{
#define IOAT_TEST_RINGSIZE 512
const struct rte_idxd_rawdev *idxd =
(struct rte_idxd_rawdev *)rte_rawdevs[dev_id].dev_private;
const enum rte_ioat_dev_type ioat_type = idxd->type;
struct rte_ioat_rawdev_config p = { .ring_size = -1 };
struct rte_rawdev_info info = { .dev_private = &p };
struct rte_rawdev_xstats_name *snames = NULL;
uint64_t *stats = NULL;
unsigned int *ids = NULL;
unsigned int nb_xstats;
unsigned int i;
if (dev_id >= MAX_SUPPORTED_RAWDEVS) {
printf("Skipping test. Cannot test rawdevs with id's greater than %d\n",
MAX_SUPPORTED_RAWDEVS);
return TEST_SKIPPED;
}
rte_rawdev_info_get(dev_id, &info, sizeof(p));
if (p.ring_size != expected_ring_size[dev_id]) {
PRINT_ERR("Error, initial ring size is not as expected (Actual: %d, Expected: %d)\n",
(int)p.ring_size, expected_ring_size[dev_id]);
return -1;
}
p.ring_size = IOAT_TEST_RINGSIZE;
if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) {
PRINT_ERR("Error with rte_rawdev_configure()\n");
return -1;
}
rte_rawdev_info_get(dev_id, &info, sizeof(p));
if (p.ring_size != IOAT_TEST_RINGSIZE) {
PRINT_ERR("Error, ring size is not %d (%d)\n",
IOAT_TEST_RINGSIZE, (int)p.ring_size);
return -1;
}
expected_ring_size[dev_id] = p.ring_size;
if (rte_rawdev_start(dev_id) != 0) {
PRINT_ERR("Error with rte_rawdev_start()\n");
return -1;
}
pool = rte_pktmbuf_pool_create("TEST_IOAT_POOL",
p.ring_size * 2, /* n == num elements */
32, /* cache size */
0, /* priv size */
2048, /* data room size */
info.socket_id);
if (pool == NULL) {
PRINT_ERR("Error with mempool creation\n");
return -1;
}
/* allocate memory for xstats names and values */
nb_xstats = rte_rawdev_xstats_names_get(dev_id, NULL, 0);
snames = malloc(sizeof(*snames) * nb_xstats);
if (snames == NULL) {
PRINT_ERR("Error allocating xstat names memory\n");
goto err;
}
rte_rawdev_xstats_names_get(dev_id, snames, nb_xstats);
ids = malloc(sizeof(*ids) * nb_xstats);
if (ids == NULL) {
PRINT_ERR("Error allocating xstat ids memory\n");
goto err;
}
for (i = 0; i < nb_xstats; i++)
ids[i] = i;
stats = malloc(sizeof(*stats) * nb_xstats);
if (stats == NULL) {
PRINT_ERR("Error allocating xstat memory\n");
goto err;
}
/* run the test cases */
printf("Running Copy Tests\n");
for (i = 0; i < 100; i++) {
unsigned int j;
if (test_enqueue_copies(dev_id) != 0)
goto err;
rte_rawdev_xstats_get(dev_id, ids, stats, nb_xstats);
for (j = 0; j < nb_xstats; j++)
printf("%s: %"PRIu64" ", snames[j].name, stats[j]);
printf("\r");
}
printf("\n");
/* test enqueue fill operation */
printf("Running Fill Tests\n");
for (i = 0; i < 100; i++) {
unsigned int j;
if (test_enqueue_fill(dev_id) != 0)
goto err;
rte_rawdev_xstats_get(dev_id, ids, stats, nb_xstats);
for (j = 0; j < nb_xstats; j++)
printf("%s: %"PRIu64" ", snames[j].name, stats[j]);
printf("\r");
}
printf("\n");
printf("Running Burst Capacity Test\n");
if (test_burst_capacity(dev_id) != 0)
goto err;
/* only DSA devices report address errors, and we can only use null pointers
* to generate those errors when DPDK is in VA mode.
*/
if (rte_eal_iova_mode() == RTE_IOVA_VA && ioat_type == RTE_IDXD_DEV) {
printf("Running Completions Status Test\n");
if (test_completion_status(dev_id) != 0)
goto err;
}
rte_rawdev_stop(dev_id);
if (rte_rawdev_xstats_reset(dev_id, NULL, 0) != 0) {
PRINT_ERR("Error resetting xstat values\n");
goto err;
}
rte_mempool_free(pool);
free(snames);
free(stats);
free(ids);
return 0;
err:
rte_rawdev_stop(dev_id);
rte_rawdev_xstats_reset(dev_id, NULL, 0);
rte_mempool_free(pool);
free(snames);
free(stats);
free(ids);
return -1;
}

View File

@ -1,336 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) Intel Corporation
*/
/**
* \file
* I/OAT specification definitions
*
* Taken from ioat_spec.h from SPDK project, with prefix renames and
* other minor changes.
*/
#ifndef RTE_IOAT_SPEC_H
#define RTE_IOAT_SPEC_H
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#define RTE_IOAT_PCI_CHANERR_INT_OFFSET 0x180
#define RTE_IOAT_INTRCTRL_MASTER_INT_EN 0x01
#define RTE_IOAT_VER_3_0 0x30
#define RTE_IOAT_VER_3_3 0x33
/* DMA Channel Registers */
#define RTE_IOAT_CHANCTRL_CHANNEL_PRIORITY_MASK 0xF000
#define RTE_IOAT_CHANCTRL_COMPL_DCA_EN 0x0200
#define RTE_IOAT_CHANCTRL_CHANNEL_IN_USE 0x0100
#define RTE_IOAT_CHANCTRL_DESCRIPTOR_ADDR_SNOOP_CONTROL 0x0020
#define RTE_IOAT_CHANCTRL_ERR_INT_EN 0x0010
#define RTE_IOAT_CHANCTRL_ANY_ERR_ABORT_EN 0x0008
#define RTE_IOAT_CHANCTRL_ERR_COMPLETION_EN 0x0004
#define RTE_IOAT_CHANCTRL_INT_REARM 0x0001
/* DMA Channel Capabilities */
#define RTE_IOAT_DMACAP_PB (1 << 0)
#define RTE_IOAT_DMACAP_DCA (1 << 4)
#define RTE_IOAT_DMACAP_BFILL (1 << 6)
#define RTE_IOAT_DMACAP_XOR (1 << 8)
#define RTE_IOAT_DMACAP_PQ (1 << 9)
#define RTE_IOAT_DMACAP_DMA_DIF (1 << 10)
struct rte_ioat_registers {
uint8_t chancnt;
uint8_t xfercap;
uint8_t genctrl;
uint8_t intrctrl;
uint32_t attnstatus;
uint8_t cbver; /* 0x08 */
uint8_t reserved4[0x3]; /* 0x09 */
uint16_t intrdelay; /* 0x0C */
uint16_t cs_status; /* 0x0E */
uint32_t dmacapability; /* 0x10 */
uint8_t reserved5[0x6C]; /* 0x14 */
uint16_t chanctrl; /* 0x80 */
uint8_t reserved6[0x2]; /* 0x82 */
uint8_t chancmd; /* 0x84 */
uint8_t reserved3[1]; /* 0x85 */
uint16_t dmacount; /* 0x86 */
uint64_t chansts; /* 0x88 */
uint64_t chainaddr; /* 0x90 */
uint64_t chancmp; /* 0x98 */
uint8_t reserved2[0x8]; /* 0xA0 */
uint32_t chanerr; /* 0xA8 */
uint32_t chanerrmask; /* 0xAC */
} __rte_packed;
#define RTE_IOAT_CHANCMD_RESET 0x20
#define RTE_IOAT_CHANCMD_SUSPEND 0x04
#define RTE_IOAT_CHANSTS_STATUS 0x7ULL
#define RTE_IOAT_CHANSTS_ACTIVE 0x0
#define RTE_IOAT_CHANSTS_IDLE 0x1
#define RTE_IOAT_CHANSTS_SUSPENDED 0x2
#define RTE_IOAT_CHANSTS_HALTED 0x3
#define RTE_IOAT_CHANSTS_ARMED 0x4
#define RTE_IOAT_CHANSTS_UNAFFILIATED_ERROR 0x8ULL
#define RTE_IOAT_CHANSTS_SOFT_ERROR 0x10ULL
#define RTE_IOAT_CHANSTS_COMPLETED_DESCRIPTOR_MASK (~0x3FULL)
#define RTE_IOAT_CHANCMP_ALIGN 8 /* CHANCMP address must be 64-bit aligned */
struct rte_ioat_dma_hw_desc {
uint32_t size;
union {
uint32_t control_raw;
struct {
uint32_t int_enable: 1;
uint32_t src_snoop_disable: 1;
uint32_t dest_snoop_disable: 1;
uint32_t completion_update: 1;
uint32_t fence: 1;
uint32_t null: 1;
uint32_t src_page_break: 1;
uint32_t dest_page_break: 1;
uint32_t bundle: 1;
uint32_t dest_dca: 1;
uint32_t hint: 1;
uint32_t reserved: 13;
#define RTE_IOAT_OP_COPY 0x00
uint32_t op: 8;
} control;
} u;
uint64_t src_addr;
uint64_t dest_addr;
uint64_t next;
uint64_t reserved;
uint64_t reserved2;
uint64_t user1;
uint64_t user2;
};
struct rte_ioat_fill_hw_desc {
uint32_t size;
union {
uint32_t control_raw;
struct {
uint32_t int_enable: 1;
uint32_t reserved: 1;
uint32_t dest_snoop_disable: 1;
uint32_t completion_update: 1;
uint32_t fence: 1;
uint32_t reserved2: 2;
uint32_t dest_page_break: 1;
uint32_t bundle: 1;
uint32_t reserved3: 15;
#define RTE_IOAT_OP_FILL 0x01
uint32_t op: 8;
} control;
} u;
uint64_t src_data;
uint64_t dest_addr;
uint64_t next;
uint64_t reserved;
uint64_t next_dest_addr;
uint64_t user1;
uint64_t user2;
};
struct rte_ioat_xor_hw_desc {
uint32_t size;
union {
uint32_t control_raw;
struct {
uint32_t int_enable: 1;
uint32_t src_snoop_disable: 1;
uint32_t dest_snoop_disable: 1;
uint32_t completion_update: 1;
uint32_t fence: 1;
uint32_t src_count: 3;
uint32_t bundle: 1;
uint32_t dest_dca: 1;
uint32_t hint: 1;
uint32_t reserved: 13;
#define RTE_IOAT_OP_XOR 0x87
#define RTE_IOAT_OP_XOR_VAL 0x88
uint32_t op: 8;
} control;
} u;
uint64_t src_addr;
uint64_t dest_addr;
uint64_t next;
uint64_t src_addr2;
uint64_t src_addr3;
uint64_t src_addr4;
uint64_t src_addr5;
};
struct rte_ioat_xor_ext_hw_desc {
uint64_t src_addr6;
uint64_t src_addr7;
uint64_t src_addr8;
uint64_t next;
uint64_t reserved[4];
};
struct rte_ioat_pq_hw_desc {
uint32_t size;
union {
uint32_t control_raw;
struct {
uint32_t int_enable: 1;
uint32_t src_snoop_disable: 1;
uint32_t dest_snoop_disable: 1;
uint32_t completion_update: 1;
uint32_t fence: 1;
uint32_t src_count: 3;
uint32_t bundle: 1;
uint32_t dest_dca: 1;
uint32_t hint: 1;
uint32_t p_disable: 1;
uint32_t q_disable: 1;
uint32_t reserved: 11;
#define RTE_IOAT_OP_PQ 0x89
#define RTE_IOAT_OP_PQ_VAL 0x8a
uint32_t op: 8;
} control;
} u;
uint64_t src_addr;
uint64_t p_addr;
uint64_t next;
uint64_t src_addr2;
uint64_t src_addr3;
uint8_t coef[8];
uint64_t q_addr;
};
struct rte_ioat_pq_ext_hw_desc {
uint64_t src_addr4;
uint64_t src_addr5;
uint64_t src_addr6;
uint64_t next;
uint64_t src_addr7;
uint64_t src_addr8;
uint64_t reserved[2];
};
struct rte_ioat_pq_update_hw_desc {
uint32_t size;
union {
uint32_t control_raw;
struct {
uint32_t int_enable: 1;
uint32_t src_snoop_disable: 1;
uint32_t dest_snoop_disable: 1;
uint32_t completion_update: 1;
uint32_t fence: 1;
uint32_t src_cnt: 3;
uint32_t bundle: 1;
uint32_t dest_dca: 1;
uint32_t hint: 1;
uint32_t p_disable: 1;
uint32_t q_disable: 1;
uint32_t reserved: 3;
uint32_t coef: 8;
#define RTE_IOAT_OP_PQ_UP 0x8b
uint32_t op: 8;
} control;
} u;
uint64_t src_addr;
uint64_t p_addr;
uint64_t next;
uint64_t src_addr2;
uint64_t p_src;
uint64_t q_src;
uint64_t q_addr;
};
struct rte_ioat_raw_hw_desc {
uint64_t field[8];
};
union rte_ioat_hw_desc {
struct rte_ioat_raw_hw_desc raw;
struct rte_ioat_generic_hw_desc generic;
struct rte_ioat_dma_hw_desc dma;
struct rte_ioat_fill_hw_desc fill;
struct rte_ioat_xor_hw_desc xor_desc;
struct rte_ioat_xor_ext_hw_desc xor_ext;
struct rte_ioat_pq_hw_desc pq;
struct rte_ioat_pq_ext_hw_desc pq_ext;
struct rte_ioat_pq_update_hw_desc pq_update;
};
/*** Definitions for Intel(R) Data Streaming Accelerator Follow ***/
#define IDXD_CMD_SHIFT 20
enum rte_idxd_cmds {
idxd_enable_dev = 1,
idxd_disable_dev,
idxd_drain_all,
idxd_abort_all,
idxd_reset_device,
idxd_enable_wq,
idxd_disable_wq,
idxd_drain_wq,
idxd_abort_wq,
idxd_reset_wq,
};
/* General bar0 registers */
struct rte_idxd_bar0 {
uint32_t __rte_cache_aligned version; /* offset 0x00 */
uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */
uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */
uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */
uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */
uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */
uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */
uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */
uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */
uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */
uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */
uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */
uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */
uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */
};
/* workqueue config is provided by array of uint32_t. */
#define WQ_SIZE_IDX 0 /* size is in first 32-bit value */
#define WQ_THRESHOLD_IDX 1 /* WQ threshold second 32-bits */
#define WQ_MODE_IDX 2 /* WQ mode and other flags */
#define WQ_SIZES_IDX 3 /* WQ transfer and batch sizes */
#define WQ_OCC_INT_IDX 4 /* WQ occupancy interrupt handle */
#define WQ_OCC_LIMIT_IDX 5 /* WQ occupancy limit */
#define WQ_STATE_IDX 6 /* WQ state and occupancy state */
#define WQ_MODE_SHARED 0
#define WQ_MODE_DEDICATED 1
#define WQ_PRIORITY_SHIFT 4
#define WQ_BATCH_SZ_SHIFT 5
#define WQ_STATE_SHIFT 30
#define WQ_STATE_MASK 0x3
struct rte_idxd_grpcfg {
uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */
uint64_t grpengcfg; /* offset 32 */
uint32_t grpflags; /* offset 40 */
};
#define GENSTS_DEV_STATE_MASK 0x03
#define CMDSTATUS_ACTIVE_SHIFT 31
#define CMDSTATUS_ACTIVE_MASK (1 << 31)
#define CMDSTATUS_ERR_MASK 0xFF
#ifdef __cplusplus
}
#endif
#endif /* RTE_IOAT_SPEC_H */

View File

@ -1,36 +0,0 @@
# SPDX-License-Identifier: BSD-3-Clause
# Copyright 2019 Intel Corporation
build = dpdk_conf.has('RTE_ARCH_X86')
# only use ioat rawdev driver if we don't have the equivalent dmadev ones
if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT')
build = false
reason = 'replaced by dmadev drivers'
subdir_done()
endif
reason = 'only supported on x86'
sources = files(
'ioat_common.c',
'ioat_rawdev_test.c',
)
if not dpdk_conf.has('RTE_DMA_IDXD')
sources += files(
'idxd_bus.c',
'idxd_pci.c',
)
endif
if not dpdk_conf.has('RTE_DMA_IOAT')
sources += files (
'ioat_rawdev.c',
)
endif
deps += ['bus_pci', 'mbuf', 'rawdev']
headers = files(
'rte_ioat_rawdev.h',
'rte_idxd_rawdev_fns.h',
'rte_ioat_rawdev_fns.h',
)

View File

@ -1,394 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2021 Intel Corporation
*/
#ifndef _RTE_IDXD_RAWDEV_FNS_H_
#define _RTE_IDXD_RAWDEV_FNS_H_
/**
* @file
* This header file contains the implementation of the various ioat
* rawdev functions for DSA hardware. The API specification and key
* public structures are defined in "rte_ioat_rawdev.h".
*
* This file should not be included directly, but instead applications should
* include "rte_ioat_rawdev.h", which then includes this file - and the
* IOAT/CBDMA equivalent header - in turn.
*/
#include <stdint.h>
#include <rte_errno.h>
/*
* Defines used in the data path for interacting with IDXD hardware.
*/
#define IDXD_CMD_OP_SHIFT 24
enum rte_idxd_ops {
idxd_op_nop = 0,
idxd_op_batch,
idxd_op_drain,
idxd_op_memmove,
idxd_op_fill
};
#define IDXD_FLAG_FENCE (1 << 0)
#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2)
#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3)
#define IDXD_FLAG_CACHE_CONTROL (1 << 8)
#define IOAT_COMP_UPDATE_SHIFT 3
#define IOAT_CMD_OP_SHIFT 24
enum rte_ioat_ops {
ioat_op_copy = 0, /* Standard DMA Operation */
ioat_op_fill /* Block Fill */
};
/**
* Hardware descriptor used by DSA hardware, for both bursts and
* for individual operations.
*/
struct rte_idxd_hw_desc {
uint32_t pasid;
uint32_t op_flags;
rte_iova_t completion;
RTE_STD_C11
union {
rte_iova_t src; /* source address for copy ops etc. */
rte_iova_t desc_addr; /* descriptor pointer for batch */
};
rte_iova_t dst;
uint32_t size; /* length of data for op, or batch size */
uint16_t intr_handle; /* completion interrupt handle */
/* remaining 26 bytes are reserved */
uint16_t __reserved[13];
} __rte_aligned(64);
/**
* Completion record structure written back by DSA
*/
struct rte_idxd_completion {
uint8_t status;
uint8_t result;
/* 16-bits pad here */
uint32_t completed_size; /* data length, or descriptors for batch */
rte_iova_t fault_address;
uint32_t invalid_flags;
} __rte_aligned(32);
/**
* structure used to save the "handles" provided by the user to be
* returned to the user on job completion.
*/
struct rte_idxd_user_hdl {
uint64_t src;
uint64_t dst;
};
/**
* @internal
* Structure representing an IDXD device instance
*/
struct rte_idxd_rawdev {
enum rte_ioat_dev_type type;
struct rte_ioat_xstats xstats;
void *portal; /* address to write the batch descriptor */
struct rte_ioat_rawdev_config cfg;
rte_iova_t desc_iova; /* base address of desc ring, needed for completions */
/* counters to track the batches */
unsigned short max_batches;
unsigned short batch_idx_read;
unsigned short batch_idx_write;
unsigned short *batch_idx_ring; /* store where each batch ends */
/* track descriptors and handles */
unsigned short desc_ring_mask;
unsigned short hdls_avail; /* handles for ops completed */
unsigned short hdls_read; /* the read pointer for hdls/desc rings */
unsigned short batch_start; /* start+size == write pointer for hdls/desc */
unsigned short batch_size;
struct rte_idxd_hw_desc *desc_ring;
struct rte_idxd_user_hdl *hdl_ring;
/* flags to indicate handle validity. Kept separate from ring, to avoid
* using 8 bytes per flag. Upper 8 bits holds error code if any.
*/
uint16_t *hdl_ring_flags;
};
#define RTE_IDXD_HDL_NORMAL 0
#define RTE_IDXD_HDL_INVALID (1 << 0) /* no handle stored for this element */
#define RTE_IDXD_HDL_OP_FAILED (1 << 1) /* return failure for this one */
#define RTE_IDXD_HDL_OP_SKIPPED (1 << 2) /* this op was skipped */
static __rte_always_inline uint16_t
__idxd_burst_capacity(int dev_id)
{
struct rte_idxd_rawdev *idxd =
(struct rte_idxd_rawdev *)rte_rawdevs[dev_id].dev_private;
uint16_t write_idx = idxd->batch_start + idxd->batch_size;
uint16_t used_space, free_space;
/* Check for space in the batch ring */
if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) ||
idxd->batch_idx_write + 1 == idxd->batch_idx_read)
return 0;
/* for descriptors, check for wrap-around on write but not read */
if (idxd->hdls_read > write_idx)
write_idx += idxd->desc_ring_mask + 1;
used_space = write_idx - idxd->hdls_read;
/* Return amount of free space in the descriptor ring
* subtract 1 for space for batch descriptor and 1 for possible null desc
*/
free_space = idxd->desc_ring_mask - used_space;
if (free_space < 2)
return 0;
return free_space - 2;
}
static __rte_always_inline rte_iova_t
__desc_idx_to_iova(struct rte_idxd_rawdev *idxd, uint16_t n)
{
return idxd->desc_iova + (n * sizeof(struct rte_idxd_hw_desc));
}
static __rte_always_inline int
__idxd_write_desc(int dev_id,
const uint32_t op_flags,
const rte_iova_t src,
const rte_iova_t dst,
const uint32_t size,
const struct rte_idxd_user_hdl *hdl)
{
struct rte_idxd_rawdev *idxd =
(struct rte_idxd_rawdev *)rte_rawdevs[dev_id].dev_private;
uint16_t write_idx = idxd->batch_start + idxd->batch_size;
uint16_t mask = idxd->desc_ring_mask;
/* first check batch ring space then desc ring space */
if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) ||
idxd->batch_idx_write + 1 == idxd->batch_idx_read)
goto failed;
/* for descriptor ring, we always need a slot for batch completion */
if (((write_idx + 2) & mask) == idxd->hdls_read ||
((write_idx + 1) & mask) == idxd->hdls_read)
goto failed;
/* write desc and handle. Note, descriptors don't wrap */
idxd->desc_ring[write_idx].pasid = 0;
idxd->desc_ring[write_idx].op_flags = op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID;
idxd->desc_ring[write_idx].completion = __desc_idx_to_iova(idxd, write_idx & mask);
idxd->desc_ring[write_idx].src = src;
idxd->desc_ring[write_idx].dst = dst;
idxd->desc_ring[write_idx].size = size;
if (hdl == NULL)
idxd->hdl_ring_flags[write_idx & mask] = RTE_IDXD_HDL_INVALID;
else
idxd->hdl_ring[write_idx & mask] = *hdl;
idxd->batch_size++;
idxd->xstats.enqueued++;
rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]);
return 1;
failed:
idxd->xstats.enqueue_failed++;
rte_errno = ENOSPC;
return 0;
}
static __rte_always_inline int
__idxd_enqueue_fill(int dev_id, uint64_t pattern, rte_iova_t dst,
unsigned int length, uintptr_t dst_hdl)
{
const struct rte_idxd_user_hdl hdl = {
.dst = dst_hdl
};
return __idxd_write_desc(dev_id,
(idxd_op_fill << IDXD_CMD_OP_SHIFT) | IDXD_FLAG_CACHE_CONTROL,
pattern, dst, length, &hdl);
}
static __rte_always_inline int
__idxd_enqueue_copy(int dev_id, rte_iova_t src, rte_iova_t dst,
unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
{
const struct rte_idxd_user_hdl hdl = {
.src = src_hdl,
.dst = dst_hdl
};
return __idxd_write_desc(dev_id,
(idxd_op_memmove << IDXD_CMD_OP_SHIFT) | IDXD_FLAG_CACHE_CONTROL,
src, dst, length, &hdl);
}
static __rte_always_inline int
__idxd_enqueue_nop(int dev_id)
{
/* only op field needs filling - zero src, dst and length */
return __idxd_write_desc(dev_id, idxd_op_nop << IDXD_CMD_OP_SHIFT,
0, 0, 0, NULL);
}
static __rte_always_inline int
__idxd_fence(int dev_id)
{
/* only op field needs filling - zero src, dst and length */
return __idxd_write_desc(dev_id, IDXD_FLAG_FENCE, 0, 0, 0, NULL);
}
static __rte_always_inline void
__idxd_movdir64b(volatile void *dst, const struct rte_idxd_hw_desc *src)
{
asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
:
: "a" (dst), "d" (src)
: "memory");
}
static __rte_always_inline int
__idxd_perform_ops(int dev_id)
{
struct rte_idxd_rawdev *idxd =
(struct rte_idxd_rawdev *)rte_rawdevs[dev_id].dev_private;
if (!idxd->cfg.no_prefetch_completions)
rte_prefetch1(&idxd->desc_ring[idxd->batch_idx_ring[idxd->batch_idx_read]]);
if (idxd->batch_size == 0)
return 0;
if (idxd->batch_size == 1)
/* use a NOP as a null descriptor, so batch_size >= 2 */
if (__idxd_enqueue_nop(dev_id) != 1)
return -1;
/* write completion beyond last desc in the batch */
uint16_t comp_idx = (idxd->batch_start + idxd->batch_size) & idxd->desc_ring_mask;
*((uint64_t *)&idxd->desc_ring[comp_idx]) = 0; /* zero start of desc */
idxd->hdl_ring_flags[comp_idx] = RTE_IDXD_HDL_INVALID;
const struct rte_idxd_hw_desc batch_desc = {
.op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) |
IDXD_FLAG_COMPLETION_ADDR_VALID |
IDXD_FLAG_REQUEST_COMPLETION,
.desc_addr = __desc_idx_to_iova(idxd, idxd->batch_start),
.completion = __desc_idx_to_iova(idxd, comp_idx),
.size = idxd->batch_size,
};
_mm_sfence(); /* fence before writing desc to device */
__idxd_movdir64b(idxd->portal, &batch_desc);
idxd->xstats.started += idxd->batch_size;
idxd->batch_start += idxd->batch_size + 1;
idxd->batch_start &= idxd->desc_ring_mask;
idxd->batch_size = 0;
idxd->batch_idx_ring[idxd->batch_idx_write++] = comp_idx;
if (idxd->batch_idx_write > idxd->max_batches)
idxd->batch_idx_write = 0;
return 0;
}
static __rte_always_inline int
__idxd_completed_ops(int dev_id, uint8_t max_ops, uint32_t *status, uint8_t *num_unsuccessful,
uintptr_t *src_hdls, uintptr_t *dst_hdls)
{
struct rte_idxd_rawdev *idxd =
(struct rte_idxd_rawdev *)rte_rawdevs[dev_id].dev_private;
unsigned short n, h_idx;
while (idxd->batch_idx_read != idxd->batch_idx_write) {
uint16_t idx_to_chk = idxd->batch_idx_ring[idxd->batch_idx_read];
volatile struct rte_idxd_completion *comp_to_chk =
(struct rte_idxd_completion *)&idxd->desc_ring[idx_to_chk];
uint8_t batch_status = comp_to_chk->status;
if (batch_status == 0)
break;
comp_to_chk->status = 0;
if (unlikely(batch_status > 1)) {
/* error occurred somewhere in batch, start where last checked */
uint16_t desc_count = comp_to_chk->completed_size;
uint16_t batch_start = idxd->hdls_avail;
uint16_t batch_end = idx_to_chk;
if (batch_start > batch_end)
batch_end += idxd->desc_ring_mask + 1;
/* go through each batch entry and see status */
for (n = 0; n < desc_count; n++) {
uint16_t idx = (batch_start + n) & idxd->desc_ring_mask;
volatile struct rte_idxd_completion *comp =
(struct rte_idxd_completion *)&idxd->desc_ring[idx];
if (comp->status != 0 &&
idxd->hdl_ring_flags[idx] == RTE_IDXD_HDL_NORMAL) {
idxd->hdl_ring_flags[idx] = RTE_IDXD_HDL_OP_FAILED;
idxd->hdl_ring_flags[idx] |= (comp->status << 8);
comp->status = 0; /* clear error for next time */
}
}
/* if batch is incomplete, mark rest as skipped */
for ( ; n < batch_end - batch_start; n++) {
uint16_t idx = (batch_start + n) & idxd->desc_ring_mask;
if (idxd->hdl_ring_flags[idx] == RTE_IDXD_HDL_NORMAL)
idxd->hdl_ring_flags[idx] = RTE_IDXD_HDL_OP_SKIPPED;
}
}
/* avail points to one after the last one written */
idxd->hdls_avail = (idx_to_chk + 1) & idxd->desc_ring_mask;
idxd->batch_idx_read++;
if (idxd->batch_idx_read > idxd->max_batches)
idxd->batch_idx_read = 0;
}
n = 0;
h_idx = idxd->hdls_read;
while (h_idx != idxd->hdls_avail) {
uint16_t flag = idxd->hdl_ring_flags[h_idx];
if (flag != RTE_IDXD_HDL_INVALID) {
if (!idxd->cfg.hdls_disable) {
src_hdls[n] = idxd->hdl_ring[h_idx].src;
dst_hdls[n] = idxd->hdl_ring[h_idx].dst;
}
if (unlikely(flag != RTE_IDXD_HDL_NORMAL)) {
if (status != NULL)
status[n] = flag == RTE_IDXD_HDL_OP_SKIPPED ?
RTE_IOAT_OP_SKIPPED :
/* failure case, return err code */
idxd->hdl_ring_flags[h_idx] >> 8;
if (num_unsuccessful != NULL)
*num_unsuccessful += 1;
}
n++;
}
idxd->hdl_ring_flags[h_idx] = RTE_IDXD_HDL_NORMAL;
if (++h_idx > idxd->desc_ring_mask)
h_idx = 0;
if (n >= max_ops)
break;
}
/* skip over any remaining blank elements, e.g. batch completion */
while (idxd->hdl_ring_flags[h_idx] == RTE_IDXD_HDL_INVALID && h_idx != idxd->hdls_avail) {
idxd->hdl_ring_flags[h_idx] = RTE_IDXD_HDL_NORMAL;
if (++h_idx > idxd->desc_ring_mask)
h_idx = 0;
}
idxd->hdls_read = h_idx;
idxd->xstats.completed += n;
return n;
}
#endif

View File

@ -1,214 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2019 Intel Corporation
*/
#ifndef _RTE_IOAT_RAWDEV_H_
#define _RTE_IOAT_RAWDEV_H_
#ifdef __cplusplus
extern "C" {
#endif
/**
* @file rte_ioat_rawdev.h
*
* Definitions for using the ioat rawdev device driver
*
* @warning
* @b EXPERIMENTAL: these structures and APIs may change without prior notice
*/
#include <rte_common.h>
/** Name of the device driver */
#define IOAT_PMD_RAWDEV_NAME rawdev_ioat
/** String reported as the device driver name by rte_rawdev_info_get() */
#define IOAT_PMD_RAWDEV_NAME_STR "rawdev_ioat"
/**
* Configuration structure for an ioat rawdev instance
*
* This structure is to be passed as the ".dev_private" parameter when
* calling the rte_rawdev_get_info() and rte_rawdev_configure() APIs on
* an ioat rawdev instance.
*/
struct rte_ioat_rawdev_config {
unsigned short ring_size; /**< size of job submission descriptor ring */
bool hdls_disable; /**< if set, ignore user-supplied handle params */
/** set "no_prefetch_completions", if polling completions on separate core
* from the core submitting the jobs
*/
bool no_prefetch_completions;
};
/**
* Enqueue a fill operation onto the ioat device
*
* This queues up a fill operation to be performed by hardware, but does not
* trigger hardware to begin that operation.
*
* @param dev_id
* The rawdev device id of the ioat instance
* @param pattern
* The pattern to populate the destination buffer with
* @param dst
* The physical address of the destination buffer
* @param length
* The length of the destination buffer
* @param dst_hdl
* An opaque handle for the destination data, to be returned when this
* operation has been completed and the user polls for the completion details.
* NOTE: If hdls_disable configuration option for the device is set, this
* parameter is ignored.
* @return
* Number of operations enqueued, either 0 or 1
*/
static inline int
__rte_experimental
rte_ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst,
unsigned int length, uintptr_t dst_hdl);
/**
* Enqueue a copy operation onto the ioat device
*
* This queues up a copy operation to be performed by hardware, but does not
* trigger hardware to begin that operation.
*
* @param dev_id
* The rawdev device id of the ioat instance
* @param src
* The physical address of the source buffer
* @param dst
* The physical address of the destination buffer
* @param length
* The length of the data to be copied
* @param src_hdl
* An opaque handle for the source data, to be returned when this operation
* has been completed and the user polls for the completion details.
* NOTE: If hdls_disable configuration option for the device is set, this
* parameter is ignored.
* @param dst_hdl
* An opaque handle for the destination data, to be returned when this
* operation has been completed and the user polls for the completion details.
* NOTE: If hdls_disable configuration option for the device is set, this
* parameter is ignored.
* @return
* Number of operations enqueued, either 0 or 1
*/
static inline int
__rte_experimental
rte_ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl);
/**
* Add a fence to force ordering between operations
*
* This adds a fence to a sequence of operations to enforce ordering, such that
* all operations enqueued before the fence must be completed before operations
* after the fence.
* NOTE: Since this fence may be added as a flag to the last operation enqueued,
* this API may not function correctly when called immediately after an
* "rte_ioat_perform_ops" call i.e. before any new operations are enqueued.
*
* @param dev_id
* The rawdev device id of the ioat instance
* @return
* Number of fences enqueued, either 0 or 1
*/
static inline int
__rte_experimental
rte_ioat_fence(int dev_id);
/**
* Trigger hardware to begin performing enqueued operations
*
* Writes the "doorbell" to the hardware to trigger it
* to begin the operations previously enqueued by rte_ioat_enqueue_copy()
*
* @param dev_id
* The rawdev device id of the ioat instance
* @return
* 0 on success. Non-zero return on error.
*/
static inline int
__rte_experimental
rte_ioat_perform_ops(int dev_id);
/*
* Status codes for operations.
*/
#define RTE_IOAT_OP_SUCCESS 0 /**< Operation completed successfully */
#define RTE_IOAT_OP_SKIPPED 1 /**< Operation was not attempted (Earlier fenced op failed) */
/* Values >1 indicate a failure condition */
/* Error codes taken from Intel(R) Data Streaming Accelerator Architecture
* Specification, section 5.7
*/
#define RTE_IOAT_OP_ADDRESS_ERR 0x03 /**< Page fault or invalid address */
#define RTE_IOAT_OP_INVALID_LEN 0x13 /**< Invalid/too big length field passed */
#define RTE_IOAT_OP_OVERLAPPING_BUFS 0x16 /**< Overlapping buffers error */
/**
* Returns details of operations that have been completed
*
* The status of each operation is returned in the status array parameter.
* If the hdls_disable option was not set when the device was configured,
* the function will return to the caller the user-provided "handles" for
* the copy operations which have been completed by the hardware, and not
* already returned by a previous call to this API.
* If the hdls_disable option for the device was set on configure, the
* src_hdls and dst_hdls parameters will be ignored, and the
* function returns the number of newly-completed operations.
* If status is also NULL, then max_copies parameter is also ignored and the
* function returns a count of the number of newly-completed operations.
*
* @param dev_id
* The rawdev device id of the ioat instance
* @param max_copies
* The number of entries which can fit in the status, src_hdls and dst_hdls
* arrays, i.e. max number of completed operations to report.
* NOTE: If hdls_disable configuration option for the device is set, this
* parameter applies only to the "status" array if specified
* @param status
* Array to hold the status of each completed operation. Array should be
* set to zeros on input, as the driver will only write error status values.
* A value of 1 implies an operation was not attempted, and any other non-zero
* value indicates operation failure.
* Parameter may be NULL if no status value checking is required.
* @param num_unsuccessful
* Returns the number of elements in status where the value is non-zero,
* i.e. the operation either failed or was not attempted due to an earlier
* failure. If this value is returned as zero (the expected case), the
* status array will not have been modified by the function and need not be
* checked by software
* @param src_hdls
* Array to hold the source handle parameters of the completed ops.
* NOTE: If hdls_disable configuration option for the device is set, this
* parameter is ignored, and may be NULL
* @param dst_hdls
* Array to hold the destination handle parameters of the completed ops.
* NOTE: If hdls_disable configuration option for the device is set, this
* parameter is ignored, and may be NULL
* @return
* -1 on device error, with rte_errno set appropriately and parameters
* unmodified.
* Otherwise number of returned operations i.e. number of valid entries
* in the status, src_hdls and dst_hdls array parameters. If status is NULL,
* and the hdls_disable config option is set, this value may be greater than
* max_copies parameter.
*/
static inline int
__rte_experimental
rte_ioat_completed_ops(int dev_id, uint8_t max_copies,
uint32_t *status, uint8_t *num_unsuccessful,
uintptr_t *src_hdls, uintptr_t *dst_hdls);
/* include the implementation details from a separate file */
#include "rte_ioat_rawdev_fns.h"
#ifdef __cplusplus
}
#endif
#endif /* _RTE_IOAT_RAWDEV_H_ */

View File

@ -1,379 +0,0 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2019-2020 Intel Corporation
*/
#ifndef _RTE_IOAT_RAWDEV_FNS_H_
#define _RTE_IOAT_RAWDEV_FNS_H_
/**
* @file
* This header file contains the implementation of the various ioat
* rawdev functions for IOAT/CBDMA hardware. The API specification and key
* public structures are defined in "rte_ioat_rawdev.h".
*
* This file should not be included directly, but instead applications should
* include "rte_ioat_rawdev.h", which then includes this file - and the IDXD/DSA
* equivalent header - in turn.
*/
#include <x86intrin.h>
#include <rte_rawdev.h>
#include <rte_memzone.h>
#include <rte_prefetch.h>
/**
* @internal
* Identify the data path to use.
* Must be first field of rte_ioat_rawdev and rte_idxd_rawdev structs
*/
enum rte_ioat_dev_type {
RTE_IOAT_DEV,
RTE_IDXD_DEV,
};
/**
* @internal
* some statistics for tracking, if added/changed update xstats fns
*/
struct rte_ioat_xstats {
uint64_t enqueue_failed;
uint64_t enqueued;
uint64_t started;
uint64_t completed;
};
#include "rte_idxd_rawdev_fns.h"
/**
* @internal
* Structure representing a device descriptor
*/
struct rte_ioat_generic_hw_desc {
uint32_t size;
union {
uint32_t control_raw;
struct {
uint32_t int_enable: 1;
uint32_t src_snoop_disable: 1;
uint32_t dest_snoop_disable: 1;
uint32_t completion_update: 1;
uint32_t fence: 1;
uint32_t reserved2: 1;
uint32_t src_page_break: 1;
uint32_t dest_page_break: 1;
uint32_t bundle: 1;
uint32_t dest_dca: 1;
uint32_t hint: 1;
uint32_t reserved: 13;
uint32_t op: 8;
} control;
} u;
uint64_t src_addr;
uint64_t dest_addr;
uint64_t next;
uint64_t op_specific[4];
};
/**
* @internal
* Structure representing an IOAT device instance
*/
struct rte_ioat_rawdev {
/* common fields at the top - match those in rte_idxd_rawdev */
enum rte_ioat_dev_type type;
struct rte_ioat_xstats xstats;
struct rte_rawdev *rawdev;
const struct rte_memzone *mz;
const struct rte_memzone *desc_mz;
volatile uint16_t *doorbell __rte_cache_aligned;
phys_addr_t status_addr;
phys_addr_t ring_addr;
unsigned short ring_size;
bool hdls_disable;
struct rte_ioat_generic_hw_desc *desc_ring;
__m128i *hdls; /* completion handles for returning to user */
unsigned short next_read;
unsigned short next_write;
/* to report completions, the device will write status back here */
volatile uint64_t status __rte_cache_aligned;
/* pointer to the register bar */
volatile struct rte_ioat_registers *regs;
};
#define RTE_IOAT_CHANSTS_IDLE 0x1
#define RTE_IOAT_CHANSTS_SUSPENDED 0x2
#define RTE_IOAT_CHANSTS_HALTED 0x3
#define RTE_IOAT_CHANSTS_ARMED 0x4
static __rte_always_inline uint16_t
__ioat_burst_capacity(int dev_id)
{
struct rte_ioat_rawdev *ioat =
(struct rte_ioat_rawdev *)rte_rawdevs[dev_id].dev_private;
unsigned short size = ioat->ring_size - 1;
unsigned short read = ioat->next_read;
unsigned short write = ioat->next_write;
unsigned short space = size - (write - read);
return space;
}
static __rte_always_inline int
__ioat_write_desc(int dev_id, uint32_t op, uint64_t src, phys_addr_t dst,
unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
{
struct rte_ioat_rawdev *ioat =
(struct rte_ioat_rawdev *)rte_rawdevs[dev_id].dev_private;
unsigned short read = ioat->next_read;
unsigned short write = ioat->next_write;
unsigned short mask = ioat->ring_size - 1;
unsigned short space = mask + read - write;
struct rte_ioat_generic_hw_desc *desc;
if (space == 0) {
ioat->xstats.enqueue_failed++;
return 0;
}
ioat->next_write = write + 1;
write &= mask;
desc = &ioat->desc_ring[write];
desc->size = length;
/* set descriptor write-back every 16th descriptor */
desc->u.control_raw = (uint32_t)((op << IOAT_CMD_OP_SHIFT) |
(!(write & 0xF) << IOAT_COMP_UPDATE_SHIFT));
desc->src_addr = src;
desc->dest_addr = dst;
if (!ioat->hdls_disable)
ioat->hdls[write] = _mm_set_epi64x((int64_t)dst_hdl,
(int64_t)src_hdl);
rte_prefetch0(&ioat->desc_ring[ioat->next_write & mask]);
ioat->xstats.enqueued++;
return 1;
}
static __rte_always_inline int
__ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst,
unsigned int length, uintptr_t dst_hdl)
{
static const uintptr_t null_hdl;
return __ioat_write_desc(dev_id, ioat_op_fill, pattern, dst, length,
null_hdl, dst_hdl);
}
/*
* Enqueue a copy operation onto the ioat device
*/
static __rte_always_inline int
__ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
{
return __ioat_write_desc(dev_id, ioat_op_copy, src, dst, length,
src_hdl, dst_hdl);
}
/* add fence to last written descriptor */
static __rte_always_inline int
__ioat_fence(int dev_id)
{
struct rte_ioat_rawdev *ioat =
(struct rte_ioat_rawdev *)rte_rawdevs[dev_id].dev_private;
unsigned short write = ioat->next_write;
unsigned short mask = ioat->ring_size - 1;
struct rte_ioat_generic_hw_desc *desc;
write = (write - 1) & mask;
desc = &ioat->desc_ring[write];
desc->u.control.fence = 1;
return 0;
}
/*
* Trigger hardware to begin performing enqueued operations
*/
static __rte_always_inline int
__ioat_perform_ops(int dev_id)
{
struct rte_ioat_rawdev *ioat =
(struct rte_ioat_rawdev *)rte_rawdevs[dev_id].dev_private;
ioat->desc_ring[(ioat->next_write - 1) & (ioat->ring_size - 1)].u
.control.completion_update = 1;
rte_compiler_barrier();
*ioat->doorbell = ioat->next_write;
ioat->xstats.started = ioat->xstats.enqueued;
return 0;
}
/**
* @internal
* Returns the index of the last completed operation.
*/
static __rte_always_inline int
__ioat_get_last_completed(struct rte_ioat_rawdev *ioat, int *error)
{
uint64_t status = ioat->status;
/* lower 3 bits indicate "transfer status" : active, idle, halted.
* We can ignore bit 0.
*/
*error = status & (RTE_IOAT_CHANSTS_SUSPENDED | RTE_IOAT_CHANSTS_ARMED);
return (status - ioat->ring_addr) >> 6;
}
/*
* Returns details of operations that have been completed
*/
static __rte_always_inline int
__ioat_completed_ops(int dev_id, uint8_t max_copies,
uintptr_t *src_hdls, uintptr_t *dst_hdls)
{
struct rte_ioat_rawdev *ioat =
(struct rte_ioat_rawdev *)rte_rawdevs[dev_id].dev_private;
unsigned short mask = (ioat->ring_size - 1);
unsigned short read = ioat->next_read;
unsigned short end_read, count;
int error;
int i = 0;
end_read = (__ioat_get_last_completed(ioat, &error) + 1) & mask;
count = (end_read - (read & mask)) & mask;
if (error) {
rte_errno = EIO;
return -1;
}
if (ioat->hdls_disable) {
read += count;
goto end;
}
if (count > max_copies)
count = max_copies;
for (; i < count - 1; i += 2, read += 2) {
__m128i hdls0 = _mm_load_si128(&ioat->hdls[read & mask]);
__m128i hdls1 = _mm_load_si128(&ioat->hdls[(read + 1) & mask]);
_mm_storeu_si128((__m128i *)&src_hdls[i],
_mm_unpacklo_epi64(hdls0, hdls1));
_mm_storeu_si128((__m128i *)&dst_hdls[i],
_mm_unpackhi_epi64(hdls0, hdls1));
}
for (; i < count; i++, read++) {
uintptr_t *hdls = (uintptr_t *)&ioat->hdls[read & mask];
src_hdls[i] = hdls[0];
dst_hdls[i] = hdls[1];
}
end:
ioat->next_read = read;
ioat->xstats.completed += count;
return count;
}
static inline uint16_t
rte_ioat_burst_capacity(int dev_id)
{
enum rte_ioat_dev_type *type =
(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
if (*type == RTE_IDXD_DEV)
return __idxd_burst_capacity(dev_id);
else
return __ioat_burst_capacity(dev_id);
}
static inline int
rte_ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst,
unsigned int len, uintptr_t dst_hdl)
{
enum rte_ioat_dev_type *type =
(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
if (*type == RTE_IDXD_DEV)
return __idxd_enqueue_fill(dev_id, pattern, dst, len, dst_hdl);
else
return __ioat_enqueue_fill(dev_id, pattern, dst, len, dst_hdl);
}
static inline int
rte_ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
{
enum rte_ioat_dev_type *type =
(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
if (*type == RTE_IDXD_DEV)
return __idxd_enqueue_copy(dev_id, src, dst, length,
src_hdl, dst_hdl);
else
return __ioat_enqueue_copy(dev_id, src, dst, length,
src_hdl, dst_hdl);
}
static inline int
rte_ioat_fence(int dev_id)
{
enum rte_ioat_dev_type *type =
(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
if (*type == RTE_IDXD_DEV)
return __idxd_fence(dev_id);
else
return __ioat_fence(dev_id);
}
static inline int
rte_ioat_perform_ops(int dev_id)
{
enum rte_ioat_dev_type *type =
(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
if (*type == RTE_IDXD_DEV)
return __idxd_perform_ops(dev_id);
else
return __ioat_perform_ops(dev_id);
}
static inline int
rte_ioat_completed_ops(int dev_id, uint8_t max_copies,
uint32_t *status, uint8_t *num_unsuccessful,
uintptr_t *src_hdls, uintptr_t *dst_hdls)
{
enum rte_ioat_dev_type *type =
(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
uint8_t tmp; /* used so functions don't need to check for null parameter */
if (num_unsuccessful == NULL)
num_unsuccessful = &tmp;
*num_unsuccessful = 0;
if (*type == RTE_IDXD_DEV)
return __idxd_completed_ops(dev_id, max_copies, status, num_unsuccessful,
src_hdls, dst_hdls);
else
return __ioat_completed_ops(dev_id, max_copies, src_hdls, dst_hdls);
}
static inline void
__rte_deprecated_msg("use rte_ioat_perform_ops() instead")
rte_ioat_do_copies(int dev_id) { rte_ioat_perform_ops(dev_id); }
static inline int
__rte_deprecated_msg("use rte_ioat_completed_ops() instead")
rte_ioat_completed_copies(int dev_id, uint8_t max_copies,
uintptr_t *src_hdls, uintptr_t *dst_hdls)
{
return rte_ioat_completed_ops(dev_id, max_copies, NULL, NULL,
src_hdls, dst_hdls);
}
#endif /* _RTE_IOAT_RAWDEV_FNS_H_ */

View File

@ -1,3 +0,0 @@
DPDK_23 {
local: *;
};

View File

@ -10,7 +10,6 @@ drivers = [
'cnxk_gpio', 'cnxk_gpio',
'dpaa2_cmdif', 'dpaa2_cmdif',
'ifpga', 'ifpga',
'ioat',
'ntb', 'ntb',
'skeleton', 'skeleton',
] ]