Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.
Implementing alarm cleanup routine, where the memory allocated
for interrupt instance can be freed.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
Making changes to the interrupt framework to use interrupt handle
APIs to get/set any field.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
Prototype/Implement get set APIs for interrupt handle fields.
User won't be able to access any of the interrupt handle fields
directly while should use these get/set APIs to access/manipulate
them.
Internal interrupt header i.e. rte_eal_interrupt.h is rearranged,
as APIs defined are moved to rte_interrupts.h and epoll specific
definitions are moved to a new header rte_epoll.h.
Later in the series rte_eal_interrupt.h will be removed.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
Windows EAL did not detect IOVA mode and worked incorrectly
if physical addresses could not be obtained
(if virt2phys driver was missing or inaccessible).
In this case, rte_mem_virt2iova() reported RTE_BAD_IOVA for any address.
Inability to obtain IOVA, be it PA or VA, should cause a failure
for the DPDK allocator, but it was hidden by the implementation,
so allocations did not fail when they should.
The mode when DPDK cannot obtain PA but can work is IOVA-as-VA mode.
However, rte_eal_iova_mode() always returned RTE_IOVA_DC
(while it should only ever return RTE_IOVA_PA or RTE_IOVA_VA),
because IOVA mode detection was not implemented.
Implement IOVA mode detection:
1. Always allow to force --iova-mode=va.
2. Allow to force --iova-mode=pa only if virt2phys is available.
3. If no mode is forced and virt2phys is available,
select the mode according to bus requests, default to PA.
4. If no mode is forced but virt2phys is unavailable, default to VA.
Fix rte_mem_virt2iova() by returning VA when using IOVA-as-VA.
Fix rte_eal_iova_mode() by returning the selected mode.
Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org
Reported-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
This patch fixes buffer overflow reported by ASAN,
please reference https://bugs.dpdk.org/show_bug.cgi?id=819
The rte_lpm6 keeps routing information for control plane purpose
inside the rte_hash table which uses rte_jhash() as a hash function.
From the rte_jhash() documentation: If input key is not aligned to
four byte boundaries or a multiple of four bytes in length,
the memory region just after may be read (but not used in the
computation).
rte_lpm6 uses 17 bytes keys consisting of IPv6 address (16 bytes) +
depth (1 byte).
This patch increases the size of the depth field up to uint32_t
and sets the alignment to 4 bytes.
Bugzilla ID: 819
Fixes: 86b3b21952 ("lpm6: store rules in hash table")
Cc: stable@dpdk.org
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Fixes: 7574c3ef74 ("hash: add toeplitz algorithm used by RSS")
Cc: stable@dpdk.org
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Ensure that the memory operations before the call to
rte_eal_remote_launch are visible to the worker thread.
Use the function pointer to execute in worker thread
as the guard variable.
Ensure that the memory operations in worker thread, that happen
before it returns the status of the assigned function, are
visible to the main thread. Use the variable containing the
lcore's state as the guard variable.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Feifei Wang <feifei.wang2@arm.com>
FINISHED state seems to be used to indicate that the worker's update
of the 'state' is not visible to other threads. There seems to be no
requirement to have such a state.
Since the FINISHED state is removed, the API rte_eal_wait_lcore
is updated to always return the status of the last function that
ran in the worker core.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Feifei Wang <feifei.wang2@arm.com>
In the rte_eal_remote_launch function, the lcore function
pointer is checked for NULL. However, the pointer is never
reset to NULL. Reset the lcore function pointer and argument
after the worker has completed executing the lcore function.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Feifei Wang <feifei.wang2@arm.com>
Functions and macros in x86 rte_memcpy.h may cause cast-align warnings,
when using strict cast align flag with supporting gcc:
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
CFLAGS="-Wcast-align=strict" make V=1 -C examples/l2fwd clean static
For example:
In file included from main.c:24:
/dpdk/build/include/rte_memcpy.h: In function 'rte_mov16':
/dpdk/build/include/rte_memcpy.h:306:25: warning: cast increases
required alignment of target type [-Wcast-align]
306 | xmm0 = _mm_loadu_si128((const __m128i *)src);
| ^
As the code assumes correct alignment, add first a (void *) or (const
void *) castings, to avoid the warnings.
Fixes: 9484092baa ("eal/x86: optimize memcpy for AVX512 platforms")
Cc: stable@dpdk.org
Signed-off-by: Eli Britstein <elibr@nvidia.com>
In rte_pktmbuf_mtod_offset macro, there is a casting from char * to type
't', which may cause cast-align warning when using strict cast align
flag with supporting gcc:
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
CFLAGS="-Wcast-align=strict" make V=1 -C examples/l2fwd clean static
main.c: In function 'l2fwd_mac_updating':
/dpdk/build/include/rte_mbuf_core.h:719:3: warning: cast increases
required alignment of target type [-Wcast-align]
719 | ((t)((char *)(m)->buf_addr + (m)->data_off + (o)))
| ^
/dpdk/build/include/rte_mbuf_core.h:733:32: note: in expansion of macro
'rte_pktmbuf_mtod_offset'
733 | #define rte_pktmbuf_mtod(m, t) rte_pktmbuf_mtod_offset(m, t, 0)
| ^~~~~~~~~~~~~~~~~~~~~~~
As the code assumes correct alignment, add first a (void *) casting, to
avoid the warning.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
In rte_vlan_insert there is a casting of rte_pktmbuf_prepend returned
value to (struct rte_ether_hdr *), which causes cast-align warning when
using strict cast align flag with supporting gcc:
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
CFLAGS="-Wcast-align=strict" make V=1 -C examples/l2fwd clean static
In file included from main.c:35:
/dpdk/build/include/rte_ether.h:370:7: warning: cast increases required
alignment of target type [-Wcast-align]
370 | nh = (struct rte_ether_hdr *)
| ^
As the code assumes correct alignment, add first a (void *) casting, to
avoid the warning.
Fixes: c974021a59 ("ether: add soft vlan encap/decap")
Cc: stable@dpdk.org
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
When mempool had been created with RTE_MEMPOOL_F_NO_IOVA_CONTIG flag
but later populated with valid IOVA, RTE_MEMPOOL_F_NON_IO was unset,
while it should be kept. The unit test did not catch this
because rte_mempool_populate_default() it used was populating
with RTE_BAD_IOVA.
Keep setting RTE_MEMPOOL_NON_IO at an empty mempool creation
and add an assert for it in the unit test (remove the separate case).
Do not reset the flag if RTE_MEMPOOL_F_ON_IOVA_CONTIG is set.
Fixes: 11541c5c81 ("mempool: add non-IO flag")
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This API was introduced in 18.05, therefore removing
experimental tag to promote it to stable state
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Enable restricting the scope of an action to regular table entries or
to the table default entry in order to support the P4 language
tableonly or defaultonly annotations.
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add support for configurable number of loops through the input PCAP
file for the source port. Added an additional parameter to source
port CLI command.
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The instruction_data array was incorrectly indexed, which resulted in
the array index getting out of bounds and sometimes segfault.
Fixes: a1711f (“pipeline: add SWX Rx and extract instructions“)
Cc: stable@dpdk.org
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Build is broken on RHEL7 following introduction of this new protocol.
Fixes: 3a929df1f2 ("ethdev: support L2TPv2 and PPP procotol")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
Fix the mbuf offload flags namespace by adding an RTE_ prefix to the
name. The old flags remain usable, but a deprecation warning is issued
at compilation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
The flags PKT_TX_VLAN_PKT and PKT_TX_QINQ_PKT are
marked as deprecated since commit 380a7aab1a ("mbuf: rename deprecated
VLAN flags") (2017). But they were not using the RTE_DEPRECATED
macro, because it did not exist at this time. Add it, and replace
usage of these flags.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The flags PKT_RX_L4_CKSUM_BAD and PKT_RX_IP_CKSUM_BAD are defined
twice with the same value. Remove one of the occurrence, which was
marked as "deprecated".
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Set correct tunnel type telemetry text - tunnel type
was wrongly set as IPv4-UDP for all types.
Fixes: bf5b65a8e781 ("ipsec: support SA telemetry")
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
The device specific structures - rte_cryptodev
and rte_cryptodev_data are moved to cryptodev_pmd.h
to hide it from the applications.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Tested-by: Rebecca Troy <rebecca.troy@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Rework fast-path cryptodev functions to use rte_crypto_fp_ops[].
While it is an API/ABI breakage, this change is intended to be
transparent for both users (no changes in user app is required) and
PMD developers (no changes in PMD is required).
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Added a rte_cryptodev_pmd_probing_finish API which
need to be called by the PMD after the device is initialized
completely. This will set the fast path function pointers
in the flat array for secondary process. For primary process,
these are set in rte_cryptodev_start.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Move fastpath inline function pointers from rte_cryptodev into a
separate structure accessed via a flat array.
The intention is to make rte_cryptodev and related structures private
to avoid future API/ABI breakages.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Tested-by: Rebecca Troy <rebecca.troy@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
At queue_pair config stage, allocate memory for maximum
number of queue pair pointers that a device can support.
This will allow fast path APIs(enqueue_burst/dequeue_burst) to
refer pointer to internal QP data without checking for currently
configured QPs.
This is required to hide the rte_cryptodev and rte_cryptodev_data
structure from user.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
A new header file rte_cryptodev_core.h is added and all
internal data structures which need not be exposed directly to
application are moved to this file. These structures are mostly
used by drivers, but they need to be in the public header file
as they are accessed by datapath inline functions for
performance reasons.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Tested-by: Rebecca Troy <rebecca.troy@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The macros RTE_BIT32 and RTE_BIT64 are used to replace single bit masks.
Do not switch VLAN offload flags since type is not fixed size.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible
way. The macros for backward compatibility can be removed in next LTS.
Also updated some struct names to have 'rte_eth' prefix.
All internal components switched to using new names.
Syntax fixed on lines that this patch touches.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
rte_eth_dev_configure() always sets MTU to either dev_conf.rxmode.mtu
or RTE_ETHER_MTU if application doesn't provide the value.
So, there is no point to allow rte_eth_dev_set_mtu() before since
set value will be overwritten on configure anyway.
Fixes: 1bb4a528c4 ("ethdev: fix max Rx packet length")
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds API to return name of device capability.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
In current DPDK framework, each Rx queue is pre-loaded with mbufs to
save incoming packets. For some PMDs, when number of representors scale
out in a switch domain, the memory consumption became significant.
Polling all ports also leads to high cache miss, high latency and low
throughput.
This patch introduces shared Rx queue. Ports in same Rx domain and
switch domain could share Rx queue set by specifying non-zero sharing
group in Rx queue configuration.
Shared Rx queue is identified by share_rxq field of Rx queue
configuration. Port A RxQ X can share RxQ with Port B RxQ Y by using
same shared Rx queue ID.
No special API is defined to receive packets from shared Rx queue.
Polling any member port of a shared Rx queue receives packets of that
queue for all member ports, port_id is identified by mbuf->port. PMD is
responsible to resolve shared Rx queue from device and queue data.
Shared Rx queue must be polled in same thread or core, polling a queue
ID of any member port is essentially same.
Multiple share groups are supported. PMD should support mixed
configuration by allowing multiple share groups and non-shared Rx queue
on one port.
Example grouping and polling model to reflect service priority:
Group1, 2 shared Rx queues per port: PF, rep0, rep1
Group2, 1 shared Rx queue per port: rep2, rep3, ... rep127
Core0: poll PF queue0
Core1: poll PF queue1
Core2: poll rep2 queue0
PMD advertise shared Rx queue capability via RTE_ETH_DEV_CAPA_RXQ_SHARE.
PMD is responsible for shared Rx queue consistency checks to avoid
member port's configuration contradict each other.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
If calling rte_dev_remove() after rte_eth_dev_close(), in
rte_eth_dev_pci_generic_remove() function, the released eth device still
can be found by its name in shared memory. As a result, the eth device
will be released repeatedly. The state of the eth device is modified to
RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this state can be used
to avoid this problem.
Fixes: dcd5c8112b ("ethdev: add PCI driver helpers")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The use of IOMMU has many advantages, such as isolation and address
translation. This patch extends the capability of DMA engine to use
IOMMU if the DMA engine is bound to vfio.
When set memory table, the guest memory will be mapped
into the default container of DPDK.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Currently, if we map a memory area A, then map a separate memory area B
that by coincidence happens to be adjacent to A, current implementation
will merge these two segments into one, and if partial unmapping is not
supported, these segments will then be only allowed to be unmapped in
one go. In other words, given segments A and B that are adjacent, it
is currently not possible to map A, then map B, then unmap A.
Fix this by adding a notion of "chunk size", which will allow
subdividing segments into equally sized segments whenever we are dealing
with an IOMMU that does not support partial unmapping. With this change,
we will still be able to merge adjacent segments, but only if they are
of the same size. If we keep with our above example, adjacent segments A
and B will be stored as separate segments if they are of different
sizes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The index in rte_vhost_set_last_inflight_io_split is from
the frontend driver, check if it's in the virtqueue range.
Fixes: bb0c2de960 ("vhost: add APIs to operate inflight ring")
Cc: stable@dpdk.org
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Added flow pattern items and header formats of L2TPv2 and PPP.
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Full stop at the end of short comment just make line longer. It
should be either everywhere or nowhere to be consistent.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add empty lines to separate fields commented using different styles.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix both in one changeset since they share line in a number of cases.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix it everywhere in ethdev including log messages.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Documentation in the next separate line is confusing. If documentation
requires own line it should be before, not after.
Move documentation to the previous line if documentation on the same
line makes it too long.
Fix a number of incorrect markups on the way.
When a lines is touched by the patch anyway, do other cosmetics
changes to avoid changes in next patches.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Malloc cl in the cmdline_stdin_new function, so release in the
cmdline_stdin_exit function is logical, so that cl will not be
released alone.
Fixes: af75078fec ("first public release")
Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
Reviewed-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Zhihong Peng <zhihongx.peng@intel.com>
Hide struct rdline definition and some RDLINE_* constants in order
to be able to change internal buffer sizes transparently to the user.
Add new functions:
* rdline_new(): allocate and initialize struct rdline.
This function replaces rdline_init() and takes an extra parameter:
opaque user data for the callbacks.
* rdline_free(): deallocate struct rdline.
* rdline_get_history_buffer_size(): for use in tests.
* rdline_get_opaque(): to obtain user data in callback functions.
Remove rdline_init() function from library headers and export list,
because using it requires the knowledge of sizeof(struct rdline).
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Remove the definition of `struct cmdline` from public header.
Deprecation notice:
https://mails.dpdk.org/archives/dev/2020-September/183310.html
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Rather than maintaining a separate list of libraries which are to be
built on windows, use the standard library list and explicitly add to
each library that is not to be built a check for windows and disable
the library at that per-lib level. As well as shortening the main
lib/meson.build file, this also leads to the build summary at the end of
the meson config run correctly listing the libraries which are not to be
built.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The dmadev library was not added to the list of libraries built on
Windows, meaning it was skipped in those builds and also that none of
the drivers were being considered for build. Adding dmadev to the list
fixes this, and also enables the skeleton dmadev driver to be built -
all-be-it with a small fix necessary.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
Inline helpers have no global symbols in shared libraries.
There is no reason to ask for versioning (plus this library would not
build on Windows).
Fixes: 91e581e5c9 ("dmadev: add data plane API")
Fixes: ea8cf0f853 ("dmadev: add burst capacity API")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This enhances the DPDK pdump library to support new
pcapng format and filtering via BPF.
The internal client/server protocol is changed to support
two versions: the original pdump basic version and a
new pcapng version.
The internal version number (not part of exposed API or ABI)
is intentionally increased to cause any attempt to try
mismatched primary/secondary process to fail.
Add new API to do allow filtering of captured packets with
DPDK BPF (eBPF) filter program. It keeps statistics
on packets captured, filtered, and missed (because ring was full).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
When debugging converted (and other) programs it is useful
to see disassembled eBPF output.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
The pcap library emits classic BPF (32 bit) and is useful for
creating filter programs. The DPDK BPF library only implements
extended BPF (eBPF). Add an function to convert from old to
new.
The rte_bpf_convert function uses rte_malloc to put the resulting
program in hugepage shared memory so it can be passed from a
secondary process to a primary process.
The code to convert was originally done as part of the Linux
kernel implementation then converted to a userspace program.
See https://github.com/tklauser/filter2xdp
Both authors have agreed that it is allowable to create a modified
version of this code and license it with BSD license used by DPDK.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Some BPF programs may use XOR of a register with itself
as a way to zero register in one instruction.
The BPF filter converter generates this in the prolog
to the generated code.
The BPF validator would not allow this because the value of
register was undefined. But after this operation it always zero.
Fixes: 8021917293 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This is utility library for writing pcapng format files
used by Wireshark family of utilities. Older tcpdump
also knows how to read (but not write) this format.
See
https://github.com/pcapng/pcapng/
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
The current version of the pdump library was building on
Windows, but it was useless since the pdump utility was not being
built and Windows does not have multi-process support.
The new version of pdump with filtering now has dependency
on bpf. But bpf library is not available on Windows.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
No need to expose rte_dma_devices out of the dmadev library.
Existing helpers should be enough, and inlines make use of
rte_dma_fp_objs.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
This commit introduces 3 flags to the port configuration flags.
These flags allow the application to indicate what type of work
is expected to be performed by an eventdev port.
The three new flags are
- RTE_EVENT_PORT_CFG_HINT_PRODUCER (mostly RTE_EVENT_OP_NEW events)
- RTE_EVENT_PORT_CFG_HINT_CONSUMER (mostly RTE_EVENT_OP_RELEASE events)
- RTE_EVENT_PORT_CFG_HINT_WORKER (mostly RTE_EVENT_OP_FORWARD events)
These flags are only hints, and the PMDs must operate under the
assumption that any port can enqueue an event with any type of op.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
When a poll queue is removed from a rx_adapter instance, the WRR poll
array is recomputed. The wrr array length is reduced in this case. The
next wrr position to poll is stored in wrr_pos variable of rx_adapter
instance. This wrr_pos can become invalid in some cases after wrr is
recomputed. Using this variable to get the next queue and device pair
may leed to wrr buffer overruns.
Resetting the wrr_pos to zero after recomputation of wrr array fixes
the buffer overrun issue.
Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Mark rte_trace global variables as internal i.e. remove them
from experimental section of version map.
Some of them are used in inline APIs, mark those as global.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Slowpath trace APIs are only used in rte_eventdev.c so make them
as internal.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Move memory used by timer adapters to hugepage.
Allocate memory on the first adapter create or lookup to address
both primary and secondary process usecases.
This will prevent TLB misses if any and aligns to memory structure
of other subsystems.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Rearrange fields in rte_event_timer data structure to remove holes.
Also, remove use of volatile from rte_event_timer.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Remove rte_ prefix from rte_eth_event_enqueue_buffer,
rte_event_eth_rx_adapter and rte_event_crypto_adapter
as they are only used in rte_event_eth_rx_adapter.c and
rte_event_crypto_adapter.c
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Hide rte_event_timer_adapter_pmd.h file as it is an internal file.
Remove rte_ prefix from rte_event_timer_adapter_ops structure.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Use new driver interface for the fastpath enqueue/dequeue inline
functions.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Move fastpath inline function pointers from rte_eventdev into a
separate structure accessed via a flat array.
The intention is to make rte_eventdev and related structures private
to avoid future API/ABI breakages.`
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Allocate max space for internal port, port config, queue config and
link map arrays.
Introduce new macro RTE_EVENT_MAX_PORTS_PER_DEV and set it to max
possible value.
This simplifies the port and queue reconfigure scenarios and will
also allow inline functions to refer pointer to internal port data
without extra checking of current number of configured queues.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Create rte_eventdev_core.h and move all the internal data structures
to this file. These structures are mostly used by drivers, but they
need to be in the public header file as they are accessed by datapath
inline functions for performance reasons.
The accessibility of these data structures is not changed.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Mark all the driver specific functions as internal, remove
`rte` prefix from `struct rte_eventdev_ops`.
Remove experimental tag from internal functions.
Remove `eventdev_pmd.h` from non-internal header files.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Added telemetry callbacks to get Rx adapter stats, reset stats and
to get Rx queue config information.
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Added per queue buffer. To configure per queue event buffer size,
application sets rte_event_eth_rx_adapter_params::use_queue_event_buf
flag as true while using rte_event_eth_rx_adapter_create_with_params().
The per queue event buffer size is populated in
rte_event_eth_rx_adapter_queue_conf::event_buf_size and passed
to rte_event_eth_rx_adapter_queue_add().
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Currently event buffer is static array with a default size defined
internally.
To configure event buffer size from application,
rte_event_eth_rx_adapter_create_with_params() API is added which
takes struct rte_event_eth_rx_adapter_params to configure event
buffer size in addition other params. The event buffer size is
rounded up for better buffer utilization and performance. In case
of NULL params argument, default event buffer size is used.
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Added rte_event_eth_rx_adapter_queue_conf_get() API to get rx queue
information - event queue identifier, flags for handling received packets,
scheduler type, event priority, polling frequency of the receive queue
and flow identifier in rte_event_eth_rx_adapter_queue_conf structure
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to register timestamp dynamic field in mbuf.
Update the timestamp in mbuf for each packet before enqueuing
to event device if the timestamp is not already set.
Adding the timestamp in Rx adapter avoids additional latency
due to the event device.
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Include vector configuration into the structure
``rte_event_eth_rx_adapter_queue_conf`` that is used to configure
Rx adapter ethernet device Rx queue parameters.
This simplifies event vector configuration as it avoids splitting
configuration per Rx queue.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Event crypto adapter spec does not mention about cryptodev start and
stop. Cryptodev attached to the adapter should be started before calling
crypto adapter start. Added the same in spec and test application.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Rx adapter uses memove() to move unprocessed events to the beginning of
the packet enqueue buffer. The use memmove() was found to consume good
amount of CPU cycles (about 20%).
This patch removes the use of memove() while implementing a circular
buffer to avoid copying of data. With this change RX adapter is able
to fill the buffer of 16384 events.
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Global devargs syntax is used as device iteration filter like
"class=vdpa", a devargs without bus args is valid from parsing
perspective.
This patch makes bus args optional.
Fixes: d2a66ad794 ("bus: add device arguments name parsing")
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
Slash is used to split global device arguments.
To support path value which contains slash, this patch parses devargs by
locating both slash and layer name key:
bus=a,name=/some/path/class=b,k1=v1/driver=c,k2=v2
"/class=" and "/driver" are valid start of a layer.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
m->nb_seg must be reset on mbuf free whatever the value of m->next,
because it can happen that m->nb_seg is != 1. For instance in this
case:
m1 = rte_pktmbuf_alloc(mp);
rte_pktmbuf_append(m1, 500);
m2 = rte_pktmbuf_alloc(mp);
rte_pktmbuf_append(m2, 500);
rte_pktmbuf_chain(m1, m2);
m0 = rte_pktmbuf_alloc(mp);
rte_pktmbuf_append(m0, 500);
rte_pktmbuf_chain(m0, m1);
As rte_pktmbuf_chain() does not reset nb_seg in the initial m1
segment (this is not required), after this code the mbuf chain
have 3 segments:
- m0: next=m1, nb_seg=3
- m1: next=m2, nb_seg=2
- m2: next=NULL, nb_seg=1
Then split this chain between m1 and m2, it would result in 2 packets:
- first packet
- m0: next=m1, nb_seg=2
- m1: next=NULL, nb_seg=2
- second packet
- m2: next=NULL, nb_seg=1
Freeing the first packet will not restore nb_seg=1 in the second
segment. This is an issue because it is expected that mbufs stored
in pool have their nb_seg field set to 1.
Fixes: 8f094a9ac5 ("mbuf: set mbuf fields while in pool")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
Use correct define for the name array size. The change breaks ABI and
hence cannot be backported to stable branches.
Fixes: 38c9817ee1 ("mempool: adjust name size in related data types")
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The macros RTE_BIT32 and RTE_BIT64 are used to replace bit shifts.
The macro UINT64C is also used to replace remaining occurrences of ULL.
The bit shifts of ETH_RSS_LEVEL_* are kept for aesthetic reason.
The API of rte_mtr and rte_tm is using enums for 64-bit variables.
As they are enums, unsigned bit cannot be used.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Ethernet device must be stopped first before close in accordance
with the documentation.
Fixes: 980995f8cc ("ethdev: improve API comments of close and detach functions")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
1. Introduction and Retrospective
Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.
The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?
Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:
- only fixed size network header can be represented
- the entire network header pattern of fixed format
(header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
might be triggered), and actually is not supported
by existing PMDs
- no explicitly specified relations with preceding
and following items
- no tunnel hint support
As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).
This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.
2. Flex Item Life Cycle
Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:
- header format
- header length, (can be variable, depending on options)
- potential presence of extra options following or included
in the header the header
- the relations with preceding protocols. For example,
the GENEVE follows UDP, eCPRI can follow either UDP
or L2 header
- the relations with following protocols. For example,
the next layer after tunnel header can be L2 or L3
- whether the new protocol is a tunnel and the header
is a splitting point between outer and inner layers
The supposed way to operate with flex item:
- application defines the header structures according to
protocol specification
- application calls rte_flow_flex_item_create() with desired
configuration according to the protocol specification, it
creates the flex item object over specified ethernet device
and prepares PMD and underlying hardware to handle flex
item. On item creation call PMD backing the specified
ethernet device returns the opaque handle identifying
the object has been created
- application uses the rte_flow_item_flex with obtained handle
in the flows, the values/masks to match with fields in the
header are specified in the flex item per flow as for regular
items (except that pattern buffer combines all fields)
- flows with flex items match with packets in a regular fashion,
the values and masks for the new protocol header match are
taken from the flex items in the flows
- application destroys flows with flex items
- application calls rte_flow_flex_item_release() as part of
ethernet device API and destroys the flex item object in
PMD and releases the engaged hardware resources
3. Flex Item Structure
The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.
struct rte_flow_item_flex {
void *handle;
uint32_t length;
const uint8_t* pattern;
};
The handle is some opaque object maintained on per device basis
by underlying driver.
The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.
The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.
4. Flex Item Configuration
The flex item configuration consists of the following parts:
- header field descriptors:
- next header
- next protocol
- sample to match
- input link descriptors
- output link descriptors
The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.
The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.
There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.
The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.
There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:
- FIELD_MODE_FIXED - fixed offset. The bit offset from
header beginning is permanent and defined by field_base
configuration parameter.
- FIELD_MODE_OFFSET - the field bit offset is extracted
from other header field (indirect offset field). The
resulting field offset to match is calculated from as:
field_base + (*offset_base & offset_mask) << offset_shift
This mode is useful to sample some extra options following
the main header with field containing main header length.
Also, this mode can be used to calculate offset to the
next protocol header, for example - IPv4 header contains
the 4-bit field with IPv4 header length expressed in dwords.
One more example - this mode would allow us to skip GENEVE
header variable length options.
- FIELD_MODE_BITMASK - the field bit offset is extracted
from other header field (indirect offset field), the latter
is considered as bitmask containing some number of one bits,
the resulting field offset to match is calculated as:
field_base + bitcount(*offset_base & offset_mask) << offset_shift
This mode would be useful to skip the GTP header and its
extra options with specified flags.
- FIELD_MODE_DUMMY - dummy field, optionally used for byte
boundary alignment in pattern. Pattern mask and data are
ignored in the match. All configuration parameters besides
field size and offset are ignored.
Note: "*" - means the indirect field offset is calculated
and actual data are extracted from the packet by this
offset (like data are fetched by pointer *p from memory).
The offset mode list can be extended by vendors according to
hardware supported options.
The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.
The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.
The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.
5. Flex Item Chaining
If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing. In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.
Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.
6. Example of New Protocol Handling
Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:
struct new_protocol_header {
rte_be32 header_length; /* length in dwords, including options */
rte_be32 specific0; /* some protocol data, no intention */
rte_be32 specific1; /* to match in flows on these fields */
rte_be32 crucial; /* data of interest, match is needed */
rte_be32 options[0]; /* optional protocol data, variable length */
};
The supposed flex item configuration:
struct rte_flow_item_flex_field field0 = {
.field_mode = FIELD_MODE_DUMMY, /* Affects match pattern only */
.field_size = 96, /* three dwords from the beginning */
};
struct rte_flow_item_flex_field field1 = {
.field_mode = FIELD_MODE_FIXED,
.field_size = 32, /* Field size is one dword */
.field_base = 96, /* Skip three dwords from the beginning */
};
struct rte_flow_item_udp spec0 = {
.hdr = {
.dst_port = RTE_BE16(0xFADE),
}
};
struct rte_flow_item_udp mask0 = {
.hdr = {
.dst_port = RTE_BE16(0xFFFF),
}
};
struct rte_flow_item_flex_link link0 = {
.item = {
.type = RTE_FLOW_ITEM_TYPE_UDP,
.spec = &spec0,
.mask = &mask0,
};
struct rte_flow_item_flex_conf conf = {
.next_header = {
.tunnel = FLEX_TUNNEL_MODE_SINGLE,
.field_mode = FIELD_MODE_OFFSET,
.field_base = 0,
.offset_base = 0,
.offset_mask = 0xFFFFFFFF,
.offset_shift = 2 /* Expressed in dwords, shift left by 2 */
},
.sample = {
&field0,
&field1,
},
.nb_samples = 2,
.input_link[0] = &link0,
.nb_inputs = 1
};
Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:
struct new_protocol_header spec_pattern =
{
.crucial = RTE_BE32(0x00112233),
};
struct new_protocol_header mask_pattern =
{
.crucial = RTE_BE32(0xFFFFFFFF),
};
struct rte_flow_item_flex spec_flex = {
.handle = 0x123456789A
.length = sizeiof(struct new_protocol_header),
.pattern = &spec_pattern,
};
struct rte_flow_item_flex mask_flex = {
.length = sizeof(struct new_protocol_header),
.pattern = &mask_pattern,
};
struct rte_flow_item item_to_match = {
.type = RTE_FLOW_ITEM_TYPE_FLEX,
.spec = &spec_flex,
.mask = &mask_flex,
};
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.
There is the new Flow items with variable lengths coming - flex
item. In order to handle this item (and potentially other new ones
with variable pattern length) in flow copy and conversion routines
the helper function is introduced.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Both 'rte_eth_dev_configure()' & 'rte_eth_dev_set_mtu()' sets MTU but
have slightly different checks. Like one checks min MTU against
RTE_ETHER_MIN_MTU and other RTE_ETHER_MIN_LEN.
Checks moved into common function to unify the checks. Also this has
benefit to have common error logs.
Default 'dev_info->min_mtu' (the one set by ethdev if driver doesn't
provide one), changed to ('RTE_ETHER_MIN_LEN' - overhead). Previously it
was 'RTE_ETHER_MIN_MTU' which is min MTU for IPv4 packets. Since the
intention is to provide min MTU corresponding minimum frame size, new
default value suits better.
Suggested-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag.
Instead of drivers announce this capability, application can deduct the
capability by checking reported 'dev_info.max_mtu' or
'dev_info.max_rx_pktlen'.
And instead of application setting this flag explicitly to enable jumbo
frames, this can be deduced by driver by comparing requested 'mtu' to
'RTE_ETHER_MTU'.
Removing this additional configuration for simplification.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Move requested MTU value check to the API to prevent the duplicated
code.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Setting MTU bigger than RTE_ETHER_MTU requires the jumbo frame support,
and application should enable the jumbo frame offload support for it.
When jumbo frame offload is not enabled by application, but MTU bigger
than RTE_ETHER_MTU is requested there are two options, either fail or
enable jumbo frame offload implicitly.
Enabling jumbo frame offload implicitly is selected by many drivers
since setting a big MTU value already implies it, and this increases
usability.
This patch moves this logic from drivers to the library, both to reduce
the duplicated code in the drivers and to make behaviour more visible.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
There is a confusion on setting max Rx packet length, this patch aims to
clarify it.
'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.
Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.
These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.
Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
Ethernet frame overhead, and this overhead may be different from
device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
which adds additional confusion and some APIs and PMDs already
discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
field, this adds configuration complexity for application.
As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.
For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.
When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.
Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.
Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.
Fixes: 6006818cfb ("net: new checksum functions")
Fixes: e079655c41 ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org
Signed-off-by: Georg Sauthoff <mail@gms.tf>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The driver may change offloads info into dev->data->dev_conf
in dev_configure which may cause apps use outdated values.
Add a new API to get actual device configuration.
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
RTE IPv4 header definition combines the `version' and `ihl' fields
into a single structure member.
This patch introduces dedicated structure members for both `version'
and `ihl' IPv4 fields. Separated header fields definitions allow to
create simplified code to match on the IHL value in a flow rule.
The original `version_ihl' structure member is kept for backward
compatibility.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
EXPERIMENTAL tag was missed in rte_flow_action_modify_data
structure description.
Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
The generic modify field flow action introduced in [1] has
some issues related to the immediate source operand:
- immediate source can be presented either as an unsigned
64-bit integer or pointer to data pattern in memory.
There was no explicit pointer field defined in the union.
- the byte ordering for 64-bit integer was not specified.
Many fields have shorter lengths and byte ordering
is crucial.
- how the bit offset is applied to the immediate source
field was not defined and documented.
- 64-bit integer size is not enough to provide IPv6
addresses.
In order to cover the issues and exclude any ambiguities
the following is done:
- introduce the explicit pointer field
in rte_flow_action_modify_data structure
- replace the 64-bit unsigned integer with 16-byte array
- update the modify field flow action documentation
Appropriate deprecation notice has been removed.
[1] commit 73b68f4c54 ("ethdev: introduce generic modify flow action")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Not all DPDK ports in a given switching domain may have the
privilege to manage "transfer" flows. Add an API to find a
port with sufficient privileges by any port in the domain.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Attributes "ingress" and "egress" can only apply unambiguosly
to non-"transfer" flows. In "transfer" flows, the standpoint
is effectively shifted to the embedded switch. There can be
many different endpoints connected to the switch, so the
use of "ingress" / "egress" does not shed light on which
endpoints precisely can be considered as traffic sources.
Add relevant deprecation notices and suggest the use of precise
traffic source items (PORT_REPRESENTOR and REPRESENTED_PORT).
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
PF, VF and PHY_PORT require that applications have extra
knowledge of the underlying NIC and thus are hard to use.
Also, the corresponding items depend on the direction
attribute (ingress / egress), which complicates their
use in applications and interpretation in PMDs.
The concept of PORT_ID is ambiguous as it doesn't say whether
the port in question is an ethdev or the represented entity.
Items and actions PORT_REPRESENTOR, REPRESENTED_PORT
should be used instead.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
For use in "transfer" flows. Supposed to send matching traffic to the
entity represented by the given ethdev, at embedded switch level.
Such an entity can be a network (via a network port), a guest
machine (via a VF) or another ethdev in the same application.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
For use in "transfer" flows. Supposed to send matching traffic to
the given ethdev (to the application), at embedded switch level.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
For use in "transfer" flows. Supposed to match traffic entering the
embedded switch from the entity represented by the given ethdev.
Such an entity can be a network (via a network port), a guest
machine (via a VF) or another ethdev in the same application.
Must not be combined with direction attributes.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
For use in "transfer" flows. Supposed to match traffic
entering the embedded switch from the given ethdev.
Must not be combined with direction attributes.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Move rte_eth_dev, rte_eth_dev_data, rte_eth_rxtx_callback and related
data into private header (ethdev_driver.h).
Few minor changes to keep DPDK building after that.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
Introduce rte_eth_macaddrs_get() to allow user to retrieve all ethernet
addresses assigned to given port.
Change testpmd to use this new function and avoid referencing directly
rte_eth_devices[].
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
Rework fast-path ethdev functions to use rte_eth_fp_ops[].
While it is an API/ABI breakage, this change is intended to be
transparent for both users (no changes in user app is required) and
PMD developers (no changes in PMD is required).
One extra thing to note - RX/TX callback invocation will cause extra
function call with these changes. That might cause some insignificant
slowdown for code-path where RX/TX callbacks are heavily involved.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
Copy public function pointers (rx_pkt_burst(), etc.) and related
pointers to internal data from rte_eth_dev structure into a
separate flat array. That array will remain in a public header.
The intention here is to make rte_eth_dev and related structures internal.
That should allow future possible changes to core eth_dev structures
to be transparent to the user and help to avoid ABI/API breakages.
The plan is to keep minimal part of data from rte_eth_dev public,
so we still can use inline functions for fast-path calls
(like rte_eth_rx_burst(), etc.) to avoid/minimize slowdown.
The whole idea beyond this new schema:
1. PMDs keep to setup fast-path function pointers and related data
inside rte_eth_dev struct in the same way they did it before.
2. Inside rte_eth_dev_start() and inside rte_eth_dev_probing_finish()
(for secondary process) we call eth_dev_fp_ops_setup, which
copies these function and data pointers into rte_eth_fp_ops[port_id].
3. Inside rte_eth_dev_stop() and inside rte_eth_dev_release_port()
we call eth_dev_fp_ops_reset(), which resets rte_eth_fp_ops[port_id]
into some dummy values.
4. fast-path ethdev API (rte_eth_rx_burst(), etc.) will use that new
flat array to call PMD specific functions.
That approach should allow us to make rte_eth_devices[] private
without introducing regression and help to avoid changes in drivers code.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
Currently majority of fast-path ethdev ops take pointers to internal
queue data structures as an input parameter.
While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue
index.
For future work to hide rte_eth_devices[] and friends it would be
plausible to unify parameters list of all fast-path ethdev ops.
This patch changes eth_rx_queue_count() to accept pointer to internal
queue data as input parameter.
While this change is transparent to user, it still counts as an ABI change,
as eth_rx_queue_count_t is used by ethdev public inline function
rte_eth_rx_queue_count().
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
At queue configure stage always allocate space for maximum possible
number (RTE_MAX_QUEUES_PER_PORT) of queue pointers.
That will allow 'fast' inline functions (eth_rx_burst, etc.) to refer
pointer to internal queue data without extra checking of current number
of configured queues.
That would help in future to hide rte_eth_dev and related structures.
It means that from now on, each ethdev port will always consume:
((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT)
bytes of memory for its queue pointers.
With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.
Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.
The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
Indirect actions should be used to do shared counters.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The patch is required for all PMDs which do not provide representors
info on the representor itself.
The function, rte_eth_representor_id_get(), is used in
eth_representor_cmp() which is required in ethdev class iterator to
search ethdev port ID by name (representor case). Before the patch
the function is called on the representor itself and tries to get
representors info to match.
Search of port ID by name is used after hotplug to find out port ID
of the just plugged device.
Getting a list of representors from a representor does not make sense.
Instead, a backer device should be used.
To this end, extend the rte_eth_dev_data structure to include the port ID
of the backing device for representors.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
rte_eth_rx_descriptor_status() should be used as a replacement.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
In struct rte_security_ipsec_sa_options, for every new option
added, there is an ABI breakage, to avoid, a reserved_opts
bitfield is added to for the remaining bits available in the
structure.
Now for every new sa option, these reserved_opts can be reduced
and new option can be added.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
rte_security_dynfield_register() is an internal
API to be used by the driver, hence moving it to internal.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Added device information to capture explicitly the assumption
of the input/output data byte endianness being processed.
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
If no next segment available the “for” loop will fail and it still
returns i+1 i.e. 2, which is wrong as it has filled only 1 buffer.
Fixes: 7adf992fb9 ("cryptodev: introduce CPU crypto API")
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
The structure rte_crypto_sym_vec is updated to
add dest_sgl to support out of place processing.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The current crypto raw data vectors is extended to support
rte_security usecases, where we need total data length to know
how much additional memory space is available in buffer other
than data length so that driver/HW can write expanded size
data after encryption.
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch renames the sgl to src_sgl in struct rte_crypto_sym_vec
to help differentiating between source and destination sgl.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add telemetry support for ipsec SAs.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Add ESP tunnel type to the tunnel types list that can be specified
for TSO or checksum on the inner part of tunnel packets.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add support for the IPsec NAT-Traversal use case for Tunnel mode
packets.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Add support for specifying UDP port params for UDP encapsulation option.
RFC3948 section-2.1 does not enforce using specific the UDP ports for
UDP-Encapsulated ESP Header
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Added support for AES_CCM, CHACHA20_POLY1305 and AES_GMAC.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Update ipsec_xform definition to include ESN field.
This allows the application to control the ESN starting value.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
As described in [1] and as announced in [2], The field ``dataunit_len``
of the ``struct rte_crypto_cipher_xform`` moved to the end of the
structure and extended to ``uint32_t``.
In this way, sizes bigger than 64K bytes can be supported for data-unit
lengths.
[1] commit d014dddb2d ("cryptodev: support multiple cipher
data-units")
[2] commit 9a5c09211b ("doc: announce extension of crypto data-unit
length")
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
As reported by Dmitry, RTE_MEMPOOL_F_POOL_CREATED is a flag only
manipulated internally.
This flag is not supposed to be requested from an application and would
probably result in an incorrect behavior if an application did pass it.
At least one other internal flag has been added recently and more may be
introduced later.
Rework the check and export a mask of valid user flags for use in the
unit test.
Fixes: b240af8b10 ("mempool: enforce valid flags at creation")
Reported-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
MEMPOOL_PG_NUM_DEFAULT and MEMPOOL_PG_SHIFT_MAX are not used.
Fixes: fd943c764a ("mempool: deprecate xmem functions")
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add RTE_ prefix to macro used to register mempool driver.
The old one is still available but deprecated.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add RTE_ prefix to helper macro to calculate mempool header size and
make it internal. Old macro is still available, but deprecated.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add RTE_ prefix to internal API defined in public header.
Use the prefix instead of double underscore.
Use uppercase for macros in the case of name conflict.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Fix the mempool flags namespace by adding an RTE_ prefix to the name.
The old flags remain usable, to be deprecated in the future.
Flag MEMPOOL_F_NON_IO added in the release is just renamed to have RTE_
prefix.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Move documentation into a separate line just before define.
Prepare to have a bit longer flag name because of namespace prefix.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>