The NIC can have multiple PCIe links and can be attached to the multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. To provide the unambiguous identification
of the PCIe function the controller index is added. The full representor
identifier consists of three indices - controller index, PF index, and
VF or SF index (if any).
This patch introduces controller index to ethdev representor syntax,
examples:
[[c#]pf#]vf#: VF port representor/s, example: pf0vf1
[[c#]pf#]sf#: SF port representor/s, example: c1pf1sf[0-3]
c# is controller(host) ID/range in case of multi-host, optional.
For user application (e.g. OVS), PMD is responsible to interpret and
locate representor device based on controller ID, PF ID and VF/SF ID in
representor syntax.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
With Kernel bonding, multiple underlying PFs are bonded, VFs come
from different PF, need to identify representor of VFs unambiguously by
adding PF index.
This patch introduces optional 'pf' section to representor devargs
syntax, examples:
representor=pf0vf0 - single VF representor
representor=pf[0-1]sf[0-1023] - SF representors from 2 PFs
PF type representor is supported by using standalone 'pf' section:
representor=pf1 - PF representor
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
This patch updates kvargs parser to support value of multiple lists or
ranges:
k1=v[1,2]v[3-5]
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
SubFunction is a portion of the PCI device, created on demand, a SF
netdev has its own dedicated queues(txq, rxq). A SF netdev supports
eswitch representation offload similar to existing PF and VF
representors.
To support SF representor, this patch introduces new devargs syntax,
examples:
representor=sf0 - single SubFunction representor
representor=sf[1,3,5] - single list
representor=sf[0-3], - single range
representor=sf[0,2-6,8,10-12] - list with singles and ranges
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Current VF representor syntax:
representor=2 - single representor
representor=[0-3] - single range
To prepare for more representor types, this patch adds compatible VF
representor devargs syntax:
vf#:
representor=vf2 - single representor
representor=vf[1,3,5] - single list
representor=vf[0-3] - single range
representor=vf[0,1,4-7] - list with singles and range
For backwards compatibility, representor "#" is interpreted as "vf#".
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To the extended representor syntax which need to reuse the value parsing
function for controller and PF section, this patch refactors the port
list parsing.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To support more representor type, this patch introduces representor type
enum. The enum is subject to be extended to support new representor in
patches upcoming.
For each devarg structure, only one type supported.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
The comment got it wrong. The payload length field
does not include the fixed IPv6 header size.
Fixes: 7eca7f7fd0 ("net: add missing endianness annotations")
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Some functions were introduced in DPDK 21.05 to query the version parts
(prefix, year, month, minor, suffix, release) at runtime.
Per guidelines, these new public functions must be marked with
__rte_experimental and ABI versioned as EXPERIMENTAL.
Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The macro RTE_VERSION was broken since updated with function calls.
It is a build-time version number, and must be built with macros.
For a run-time version number, there is the function rte_version().
Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
The hard-coded thread priority for Windows threads in EAL
is REALTIME_PRIORITY_CLASS/THREAD_PRIORITY_TIME_CRITICAL.
This results in issues with DPDK threads causing OS thread starvation
and eventually a bugcheck.
The fix reduce the thread priority to
NORMAL_PRIORITY_CLASS/THREAD_PRIORITY_NORMAL.
Bugzilla ID: 600
Fixes: 53ffd9f080 ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org
Reported-by: Odi Assli <odia@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
This patch allows the same set of rte_metrics_tel_* functions to be
exported no matter JANSSON is available or not, by doing following:
1. Leverage dpdk_conf to set configuration flag RTE_HAS_JANSSON
when Jansson dependency is found.
2. In rte_metrics_telemetry.c, leverage RTE_HAS_JANSSON to handle the
case when JANSSON is not available by adding stubs for all the instances.
3. In meson.build, per dpdk/doc/guides/rel_notes/release_20_05.rst,
it is claimed that "Telemetry library is no longer dependent on the
external Jansson library, which allows Telemetry be enabled by default.",
thus make the deps and includes of Telemetry as not conditional anymore.
Signed-off-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
In Linux by default DPDK log goes to stdout, as well as syslog.
It is possible for an application to change the library output stream
via 'rte_openlog_stream()' API, to set it to stderr, it can be used as:
rte_openlog_stream(stderr);
But still updating the default log output to 'stderr'.
Bugzilla ID: 8
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
For using a DPDK application, such as OVS, which is dynamically linked, the
DPDK version in use should always report the actual version, not the
version used at build time. This incorrect behaviour can be seen by
building OVS against one version of DPDK and running it against a later
one. Using "ovs-vsctl list Open_vSwitch" to query basic info, the
dpdk_version returned will be the build version not the currently running
one - which can be verified using the DPDK telemetry library client.
$ sudo ovs-vsctl list Open_vSwitch | grep dpdk_version
dpdk_version : "DPDK 20.11.0-rc4"
$ echo quit | sudo dpdk-telemetry.py
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.02.0-rc2", "pid": 405659, "max_output_len": 16384}
-->
To fix this, we need to convert the rte_version() function, and any other
necessary parts of the rte_version.h, to be actual functions in EAL, not
just inlines/macros. The only complication in doing so is that telemetry
library cannot call rte_version() directly, and instead needs the version
string passed in on init.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Field IDs for the MODIFY_FIELD action lack doxygen comments
and not visible in online DPDK documentation because of that.
Provide a meaningful description for every Field ID for the
rte_flow_field_id enumeration.
Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Rename PKT_RX_EIP_CKSUM_BAD to PKT_RX_OUTER_IP_CKSUM_BAD and
deprecate the original name. The new name is better aligned
with existing PKT_RX_OUTER_* flags, which should help reduce
confusion about its use.
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
When file truncation fails, the log message attempts to print a path of
file we failed to truncate, but this path was never set to anything and,
what's worse, was uninitialized. Fix it by passing path from the caller.
Coverity issue: 366122
Fixes: c44d09811b ("eal: add shared indexed file-backed array")
Cc: stable@dpdk.org
Reported-by: Andrew Boyer <aboyer@pensando.io>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Return a PPC specific value for get_tsc_freq_arch() rather than
depending on the EAL framework to estimate the frequency.
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Currently, we don't detach the shared memory on EAL cleanup, which
leaves the page table descriptors still holding on to the file
descriptors as well as memory space occupied by them. Fix it by adding
another detach stage that closes the internal memory allocator resource
references, detaches shared fbarrays and unmaps the shared mem config.
Bugzilla ID: 380
Bugzilla ID: 381
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This is causing build error, like:
https://travis-ci.com/github/ovsrobot/dpdk/jobs/482121104
Also '@internal' marker removed from doxygen comment, since public API
should not be internal.
Experimental tag removed from 'rte_power_guest_channel_send_msg()'
Fixes: 4d3892dcd7 ("power: make channel message functions public")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Partial unmapping is not supported for VFIO IOMMU type1
by kernel. Though kernel gives return as zero, the unmapped size
returned will not be same as expected. So check for
returned unmap size and return error.
For IOVA as PA, DMA mapping is already at memseg size
granularity. Do the same even for IOVA as VA mode as
DMA map/unmap triggered by heap allocations,
maintain granularity of memseg page size so that heap
expansion and contraction does not have this issue.
For user requested DMA map/unmap disallow partial unmapping
for VFIO type1.
Fixes: 73a6390859 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
In order to save DMA entries limited by kernel both for external
memory and hugepage memory, an attempt was made to map physically
contiguous memory in one go. This cannot be done as VFIO IOMMU type1
does not support partially unmapping a previously mapped memory
region while Heap can request for multi page mapping and
partial unmapping.
Hence for going back to old method of mapping/unmapping at
memseg granularity, this commit reverts
commit d1c7c0cdf7 ("vfio: map contiguous areas in one go")
Also add documentation on what module parameter needs to be used
to increase the per-container dma map limit for VFIO.
Fixes: d1c7c0cdf7 ("vfio: map contiguous areas in one go")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
When vhost is doing dequeue offloading, it parses ethernet and L3/L4
headers of the packet. Then vhost will set corresponding value in mbuf
attributes. It means offloading action should be after packet data copy.
Fixes: 75ed516978 ("vhost: add packed ring batch dequeue")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Clarify what is the scope and impact of the UDP port tunnel API.
There are still missing infos to be improved in future:
- no capability flag
- dependency between ports of the same device
- required privilege
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
When checking the loading of EAL shared lib to see if we have a shared
DPDK build, we only want to include part of the ABI version in the check
rather than the whole thing. For example, with ABI version 21.1 for DPDK
release 21.02, the linker links the binary against librte_eal.so.21,
without the ".1".
To avoid any further brittleness in this area, we can check for multiple
versions when doing the check, since just about any version of EAL implies
a shared build. Therefore we check for presence of librte_eal.so with full
ABI_VERSION extension, and then repeatedly remove the end part of the
filename after the last dot, checking each time. For example (debug log
output for static build):
EAL: Checking presence of .so 'librte_eal.so.21.1'
EAL: Checking presence of .so 'librte_eal.so.21'
EAL: Checking presence of .so 'librte_eal.so'
EAL: Detected static linkage of DPDK
Fixes: 7781950f4d ("eal: fix shared lib mode detection")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
The "rte_telemetry_init()" function is for use by "rte_eal_init()" and
should not be part of the public API. Mark it as internal only.
Fixes: 6dd571fd07 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
There is no need for the direct inclusion of the generic/ header [1]
now that we don't use the rte_atomic API anymore.
It was the last case of direct inclusion of the generic/ headers,
so the flag -Wno-unused-function can be dropped.
1: https://git.dpdk.org/dpdk/commit/?id=3eb860b08eb7
Fixes: e41d27a68d ("mbuf: remove atomic reference counters")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When doing a mempool dump or an audit, the application can panic because
the length of the cache is greater than the flush threshold, which is
seen as a fatal error. But this can temporarily happen when the mempool
is in use.
Fix the panic condition to abort only when the cache length is greater
than the array.
Fixes: ea5dd2744b ("mempool: cache optimisations")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
The PMD info get API has a void return type. Remove the
@return 0 Success doxygen comment as it doesn't make sense here.
Fixes: 5223a1f3b8 ("eventdev: define southbound driver interface")
Cc: stable@dpdk.org
Reported-by: Fredrik A Lindgren <fredrik.a.lindgren@tietoevry.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
vhost_new_device might be called in different threads at
the same time.
thread 1(config thread)
rte_vhost_driver_start
->vhost_user_start_client
->vhost_user_add_connection
-> vhost_new_device
thread 2(vhost-events)
vhost_user_read_cb
->vhost_user_msg_handler (return value < 0)
-> vhost_user_start_client
-> vhost_new_device
So there could be a case that a same vid has been allocated
twice, or some vid might be lost in DPDK lib however still
held by the upper applications.
Another place where race would happen is at the func
*vhost_destroy_device*, but after a detailed investigation,
the race does not exist as long as no two devices have the
same vid: Calling vhost_destroy_devices in different
threads with different vids is actually safe.
Fixes: a277c71598 ("vhost: refactor code structure")
Cc: stable@dpdk.org
Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Previous fix used `rte_malloc_heap_socket_is_external()` to check if the
heap was an external heap. However, that API is thread-safe, and when
we're inside the allocation process, we're already write-locked, so
calling `rte_malloc_heap_socket_is_external()` will result in a
deadlock followed by a timeout.
Fix it by replacing the API call with a check against maximum number of
NUMA nodes, because external heaps always have higher socket ID's.
Fixes: 7ac31e82bc ("mem: improve parameter checking on memory hotplug")
Reported-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch defines new RSS offload types for MPLS. The distribution
will on the basis of MPLS tag.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
IPv6 DSCP field ID is missing from the original list of Field IDs
for MODIFY_FIELD action. Add it to support IPv6 header fully.
Add ipv6_dscp option for the corresponding header field in testpmd.
Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
If a failure happens when closing a port,
it was unnecessarily failing again in the function eth_err(),
because of a check against HW removal cause.
Indeed there is a big chance the port is released at this point.
Given the port is in the middle (or at the end) of a close process,
checking the error cause by accessing the port is a non-sense.
The error check is replaced by a simple return in the close function.
Bugzilla ID: 624
Fixes: 8a5a0aad5d ("ethdev: allow close function to return an error")
Cc: stable@dpdk.org
Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Different hardware gathers statistics differently, so some general
rules need to be established.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
To verify that all DPDK headers are ok for inclusion directly in a C file,
and are not missing any other pre-requisite headers, we can auto-generate
for each header an empty C file that includes that header. Compiling these
files will throw errors if any header has unmet dependencies.
For some libraries, there may be some header files which are not for direct
inclusion, but rather are to be included via other header files. To allow
later checking of these files for missing includes, we separate out the
indirect include files from the direct ones.
To ensure ongoing compliance, we enable this build test as part of the
default x86 build in "test-meson-builds.sh".
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The rte_eventdev_pmd*.h files are for drivers only and should be private
to DPDK, and not installed for app use.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
The rte_rib6 header was using RTE_MIN macro from rte_common.h but not
including the header file.
Fixes: f7e861e21c ("rib: support IPv6")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
The rte_power_guest_channel.h file did not include its dependent
headers, so add them.
Fixes: 5f443cc0f9 ("power: create guest channel public header file")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Clang does not have an "error" attribute for functions, so for marking
internal functions we need to check for the error attribute, and provide
a fallback if it is not present. For clang, we can use "diagnose_if"
attribute, similarly checking for its presence before use.
Fixes: fba5af82ad ("eal: add internal ABI tag definition")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Some parameters of typedef'ed function pointers were not properly listed
in the doxygen comments.
The error is seen with doxygen 1.9 which added this specific check:
https://github.com/doxygen/doxygen/commit/d34236ba4037
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Add a simple on/off switch that will enable saving power when no
packets are arriving. It is based on counting the number of empty
polls and, when the number reaches a certain threshold, entering an
architecture-defined optimized power state that will either wait
until a TSC timestamp expires, or when packets arrive.
This API mandates a core-to-single-queue mapping (that is, multiple
queued per device are supported, but they have to be polled on different
cores).
This design is using PMD RX callbacks.
1. UMWAIT/UMONITOR:
When a certain threshold of empty polls is reached, the core will go
into a power optimized sleep while waiting on an address of next RX
descriptor to be written to.
2. TPAUSE/Pause instruction
This method uses the pause (or TPAUSE, if available) instruction to
avoid busy polling.
3. Frequency scaling
Reuse existing DPDK power library to scale up/down core frequency
depending on traffic volume.
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Currently, the API documentation is ambiguous as to what happens when
certain conditions are met. Document the behavior explicitly, as well as
fix some typos and outdated comments.
Fixes: 6a17919b0e ("eal: change power intrinsics API")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The `data_sz` name is fine, but it looks out of place because nothing
else has "data" prefix in that structure. Rename it to "size", as well
as add more clarity to the comments around each struct member.
Fixes: 6a17919b0e ("eal: change power intrinsics API")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
For legacy modes, rename ring_generic/c11 to ring_generic/c11_pvt.
Furthermore, add new file ring_elem_pvt.h which includes ring_do_eq/deq
and ring element copy/delete APIs.
The update_tail internal helper has been prefixed with the library prefix.
For other modes, rename xx_c11_mem to xx_elem_pvt. Move all private APIs
into these new header files.
Finally, the external APIs and internal APIs will be separated from each
other. This can remind users not to use internal APIs and make ring
library easier to maintain.
Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
re-organise the including of the new public header file and
remove un-needed includes
Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Adjust meson.build so that 'ninja install' copies the new header
file into the installation directory.
Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Rename the #defines to have an RTE_POWER_ prefix
Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Rename the public structs to have an rte_power_ prefix.
Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Move the 2 public functions into rte_power_guest_channel.h
Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
In preparation for making the header file public, we first rename
channel_commands.h as rte_power_guest_channel.h.
Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Currently, we don't check anything that comes in through memory hotplug
subsystem using the IPC, because we always assume the data is correct.
This is okay as anyone having access to the IPC socket would also have
rights to crash the DPDK process through other means, but it's still a
good practice to do parameter checking, so fix the code to do that.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Compiling with "meson build -Dbuildtype=debug --cross-file
config/arm/arm64_thunderx2_linux_gcc" shows the warnings
"function returns an aggregate [-Waggregate-return]":
../../dpdk/lib/librte_eal/arm/include/rte_atomic_64.h: In
function ‘__cas_128_relaxed’:
../../dpdk/lib/librte_eal/arm/include/rte_atomic_64.h:81:20:
error: function returns an aggregate [-Werror=aggregate-return]
__ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
^~~~~~~~~~~~~~~~~
Fix the compiling issue by defining __ATOMIC128_CAS_OP as a void
function and passing the address pointer into it.
Fixes: 7e2c3e17fe ("eal/arm64: add 128-bit atomic compare exchange")
Cc: stable@dpdk.org
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Meson can use cmake as a fallback for detecting packages, and this can
lead to picking up 64-libs for 32-bit builds. To work around this, force
the use of pkg-config only for detecting libcrypto, zlib, jansson and
other package dependencies.
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: Liron Himi <lironh@marvell.com>
Tested-by: Lee Daly <lee.daly@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Martin Spinler <spinler@cesnet.cz>
The rte_compat header file is needed for the '__rte_experimental' macro.
Fixes: f00708c2aa ("node: add IPv4 rewrite and lookup control")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The global variable "tel_met_data" was declared in a header file, rather
than in a C file, leading to duplicate definitions if more than one C
file included the header.
Fixes: c5b7197f66 ("telemetry: move some functions to metrics library")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The stdio.h header needs to be included to get the definition of the
FILE type.
Fixes: b32c0a2c5e ("pipeline: add SWX table update high level API")
Fixes: 3ca60ceed7 ("pipeline: add SWX pipeline specification file")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The rte_lru_x86.h header, included from the main rte_lru.h header, uses
the RTE_CC_IS_GNU macro from rte_common.h but fails to include that
header file.
Fixes: 0c9a5735a9 ("eal: fix compiler detection in public headers")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Add stdint.h to get definitions of standard integer types
Fixes: 39e9272484 ("fib: add FIB library")
Fixes: 40d41a8a7b ("fib: support IPv6")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The rte_ipsec_sad.h header used the standard uintXX_t types, but did not
include stdint.h header for them.
Fixes: 401633d9c1 ("ipsec: add inbound SAD API")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The vhost header files were missing definitions from headers to allow
them to be compiled up individually.
Fixes: d7280c9fff ("vhost: support selective datapath")
Fixes: a49f758d11 ("vhost: split vDPA header file")
Fixes: 939066d965 ("vhost/crypto: add public function implementation")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The standard integer types, and the size_t types are missing their
required header includes in the rib header file.
Fixes: 5a5793a5ff ("rib: add RIB library")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The rte_compat.h header file must be included to get the definition of
__rte_experimental.
Fixes: f030bff72f ("bitrate: add free function")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The rte_mbuf_dyn.h header file uses a number of types and macros without
including the required header files to get the definitions of those
macros/types. Similarly, the rte_mbuf_core.h file was missing an
include for rte_byteorder.h header.
Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Fixes: 3eb860b08e ("mbuf: move definitions into a separate file")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The define for RTE_ETH_FLOW_MAX is defined in rte_ethdev.h, so that
header should be included in rte_eth_ctrl.h to allow it to be compiled
independently.
Fixes: 7fa96d696f ("ethdev: unification of flow types")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The telemetry header file uses the rte_cpuset_t type, but does not
include any header providing that type. Include rte_os.h to provide the
necessary type.
Fixes: febbebf7f2 ("telemetry: keep threads separate from data plane")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The rte_reciprocal header file used standard __rte_always_inline from
rte_common.h but does not include that header file, leading to compiler
errors when the reciprocal header is included alone.
Fixes: 6d45659eac ("eal: add u64-bit variant for reciprocal divide")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
rte_thread.h was missing the compat header to get the __rte_experimental
macro definition.
Fixes: b1fd151267 ("eal: add generic thread-local-storage functions")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
As announced in the deprecation note, remove all compatibility build
defines from previous make/meson versions and use only the standardized
ones - RTE_LIB_<name> for libraries, and RTE_<CLASS>_<NAME> for drivers.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Implement the generic modify flow API to allow manipulations on
an arbitrary header field (as well as mark, metadata or tag) using
data from another field or a user-specified value.
This generic modify mechanism removes the necessity to implement
a separate RTE Flow action every time we need to modify a new packet
field in the future.
Supported operation are:
- set: copy data from source to destination.
- add: integer addition, stores the result in destination.
- sub: integer subtraction, stores the result in destination.
The field ID is used to specify the desired source/destination packet
field in order to simplify the API for various encapsulation models.
Specifying the packet field ID with the needed encapsulation level
is able to quickly get a packet field for any inner packet header.
Alternatively, the special ID (ITEM_START) can be used to point to
the very beginning of a packet. This ID in conjunction with the
offset parameter provides great flexibility to copy/modify any part of
a packet as needed.
The number of bits to use from a source as well as the offset can be
be specified to allow a partial copy or dividing a big packet field
into multiple small fields (e.g. copying 128 bits of IPv6 to 4 tags).
An immediate value (or a pointer to it) can be specified instead of the
level and the offset for the special FIELD_VALUE ID (or FIELD_POINTER).
Can be used as a source only.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
When querying the link status via telemetry interface, we don't want the
client to have to wait for multiple seconds for a reply. Therefore use
"rte_eth_link_get_nowait()" rather than "rte_eth_link_get()" in the
telemetry callback.
Fixes: c190daedb9 ("ethdev: add telemetry callbacks")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The Geneve tunneling protocol is designed to allow the
user to specify some data context on the packet.
The GENEVE TLV (Type-Length-Variable) Option
is the mean intended to present the user data.
In order to support GENEVE TLV Option the new rte_flow
item "rte_flow_item_geneve_opt" is added.
The new item contains the values and masks for the
following fields:
-option class
-option type
-length
-data
New item will be added to testpmd to support match and
raw encap/decap actions.
Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ethdev is using default Ethernet overhead to decide if provided
'max_rx_pkt_len' value is bigger than max (non jumbo) MTU value,
and limits it to MAX if it is.
Since the application/driver used Ethernet overhead is different than
the ethdev one, check result is wrong.
If the driver is using Ethernet overhead bigger than the default one,
the provided 'max_rx_pkt_len' is trimmed down, and in the driver when
correct Ethernet overhead is used to convert back, the resulting MTU is
less than the intended one, causing some packets to be dropped.
Like,
app -> max_rx_pkt_len = 1500/*mtu*/ + 22/*overhead*/ = 1522
ethdev -> 1522 > 1518/*MAX*/; max_rx_pkt_len = 1518
driver -> MTU = 1518 - 22 = 1496
Packets with size 1497-1500 are dropped although intention is to be able
to send/receive them.
The fix is to make ethdev use the correct Ethernet overhead for port,
instead of default one.
Fixes: 59d0ecdbf0 ("ethdev: MTU accessors")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel type.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch introduces the GTP header individual flag bit fields
and the header optional word with N-PDU number, Sequence Number
and Next Extension Header type.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds APIs to add/remove callback functions on crypto
enqueue/dequeue burst. The callback function will be called for
each burst of crypto ops received/sent on a given crypto device
queue pair.
Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Checkpatch prefers 'unsigned int' over bare 'unsigned'.
Reword the error messages for brevity and clarity
so they don't have to be split across multiple lines.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The device string has an existing size in rte_dev.h
use that instead of defining our own.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use rte_pktmbuf_free_bulk instead of loop when freeing
packets.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
When we're attaching fbarrays in secondary processes, we check for
whether the intended memory address for the fbarray is already in use by
some other, local fbarray. However, the check for end-overlap (i.e. to
see if our memory area's end overlaps with some other fbarray) is
incorrectly counting end offset as part of the overlap. Fix the check.
Fixes: 5b61c62cfd ("fbarray: add internal tailq for mapped areas")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Zhihong Peng <zhihongx.peng@intel.com>
When no hugepages are found, we log a message about it, but we never
specify on which node. We also implicitly declare the page size based
on the directory name, but that's not very user friendly.
Fix both by changing the text of the message to note the NUMA node (if
applicable) and explicitly mention page size in kilobytes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Add a simple API to allow getting the monitor conditions for
power-optimized monitoring of the Rx queues from the PMD, as well as
release notes information.
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Now that we have everything in a C file, we can store the information
about our sleep, and have a native mechanism to wake up the sleeping
core. This mechanism would however only wake up a core that's sleeping
while monitoring - waking up from `rte_power_pause` won't work.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Currently, the "sync" version of power monitor intrinsic is supposed to
be used for purposes of waking up a sleeping core. However, there are
better ways to achieve the same result, so remove the unneeded function.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Instead of passing around pointers and integers, collect everything
into struct. This makes API design around these intrinsics much easier.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Currently, the API documentation mandates that if the user wants to use
the power management intrinsics, they need to call the
`rte_cpu_get_intrinsics_support` API and check support for specific
intrinsics.
However, if the user does not do that, it is possible to get illegal
instruction error because we're using raw instruction opcodes, which may
or may not be supported at runtime.
Now that we have everything in a C file, we can check for support at
startup and prevent the user from possibly encountering illegal
instruction errors.
We also add return values to the API's as well, because why not.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Currently, power intrinsics are inline functions. Make them part of the
ABI so that we can have various internal data associated with them
without exposing said data to the outside world.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The librte_cfgfile lib is functional on Windows.
Enable compilation of this lib for Windows.
Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Currently bitmap line not empty check API assumes cache line
of 64B and only checks 8 slabs. Since in 128B cacheline, we
have 16 slabs per cacheline, rte_bitmap_clear() will mark
complete line as empty as soon as 8 slabs are empty thereby
breaking bitmap scan functionality. Fix it by defining new
__rte_bitmap_line_not_empty() for 128B cacheline platform.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
In some cases, a device or infrastructure may want to enable hotplug
but application may also try and start hotplug as well. Therefore
change the monitor_started from a boolean into a reference count.
Signed-off-by: Long Li <longli@microsoft.com>
Explicitly cast void * to type * so that EAL headers may be compiled
as C or C++.
Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Currently, when rte_service_init() fails at initialization, the
application always gets a ENOEXEC error code. For example, with testpmd,
this is displayed as:
Cannot init EAL: Exec format error
This error code does not describe the real issue. Instead, use the error
code returned by the function.
Fixes: e398245008 ("service: initialize with EAL")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
In some situations, we would get several ip fragments, which total
data length is less than min_ip_len(64) and padding with zeros.
We simulated intermediate fragments by modifying the MTU.
To illustrate the problem, we simplify the packet format and
ignore the impact of the packet header.In namespace2,
a packet whose data length is 1520 is sent.
When the packet passes tap2, the packet is divided into two
fragments: fragment A and B, similar to (1520 = 1510 + 10).
When the packet passes tap3, the larger fragment packet A is
divided into two fragments A1 and A2, similar to (1510 = 1500 + 10).
Finally, the bond interface receives three fragments:
A1, A2, and B (1520 = 1500 + 10 + 10).
One fragmented packet A2 is smaller than the minimum Ethernet
frame length, so it needs to be padded.
|---------------------------------------------------|
| HOST |
| |--------------| |----------------------------| |
| | ns2 | | |--------------| | |
| | |--------| | | |--------| |--------| | |
| | | tap1 | | | | tap2 | ns1| tap3 | | |
| | |mtu=1510| | | |mtu=1510| |mtu=1500| | |
| |--|1.1.1.1 |--| |--|1.1.1.2 |----|2.1.1.1 |--| |
| |--------| |--------| |--------| |
| | | | |
| |-----------------| | |
| | |
| |--------| |
| | bond | |
|--------------------------------------|mtu=1500|---|
|--------|
When processing the preceding packets above,
DPDK would aggregate fragmented packets A2 and B.
And error packets are generated, which padding(zero)
is displayed in the middle of the packet.
A2 + B:
0000 fa 16 3e 9f fb 82 fa 47 b2 57 dc 20 08 00 45 00
0010 00 33 b4 66 00 ba 3f 01 c1 a5 01 01 01 01 02 01
0020 01 02 c0 c1 c2 c3 c4 c5 c6 c7 00 00 00 00 00 00
0030 00 00 00 00 00 00 00 00 00 00 00 00 c8 c9 ca cb
0040 cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db
0050 dc dd de df e0 e1 e2 e3 e4 e5 e6
So, we would calculate the length of padding, and remove
the padding in pkt_len and data_len before aggregation.
And also we have the fix for both ipv4 and ipv6.
Fixes: 7f0983ee33 ("ip_frag: check fragment length of incoming packet")
Cc: stable@dpdk.org
Signed-off-by: Yicai Lu <luyicai@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As most NICs do not support segmentation for VXLAN-encapsulated
UDP/IPv4 packets, this patch adds VXLAN UDP/IPv4 GSO support.
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
The file rte_random.c is required to build i40e PMD on Windows.
Add rte_rand variable to export file.
Redefine _m_prefetchw for Clang toolchain due to following error
with respect to conflicting types:
FAILED: lib/76b5a35@@rte_eal@sta/librte_eal_common_rte_random.c.obj
clang @lib/76b5a35@@rte_eal@sta/librte_eal_common_rte_random.c.obj.rsp
In file included from ../lib/librte_eal/common/rte_random.c:13:
In file included from ..\lib/librte_eal/include\rte_eal.h:20:
In file included from ..\lib/librte_eal/include\rte_per_lcore.h:25:
In file included from ..\lib/librte_eal/windows/include\pthread.h:21:
In file included from ..\lib/librte_eal/windows/include\rte_windows.h:27:
In file included from C:\Program Files (x86)\Windows Kits\10\include\
10.0.18362.0\um\windows.h:171:
In file included from C:\Program Files (x86)\Windows Kits\10\include\
10.0.18362.0\shared\windef.h:24:
In file included from C:\Program Files (x86)\Windows Kits\10\include\
10.0.18362.0\shared\minwindef.h:182:
C:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um\
winnt.h:3324:1: error: conflicting types for '_m_prefetchw'
_m_prefetchw (
^
C:\Program Files\LLVM\lib\clang\10.0.0\include\prfchwintrin.h:50:1:
note: previous definition is here
_m_prefetchw(void *__P)
^
1 error generated.
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Async enqueue offloads large copies to DMA devices, and small copies
are still performed by the CPU. However, it requires users to get
enqueue completed packets by rte_vhost_poll_enqueue_completed(), even
if they are completed by the CPU when rte_vhost_submit_enqueue_burst()
returns. This design incurs extra overheads of tracking completed
pktmbufs and function calls, thus degrading performance on small packets.
This patch enhances async enqueue for small packets by enabling
rte_vhost_submit_enqueue_burst() to return completed packets.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch removes unnecessary check and function calls, and it changes
appropriate types for internal variables and fixes typos.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Added new path to do lpm4 lookup by using scalable vector extension.
The SVE path will be selected if compiler has flag SVE set.
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
rte_lpm_lookupx4 could return wrong next hop when more than 256 tbl8
groups are created. This is caused by incorrect type casting of tbl8
group index that been stored in tbl24 entry. The casting caused group
index truncation and hence wrong tbl8 group been searched.
Issue fixed by applying proper mask to tbl24 entry to get tbl8 group index.
Fixes: dc81ebbaca ("lpm: extend IPv4 next hop field")
Fixes: cbc2f1dccf ("lpm/arm: support NEON")
Fixes: d2cc795934 ("lpm: add AltiVec for ppc64")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Improved performance for AVX512 FIB6 lookup by doubling the number
of flows being processed
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
When scanning a buffer it is possible that the scan will abort
due to some internal resource limit.
This commit adds such response flag, so application can handle such cases.
Signed-off-by: Francis Kelly <fkelly@nvidia.com>
Signed-off-by: Ori Kam <orika@nvidia.com>
Add support for TLS functionality in EAL.
The following functions are added:
rte_thread_tls_key_create - create a TLS data key.
rte_thread_tls_key_delete - delete a TLS data key.
rte_thread_tls_value_set - set value bound to the TLS key
rte_thread_tls_value_get - get value bound to the TLS key
TLS key is defined by the new type rte_tls_key.
The API allocates the thread local storage (TLS) key.
Any thread of the process can subsequently use this key
to store and retrieve values that are local to the thread.
Those functions are added in addition to TLS capability
in rte_per_lcore.h to allow abstraction of the pthread
layer for all operating systems.
Windows implementation is under librte_eal/windows and
implemented using WIN32 API for Windows only.
Unix implementation is under librte_eal/unix and
implemented using pthread for UNIX compilation.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Move the definition of the functions
rte_thread_set_affinity and rte_thread_get_affinity
to new file, rte_thread.h
The file will implement generic threading functionality
and will only host threading functions which do not reference
pthread API.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
This patch moves memory region mmaping and related
preparation in a dedicated function in order to simplify
VHOST_USER_SET_MEM_TABLE request handling function.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch moves the registration of postcopy to a
dedicated function, with the goal of simplifying
VHOST_USER_SET_MEM_TABLE request handling function.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch moves the registration of memory regions to
userfaultfd to a dedicated function, with the goal of
simplifying VHOST_USER_SET_MEM_TABLE request handling
function.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Simply replace the smp barriers with atomic thread fence for vhost control
path, if there are no synchronization points.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Simply replace smp barriers with atomic thread fence for
virtio packed vring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Used idx can be synchronized by one-way barrier instead of full
write barrier for split vring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Relax the full read barrier to one-way barrier for desc flags in
packed vring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The ordering between avail index and desc reads has been enforced
by load-acquire for split vring, so smp_rmb barrier is not needed
behind it.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As function desc_is_avail performs a load-acquire barrier to
enforce the ordering between desc flags and desc content, it is
unnecessary to add a rte_smp_rmb barrier around the trace which
follows desc_is_avail.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Use rte_atomic_thread_fence wrapper which has been provided for
__atomic_thread_fence builtins to support optimized code for
__ATOMIC_SEQ_CST memory order on x86 platforms.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
The header was missing the extern "C" directive which causes name
mangling of functions by C++ compilers, leading to linker errors
complaining of undefined references to these functions.
Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Ashish Sadanandan <ashish.sadanandan@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The "rev->epdata.event" assigned to "events.epdata.event" directly, which
was wrong in case of epoll events. It should be set to the "evs.events".
Fixes: 9efe9c6cdc ("eal/linux: add epoll wrappers")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
According to GCC documentation for __builtin_clz:
Returns the number of leading 0-bits in x,
starting at the most significant bit position.
If x is 0, the result is undefined.
__builtin_clz will be called with 0 if the existing
prefix address matches the one we want to insert.
Fixes: 5a5793a5ff ("rib: add RIB library")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
When building with clang (11.0,--buildtype=debug), eal_lcore.c
produces a -Wformat-nonliteral warning from the vfprintf call
in log_early.
Add __rte_format_printf annotation.
Fixes: b8a36b0866 ("eal/windows: improve CPU and NUMA node detection")
Cc: stable@dpdk.org
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
Compiling with MinGW in --buildtype=debug produces a redefinition
error for strncasecmp.
The root cause is that rte_os.h shouldn't be injecting POSIX definitions
into the environment. It is the applications responsibility to decide
how to handle missing functionality.
Resolving this properly will require further work, but in the meantime
wrap all such definitions with #ifndef/#endif. This resolves the specific
issue with strncasecmp and handles similar issues that applications may
encounter.
Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
MinGW-w64 above 8.0.0 exposes VirtualAlloc2() API in headers, but lacks
it in import libraries. Hence, availability of this API at compile-time
can't be used to choose between locating VirtualAlloc2() manually or
relying on the dynamic linker.
Fix redefinition compile-time errors.
Always link VirtualAlloc2() when using GCC.
Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org
Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: Thomas Monjalon <thomas@monjalon.net>
A new generic shared actions API may be used to create shared
counter. There is no point to keep duplicate COUNT action specific
capability to create shared counters.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC build with '-O0' on platforms with RTE_ARM_FEATURE_ATOMICS set
failed for:
../lib/librte_efd/rte_efd.c
Assembler messages:
3866: Error: selected processor does not support `crc32cb w0,w0,w1'
3890: Error: selected processor does not support `crc32ch w0,w0,w1'
3914: Error: selected processor does not support `crc32cw w0,w0,w1'
3938: Error: selected processor does not support `crc32cx w0,w0,x1'
This was caused by an architecture specifier added for Clang.
Unlike Clang, GCC considers each inline assembly block to be dependent
and therefore, the architecture specifier impacts assemble of some
blocks require certain extension support.
Removed the architecture for GCC to fix the issue.
Fixes: 8fce34cd0a ("eal/arm: fix clang build of native target")
Cc: stable@dpdk.org
Reported-by: Feifei Wang <feifei.wang2@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Commit 49b536fc30 ("eal: load only shared libs from driver plugin directories")
introduced a check that any shared library must ends with .so, but it can't
work, at least, on Fedora/CentOS/RHEL since .so symlinks are not installed
when you install dpdk package, but only when you install dpdk-devel package.
This commit adds also a check for .so.ABI_VERSION to check for shared lib.
See Fedora Packaging Guidelines for more information:
https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages
Fixes: 49b536fc30 ("eal: load only shared libs from driver plugin directories")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
Commit 06c7871dde ("eal: restrict default plugin path to shared lib mode")
introduced a check that enabled shared lib mode when librte_eal.so can
be loaded, but it can't work, at least, on Fedora/CentOS/RHEL since .so
symlinks are not installed when you install dpdk package, but only when
you install dpdk-devel package.
This commit uses librte_eal.so.ABI_VERSION to check for shared lib,
since it exists on any linux distributions.
See Fedora Packaging Guidelines for more information:
https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages
Fixes: 06c7871dde ("eal: restrict default plugin path to shared lib mode")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
The initialization me->locked=1 in lock() must happen before
next->locked=0 in unlock(), otherwise a thread may hang forever,
waiting me->locked become 0. On weak memory systems (such as ARMv8),
the current implementation allows me->locked=1 to be reordered with
announcing the node (pred->next=me) and, consequently, to be
reordered with next->locked=0 in unlock().
This fix adds a release barrier to pred->next=me, forcing
me->locked=1 to happen before this operation.
Fixes: 2173f3333b ("mcslock: add MCS queued lock implementation")
Cc: stable@dpdk.org
Signed-off-by: Diogo Behrens <diogo.behrens@huawei.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Linking with the 'pci' driver when building with MinGW on
Windows fails with undefined symbol 'GUID_DEVCLASS_NET'.
This occurs because devguid.h is included in rte_windows.h
before INITGUID is defined.
Move the include of devguid.h after the definition of INITGUID.
Fixes: b762221ac2 ("bus/pci: support Windows with bifurcated drivers")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Tal Shnaiderman <talshn@nvidia.com>
Cleanup code style issue reported by kernel checkpatch. As follows:
* ERROR:CODE_INDENT: code indent should use tabs where possible
* ERROR:SPACING: spaces required around that '?' (ctx:VxE)
* WARNING:INDENTED_LABEL: labels should not be indented
Fixes: b0489e7bca ("malloc: fix linear complexity")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
In the experimental function rte_flow_shared_action_destroy()
introduced in DPDK 20.11, the errno ETOOMANYREFS was used.
This errno is not always available on Windows,
so it is preferred using EBUSY instead.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Tal Shnaiderman <talshn@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch defines new RSS offload types for eCPRI. For eCPRI with
Message Type 0, the hash field is physical channel ID.
Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reasons for building not supported generally start with lowercase
because printed as the second part of a line.
Other changes:
- "linux" should be "Linux" with a capital letter.
- ARCH_X86_64 may be simply x86_64.
- aarch64 is preferred over arm64.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
The definition of ETOOMANYREFS is reverted as it breaks build of
external applications already defining it.
Fixes: c917b54b0c ("eal/windows: add definition of ETOOMANYREFS")
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Reviewed-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
The ETOOMANYREFS errno was missing from the Windows build.
It is used in initialization of flow error structures.
It is defined with the same error code used by WSAETOOMANYREFS.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Replace -w / --pci-whitelist with -a / --allow options
and --pci-blacklist with --block.
The -b short option remains unchanged.
Allow the old options for now, but print a nag
warning since old options are deprecated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Rename the enum values in the EAL include files.
As a backward compatible temporary migration tool, define
a replacement mapping for old values.
The old names relating to blacklist and whitelist are replaced
by block list and allow list, but applications may be using the
older compatibility macros, marked as deprecated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Gaetan Rivet <grive@u256.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Fix the detection of instruction pattern with multiple emits followed
by TX. Once detected, this is one of the instruction patterns that is
internally replaced with a single optimized instruction, as long as
none of the instructions to be replaced is referenced by a jump
instruction. The fix enforces this check for the TX instruction of
the pattern.
Fixes: 31035e87b2 ("pipeline: add SWX instruction optimizer")
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
This patch fixes a file descriptor leak which happens
in the error path of vhost_user_set_vring_kick().
Fixes: 4796ad63ba ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
This patch fixes a file descriptor leak which happens
in the error path of vhost_user_set_log_base().
Fixes: 4796ad63ba ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org
Reported-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
If an error is encountered before the memory regions are
parsed, the file descriptors for these shared buffers are
leaked.
This patch fixes this by closing the message file descriptors
on error, taking care of avoiding double closing of the file
descriptors. guest_pages is also freed, even though it was not
leaked as its pointer was not overridden on subsequent function
calls.
Fixes: 8f972312b8 ("vhost: support vhost-user")
Cc: stable@dpdk.org
Reported-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
This patches fixes virtqueue initialization issue causing
segfault or file descriptor being closed unexpectedly.
The wrong index was passed to init_vring_queue() by
alloc_vring_queue() when a hole in the virtqueue array was
met.
Fixes: 8acd7c2133 ("vhost: fix virtqueues metadata allocation")
Cc: stable@dpdk.org
Reported-by: Yu Jiang <yux.jiang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: Yu Jiang <yux.jiang@intel.com>
Async inflight packet counter should take failed packets into account.
Failed packets will be deducted in the error handling logic.
Fixes: 6b3c81db8b ("vhost: simplify async copy completion")
Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
For VxLAN packets, GRO will mistakenly reassemble them
if inner L3 is IPv6, inner L4 is TCP or UDP, and outer L3
is IPv4 because the value of IS_IPV4_VXLAN_TCP4/UDP4_PKT
is true for them.
This fix makes sure IS_IPV4_TCP_PKT, IS_IPV4_UDP_PKT,
IS_IPV4_VXLAN_TCP4_PKT and IS_IPV4_VXLAN_UDP4_PKT can make
decision precisely.
Fixes: e2d8110636 ("gro: support VXLAN UDP/IPv4")
Fixes: 1ca5e67408 ("gro: support UDP/IPv4")
Fixes: 9e0b9d2ec0 ("gro: support VxLAN GRO")
Fixes: 0d2cbe59b7 ("lib/gro: support TCP/IPv4")
Cc: stable@dpdk.org
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Meson versions >= 0.54.0 include support for handling /implib
with msvc link. Specifying it explicitly causes failures when
linking against the dll. Tested using Link 14.27.29112.0 and
Clang 11.0.0.
There were a number of changes to the way that import libraries
are handled between 0.47.1 and 0.54.0. Only make the change
for >= 0.54.0, leaving the behaviour unchanged for earlier
versions.
Fixes: 77cca7ccec ("build: fix drivers library path on Windows")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Tested-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Khoa To <khot@microsoft.com>
When doing Clang build with '-mcpu=native' on N1 platform, build failed
with:
../lib/librte_eal/arm/include/rte_atomic_64.h:76:39:
error: instruction requires: lse
__ATOMIC128_CAS_OP(__cas_128_release, "caspl")
This is because native detection for Neoverse N1 was added in Clang-11.
Prior version of Clang's assembler doesn't know LSE support on hardware.
Fixed this for Clang earlier than version 11 by specifying architecture
for assembler.
Referred to [1] for this fix.
Fixes: 7e2c3e17fe ("eal/arm64: add 128-bit atomic compare exchange")
Cc: stable@dpdk.org
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e0d5896bd356cd577f9710a02d7a474cdf58426b
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
The SPAPR IOMMU requires that a DMA window size be defined before memory
can be mapped for DMA. Current code dynamically modifies the DMA window
size in response to every new memory allocation which is potentially
dangerous because all existing mappings need to be unmapped/remapped in
order to resize the DMA window, leaving hardware holding IOVA addresses
that are temporarily unmapped. The new SPAPR code statically assigns
the DMA window size on first use, using the largest physical memory
memory address when IOVA=PA and the highest existing memseg virtual
address when IOVA=VA.
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The mbuf header files had some commenting style errors that affected the
API documentation.
Also, the RTE_ prefix was missing on a macro and a definition.
Note: This patch does not touch the offload and attachment flags that are
also missing the RTE_ prefix.
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
cmdline_numtype member names clash with Windows system identifiers.
Add RTE_ prefix to cmdline constants to avoid this and possible
future conflicts.
Suggested-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Jie Zhou <jizh@microsoft.com>
Tested-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The dynamic flag management is broken if rte_mbuf_dynflag_lookup()
is done in a secondary process because the local pointer to
the memzone is not ever initialized.
Fix it by using the same checks as dynfield_register().
I.e if shared memory zone has not been looked up already,
then discover it.
Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The ethdev port id is 16 bits now. This patch fixes the data type
of the variable for 'pid', which changing from uint32_t to uint16_t.
RTE_MAX_ETHPORTS is the maximum number of ports, which customized by
the user. To avoid 16-bit unsigned integer overflow, the valid value
of RTE_MAX_ETHPORTS should be set from 0 to UINT16_MAX, and it is
safer to cut one more port from space.
So we use RTE_BUILD_BUG_ON() to ensure that RTE_MAX_ETHPORTS is less
to UINT16_MAX.
Fixes: 5b7ba31148 ("ethdev: add port ownership")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Coverity flags that 'rx_conf' variable is used before
it's checked for NULL. This patch fixes this issue.
Coverity issue: 363570
Fixes: 4ff702b5df ("ethdev: introduce Rx buffer split")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
In a flow rule, attribute "transfer" means operation level
at which both traffic is matched and actions are conducted.
Add the very same attribute to shared action configuration.
If a driver needs to prepare HW resources in two different
ways, depending on the operation level, in order to set up
an action, then this new attribute will indicate the level.
Also, when handling a flow rule insertion, the driver will
be able to turn down a shared action if its level is unfit.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrey Vesnovaty <andreyv@nvidia.com>
Updated description of rte_eth_rx_burst() to reflect what drivers,
when using vector instructions, expect from nb_pkts.
Also discussed on the mailing list here:
http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35C61257@smartserver.smartshare.dk/
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
net/ixgbe driver is the only user of the struct rte_eth_l2_tunnel_conf.
Move it to the driver and use ixgbe_ prefix instead of rte_eth_.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The legacy filter API, including rte_eth_dev_filter_supported() and
rte_eth_dev_filter_ctrl() is removed. Flow API should be used.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of FDIR filters RTE flow API should be used.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Global filter configuration request was supported by net/i40e
driver only to configure GRE key length.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of HASH filter RTE flow API should be used.
Preserve RTE_ETH_FILTER_HASH since it is used in drivers
internally in RTE flow API support.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of TUNNEL filter RTE flow API should be used.
Move corresponding defines and helper structure to ethdev
driver interface since it is still used by drivers internally.
Preserve RTE_ETH_FILTER_TUNNEL because of usage in drivers.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of N-tuple filter RTE flow API should be used.
Preserve struct rte_eth_ntuple_filter in ethdev API since
the structure and related defines are used in flow classify
library and a number of drivers.
Preserve RTE_ETH_FILTER_NTUPLE because of usage in drivers.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of SYN filter RTE flow API should be used.
Move corresponding definitions to ethdev internal driver API
since it is used by drivers internally.
Preserve RTE_ETH_FILTER_SYN because of it as well.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
net/e1000 driver is the only user of the struct rte_eth_flex_filter
and helper defines. Move it to the driver and use igb_ prefix
instead of rte_eth_.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of EtherType filter RTE flow API should be used.
Move corresponding definitions to ethdev internal driver API
since it is used by drivers internally.
Preserve RTE_ETH_FILTER_ETHERTYPE because of it as well.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
net/i40e driver is the only user of the enum rte_mac_filter_type.
Move the define to the driver and use i40e_ prefix instead of rte_.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Instead of MACVLAN filter RTE flow API should be used.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch initializes a local parameter in async data path to avoid
compiler warnings.
Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
gpa_to_hpa() function almost always fails due to the wrong setup of
the binary tree search key. Since there has already been a similar
function gpa_to_first_hpa() available in the vhost, instead of fixing
the issue in its original logic, gpa_to_hpa() function is rewritten to
be a wrapper of the gpa_to_first_hpa() to avoid code redundancy.
Fixes: e246896178 ("vhost: get guest/host physical address mappings")
Fixes: faa9867c4d ("vhost: use binary search in address conversion")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The definitions of RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP
and RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP were inserted
before the last comment of Tx offloads.
It is moved in a better place,
with comments moved to be before the definition.
A group comment is added to better describe device capabilities.
Fixes: cac923cfea ("ethdev: support runtime queue setup")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
When a flow API RSS rule is issued in testpmd, device RSS key is changed
unexpectedly, device RSS key is changed to the testpmd default RSS key.
Consider the following usage with testpmd:
1. first, startup testpmd:
testpmd> show port 0 rss-hash key
RSS functions: all ipv4-frag ipv4-other ipv6-frag ipv6-other ip
RSS key: 6D5A56DA255B0EC24167253D43A38FB0D0CA2BCBAE7B30B477CB2DA38030F
20C6A42B73BBEAC01FA
2. create a rss rule
testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end \
actions rss types ipv4-udp end queues end / end
3. show rss-hash key
testpmd> show port 0 rss-hash key
RSS functions: all ipv4-udp udp
RSS key: 74657374706D6427732064656661756C74205253532068617368206B65792
C206F76657272696465
This is because testpmd always sends a key with the RSS rule,
if user provides a key as part of the rule that key is used, if user
doesn't provide a key, testpmd default key is sent to the PMDs, which is
causing device programmed RSS key to be changed.
There was a previous attempt to fix the same issue [1], but it has been
reverted back [2] because of the crash when 'key_len' is provided
without 'key'.
This patch follows the same approach with the initial fix [1] but also
addresses the crash.
After change, testpmd RSS key is 'NULL' by default, if user provides a
key as part of rule it is used, if not no key is sent to the PMDs at all
[1]
Commit a4391f8bae ("app/testpmd: set default RSS key as null")
[2]
Commit f3698c3d09 ("app/testpmd: revert setting default RSS")
Fixes: d0ad8648b1 ("app/testpmd: fix RSS flow action configuration")
Cc: stable@dpdk.org
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
By design, async enqueue API should return directly if async device
is not registered. This patch removes the corrupted implementation of
the enqueue fallback from async mode to sync mode.
Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch checks whether the virtqueue metadata pointer
is valid before dereferencing it. It is not considered
a fix as earlier patch ensures there are no holes in the
array of virtqueue metadata pointers.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch validates the queue index parameter, in order
to ensure no out-of-bound accesses happen.
Fixes: 9eed6bfd2e ("vhost: allow to enable or disable features")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch validates the queue index parameter, in order
to ensure neither out-of-bound accesses nor NULL pointer
dereferencing happen.
Fixes: 4d891f77dd ("vhost: add APIs to get inflight ring")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch validates the queue index parameter, in order
to ensure no out-of-bound accesses happen.
Fixes: bd2e0c3fe5 ("vhost: add APIs for live migration")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch validates the queue index parameter, in order
to ensure neither out-of-bound accesses nor NULL pointer
dereferencing happen.
Fixes: 9eed6bfd2e ("vhost: allow to enable or disable features")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch validates the queue index parameter, in order
to ensure neither out-of-bound accesses nor NULL pointer
dereferencing happen.
Fixes: a67f286a65 ("vhost: export queue free entries")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
The Vhost-user backend implementation assumes there will be
no holes in the device's array of virtqueues metadata
pointers.
It can happen though, and would cause segmentation faults,
memory leaks or undefined behaviour.
This patch keep the assumption that there is no holes in this
array, and allocate all uninitialized virtqueues metadata up
to requested index.
Fixes: 160cbc815b ("vhost: remove a hack on queue allocation")
Cc: stable@dpdk.org
Suggested-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
If an error occurred when allocating memory for metrics or names,
the function returned without freeing allocated memory. This is now
fixed to avoid the resource leak in the case that either metrics or
names had been successfully allocated memory.
Coverity issue: 362053
Fixes: c5b7197f66 ("telemetry: move some functions to metrics library")
Cc: stable@dpdk.org
Reported-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Currently, the intrinsics documentation refers to `rte_cpu_get_features`
as a check for whether these intrinsics are supported at runtime. This
is incorrect, because actually the user should use the
`rte_cpu_get_intrinsics_support` API to do said check. Fix the typo.
Fixes: 1280214212 ("eal: add intrinsics support check infrastructure")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
rte_gso_segment decreased refcnt of pkt by one, but
it is wrong if pkt is external mbuf, pkt won't be
freed because of incorrect refcnt, the result is
application can't allocate mbuf from mempool because
mbufs in mempool are run out of.
One correct way is application should call
rte_pktmbuf_free after calling rte_gso_segment to free
pkt explicitly. rte_gso_segment must not handle it, this
should be responsibility of application.
This commit changed rte_gso_segment in functional behavior
and return value, so the application must take appropriate
actions according to return values, "ret < 0" means it
should free and drop 'pkt', "ret == 0" means 'pkt' isn't
GSOed but 'pkt' can be transmitted as a normal packet,
"ret > 0" means 'pkt' has been GSOed into two or multiple
segments, it should use "pkts_out" to transmit these
segments. The application must free 'pkt' after call
rte_gso_segment when return value isn't equal to 0.
Fixes: 119583797b ("gso: support TCP/IPv4 GSO")
Cc: stable@dpdk.org
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Windows alarms are both armed and executed from the interrupt thread.
rte_eal_alarm_set() dispatched alarm-arming code to that thread and
waited for its completion via a spinlock. However, if called from alarm
callback (i.e. from the interrupt thread), this caused a deadlock,
because arming could not be run until its dispatcher exits, but it could
only exit after it finished waiting for arming to complete.
Call arming code directly when running in the interrupt thread.
Fixes: f4cbdbc7fb ("eal/windows: implement alarm API")
Reported-by: Pallavi Kadam <pallavi.kadam@intel.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Currently, since there is no runtime directory set, the code tries to
create a file in C:\ which is only writable with administrator
privileges. As a result, if the user is not admin, the application will
fail.
So, forcing no_shconf to 1 to prevent the code having to create files in
the runtime directory.
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For consistency between Rx and Tx timestamps,
managing mbuf registrations inside the driver, as properly documented,
is a simpler expectation.
The only driver to support this feature (mlx5) is updated
as well as the testpmd application.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The function rte_mbuf_dyn_tx_timestamp_register()
can be used to register the required field and flag.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The offload flag DEV_RX_OFFLOAD_TIMESTAMP had no documentation.
After switching to dynamic mbuf flag and field,
it becomes even more important to explicit the feature behaviour.
A doxygen comment for the timesync API was mentioning
the deprecated timestamp field, so it is also updated.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced with the dynamic one.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
There is already a dynamic field for timestamp,
used only for Tx scheduling with the dedicated Tx offload flag.
The same field can be used for Rx timestamp filled by drivers.
A new dynamic flag is defined for Rx usage.
A new function wraps the registration of both field and Rx flag.
The type rte_mbuf_timestamp_t is defined for the API users.
After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This a revert of the commit 569758758d ("eventdev: add Rx timestamp").
If the Rx timestamp is not configured on the ethdev port,
there is no reason to set one.
Also the accuracy of the timestamp was bad because set at a late stage.
Anyway there is no trace of the usage of this timestamp.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Rather than have drivers check for this, let's ensure the passed FILE *
is not NULL.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
This patch increases the immediate operand size from 32 to 64 bits.
Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The eventdev drivers have been hacking the deprecated field seqn for
internal test usage.
It is moved to a dynamic mbuf field in order to allow removal of seqn.
Signed-off-by: David Marchand <david.marchand@redhat.com>
The reorder library used sequence numbers stored in the deprecated field
seqn.
It is moved to a dynamic mbuf field in order to allow removal of seqn.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
The device-specific metadata was stored in the deprecated field udata64.
It is moved to a dynamic mbuf field in order to allow removal of udata64.
The name rte_security_dynfield is not very descriptive
but it should be replaced later by separate fields for each type of data
that drivers pass to the upper layer.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
The node_mbuf_priv1 was stored in the deprecated mbuf field udata64.
It is moved to a dynamic field in order to allow removal of udata64.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Replace "in a in PMD" with "in a PMD".
Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Since the kernel module is not part of EAL anymore,
there is no need to have the common KNI header file in EAL.
The file rte_kni_common.h is moved to librte_kni.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The ctf metadata is written to the metadata file without any check for
length, so this string must be null terminated.
Fixes: f1a099f5b1 ("trace: create CTF TDSL metadata in memory")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Kumar Kori <skori@mavell.com>
Rework registration so that it uses dynamic allocations and has no size
limit.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Sunil Kumar Kori <skori@mavell.com>
CTF event description is currently built by appending all fields in a
single string at trace point registration.
When dumping the metadata, this string is split again and inspected to
fixup reserved keywords and special tokens like "." or "->".
Move this fixup per field at trace point registration time so that there
is no need for inspecting / string parsing when dumping metadata.
Use dynamic allocations to remove an artificial size limit on the CTF
event description manipulations.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Kumar Kori <skori@mavell.com>
Currently, it is not possible to check support for intrinsics that
are platform-specific, cannot be abstracted in a generic way, or do not
have support on all architectures. The CPUID flags can be used to some
extent, but they are only defined for their platform, while intrinsics
will be available to all code as they are in generic headers.
This patch introduces infrastructure to check support for certain
platform-specific intrinsics, and adds support for checking support for
IA power management-related intrinsics for UMWAIT/UMONITOR and TPAUSE.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Add two new power management intrinsics, and provide an implementation
in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions
are implemented as raw byte opcodes because there is not yet widespread
compiler support for these instructions.
The power management instructions provide an architecture-specific
function to either wait until a specified TSC timestamp is reached, or
optionally wait until either a TSC timestamp is reached or a memory
location is written to. The monitor function also provides an optional
comparison, to avoid sleeping when the expected write has already
happened, and no more writes are expected.
For more details, please refer to Intel(R) 64 and IA-32 Architectures
Software Developer's Manual, Volume 2.
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Add a new CPUID flag indicating processor support for UMONITOR/UMWAIT
and TPAUSE instructions instruction.
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Coverity flags that 'h' variable is used before
it's checked for NULL. This patch fixes this issue.
Coverity issue: 363625
Fixes: 769b2de7fb ("hash: implement RCU resources reclamation")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
This patch fixes (dereference after null check) coverity issue.
For this reason, we should add null check at the beginning of the
function and return error directly if the 'intr_handle' is null.
Coverity issue: 357695, 357751
Fixes: 05c4105738 ("trace: add interrupt tracepoints")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Add zero-copy APIs. These APIs provide the capability to
copy the data to/from the ring memory directly, without
having a temporary copy (for ex: an array of mbufs on
the stack). Use cases that involve copying large amount
of data to/from the ring can benefit from these APIs.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add new lookup implementation for FIB6 trie algorithm using
AVX512 instruction set
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Move trie table layout and lookup definition into the
private header file. This is necessary for implementing a
vectorized lookup function in a separate .с file.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add type argument to trie_get_lookup_fn()
Now it only supports RTE_FIB6_LOOKUP_TRIE_SCALAR
Add new rte_fib6_select_lookup() - user can change lookup
function type runtime.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add new lookup implementation for DIR24_8 algorithm using
AVX512 instruction set
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Move dir24_8 table layout and lookup definition into the
private header file. This is necessary for implementing a
vectorized lookup function in a separate .с file.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add type argument to dir24_8_get_lookup_fn()
Now it supports 3 different lookup implementations:
RTE_FIB_LOOKUP_DIR24_8_SCALAR_MACRO
RTE_FIB_LOOKUP_DIR24_8_SCALAR_INLINE
RTE_FIB_LOOKUP_DIR24_8_SCALAR_UNI
Add new rte_fib_select_lookup() - user can change lookup
function type runtime.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
FIB type RTE_FIB_TYPE_MAX is used only for sanity checks,
remove it to prevent applications start using it.
The same is for FIB6's RTE_FIB6_TYPE_MAX.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
No functional change intended.
service_dump_calls_per_lcore() was always called with a 0 reset flag.
service_dump_one() was called with either a 0 reset flag or a NULL
FILE pointer.
We can split the code for readability sake.
Note: there is no path to resetting calls_per_service[], this is left as
is.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Fields except tbl24 and tbl8 in rte_lpm structure have no
need to be exposed to the user.
Hide the unneeded exposure of structure fields for better
ABI maintainability.
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
The container structure should be freed instead of rte_lpm structure
after wrapping rte_lpm into internal structure __rte_lpm.
Fixes: 8a9f8564e9 ("lpm: implement RCU rule reclamation")
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Currently, users have to use external RCU mechanisms to free resources
when using lock free hash algorithm.
Integrate RCU QSBR process to make it easier for the applications to use
lock free algorithm.
Refer to RCU documentation to understand various aspects of
integrating RCU library into other libraries.
Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
The config options CONFIG_RTE_* are simple RTE_* defines with meson.
Now that make support is dropped, update the names in logs and comments.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
When introducing the new option CONFIG_RTE_MAX_MEM_MB_PER_TYPE,
some logs were referencing a wrong name: CONFIG_RTE_MAX_MEM_PER_TYPE.
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
The main initialization function (rte_eal_init) has documentation
about a feature from another era: memory partition.
Curiously, this lost treasure is found only now,
suggesting there may be other interesting things to discover in the doc.
To all aspiring Indiana Jones: the hunt is open!
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
We should return an error value, when the callback is already exist.
Fixes: a753e53d51 ("eal: add device event monitor framework")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Fix return value, using -EAGAIN instead of 0 when the callback is busy
and using -ENOENT instead of 0 when the callback is not found.
Fixes: a753e53d51 ("eal: add device event monitor framework")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
The event_cb->dev_name is not freed when freeing event_cb,
and this causes a memory leak.
Fixes: a753e53d51 ("eal: add device event monitor framework")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
In rte_efd_create() allocated memory for tailq entry, we should
free it when error happens, otherwise it will lead to memory leak.
Fixes: 56b6ef874f ("efd: new Elastic Flow Distributor library")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
jhash has been forgotten when factorising the x86 arch check.
Fixes: dbf17d44f3 ("hash: use common x86 flag")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Libraries can use the headers variable to install headers.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Fixes: 63b3907833 ("build: remove library name from version map file name")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When the memory for uevent.devname is allocated in dev_uev_parse(). It
is not freed when parse the subsystem layer fails in dev_uev_parse().
Before return, it is also not freed in dev_uev_handler(). These cause a
memory leak.
Fixes: 0d0f478d04 ("eal/linux: add uevent parse and process")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Following the addition of the in_addr/in6_addr structs
to in.h the header file must have stdint.h included
for the definitions of the uint8_t/uint32_t types used
within the new structs.
Not having it could results in the following errors
in places where in.h is included:
in.h:30:2: error: unknown type name 'uint32_t'
uint32_t s_addr;
in.h:34:2: error: unknown type name 'uint8_t'
uint8_t s6_addr[16];
Fixes: f40a74cfcf ("eal/windows: improve compatibility networking headers")
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Replace master lcore with main lcore and
replace slave lcore with worker lcore.
Keep the old functions and macros but mark them as deprecated
for this release.
The "--master-lcore" command line option is also deprecated
and any usage will print a warning and use "--main-lcore"
as replacement.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add a macro that causes GCC and CLANG to emit a warning when
a deprecated macro is used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Rename new rte_flow ops callbacks to emphasize relation to tunnel
offload API.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Fixes spelling in comment and message about thread error.
Found while looking at checkpatch complaints about "thead"
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>