Currently, even though memory is mapped with PROT_NONE, this does not
cause it to be excluded from core dumps. This is counter-productive,
because in a lot of cases, this memory will go unused (e.g. when the
memory subsystem preallocates VA space but hasn't yet mapped physical
pages into it).
Use `madvise()` call with MADV_DONTDUMP/MADV_NOCORE to exclude the
unused memory from being dumped.
Signed-off-by: Li Feng <fengli@smartx.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Commit 8a4baf06c17a ("mem: mark pages as not accessed when reserving VA")
has mapped the initialized memory with PROT_NONE, and when it's unmapped,
eal_memalloc.c should remmap the anonymous memory with PROT_NONE too.
Fixes: 8a4baf06c17a ("mem: mark pages as not accessed when reserving VA")
Cc: stable@dpdk.org
Signed-off-by: Li Feng <fengli@smartx.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Fix spelling errors in comments (found with codespell).
Note that "inbetween" is not correct in English and should
either be two words or better yet, the in can be dropped.
https://www.grammarly.com/blog/in-between-or-inbetween/
Fixes: 12f45fa7e29b ("eal/arm: read timer from PMU if enabled")
Fixes: 096ffd811fe2 ("eal/x86: use lock-prefixed instructions for SMP barrier")
Fixes: 1d406458db47 ("mem: make segment preallocation OS-specific")
Fixes: bb372060dad4 ("malloc: make heap a doubly-linked list")
Fixes: 7353ee7344b4 ("fbarray: add API to find biggest used or free chunks")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Table entries do not need to be updated if the same rules can be found.
Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
The code didn't compile when using exported mempool functions
under Windows.
compilation error logs:
rte_mempool_exports.def : error LNK2001:
unresolved external symbol rte_mempool_cache_flush
rte_mempool_exports.def : error LNK2001:
unresolved external symbol rte_mempool_default_cache
rte_mempool_exports.def : error LNK2001:
unresolved external symbol rte_mempool_generic_get
rte_mempool_exports.def : error LNK2001:
unresolved external symbol rte_mempool_generic_put
lib\librte_mempool.dll.a : fatal error LNK1120: 4 unresolved externals
clang: error: linker command failed with exit code 1120 (use -v to see invocation)
The cause was that there were some inline functions that were included
in the export list.
To solve this the functions, which are implemented in the header
and shouldn't be exported, were removed from rte_mempool_version.map
export list.
Fixes: 4b5062755aa7 ("mempool: allow user-owned cache")
Fixes: 656f2d3ede96 ("mempool: deprecate specific get and put functions")
Cc: stable@dpdk.org
Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Valid checks for optional function pointers inside dev-ops
were disabled by undefined macro.
Fixes: b6ee98547847 ("security: fix verification of parameters")
Cc: stable@dpdk.org
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Using a global mbuf dynamic field for metadata incurs some
performance penalty on a datapath. Store this information in
the Rx queue descriptor for a better cache locality.
Fixes: a18ac6113331 ("net/mlx5: add metadata support to Rx datapath")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Defines some new RSS offload types for ETH/S_VLAN/C_VLAN/L2TPV3/
/PFCP/L2_SRC_ONLY/L2_DST_ONLY.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
One of the reasons to destroy a flow is the fact that no packet matches
the flow for "timeout" time.
For example, when TCP\UDP sessions are suddenly closed.
Currently, there is not any DPDK mechanism for flow aging and the
applications use their own ways to detect and destroy aged-out flows.
The flow aging implementation need include:
- A new rte_flow action: RTE_FLOW_ACTION_TYPE_AGE to set the timeout and
the application flow context for each flow.
- A new ethdev event: RTE_ETH_EVENT_FLOW_AGED for the driver to report
that there are new aged-out flows.
- A new rte_flow API: rte_flow_get_aged_flows to get the aged-out flows
contexts from the port.
- Support input flow aging command line in Testpmd.
The new event type addition in the enum is flagged as an ABI breakage,
so an ignore rule is added for these reasons:
- It is not changing value of existing types (except MAX)
- The new value is not used by existing API if the event is not
registered
In general, it is safe adding new ethdev event types at the end of the
enum, because of event callback registration mechanism.
Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Matan Azrad <matan@mellanox.com>
When ring size or enqueue packets not aligned with batch number, it is
possible that descs update still kept in shadowed used structure when
batched enqueue. Fix this issue by flushing remained shadowed used descs
before batch flush.
Fixes: f41516c309d7 ("vhost: flush batched enqueue descs directly")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Defer shadow ring update introduces functional issue which has been
described in Eugenio's fix patch.
The current implementation of vhost_net in packed vring tries to fill
the shadow vector before send any actual changes to the guest. While
this can be beneficial for the throughput, it conflicts with some
bufferfloats methods like the linux kernel napi, that stops
transmitting packets if there are too much bytes/buffers in the
driver.
It also introduces performance issue when frontend run much faster than
backend. Frontend may not be able to collect available descs when shadow
update is deferred. That will harm RFC2544 throughput.
Appropriate choice is to remove deferred shadowed update method.
Now shadowed used descs are flushed at the end of dequeue function.
Fixes: 31d6c6a5b820 ("vhost: optimize packed ring dequeue")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In order to avoid potential conflicts, rename the PCI_ADDR
enum value to VDPA_ADDR_PCI in vdpa_addr_type_enum.
All symbols referencing this enum are experimental, so it
does not break API policy.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, iotlb cache name is comprised of vid and virtqueue
index. For example, "iotlb_cache_0_0". Because vid is assigned
per process, iotlb cache name is not unique among multi processes.
For example a secondary process uses a vhost
(ex. eth_vhost0,iface=/tmp/sock0) and another secondary process
uses a vhost (ex. eth_vhost1,iface=/tmp/sock1), iotlb cache
name of both vhost ("iotlb_cache_0_0") are same and as a result
iotlb cache is broken.
This patch makes iotlb cache name unique among milti processes
by adding process id to the iotlb cache name.
The prefix of the name is shortened to "iotlb_" since the maximum
length of pool name is 25 bytes (RTE_MEMPOOL_NAMESIZE is 26).
Note that it is just 25 characters in maximum at the moment.
Here,
* pid_t == int: max 10 digits.
* vid < MAX_VHOST_DECICE(1024): max 4 digits.
* vq_index < VHOST_MAX_VRING(256): max 3 digits.
Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
Cc: stable@dpdk.org
Signed-off-by: Itsuro Oda <oda@valinux.co.jp>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
VHOST_FEATURES has been removed in previous refactoring.
Fixes: 0917f9d1f059 ("vhost: use new APIs to handle features")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Available buffer ID should be stored in the zmbuf in the packed-ring
dequeue path. There's no guarantee that local queue avail index is
equal to buffer ID.
Fixes: d1eafb532268 ("vhost: add packed ring zcopy batch and single dequeue")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reported-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes the vhost crypto missed
"VHOST_USER_PROTOCOL_F_CONFIG" flag problem during initialization.
Newer Qemu version requires this feature enabled.
Fixes: 939066d96563 ("vhost/crypto: add public function implementation")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add the previous prototype for the 'profile_hook_rx_burst_cb' function
to fix the compiler warning when the DPDK is built with
'RTE_ETHDEV_PROFILE_WITH_VTUNE' config option enabled:
/home/dpdk/lib/librte_ethdev/ethdev_profile.c:17:1: warning:
no previous prototype for profile_hook_rx_burst_cb [-Wmissing-prototypes]
Fixes: 2c1bbab7f09d ("ethdev: change vtune profiling approach")
Cc: stable@dpdk.org
Signed-off-by: Eugeny Parshutin <eugeny.parshutin@linux.intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add tracepoints at important and mandatory APIs for tracing support.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add tracepoints at important and mandatory APIs for tracing support.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add tracepoints at important and mandatory APIs for tracing support.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add tracepoints at important and mandatory APIs for tracing support.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Trace library exposes --trace-mode eal parameter to configure
event record mode when ring buffers are full.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Trace library exposes --trace-bufsz EAL parameter to configure
maximum size of ring buffer where events are to be stored.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Trace library exposes --trace-dir EAL parameter to configure
directory where traces will be generated.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add the following interrupt related tracepoints.
- rte_eal_trace_intr_callback_register()
- rte_eal_trace_intr_callback_unregister()
- rte_eal_trace_intr_enable()
- rte_eal_trace_intr_disable()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add the following thread related tracepoints.
- rte_eal_trace_thread_remote_launch()
- rte_eal_trace_thread_lcore_ready()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add the following memzone related tracepoints.
- rte_eal_trace_memzone_reserve()
- rte_eal_trace_memzone_lookup()
- rte_eal_trace_memzone_free()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add the following memory-related tracepoints.
- rte_eal_trace_mem_zmalloc()
- rte_eal_trace_mem_malloc()
- rte_eal_trace_mem_realloc()
- rte_eal_trace_mem_free()
rte_malloc() and rte_free() has been used in the trace implementation,
in order to avoid tracing implementation specific events, added
an internal no trace version rte_malloc() and rte_free().
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add following alarm related trace points.
- rte_eal_trace_alarm_set()
- rte_eal_trace_alarm_cancel()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
This patch creates the following generic tracepoint for
generic tracing when there is no dedicated tracepoint is
available.
- rte_eal_trace_generic_void()
- rte_eal_trace_generic_u64()
- rte_eal_trace_generic_u32()
- rte_eal_trace_generic_u16()
- rte_eal_trace_generic_u8()
- rte_eal_trace_generic_i64()
- rte_eal_trace_generic_i32()
- rte_eal_trace_generic_i16()
- rte_eal_trace_generic_i8()
- rte_eal_trace_generic_int()
- rte_eal_trace_generic_long()
- rte_eal_trace_generic_float()
- rte_eal_trace_generic_double()
- rte_eal_trace_generic_ptr()
- rte_eal_trace_generic_str()
For example, if an application wishes to emit an int datatype,
it can call rte_eal_trace_generic_int(val) to emit the trace.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Connect the internal trace interface API to FreeBSD EAL.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Connect the internal trace interface API to Linux EAL.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
The trace function payloads such as rte_trace_point_emit_* have
dual functions. The first to emit the payload for the registration
function and the second one to act as trace mem emitters a.k.a
provider payload.
When it is used as provider payload, those function copy the trace
field to trace memory based on the tracing mode.
Added payload definitions under ALLOW_EXPERIMENTAL_API define
to allow the existing applications to compile without enabling
experimental APIs.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
The trace function payloads such as rte_trace_point_emit_* have
dual functions. The first to emit the payload for the registration
function and the second one to act as trace memory emitters.
When it is used as registration payload, it will do the following to
fulfill the registration job.
- Find out the size of the event,
- Generate metadata field string using __rte_trace_point_emit_field().
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Implement rte_trace_save(), which will save the metadata
file and trace memory snapshot to the trace directory.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Implement rte_trace_metadata_dump() and rte_trace_dump()
functions. Former one used to dump the CTF metadata file and
the latter one to dump all the registered events and its status.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Trace memory will be allocated per thread to enable lockless trace
events updates to the memory. The allocator will first attempt to
allocate from hugepage, then if not available from hugepage or
finally fallback to malloc memory.
Later in the patches series, this API will be hooked to DPDK fast path
and control plane thread creation API. It is possible for non
DPDK thread to use trace events. In that case, trace memory
will be allocated on the first event emission.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Some of the keyword like align, event, "." and "->" etc will be
used in CTF metadata syntax. This patch support for handling
those keywords with DPDK events name.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Common trace format(CTF) defines the metadata[1][2] for trace events,
This patch creates the metadata for the DPDK events in memory and
later this will be saved to trace directory on rte_trace_save()
invocation.
[1] https://diamon.org/ctf/#specification
[2] https://diamon.org/ctf/#examples
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Find epoch_sec, epoch_nsec and uptime_ticks time information
on eal_trace_init()/bootup to derive the time in the trace.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Define eal_trace_init() and eal_trace_fini() EAL interface
functions that rte_eal_init() and rte_eal_cleanup() function can
use to initialize and finalize the trace subsystem.
eal_trace_init() function will add the following functionality if
trace is enabled through EAL command line param.
- Test for trace registration failure.
- Test for duplicate trace name registration.
- Generate UUID ver 4.
- Create a trace directory.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
The consumers of trace API defines the tracepoint and registers
to eal. Internally these tracepoints will be stored in STAILQ
for future use. This patch implements the tracepoint
registration function.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Define the public API for trace support.
This patch also adds support for the build infrastructure and
update the MAINTAINERS file for the trace subsystem.
The 8 bytes tracepoint object is a global variable, and can be used in
fast path. Created a new __rte_trace_point section to store the
tracepoint objects as,
- It is a mostly read-only data and not to mix with other "write"
global variables.
- Chances that the same subsystem fast path variables come in the same
fast path cache line. i.e, it will enable a more predictable
performance number from build to build.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Introduce rte_thread_getname() API to get the thread name
and implement it for Linux and FreeBSD.
FreeBSD does not support getting the thread name.
One of the consumers of this API will be the trace subsystem where
it used as an informative purpose.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>