Latency calculation logic is not correct for the case where
packets gets dropped before TX. As for the dropped packets,
the timestamp is not cleared, and such packets still gets
counted for latency calculation in next runs, that will result
in inaccurate latency measurement.
So fix this issue as below,
Before setting timestamp in mbuf, check mbuf don't have
any prior valid time stamp flag set and after marking
the timestamp, set mbuf flags to indicate timestamp is
valid.
Before calculating timestamp check mbuf flags are set to
indicate timestamp is valid.
With the above logic it is guaranteed that correct timestamps
have been used.
Fixes: 5cd3cac9ed ("latency: added new library for latency stats")
Cc: stable@dpdk.org
Reported-by: Bao-Long Tran <longtb5@viettel.com.vn>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Tested-by: Bao-Long Tran <longtb5@viettel.com.vn>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch fixes an issue caught with ASAN where a vdev_scan()
to a secondary bus was failing to free some memory.
The doxygen comment in EAL is fixed at the same time.
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
Fixes: 783b6e54971d ("eal: add synchronous multi-process communication")
Cc: stable@dpdk.org
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_devargs_parsef will leak memory each time it is called.
The device string must be freed.
Fixes: a23bc2c4e01b ("devargs: add non-variadic parsing function")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
eal: add shorthand __rte_weak macro
qat: update code to use __rte_weak macro
avf: update code to use __rte_weak macro
fm10k: update code to use __rte_weak macro
i40e: update code to use __rte_weak macro
ixgbe: update code to use __rte_weak macro
mlx5: update code to use __rte_weak macro
virtio: update code to use __rte_weak macro
acl: update code to use __rte_weak macro
bpf: update code to use __rte_weak macro
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The cast of hpet_msb_inc is causing a warning in some compilations.
Yet the cast is unnecessary, the function is used only one place
just use the correct signature.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
rte_init_alert already adds a newline, don't do it twice.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Packet Data Convergence Protocol (PDCP) is added in rte_security
for 3GPP TS 36.323 for LTE.
The patchset provide the structure definitions for configuring the
PDCP sessions and relevant documentation is added.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Anoob Joseph <anoob.joseph@caviumnetworks.com>
In no-shconf mode the rte_mp_request_sync() wasn't initializing
the `reply` parameter, which contained e.g. a number of sent
requests. Callers of rte_mp_request_sync() might check that
param afterwards and might read potentially unitialized memory.
The no-shconf check that makes us return early (with rc = 0) was
placed before the `reply` initialization. Fix this by making the
`reply` initialization occur first.
Fixes: 5848e3d2813c ("ipc: support --no-shconf mode")
Cc: stable@dpdk.org
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
In the doxygen description of rte_kvargs_process(), it is said:
If *kvlist* is NULL function does nothing.
It has been added by mistake here instead of rte_kvargs_free().
Anyway, null list should be correctly handled in both functions.
Comments are fixed in both functions and NULL handling is added
to rte_kvargs_process().
Fixes: c34af7424e09 ("kvargs: fix freeing behaviour for null")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Segment preallocation code allocates an array of structures on the
heap but does not free the memory afterwards. Fix it by freeing it
at the end of the function, and changing control flow to always go
through that code path.
Coverity issue: 323524
Fixes: 1dd342d0fdc4 ("mem: improve segment list preallocation")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
A crash may appear when removing some PCI devices because
dev->devargs is not always initialized. So use dev->bus instead of
dev->devargs->bus when building devargs string to remove a device.
Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Current code to preallocate segment lists is trying to do
everything in one go, and thus ends up being convoluted,
hard to understand, and, most importantly, does not scale beyond
initial assumptions about number of NUMA nodes and number of
page sizes, and therefore has issues on some configurations.
Instead of fixing these issues in the existing code, simply
rewrite it to be slightly less clever but much more logical, and
provide ample comments to explain exactly what is going on.
We cannot use the same approach for 32-bit code because the
limitations of the target dictate current socket-centric
approach rather than type-centric approach we use on 64-bit
target, so 32-bit code is left unmodified. FreeBSD doesn't
support NUMA so there's no complexity involved there, and thus
its code is much more readable and not worth changing.
Fixes: 1d406458db47 ("mem: make segment preallocation OS-specific")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Musl complains about pthread id being of wrong size, because on
musl, pthread_t is a struct pointer, not an unsigned int. Fix the
printing code by casting pthread id to unsigned pointer type and
adjusting the format specifier to be of appropriate size.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Musl wraps various string functions such as strlcpy in order to
harden them. However, the fortify wrappers are included without
including the actual string functions being wrapped, which
throws missing definition compile errors. Fix by including
string.h in string functions header.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When built against musl, fcntl.h doesn't silently get included.
Fix by including it explicitly.
Bugzilla ID: 31
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When built against musl, fcntl.h doesn't silently get included.
Fix by including it explicitly.
Bugzilla ID: 33
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When built against musl, fcntl.h doesn't silently get included.
Fix by including it explicitly.
Bugzilla ID: 34
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
We use _GNU_SOURCE all over the place, but often times we miss
defining it, resulting in broken builds on musl. Rather than
fixing every library's and driver's and application's makefile,
fix it by simply defining _GNU_SOURCE by default for all
builds.
Remove all usages of _GNU_SOURCE in source files and makefiles,
and also fixup a couple of instances of using __USE_GNU instead
of _GNU_SOURCE.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
After calling unplug function of a bus, the device is expected
to be freed. It is too late for getting devargs to remove.
Anyway, the buses which implement unplug are already freeing
the devargs, except the PCI bus.
So the call to rte_devargs_remove() is removed from EAL and
added in PCI.
Fixes: 2effa126fbd8 ("devargs: simplify parameters of removal function")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Postcopy live-migration feature requires the application to
not populate the guest memory. As the vhost library cannot
prevent the application to that (e.g. preventing the
application to call mlockall()), the feature is disabled by
default.
The application should only enable the feature if it does not
force the guest memory to be populated.
In case the user passes the RTE_VHOST_USER_POSTCOPY_SUPPORT
flag at registration but the feature was not compiled,
registration fails.
For the same reason, postcopy and dequeue zero copy features
are not compatible, so don't advertize postcopy support if
dequeue zero copy is requested.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The master sends this message before stopping handling
userfaults, so that the backend closes the userfaultfd.
The master waits for the slave to acknowledge the request
with an empty 64bits payload for synchronization purpose.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The VHOST_USER_SET_MEM_TABLE payload is copied when handled,
whereas it could directly be referenced.
This is not very important, but next, we'll need to update the
payload and send it back to Qemu.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
This patch opens a userfaultfd and sends it back to Qemu's
VHOST_USER_POSTCOPY_ADVISE request.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Postcopy live-migration features relies on userfaultfd,
which was only introduced in kernel v4.3.
This patch introduces a new define to allow building vhost
library on kernels not supporting userfaultfd.
With legacy build system, user has to explicitly set
CONFIG_RTE_LIBRTE_VHOST_POSTCOPY to 'y'.
With Meson build system, RTE_LIBRTE_VHOST_POSTCOPY gets
automatically defined if userfaultfd kernel header is
present.
Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Passing userfault fds to Qemu will be required for postcopy
live-migration feature.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This is not used for now, but will be needed for the
special handling of VHOST_USER_SET_MEM_TABLE message
once postcopy will be supported.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As soon as some ancillary data (fds) are received, it is copied
without checking its length.
This patch adds the number of fds received to the message,
which is set in read_vhost_message().
This is preliminary work to support sending fds to Qemu.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
When the memory table gets updated, the rings addresses need
to be translated again. If it fails, we need to exit cleanly
by unmapping memory regions.
Fixes: d5022533c20a ("vhost: retranslate vring addr when memory table changes")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
QEMU doesn't expect any payload for the reply of
VHOST_USER_SET_LOG_BASE request, so don't send any.
Note that the Vhost-user specification isn't clear about
it and would need to be fixed.
Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Cc: stable@dpdk.org
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
For messages that require a reply, a second ack should not be
sent when reply-ack protocol feature is negotiated, even if
the corresponding flag is set in the message.
The code is compliant with the spec but it isn't clear it is,
so this patch adds a comment to make it explicit.
Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
VHOST_USER_GET_PROTOCOL_FEATURES, VHOST_USER_GET_VRING_BASE
and VHOST_USER_SET_LOG_BASE require replies, so their handlers
should return VH_RESULT_REPLY, not VH_RESULT_OK.
Fixes: 0bff510b5ea6 ("vhost: unify message handling function signature")
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Return of message handling has now changed to an enum that can
take non-negative value that is not zero in case a reply is
needed. But the code checking the variable afterwards has not
been updated, leading to success messages handling being
treated as errors.
External post and pre callbacks return type needs also to be
changed to the new enum, so that its handling is consistent.
This is done in this patch alongside with the convertion of
its only user, vhost-crypto backend.
Fixes: 0bff510b5ea6 ("vhost: unify message handling function signature")
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As APIs in rte_vdpa.h are public, we need to add doxygen comments
to all APIs and structures.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The notification can't be disabled in packed ring when
application tries to disable notification, because the
device event flags field is overwritten by an unexpected
value. This patch fixes this issue.
Fixes: b1cce26af1dc ("vhost: add notification for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
rte_flow actions:
- RTE_FLOW_ACTION_TYPE_SET_MAC_SRC
- RTE_FLOW_ACTION_TYPE_SET_MAC_DST
added in order to offload to NIC
The rte_flow_itme_eth must be present in rte_flow pattern
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
rewrite TTL by decrease or just set it directly
it's not necessary to check if the final result
is zero or not
This is slightly different from the one defined
by openflow and more generic
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Primary and secondary processes share a per-device private data. With
current design it is not possible to have data per-device per-process.
This is required for handling properly the CPP interface inside the NFP
PMD with multiprocess support.
There is also at least another PMD driver, tap, with similar
requirements for per-process device data.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch fixes the cryptodev library version number that was
missed updating in DPDK 18.08.
Fixes: a4493be5bdfa ("cryptodev: replace bus specific struct with generic dev")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The documentation of rte_crypto_op_pool_create indicates that
specifying RTE_CRYPTO_OP_TYPE_UNDEFINED would create a pool that
supports all operation types. This change makes the code
consistent with documentation.
Fixes: c0f87eb5252b ("cryptodev: change burst API to be crypto op oriented")
Cc: stable@dpdk.org
Signed-off-by: Junxiao Shi <git@mail1.yoursunny.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
In the devargs syntax for device representors, it is possible to add
several devices at once: -w dbdf,representor=[0-3]
It will become a more frequent case when introducing wildcards
and ranges in the new devargs syntax.
If a devargs string is provided for probing, and updated with a bigger
range for a new probing, then we do not want it to fail because
part of this range was already probed previously.
There can be new ports to create from an existing rte_device.
That's why the check for an already probed device
is moved as bus responsibility.
In the case of vdev, a global check is kept in insert_vdev(),
assuming that a vdev will always have only one port.
In the case of ifpga and vmbus, already probed devices are checked.
In the case of NXP buses, the probing is done only once (no hotplug),
though a check is added at bus level for consistency.
In the case of PCI, a driver flag is added to allow PMD probing again.
Only the PMD knows the ports attached to one rte_device.
As another consequence of being able to probe in several steps,
the field rte_device.devargs must not be considered as a full
representation of the rte_device, but only the latest probing args.
Anyway, the field rte_device.devargs is used only for probing.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>