A new device feature flag, RTE_COMPDEV_FF_OP_DONE_IN_DEQUEUE
is added. A PMD should set this if the bulk of the
processing is done during the dequeue. It should leave it
cleared if the bulk of the processing is done during the
enqueue (default).
Applications can use this as a hint for tuning.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
This patch adds AES-CTR cipher algorithm support to ipsec
library.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The string compare to the length of driver name might give false
positives when there are drivers with similar names (one being the
subset of another).
Following is such a naming which could result in false positive.
1. crypto_driver
2. crypto_driver1
When strncmp with len = strlen("crypto_driver") is done, it could give
a false positive when compared against "crypto_driver1". For such cases,
'strlen + 1' is done, so that the NULL termination also would be
considered for the comparison.
Fixes: d11b0f30df88 ("cryptodev: introduce API and framework for crypto devices")
Cc: stable@dpdk.org
Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
This commit adds result field to be used when modular exponentiation or
modular multiplicative inverse operation is used
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
in 18.08 new cache-aligned structure rte_crypto_asym_op was introduced.
As it also was included into rte_crypto_op, it caused implicit change
in rte_crypto_op layout and alignment: now rte_crypto_op is cache-line
aligned has a hole of 40/104 bytes between phys_addr and sym/asym op.
It looks like unintended ABI breakage, plus such change can cause
negative performance effects:
- now status and sym[0].m_src lies on different cache-lines, so
post-process code would need extra cache-line read.
- new alignment causes grow of the space requirements and cache-line
reads/updates for structures that contain rte_crypto_op inside.
As there seems no actual need to have rte_crypto_asym_op cache-line
aligned, and rte_crypto_asym_op is not intended to be used on it's own -
the simplest fix is just to remove cache-line alignment for it.
As the immediate positive effect: on IA ipsec-secgw performance increased
by 5-10% (depending on the crypto-dev and algo used).
My guess that on machines with 128B cache-line and lookaside-protocol
capable crypto devices the impact will be even more noticeable.
Fixes: 26008aaed14c ("cryptodev: add asymmetric xform and op definitions")
Cc: stable@dpdk.org
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Do not allow creating an Ethernet device with a name over the
allowed maximum (or zero length).
This is safer than silently truncating which is what happens now.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ali Alnubani <alialnu@mellanox.com>
All-multicast is a part of receive mode configuration and it is
better to mention explicitly that it is retained across restart.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The documentation says MAC addresses array is retained and
it is logical to assume that default MAC address is retained
as well.
Also some PMDs do not allow to change the default MAC in
running state (see RTE_ETH_DEV_NOLIVE_MAC_ADDR).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Changing MTU in running state may return -EBUSY saying that
MTU cannot be changed when the port is running. It assumes
that changes may be done in stopped and started (but some
PMDs may reject it) state and it is logical to require that
changes done in any of these states are retained.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
External backends may have specific requests to handle, and so
we don't want the vhost-user lib to handle these requests as
errors.
This patch also changes the experimental API by introducing
RTE_VHOST_MSG_RESULT_NOT_HANDLED so that vhost-user lib
can report an error if a message is handled neither by
the vhost-user library nor by the external backend.
The logic changes a bit so that if the callback returns
with ERR, OK or REPLY, it is considered the message
is handled by the external backend so it won't be
handled by the vhost-user library.
It is still possible for an external backend to listen
to requests that have to be handled by the vhost-user
library like SET_MEM_TABLE, but the callback have to
return NOT_HANDLED in that case.
Vhost-crypto backend is also adapted to this API change.
Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
rte_vhost_driver_set_protocol_features API is to be used
by external backends to advertise vhost-user protocol
features it supports.
It has to be called after rte_vhost_driver_register() and
before rte_vhost_driver_start().
Example of usage to advertize VHOST_USER_PROTOCOL_F_FOOBAR
protocol feature:
const char *path = "/tmp/vhost-user";
uint64_t protocol_features;
rte_vhost_driver_register(path, 0);
rte_vhost_driver_get_protocol_features(path, &protocol_features);
protocol_features |= VHOST_USER_PROTOCOL_F_FOOBAR;
rte_vhost_driver_set_protocol_features(path, protocol_features);
rte_vhost_driver_start(path);
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The VIRTIO_RING_F_EVENT_IDX feature of split ring might
be broken, as the value of signalled_used is invalid
after live migration, start up and virtio driver reload.
This patch fixes it by using signalled_used_valid.
In addition, this patch makes the VIRTIO_RING_F_EVENT_IDX
implementation of split ring match kernel backend to suppress
more interrupts.
Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification suppression")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The vhost-user spec says that once the vring is disabled, the
client has to stop processing it. But it can happen when
dequeue zero-copy is enabled if outstanding descriptors buffers
are still being processed by an external NIC or another guest.
The fix consists in draining the zmbufs list to ensure no more
descriptors buffers are in the wild.
Note that this fix is only working in the case REPLY_ACK
protocol feature is enabled, which is not the case by default
for now (it is only enabled when IOMMU feature is enabled in
the vhost library).
Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This patch fixes rte_ethdev header file to use the correct method name,
namely to use rte_eth_dev_info_get() instead of
rte_eth_dev_infos_get().
Fixes: a4996bd89c42 ("ethdev: new Rx/Tx offloads API")
Fixes: 4f5701f28bd4 ("examples: fix RSS hash function configuration")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Don't need to use snprintf for simple name copy.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
The set_port_owner was copying a string between structures of the
same type, therefore the name could never be truncated (unless source
string was not null terminated). Use strlcpy which does it better.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Currently, rte_realloc will not respect original allocation's
NUMA node when memory cannot be resized, and there is no
NUMA-aware equivalent of rte_realloc. This patch adds such a function.
The new API will ensure that reallocated memory stays on
requested NUMA node, as well as allow moving allocated memory
to a different NUMA node.
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
EventDev i.e consumer needs to be started before starting the
event producers.
Update documentation of EventDev and EventDev adapters.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Reviewed-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Rather than using linuxapp and bsdapp everywhere, we can change things to
use the, more readable, terms "linux" and "freebsd" in our build configs.
Rather than renaming the configs we can just duplicate the existing ones
with the new names using symlinks, and use the new names exclusively
internally. ["make showconfigs" also only shows the new names to keep the
list short] The result is that backward compatibility is kept fully but any
new builds or development can be done using the newer names, i.e. both
"make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc"
work.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Rename the macro and all instances in DPDK code, but keep a copy of
the old macro defined for legacy code linking against DPDK
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Rename the macro to make things shorter and more comprehensible. For
both meson and make builds, keep the old macro around for backward
compatibility.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The term "linuxapp" is a legacy one, but just calling the subdirectory
"linux" is just clearer for all concerned.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The term "bsdapp" is a legacy one, but just calling the subdirectory
"freebsd" is just clearer for all concerned.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This patch changes modular exponentiation and modular multiplicative
inverse API comments to make it more precise.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
-l and -c options are two ways to select the cores used by DPDK.
Their format differs, but the checks on the selected cores are the same.
Use an intermediate array to separate the specific parsing checks from
the common consistency checks.
The parsing functions now concentrate on validating the passed string
and do nothing more.
We can report all invalid core indexes rather than only the first error.
In the error log message, reporting [0, cfg->lcore_count - 1] as a valid
range is then wrong when the core list is not continuous.
Example on my 8 cpus laptop with core 2 and 6 disabled.
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu6/online
Before:
./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048
EAL: Detected 6 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: invalid core list, please check core numbers are in [0, 5] range
...
After:
./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048
EAL: Detected 6 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: lcore 2 unavailable
EAL: lcore 6 unavailable
EAL: invalid core list, please check specified cores are part of 0-1,3-5,7
...
Fixes: d888cb8b9613 ("eal: add core list input format")
Fixes: b38693b612b4 ("eal: fix core number validation")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
We don't need to look for trailing spaces.
This is a copy/paste block from eal_parse_coremask().
Remove it and the associated comment.
Fixes: d888cb8b9613 ("eal: add core list input format")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Spawning the ctrl threads on anything that is not part of the eal
coremask is not that polite to the rest of the system, especially
when you took good care to pin your processes on cpu resources with
tools like taskset (linux) / cpuset (freebsd).
Rather than introduce yet another eal options to control on which cpu
those ctrl threads are created, let's take the startup cpu affinity
as a reference and remove the eal coremask from it.
If no cpu is left, then we default to the master core.
The cpuset is computed once at init before the original cpu affinity
is lost.
Introduced a RTE_CPU_AND macro to abstract the differences between linux
and freebsd respective macros.
Examples in a 4 cores FreeBSD vm:
$ ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \
-- -i --total-num-mbufs=2048
$ procstat -S 1057
PID TID COMM TDNAME CPU CSID CPU MASK
1057 100131 testpmd - 2 1 2
1057 100140 testpmd eal-intr-thread 1 1 0-1
1057 100141 testpmd rte_mp_handle 1 1 0-1
1057 100142 testpmd lcore-slave-3 3 1 3
$ cpuset -l 1,2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \
-- -i --total-num-mbufs=2048
$ procstat -S 1061
PID TID COMM TDNAME CPU CSID CPU MASK
1061 100131 testpmd - 2 2 2
1061 100144 testpmd eal-intr-thread 1 2 1
1061 100145 testpmd rte_mp_handle 1 2 1
1061 100147 testpmd lcore-slave-3 3 2 3
$ cpuset -l 2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \
-- -i --total-num-mbufs=2048
$ procstat -S 1065
PID TID COMM TDNAME CPU CSID CPU MASK
1065 100131 testpmd - 2 2 2
1065 100148 testpmd eal-intr-thread 2 2 2
1065 100149 testpmd rte_mp_handle 2 2 2
1065 100150 testpmd lcore-slave-3 3 2 3
Fixes: d651ee4919cd ("eal: set affinity for control threads")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
pthread_setaffinity_np returns a >0 value on error.
We could end up letting the ctrl threads on the current process cpu
affinity.
Fixes: d651ee4919cd ("eal: set affinity for control threads")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
pthread_getaffinity_np returns a >0 value when failing.
This is mainly for the sake of correctness.
The only case where it could fail is when passing an incorrect cpuset
size wrt to the kernel.
Fixes: 2eba8d21f3c9 ("eal: restrict cores auto detection")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
The RTE_PMD_DEBUG_TRACE was only enabled for EVENTDEV_DEBUG and
that configuration is now handled by RTE_EDEV_LOG macros.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The driver already has RTE_EDEV_XXX log macros so use
them in two more places.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The rte_vhost API to put data into virtqueues operates
on mbufs and hence it is strictly vhost-net specific.
External backends need to implement virtqueue handling
from scratch and that's just not possible without APIs
to get/set vring base addresses.
Those relevant APIs are there, but they have a check that
prevents them from working with any non-vhost-net device.
This patch removes those checks.
rte_vhost_get_log_base() is not necessarily needed for
external backends, as other, higher level vhost APIs for
live migration are available and could be used instead.
We remove the extra check from it anyway for consistency.
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reclaim outstanding zmbufs first before freeing memory regions,
otherwise there could be use-after-free.
Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Don't free the zero copy mbufs before they have been consumed,
otherwise there could be use-after-free.
Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The mbufs should also be restored in free_zmbufs().
Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Fixes: 3ebd930588b7 ("vhost: fix mbuf free")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Use dependency() instead of manual append to ldflags.
Move libbsd inclusion to librte_eal, so that all other libraries and
PMDs will inherit it.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Most libraries and PMDs depend on eal, and eal depends only on kvargs,
so reorder the list in Meson to reflect this and take advantage of this
dependency chain.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Whenever possible (if the library ships a pkg-config file) use meson's
dependency() function to look for it, as it will automatically add it
to the Requires.private list if needed, to allow for static builds to
succeed for reverse dependencies of DPDK. Otherwise the recursive
dependencies are not parsed, and users doing static builds have to
resolve them manually by themselves.
When using this API avoid additional checks that are superfluous and
take extra time, and avoid adding the linker flag manually which causes
it to be duplicated.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
Rather than relying on the target machine architecture, use the
size of a pointer from the compiler to determine if we are 64-bits
or not. This allows correct behaviour when you pass -m32 as a compile
option. It also allows us to use this value repeatedly throughout the
repo rather than continually testing for the sizeof(void*).
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Luca Boccassi <bluca@debian.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Since compat library is only a single header, we can easily move it into
the EAL common headers instead of tracking it separately. The downside of
this is that it becomes a little more difficult to have any libs that are
built before EAL depend on it. Thankfully, this is not a major problem as
the only library which uses rte_compat.h and is built before EAL (kvargs)
already has the path to the compat.h header file explicitly called out as
an include path.
However, to ensure that we don't hit problems later with this, we can add
EAL common headers folder to the global include list in the meson build
which means that all common headers can be safely used by all libraries, no
matter what their build order.
As a side-effect, this patch also fixes an issue with building on BSD using
meson, due to compat lib no longer needing to be listed as a dependency.
Fixes: a8499f65a1d1 ("log: add missing experimental tag")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
sprintf function is not secure as it doesn't check the length of string.
More secure function snprintf is used.
Fixes: d7280c9fffcb ("vhost: support selective datapath")
Cc: stable@dpdk.org
Signed-off-by: Pallantla Poornima <pallantlax.poornima@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
In rte_vhost_driver_unregister(), the connection fd is
removed from the fdset using fdset_try_del(). Call to
this function may fail if the corresponding fd is in
busy state, indicating that event dispatcher is
executing the read or write callback on this fd.
When it happens, rte_vhost_driver_unregister() keeps
trying to remove the fd from the set until it is no
more busy.
This situation is causing a deadlock, because
rte_vhost_driver_unregister() keeps trying to remove
the fd from the set with vhost_user.mutex held, while
the callback executed by the dispatcher,
vhost_user_read_cb(), also takes this mutex at
numerous places.
The fix consists in releasing vhost_user.mutex between
each retry in vhost_driver_unregister().
Fixes: 8b4b949144b8 ("vhost: fix dead lock on closing in server mode")
Cc: stable@dpdk.org
Signed-off-by: Wenjie Sun <findtheonlyway@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When removing the old attach function, the racy variable for getting
the last port id became unused.
Fixes: c9cce42876f5 ("ethdev: remove deprecated attach/detach functions")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Add the strlcat function to DPDK to exist alongside the strlcpy one.
While strncat is generally safe for use for concatenation, the API for the
strlcat function is perhaps a little nicer to use, and supports truncation
detection.
See commit 5364de644a4b ("eal: support strlcpy function") for more
details on the function selection logic, since we only should be using the
DPDK-provided version when no system-provided version is present.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
External message callbacks are used e.g. by vhost crypto
to parse crypto-specific vhost-user messages.
We are now publishing the API to register those callbacks,
so that other backends outside of DPDK can use them as well.
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We don't need to relay available ring and check the desc, vdpa device
can access the available ring in the guest directly. With this patch,
we can achieve better throughput and lower CPU usage.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>