The current mbuf scatter gather feature flag is
too ambiguous, as it is not clear if input and/or output
buffers can be scatter gather mbufs or not.
Therefore, three new flags will replace this flag:
- RTE_COMP_FF_OOP_SGL_IN_SGL_OUT
- RTE_COMP_FF_OOP_SGL_IN_FB_OUT
- RTE_COMP_FF_OOP_LB_IN_SGL_OUT
Note that out-of-place flat buffers is supported by default
and in-place is not supported by the library.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shally.verma@caviumnetworks.com>
rte_security_session_destroy should return -EINVAL if session is NULL,
but segfaults because of rte_mempool_from_obj(NULL) call.
Fixes: c261d1431b ("security: introduce security API and framework")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
For cryptodev dynamic logging, conditional compilation of
debug logs is not actually required.
Signed-off-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
Reviewed-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Add code to set up packed queues when enabled.
Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
Add some helper functions to check descriptor flags
and check if a vring is of type packed.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
This is an optimization to prefetch next buffer while the
current one is being processed.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
This is an optimization to prefetch next buffer while the
current one is being processed.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
To ease packed ring layout integration, this patch makes
the dequeue path to re-use buffer vectors implemented for
enqueue path.
Doing this, copy_desc_to_mbuf() is now ring layout type
agnostic.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
Relax used ring contention by reusing the shadow used
ring feature used by enqueue path.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
ethdev layer introduced checks for application requested RSS hash
functions and returns error for ones unsupported by hardware
This check breaks some sample applications which blindly configures
RSS hash functions without checking underlying hardware support.
Updated examples to mask out unsupported RSS has functions during device
configuration.
Prints a log if configuration values updated by this check.
Fixes: aa1a6d87f1 ("ethdev: force RSS offload rules again")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Meijuan Zhao <meijuanx.zhao@intel.com>
Tested-by: Yingya Han <yingyax.han@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
In DPDK 17.11, the ethdev offloads API has changed:
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload
For reminder, the main concepts in the new API were:
- All offloads are disabled by default
- Distinction between per port and per queue offloads.
The transition bits are now removed:
- Translation of the old API in ethdev
- rte_eth_conf.rxmode.ignore_offload_bitfield
- ETH_TXQ_FLAGS_IGNORE
The old API bits are now removed:
- Rx per-port rte_eth_conf.rxmode.[bit-fields]
- Tx per-queue rte_eth_txconf.txq_flags
- ETH_TXQ_FLAGS_NO*
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Documents the assumption that 'xstats[i].id == i' and
key=xstats_names[i].name, value=xstats[i].value
xstats[i].id is still used for xstats _by_id() APIs.
This patch reverts some part of the commit 6d52d1d4af ("ethdev:
clarify extended statistics documentation")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
If devices always use descriptors in the same order in which they have
been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the use
of a batch of buffers to virtio driver by only writing used ring index.
Vhost user device has supported this feature by default. If vhost
dequeue zero is enabled, should disable VIRTIO_F_IN_ORDER as vhost can’t
assure that descriptors returned from NIC are in order.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
DEV_RX_OFFLOAD_KEEP_CRC offload flag is added. PMDs that support
keeping CRC should advertise this offload capability.
DEV_RX_OFFLOAD_CRC_STRIP flag will remain one more release
default behavior in PMDs are to keep the CRC until this flag removed
Until DEV_RX_OFFLOAD_CRC_STRIP flag is removed:
- Setting both KEEP_CRC & CRC_STRIP is INVALID
- Setting only CRC_STRIP PMD should strip the CRC
- Setting only KEEP_CRC PMD should keep the CRC
- Not setting both PMD should keep the CRC
A helper function rte_eth_dev_is_keep_crc() has been added to be able to
change the no flag behavior with minimal changes in PMDs.
The PMDs that doesn't report the DEV_RX_OFFLOAD_KEEP_CRC offload can
remove rte_eth_dev_is_keep_crc() checks next release, related code
commented to help the maintenance task.
And DEV_RX_OFFLOAD_CRC_STRIP has been added to virtual drivers since
they don't use CRC at all, when an application requires this offload
virtual PMDs should not return error.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Introduce an helper for PMD to expand easily flows items list with RSS
action into multiple flow items lists with priority information.
For instance a user items list being "eth / end" with rss action types
"ipv4-udp ipv6-udp end" needs to be expanded into three items lists:
- eth
- eth / ipv4 / udp
- eth / ipv6 / udp
to match the user request. Some drivers are unable to reach such
request without this expansion, this API is there to help those.
Only PMD should use such API for their internal cooking, the application
will still handle a single flow.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
COUNT action has been modified and has several fields not addressable
though testpmd. In addition, as those fields are not definable testpmd
is providing an empty configuration which is undefined.
Fixes: fb8fd96d42 ("ethdev: add shared counter to flow API")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
static logging macro RTE_PMD_DEBUG_TRACE is enabled with a few DEBUG
config options, including RTE_LIBRTE_ETHDEV_DEBUG
RTE_LIBRTE_ETHDEV_DEBUG is still used for data path logging, but all
ethdev logging switched to dynamic logging, so no need to enable static
logging macro for ethdev.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Replace RTE_PMD_DEBUG_TRACE with RTE_ETHDEV_LOG.
RTE_PMD_DEBUG_TRACE is using hardcoded PMD logtype and ERR log level,
controlled by compile time flags.
RTE_ETHDEV_LOG is using dynamic ethdev_logtype.
Also a few minor cleanups, like
- use %u for unsigned values like port_id which is uint16_t
- use PRIx64 for owner_id
- Join some log lines
- Unify to not have a "." at the end of the log
- Unify log start with uppercase
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Macro moved to header to be able to convert logging usage in header.
And since it has been moved to public header changed naming and added
RTE prefix, ethdev_log -> RTE_ETHDEV_LOG
Also need to add logtype variable to map file since logging macro used
from other libraries.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The log_cache_nb_elem was never incremented, resulting
in all dirty pages to be missed during live migration.
Fixes: c16915b871 ("vhost: improve dirty pages logging performance")
Cc: stable@dpdk.org
Reported-by: Peng He <xnhp0320@icloud.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Previously, we were putting an exclusive lock to prevent secondary
processes spinning up while we are sending our messages. However,
using exclusive locks had an effect of disallowing multiple
simultaenous unrelated messages/requests being sent, which was
not the intention behind locking.
Fix it to put a shared lock on the directory. That way, we still
prevent secondary process initializations while sending data over
IPC, but allow multiple unrelated transmissions to proceed.
Fixes: 89f1fe7e6d ("eal: lock IPC directory on init and send")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Qi Zhang <qi.z.zhang@intel.com>
In 17.08, the crypto operation was restructured,
and some reserved bytes (5) were added to have the mempool
pointer aligned to 64 bits, since the structure is expected
to be aligned to 64 bits, allowing future additions with no
ABI breakage needed.
In 18.05, a new 2-byte field was added, so the reserved bytes
were reduced to 3. However, this field was added after the first 3 bytes
of the structure, causing it to be placed in an offset of 4 bytes,
and therefore, forcing the mempool pointer to be placed after 16 bytes,
instead of a 8 bytes, causing unintentionally the ABI breakage.
This commit fixes the breakage, by swapping the reserved bytes
and the private_data_offset field, so the latter is aligned to 2 bytes
and the offset of the mempool pointer returns to its original offset,
8 bytes.
Fixes: 54c8368466 ("cryptodev: set private data for session-less mode")
Cc: stable@dpdk.org
Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Converted the license header of the files
that still have the old full header.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
This patch renames u16 to u16_buf. u16 as a variable name causes a shadowed
declaration warning if, for example, the application also typedefs u16
(e.g. by including a header containing "typedef unsigned short u16") and
the application is built with -Wshadow.
Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Rather than copy the string, we can use a precision in the format string
given to printf.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Rather than copy the log message, we can use a precision in the format
string given to syslog.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
The Rx adapter stop call does not guarantee that the
SW service function will not execute after the
rte_event_eth_rx_adapter_stop() call.
Add a "started" flag to prevent the adapter from executing
if stop has been called.
Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Set the internal_event_port flag when the ethdev-eventdev
packet transfer is implemented in hardware and add a check
for the flag to ignore the connection when setting up the
WRR polling sequence.
Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add an event buffer flush when the current invocation
of the Rx adapter is completed.
This patch provides lower latency in case there is a
BATCH_SIZE of events in the event buffer.
Cc: stable@dpdk.org
Suggested-by: Narender Vangati <narender.vangati@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
After dequeuing Rx packets and enqueueing them to the
temporary buffer towards eventdev, the packet Rx loop exits
if the temporary buffer is full but the current WRR position
is not saved.
Save away the current value of the WRR position, so packets
are dequeued from the correct Rx queue in the next invocation.
Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Suggested-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
The dev_id parameter to fill_event_buffer() should be 16 bit,
also rename to to eth_dev_id to avoid confusion with event device
id elsewhere in the file.
Fixes: c2189c907d ("eventdev: make ethdev port identifiers 16-bit")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
vhost_vring_call() used rte_mb(), which translates into
mfence instruction on x86.
This patch changes to use rte_smp_mb(), which changed recently
to translate into a locked ADD instruction for performance
reason.
The measured gain is up to 3% with the testpmd benchmarks.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Introduce an new common helper to avoid redundancy.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When a vDPA device is attached, vhost user will try to
register host notifiers to QEMU to allow notifications
to be delivered between the driver in the guest and the
vDPA device in the host directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Make sure find avalid device id before allocating
virtio_net, if not, return directly. It may avoid
allocating and freeing virtio_net when there is
not valid device id.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
PMDs should provide supported RSS hash functions via
dev_info.flow_type_rss_offloads variable.
There is a check in ethdev if requested RSS hash function is supported
by PMD or not.
This check has been relaxed in previous release to not return an error
when a non supported has function requested [1], this has been done to
not break the applications.
Adding the error return back.
PMDs need to provide correct list of supported hash functions and
applications need to take care this information before configuring
the RSS otherwise they will get an error from APIs:
rte_eth_dev_rss_hash_update()
rte_eth_dev_configure()
[1] commit af7551e2bf ("ethdev: remove error return on RSS hash check")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The error path was disabled in previous release to let apps to be more
flexible.
But this release they are enabled, applications have to obey offload API
rules otherwise they will get errors from following APIs:
rte_eth_dev_configure
rte_eth_rx_queue_setup
rte_eth_tx_queue_setup
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Instead of copying batch_copy_nb_elems into the stack,
this patch uses it directly.
Small performance gain of 3% is seen when running PVP
benchmark.
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch reworks the vhost enqueue path so that a single
code path is used for both Rx mergeable or non-mergeable cases.
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The *_callback_* functions are not implemented,
so they are removed from the export map file.
Fixes: ed7dd94f7f ("compressdev: add basic device management")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The functions
- vfio_get_container_fd
- vfio_get_group_fd
- vfio_get_group_no
have been renamed to
- rte_vfio_get_container_fd
- rte_vfio_get_group_fd
- rte_vfio_get_group_num
The old names are removed from the map file.
Fixes: 964b2f3bfb ("vfio: export some internal functions")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Deprecate rte_eal_mbuf_default_mempool_ops(), it shall be replaced by
rte_mbuf_best_mempool_ops().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Currently, memzone allocation with length set to 0 that are also
IOVA-contiguous is not supported. Document this limitation.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Mem event and validator callbacks may not be supported under all
circumstances (such as when running in legacy memory mode, or on
FreeBSD), and this case needs to be handled by any code that will
use these callbacks. Spell this out more clearly, because it's not
immediately obvious that this is an expected use case.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
The rte_tm_get_number_of_leaf_nodes() API function was added in DPDK
17.08. However, it was added to the .map file with the wrong function
name (rte_tm_get_leaf_nodes), which was subsequently removed in commit
3e8ea3d ('lib: remove unused map symbols').
Add it back under the 17.08 release with the correct function name.
Fixes: 5d109deffa ("ethdev: add traffic management API")
Cc: stable@dpdk.org
Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Change adds extra information on name parameter for API
rte_eth_dev_get_name_by_port and rte_eth_dev_get_port_by_name.
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Like original commit mentioned below, this fix synchronizes flow rule copy
function with testpmd's own implementation following "app/testpmd: fix copy
of raw flow item (revisited)".
It addresses a crash that occurs when feeding a RAW pattern item to
rte_flow_copy(). Besides external applications, two PMDs (bonding and
failsafe) rely on this function internally.
Note the scope of this patch is limited to the RAW pattern item and has no
impact on all others.
Fixes: 972bf36106 ("ethdev: fix shallow copy of flow API RSS action")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
eth_dev_last_created_port is used to store port id type and should
be extended to 16bits corresponding to ethdev port id range.
Fixes: f8244c6399 ("ethdev: increase port id range")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_event_eth_rx_adapter_create allocates eth_devices for
currently available eth devices. For newly created eth
devices a new instance for rx adapter has to be created.
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Nikhil Rao <nikhil.rao@intel.com>
Fix the call to rte_timer_reset_sync() in sw_event_timer_cb(). The
second parameter is the number of ticks, the third is the timer type.
Fixes: 6750b21bd6 ("eventdev: add default software timer adapter")
Signed-off-by: Dan Gora <dg@adax.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
rte_cryptodev_get_header_session_size() and
rte_cryptodev_get_private_session_size() functions are
targeting symmetric sessions.
With the future addition of asymmetric operations,
these functions need to be renamed from *cryptodev_*_session_*
to *cryptodev_sym_*_session_* to be symmetric specific.
The two original functions are marked as deprecated
and will be removed in 18.08, so applications can still
use the functions in 18.05.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Shally Verma <shally.verma@caviumnetworks.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Functions rte_cryptodev_queue_pair_start/stop
are not really used in any of the crypto drivers
(they all just return 0 or -ENOTSUP).
Therefore, this API can be deprecated from 18.05
and removed in 18.08.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Functions rte_cryptodev_queue_pair_attach_sym_session
and rte_cryptodev_queue_pair_detach_sym_sessions
are not really used in any of the crypto drivers
(only one driver implements it and it just return 0).
Therefore, this API can be deprecated from 18.05
and removed in 18.08.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Add extra clarification about offset in source and
destination mbuf used in compressdev, when they
are a chain of mbufs.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Shally Verma <shally.verma@caviumnetworks.com>
Session data can only be cleared once there are no
inflight operations using the session. It is the application's
responsability to make sure of this.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
As the private_xform data can be shared by many operations
and across queue_pairs, it would be performance impacting
for PMDs to track inflights associated with one. It makes
more sense to push the responsibility to the application to
keep track of its usage and only delete the private_xform when
there are no more ops using it.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Driver id field is not set/used anywhere,
so it should be removed from rte_compressdev structure.
Fixes: ed7dd94f7f ("compressdev: add basic device management")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Security protocol flag string was not added
when the actual flag was added.
Fixes: eadb4fa1e1 ("cryptodev: support security APIs")
Cc: stable@dpdk.org
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
Extend the description of cryptodev feature flags,
adding extra information.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Crypto capability structure contains supported
sizes for key, IV, digest, etc. on different algorithms.
These sizes can be expressed as a single value or
a range of values.
The check was broken when a size was checked against
a range with multiple values.
Also, for more clarity, the param_range_check macro
has been converted into a function.
Fixes: 38227c0e3a ("cryptodev: retrieve device info")
Cc: stable@dpdk.org
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
GCC 8.1 warned:
In function 'rte_eth_rx_burst':
rte_ethdev.h:3836:18: warning: conversion to 'int16_t'
{aka 'short int'} from 'uint16_t' {aka 'short unsigned int'}
may change the sign of the result [-Wsign-conversion]
int16_t nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
^
rte_ethdev.h:3844:50: warning: conversion to 'uint16_t'
{aka 'short unsigned int'} from 'int16_t' {aka 'short int'}
may change the sign of the result [-Wsign-conversion]
nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
^~~~~
rte_ethdev.h:3844:12: warning: conversion to 'int16_t'
{aka 'short int'} from 'uint16_t' {aka 'short unsigned int'}
may change the sign of the result [-Wsign-conversion]
nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
^~
rte_ethdev.h:3851:9: warning: conversion to 'uint16_t'
{aka 'short unsigned int'} from 'int16_t' {aka 'short int'}
may change the sign of the result [-Wsign-conversion]
return nb_rx;
^~~~~
The second part of the patch is solved by its own basic
block because it is inside a preprocessor conditional.
Bringing the declaration of the var to the top of the
function would require that also being given its own
preprocessor conditional, or a (void)var to avoid an
unused var warning. The basic block is no worse than
those imho.
Fixes: 467465d86d ("ethdev: add packet count parameter to Rx callback")
Fixes: 4dc294158c ("ethdev: support optional Rx and Tx callbacks")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_pktmbuf_prepend':
rte_mbuf.h:1908:17: warning: conversion from 'int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
m->data_off -= len;
^~~
m->data_off is a uint16_t
uint16_t data_off;
len (a uint16_t) is promoted to an int using -=. Do the
subtraction explicitly and cast the result to uint16_t.
The below += or -= changes are solving the same thing.
In function 'rte_pktmbuf_adj':
rte_mbuf.h:1969:17: warning: conversion from 'int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
m->data_off += len;
^~~
In function 'rte_pktmbuf_chain':
rte_mbuf.h:2082:19: warning: conversion from 'int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
head->nb_segs += tail->nb_segs;
^~~~
Also uint16_t
uint16_t nb_segs; /**< Number of segments. */
Fixes: 08b563ffb1 ("mbuf: replace data pointer by an offset")
Fixes: 1a60a0daa6 ("mbuf: fix segments number type increase")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_validate_tx_offload':
rte_mbuf.h:2112:19: warning: conversion to 'uint64_t'
{aka 'long unsigned int'} from 'int' may change the
sign of the result [-Wsign-conversion]
inner_l3_offset += m->outer_l2_len + m->outer_l3_len;
^~
uint64_t inner_l3_offset...
/* fields for TX offloading of tunnels */
uint64_t outer_l3_len:9; /**< Outer L3 (IP) Hdr Length. */
uint64_t outer_l2_len:7; /**< Outer L2 (MAC) Hdr Length. */
We want to do the arithmetic entirely in uint64_t
space, but with the +=, the rhs type becomes int since the
bitfields will fit in int.
Elaborate the artithmetic to be u64 = u64 + int + int, so
the type of the result is correct to be stored in the u64.
Fixes: 4fb7e803eb ("ethdev: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_pktmbuf_linearize':
rte_mbuf.h:1873:32: warning: conversion to 'int' from 'uint32_t'
{aka 'unsigned int'} may change the sign of the result [-Wsign-conversion]
rte_mbuf.h:2166:13: note: in expansion of macro 'rte_pktmbuf_pkt_len'
copy_len = rte_pktmbuf_pkt_len(mbuf) - rte_pktmbuf_data_len(mbuf);
rte_mbuf.h:2180:51: warning: conversion to 'size_t'
{aka 'long unsigned int'} from 'int' may change the
sign of the result [-Wsign-conversion]
rte_memcpy(buffer, rte_pktmbuf_mtod(m, char *), seg_len);
^~~~~~~
The temp is consumed as a size_t. So let's make it
a size_t in the first place.
Fixes: 1feda4d8fc ("mbuf: add a function to linearize a packet")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_pktmbuf_detach':
rte_mbuf.h:1583:17: warning: conversion from 'uint32_t'
{aka 'unsigned int'} to 'uint16_t' {aka 'short unsigned int'}
may change value [-Wconversion]
m->priv_size = priv_size;
^~~~~~~~~
The temp priv_size is declared as a uint32_t. But it
only deals in uint16_t. m->priv_size is a uint16_t.
Change it to a uint16_t.
Fixes: 355e6735b3 ("mbuf: fix cloning with private mbuf data")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_ipv4_udptcp_cksum':
rte_byteorder.h:51:24: warning: conversion from 'long unsigned int' to
'uint32_t' {aka 'unsigned int'} may change value [-Wconversion]
#define rte_bswap16(x) ((uint16_t) (__builtin_constant_p(x) ? \
^
rte_byteorder.h:85:29: note: in expansion of macro 'rte_bswap16'
#define rte_be_to_cpu_16(x) rte_bswap16(x)
^~~~~~~~~~~
rte_ip.h:321:11: note: in expansion of macro 'rte_be_to_cpu_16'
l4_len = rte_be_to_cpu_16(ipv4_hdr->total_length) -
^~~~~~~~~~~~~~~~
Also with this one, it is a cast that always occurred
and is just being done explicitly, with no changes to
the generated code.
The warning stack is misleading, it points to the last
element in the macro that produced the lhs of the subtraction
above. But the only "unsigned long int" in the expression is
the result of the sizeof() on the rhs, it promotes the
subtraction result to unsigned long. So the error actually
relates to the result of the outer subtraction.
The actual error is "you are trying to put an unsigned long
into a uint32_t". We always did so, the fix is just to inform
the compiler it is intentional with an explicit cast.
Fixes: 6006818cfb ("net: new checksum functions")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_rwlock_read_lock':
rte_rwlock.h:74:12: warning: conversion to 'uint32_t'
{aka 'unsigned int'} from 'int32_t' {aka 'int'} may
change the sign of the result [-Wsign-conversion]
x, x + 1);
^
rte_rwlock.h:74:17: warning: conversion to 'uint32_t'
{aka 'unsigned int'} from 'int' may change the sign
of the result [-Wsign-conversion]
x, x + 1);
~~^~~
In function 'rte_rwlock_write_lock':
rte_rwlock.h:110:15: warning: unsigned conversion
from 'int' to 'uint32_t' {aka 'unsigned int'}
changes value from '-1' to '4294967295' [-Wsign-conversion]
0, -1);
^~
Again in this case we are making explicit the exact cast
that was always happening implicitly. The patch does not
change the generated code.
The int32_t temp "x" is required to be signed to detect
a < 0 error condition from the lock status. Afterwards,
it has always been implicitly cast to uint32_t when it
is used in the arguments to rte_atomic32_cmpset()...
gcc8.1 objects to the implicit cast now and requires us
to cast it explicitly.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
rte_memcpy.h:793:2: note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
^~~~~~~~~~~~~~~~~~~~
rte_memcpy.h:649:51: warning: conversion from 'size_t'
{aka 'long unsigned int'} to 'int' may change value [-Wconversion]
case 0x0B: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0B); break;
^
rte_memcpy.h:616:15: note: in definition of macro 'MOVEUNALIGNED_LEFT47_IMM'
tmp = len;
^~~
rte_memcpy.h:793:2: note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
^~~~~~~~~~~~~~~~~~~~
rte_memcpy.h:618:13: warning: conversion to 'size_t'
{aka 'long unsigned int'} from 'int'
may change the sign of the result [-Wsign-conversion]
tmp -= len;
^~
rte_memcpy.h:649:16: note: in expansion of macro 'MOVEUNALIGNED_LEFT47_IMM'
case 0x0B: MOVEUNALIGNED_LEFT47_IMM(dst, src, n, 0x0B); break;
^~~~~~~~~~~~~~~~~~~~~~~~
rte_memcpy.h:793:2: note: in expansion of macro 'MOVEUNALIGNED_LEFT47'
MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
^~~~~~~~~~~~~~~~~~~~
rte_memcpy.h:618:13: warning: conversion to 'size_t'
{aka 'long unsigned int'} from 'int'
may change the sign of the result [-Wsign-conversion]
tmp -= len;
^~
We can eliminate the problems by setting the type of tmp to
size_t in the first place.
Fixes: d35cc1fe6a ("eal/x86: revert select optimized memcpy at run-time")
Cc: stable@dpdk.org
Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
loglevel set wrong when ":" is used as separator, like
--log-type="user:debug"
This is because fnmatch returns zero on success. Fixed fnmatch return
value check.
Fixes: 7f0bb634a1 ("log: add ability to match log type with globbing")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Executable bit must be set on directories for normal users to enter them.
This patch addresses the inability to start DPDK applications as non-root
due to errors such as:
EAL: failed to bind /tmp/dpdk/rte/mp_socket: Permission denied
Fixes: 56236363b4 ("eal: add directory for runtime data")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This patch fix the clang compiling issue reported on the ARM64
builing hosts. ev is a pointer in size of 64bit, but herein
it should be the size of its content.
lib/librte_eventdev/rte_event_crypto_adapter.c:530:49: error:
'rte_memcpy' call operates on objects of type 'struct rte_event'
while the size is based on a different type
'struct rte_event *' [-Werror,-Wsizeof-pointer-memaccess]
rte_memcpy(ev, &m_data->response_info, sizeof(ev));
lib/librte_eventdev/rte_event_crypto_adapter.c:530:49:
note: did you mean to dereference the argument to 'sizeof' (and multiply
it by the number of elements)?
rte_memcpy(ev, &m_data->response_info, sizeof(ev));
Fixes: 7901eac340 ("eventdev: add crypto adapter implementation")
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
GCC 8.1 produces a warning:
rte_ethdev.h: In function 'rte_eth_rx_queue_count':
rte_ethdev.h:3882:10: warning: conversion to 'int' from 'uint32_t'
{aka 'unsigned int'} may change the sign of the result [-Wsign-conversion]
return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 33cf6be04d ("ethdev: add sanity checks to functions")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
GCC 8.1 warned:
In function 'rte_ipv6_phdr_cksum':
rte_ip.h:378:18: warning: conversion to 'uint32_t' {aka 'unsigned int'}
from 'int' may change the sign of the result [-Wsign-conversion]
psd_hdr.proto = (ipv6_hdr->proto << 24);
Fixes: 6006818cfb ("net: new checksum functions")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
GCC 8.1 warned:
In function 'rte_raw_cksum_mbuf':
rte_ip.h:225:22: warning: conversion from 'uint32_t'
{aka 'unsigned int'} to 'uint16_t' {aka 'short unsigned int'}
may change value [-Wconversion]
tmp = rte_bswap16(tmp);
^~~
In function 'rte_ipv4_cksum':
rte_ip.h:256:35: warning: conversion from 'int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
return (cksum == 0xffff) ? cksum : ~cksum;
~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
rte_ip.h:332:9: warning: conversion from 'uint32_t'
{aka 'unsigned int'} to 'uint16_t' {aka 'short unsigned int'}
may change value [-Wconversion]
return cksum;
^~~~~
In function 'rte_ipv6_udptcp_cksum':
rte_ip.h:421:9: warning: conversion from 'uint32_t' {aka 'unsigned int'}
to 'uint16_t' {aka 'short unsigned int'} may change value [-Wconversion]
return cksum;
^~~~~
Fixes: 6006818cfb ("net: new checksum functions")
Fixes: 4199fdea60 ("mbuf: generic support for TCP segmentation offload")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
GCC 8.1 warned:
rte_ether.h:213:13:
warning: conversion from 'int' to 'uint8_t'
{aka 'unsigned char'} may change value [-Wconversion]
addr[0] &= ~ETHER_GROUP_ADDR;
Fixes: 7ef0072910 ("ethdev: random MAC address")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
In function 'rte_pktmbuf_detach':
rte_mbuf.h:1580:14: warning: conversion from 'long unsigned int'
to 'uint32_t' {aka 'unsigned int'} may change value [-Wconversion]
mbuf_size = sizeof(struct rte_mbuf) + priv_size;
^~~~~~
Fixes: 355e6735b3 ("mbuf: fix cloning with private mbuf data")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
GCC 8.1 warned:
rte_common.h:141:34:
warning: conversion from 'long unsigned int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
#define RTE_PTR_DIFF(ptr1, ptr2) ((uintptr_t)(ptr1) - (uintptr_t)(ptr2))
^
rte_mbuf.h:1360:13:
note: in expansion of macro 'RTE_PTR_DIFF'
*buf_len = RTE_PTR_DIFF(shinfo, buf_addr);
Fixes: a53aa2b9f3 ("mbuf: support attaching external buffer")
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warned:
rte_common.h:384:2:
warning: conversion from 'int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
__extension__ ({ \
^~~~~~~~~~~~~
rte_mbuf.h:1204:16:
note: in expansion of macro 'RTE_MIN'
m->data_off = RTE_MIN(RTE_PKTMBUF_HEADROOM, (uint16_t)m->buf_len);
RTE_PKTMBUF_HEADROOM is typ 128, so it doesn't make trouble.
Fixes: 08b563ffb1 ("mbuf: replace data pointer by an offset")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
differences to the atomic16 are signed, but the
atomic16 itself is unsigned. It needs to be
made explicit with casts.
Fixes: af75078fec ("first public release")
Fixes: a53aa2b9f3 ("mbuf: support attaching external buffer")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
GCC 8.1 warned:
"1 + value", where value is an uint16_t causes promotion
to a signed int. The compiler complained that we are
shoving an int into a uint16_t return type with different
size and sign.
Bumping and returning value directly instead removes the
promotion and the problem.
Fixes: f20b50b946 ("mbuf: optimize refcnt update")
Fixes: a53aa2b9f3 ("mbuf: support attaching external buffer")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
GCC 8.1 warns:
rte_ring.h:350:46:
warning: conversion to 'uint32_t' {aka 'unsigned int'}
from 'int' may change the sign of the result
[-Wsign-conversion]
update_tail(&r->prod, prod_head, prod_next, is_sp, 1);
The visible apis take unsigned int, then call a private
api taking an int, which finally calls an api taking an
unsigned int.
Convert the private api to take unsigned int removing
5 x warning similar to that shown above.
Fixes: 0dfc98c507 ("ring: separate out head index manipulation")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
There were warnings with GCC 8.1:
In function '__rte_ring_move_prod_head':
rte_ring_generic.h:76:3:
warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
const uint32_t cons_tail = r->cons.tail;
^~~~~
In function '__rte_ring_move_cons_head':
rte_ring_generic.h:147:3:
warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
const uint32_t prod_tail = r->prod.tail;
Fixes: 0dfc98c507 ("ring: separate out head index manipulation")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
GCC 8.1 warns:
rte_byteorder.h: In function 'rte_constant_bswap16':
rte_byteorder.h:54:45: warning: conversion from
'int' to 'uint16_t' {aka 'short unsigned int'}
may change value [-Wconversion]
((((uint16_t)(v) & UINT16_C(0x00ff)) << 8) | \
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
(((uint16_t)(v) & UINT16_C(0xff00)) >> 8))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
rte_byteorder.h:126:9: note: in expansion of macro
'RTE_STATIC_BSWAP16'
return RTE_STATIC_BSWAP16(x);
^~~~~~~~~~~~~~~~~~
The other two sizes are going to be afflicted the
same, so get the same fix.
Fixes: b75667ef9f ("eal: add static endianness conversion macros")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
GCC 8.1 warns:
In function 'rte_srand':
rte_random.h:34:10:
warning: conversion to 'long int' from 'long unsigned int'
may change the sign of the result [-Wsign-conversion]
srand48((long unsigned int)seedval);
rte_random.h:51:8:
warning: conversion to 'uint64_t' {aka 'long unsigned int'}
from 'long int' may change the sign of the result
[-Wsign-conversion]
val = lrand48();
^~~~~~~
rte_random.h:53:6:
warning: conversion to 'long unsigned int' from 'long int'
may change the sign of the result [-Wsign-conversion]
val += lrand48();
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
GCC 8.1 warns:
rte_string_fns.h: In function 'rte_strlcpy':
rte_string_fns.h:58:9:
warning: conversion to 'size_t' {aka 'long unsigned int'} from
'int' may change the sign of the result [-Wsign-conversion]
return snprintf(dst, size, "%s", src);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 5364de644a ("eal: support strlcpy function")
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Including rte_mbuf.h in C++ triggers the following warning as C++ does not
allow implicit casting of a void *.
In file included from test.cpp:1:0:
rte_mbuf.h: In function ‘rte_mbuf_ext_shared_info*
rte_pktmbuf_ext_shinfo_init_helper(void*, uint16_t*,
rte_mbuf_extbuf_free_callback_t, void*)’:
rte_mbuf.h:1349:9: error: invalid conversion
from ‘void*’ to ‘rte_mbuf_ext_shared_info*’ [-fpermissive]
shinfo = RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end,
^
Fixes: a53aa2b9f3 ("mbuf: support attaching external buffer")
Signed-off-by: David Marchand <david.marchand@6wind.com>
This patch caches all dirty pages logging until the used ring index
is updated.
The goal of this optimization is to fix a performance regression
introduced when the vhost library started to use atomic operations
to set bits in the shared dirty log map. While the fix was valid
as previous implementation wasn't safe against concurrent accesses,
contention was induced.
With this patch, during migration, we have:
1. Less atomic operations as only a single atomic OR operation
per 32 or 64 (depending on CPU) pages.
2. Less atomic operations as during a burst, the same page will
be marked dirty only once.
3. Less write memory barriers.
Fixes: 897f13a1f7 ("vhost: make page logging atomic")
Cc: stable@dpdk.org
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This patch enables the handling of buffers non-contiguous in
virtual address space in the vhost_crypto. Instead of using
rte_vhost_va_from_guest_pa(), the host virtual address is
converted by vhost_iova_to_vva() for wider use cases.
For copy mode, the copy length is limited to the chunk size,
next chunks VAs being fetched afterward.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes the redundant descriptor move in the copy mode
of vhost crypto. Originally the redundant descriptor move will
cause the message parsing error.
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Currently, populate_virt will check if mempool is already populated.
This will cause inability to reserve multi-chunk mempools if
contiguous memory is not a hard requirement, because if allocating
all-contiguous memory fails, mempool will retry with virtual addresses
and will call populate_virt. It seems that the original code never
anticipated more than one non-physically contiguous area.
Fix it by removing the check in populate virt. populate_anon() function
calls populate_virt() also, and it can be reasonably inferred that it is
expecting that virtual area is not already populated. Even though a
similar check is already in place there, also add the check that was
part of populate_virt() just in case.
Fixes: aab4f62d6c ("mempool: support no hugepage mode")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The intention of the original code was to create runtime data
directory as early as possible, however it was moved too early,
before the arguments were parsed, resulting in --file-prefix
option essentially not working.
Fix this by moving eal_create_runtime_dir() to after command
line arguments parsing.
Fixes: 56236363b4 ("eal: add directory for runtime data")
Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Fix all calls to functions in eal_filesystem to produce paths
residing inside dedicated DPDK runtime directory. Leaving DPDK
runtime config in place as 3rd-party applications within the
DPDK ecosystem might rely on this path to determine whether
DPDK is running, so moving that will be postponed to the next
release cycle.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, during runtime, DPDK will store a bunch of files here
and there (in /var/run, /tmp or in $HOME). Fix it by creating a
DPDK-specific runtime directory, under which all runtime data
will be placed. The template for creating this runtime directory
is the following:
<base path>/dpdk/<DPDK prefix>/
Where <base path> is set to either "/var/run" if run as root, or
$XDG_RUNTIME_DIR if run as non-root, with a fallback to /tmp if
$XDG_RUNTIME_DIR is not defined. So, for example, if run as root,
by default all runtime data will be stored at /var/run/dpdk/rte/.
There is no equivalent of "mkdir -p", so we will be creating the
path step by step.
Nothing uses this new path yet, changes for that will come in
next commit.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Reshma Pattan <reshma.pattan@intel.com>
The original name for this path was not too descriptive and
confusing. Rename it to a more appropriate and descriptive name:
it stores data about hugepages, so name it eal_hugepage_data_path().
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Reshma Pattan <reshma.pattan@intel.com>
The define was a leftover from IVSHMEM library.
Fixes: c711ccb309 ("ivshmem: remove library and its EAL integration")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Description of rte_eth_dev_get_name_by_port() calls
port ID argument a pointer, which is misleading.
Also, output buffer minimal size is not mentioned.
These points need to be improved.
Fixes: bde516d5a8 ("ethdev: get port by name")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Relax the check for queue setup, since some device
may not update queue states during dev_stop.
Fixes: cac923cfea ("ethdev: support runtime queue setup")
Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
When an ethdev port is released, a destroy event is triggered to notify
the users about the released port.
A bit before the destroy event is triggered, the port becomes invalid
by changing its state to UNUSED and cleaning its data. Therefore, the
port is invalid for the destroy event callback process and the users
may get a wrong information of the port.
Move the destroy event emitting to be called before the port
invalidation.
Fixes: 133b54779a ("ethdev: fix port data reset timing")
Fixes: 29aa41e36d ("ethdev: add notifications for probing and removal")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The new device was notified as soon as it was allocated.
It leads to use a device which is not yet initialized.
The notification must be published after the initialization is done
by the PMD, but before the state is changed, in order to let
notified entities taking ownership before general availability.
Fixes: 29aa41e36d ("ethdev: add notifications for probing and removal")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The port was set to the state ATTACHED during allocation.
The consequence was to iterate over ports which are not initialized.
The state ATTACHED is now set as the last step of probing.
The uniqueness of port name is now checked before the availability
of a port id for allocation (order reversed).
As the state is not set on allocation anymore, it is also not checked
in the function telling whether a port is allocated or not.
The name of the port is set on allocation, so it is enough as a check.
Fixes: 5588909af2 ("ethdev: add device iterator")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
When comparing the port name, there can be a race condition with
a thread allocating a new port and writing the name at the same time.
It can lead to match with a partial name by error.
The check of the port is now considered as a critical section
protected with locks.
This fix will be even more required for multi-process when the
port availability will rely only on the name, in a following patch.
Fixes: 84934303a1 ("ethdev: synchronize port allocation")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
When the state will be updated later than in allocation,
we may need to update the ownership of a port which is
still in state unused.
It will be used to take ownership of a port before it is
declared as available for other entities.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
A new hook function is added and called inside the PMDs at the end
of the device probing:
- in primary process, after allocating, init and config
- in secondary process, after attaching and local init
This new function is almost empty for now.
It will be used later to add some post-initialization processing.
For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.
Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The enum rte_eth_dev_state was not properly documented.
Its values did not appear in the doxygen output,
and may be misunderstood.
The state RTE_ETH_DEV_DEFERRED has no interest anymore
since the ownership mechanism brings a more flexible categorization.
This state could be removed later.
Fixes: d52268a8b2 ("ethdev: expose device states")
Fixes: cb894d99ec ("ethdev: add deferred intermediate device state")
Fixes: 5b7ba31148 ("ethdev: add port ownership")
Fixes: 7106edc123 ("ethdev: add devop to check removal status")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The owner id is 64-bit.
On 32-bit environment, it must be printed with PRIX64.
Fixes: 5b7ba31148 ("ethdev: add port ownership")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.
This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.
This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().
In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Calling dev_infos_get() devops directly in rte_eth_dev_configure cause
random values in uninitialized fields because devops doesn't reset the
dev_info structure.
Call rte_eth_dev_info_get() API instead which memset the struct.
Also remove duplicated dev_infos_get existence check.
Fixes: 3be82f5cc5 ("ethdev: support PMD-tuned Tx/Rx parameters")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
When the vhost-user master sends memory updates using
VHOST_USER_SET_MEM request, the user backends unmap and then
mmap again the memory regions in its address space.
If the ring addresses have already been translated, it needs to
be translated again as they point to unmapped memory.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
A bracket was misplaced in a condition check, this patch
fixes it.
Coverity issue: 277232, 277237
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In the loop to copy virtio-net header to the descriptor buffer,
destination pointer was incremented instead of the source
pointer.
Fixes: fb3815cc61 ("vhost: handle virtually non-contiguous buffers in Rx-mrg")
Fixes: 6727f5a739 ("vhost: handle virtually non-contiguous buffers in Rx")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
when rte_vhost_driver_unregister detstroy the vsocket, we
should set it to NULL after freeing it, because in client mode,
the conn may be added to reconnect thread while vsocket is
destroyed. In one case, if qemu create vhostuser port as a
server with the same unix path, the reconnect thread will
reconnect to it while vsocket is destroyed.
To fix this:
1. set vsocket to NULL after free it.
2. remove the reconnection from reconnection thread in suitable
position.
Cc: stable@dpdk.org
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When qemu close the unix socket fd of the vhostuser as a
server, and then immediately delete the vhostuser port on
openvswitch. There will be a deadlock.
A thread (fdset event thread): B thread:
1. fdset_event_dispatch rte_vhost_driver_unregister
2. set the fd busy to 1. lock vsocket->conn_mutex
3. vhost_user_read_cb fdset_del waits busy changed to 0.
4. vhost peer closed, remove the
conn from vsocket->conn_list:
lock vsocket->conn_mutex
5. set the fd busy to 0
Fixes: 65388b43f5 ("vhost: fix fd leaks for vhost-user server mode")
Cc: stable@dpdk.org
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tx offload will be converted to txq_flags automatically during
rte_eth_dev_info_get and rte_eth_tx_queue_info_get. So PMD can
clean the code to get rid of txq_flags at all while keep old APP
not be impacted.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
In ip_frag_process, some IP_FRAG_LOG content is wrong.
Fixes: 4f1a8f6338 ("ip_frag: add IPv6 reassembly")
Cc: stable@dpdk.org
Signed-off-by: Li Han <han.li1@zte.com.cn>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
- add EXPERIMENTAL tag for the section in MAINTAINERS.
- add EXPERIMENTAL tag to BPF public API files.
- add attribute __rte_experimental to BPF public API declarations.
Fixes: 94972f35a0 ("bpf: add BPF loading and execution framework")
Fixes: 5dba93ae5f ("bpf: add ability to load eBPF program from ELF object file")
Fixes: a93ff62a89 ("bpf: introduce basic Rx/Tx filters")
Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Remove version tag from experimental block in linker version scripts
(.map files).
That label is not used by linker and information only. It is useful
for version blocks but not useful for experimental block but confusing.
Removing those labels.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Currently, page deallocation might fail if allocator cannot get page
fd, which will leave VA space still mapped, and will also not mark
page as free.
Fix page deallocation function to always unmap space before trying
to get rid of the page itself, and always mark page as free even if
page deallocation failed.
Fixes: a5ff05d60f ("mem: support unmapping pages at runtime")
Fixes: 1a7dc2252f ("mem: revert to using flock and add per-segment lockfiles")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Return value should be zero for success, but if unlock and unlink
have succeeded, return value was 1, which triggered failure message
in calling code.
Fixes: a5ff05d60f ("mem: support unmapping pages at runtime")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Segment index was calculated incorrectly, causing free_seg to
attempt to free segments that do not exist.
Fixes: a5ff05d60f ("mem: support unmapping pages at runtime")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
If total memory is already bigger than max memory, an underflow
will occur on subtraction. Fix it by simply stopping whenever
we already have amount of memory that is bigger than maximum.
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, reserving a memzone with length set to 0 will not trigger
any memory allocations, and memzone will instead be looking through
already allocated memory only. Document this limitation.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Size of malloc heap elements include overhead, which should not
be counted as part of memzone.
Fixes: fafcc11985 ("mem: rework memzone to be allocated by malloc")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Deallocation used the wrong function, which could have resulted in
race conditions because the function does not use locks internally.
Fixes: 1403f87d4f ("malloc: enable memory hotplug support")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
When we ask to reserve virtual areas, we usually include
alignment in the mapping size, and that memory ends up
being wasted. Wasting a gigabyte of VA space while trying to
reserve one gigabyte is pretty expensive on 32-bit, so after
we're done mapping, unmap unneeded space.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Mapping size is a 64-bit integer, but mmap() will accept size_t for
size mappings. A user could request a mapping with an alignment, which
would have overflown size_t, so check if (size + alignment) will
overflow size_t.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The code aimed to pick and remember the value of
mempool ops name from EAL command line arguments does not
copy the string and remembers the pointer provided
by getopt_long() directly. The latter could be clobbered
later and result in reading wrong mbuf pool ops name
by rte_mempool library.
Typically, this flaw could be avoided by using strdup()
to remember the string value of the option.
Fixes: a103a97e71 ("eal: allow user to override default mempool driver")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
In function 'crc32c_sse42_u64_mimic':
rte_hash_crc.h:402:40:
warning: conversion from 'uint64_t' {aka 'long unsigned int'}
to 'uint32_t' {aka 'unsigned int'} may change value [-Wconversion]
init_val = crc32c_sse42_u32(d.u32[0], init_val);
Fixes: 00bf774bab ("hash: add assembly implementation of CRC32 intrinsics")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
In function 'crc32c_2words':
rte_hash_crc.h:347:2:
warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
uint32_t crc, term1, term2;
Fixes: d983cf4169 ("hash: add software CRC32 implementation")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
In function 'rte_eth_tx_buffer_flush':
rte_ethdev.h:4248:55:
warning: conversion from 'int' to 'uint16_t'
{aka 'short unsigned int'} may change value [-Wconversion]
buffer->error_callback(&buffer->pkts[sent], to_send - sent,
Fixes: d6c99e62c8 ("ethdev: add buffered Tx")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
In function 'rte_try_tm':
rte_spinlock.h:82:2:
warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
int retries = RTE_RTM_MAX_RETRIES;
Fixes: ba7468997e ("spinlock: add HTM lock elision for x86")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
rte_lcore.h: In function 'rte_lcore_index':
rte_lcore.h:122:14:
warning: conversion to 'int' from 'unsigned int' may change
the sign of the result [-Wsign-conversion]
lcore_id = rte_lcore_id();
Fixes: 5583037a79 ("eal: get relative core index")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
rte_common.h:416:9:
warning: conversion to 'uint32_t' {aka 'unsigned int'} from
'int' may change the sign of the result [-Wsign-conversion]
return __builtin_ctz(v);
^~~~~~~~~~~~~~~~
The builtin is defined to return int, but we want to
return it as uint32_t. Its only defined valid return
values are positive integers or zero, which is OK for
uint32_t. So just add an explicit cast.
Fixes: 03f6bced5b ("eal: use intrinsic function")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Meson 0.46 fixed a bug where "extract_all_objects" would not recursively
extract objects not compiled from source for a target. To keep backward
compatibility, a "recursive" keyword-arg was added to make this optional.
The value is "false" by default for now, but will change to "true" in
future, so we hard-code it to "false" in our code to ensure future
compatibility.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Introduce API to install BPF based filters on ethdev RX/TX path.
Current implementation is pure SW one, based on ethdev RX/TX
callback mechanism.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add checks for:
- all instructions are valid ones
(known opcodes, correct syntax, valid reg/off/imm values, etc.)
- no unreachable instructions
- no loops
- basic stack boundaries checks
- division by zero
Still need to add checks for:
- use/return only initialized registers and stack data.
- memory boundaries violation
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Introduce rte_bpf_elf_load() function to provide ability to
load eBPF program from ELF object file.
It also adds dependency on libelf.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
librte_bpf provides a framework to load and execute eBPF bytecode
inside user-space dpdk based applications.
It supports basic set of features from eBPF spec
(https://www.kernel.org/doc/Documentation/networking/filter.txt).
Not currently supported features:
- JIT
- cBPF
- tail-pointer call
- eBPF MAP
- skb
- function calls for 32-bit apps
- mbuf pointer as input parameter for 32-bit apps
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Splitting Queue Groups into UL/DL Groups in Turbo Software
Driver. They are independent for Decode/Encode.
Release note updated accordingly.
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
New test created to measure offload cost.
Changes were introduced in API, turbo software driver
and test application
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
Support for optional CRC overlap in decode processing implemented
in Turbo Software driver
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
Update Turbo Software driver for Wireless Baseband Device:
- function scaling input LLR values to specific range [-16, 16] added
- new test vectors to check device capabilities added
- release note updated accordingly
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
Added API to retrieve the device id provided the device name.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
Added structure which each PMD will fill out,
providing the capabilities of each driver
(containing mainly which compression services
it supports).
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
- Added hash algo enumeration and params in xform and rte_comp_op
- Updated compress/decompress xform to input hash algorithm
- Updated struct rte_comp_op to input hash buffer
User in capability query will know about support hashes via
device info comp_feature_flag. If supported, application can initialize
desired algorithm enumeration in xform structure and pass valid hash
buffer during enqueue_burst().
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Sunila Sahu <sunila.sahu@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
Added stream data (stream) in compression operation,
which will contain the private data from each PMD
to support stateful operations.
Also, added functions to create/free this data.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
Added private transform data (priv_xform) in compression
operation, which will contain the private data from each
PMD to support stateless operations.
Also, added functions to create/free this data.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
Added structures and enums specific to compression,
including the compression operation structure and the
different supported algorithms, checksums and compression
levels.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
Add basic functions to manage compress devices,
including driver and device allocation, and the basic
interface with compressdev PMDs.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Signed-off-by: Ashish Gupta <ashish.gupta@caviumnetworks.com>
Ethernet port ID data size has been extended to 16 bits size 17.11
Update the Rx event adapter interface and implementation accordingly.
This commit bumps the library version to refect the ABI change
caused by extending the ethernet port parameter in Rx adapter
functions from 8 to 16 bits.
Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
This patch adds common code for the crypto adapter to support
SW and HW based transfer mechanisms. The adapter uses an EAL
service core function for SW based packet transfer and uses
the eventdev PMD functions to configure HW based packet
transfer between the crypto device and the event device.
This patch also adds adapter to the meson build system &
updates the necessary makefile & map file.
Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
This patch defines capabilities & functions to be called
for eventdev PMDs.
Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
This patch introduces event crypto adapter APIs. It
also provides information on working model/adapter
modes & their usage. Application is expected to use
this interface to transfer packets between the crypto
device & the event device.
Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Add dedicated parameter structure for cuckoo hash. The cuckoo hash from
librte_hash uses slightly different prototype for the hash function (no
key_mask parameter, 32-bit seed and return value) that require either
of the following approaches:
1/ Function pointer conversion: gcc 8.1 warning [1], misleading [2]
2/ Union within the parameter structure: pollutes a very generic API
parameter structure with some implementation dependent detail
(i.e. key mask not available for one of the available
implementations)
3/ Using opaque pointer for hash function: same issue from 2/
4/ Different parameter structure: avoid issue from 2/; hopefully,
it won't be long before librte_hash implements the key mask feature,
so the generic API structure could be used.
[1] http://www.dpdk.org/ml/archives/dev/2018-April/094950.html
[2] http://www.dpdk.org/ml/archives/dev/2018-April/096250.html
Fixes: 5a80bf0ae6 ("table: add cuckoo hash")
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add new API function to add more pipe configuration profiles
post initialization to the set of exisitng profiles specified during
the creation of scheduler port.
This API removes the current limitation that forces the user
to define the full set of pipe profiles as the part of port parameters
while port is being created.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
WRED thresholds can be specified in bytes if the TM leaf
node supports it. Also extend WRED thresholds to 32 bits from 16.
TM capability (port/level/queue) fields cman_wred_packet_mode_supported and
cman_wred_byte_mode_supported, when non-zero, indicate support for WRED
thresholds in packets and bytes respectively.
The packet_mode member of struct rte_tm_wred_params, when non-zero,
indicates that the min and max thresholds are specified in
packets and when zero, indicates that the min and max thresholds
are specified in bytes.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
The rte_tm_node_wfq_weight_mode_update() API function operates on
non-leaf nodes, not leaf nodes.
Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
It may be useful to pass arbitrary data to the callback (such
as device pointers), so add this to the mem event callback API.
Suggested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When populating a mempool with the default function, if there is not
enough virtually contiguous memory for the whole mempool, it will be
populated with several chunks. A chunk of the maximum available length
is requested with:
mz = rte_memzone_reserve_aligned(..., len=0, ..., align=x)
If align is smaller than the page size, the address and the length of
the memzone may not be a multiple of the page size. This makes
rte_mempool_populate_virt() to fail because it requires them to be
page-aligned. This patch fixes that.
The problem can be reproduced easily by allocating more than available
memory:
./build/app/testpmd -l 0,1 -- --total-num-mbufs=65536
...
Cause: Creation of mbuf pool for socket 0 failed: Invalid argument
After the patch, the error code is correct:
./build/app/testpmd -l 0,1 -- --total-num-mbufs=65536
...
Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
Fixes: ba0009560c ("mempool: support new allocation methods")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Currently, when deallocating pages, malloc will fixup other
elements' headers if there is not enough space to store a full
element in leftover space. This leads to race conditions because
there are some functions that check for pad size with an unlocked
heap, expecting pad size to be constant.
Fix it by being more conservative and only freeing pages when
there is enough space before and after the page to store a free
element.
Fixes: 1403f87d4f ("malloc: enable memory hotplug support")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
The pad value is not used unless element is in pad state, but it
will show up in heap dumps and may be confusing.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
After below commit, we encounter some strange issue:
1) Dead lock as described here:
http://dpdk.org/ml/archives/dev/2018-April/099806.html
2) SIGSEGV issue when starting a testpmd in VM.
Considering below commit changes to use dynamic memory instead of
stack for memory barrier, we doubt it's caused by use-after-free.
Fixes: 3d09a6e26d ("eal: fix threads block on barrier")
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reported-by: Lei Yao <lei.a.yao@intel.com>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
params is not freed if pthread_create() fails. The fix is
straight-forward.
Fixes: 3d09a6e26d ("eal: fix threads block on barrier")
Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Many sample applications fail because of
dev_info.flow_type_rss_offloads check in rte_eth_dev_configure()
The sample applications need to be fixed/updated before returning error
on rte_eth_dev_configure() and rte_eth_dev_rss_hash_update().
This patch keeps the error logs but removes returning errors.
Fixes: 8863a1fbfc ("ethdev: add supported hash function check")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
When heap initializes, we need to add already allocated segments
onto the heap. However, in doing that, we never increased total
heap size. Fix it by adding segment length to total heap length
when initializing the heap.
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
At hugepage info initialization, EAL takes out a write lock on
hugetlbfs directories, and drops it after the memory init is
finished. However, in non-legacy mode, if "-m" or "--socket-mem"
switches are passed, this leads to a deadlock because EAL tries
to allocate pages (and thus take out a write lock on hugedir)
while still holding a separate hugedir write lock in EAL.
Fix it by checking if write lock in hugepage info is active, and
not trying to lock the directory if the hugedir fd is valid.
Fixes: 1a7dc2252f ("mem: revert to using flock and add per-segment lockfiles")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Shahaf Shuler <shahafs@mellanox.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
The original implementation used flock() locks, but was later
switched to using fcntl() locks for page locking, because
fcntl() locks allow locking parts of a file, which is useful
for single-file segments mode, where locking the entire file
isn't as useful because we still need to grow and shrink it.
However, according to fcntl()'s Ubuntu manpage [1], semantics of
fcntl() locks have a giant oversight:
This interface follows the completely stupid semantics of System
V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all
locks associated with a file for a given process are removed
when any file descriptor for that file is closed by that process.
This semantic means that applications must be aware of any files
that a subroutine library may access.
Basically, closing *any* fd with an fcntl() lock (which we do because
we don't want to leak fd's) will drop the lock completely.
So, in this commit, we will be reverting back to using flock() locks
everywhere. However, that still leaves the problem of locking parts
of a memseg list file in single file segments mode, and we will be
solving it with creating separate lock files per each page, and
tracking those with flock().
We will also be removing all of this tailq business and replacing it
with a simple array - saving a few bytes is not worth the extra
hassle of dealing with pointers and potential memory allocation
failures. Also, remove the tailq lock since it is not needed - these
fd lists are per-process, and within a given process, it is always
only one thread handling access to hugetlbfs.
So, first one to allocate a segment will create a lockfile, and put
a shared lock on it. When we're shrinking the page file, we will be
trying to take out a write lock on that lockfile, which would fail if
any other process is holding onto the lockfile as well. This way, we
can know if we can shrink the segment file. Also, if no other locks
are found in the lock list for a given memseg list, the memseg list
fd is automatically closed.
One other thing to note is, according to flock() Ubuntu manpage [2],
upgrading the lock from shared to exclusive is implemented by dropping
and reacquiring the lock, which is not atomic and thus would have
created race conditions. So, on attempting to perform operations in
hugetlbfs, we will take out a writelock on hugetlbfs directory, so
that only one process could perform hugetlbfs operations concurrently.
[1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html
[2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Fixes: 582bed1e1d ("mem: support mapping hugepages at runtime")
Fixes: a5ff05d60f ("mem: support unmapping pages at runtime")
Fixes: 2a04139f66 ("eal: add single file segments option")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Currently, memseg lists for secondary process are allocated on
sync (triggered by init), when they are accessed for the first
time. Move this initialization to a separate init stage for
memalloc.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
For non-legacy mode, we are preallocating space for hugepages, so
we know in advance which pages we will be able to allocate, and
which we won't. However, the init procedure was using hugepage
counts gathered from sysfs and paid no attention to hugepage
sizes that were actually available for reservation, and failed
on attempts to reserve unavailable pages.
Fix this by limiting total page counts by number of pages
actually preallocated.
Also, VA preallocate procedure only looks at mountpoints that are
available, and expects pages to exist if a mountpoint exists. That
might not necessarily be the case, so also check if there are
hugepages available for a particular page size on a particular
NUMA node.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
Previously, if we couldn't preallocate VA space on 32-bit for
one page size, we simply bailed out, even though we could've
tried allocating VA space with other page sizes.
For example, if user had both 1G and 2M pages enabled, and
has asked DPDK to allocate memory on both sockets, DPDK
would've tried to allocate VA space for 1x1G page on both
sockets, failed and never tried again, even though it
could've allocated the same 1G of VA space for 512x2M pages.
Fix this by retrying with different page sizes if VA space
reservation failed.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
32-bit mode has an upper limit on amount of VA space it can preallocate,
but the original implementation used the wrong constant, resulting in
failure to initialize due to integer overflow. Fix it by using the
correct constant.
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
Previous code checked for both first/last elements being NULL,
but if they weren't, the expectation was that they're both
non-NULL, which will be the case under normal conditions, but
may not be the case due to heap structure corruption.
Coverity issue: 272566
Fixes: bb372060da ("malloc: make heap a doubly-linked list")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Technically, while the pointer would've been invalid if msl_idx
were invalid, we wouldn't have actually attempted to access the
pointer until verifying the index. Fix it by moving array access
to after we've verified validity of the index.
Coverity issue: 272574
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
If user has specified a flag to unmap the area right after mapping it,
we were passing an already-unmapped pointer to RTE_LOG. This is not an
issue since RTE_LOG doesn't actually dereference the pointer, but fix
it anyway by moving call to RTE_LOG to before unmap.
Coverity issue: 272584
Fixes: b7cc54187e ("mem: move virtual area function in common directory")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Coverity reports these lines as having no effect. Technically, we do
want for those lines to have no effect, however they would've likely
been optimized out. Add volatile qualifiers to ensure the code has
effects.
Coverity issue: 272608
Fixes: 582bed1e1d ("mem: support mapping hugepages at runtime")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Previously, if mmap failed to map page address at requested
address, we were attempting to unmap the wrong address. Fix it
by unmapping our actual mapped address, and jump further to
avoid unmapping memory that is not allocated.
Coverity issue: 272602
Fixes: 582bed1e1d ("mem: support mapping hugepages at runtime")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Previous code had an old rebase leftover from the time when
oldpolicy was an actual int, instead of a pointer. Fix it to
do comparison with dereferencing the pointer.
Coverity issue: 272589
Fixes: 582bed1e1d ("mem: support mapping hugepages at runtime")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Normally, tailq entry should have a valid fd by the time we attempt
to map the segment. However, in case it doesn't, we're leaking fd,
so fix it.
Coverity issue: 272570
Fixes: 2a04139f66 ("eal: add single file segments option")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We close fd if we managed to find it in the list of allocated
segment lists (which should always be the case under normal
conditions), but if we didn't, the fd was leaking. Close it if
we couldn't find it in the segment list. This is not an issue
as if the segment is zero length, we're getting rid of it
anyway, so there's no harm in not storing the fd anywhere.
Coverity issue: 272568
Fixes: 2a04139f66 ("eal: add single file segments option")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
We were closing descriptor before checking if mapping has
failed, but if it did, we did a second close afterwards. Fix
it by moving closing descriptor to after we've done all error
checks.
Coverity issue: 272560
Fixes: 2a04139f66 ("eal: add single file segments option")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
resize_hugefile() returns either 0 (which indicates success) or -1
(which indicates failure). We failed to check the success as we
use --single-file-segments option.
Fixes: 2a04139f66 ("eal: add single file segments option")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Below commit introduced pthread barrier for synchronization.
But two IPC threads block on the barrier, and never wake up.
(gdb) bt
#0 futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4)
at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1 futex_wait_simple (private=0, expected=0, futex_word=0x7fffffffcff4)
at ../sysdeps/nptl/futex-internal.h:135
#2 __pthread_barrier_wait (barrier=0x7fffffffcff0) at pthread_barrier_wait.c:184
#3 rte_thread_init (arg=0x7fffffffcfe0)
at ../dpdk/lib/librte_eal/common/eal_common_thread.c:160
#4 start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333
#5 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Through analysis, we find the barrier defined on the stack could be the
root cause. This patch will change to use heap memory as the barrier.
Fixes: d651ee4919 ("eal: set affinity for control threads")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
This patch introduces a new way of attaching an external buffer to a mbuf.
Attaching an external buffer is quite similar to mbuf indirection in
replacing buffer addresses and length of a mbuf, but a few differences:
- When an indirect mbuf is attached, refcnt of the direct mbuf would be
2 as long as the direct mbuf itself isn't freed after the attachment.
In such cases, the buffer area of a direct mbuf must be read-only. But
external buffer has its own refcnt and it starts from 1. Unless
multiple mbufs are attached to a mbuf having an external buffer, the
external buffer is writable.
- There's no need to allocate buffer from a mempool. Any buffer can be
attached with appropriate free callback.
- Smaller metadata is required to maintain shared data such as refcnt.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch fix final condition check while moving virtqueue
descriptors.
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes the missing head descriptor correction for
indirect descriptors.
Fixes: 0aee242841 ("vhost/crypto: move to safe GPA translation API")
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We should call set_features callback after setting features in virtio_net
structure, otherwise vDPA driver cannot get the right features.
Fixes: 07718b4f87 ("vhost: adapt library for selective datapath")
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This reverts commit 394313fff3.
While the patch did solve concurrency issue, it induces more
pages copies as some clean pages are marked as dirty for
performance reasons. Moreover, as there is no more contention
doing the logging, the rate of packets than can be processed is
higher, leading to even more pages to be dirtied.
It has been reported that with more than one queue pair, and
with a relatively low packet rate (1Mpps), the live migration
never converges until the flow is stopped.
While a better solution is found, it is better to reset to the
old behaviour, i.e. using atomic operation for dirty pages
logging.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Library folder name and output library name are same except a few flaws
including librte_ether.
This library is network device abstraction layer, the name "ethdev" fits
better than "ether", and library & header files already named as ethdev.
Also there is a rte_ether.h in the net library which can cause confusion.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Add rte_flow_action_count action data structure to enable shared
counters across multiple flows on a single port or across multiple
flows on multiple ports within the same switch domain. Also this enables
multiple count actions to be specified in a single flow action.
This patch also modifies the existing rte_flow_query API to take the
rte_flow_action structure as an input parameter instead of the
rte_flow_action_type enumeration to allow querying a specific action
from a flow rule when multiple actions of the same type are specified.
This patch also contains updates for the bonding, failsafe and mlx5 PMDs
and testpmd application which are affected by this API change.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Introduces a new action type RTE_FLOW_ITEM_TYPE_MARK which enables
flow patterns to specify arbitrary integer values to match aginst
set by the RTE_FLOW_ACTION_TYPE_MARK action in previously matched
flows.
Add support for specification of new MARK flow item in testpmd's cli.
Update testpmd documentation to describe new MARK flow item support.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Add jump action type which defines an action which allows a matched
flow to be redirect to the specified group. This allows physical and
logical flow table/group hierarchies to be defined through rte_flow.
This breaks ABI compatibility for the following public functions (as it
modifes the ordering of the rte_flow_action_type enumeration):
- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()
Add support for specification of new JUMP action to testpmd's flow
cli, and update the testpmd documentation to describe this new
action.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Add new flow action types and associated action data structures to
support the encapsulation and decapsulation of VXLAN and NVGRE tunnel
endpoints.
The RTE_FLOW_ACTION_TYPE_[VXLAN/NVGRE]_ENCAP action will cause the
matching flow to be encapsulated in the tunnel endpoint overlay
defined in the [vxlan/nvgre]_encap action data.
The RTE_FLOW_ACTION_TYPE_[VXLAN/NVGRE]_DECAP action will cause all
headers associated with the outer most tunnel endpoint of the specified
type for the matching flows.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Add switch domain allocate and free API to enable NET devices to
synchronise switch domain allocation.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Introduces a new structure, rte_eth_devargs, to support generic
ethdev arguments common across NET PMDs, with a new API
rte_eth_devargs_parse API to support PMD parsing these arguments. The
patch add support for a representor argument passed with passed with
the EAL -w option. The representor parameter allows the user to specify
which representor ports to initialise on a device.
The argument supports passing a single representor port, a list of
port values or a range of port values.
-w BDF,representor=1 # create representor port 1 on pci device BDF
-w BDF,representor=[1,2,5,6,10] # create representor ports in list
-w BDF,representor=[0-31] # create representor ports in range
Signed-off-by: Remy Horton <remy.horton@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add new device flag to specify that an ethdev port is a port
representor. Extend rte_eth_dev_info structure to expose device flags
to the user which enables applications to discover if a port is a
representor port.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add new bus generic ethdev create/destroy APIs which are bus independent
and provide hooks for bus specific initialisation.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Introduces a new port attribute to ethdev port's which denotes the
switch domain a port belongs to. By default all port's switch
identifiers are set to RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID. Ports
which supported the concept of switch domains can be configured with
the same switch domain id.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Add support for the following OpenFlow-defined actions:
- RTE_FLOW_ACTION_OF_POP_VLAN: pop the outer VLAN tag.
- RTE_FLOW_ACTION_OF_PUSH_VLAN: push a new VLAN tag.
- RTE_FLOW_ACTION_OF_SET_VLAN_VID: set the 802.1q VLAN id.
- RTE_FLOW_ACTION_OF_SET_VLAN_PCP: set the 802.1q priority.
- RTE_FLOW_ACTION_OF_POP_MPLS: pop the outer MPLS tag.
- RTE_FLOW_ACTION_OF_PUSH_MPLS: push a new MPLS tag.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This patch adds new tunnel type for MPLS-in-GRE and MPLS-in-UDP.
MPLS-in-GRE protocol link:
https://tools.ietf.org/html/rfc4023
MPLS-in-UDP protocol link:
https://tools.ietf.org/html/rfc7510
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>