Now that dpdk supports more than one mempool drivers and
each mempool driver works best for specific PMD, example:
- sw ring based mempool for Intel PMD drivers.
- dpaa2 HW mempool manager for dpaa2 PMD driver.
- fpa HW mempool manager for Octeontx PMD driver.
Application would like to know the best mempool handle
for any port.
Introducing rte_eth_dev_pool_ops_supported() API,
which allows PMD driver to advertise
his supported pool capability to the application.
Supported pools are categorized in below priority:-
- Best mempool handle for this port (Highest priority '0')
- Port supports this mempool handle (Priority '1')
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
DPDK has support for both sw and hw mempool and
currently user is limited to use ring_mp_mc pool.
In case user want to use other pool handle,
need to update config RTE_MEMPOOL_OPS_DEFAULT, then
build and run with desired pool handle.
Introducing eal option to override default pool handle.
Now user can override the RTE_MEMPOOL_OPS_DEFAULT by passing
pool handle to eal `--mbuf-pool-ops-name=""`.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
iova autodetection depends on rte_bus_scan result. Result of bus scan will
have updated device_list and each device in that list has its '.kdev' state
updated. That kdrv state used to detect iova mapping mode for that device.
_device_parse() has dependency on rt_bus_scan so,
Below calls moved up in the eal initialization order:
- eal_option_device_parse
- rte_bus_scan
And based on the result of rte_bus_scan_iommu_class - select iova
mapping mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
Flag used when driver needs to operate in iova=va mode.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Patch also adds rte_pci_get_iommu_class definition for:
- bsdapp: api returns default iova mode.
- linuxapp: Has stub implementation, Followup patch has complete
implementation.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
As for the testpmd flow command which uses uint16_t since the beginning by
chance, switch to portid_t for consistency.
Fixes: 14ab03825b1d ("ethdev: increase port id range")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
In order to support more than 256 virtual ports, the field "port"
in rte_mbuf has been increased to 16 bits. The initialization/reset
value of the field "port" should be changed from 0xff to 0xffff
accordingly.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Extend port_id definition from uint8_t to uint16_t in lib and drivers
data structures, specifically rte_eth_dev_data. Modify the APIs,
drivers and app using port_id at the same time.
Fix some checkpatch issues from the original code and remove some
unnecessary cast operations.
release_17_11 and deprecation docs have been updated in this patch.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The compilation with gcc-6.3.0 and EXTRA_CFLAGS=-Og gives the following
error:
CC rte_metrics.o
rte_metrics.c: In function
‘rte_metrics_reg_names’: rte_metrics.c:153:22: error: ‘entry’
may be used uninitialized in this function
[-Werror=maybe-uninitialized]
entry->idx_next_set = 0;
~~~~~~~~~~~~~~~~~~~~^~~
This is a false positive, gcc is not able to see that we always go in
the loop at least once, initializing entry.
Fix the warning by initializing entry to NULL.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The compilation with gcc-6.3.0 and EXTRA_CFLAGS=-Og gives the following
error:
CC cmdline_parse.o
cmdline_parse.c: In function ‘match_inst’:
cmdline_parse.c:227:5: error: ‘token_p’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
if (token_p) {
^
This is a false positive, gcc is not able to see that we always go in
the loop at least once, initializing token_p.
Fix the warning by initializing token_p to NULL.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The compilation with gcc-6.3.0 and EXTRA_CFLAGS=-Og gives the following
error:
CC eal_pci_uio.o
eal_pci_uio.c: In function ‘pci_get_uio_de ’:
eal_pci_uio.c:221:9: error: ‘uio_num’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
return uio_num;
^~~~~~~
This is a false positive: gcc is not able to see that when e != NULL,
uio_num is always set. Fix the warning by initializing uio_num to -1,
and by the way, change the type to int.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Remove unnecessary check for new flow type for rss hash filter update.
Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
PMDs which expose this offload cap supports optimization for fast release
of mbufs following successful Tx.
Such optimization requires that per queue, all mbufs come from the same
mempool and has refcnt = 1.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce a new API to configure Tx offloads.
In the new API, offloads are divided into per-port and per-queue
offloads. The PMD reports capability for each of them.
Offloads are enabled using the existing DEV_TX_OFFLOAD_* flags.
To enable per-port offload, the offload should be set on both device
configuration and queue configuration. To enable per-queue offload, the
offloads can be set only on queue configuration.
In addition the Tx offloads will be disabled by default and be
enabled per application needs. This will much simplify PMD management of
the different offloads.
Applications should set the ETH_TXQ_FLAGS_IGNORE flag on txq_flags
field in order to move to the new API.
The old Tx offloads API is kept for the meanwhile, in order to enable a
smooth transition for PMDs and application to the new API.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce a new API to configure Rx offloads.
In the new API, offloads are divided into per-port and per-queue
offloads. The PMD reports capability for each of them.
Offloads are enabled using the existing DEV_RX_OFFLOAD_* flags.
To enable per-port offload, the offload should be set on both device
configuration and queue configuration. To enable per-queue offload, the
offloads can be set only on queue configuration.
Applications should set the ignore_offload_bitfield bit on rxmode
structure in order to move to the new API.
The old Rx offloads API is kept for the meanwhile, in order to enable a
smooth transition for PMDs and application to the new API.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The assertion of return value from the open() function is done against
0, while it is a correct value - open() returns -1 in case of an error.
It causes problems while trying to run as a daemon, in which case, this
call to open() will return 0 as a valid descriptor.
Fixes: b94e5c9406b5 ("eal/arm: add CPU flags for ARMv7")
Fixes: 97523f822ba9 ("eal/arm: add CPU flags for ARMv8")
Fixes: 9ae155385686 ("eal/ppc: cpu flag checks for IBM Power")
Cc: stable@dpdk.org
Signed-off-by: Lukasz Majczak <lma@semihalf.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
When compiled on Ubuntu with extra warnings enabled, the rte_strerror()
function triggered a warning about an unused return value from
strerror_r(). Rather than always have this warning disabled, we fix this,
and in the process do some cleanup of the code so as to reduce the
complexity of the fix, e.g. not having the #ifdef macros inside the
snprintf call.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Due to the uint32_t accesses in the hash computation, keys that aren't
aligned to a uint32_t boundary or multiples of uint32_t in length, may
see accesses beyond the end of the key. This may cross a page boundary.
Signed-off-by: Chas Williams <chas3@att.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: John McNamara <john.mcnamara@intel.com>
The inner L2 length returned by rte_net_get_ptype() is not
properly initialized. If the caller does not zero the header
lengths structure, the inner_l2 field will be undefined.
Fix it by initializing inner_l2 to 0 when parsing a inner layer.
Fixes: 2c15c5377da2 ("net: support NVGRE in software packet type parser")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
There is no reason to prevent ring from being larger than 0x0FFFFFFF.
Increase the maximum size to 0x7FFFFFFF, which is the maximum possible
without changing the code and the structure definition (size is stored
on a uint32_t).
Link: http://dpdk.org/ml/archives/dev/2017-September/074701.html
Suggested-by: Venkatesh Nuthula <venki497@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
When DPDK is compiled on Ubuntu with extra warnings turned on, there is a
warning about the return value from write() being unchecked. Rather than
having builds disable the warning, which may mask other cases we do care
about, we can add a dummy use of the return value in the code to silence it
in this instance.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
The ethdev ABI has been broken in release 17.08 without being bumped.
Fixes: c33ade1227a5 ("doc: notify ethdev callback process API change")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Any log messages during bus initialization maybe lost because
the bus registration constructor is called before the logging constructor.
Fixes: a97725791eec ("bus: introduce bus abstraction")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
It is a reminder that the constructors without priority
get the lowest priority.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The patch simplifies DPDK applications analysis for developers which use
Intel® VTune Amplifier.
The empty cycles are such iterations that yielded no RX packets. As far as
DPDK is running in poll mode, wasting cycles is equal to wasting CPU time.
Tracing such iterations can identify that device is underutilized. Tracing
empty cycles becomes even more critical if a system uses a lot of Ethernet
ports.
The patch gives possibility to analyze empty cycles without changing
application code. All needs to be done is just to reconfigure and rebuild
the DPDK itself with CONFIG_RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS
enbled. The important thing here is that this does not affect DPDK code.
The profiling code is not being compiled if user does not specify config
flag.
The patch provides common way to inject RX queues profiling and VTune
specific implementation.
Signed-off-by: Ilia Kurakin <ilia.kurakin@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Some compilers detect this error:
error: ‘ids[0]’ may be used uninitialized in this function
ret = rte_service_map_lcore_set(i, ids[lcore_iter], 1);
It can be reproduced very easily on Fedora 21 with
gcc-4.9.2-6.fc21.x86_64.
Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
rte_cfgfile_section_num_entries_by_index() is added to get the number of
entries of a section when multiple sections of the same name are
present.
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
lcore_states store the state of the lcore. Fixing the invalid
dereference of lcore_states with service number
Unit test case service_lcore_start_stop fails with the above fix.
Service core was stopped without stopping the service.
This commit fixes the test by adding negative and positive cases of
stopping the service lcore before and after stopping the service
respectively
Fixes: 21698354c832 ("service: introduce service cores concept")
Fixes: f038a81e1c56 ("service: add unit tests")
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
This commit adds a section to the service register function
to make it clear that registering a service, must not configure
service-cores (eg: adding lcores or changing mappings).
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit adds a new flag that the component (or "backend")
can use to indicate readyness. The service function callback
will not be called until the component sets itself as ready.
The use-case behind adding this feature is eg: a service that
requires configuration before it can start. Any service that
emulates an ethdev will have rte_eth_dev_configure() called,
and only after that the service will know how many queues/etc
to allocate. Once that configuration is complete, the service
marks itself as ready using rte_service_component_runstate_set().
This feature request results from prototyping services, and
requiring a flag in each service to note "internal" readyness.
Instead that logic is now lifted to the service library.
The unit tests have been updated to test the component runstate.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This aligns with the service stats, which are currently
also reset on read. A generic statistics API would be
helpful for the service library in future.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit ensures that after an lcore is added, that
it is in the WAIT state. Previously, adding an lcore did
not ensure that the core was ready for being relaunch, which
would cause errors during lcore_start(). Now that the lcore is
ensured to be in WAIT state by the lcore_add() function, this
is no longer an issue.
Fixes: 21698354c832 ("service: introduce service cores concept")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Previously to this commit, the return value of the following
functions was the mask value, instead of a neat 0 or 1.
This commit fixes this using the "!!" trick, to force the
number to 1 instead of the bitmask value itself, bringing
the return value in line with the function documentation.
Fixes: 21698354c832 ("service: introduce service cores concept")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Services can be registered and unregistered, and "holes" can
appear in the contiguous array of services if a service is
unregistered. As a result, we must never iterate to the
number of services (as counted by rte_service_count), instead
scanning the service array and checking if the service is valid.
After this commit, the rte_service_count variable is only used
for its intended purpose; tracking the number of services that
are present.
Fixes: 21698354c832 ("service: introduce service cores concept")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit fixes an issue in the service runner function,
where the atomic value was not cleared on exiting the service
function. This resulted in future attempts to run the service
to appear like the function was running, however it was in
reality deadlocked.
This commit refactors the atomic handling to be more readable,
by splitting the implementation code into a new static inline
function. The remaining flow control of atomics in the existing
function is refactored for readability.
Fixes: 21698354c832 ("service: introduce service cores concept")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit reworks the service_get_by_name() function to
accept an integer, and removes the service_get_by_id() function.
All functions now accept an integer argument representing the
service, so it is no longer required to expose the service_spec
pointers to the application.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit reworks the unregister API to accept an integer.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit reworks the statistics functions to use integer ids
for services instead of pointers. Passing UINT32_MAX to the dump
function prints all info, similar to passing NULL previously.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
This commit reworks the API to move from two separate start
and stop functions, to a "runstate" API which allows setting
the runstate. The is_running API is replaced with an function
to query the runstate. The runstate functions take a id value
for service. Unit tests and the eventdev sw pmd are updated.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>