3054 Commits

Author SHA1 Message Date
Yongseok Koh
0b259b8e96 net/mlx4: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
f4efc0eb97 net/mlx4: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx4_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smalle
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
c18cf501a7 net/mlx5: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
dceb502942 net/mlx5: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx5_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smaller
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
207fe7ac72 net/mlx5: fix external memory registration
Secondary process is not allowed to register MR due to a restriction of
library and kernel driver.

Fixes: 7e43a32ee060 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
0203d33a10 net/mlx4: support secondary process
In order to support secondary process, a few features are required.

a) rdma-core library should allocate device resources using DPDK's
   memory allocator.

b) UAR should be remapped for secondary processes. Currently, in order
   not to use different data structure for secondary processes, PMD
   tries to reserve identical virtual address space for both primary
   and secondary processes.

c) IPC channel is necessary, which can be easily set with rte_mp APIs.
   Through the channel, Verbs command FD is delivered to the secondary
   process and the device stop/start event is also broadcast from
   primary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Dekel Peled
205d9d3cf8 doc: fix typos in testpmd user guide
Correct typing mistakes in several places.
1) "Those command will" ==> "These commands will"
2) icmp6_nd_opt_sla_eth ==> icmp6_nd_opt_tla_eth
   (change 's' to 't' for 'target' where applicable)

Fixes: 1960be7d32f8 ("app/testpmd: add VXLAN encap/decap")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2019-04-05 17:45:22 +02:00
Dekel Peled
3fdac78a2b doc: fix typos in mlx5 guide
Correct typing mistakes:
appiled ==> applied
tarffic ==> traffic

Fixes: 0280f2812284 ("doc: add mlx5 E-Switch VXLAN tunnels limitations")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Kevin Traynor
219a6e74ca doc: update LTS section
Update the LTS section to mention the branch and how LTS support ends.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-04-05 15:46:10 +02:00
Kevin Traynor
cf6b07d206 doc: note validation and timeline required for stables
If a stable branch for a specific DPDK release is to proceed,
along with needing a maintainer, there should also be commitment
from major contributors for validation of the releases.

Also, as decided in the March 27th techboard, to facilitate user
planning, a release should be designated as a stable release
no later than 1 month after it's initial master release.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-04-05 15:39:15 +02:00
David Marchand
560a3580ba doc: fix ABI check script examples
The doc examples are not aligned on the script following the
incriminated commit.

Fixes: c4a5fe3bf832 ("devtools: rework ABI checker script")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2019-04-05 10:40:56 +02:00
Rami Rosen
eeafaf40c9 doc: fix two typos in contributing guide
This patch fixes two typos in the coding style part of
DPDK contributing guide:

- The header entry should have .h file instead of .c file.
- The will->This will

Fixes: 44a6dface13b ("doc: describe how to add new components")
Cc: stable@dpdk.org

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
2019-04-05 10:40:56 +02:00
Dekel Peled
047b663a59 doc: fix links to doxygen and sphinx sites
Update broken links, replace with valid links.

Fixes: 7798f17a0d62 ("doc: add documentation guidelines")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
2019-04-05 10:40:56 +02:00
Shreyansh Jain
a2831648b3 doc: bump NXP SDK support version for dpaa2
With the change in MC firmware, minimum supported version of
the Layerscape SDK too needs to be changed.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2019-04-04 23:42:15 +02:00
Gage Eads
e75bc77f98 mempool/stack: add lock-free stack mempool handler
This commit adds support for lock-free (linked list based) stack mempool
handler.

In mempool_perf_autotest the lock-based stack outperforms the
lock-free handler for certain lcore/alloc count/free count
combinations*, however:
- For applications with preemptible pthreads, a standard (lock-based)
  stack's worst-case performance (i.e. one thread being preempted while
  holding the spinlock) is much worse than the lock-free stack's.
- Using per-thread mempool caches will largely mitigate the performance
  difference.

*Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4,
running on isolcpus cores with a tickless scheduler. The lock-based stack's
rate_persec was 0.6x-3.5x the lock-free stack's.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
2019-04-04 22:06:16 +02:00
Gage Eads
3340202f59 stack: add lock-free implementation
This commit adds support for a lock-free (linked list based) stack to the
stack API. This behavior is selected through a new rte_stack_create() flag,
RTE_STACK_F_LF.

The stack consists of a linked list of elements, each containing a data
pointer and a next pointer, and an atomic stack depth counter.

The lock-free push operation enqueues a linked list of pointers by pointing
the tail of the list to the current stack head, and using a CAS to swing
the stack head pointer to the head of the list. The operation retries if it
is unsuccessful (i.e. the list changed between reading the head and
modifying it), else it adjusts the stack length and returns.

The lock-free pop operation first reserves num elements by adjusting the
stack length, to ensure the dequeue operation will succeed without
blocking. It then dequeues pointers by walking the list -- starting from
the head -- then swinging the head pointer (using a CAS as well). While
walking the list, the data pointers are recorded in an object table.

This algorithm stack uses a 128-bit compare-and-swap instruction, which
atomically updates the stack top pointer and a modification counter, to
protect against the ABA problem.

The linked list elements themselves are maintained in a lock-free LIFO
list, and are allocated before stack pushes and freed after stack pops.
Since the stack has a fixed maximum depth, these elements do not need to be
dynamically created.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-04 22:06:16 +02:00
Gage Eads
05d3b5283c stack: introduce stack library
The rte_stack library provides an API for configuration and use of a
bounded stack of pointers. Push and pop operations are MT-safe, allowing
concurrent access, and the interface supports pushing and popping multiple
pointers at a time.

The library's interface is modeled after another DPDK data structure,
rte_ring, and its lock-based implementation is derived from the stack
mempool handler. An upcoming commit will migrate the stack mempool handler
to rte_stack.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-04 22:06:16 +02:00
Fan Zhang
3ed37e0934 doc: update supported algorithms in IPsec guide
This patch updates the ipsec library programmer's guide with
the additional algorithms which are now supported.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-04-03 13:50:58 +02:00
Fan Zhang
f4dc5af262 doc: announce cryptodev xform API change
This patch adds the deprecation notice of changing Cryptodev
symmetric xform structure. The proposed change is to making
key pointers in the crypto xforms (cipher, auth, aead) to
indicate neither the library or the drivers will not change
the content of the key buffer.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
2019-04-03 11:33:16 +02:00
Konstantin Ananyev
2ad46e6831 doc: add IPsec library in release notes
Add librte_ipsec into 'Shared Library Versions' list in the release notes.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2019-04-02 16:50:24 +02:00
Ayuj Verma
378e08eba8 crypto/openssl: set RSA private op feature flag
openssl PMD support RSA private key operation
using both qt and exp key type.
Set rsa key type feature flag

Signed-off-by: Ayuj Verma <ayverma@marvell.com>
Signed-off-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2019-04-02 16:50:24 +02:00
Ayuj Verma
398ba4c13f cryptodev: add RSA private key feature flag
Add feature flag to reflect RSA private key
operation support using quintuple (crt) or
exponent type key. if PMD support both,
then it should set both.

App should query cryptodev feature flag to check
if Sign and Decryt with CRT keys or exponent is
supported, thus call operation with relevant
key type.

Signed-off-by: Ayuj Verma <ayverma@marvell.com>
Signed-off-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2019-04-02 16:50:24 +02:00
Tomasz Jozwiak
352332744c compress/qat: add dynamic SGL allocation
This patch adds dynamic SGL allocation instead of static one.
The number of element in SGL can be adjusted in each operation
depend of the request.

Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
2019-04-02 16:50:24 +02:00
Fan Zhang
7b2d4706c9 crypto/aesni_mb: support newer library version only
As stated in 19.02 deprecation notice, this patch updates the
aesni_mb PMD to remove the support of older Intel-ipsec-mb
library version earlier than 0.52.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2019-04-02 16:50:24 +02:00
Fan Zhang
2d0c29a37a crypto/aesni_mb: enable out of place processing
Add out-of-place processing, i.e. different source and
destination m_bufs, plus related capability update, tests
and documentation.

Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2019-04-02 16:50:24 +02:00
Arek Kusztal
8245972c04 crypto/qat: add modular multiplicative inverse
This commit adds modular multiplicative inverse to Intel
QuickAssist Technology driver. For capabilities or limitations
please refer to qat.rst or qat_asym_capabilities.h.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
2019-04-02 16:50:24 +02:00
Arek Kusztal
fb70b33b05 crypto/qat: add modular exponentiation
This commit adds modular exponentiation to Intel QuickAssist
Technology driver. For capabilities or limitations please refer to
qat.rst or qat_asym_capabilities.h.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
2019-04-02 16:50:24 +02:00
Arek Kusztal
f81cbc208f crypto/qat: add asymmetric crypto PMD
This patch adds Poll Mode Driver for asymmetric crypto
functions of Intel QuickAssist Technology hardware.

It contains plain driver with no functions implemented, specific
algorithms will be introduced in separate patches.

This patch depends on a QAT PF driver for device initialization. See
the file docs/guides/cryptodevs/qat.rst for configuration details.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
2019-04-02 16:50:24 +02:00
Anoob Joseph
728762e37e doc: announce ABI change for cryptodev config
Add new field ff_disable in rte_cryptodev_config. This enables
applications to control the features enabled on the crypto device.

Proposed new layout:

/** Crypto device configuration structure */
struct rte_cryptodev_config {
    int socket_id;            /**< Socket to allocate resources on */
    uint16_t nb_queue_pairs;
    /**< Number of queue pairs to configure on device */
+   uint64_t ff_disable;
+   /**< Feature flags to be disabled. Only the following features are
+    * allowed to be disabled,
+    *  - RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO
+    *  - RTE_CRYPTODEV_FF_ASYMMETRIC_CRYPTO
+    *  - RTE_CRYTPODEV_FF_SECURITY
+    */
};

For eth devices, rte_eth_conf.rx_mode.offloads and
rte_eth_conf.tx_mode.offloads fields are used by applications to
control the offloads enabled on the eth device. This proposal adds a
similar ability for the crypto device.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2019-04-02 16:50:24 +02:00
Pavan Nikhilesh
f0959283ed app/eventdev: add option for global dequeue timeout
Add option to provide a global dequeue timeout that is used to create
the eventdev.
The dequeue timeout provided will be common across all the worker
ports. If the eventdev hardware supports power management through
dequeue timeout then this option can be used for verifying power
demands at various packet rates.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2019-04-02 03:11:08 +02:00
Dharmik Thakkar
f401363d98 hash: support lock-free extendable bucket
This patch enables lock-free read-write concurrency support for
extendable bucket feature.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
2019-04-03 20:52:35 +02:00
Anand Rawat
196c650b8b doc: add guide for Windows
Added documentation to build helloworld example
on Windows using meson and clang.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Signed-off-by: Anand Rawat <anand.rawat@intel.com>
Reviewed-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Tested-by: Harini Ramakrishnan <harini.ramakrishnan@microsoft.com>
Acked-by: Harini Ramakrishnan <harini.ramakrishnan@microsoft.com>
2019-04-03 01:21:31 +02:00
Anatoly Burakov
1e3380a2f4 mem: do not use lockfiles for single file segments mode
Due to internal glibc limitations [1], DPDK may exhaust internal
file descriptor limits when using smaller page sizes, which results
in inability to use system calls such as select() by user
applications.

Single file segments option stores lock files per page to ensure
that pages are deleted when there are no more users, however this
is not necessary because the processes will be holding onto the
pages anyway because of mmap(). Thus, removing pages from the
filesystem is safe even though they may be used by some other
secondary process. As a result, single file segments mode no
longer stores inordinate amounts of segment fd's, and the above
issue with fd limits is solved.

However, this will not work for legacy mem mode. For that, simply
document that using bigger page sizes is the only option.

[1] https://mails.dpdk.org/archives/dev/2019-February/124386.html

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-02 16:07:25 +02:00
Yongseok Koh
82010ef55e app/testpmd: make txonly mode generate multiple flows
Testpmd can generate multiple flows without taking much cost and this
could be a simple traffic generator for developer's quick tests.
If "--txonly-multi-flow" is specified in the command line, IP source
address is varied to generate multiple flows.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2019-03-29 19:42:42 +01:00
Stephen Hemminger
ad97ceece1 ethdev: add min/max MTU to device info
This addresses the usability issue raised by OVS at DPDK Userspace
summit. It adds general min/max MTU into device info. For compatibility,
and to save space, it fits in a hole in existing structure.

The initial version sets max MTU to normal Ethernet, it is up to
PMD to set larger value if it supports Jumbo frames.

Also remove the deprecation notice introduced in 18.11 regarding this
change and bump ethdev ABI version.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-03-29 18:57:42 +01:00
Qiming Yang
d7d150b930 net/ice: enable RSS when device init
This patch enabled RSS for UPD/TCP/SCTP+IPV4/IPV6 packets.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-03-29 17:25:31 +01:00
Qiming Yang
37f4ac5f59 net/ice: add safe mode
If E810 download package failed, driver need to go to safe mode.
In the safe mode, some advanced features will not be supported.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-03-29 17:25:31 +01:00
Qiming Yang
a4c8c48fe3 net/ice: load OS default package
This patch enables package downloading to the device. The package is
to be in the /lib/firmware/intel/ice/ddp directory and named ice.pkg.
The package is shared by the kernel driver and the DPDK PMD.

There is no per device package be supported so far, all the
devices can only download the same package. This limitation will
be removed in the future.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-03-29 17:25:31 +01:00
David Marchand
53324971a1 app/testpmd: display/clear forwarding stats on demand
Add a new "show/clear fwd stats all" command to display fwd and port
statistics on the fly.

To be able to do so, the (testpmd only) rte_port structure can't be used
to maintain any statistics.
Moved the stats dump parts from stop_packet_forwarding() and merge with
fwd_port_stats_display() into fwd_stats_display().
fwd engine statistics are then aggregated into a local per port array.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-03-29 17:25:31 +01:00
Wenzhuo Lu
2d5f6953d5 net/ice: support vector AVX2 in Tx
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-03-29 17:25:31 +01:00
Wenzhuo Lu
f88de4694d net/ice: support Tx SSE vector
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-03-29 17:25:31 +01:00
Wenzhuo Lu
c68a52b8b3 net/ice: support vector SSE in Rx
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-03-29 17:25:31 +01:00
Shahaf Shuler
8dd56ce261 doc: announce deprecation of VFIO DMA map functions
As those should be replaced by rte_dev_dma_map and rte_dev_dma_unmap
APIs.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2019-03-30 16:48:58 +01:00
Shahaf Shuler
c33a675b62 bus: introduce device level DMA memory mapping
The DPDK APIs expose 3 different modes to work with memory used for DMA:

1. Use the DPDK owned memory (backed by the DPDK provided hugepages).
This memory is allocated by the DPDK libraries, included in the DPDK
memory system (memseg lists) and automatically DMA mapped by the DPDK
layers.

2. Use memory allocated by the user and register to the DPDK memory
systems. Upon registration of memory, the DPDK layers will DMA map it
to all needed devices. After registration, allocation of this memory
will be done with rte_*malloc APIs.

3. Use memory allocated by the user and not registered to the DPDK memory
system. This is for users who wants to have tight control on this
memory (e.g. avoid the rte_malloc header).
The user should create a memory, register it through rte_extmem_register
API, and call DMA map function in order to register such memory to
the different devices.

The scope of the patch focus on #3 above.

Currently the only way to map external memory is through VFIO
(rte_vfio_dma_map). While VFIO is common, there are other vendors
which use different ways to map memory (e.g. Mellanox and NXP).

The work in this patch moves the DMA mapping to vendor agnostic APIs.
Device level DMA map and unmap APIs were added. Implementation of those
APIs was done currently only for PCI devices.

For PCI bus devices, the pci driver can expose its own map and unmap
functions to be used for the mapping. In case the driver doesn't provide
any, the memory will be mapped, if possible, to IOMMU through VFIO APIs.

Application usage with those APIs is quite simple:
* allocate memory
* call rte_extmem_register on the memory chunk.
* take a device, and query its rte_device.
* call the device specific mapping function for this device.

Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap
APIs, leaving the rte device APIs as the preferred option for the user.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2019-03-30 16:48:56 +01:00
Shahaf Shuler
4106d89a18 vfio: allow DMA map to the default container
Enable users the option to call rte_vfio_dma_map with request to map
to the default vfio fd.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2019-03-30 16:47:54 +01:00
Liron Himi
ff1e35fb5f kni: calculate MTU from mbuf size
- mbuf_size and mtu are now being calculated according
to the given mb-pool.

- max_mtu is now being set according to the given mtu

the above two changes provide the ability to work with jumbo frames

Signed-off-by: Liron Himi <lironh@marvell.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-03-30 00:59:59 +01:00
Nikhil Rao
db9f4430c2 service: fix parameter type for attribute
The type of value parameter to rte_service_attr_get
should be uint64_t *, since the attributes
are of type uint64_t.

Fixes: 4d55194d76a4 ("service: add attribute get function")

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2019-03-28 21:07:48 +01:00
Joyce Kong
184104fc61 ticketlock: introduce fair ticket based locking
The spinlock implementation is unfair, some threads may take locks
aggressively while leaving the other threads starving for long time.

This patch introduces ticketlock which gives each waiting thread a
ticket and they can take the lock one by one. First come, first serviced.
This avoids starvation for too long time and is more predictable.

Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-03-28 14:58:49 +01:00
Ferruh Yigit
d818e454df doc: deprecate KNI ethtool support
Announce removal of KNI ethtool support.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-03-27 14:35:19 +01:00
Ferruh Yigit
7abe4a24cc doc: add deprecation marker usage
Define '__rte_deprecated' usage process.

Suggests keeping old API with '__rte_deprecated' marker including
next LTS, they will be removed just after the LTS release.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2019-03-27 14:28:51 +01:00