When no memory is available on the same numa node than the device, the
initialization of the device fails. However, the use case where the
cores and memory are on a different socket than the device is valid,
even if not optimal.
To fix this issue, this commit introduces an infrastructure to select
the socket on which to allocate the verbs objects based on the ethdev
configuration and the object type, rather than the PCI numa node.
Fixes: 1e3a39f72d ("net/mlx5: allocate verbs object into shared memory")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
On error, mlx5_dev_start() does not return a negative value
as it is supposed to do. The consequence is that the application
(ex: testpmd) does not notice that the port is not started
and begins the rxtx on an uninitialized port, which crashes.
Fixes: e1016cb733 ("net/mlx5: fix Rx interrupts management")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
All multi code should not be handled in exit part of the code but in the
mainline of the function.
Fixes: 0a40a1363a ("net/mlx5: fix flow type for allmulti rules")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Check if the security enable bits are not fused before setting
offload capabilities for security.
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Issue detected by coverity. Could never actually cause a
problem as truncated value (0x7f7f7f7f->0x7f) is what's needed.
But fix in code for correctness.
Coverity issue: 194998
Fixes: 571365dd4c ("crypto/qat: enable Rx head writes coalescing")
Cc: stable@dpdk.org
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
If auth algorithm is RTE_CRYPTO_AUTH_NULL and digest_length is 0
in the xform and digest pointer is set in the op, then
the PMD may overwrite memory at the digest pointer.
With this patch the memory is not overwritten.
Fixes: db0e952a5c ("crypto/qat: add NULL capability")
Cc: stable@dpdk.org
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
This commit fixes right cast from qat_cipher_get_block_size
function. This function can return -EFAULT in case of
any error, and that value must be cast to int instead of uint8_t
Fixes: d18ab45f76 ("crypto/qat: support DOCSIS BPI mode")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
This commit fixes
- bpi_cipher_encrypt to prevent before 'array subscript is
above array bounds' error
- bpi_cipher_decrypt to prevent before 'array subscript is
above array bounds' error
Fixes: d18ab45f76 ("crypto/qat: support DOCSIS BPI mode")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Seen with GCC 7.2.0, a switch fall through is detected and
cannot be fixed with a fall-through comment or attribute:
drivers/crypto/dpaa2_sec/hw/rta/operation_cmd.h:89:6: error:
this statement may fall through [-Werror=implicit-fallthrough=]
if (rta_sec_era < RTA_SEC_ERA_2)
^
The check is disabled in dpaa2_sec Makefile but not in dpaa_sec Makefile
which uses source code shared by dpaa2_sec.
The workaround is to disable the check at the beginning of the file.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Add checks during build to ensure that all symbols in the EXPERIMENTAL
version map section have __experimental tags on their definitions, and
enable the warnings needed to announce their use. Also add an
ALLOW_EXPERIMENTAL_APIS define to allow individual libraries and files
to declare the acceptability of experimental api usage
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Append the __rte_experimental tag to api calls appearing in the
EXPERIMENTAL section of their libraries version map
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Polling a new packet is basically sensing the generation bit in a
completion entry. For some processors not having strongly-ordered memory
model, there has to be a memory barrier between reading the generation bit
and other fields of the entry in order to guarantee data is not stale.
Fixes: 570acdb1da ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
ICC reports the issue at compile time as follows.
error #592: variable "i" is used before its value is set
RTE_SET_USED(i);
The patch is to fix it. GCC and CLANG has been tested as well.
Fixes: d548ef513c ("event/opdl: add unit tests")
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
Remove CFLAGS -std=c11 and -pedantic in order to guarantee
a successful vdev_netvsc compilation on old Linux distributions.
Otherwise old GCC compilers may complain as follows:
cc1: error: unrecognized command line option -std=c11
Fixes: 6086ab3bb3 ("net/vdev_netvsc: introduce Hyper-V platform driver")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
eBPF has a graceful approach: it must successfully compile on all Linux
distributions. If a specific kernel cannot support eBPF it will gracefully
refuse the eBPF netlink message sent to it.
The kernel header file linux/bpf.h (if present) on different Linux
distributions may not include all definitions required for TAP
compilation.
In order to guarantee a successful eBPF compilation everywhere all the
required definitions for TAP have been locally added instead of including
file <linux/bpf.h>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
This commit reworks the loop counter variable declarations
to be in line with the DPDK source code.
Fixes: 3c7f3dcfb0 ("event/opdl: add PMD main body and helper function")
Fixes: 8ca8e3b48e ("event/opdl: add event queue config get/set")
Fixes: d548ef513c ("event/opdl: add unit tests")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
align the config option name with config/common_base
Fixes: aaa4a221da ("event/sw: add new software-only eventdev driver")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.
In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.
Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.
TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.
TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.
eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.
TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
This commit include BPF API to be used by TAP.
tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
File tap_bpf_insns.h was added. It includes eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
-filetype=obj -o <tap_bpf_program.o>
clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel
2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
File tap_bpf_program.c was added with two ELF sections
corresponding to two BPF programs and one BPF map.
Section cls_q - BPF classifier to classify packets to their
corresponding queue after an RSS hash was calculated on the packet
and saved in skb->cb[1]
Section l3_l4 - BPF action to calculate RSS hash on packet
layers 3 and 4
This file is not part of DPDK tree compilation.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
This patch reverts:
commit 3a6f2eb8c5 ("net/mlx5: fix Memory Region registration")
Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. It is because buffer address is
compared against aligned addresses for Memory Region. Saving original
addresses of mempool for comparison doesn't create any overlap.
Fixes: b0b0938457 ("net/mlx5: use buffer address for LKEY search")
Fixes: 3a6f2eb8c5 ("net/mlx5: fix Memory Region registration")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Replace strncmp with strncasecmp in i40e_update_customized_ptype
function for compatibility.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
There're new metadata IPV4FRAG and IPV6FRAG in PPP
profile, this patch improves ptype parser to support
IPV4FRAG and IPV6FRAG.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Fail to update SW ptype mapping table when loading
PPP profile, though profile can be loaded successfully.
It will cause fail to parse SW ptype during receiving
packets. This patch fixes this issue.
Fixes: 11556c915a ("net/i40e: improve packet type parser")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Using DPDK in Hyper-V VM systems requires vdev_netvsc driver to pair
the NetVSC netdev device with the same MAC address PCI device by
fail-safe PMD.
Add vdev_netvsc custom scan in vdev bus to allow automatic probing in
Hyper-V VM systems unless it was already specified by command line.
Add "ignore" parameter to disable this auto-detection.
Signed-off-by: Matan Azrad <matan@mellanox.com>
This parameter allows specifying any non-NetVSC interface or routed
NetVSC interfaces to use with tap sub-devices for development purposes.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
NetVSC netdevices which are already routed should not be probed because
they are used for management purposes by the HyperV.
prevent routed netvsc devices probing.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
As described in more details in the attached documentation (see patch
contents), this virtual device driver manages NetVSC interfaces in virtual
machines hosted by Hyper-V/Azure platforms.
This driver does not manage traffic nor Ethernet devices directly; it acts
as a thin configuration layer that automatically instantiates and controls
fail-safe PMD instances combining tap and PCI sub-devices, so that each
NetVSC interface is exposed as a single consolidated port to DPDK
applications.
PCI sub-devices being hot-pluggable (e.g. during VM migration),
applications automatically benefit from increased throughput when present
and automatic fallback on NetVSC otherwise without interruption thanks to
fail-safe's hot-plug handling.
Once initialized, the sole job of the vdev_netvsc driver is to regularly
scan for PCI devices to associate with NetVSC interfaces and feed their
addresses to corresponding fail-safe instances.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
This patch lays the groundwork for this driver (draft documentation,
copyright notices, code base skeleton and build system hooks). While it can
be successfully compiled and invoked, it's an empty shell at this stage.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Previous fail-safe code didn't support probed sub-devices capture and
failed when it tried to probe them.
Skip fail-safe sub-device probing when it already was probed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
rte_free() is not supposed to work with pointers returned by calloc().
Fixes: a0194d8281 ("net/failsafe: add flexible device definition")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>