Starting from FW version 22.27.4002, it is required to
configure protection domain (PD) for each virtq created by
DevX.
Add PD requirement in virtq DevX APIs.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
A new vDPA driver feature was added to query the virtq
statistics from the HW.
Use this feature to show the HW queues statistics for the virtqs.
Command description: stats X Y.
X is the device ID.
Y is the queue ID, Y=0xffff to show all the virtio queues
statistics of the device X.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support for statistics operations.
A DevX counter object is allocated per virtq in order to
manage the virtq statistics.
The counter object is allocated before the virtq creation
and destroyed after it, so the statistics are valid only in
the life time of the virtq.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add DevX API to create and query virtio queue statistics
from the HW. The next counters are supported by the HW per
virtio queue:
received_desc.
completed_desc.
error_cqes.
bad_desc_errors.
exceed_max_chain.
invalid_buffer.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The vDPA device offloads all the datapath of the vhost
device to the HW device.
In order to expose to the user traffic information this
patch introduces new 3 APIs to get traffic statistics, the
device statistics name and to reset the statistics per
virtio queue.
The statistics are taken directly from the vDPA driver
managing the HW device and can be different for each vendor
driver.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As announced during v20.05 release cycle, this
patch makes reply-ack protocol feature to be enabled
unconditionally.
This protocol feature makes the communication between the
master and the slave more robust, avoiding for example
possible undefined behaviour with VHOST_USER_SET_MEM_TABLE.
Also, reply-ack support will be required for upcoming
VHOST_USER_SET_STATUS request.
Note that this protocol feature was disabled by default
because Qemu version 2.7.0 to 2.9.0 had a bug causing a
deadlock when reply-ack was negotiated and multiqueue
enabled. These Qemu version are now very old and no more
maintained, so we can reasonably consider we no more
support them.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Replacing flow profile locks with RSS profile locks in the function to
remove all RSS rules for a given VSI. This is to align the locks used
for RSS rule addition to VSI and removal during VSI teardown to avoid
a race condition owing to several iterations of the above operations.
In function to get RSS rules for given VSI and protocol header replacing
the pointer reference of the RSS entry with a copy of hash value to
ensure thread safety.
Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
set_rss_lut failed due to incorrect vsi_id mask. vsi_id is 10 bit
but mask was 0x1FF whereas it should be 0x3FF.
For vsi_num >= 512, FW set_rss_lut has been failing with return code
EACCESS (vsi ownership issue) because software was providing
incorrect vsi_num (dropping 10th bit due to incorrect mask) for
set_rss_lut admin command
Fixes: a90fae1d0755 ("net/ice/base: add admin queue structures and commands")
Cc: stable@dpdk.org
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
In order to find proper dummy packets for switch filter,
it need to check ipv4 next protocol number, if it is 0x06,
which means next payload is TCP, we need to use TCP
format dummy packet.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
This patch add support to get tunnel type of recipe
after get recipe from FW. This will fix the issue in
function ice_find_recp() for tunnel type comparing.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Add FDIR support for MAC_IPV6_GTPU type with outer IPv6 address, teid
and qfi fields matching.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
The grst_delay variable in ice_check_reset contains the maximum time
(in 100 msec units) that the driver will wait for a reset event to
transition to the Device Active state. The value is the sum of three
separate components:
1) The maximum time it may take for the firmware to process its
outstanding command before handling the reset request.
2) The value in RSTCTL.GRSTDEL (the delay firmware inserts between first
seeing the driver reset request and the actual hardware assertion).
3) The maximum expected reset processing time in hardware.
Referring to this total time as "grst_delay" is misleading and
potentially confusing to someone checking the code and cross-referencing
the hardware specification.
Fix this by renaming the variable to "grst_timeout", which is more
descriptive of its actual use.
Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
System diagnostic solution extend the ability to fetch FW
internal status data and error indication.
Signed-off-by: Sharon Haroni <sharon.haroni@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Add outer IP address fields while generating the training packets for
GTPU, so that we can support FDIR based on outer IP of GTPU.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
The ice_discover_caps function is used to read the device and function
capabilities, updating the hardware capabilities structures with
relevant data.
The exact number of capabilities returned by the hardware is unknown
ahead of time. The AdminQ command will report the total number of
capabilities in the return buffer.
The current implementation involves requesting capabilities once,
reading this returned size, and then re-requested with that size.
This isn't really necessary. The firmware interface has a maximum size
of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than
ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities.
Avoid the retry loop by simply allocating a buffer of size
ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The
extra allocation isn't a big deal, as it will be released after we
finish parsing the capabilities.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
The profile id map lock should be held till the caller completes
all references of that profile entries.
The current code releases the lock right after the match search.
This caused a driver issue when the profile map entries were
referenced after it was freed in other thread after the lock was
released earlier.
Also return type of get/set profile functions were changed to
return the ice status instead of the profile entry pointer.
This will prevent the caller referencing the profile fields
outside the lock.
Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Some places were calling the meson function host_machine.system()
instead of the variables is_windows and is_linux defined
in config/meson.build.
At the same time, the missing "Linux restriction" reason is added to
pfe and octeontx2 crypto PMDs.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
The macro __rte_cache_aligned is better suited for aligning
a structure on a cache line (of any size).
Fixes: 15c431864000 ("app/flow-perf: add packet forwarding support")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Wisam Jaddo <wisamm@mellanox.com>
The Meson cross file is renamed from meson_mingw.txt to cross-mingw,
and is added to test-meson-builds.sh.
The only example supported on Windows so far is "helloworld",
that's why the default list of examples is overridden.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Add cross-compilation support of a PPC target in the build test matrix.
The CPU is defined as Power8, running as little endian.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
If a compiler is not found in $PATH, the compilation test is skipped.
In some cases, the compiler could be found after extending $PATH
in an environment configuration script (called by load-devel-config).
The decision to skip is deferred to a later stage, after loading the
configuration script.
In such case, the variable DPDK_TARGET, used by the configuration script
as input, is the compiler name.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Each cross-compilation case needs to define the target compiler
and the meson cross file.
Given the compiler is already defined in the cross file,
the latter is enough.
The function "build" is changed to accept a cross file alternatively
to the compiler name. In the case of a file (detected if readable),
the compiler is extracted with sed and tr, and the option --cross-file
is automatically added.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Casting thread ID to handle is not accurate way to get thread handle.
Need to use OpenThread function to get thread handle from thread ID.
pthread_setaffinity_np and pthread_getaffinity_np functions
for Windows are affected because of it.
Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
Uses SetupAPI.h functions to scan PCI tree.
Uses DEVPKEY_Device_Numa_Node to get the PCI NUMA node.
Uses SPDRP_BUSNUMBER and SPDRP_BUSNUMBER to get the BDF.
scanning currently supports types RTE_KDRV_NONE.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
the struct rte_pci_addr defines domain as uint32_t variable however
the PCI_PRI_FMT macro used for logging the struct sets the format
of domain to uint16_t.
The mismatch causes the following warning messages
in Windows clang build:
format specifies type 'unsigned short' but the argument
has type 'uint32_t' (aka 'unsigned int') [-Wformat]
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Added <sys/types.h> in rte_pci header file
to include off_t type since it is missing for Windows.
Define the implementation of the Linux function rte_pci_get_sysfs_path
in pci_common.c for Linux OS only as it is unneeded for other OSs
and to avoid the warning on deprecated call to getenv() on Windows:
"warning: 'getenv' is deprecated: This function or variable may be unsafe.
Consider using _dupenv_s instead."
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Changing all of PCIs Unix memory mapping to the
new memory allocation API wrapper.
Change all of PCI mapping function usage in
bus/pci to support the new API.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Move common functions between Unix and Windows to eal_common_options.c.
Those functions are getter functions for rte_application_usage_hook.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Move common functions between Unix and Windows to eal_common_config.c.
Those functions are getter functions for IOVA,
configuration, Multi-process.
Move rte_config, internal_config, early_mem_config and runtime_dir
to be defined in the common file with getter functions.
Refactor the users of the config variables above to use
the getter functions.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
The MinGW build for Windows has special cases where exported
function contain additional prefix:
__emutls_v.per_lcore__*
To avoid adding those prefixed functions to the version.map file
the map_to_def.py script was modified to create a map file for MinGW
with the needed changed.
The file name was changed to map_to_win.py and lib/meson.build map output
was unified with drivers/meson.build output
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
import library (/IMPLIB) in meson.build should use
the 'drivers' and not 'libs' folder.
The error is: fatal error LNK1149: output filename matches input filename.
The fix uses the correct folder.
Fixes: 5ed3766981 ("drivers: process shared link dependencies as for libs")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
pmdinfogen generation is currently unsupported for Windows.
The relevant part in meson.build is skipped.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
DPDK 20.05 had some deprecation notes after "make config"
and after the build.
For DPDK 20.08, the config note is replaced with a warning
before the config and before the build.
After the warning, there is a pause which can be skipped
with the variable MAKE_PAUSE.
This deprecation process was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Before removing the "make" build system completely,
the Linux guide instructions are made more concise and accurate.
Some detailed explanations are also available in
doc/guides/prog_guide/dev_kit_root_make_help.rst
This is the swan song for makefile system,
in order to have accurate information backported in LTS.
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
The build should be described only in few places,
in order to maintain up-to-date, accurate and detailed instructions.
This change is removing some of the unneeded repetitions.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jay Zhou <jianjay.zhou@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
There was a doc about how to extend DPDK by adding a library.
It could have been useful but was never updated,
so it is lacking a lot of explanations about doxygen,
meson, versioning, maintainership, etc.
Anyway such guidelines should fit in the contributors guide.
Better to completely remove this obsolete document.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Introduce packet forwarding support to the app to do
some performance measurements.
The measurements are reported in term of packet per
second unit. The forwarding will start after the end
of insertion/deletion operations.
The support has single and multi performance measurements.
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
Introduce new feature to dump memory statistics of each socket
and a total for all before and after the creation.
This will give two main advantage:
1- Check the memory consumption for large number of flows
"insertion rate scenario alone"
2- Check that no memory leackage after doing insertion then
deletion.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
Add the ability to test deletion rate for flow performance
application.
This feature is disabled by default, and can be enabled by
add "--deletion-rate" in the application command line options.
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
Add insertion rate calculation feature into flow
performance application.
The application now provide the ability to test
insertion rate of specific rte_flow rule, by
stressing it to the NIC, and calculate the
insertion rate.
The application offers some options in the command
line, to configure which rule to apply.
After that the application will start producing
rules with same pattern but increasing the outer IP
source address by 1 each time, thus it will give
different flow each time, and all other items will
have open masks.
The current design have single core insertion rate.
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
For each mbuf byte, free_space[i] == 0 means the space is occupied,
free_space[i] != 0 means space is free.
Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The value free_space[i] is used to save the size of biggest aligned
element that can fit in the zone, current implementation has one flaw,
for example, if user registers dynfield1 (size = 4, align = 4, req = 124)
first, the free_space would be as below after registration:
0070: 08 08 08 08 08 08 08 08
0078: 08 08 08 08 00 00 00 00
Then if user continues to register dynfield2 (size = 4, align = 4),
free_space would become:
0070: 00 00 00 00 04 04 04 04
0078: 04 04 04 04 00 00 00 00
Further request dynfield3 (size = 8, align = 8) would fail to register
due to alignment requirement can't be satisfied, though there is enough
space remained in mbuf.
This patch fixes above issue by saving alignment only in aligned zone,
after the fix, above registrations order can be satisfied, free_space
would be like:
After dynfield1 registration:
0070: 08 08 08 08 08 08 08 08
0078: 04 04 04 04 00 00 00 00
After dynfield2 registration:
0070: 08 08 08 08 08 08 08 08
0078: 00 00 00 00 00 00 00 00
After dynfield3 registration:
0070: 00 00 00 00 00 00 00 00
0078: 00 00 00 00 00 00 00 00
This patch also reduces iterations in process_score() by jumping align
steps in each loop.
Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Set rte_errno as ENOMEM when allocation failure.
Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
We should make sure off + size < sizeof(struct rte_mbuf) to avoid
possible out-of-bounds access of free_space array, there is no issue
currently due to the low bits of free_flags (which is adjacent to
free_space) are always set to 0. But we shouldn't rely on it since it's
fragile and layout of struct mbuf_dyn_shm may be changed in the future.
This patch adds boundary check explicitly to avoid potential risk of
out-of-bounds access.
Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
and block MMIO access on disabled memory, it will send a SIGBUS to the
application:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fdd
When the application opens the vfio PCI device, the vfio-pci module will
enable the bus memory space through PCI read/write access. According to
the PCIe specification, the 'Memory Space Enable' is always zero for VF:
Table 9-13 Command Register Changes
Bit Location | PF and VF Register Differences | PF | VF
| From Base | Attributes | Attributes
-------------+--------------------------------+------------+-----------
| Memory Space Enable - Does not | |
| apply to VFs. Must be hardwired| Base | 0b
1 | to 0b for VFs. VF Memory Space | |
| is controlled by the VF MSE bit| |
| in the VF Control register. | |
-------------+--------------------------------+------------+-----------
Afterwards the vfio-pci will initialize its own virtual PCI config space
data ('vconfig') by reading the VF's physical PCI config space, then the
'Memory Space Enable' bit in vconfig will always be 0b value. This will
make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
will be triggered if access these BARs.
By investigation, the VF PCI device *passthrough* into the Guest OS by
QEMU has the 'Memory Space Enable' with 1b value. That's because every
PCI driver will start to enable the memory space, and this action will
be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
VF runs in host OS has 'Mem-'.
Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
bus memory space explicitly to avoid access on disabled memory.
Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Thierry Martin <thierry.martin.public@gmail.com>
Caught by code review, bitops test name is incorrect.
Fixes: 7660614c11e2 ("test/bitops: add bit operations test case")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Phil Yang <phil.yang@arm.com>