22892 Commits

Author SHA1 Message Date
Matan Azrad
473d8e67d8 common/mlx5: add virtio queue protection domain
Starting from FW version 22.27.4002, it is required to
configure protection domain (PD) for each virtq created by
DevX.

Add PD requirement in virtq DevX APIs.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
6505865aa8 examples/vdpa: add statistics show command
A new vDPA driver feature was added to query the virtq
statistics from the HW.

Use this feature to show the HW queues statistics for the virtqs.

Command description: stats X Y.
X is the device ID.
Y is the queue ID, Y=0xffff to show all the virtio queues
statistics of the device X.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
7de66d823e vdpa/mlx5: support virtio queue statistics get
Add support for statistics operations.

A DevX counter object is allocated per virtq in order to
manage the virtq statistics.

The counter object is allocated before the virtq creation
and destroyed after it, so the statistics are valid only in
the life time of the virtq.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
796ae7bb6a common/mlx5: support DevX virtq stats operations
Add DevX API to create and query virtio queue statistics
from the HW. The next counters are supported by the HW per
virtio queue:
	received_desc.
	completed_desc.
	error_cqes.
	bad_desc_errors.
	exceed_max_chain.
	invalid_buffer.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
1cb4415751 vhost: introduce operation to get vDPA queue stats
The vDPA device offloads all the datapath of the vhost
device to the HW device.

In order to expose to the user traffic information this
patch introduces new 3 APIs to get traffic statistics, the
device statistics name and to reset the statistics per
virtio queue.

The statistics are taken directly from the vDPA driver
managing the HW device and can be different for each vendor
driver.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Maxime Coquelin
d1c074bd76 vhost: enable reply-ack systematically
As announced during v20.05 release cycle, this
patch makes reply-ack protocol feature to be enabled
unconditionally.

This protocol feature makes the communication between the
master and the slave more robust, avoiding for example
possible undefined behaviour with VHOST_USER_SET_MEM_TABLE.

Also, reply-ack support will be required for upcoming
VHOST_USER_SET_STATUS request.

Note that this protocol feature was disabled by default
because Qemu version 2.7.0 to 2.9.0 had a bug causing a
deadlock when reply-ack was negotiated and multiqueue
enabled. These Qemu version are now very old and no more
maintained, so we can reasonably consider we no more
support them.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
3990ea41c4 net/ice/base: replace RSS profile locks
Replacing flow profile locks with RSS profile locks in the function to
remove all RSS rules for a given VSI. This is to align the locks used
for RSS rule addition to VSI and removal during VSI teardown to avoid
a race condition owing to several iterations of the above operations.
In function to get RSS rules for given VSI and protocol header replacing
the pointer reference of the RSS entry with a copy of hash value to
ensure thread safety.

Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
072158c652 net/ice/base: fix VSI ID mask to 10 bits
set_rss_lut failed due to incorrect vsi_id mask. vsi_id is 10 bit
but mask was 0x1FF whereas it should be 0x3FF.

For vsi_num >= 512, FW set_rss_lut has been failing with return code
EACCESS (vsi ownership issue) because software was providing
incorrect vsi_num (dropping 10th bit due to incorrect mask) for
set_rss_lut admin command

Fixes: a90fae1d0755 ("net/ice/base: add admin queue structures and commands")
Cc: stable@dpdk.org

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
1ffe6670e0 net/ice/base: choose TCP dummy packet by protocol
In order to find proper dummy packets for switch filter,
it need to check ipv4 next protocol number, if it is 0x06,
which means next payload is TCP, we need to use TCP
format dummy packet.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
418d2563d1 net/ice/base: get tunnel type for recipe
This patch add support to get tunnel type of recipe
after get recipe from FW. This will fix the issue in
function ice_find_recp() for tunnel type comparing.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
3c0b91c387 net/ice/base: support flow director for GTPU with outer IPv6
Add FDIR support for MAC_IPV6_GTPU type with outer IPv6 address, teid
and qfi fields matching.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
efae14de50 net/ice/base: rename misleading variable
The grst_delay variable in ice_check_reset contains the maximum time
(in 100 msec units) that the driver will wait for a reset event to
transition to the Device Active state. The value is the sum of three
separate components:
1) The maximum time it may take for the firmware to process its
outstanding command before handling the reset request.
2) The value in RSTCTL.GRSTDEL (the delay firmware inserts between first
seeing the driver reset request and the actual hardware assertion).
3) The maximum expected reset processing time in hardware.

Referring to this total time as "grst_delay" is misleading and
potentially confusing to someone checking the code and cross-referencing
the hardware specification.

Fix this by renaming the variable to "grst_timeout", which is more
descriptive of its actual use.

Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
e014cd42e9 net/ice/base: add commands for system diagnostic
System diagnostic solution extend the ability to fetch FW
internal status data and error indication.

Signed-off-by: Sharon Haroni <sharon.haroni@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
c1bec172e8 net/ice/base: support flow director for outer IP of GTPU
Add outer IP address fields while generating the training packets for
GTPU, so that we can support FDIR based on outer IP of GTPU.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
7621dd771c net/ice/base: refactor to avoid need to retry
The ice_discover_caps function is used to read the device and function
capabilities, updating the hardware capabilities structures with
relevant data.

The exact number of capabilities returned by the hardware is unknown
ahead of time. The AdminQ command will report the total number of
capabilities in the return buffer.

The current implementation involves requesting capabilities once,
reading this returned size, and then re-requested with that size.

This isn't really necessary. The firmware interface has a maximum size
of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than
ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities.

Avoid the retry loop by simply allocating a buffer of size
ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The
extra allocation isn't a big deal, as it will be released after we
finish parsing the capabilities.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
d6e0dca1d5 net/ice/base: adjust profile ID map locks
The profile id map lock should be held till the caller completes
all references of that profile entries.

The current code releases the lock right after the match search.
This caused a driver issue when the profile map entries were
referenced after it was freed in other thread after the lock was
released earlier.

Also return type of get/set profile functions were changed to
return the ice status instead of the profile entry pointer.
This will prevent the caller referencing the profile fields
outside the lock.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Thomas Monjalon
4f299b7169 build: replace meson OS detection with variable
Some places were calling the meson function host_machine.system()
instead of the variables is_windows and is_linux defined
in config/meson.build.

At the same time, the missing "Linux restriction" reason is added to
pfe and octeontx2 crypto PMDs.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-06-30 15:29:59 +02:00
Thomas Monjalon
f7a4996c04 app/flow-perf: use macro for cache alignment
The macro __rte_cache_aligned is better suited for aligning
a structure on a cache line (of any size).

Fixes: 15c431864000 ("app/flow-perf: add packet forwarding support")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Wisam Jaddo <wisamm@mellanox.com>
2020-06-30 11:57:46 +02:00
Fady Bader
8d05adbd54 ring: enable for Windows
Building ring on Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
2020-06-30 01:30:22 +02:00
Thomas Monjalon
3b6431396a devtools: add Windows cross-build test with MinGW
The Meson cross file is renamed from meson_mingw.txt to cross-mingw,
and is added to test-meson-builds.sh.

The only example supported on Windows so far is "helloworld",
that's why the default list of examples is overridden.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-06-30 01:18:35 +02:00
Thomas Monjalon
b2b3858a08 devtools: add ppc64 in meson build test
Add cross-compilation support of a PPC target in the build test matrix.
The CPU is defined as Power8, running as little endian.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2020-06-30 01:18:35 +02:00
Thomas Monjalon
3932237f10 devtools: allow non-standard toolchain in meson test
If a compiler is not found in $PATH, the compilation test is skipped.
In some cases, the compiler could be found after extending $PATH
in an environment configuration script (called by load-devel-config).

The decision to skip is deferred to a later stage, after loading the
configuration script.

In such case, the variable DPDK_TARGET, used by the configuration script
as input, is the compiler name.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2020-06-30 01:18:35 +02:00
Thomas Monjalon
444e556776 devtools: shrink cross-compilation test definition
Each cross-compilation case needs to define the target compiler
and the meson cross file.
Given the compiler is already defined in the cross file,
the latter is enough.

The function "build" is changed to accept a cross file alternatively
to the compiler name. In the case of a file (detected if readable),
the compiler is extracted with sed and tr, and the option --cross-file
is automatically added.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2020-06-30 01:18:35 +02:00
Tasnim Bashar
5652a4ad24 eal/windows: fix thread handle
Casting thread ID to handle is not accurate way to get thread handle.
Need to use OpenThread function to get thread handle from thread ID.

pthread_setaffinity_np and pthread_getaffinity_np functions
for Windows are affected because of it.

Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
2020-06-30 00:50:26 +02:00
Tal Shnaiderman
b762221ac2 bus/pci: support Windows with bifurcated drivers
Uses SetupAPI.h functions to scan PCI tree.
Uses DEVPKEY_Device_Numa_Node to get the PCI NUMA node.
Uses SPDRP_BUSNUMBER and SPDRP_BUSNUMBER to get the BDF.
scanning currently supports types RTE_KDRV_NONE.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
33031608e8 bus/pci: introduce Windows support with stubs
Addition of stub eal and bus/pci functions to compile
bus/pci for Windows.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
8517072c87 pci: fix address domain format size
the struct rte_pci_addr defines domain as uint32_t variable however
the PCI_PRI_FMT macro used for logging the struct sets the format
of domain to uint16_t.

The mismatch causes the following warning messages
in Windows clang build:

format specifies type 'unsigned short' but the argument
has type 'uint32_t' (aka 'unsigned int') [-Wformat]

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
b137f95366 pci: build on Windows
Added <sys/types.h> in rte_pci header file
to include off_t type since it is missing for Windows.

Define the implementation of the Linux function rte_pci_get_sysfs_path
in pci_common.c for Linux OS only as it is unneeded for other OSs
and to avoid the warning on deprecated call to getenv() on Windows:

"warning: 'getenv' is deprecated: This function or variable may be unsafe.
Consider using _dupenv_s instead."

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
2fd3567e54 pci: use OS generic memory mapping functions
Changing all of PCIs Unix memory mapping to the
new memory allocation API wrapper.

Change all of PCI mapping function usage in
bus/pci to support the new API.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
2ab745bd8f eal: move OS common options usage functions
Move common functions between Unix and Windows to eal_common_options.c.

Those functions are getter functions for rte_application_usage_hook.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
57a2efb304 eal: move OS common config objects
Move common functions between Unix and Windows to eal_common_config.c.

Those functions are getter functions for IOVA,
configuration, Multi-process.

Move rte_config, internal_config, early_mem_config and runtime_dir
to be defined in the common file with getter functions.

Refactor the users of the config variables above to use
the getter functions.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
309bf90bf9 build: generate version map file for MinGW
The MinGW build for Windows has special cases where exported
function contain additional prefix:

__emutls_v.per_lcore__*

To avoid adding those prefixed functions to the version.map file
the map_to_def.py script was modified to create a map file for MinGW
with the needed changed.

The file name was changed to map_to_win.py and lib/meson.build map output
was unified with drivers/meson.build output

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
77cca7ccec build: fix drivers library path on Windows
import library (/IMPLIB) in meson.build should use
the 'drivers' and not 'libs' folder.

The error is: fatal error LNK1149: output filename matches input filename.
The fix uses the correct folder.

Fixes: 5ed3766981 ("drivers: process shared link dependencies as for libs")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
abd5c69bf6 build: skip pmdinfogen on Windows
pmdinfogen generation is currently unsupported for Windows.
The relevant part in meson.build is skipped.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Thomas Monjalon
014a7ec6c4 mk: add a paused deprecation warning before each build
DPDK 20.05 had some deprecation notes after "make config"
and after the build.
For DPDK 20.08, the config note is replaced with a warning
before the config and before the build.
After the warning, there is a pause which can be skipped
with the variable MAKE_PAUSE.

This deprecation process was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-06-29 16:37:22 +02:00
Thomas Monjalon
520bbb9cd9 doc: update build instructions in the Linux guide
Before removing the "make" build system completely,
the Linux guide instructions are made more concise and accurate.
Some detailed explanations are also available in
doc/guides/prog_guide/dev_kit_root_make_help.rst

This is the swan song for makefile system,
in order to have accurate information backported in LTS.

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-06-29 16:37:17 +02:00
Thomas Monjalon
582e9d7765 doc: remove some build instructions where unneeded
The build should be described only in few places,
in order to maintain up-to-date, accurate and detailed instructions.
This change is removing some of the unneeded repetitions.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jay Zhou <jianjay.zhou@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-06-29 16:33:39 +02:00
Thomas Monjalon
4a4ca46ae2 doc: remove outdated guidelines for library addition
There was a doc about how to extend DPDK by adding a library.
It could have been useful but was never updated,
so it is lacking a lot of explanations about doxygen,
meson, versioning, maintainership, etc.

Anyway such guidelines should fit in the contributors guide.
Better to completely remove this obsolete document.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-06-29 16:28:54 +02:00
Wisam Jaddo
15c4318640 app/flow-perf: add packet forwarding support
Introduce packet forwarding support to the app to do
some performance measurements.

The measurements are reported in term of packet per
second unit. The forwarding will start after the end
of insertion/deletion operations.

The support has single and multi performance measurements.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
662a72342a app/flow-perf: add memory dump to app
Introduce new feature to dump memory statistics of each socket
and a total for all before and after the creation.

This will give two main advantage:
1- Check the memory consumption for large number of flows
"insertion rate scenario alone"

2- Check that no memory leackage after doing insertion then
deletion.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
c12f4f217d app/flow-perf: add deletion rate calculation
Add the ability to test deletion rate for flow performance
application.

This feature is disabled by default, and can be enabled by
add "--deletion-rate" in the application command line options.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
bf3688f1e8 app/flow-perf: add insertion rate calculation
Add insertion rate calculation feature into flow
performance application.

The application now provide the ability to test
insertion rate of specific rte_flow rule, by
stressing it to the NIC, and calculate the
insertion rate.

The application offers some options in the command
line, to configure which rule to apply.

After that the application will start producing
rules with same pattern but increasing the outer IP
source address by 1 each time, thus it will give
different flow each time, and all other items will
have open masks.

The current design have single core insertion rate.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
3344cf2e30 app/flow-perf: add flow performance skeleton
Add flow performance application skeleton.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Xiaolong Ye
6248368a9a mbuf: add dump of free dynamic flags
Add support to dump free_flags as below format:

Free bit in mbuf->ol_flags (0 = occupied, 1 = free):
  0000: 0 0 0 0 0 0 0 0
  0008: 0 0 0 0 0 0 0 0
  0010: 0 0 0 0 0 0 0 1
  0018: 1 1 1 1 1 1 1 1
  0020: 1 1 1 1 1 1 1 1
  0028: 1 0 0 0 0 0 0 0
  0030: 0 0 0 0 0 0 0 0
  0038: 0 0 0 0 0 0 0 0

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
d27a4cb3ef mbuf: fix dynamic field dump log
For each mbuf byte, free_space[i] == 0 means the space is occupied,
free_space[i] != 0 means space is free.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
3a7fb882fd mbuf: fix free space update for dynamic field
The value free_space[i] is used to save the size of biggest aligned
element that can fit in the zone, current implementation has one flaw,
for example, if user registers dynfield1 (size = 4, align = 4, req = 124)
first, the free_space would be as below after registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 08 08 08 08 00 00 00 00

Then if user continues to register dynfield2 (size = 4, align = 4),
free_space would become:

  0070: 00 00 00 00 04 04 04 04
  0078: 04 04 04 04 00 00 00 00

Further request dynfield3 (size = 8, align = 8) would fail to register
due to alignment requirement can't be satisfied, though there is enough
space remained in mbuf.

This patch fixes above issue by saving alignment only in aligned zone,
after the fix, above registrations order can be satisfied, free_space
would be like:

After dynfield1 registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 04 04 04 04 00 00 00 00

After dynfield2 registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 00 00 00 00 00 00 00 00

After dynfield3 registration:

  0070: 00 00 00 00 00 00 00 00
  0078: 00 00 00 00 00 00 00 00

This patch also reduces iterations in process_score() by jumping align
steps in each loop.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
359386e15b mbuf: fix error code in dynamic field/flag registration
Set rte_errno as ENOMEM when allocation failure.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
f8eb26dda8 mbuf: fix boundary check at dynamic field registration
We should make sure off + size < sizeof(struct rte_mbuf) to avoid
possible out-of-bounds access of free_space array, there is no issue
currently due to the low bits of free_flags (which is adjacent to
free_space) are always set to 0. But we shouldn't rely on it since it's
fragile and layout of struct mbuf_dyn_shm may be changed in the future.
This patch adds boundary check explicitly to avoid potential risk of
out-of-bounds access.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Haiyue Wang
54f3fb127d bus/pci: fix VF memory access
To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
and block MMIO access on disabled memory, it will send a SIGBUS to the
application:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fdd

When the application opens the vfio PCI device, the vfio-pci module will
enable the bus memory space through PCI read/write access. According to
the PCIe specification, the 'Memory Space Enable' is always zero for VF:

             Table 9-13 Command Register Changes

Bit Location | PF and VF Register Differences | PF         | VF
             | From Base                      | Attributes | Attributes
-------------+--------------------------------+------------+-----------
             | Memory Space Enable - Does not |            |
             | apply to VFs. Must be hardwired|  Base      |  0b
     1       | to 0b for VFs. VF Memory Space |            |
             | is controlled by the VF MSE bit|            |
             | in the VF Control register.    |            |
-------------+--------------------------------+------------+-----------

Afterwards the vfio-pci will initialize its own virtual PCI config space
data ('vconfig') by reading the VF's physical PCI config space, then the
'Memory Space Enable' bit in vconfig will always be 0b value. This will
make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
will be triggered if access these BARs.

By investigation, the VF PCI device *passthrough* into the Guest OS by
QEMU has the 'Memory Space Enable' with 1b value. That's because every
PCI driver will start to enable the memory space, and this action will
be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
VF runs in host OS has 'Mem-'.

Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
bus memory space explicitly to avoid access on disabled memory.

Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
Cc: stable@dpdk.org

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Thierry Martin <thierry.martin.public@gmail.com>
2020-06-25 15:22:51 +02:00
David Marchand
7506edbc1c test/bitops: fix command name
Caught by code review, bitops test name is incorrect.

Fixes: 7660614c11e2 ("test/bitops: add bit operations test case")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Phil Yang <phil.yang@arm.com>
2020-06-25 01:26:26 +02:00