Commit Graph

11618 Commits

Author SHA1 Message Date
Harman Kalra
987984204b net/octeontx2: fix DMAC filtering
Issue has been observed where packets are getting dropped
at DMAC filtering if a new dmac address is added before
starting of port.

Fixes: c43adf6168 ("net/octeontx2: add unicast MAC filter")
Cc: stable@dpdk.org

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
a49f758d11 vhost: split vDPA header file
This patch split the vDPA header file in two, making
rte_vdpa_device structure opaque to the application.

Applications should only include rte_vdpa.h, while drivers
should include both rte_vdpa.h and rte_vdpa_dev.h.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
2263f13941 vhost: replace vDPA device ID in Vhost
This removes the notion of device ID in Vhost library
as a preliminary step to get rid of the vDPA device ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
81a6b7fe06 vhost: replace device ID in vDPA ops
This patch is a preliminary step to get rid of the
vDPA device ID. It makes vDPA callbacks to use the
vDPA device struct as a reference instead of the ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
38f8ab0bbc vhost: make vDPA framework bus agnostic
This patch makes the vDPA framework to no more
support only PCI devices, but any devices by relying
on the generic device name as identifier.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
4425524543 bus/fslmc: fix iterating on a class type
This patches fixes a null pointer dereferencing that happens
when the device string passed to the iterator is NULL. This
situation can happen when iterating on a class type.
For example:

RTE_DEV_FOREACH(dev, "class=eth", &dev_iter) {
    ...
}

Fixes: e67a61614d ("bus/fslmc: support device iteration")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
be2ee360fe bus/dpaa: fix iterating on a class type
This patches fixes a null pointer dereferencing that happens
when the device string passed to the iterator is NULL. This
situation can happen when iterating on a class type.
For example:

RTE_DEV_FOREACH(dev, "class=eth", &dev_iter) {
    ...
}

Fixes: e79df833d3 ("bus/dpaa: support hotplug ops")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Ruifeng Wang
78bfe1666b net/i40e: support aarch32
Expand vector PMD support to aarch32.
Enable i40e PMD by default for armv7 make build.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-06-30 14:52:30 +02:00
Ruifeng Wang
2b7a54f091 net/ixgbe: fix include of vector header file
The include of 'arm_neon.h' causes issues to old gcc and aarch32.
Including 'rte_vect.h' instead fixes these issues.

Fixes: b20971b6cc ("net/ixgbe: implement vector driver for ARM")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-06-30 14:52:30 +02:00
Ruifeng Wang
50dd63b9bf net/ixgbe: support aarch32
Expand vector PMD support to aarch32.
Enable ixgbe PMD by default for armv7 make build.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-06-30 14:52:30 +02:00
David Marchand
b52d25ae4b net/mvpp2: fix non-EAL thread support
Caught by code inspection, for a non-EAL thread identified with
rte_lcore_id() == LCORE_ID_ANY, the code currently arbitrarily uses
lcore 0 while there is no guarantee this lcore is used.

Fixes: 3588aaa68e ("net/mrvl: fix HIF objects allocation")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Liron Himi <lironh@marvell.com>
2020-06-30 14:52:30 +02:00
Devendra Singh Rawat
b10231aed1 net/qede: fix multicast drop in promiscuous mode
After enabling promiscuous mode all packets whose destination MAC
address is a multicast address were being dropped. This fix configures
H/W to receive all traffic in promiscuous mode. Promiscuous mode also
overrides allmulticast mode on/off status.

Fixes: 40e9f6fc15 ("net/qede: enable VF-VF traffic with unmatched dest address")
Cc: stable@dpdk.org

Signed-off-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
2020-06-30 14:52:30 +02:00
Harman Kalra
9311beeea4 net/octeontx2: support CN98xx
New cn98xx SOC comes up with two NIX blocks wrt
cn96xx, cn93xx, to achieve higher performance.
Also the no of cores increased to 36 from 24.

Adding support for cn98xx where need a logic to
detect if the LF is attached to NIX0 or NIX1 and
then accordingly use the respective NIX block.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-06-30 14:52:30 +02:00
Viacheslav Ovsiienko
420bbdae89 net/mlx5: fix host physical function representor naming
The new kernel adds the names like "pf0" for Host PCI physical
function representor on Bluefield SmartNIC hosts. This patch
provides correct HPF representor recognition over the kernel
versions 5.7 and laters.

The following port naming formats are supported:

  - missing physical port name (no sysfs/netlink key) at all,
    master is assumed

  - decimal digits (for example "12"), representor is
    assumed, the value is the index of attached VF

  - "p" followed by decimal digits, for example "p2", master
    is assumed

  - "pf" followed by PF index, for example "pf0", Host PF
     representor is assumed on SmartNIC systems.

  - "pf" followed by PF index concatenated with "vf" followed by
     VF index, for example "pf0vf1", representor is assumed.
     If index of VF is "-1" it is a special case of Host PF
     representor, this representor must be indexed in devargs
     as 65535, for example representor=[0-3,65535] will
     allow representors for VF0, VF1, VF2, VF3 and for host PF.

Fixes: 79aa430721 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:30 +02:00
Junyu Jiang
4717a12cfa net/ice: initialize and update RSS based on user config
Initialize and update RSS configure based on user request
(rte_eth_rss_conf) from dev_configure and .rss_hash_update ops.
All previous default configure has been removed.

Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2020-06-30 14:52:30 +02:00
Ori Kam
262c7ad0dd common/mlx5: move doorbell record from net driver
The creation of DBR can be used by a number of different
Mellanox PMDs. for example RegEx / Net / VDPA.

This commits moves the DBR creation and release functions to common
folder.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-06-30 14:52:30 +02:00
Ophir Munk
391b8bcc81 common/mlx5: move some getter functions from net driver
Getter functions such as: 'mlx5_os_get_ctx_device_name',
'mlx5_os_get_ctx_device_path', 'mlx5_os_get_dev_device_name',
'mlx5_os_get_umem_id' are implemented under net directory. To enable
additional devices (e.g. regex, vdpa) to access these getter functions
they are moved under common directory.

As part of this commit string sizes DEV_SYSFS_NAME_MAX and
DEV_SYSFS_PATH_MAX are increased by 1 to make sure that the destination
string size in strncpy() function is bigger than the source string size.
This update will avoid GCC version 8 error -Werror=stringop-truncation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:30 +02:00
Suanming Mou
ac79183dc6 net/mlx5: optimize free counter lookup
Currently, when allocate a new counter, it needs loop the whole
container pool list to get a free counter.

In the case with millions of counters allocated, and all the pools
are empty, allocate the new counter will still need to loop the
whole container pool list first, then allocate a new pool to get a
free counter. It wastes the cycles during the pool list traversal.

Add a global free counter list in the container helps to get the free
counters more efficiently.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:30 +02:00
Suanming Mou
b1cc226644 net/mlx5: optimize single counter pool search
For single counter, when allocate a new counter, it needs to find the pool
it belongs in order to do the query together.

Once there are millions of counters allocated, the pool array in the
counter container will become very large. In this case, the pool search
from the pool array will become extremely slow.

Save the minimum and maximum counter ID to have a quick check of current
counter ID range. And start searching the pool from the last pool in the
container will mostly get the needed pool since counter ID increases
sequentially.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:29 +02:00
Suanming Mou
632f0f1905 net/mlx5: manage shared counters in three-level table
Currently, to check if any shared counter with same ID existing, it will
have to loop the counter pools to search for the counter. Even add the
counter to the list will also not so helpful while there are thousands
of shared counters in the list.

Change Three-Level table to look up the counter index saved in the
relevant table entry will be more efficient.

This patch introduces the Three-level table to save the ID relevant
counter index in the table. Then the next while the same ID comes, just
check the table entry of this ID will get the counter index directly.
No search will be needed.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-30 14:52:29 +02:00
Suanming Mou
bd81eaebd9 net/mlx5: add three-level table utility
For the case which data is linked with sequence increased index, the
array table will be more efficient than hash table once need to search
one data entry in large numbers of entries. Since the traditional hash
tables has fixed table size, when huge numbers of data saved to the hash
table, it also comes lots of hash conflict.

But simple array table also has fixed size, allocates all the needed
memory at once will waste lots of memory. For the case don't know the
exactly number of entries will be impossible to allocate the array.

Then the multiple level table helps to balance the two disadvantages.
Allocate a global high level table with sub table entries at first,
the global table contains the sub table entries, and the sub table will
be allocated only once the corresponding index entry need to be saved.
e.g. for up to 32-bits index, three level table with 10-10-12 splitting,
with sequence increased index, the memory grows with every 4K entries.

The currently implementation introduces 10-10-12 32-bits splitting
Three-Level table to help the cases which have millions of entries to
save. The index entries can be addressed directly by the index, no
search will be needed.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-06-30 14:52:29 +02:00
David Marchand
63783b0172 net/mlx5: remove redundant newline from logs
The DRV_LOG macro already appends a newline.

Fixes: 46287eacc1 ("net/mlx5: introduce hash list")
Fixes: 860897d289 ("net/mlx5: reorganize flow tables with hash list")
Fixes: e484e40323 ("net/mlx5: optimize tag traversal with hash list")
Fixes: 6801116688 ("net/mlx5: fix multiple flow table hash list")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-30 14:52:29 +02:00
Andrew Rybchenko
ff49ac69d4 net/sfc: reap Tx descriptors at least once
Improve cache hit and increase packet rate on benchmarks.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
441476b000 vdpa/mlx5: support MTU feature
The guest virtio device may request MTU updating when the vhost backend
device exposes a capability to support it.

Expose the MTU feature capability.

At configuration time, check the requested MTU and update it in the HW
device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
aec086c9f1 common/mlx5: share kernel interface name getter
Some configuration of the mlx5 port are done by the kernel net device
associated to the IB device represents the PCI device.

The DPDK mlx5 driver uses Linux system calls, for example ioctl, in
order to configure per port configurations requested by the DPDK user.

One of the basic knowledges required to access the correct kernel net
device is its name.

Move function to get interface name from IB device path to the common
library.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
04e7beeb12 vdpa/mlx5: adjust virtio queue protection domain
In other to fill the new requirement for virtq
configuration, set the single PD managed by the driver for
all the virtqs.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
473d8e67d8 common/mlx5: add virtio queue protection domain
Starting from FW version 22.27.4002, it is required to
configure protection domain (PD) for each virtq created by
DevX.

Add PD requirement in virtq DevX APIs.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
7de66d823e vdpa/mlx5: support virtio queue statistics get
Add support for statistics operations.

A DevX counter object is allocated per virtq in order to
manage the virtq statistics.

The counter object is allocated before the virtq creation
and destroyed after it, so the statistics are valid only in
the life time of the virtq.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
796ae7bb6a common/mlx5: support DevX virtq stats operations
Add DevX API to create and query virtio queue statistics
from the HW. The next counters are supported by the HW per
virtio queue:
	received_desc.
	completed_desc.
	error_cqes.
	bad_desc_errors.
	exceed_max_chain.
	invalid_buffer.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
3990ea41c4 net/ice/base: replace RSS profile locks
Replacing flow profile locks with RSS profile locks in the function to
remove all RSS rules for a given VSI. This is to align the locks used
for RSS rule addition to VSI and removal during VSI teardown to avoid
a race condition owing to several iterations of the above operations.
In function to get RSS rules for given VSI and protocol header replacing
the pointer reference of the RSS entry with a copy of hash value to
ensure thread safety.

Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
072158c652 net/ice/base: fix VSI ID mask to 10 bits
set_rss_lut failed due to incorrect vsi_id mask. vsi_id is 10 bit
but mask was 0x1FF whereas it should be 0x3FF.

For vsi_num >= 512, FW set_rss_lut has been failing with return code
EACCESS (vsi ownership issue) because software was providing
incorrect vsi_num (dropping 10th bit due to incorrect mask) for
set_rss_lut admin command

Fixes: a90fae1d07 ("net/ice/base: add admin queue structures and commands")
Cc: stable@dpdk.org

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
1ffe6670e0 net/ice/base: choose TCP dummy packet by protocol
In order to find proper dummy packets for switch filter,
it need to check ipv4 next protocol number, if it is 0x06,
which means next payload is TCP, we need to use TCP
format dummy packet.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
418d2563d1 net/ice/base: get tunnel type for recipe
This patch add support to get tunnel type of recipe
after get recipe from FW. This will fix the issue in
function ice_find_recp() for tunnel type comparing.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
3c0b91c387 net/ice/base: support flow director for GTPU with outer IPv6
Add FDIR support for MAC_IPV6_GTPU type with outer IPv6 address, teid
and qfi fields matching.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
efae14de50 net/ice/base: rename misleading variable
The grst_delay variable in ice_check_reset contains the maximum time
(in 100 msec units) that the driver will wait for a reset event to
transition to the Device Active state. The value is the sum of three
separate components:
1) The maximum time it may take for the firmware to process its
outstanding command before handling the reset request.
2) The value in RSTCTL.GRSTDEL (the delay firmware inserts between first
seeing the driver reset request and the actual hardware assertion).
3) The maximum expected reset processing time in hardware.

Referring to this total time as "grst_delay" is misleading and
potentially confusing to someone checking the code and cross-referencing
the hardware specification.

Fix this by renaming the variable to "grst_timeout", which is more
descriptive of its actual use.

Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
e014cd42e9 net/ice/base: add commands for system diagnostic
System diagnostic solution extend the ability to fetch FW
internal status data and error indication.

Signed-off-by: Sharon Haroni <sharon.haroni@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
c1bec172e8 net/ice/base: support flow director for outer IP of GTPU
Add outer IP address fields while generating the training packets for
GTPU, so that we can support FDIR based on outer IP of GTPU.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
7621dd771c net/ice/base: refactor to avoid need to retry
The ice_discover_caps function is used to read the device and function
capabilities, updating the hardware capabilities structures with
relevant data.

The exact number of capabilities returned by the hardware is unknown
ahead of time. The AdminQ command will report the total number of
capabilities in the return buffer.

The current implementation involves requesting capabilities once,
reading this returned size, and then re-requested with that size.

This isn't really necessary. The firmware interface has a maximum size
of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than
ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities.

Avoid the retry loop by simply allocating a buffer of size
ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The
extra allocation isn't a big deal, as it will be released after we
finish parsing the capabilities.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Qi Zhang
d6e0dca1d5 net/ice/base: adjust profile ID map locks
The profile id map lock should be held till the caller completes
all references of that profile entries.

The current code releases the lock right after the match search.
This caused a driver issue when the profile map entries were
referenced after it was freed in other thread after the lock was
released earlier.

Also return type of get/set profile functions were changed to
return the ice status instead of the profile entry pointer.
This will prevent the caller referencing the profile fields
outside the lock.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2020-06-30 14:52:29 +02:00
Thomas Monjalon
4f299b7169 build: replace meson OS detection with variable
Some places were calling the meson function host_machine.system()
instead of the variables is_windows and is_linux defined
in config/meson.build.

At the same time, the missing "Linux restriction" reason is added to
pfe and octeontx2 crypto PMDs.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-06-30 15:29:59 +02:00
Tal Shnaiderman
b762221ac2 bus/pci: support Windows with bifurcated drivers
Uses SetupAPI.h functions to scan PCI tree.
Uses DEVPKEY_Device_Numa_Node to get the PCI NUMA node.
Uses SPDRP_BUSNUMBER and SPDRP_BUSNUMBER to get the BDF.
scanning currently supports types RTE_KDRV_NONE.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
33031608e8 bus/pci: introduce Windows support with stubs
Addition of stub eal and bus/pci functions to compile
bus/pci for Windows.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
b137f95366 pci: build on Windows
Added <sys/types.h> in rte_pci header file
to include off_t type since it is missing for Windows.

Define the implementation of the Linux function rte_pci_get_sysfs_path
in pci_common.c for Linux OS only as it is unneeded for other OSs
and to avoid the warning on deprecated call to getenv() on Windows:

"warning: 'getenv' is deprecated: This function or variable may be unsafe.
Consider using _dupenv_s instead."

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
2fd3567e54 pci: use OS generic memory mapping functions
Changing all of PCIs Unix memory mapping to the
new memory allocation API wrapper.

Change all of PCI mapping function usage in
bus/pci to support the new API.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Tal Shnaiderman
309bf90bf9 build: generate version map file for MinGW
The MinGW build for Windows has special cases where exported
function contain additional prefix:

__emutls_v.per_lcore__*

To avoid adding those prefixed functions to the version.map file
the map_to_def.py script was modified to create a map file for MinGW
with the needed changed.

The file name was changed to map_to_win.py and lib/meson.build map output
was unified with drivers/meson.build output

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
77cca7ccec build: fix drivers library path on Windows
import library (/IMPLIB) in meson.build should use
the 'drivers' and not 'libs' folder.

The error is: fatal error LNK1149: output filename matches input filename.
The fix uses the correct folder.

Fixes: 5ed3766981 ("drivers: process shared link dependencies as for libs")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
abd5c69bf6 build: skip pmdinfogen on Windows
pmdinfogen generation is currently unsupported for Windows.
The relevant part in meson.build is skipped.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Haiyue Wang
54f3fb127d bus/pci: fix VF memory access
To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
and block MMIO access on disabled memory, it will send a SIGBUS to the
application:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fdd

When the application opens the vfio PCI device, the vfio-pci module will
enable the bus memory space through PCI read/write access. According to
the PCIe specification, the 'Memory Space Enable' is always zero for VF:

             Table 9-13 Command Register Changes

Bit Location | PF and VF Register Differences | PF         | VF
             | From Base                      | Attributes | Attributes
-------------+--------------------------------+------------+-----------
             | Memory Space Enable - Does not |            |
             | apply to VFs. Must be hardwired|  Base      |  0b
     1       | to 0b for VFs. VF Memory Space |            |
             | is controlled by the VF MSE bit|            |
             | in the VF Control register.    |            |
-------------+--------------------------------+------------+-----------

Afterwards the vfio-pci will initialize its own virtual PCI config space
data ('vconfig') by reading the VF's physical PCI config space, then the
'Memory Space Enable' bit in vconfig will always be 0b value. This will
make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
will be triggered if access these BARs.

By investigation, the VF PCI device *passthrough* into the Guest OS by
QEMU has the 'Memory Space Enable' with 1b value. That's because every
PCI driver will start to enable the memory space, and this action will
be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
VF runs in host OS has 'Mem-'.

Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
bus memory space explicitly to avoid access on disabled memory.

Fixes: 33604c3135 ("vfio: refactor PCI BAR mapping")
Cc: stable@dpdk.org

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Thierry Martin <thierry.martin.public@gmail.com>
2020-06-25 15:22:51 +02:00
Long Li
1aef0aef36 bus/vmbus: fix ring buffer mapping
vmbus_map_addr is used as the next start virtual address for mapping ring
buffer. However it's updated based on ring_buf, which is a pointer to an
address on the stack. The next ring buffer may be mapped to an unexpected
address.

Fix this by calculating vmbus_map_addr based on returned virtual address.

Fixes: 3f9277031a ("bus/vmbus: fix check for mmap failure")
Cc: stable@dpdk.org

Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-06-25 01:04:17 +02:00
Jerin Jacob
c79a1c6746 bus/pci: optimize bus scan
In order to optimize the PCI management, RTE_KDRV_NONE based
device driver probing removed by not adding them to list in
the scan phase.

The legacy virtio is the only consumer of RTE_KDRV_NONE based device
driver probe scheme. The legacy virtio support will be available
through the existing VFIO/UIO based kernel driver scheme.

This patch also removes the deprecation notice for the same.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
2020-06-24 23:49:15 +02:00