Currently, if there are not available Rx buffer descriptors in receiving
direction based on hns3 network engine, incoming packets will always be
dropped by hardware. This patch reports the '.rx_drop_en' information to
DPDK framework in the '.dev_infos_get', '.rxq_info_get' and
'.rx_queue_setup' ops implementation function.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
To manage header modify actions mlx5 PMD used the single linked list and
lookup and insertion operations took too long times if there were
millions of objects and this impacted the flow insertion/deletion rate.
In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The mlx5 PMD hashed list was designed in approach to contain the items
with unique keys only. Now there is the need to store the objects with
possible key collisions. It is not expected to have many collisions
(very likely to have a few ones), but keys become not unique.
This commit adds the hash list extended functions in order to support
insertion and lookup for the lists with non-unique keys.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The data from the host is trusted but checked by the driver.
One check that is missing is that the packet offset and length
might cause wraparound.
Cc: stable@dpdk.org
Reported-by: Nan Chen <whutchennan@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Long Li <longli@microsoft.com>
chim_index could potentially be used in other hn_txdesc when re-allocated.
Mark it as invalid to prevent stale value being used.
Fixes: cc0251813277 ("net/netvsc: split send buffers from Tx descriptors")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
netvsc is a high speed VMBus device that uses monitor bit to signal the
host. It's not necessary to send interrupts via INT bit.
Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
netvsc uses rxbuf_info buffer to track received packets attached via
rte_pktmbuf_attach_extbuf() and ack the host based on usage count. It
uses the transaction_id in the VMBus packet to locate where to use
memory in the rxbuf_info.
This is not correct in multiple channel setup, as different channels may
return identical transaction_ids at a time, and may corrupt the
rxbuf_info buffer.
Fix this by defining rxbuf_info for each queue.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds a missing LLQ-related check in the
ena_com_is_doorbell_needed() routine, which is relevant for the feature
supported by the next generation HW of the ENA.
Fixes: b2b02edeb0d6 ("net/ena/base: upgrade HAL for new HW features")
CC: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Signed-off-by: Artur Rojek <ar@semihalf.com>
This declaration is the same as the one a few lines before.
Fixes: 6844d146ff39 ("eal: add bus pointer in device structure")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Now that the pci_map_resource API is private to the PCI bus, we can drop
the compatibility workaround we had implemented in 20.08.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
As reported during 20.08 work for Windows, the pci_map_resource API was
built with the assumption that its flags would be passed to mmap().
This introduced a regression when adding the rte_mem_map API as reported
in the workaround commit 9d2b24593724 ("pci: keep API compatibility with
mmap values").
This API was only used in the PCI bus code, so move it there.
There is no code change happening during the move.
The only change is in the pci_map_resource description where the
additional flags are now documented as rte_mem_map API flags:
- * The additional flags for the mapping range.
+ * The additional rte_mem_map() flags for the mapping range.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
The rte_kernel_driver enum actually only pointed at PCI drivers and is
only used in the PCI subsystem.
Remove it from the generic device API and use a private enum in the PCI
code.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This field was not generic as it was filled with PCI kernel drivers only.
It has no known in-tree user (and I could not find opensource projects
using it).
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Remove the deprecated buf_physaddr union field from rte_mbuf.
It is replaced with buf_iova which is at the same offset.
The single field buf_physaddr in rte_kni_mbuf is also renamed.
This concludes a 3-year process of semantic change.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Remove the deprecated functions
- rte_mbuf_data_dma_addr
- rte_mbuf_data_dma_addr_default
which aliased the more recent functions
- rte_mbuf_data_iova
- rte_mbuf_data_iova_default
Remove the deprecated macros
- rte_pktmbuf_mtophys
- rte_pktmbuf_mtophys_offset
which aliased the more recent macros
- rte_pktmbuf_iova
- rte_pktmbuf_iova_offset
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Remove the deprecated unioned fields phys_addr
from the structures rte_memseg and rte_memzone.
They are replaced with the fields iova which are at the same offsets.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
The checks for libfdt try dependency() first which would only work if
a pkg-config would be present but libfdt has none.
Then it probes for the lib path itself via cc.find_library.
But later it adds the result of either probe to ext_deps which ends up
in build and also the resulting pkg-config to contain toolchain versioned
paths in Libs.private like:
/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libfdt.so
which obviously breaks on toolchain updates.
In general libs used multiple times - ipn3ke + ifpga in this case - are
checked centrally in config/meson.build so move it there and fix the
adding of dependencies to not use the full file path.
The result is libfdt in pkg-config now showing up as:
Libs.private: -pthread -lm -ldl -lnuma -lfdt -lpcap
Fixes: e1defba4cf66 ("raw/ifpga/base: support device tree")
Cc: stable@dpdk.org
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
The driver APIs for returning the queue default config can fail if the
parameters are invalid, or other reasons, so allow them to return error
codes to the rawdev layer and from hence to the app.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
The queue setup and queue defaults query functions take a void * parameter
as configuration data, preventing any compile-time checking of the
parameters and limiting runtime checks. Adding in the length of the
expected structure provides a measure of typechecking, and can also be used
for ABI compatibility in future, since ABI changes involving structs almost
always involve a change in size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Currently with the rawdev API there is no way to check that the structure
passed in via the dev_private pointer in the structure passed to configure
API is of the correct type - it's just checked that it is non-NULL. Adding
in the length of the expected structure provides a measure of typechecking,
and can also be used for ABI compatibility in future, since ABI changes
involving structs almost always involve a change in size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Since we now allow some parameter checking inside the driver info_get()
functions, it makes sense to allow error return from those functions to the
caller. Therefore we change the driver callback return type from void to
int.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Currently with the rawdev API there is no way to check that the structure
passed in via the dev_private pointer in the dev_info structure is of the
correct type - it's just checked that it is non-NULL. Adding in the length
of the expected structure provides a measure of typechecking, and can also
be used for ABI compatibility in future, since ABI changes involving
structs almost always involve a change in size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Swap subsystem vendor id and subsystem device id.
Parse the SPDRP_HARDWAREID string with correct type values.
Fixes: b762221ac24 ("bus/pci: support Windows with bifurcated drivers")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
current support will build vdev with empty MP functions
currently unsupported for Windows.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Added missing code to free Input/Output buffers and memory
registration.
Also added calls to this code in case of error in the qp setup
procedure.
The rollback code itself did not handle rollback properly
and did not check return value from the fastpath setup.
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Make is not supported for compiling DPDK, the config files are no
longer needed.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Start a new release cycle with empty release notes.
The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
OVS-DPDK seems to set the reset bit for every flow query. Honor
the bit by resetting the SW counter values after assigning them.
Also set the 'hit' bit only if the counter value retrieved by HW
is non-zero.
While querying flow stats, use max possible entries in the fc table scan
for valid entries instead of active entries as the active entry can be in
any slot in the table.
This is a critical fix for OVS-DPDK flow aging.
Fixes: 306c2d28e247 ("net/bnxt: support count action in flow query")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Adjusted resource allocations for the hardware resources
like TCAM entries, action records, stat counters, exact match records to
scale up offload flows.
Also increased IPv4 nat entries to 1023.
This patch is a critical fix to enable driver load on current and all
FW versions going forward.
Fixes: cef3749d501e2 ("net/bnxt: update TruFlow resource allocation numbers")
Signed-off-by: Shahaji Bhosle <sbhosle@broadcom.com>
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Different device has different hash capability, it should not be
expected that all hash set would be successful to set into all
devices by default. So remove the return checking when hash default
set. And remove gtpu hash default set, iavf only enable hash for
general protocols.
Fixes: c94366cfc641 ("net/iavf: add GTPU in default hash")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Wei Xie <weix.xie@intel.com>
The two fixes are not the real root cause for MDD event, it mitigates
the failure rate when different test mode, so revert them.
Fixes: 2a0c9ae4f646 ("net/ice: fix TCP checksum offload")
Fixes: 7365a3cee51f ("net/ice: calculate TCP header size for offload")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The variables 'td_offset' and 'td_tag' should be reset to 0 for every
burst packet, otherwise the fields of Tx Descriptor will be set wrong,
this will cause the MDD event error, and Tx will hang.
Fixes: 17c7d0f9d6a4 ("net/ice: support basic Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The driver name was registered as "net_mlx5_regex".
It is renamed as "regex_mlx5".
The same name is used in mlx5_regex_driver.pci_driver.driver.name,
instead of "mlx5_regex", for consistency.
The string used for log registration (pmd.regex.mlx5) could be derived
from the driver name. A macro is created so name definitions are close.
Fixes: cfc672a90b74 ("regex/mlx5: support probing")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ori Kam <orika@mellanox.com>
When a virtq is destroyed, the SW should be able to continue the virtq
processing from where the HW stopped.
The current destroy behavior in the driver saves the virtq state (used
and available indexes) only when LM is requested.
So, when LM is not requested the queue state is not saved and the SW
indexes stay invalid.
Save the virtq state in the virtq destroy process.
Fixes: bff735011078 ("vdpa/mlx5: prepare virtio queues")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The driver CQ event management is done by non vhost library thread,
either the dpdk host thread or the internal vDPA driver thread.
When a queue is updated the CQ may be destroyed and created by the vhost
library thread via the queue state operation.
When the queue update feature was added, it didn't synchronize the CQ
management to the queue update what may cause invalid memory access.
Add the aforementioned synchronization by a new per device configuration
mutex.
Fixes: c47d6e83334e ("vdpa/mlx5: support queue update")
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The code should look into "slab" to figure out the index returned from
rte_bitmap_scan().
Fixes: cc02518132 ("net/netvsc: split send buffers from Tx descriptors")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
When parameter reta_size < RTE_RETA_GROUP_SIZE, reta_count will be 0.
Then this function will be deadloop.
Fixes: 734ce47f71e0 ("bonding: support RSS dynamic configuration")
Cc: stable@dpdk.org
Signed-off-by: Zhiguang He <hezhiguang3@huawei.com>
Acked-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Previous fix added definition of number of retries for UAR allocation.
This value is adequate for x86 systems with 4K pages.
On Power9 system with 64K pages the required value is 32.
This patch updates the defined value from 2 to 32.
Fixes: a0bfe9d56f74 ("net/mlx5: fix UAR memory mapping type")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Commit cc0251813277 ("net/netvsc: split send buffers from Tx
descriptors") changed the way that transmit descriptors are
allocated. They come from a single pool instead of being
individually attached to each mbuf. To find the IOVA, you need
to calculate the offset from the base of the pool.
Fixes: cc0251813277 ("net/netvsc: split send buffers from Tx descriptors")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Store port_id in pmd_internals when eth kni device is created.
Then set packet port of rte_mbuf in function eth_kni_rx.
Cc: stable@dpdk.org
Signed-off-by: Jecky Pei <jpei@sonicwall.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reserved field should be 0x60 instead of 0x40.
Will fail FW check otherwise.
Fixes: 9428310ae1f1 ("regex/mlx5: add engine status check")
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
mlx5_nl_mac_addr_flush should flush all allocated MAC
addresses.
The MAC addresses array size should be of size
MLX5_MAX_MAC_ADDRESSES, but currently we return without
flushing the addresses if size is MLX5_MAX_MAC_ADDRESSES.
This was fixed by not allowing an array larger than
MLX5_MAX_MAC_ADDRESSES.
Fixes: e9a8ac59b6e2 ("common/mlx5: fix MAC addresses assert")
Cc: stable@dpdk.org
Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
To detect the timestamp mode configured on the NIC the mlx5
PMD uses the firmware command ACCESS_REGISTER_USER. This
command is relatively new and might be not supported by
older firmware versions and was rejected, causing annoying
messages in kernel log.
This patch adds the attribute flag check whether firmware
supports the command and avoid the call if it does not.
Fixes: bb7ef9a96281 ("common/mlx5: add register access DevX routine")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Setting the flags of tapfd may fail and the return value
should be checked.
Coverity issue: 140739
Fixes: e3b434818bbb ("net/virtio-user: support kernel vhost")
Cc: stable@dpdk.org
Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
At .new_device() time, only the first vring pair is
now ready, other vrings are configured later.
Problem is that when application will setup and enable
interrupts, only the first queue pair Rx interrupt will
be enabled.
This patches fixes the issue by setting the number of
max interrupts to the number of Rx queues that will be
later initialized. Then, as soon as a Rx vring is ready
and interrupt enabled by the application, it removes the
corresponding uninitialized epoll event, and installs a
new one with the valid FD.
Fixes: 604052ae5395 ("net/vhost: support queue update")
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Now that the vhost library saves the guest notifications
enablement value in its virtqueues metadata, it is not
necessary to do it in the vring_state_changed callback.
One effect of the patch is also to prevent possible
deadlock happening in vhost library.
Fixes: 604052ae5395 ("net/vhost: support queue update")
Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Remove the memory management scheme for Extended Exact Match
using system memory. Using host memory scheme instead which
was the default anyway.
Fixes: b2da02480cb7 ("net/bnxt: support EEM system memory")
Signed-off-by: Randy Schacher <stuart.schacher@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Peter Spreadborough <peter.spreadborough@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>