Normally when closing the device the queue memzone should be
freed. But the memzone will be not freed, when device setup
ops like:
rte_eth_bond_slave_remove
-->__eth_bond_slave_remove_lock_free
---->slave_remove
------>rte_eth_dev_internal_reset
-------->rte_eth_dev_rx_queue_config
---------->eth_dev_rx_queue_config
------------>em_rx_queue_release
rte_eth_dev_close
-->eth_em_close
---->em_dev_free_queues
------>em_rx_queue_release
(not been called due to nb_rx_queues and nb_tx_queues are 0)
And when queue number is changed to small size, the BIG memzone
queue index will be lost. This will lead to a memory leak. So we
should release the memzone when releasing queues.
Fixes: 460d167958 ("drivers/net: delete HW rings while freeing queues")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
A more fine-grain flow API action RTE_FLOW_ACTION_TYPE_SAMPLE should
be used instead of it.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The link state change interrupt handler of the NFP PMD will delay the
actual LSC work for a short period to ensure the link is stable. If the
link of the port changes state and the port is closed immediately after
the link event then a segmentation fault will occur. This happens
because the delayed LSC work eventually triggers and this logic will try
to access private port data that had been released when the port was
closed.
Fixes: 6c53f87b34 ("nfp: add link status interrupt")
Cc: stable@dpdk.org
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Currently, most ethdev callback API use queue ID as parameter, but Rx
and Tx queue release callback use queue object which is used by Rx and
Tx burst data plane callback.
To align with other eth device queue configuration callbacks:
- queue release callbacks are changed to use queue ID
- all drivers are adapted
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Some drivers don't need Rx and Tx queue release callback, make them
optional. Clean up empty queue release callbacks for some drivers.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reflect globally enabled Rx and Tx offloads in queue conf.
Also fix issue with lmt data prepare for multi seg.
Fixes: a24af6361e ("net/cnxk: add Tx queue setup and release")
Fixes: a86144cd9d ("net/cnxk: add Rx queue setup and release")
Fixes: 305ca2c4c3 ("net/cnxk: support multi-segment vector Tx")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
This patch adds support to configure channel mask which will
be used by rte flow when adding flow rules with inline IPsec
action.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adds capabitlities for AES_CBC and HMAC_SHA1 for 9k
security offload.
Signed-off-by: Srujana Challa <schalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Sets IP6_UDP_OPT in NIX RX config to allow optional
UDP checksum for IPv6 in case of security offload.
Also disable drop_re when inline inbound is enabled.
Signed-off-by: Srujana Challa <schalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adds support to update ethertype for mixed IPsec tunnel
versions. And also sets et_overwr for inbound IPsec.
Signed-off-by: Srujana Challa <schalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adds anti replay support for cn9k platform using
SW anti replay check.
Signed-off-by: Srujana Challa <schalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to create and submit CPT instructions on Tx
on CN10K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to receive CPT processed packets on Rx via
second pass on CN10K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to create and submit CPT instructions on Tx
on CN9K SoC.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support to receive CPT processed packets on Rx for
CN9K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support for inline inbound and outbound IPSec for SA create,
destroy and other NIX / CPT LF configurations.
This patch also changes dpdk-devbind.py to list new inline
device as misc device.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add support for inline inbound and outbound IPSec for SA create,
destroy and other NIX / CPT LF configurations.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
reported by "gcc (GCC) 12.0.0 20211003 (experimental)":
../drivers/net/cxgbe/cxgbe_ethdev.c:
In function ‘cxgbe_dev_rx_queue_setup’:
../drivers/net/cxgbe/cxgbe_ethdev.c:682:24:
error: the comparison will always evaluate as ‘true’ for the
address of ‘fl’ will never be NULL [-Werror=address]
682 | if ((&rxq->fl) != NULL)
| ^~
Fixing it by removing useless check.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Adjust parameters order to eth_xstats_get_by_id_t prototype.
Make ids the second parameter similar to eth_xstats_get_by_id_t.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The af_packet pmd driver binds to a raw socket and allows sending and
receiving of packets through the kernel.
Since commit [1], the kernel strips the vlan tags early in
__netif_receive_skb_core(), so we receive untagged packets while running
with the af_packet pmd.
Luckily for us, the skb vlan-related fields are still populated from the
stripped vlan tags, so we end up having all the information that we need
in the mbuf.
Having the pmd driver support DEV_RX_OFFLOAD_VLAN_STRIP allows the
application to control the desired vlan stripping behavior, until we
have a way to describe offloads that can't be disabled by pmd drivers.
This patch will cause a change in the default way that the af_packet pmd
treats received vlan-tagged frames. While previously, the application
was required to check the PKT_RX_VLAN_STRIPPED flag, after this patch,
the pmd will re-insert the vlan tag transparently to the user, unless
the DEV_RX_OFFLOAD_VLAN_STRIP is enabled in rxmode.offloads.
I've attempted a preliminary benchmark to understand if the change could
cause a sizable performance hit.
Setup:
Two virtual machines running on top of an ESXi hypervisor
Tx: DPDK app (running on top of vmxnet3 PMD)
Rx: af_packet (running on top of a kernel vmxnet3 interface)
Packet size :68 (packet contains a vlan tag)
Rates:
Tx - 1.419 Mpps
Rx (without vlan insertion) - 1227636 pps
Rx (with vlan insertion) - 1220081 pps
At a first glance, we don't seem to have a large degradation in terms of
packet rate.
[1]
https://github.com/torvalds/linux/commit/bcc6d47903612c3861201cc3a866fb60
Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add support to fetch port and queue stats via xstats API. Also remove
queue stats from basic stats because they're now available via xstats
API for the VF.
Signed-off-by: Nikhil Vasoya <nikhil.vasoya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Previously, memif socket hash is always allocated on NUMA socket 0.
If the application is entirely running on another NUMA socket and EAL
--socket-limit prevents memory allocation on NUMA socket 0, memif
creation fails with "HASH: memory allocation failed" error.
This patch allows allocating memif socket hash on any NUMA socket.
Signed-off-by: Junxiao Shi <git@mail1.yoursunny.com>
Reviewed-by: Jakub Grajciar <jgrajcia@cisco.com>
Definition of `rte_ether_addr` structure used a workaround allowing DPDK
and Windows SDK headers to be used in the same file, because Windows SDK
defines `s_addr` as a macro. Rename `s_addr` to `src_addr` and `d_addr`
to `dst_addr` to avoid the conflict and remove the workaround.
Deprecation notice:
https://mails.dpdk.org/archives/dev/2021-July/215270.html
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
This patch enables building the ixgbe driver for Windows.
It also enables its dependencies on security and cryptodev.
I tested on AWS using ixgbe VF device, using dpdk-testpmd.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
This patch adds comment to explain how dpaa_port_fmc_ccnode_parse
function is working to get the HW queue from FMC policy file
Signed-off-by: Rohit Raj <rohit.raj@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This patch updates the RSS support to support following additional
distributions:
- VLAN
- ESP
- AH
- PPPOE
Signed-off-by: Vanshika Shukla <vanshika.shukla@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
TX redirection support by flow action RTE_FLOW_ACTION_TYPE_PHY_PORT
and RTE_FLOW_ACTION_TYPE_PORT_ID
This action is executed by HW to forward packets between ports.
If the ingress packets match the rule, the packets are switched
without software involved and perf is improved as well.
Signed-off-by: Jun Yang <jun.yang@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Updating MC firmware support APIs to be latest. It supports
improved DPDMUX (SRIOV equivalent) for traffic split between
dpnis and additional PTP APIs.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Fixed with ./devtools/update-abi.sh $(cat ABI_VERSION)
Fixes: e73a7ab224 ("net/softnic: promote manage API")
Fixes: 8f532a34c4 ("fib: promote API to stable")
Fixes: 4aeb92396b ("rib: promote API to stable")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Code changes done for build issue as reported in Bug 817
error: dereferencing type-punned pointer will break strict-aliasing rules.
added union to avoid pointer dereferencing
The build issue has been reported with both gcc 4.8.5 (RHEL 7) and
gcc 5.4.0 (Ubuntu 16.04).
Bugzilla ID: 817
Fixes: 39925373a3 ("net/ice/base: add parser execution main loop")
Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
VFs are not allowed to change physical link params when a port
module change is detected. The firmware already returns appropriate
permission error when VF tries to change physical link params. But,
make sure to avoid sending the command to firmware from VF in the
first place and prevent flooding firmware debug logs with permission
errors.
Fixes: a83041b1e9 ("net/cxgbe: rework and simplify link handling")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Vasoya <nikhil.vasoya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
The metadata can be set in the mbuf dynamic field and then used in
flow rules steering for egress direction. The hardware requires
network order for both the insertion of a rule and sending a packet.
Indeed, there is no strict restriction for the endianness. The order
for sending a packet and its steering rule should be consistent.
In the past, there was no endianness conversion due to the
performance reason. The flow rule converted the metadata into little
endian for hardware (if needed) and the packet hit the flow rule also
with little endian.
After the metadata was converted to big endian, the missing adaption
in the data path resulted in a flow miss of the egress packets.
Converting the metadata to big endian before posting a WQE to the
hardware solves this issue.
Fixes: b57e414b48 ("net/mlx5: convert meta register to big-endian")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In the function mlx5_alloc_shared_dr(), there are various reasons
to result in a failure and error clean up process. While in the
caller of mlx5_dev_spawn(), once there is a error occurring after
the mlx5_alloc_shared_dr(), the mlx5_os_free_shared_dr() is called
to release all the resources.
To prevent a double release, the pointers of the resources should
be checked before the releasing and set to NULL after done.
In the mlx5_free_table_hash_list(), after the releasing, the pointer
was missed to set to NULL and a double release may cause a crash.
By setting the tables pointer to NULL as done for other resources,
the double release and crash could be solved.
Fixes: 54534725d2 ("net/mlx5: fix flow table hash list conversion")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This patch support new global device syntax like:
bus=pci,addr=BB:DD.F/class=eth/driver=mlx5,devargs,..
In driver parameters check, ignores "driver" key which is part of new
global device syntax instead of reporting error.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch removes the simplification in Virtio descriptors
handling, where their buffer addresses are IOVAs for Virtio
PCI devices, and VA-only for Virtio-user devices, which
added a requirement on Virtio-user that it only supported
IOVA as VA.
This change introduced a regression for applications using
Virtio-user and other physical PMDs that require IOVA as PA
because they don't use an IOMMU.
This patch reverts to the old behaviour, but needed to be
reworked because of the refactoring that happened in v21.02.
Fixes: 17043a2909 ("net/virtio: force IOVA as VA mode for virtio-user")
Cc: stable@dpdk.org
Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
When attaching to an existing mono queue tap, the virtio-user was not
reporting that the virtio device was not properly initialised which
prevented from starting the port later.
$ ip tuntap add test mode tap
$ dpdk-testpmd --vdev \
net_virtio_user0,iface=test,path=/dev/vhost-net,queues=2 -- -i
...
virtio_user_dev_init_mac(): (/dev/vhost-net) No valid MAC in devargs or
device, use random
vhost_kernel_open_tap(): TUNSETIFF failed: Invalid argument
vhost_kernel_enable_queue_pair(): fail to open tap for vhost kernel
virtio_user_start_device(): (/dev/vhost-net) Failed to start device
...
Configuring Port 0 (socket 0)
vhost_kernel_open_tap(): TUNSETIFF failed: Invalid argument
vhost_kernel_enable_queue_pair(): fail to open tap for vhost kernel
virtio_set_multiple_queues(): Multiqueue configured but send command
failed, this is too late now...
Fail to start port 0: Invalid argument
Please stop the ports first
Done
The virtio-user with vhost-kernel backend was going through a lot
of complications to initialise tap fds only when using them.
For each qp enabled for the first time, a tapfd was created via
TUNSETIFF with unneeded additional steps (see below) and then mapped to
the right qp in the vhost-net backend.
Unneeded steps (as long as it has been done once for the port):
- tap features were queried while this is a constant on a running
system,
- the device name in DPDK was updated,
- the mac address of the tap was set,
On subsequent qps state change, the vhost-net backend fd mapping was
updated and the associated queue/tapfd were disabled/enabled via
TUNSETQUEUE.
Now, this patch simplifies the whole logic by keeping all tapfds opened
and in enabled state (from the tap point of view) at all time.
Unused ioctl defines are removed.
Tap features are validated earlier to fail initialisation asap.
Tap name discovery and mac address configuration are moved when
configuring qp 0.
To support attaching to mono queue tap, the virtio-user driver now tries
to attach in multi queue first, then fallbacks to mono queue.
Finally (but this is more for consistency), VIRTIO_NET_F_MQ feature is
exposed only if the underlying tap supports multi queue.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fix the tunnel port counting logic.
Currently we are incrementing the port count without checking
the if bnxt_hwrm_tunnel_dst_port_alloc would return success or failure.
Modify the logic to increment it only if the firmware returns success.
Fixes: 10d074b202 ("net/bnxt: support tunneling")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
The error recovery async event messages are often mistaken
for errors. Improved the wording to clarify the meaning of
these events.
Also, take the first step towards more inclusive language.
The references to master will be changed to primary.
For example: "bnxt_is_master_func" will be renamed to
"bnxt_is_primary_func()".
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
The device cleanup logic was freeing most of the ring related memory,
but was not freeing up the memzone associated with the rings.
This patch fixes the issue.
Fixes: 2eb53b134a ("net/bnxt: add initial Rx code")
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Default queue state of Tx queues on startup is not correct.
Fix this by setting the state when the port is started.
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
1. Fix to use correct fields in the request structure of
HWRM_FUNC_DRV_RGTR.
2. Remove the "flags" argument to bnxt_hwrm_func_driver_unregister()
as it is not needed.
Fixes: beb3087f50 ("net/bnxt: add driver register/unregister")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Add ice support for new ethdev APIs to enable/disable and read/write/adjust
IEEE1588 PTP timestamps. Currently, only scalar path supports 1588 PTP,
vector path doesn't.
The example command for running ptpclient is as below:
./build/examples/dpdk-ptpclient -c 1 -n 3 -- -T 0 -p 0x1
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The request of getting VF VSI map request may fail when DCF is busy,
this patch adds retry mechanism to make it able to succeed.
Fixes: b09d34ac85 ("net/ice: fix flow redirector")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
In the iavf_config_rx_queues_irqs function, the memory pointed to by the
intr_handle->intr_vec and qv_map addresses is not released in the
subsequent hook branch, resulting in resource leakage.
Fixes: f593944fc9 ("net/iavf: enable IRQ mapping configuration for large VF")
Cc: stable@dpdk.org
Signed-off-by: Qiming Chen <chenqiming_huawei@163.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Use the dynamic mbuf to register timestamp field and flag.
The ice has the feature to dump Rx timestamp value into dynamic
mbuf field by flex descriptor. This feature is turned on by dev
config "enable-rx-timestamp". Currently, it's only supported
under scalar path.
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The default VF driver for Intel 700 Series Ethernet Controller already
switch to iavf in DPDK 21.05. And i40evf is no need to maintain now,
so remove i40evf related code.
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>