Fix a buffer overrun issue spotted by coverity while accessing
the array ulp_device_params.
Note that the issue was observed in an internal Coverity scan.
Fixes: 313ac35ac701 ("net/bnxt: support ULP session manager init")
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Clang 6.0.0 will undefine function _mm512_maskz_set1_epi64 on i686
target. Fix it by replacing the function with _mm512_set4_epi64 when
doing 32-bit build.
Warning message during build:
../drivers/net/virtio/virtio_rxtx_packed_avx.c:385:19: warning:
implicit declaration of function '_mm512_maskz_set1_epi64' is invalid
in C99 [-Wimplicit-function-declaration]
Fixes: 77d66da83834 ("net/virtio: add vectorized packed ring Rx")
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When doing virtio device initialization, virtqueues will be reset in
server mode if ring type is packed. It will cause issue because queues
have been freed in the beginning of device initialization.
Fix this issue by checking whether device has been initialized before
reset. If device hasn't been initialized, there's no need to reset
queues.
Fixes: 6ebbf4109f35 ("net/virtio-user: fix packed ring server mode")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The current formulas to calculate the TQM slow path and fast path ring
context memory sizes are not quite correct. TQM slow path entry is
array index 0 of ctx->tqm_mem[]. The other array entries are for fast
path. Fix these sizes according to firmware spec. for 57500 and newer
chips.
Fixes: cc5e26b8ef98 ("net/bnxt: increase TQM entry allocation")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Newer firmware advertises the number of TQM rings to allocate
context memory for. Use the firmware specified value and fall back
to the old value derived from "bp->max_q" if it is not available.
Fixes: f8168ca0e690 ("net/bnxt: support thor controller")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
In recent Linux kernels, there is support for extended acknowledgment
to netlink messages. This is quite useful for diagnosing errors
in configuration in the kernel with TAP.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The tap_nl_recv() function does not need to use the full
complex recvmsg() system call, basic recv() will work here.
Ditto for tap_nl_send() full sendmsg is not needed.
Add logic to retry in case EINTR rather than forcing
error handling back in driver or worse to ethdev API.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The TAP driver does not initialize all the elements of the rte_flow
structure. This can lead to crash in rte_flow_destroy.
(gdb) where
flow=0x100e99280, error=0x0)
at drivers/net/tap/tap_flow.c:1514
(gdb) p remote_flow
$1 = (struct rte_flow *) 0x6b6b6b6b6b6b6b6b
Which is here:
static int
tap_flow_destroy_pmd(struct pmd_internals *pmd,
struct rte_flow *flow,
struct rte_flow_error *error)
{
struct rte_flow *remote_flow = flow->remote_flow;
...
if (remote_flow) {
remote_flow->msg.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
Simplest fix is to use rte_zmalloc() so remote_flow and other fields
are always set at zero.
Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The number of queues in queue group should be checked before
using it. This patch fixed the issue.
Fixes: 47d460d63233 ("net/ice: rework switch filter")
Cc: stable@dpdk.org
Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Tested-by: Qimai Xiao <qimaix.xiao@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
This patch fixes issue that doesn't support mark only case.
Mark only action is equal to mark + passthru action.
Fixes: f5cafa961fae ("net/ice: add flow director create and destroy")
Cc: stable@dpdk.org
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
DPDK does not implement interrupt mechanism on BSD,
so force NIC status synchronization.
Fixes: dc66e5fd01b9 ("net/ixgbe: improve link state check on VF")
Cc: stable@dpdk.org
Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
Tested-by: Zhimin Huang <zhiminx.huang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
When we download a switch rule for ipv6 with esp payload
"eth / ipv6 / esp spi is 1 / end actions queue index 2 / end"
if we don't add bm bit set check for tun_type, then a packet of
ipv4 with esp payload
"sendp([Ether(dst="00:00:00:00:01:00")/IP(proto=50)/ESP(spi=1)/
("X"*480)], iface="ens5f0", count=10)"
Will also go to queue index 2. And also, we need to do tun_type
check, or the second rule of following can not be download because
of rejection from switch rule download function ice_aq_sw_rules().
"eth / ipv4 / esp spi is 1 / end actions queue index 5 / end"
"eth / ipv6 / esp spi is 1 / end actions queue index 2 / end"
Fixes: 4f11962fce84 ("net/ice/base: support AH ESP and NAT-T on switch")
Fixes: 99d8ba79efbe ("net/ice/base: force switch to use different recipe")
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Tested-by: Qi Fu <qi.fu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
If the PF driver does not support the new speed reporting capabilities
then use link_event instead of link_event_adv to get the speed.
Fixes: 48de41ca11f0 ("net/avf: enable link status update")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
We see a stack smashing as a result of defensive code missing. Once the
nb_pkts is less than RTE_BNXT_DESCS_PER_LOOP, it will be modified to
zero after doing a floor align, and we can not exit the following
receiving packets loop. And the buffers will be overwrite, then the
stack frame was ruined.
Fix the problem by adding defensive code, once the nb_pkts is zero, just
directly return with no packets.
Fixes: bc4a000f2f53 ("net/bnxt: implement SSE vector mode")
Cc: stable@dpdk.org
Signed-off-by: Linsi Yuan <yuanlinsi01@baidu.com>
Signed-off-by: Dongsheng Rong <rongdongsheng@baidu.com>
Acked-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
This fixes the problem where driver would not start if only
have a single Rx queue and multiple Txq. In that case, RSS
should stay disabled.
Fixes: 92d23a57cafe ("net/netvsc: support configuring RSS parameters")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
If number of tx queues is greater than the number of rx queues;
the driver ends up allocating more channels than rx queues.
The problem is that the RSS indirection table is programmed such
that some packets will end up on a channel that would never be
polled. The fix is to limit the RSS indirection table by number
of rx queues not channels.
Fixes: 92d23a57cafe ("net/netvsc: support configuring RSS parameters")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
With multiple channels, the primary channel may receive notification
that VF has been added or removed while secondary channel is in
process of doing receive or transmit. Resolve this race by converting
existing vf_lock to a reader/writer lock.
Users of lock (tx/rx/stats) acquire for read, and actions like
add/remove acquire it for write.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since VF notifications are handled as VMBUS notifications on the
primary channel (and not as hotplug). The channel should be checked
before deciding to use VF for Rx or Tx.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Some PMD do not respect the eth_dev API when allocating their
rte_eth_dev. As a result, on device add event resulting from
rte_eth_dev_probing_finish() call, the eth_dev processed is incomplete.
The segfault is a good way to focus the developer on the issue, but does
not inspire confidence. Instead, warn the user of the error repeatedly.
The failsafe PMD can warn of the issue and continue. It will repeatedly
attempt to initialize the failed port and complain about it, which
should result in the same developer focus but with less crashing.
Signed-off-by: Gaetan Rivet <grive@u256.net>
Zero is a valid fd. The fd won't be closed thus leading fd leak,
when it is zero.
Also the service proxy is initialized at 0. This is assuming that all of
its fields are invalid at 0. The issue is that a file descriptor at 0 is
a valid one.
The value -1 is used as sentinel during cleanup. Initialize the RX proxy
file descriptor to -1.
Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Gaetan Rivet <grive@u256.net>
Tested-by: Ali Alnubani <alialnu@mellanox.com>
Can be reproduced with "make EXTRA_CFLAGS='-O1'" command using
gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)
Build error:
.../drivers/net/ena/ena_ethdev.c: In function ‘eth_ena_dev_init’:
.../drivers/net/ena/ena_ethdev.c:1815:20:
error: ‘wd_state’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
1815 | adapter->wd_state = wd_state;
| ~~~~~~~~~~~~~~~~~~^~~~~~~~~~
This looks like false positive, fixing by assigning initial value to
'wd_state' variable.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
rte_pci_probe() is private to the PCI bus.
Clean the remaining references in the documentation and comments.
Fixes: c752998b5e2e ("pci: introduce library and driver")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
Adding support for rx checksum offload. In case of wrong
checksum received (inner/outer l3/l4) it reports the
corresponding layer which has bad checksum. It also adds
rx burst function pointer hook for rx checksum offload to
event PMD.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adding macro based framework to hook dequeue/enqueue function
pointers to the appropriate function based on rx/tx offloads.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The flow_verbs_translate() function accumulates hash fields while
iterating through the flow items (SRC_IPV4, DST_IPV4, SRC_IPV6,
DST_IPV6, SRC_PORT_TCP, DST_PORT_TCP, SRC_PORT_UDP, DST_PORT_UDP).
Before this commit the dev_flow handle structure was reused in each new
flow_verbs_translate() call, however the dev_flow->hash_fields variable
was not reset before each call. As a result hash_fields from previous
calls remained present in the current flow which lead to invalid
combinations (e.g. simultaneous IPv4 and IPv6 specs). This scenario
happens for example in the next flows sequence, when running in verbs
mode (dv_flow_en=0).
flow create 0 ingress group 0 pattern eth / ipv4 / end <rss actions>
flow create 0 ingress group 0 pattern eth / ipv6 / end <rss actions>
The fix is to reset dev_flow->hash_fields in flow_verbs_prepare().
Fixes: e7bfa3596a0a ("net/mlx5: separate the flow handle resource")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The HW is optimized for IPv4/IPv6. For such cases avoid matching on
ethertype, and use ip_version field instead.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Introduce a helper function to set the ip_version match.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
gcc 10.0.1 reports warnings when using mlx5_rte_flow enums
with rte_flow type enums. For example:
../drivers/net/mlx5/mlx5_flow.c: In function ‘flow_hairpin_split’:
../drivers/net/mlx5/mlx5_flow.c:3406:19:
warning: implicit conversion from ‘enum mlx5_rtedflow_action_type’ to
‘enum rte_flow_action_type’ [-Wenum-conversion]
3406 | tag_action->type = MLX5_RTE_FLOW_ACTION_TYPE_TAG;
| ^
../drivers/net/mlx5/mlx5_flow.c:3419:13:
warning: implicit conversion from ‘enum mlx5_rte_flow_item_type’
to ‘enum rte_flow_item_type’ [-Wenum-conversion]
3419 | item->type = MLX5_RTE_FLOW_ITEM_TYPE_TAG;
| ^
Fix by casting to the correct enum.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
All comparison should be done in CPU endianness, otherwise
it will not give right results.
for example:
255 after converting into RTE_BE16 will be biger than 4096 after
converting into RTE_BE16.
Fixes: a5f2da0b816b ("net/mlx5: support modify VLAN ID on new VLAN header")
Cc: stable@dpdk.org
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The assertion was added incorrectly in converting the modify actions
into the format of low layer driver.
There is no mask specified in the rte_flow actions, and PMD driver
will give a mask of all 1s to the field to be modified. For each
field, the mask could not be zero. But for the whole header which
contains this field, the masks of other fields could be zero. The
assertion needs to be removed for debug mode.
Fixes: 72a944dba163 ("net/mlx5: fix header modify action validation")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Currently, there is no flow aging check and age-out event callback
mechanism for mlx5 driver, this patch implements it. It's included:
- Splitting the current counter container to aged or no-aged container
since reducing memory consumption. Aged container will allocate extra
memory to save the aging parameter from user configuration.
- Aging check and age-out event callback mechanism based on current
counter. When a flow be checked aged-out, RTE_ETH_EVENT_FLOW_AGED
event will be triggered to applications.
- Implement the new API: rte_flow_get_aged_flows, applications can use
this API to get aged flows.
Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, the counter pool needs 512 ext-counter memory for no batch
counters, it's allocated separately by once, behind the 512
basic-counter memory. This is not easy to get ext-counter pointer by
corresponding basic-counter pointer. This is also no easy for expanding
some other potential additional type of counter memory.
So, need allocate every one of ext-counter and basic-counter together,
as a single piece of memory. It's will be same for further additional
type of counter memory. In this case, one piece of memory contains all
type of memory for one counter, it's easy to get each type memory by
using offsetting.
Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
PMD create some default control rules with RSS action
if it's not isolated mode.
However whether default control rules need to do RSS or not should be
controlled by device configuration, the mq_mode of rxmode configuration
in specific.
In another word, only when mq_mode is configured with ETH_MQ_RX_RSS_FLAG
set, then RSS is needed for default rules.
Fixes: c64ccc0eca2f ("mlx5: fix overwritten RSS configuration")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The ULP mark manager originally assumed that zero was an invalid
mark and used it for invalidation and deletion. The mark manager
now supports adding zero as a mark, flags for validity and type,
and adds explicit bounds checking instead of relying on mask.
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The current mark handling uses the meta data field of the rxcmp as the
first level check for determining gfid vs lfid. When the meta data is
zero due to only the lowest 16bits of the gfid being set, the cfa code
is incorrectly interpreted as being an lfid. Changing code to look at
meta fmt instead of the meta data directly for the determination.
Fixes: b87abb2e55cb ("net/bnxt: support marking packet")
Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Due to an errata red algo needs to be set to discard instead of stall
for 96XX C0 silicon for two rate shaping. This workaround is being
already handled for newly created hierarchy but not for dynamic
shaper update cases. This patch hence applies the workaround
even when for shaper dynamic update.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
On detecting outer L4 checksum as bad, both outer and
inner checksums are marked as bad. No need to explicitly
check inner L4 checksum in this case.
Outer L4 UDP checksum error => PKT_RX_OUTER_L4_CKSUM_BAD
and PKT_RX_L4_CKSUM_BAD
Inner L4 UDP checksum error => PKT_RX_L4_CKSUM_BAD
Fixes: 41fe7a3a11fd ("net/octeontx2: offload bad L2/L3/L4 UDP lengths detection")
Signed-off-by: Amit Gupta <agupta3@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
When octeontx_create() is cleaning up, it does not correctly set
the mac_addrs variable to NULL, which will lead to a double free.
Fixes: 9e399b88ce2f ("net/octeontx: fix memory leak of MAC address table")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
Add support for get firmware version operation.
Get and dump multi boot image (MBI) version as part of get
firmware version string along with Management firmware (MFW) version.
Use qede_fw_version_get() for PMD info logs.
Signed-off-by: Yash Sharma <ysharma@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
and backend are assumed to be implemented in software, that is they can
run on identical CPUs in an SMP configuration.
Thus a weak form of memory barriers like rte_smp_r/wmb, other than
rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
and yields better performance.
For the above case, this patch helps yielding even better performance
by replacing the two-way barriers with C11 one-way barriers for avail
index in split ring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
and backend are assumed to be implemented in software, that is they can
run on identical CPUs in an SMP configuration.
Thus a weak form of memory barriers like rte_smp_r/wmb, other than
rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
and yields better performance.
For the above case, this patch helps yielding even better performance
by replacing the two-way barriers with C11 one-way barriers for used
index in split ring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Since the return value of the '.stats_reset' and '.xstats_reset'
callback function is int, when failing to issue command to firmware to
execute clear statistics, the relevant callback function should return
non-zero value.
Fixes: 8839c5e202f3 ("net/hns3: support device stats")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Currently, based on hns3 VF device error may occur during initialization.
The root cause as below:
When the following formula is executed during initialization, the
private variable named hw->tqps_num has not been obtained from PF driver
through mailbox, further causes failure when mapping interrupt and
queues.
hw->num_msi = (num_msi > hw->tqps_num + 1) ? hw->tqps_num + 1 : num_msi;
We need to use hw->tqp_num after it is correctly assigned.
On the other hand, because the private variable named hw->num_msi, which
represents the number of MSI-x interrupt of hns3 PF/VF device, is used in
the '.get_reg' ops implementation function to dump all interrupt related
registers, it should be obtained from firmware directly and we'd better
not modify it in the driver.
Fixes: ef2e785c36cf ("net/hns3: fix Tx interrupt when enabling Rx interrupt")
Fixes: 02a7b55657b2 ("net/hns3: support Rx interrupt")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Hao Chen <chenhao164@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
In current version, when upper level application calls the
rte_eth_dev_configure API function, if pvid config is not set of the
input parameter which struct type is rte_eth_conf, hns3 pmd driver also
sets the VLAN pvid related configuration to hardware, and this is not
reasonable. For example, As pvid is set to 100 by
rte_eth_dev_set_vlan_pvid, when pvid config is not set in rte_eth_conf,
rte_eth_dev_configure will tell driver to delete pvid 0, and that is
meaningless.
This patch fixes it to ensure that driver does not set VLAN pvid related
configuration to hardware when pvid config is not set in rte_eth_conf.
Fixes: 411d23b9eafb ("net/hns3: support VLAN")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>