This commit includes the add of:
- ConnectX-6 device ID
- ConnectX-6 SRIOV device ID
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch adds to MLX4 PMD the option to set flow rules
with empty ether MAC and specific VLAN ID.
Note that this fix works with
1. CX-3 FW 2.42.5016 or higher.
2. mlxconfig configuration STEER_FORCE_VLAN set to 1.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch implements the set of actions to support offload
of packet header modifications to MLX5 NIC.
Implementation is based on RFC [1].
[1] http://mails.dpdk.org/archives/dev/2018-November/119971.html
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
rte_eth_dev_rss_hash_conf_get API doesn't force 'rss_conf.rss_key'
to be not NULL, so rss_key = NULL should be alllowed in i40e
driver.
Fixes: 16321de093 ("ethdev: allow to get RSS hash functions and key")
Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Software indexes are PMD internal and should not be passed outside.
Right now SW and HW indexes of the Tx queue match, so it is just
a cosmetic fix.
Fixes: dbdc82416b ("net/sfc: factor out libefx-based Tx datapath")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
This capability is reported when supported by the current emitting
sub-device. Failsafe PMD itself does not exercise fast free logic.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
DCR is Intel internal information, no need to be in public code.
Related commit:
1a0833efde ("net/i40e/base: fix comment referencing internal data")
Fixes: e5b2a9e957 ("net/avf/base: add base code for avf PMD")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When setting the MTU, eth_dev_mtu_set() is called to validate the
provided MTU. As part of that, it calculates the useful area to store
data and compares it against the MTU, to guarantee that there's enough
space to store the data. It calculates that as:
"tp_frame_size - TPACKET2_HDRLEN - sizeof(struct sockaddr_ll)"
However, the TPACKET2_HDRLEN macro already increments sizeof(struct
sockaddr_ll) internally, meaning the useuful area of data above will
have sizeof(struct sockaddr_ll) decremented twice.
Instead, the useful area of data should be calculated as:
"tp_frame_size - TPACKET2_HDRLEN"
This makes sure that there's enough useful area to fit the provided MTU
after excluding tpacket2_hdr and sockaddr_ll.
Fixes: cc68ac4847 ("net/af_packet: support MTU change")
Cc: stable@dpdk.org
Signed-off-by: Tiago Lam <tiago.lam@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch improves both descriptors dequeue and refill,
by using the same batching strategy as done in in-order path.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
This patch adds support for in-order path when meargeable buffers
feature hasn't been negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Accounting of bytes was moved to a common function, so at the moment we do
it twice. This patches fixes it for sending packets with packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The common data freeing has been moved to rte_eth_dev_release_port(),
so freeing mac_addrs like this in eth_dev_close() is unnecessary and
will cause double free.
Fixes: e16adf08e5 ("ethdev: free all common data when releasing port")
Cc: stable@dpdk.org
Signed-off-by: Hideyuki Yamashita <yamashita.hideyuki@po.ntt-tx.co.jp>
Signed-off-by: Yasufumi Ogawa <ogawa.yasufumi@lab.ntt.co.jp>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The failsafe driver spews messages every time it is polling
making debugging other things more difficult. Just remove it.
Example:
net_failsafe: Hot-plug mutex was locked by thread 140182066431744 by the hot-plug alarm
net_failsafe: Hot-plug mutex was locked by thread 140182066431744 by the hot-plug alarm
...
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The checksum calculation APIs take only the packet headers pointers as
parameters, so they assume that the lengths reported in those headers
are correct. However, a malicious packet could claim to be far larger
than it is, so we need to check the header lengths in the driver before
calling the checksum API.
A better fix would be to allow the lengths to be passed into the API
function, but that would be an API break, so fixing in TAP driver for
now.
Fixes: 8ae3023387 ("net/tap: add Rx/Tx checksum offload support")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
When MPRQ mempool is freed, the pointer stored in priv structure must be
reset to null. Otherwise, the mempool can be freed again if the port is
restarted.
Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch respond the VIRTCHNL_OP_REQUEST_QUEUES msg from VF, and
process to allocated more queues for the requested VF. If successful,
PF will notify VF to reset. If unsuccessful, PF will send message to
inform VF.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Before this patch, VF gets a default number of queues from the PF.
This patch enables VF to request a different number. When VF configures
more queues, it will send VIRTCHNL_OP_REQUEST_QUEUES to PF to request
more queues, if success, PF will reset the VF.
User can run "port stop all", "port config port_id rxq/txq queue_num"
and "port start all" to reconfigure queue number.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Fix following build errors reported by Intel C++ compiler in Windows
build.
C:\> t4_hw.c(5105): warning #147: declaration is incompatible with
"int t4_bar2_sge_qregs(struct adapter *, unsigned int, unsigned int,
u64={uint64_t={unsigned __int64}} *, unsigned int *)"
(declared at line 524 of "..\..\..\..\drivers\net\cxgbe\base\common.h")
int t4_bar2_sge_qregs(struct adapter *adapter, unsigned int qid,
^
C:\> cxgbe_filter.c(42): error : expected an expression
n_user_filters = mult_frac(adap->tids.nftids,
^
C:\> sge.c(400): error : expression must be a pointer to a complete
object type
(uint16_t)(RTE_PTR_ALIGN((char *)mbuf->buf_addr +
^
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
[1] https://software.intel.com/en-us/parallel-studio-xe
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Define symbols only when they are not available.
This fixes following types of issues reported by Intel C++ compiler
in Windows build.
C:\> cxgbe_compat.h(154): warning #47: incompatible redefinition of
macro "min"
#define min(a, b) RTE_MIN(a, b)
^
C:\> t4_hw.c(338): warning #266: function "bzero" declared implicitly
bzero(p, 0, size);
^
C:\> t4_hw.c(5337): warning #266: function "htonl" declared implicitly
rvc.op_to_viid = htonl(V_FW_CMD_OP(FW_RSS_VI_CONFIG_CMD) |
^
C:\> sge.c(361): error : expected an expression
struct sge_eth_rxq *rxq = container_of(q, struct sge_eth_rxq, fl);
^
C:\> sge.c(1350): error : identifier "caddr_t" is undefined
static void inline_tx_mbuf(const struct sge_txq *q, caddr_t from,
^
[...]
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
[1] https://software.intel.com/en-us/parallel-studio-xe
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Replace "args..." with "fmt, ..." and directly use __VA_ARGS__.
This fixes following errors reported by Intel C++ compiler in Windows
build.
C:\> cxgbe_compat.h(28): error : expected a ")"
#define dev_printf(level, fmt, args...) \
^
C:\> cxgbe_compat.h(31): error : expected a ")"
#define dev_err(x, args...) dev_printf(ERR, args)
^
[...]
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
[1] https://software.intel.com/en-us/parallel-studio-xe
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
The Intel C++ compiler is not able to locate the header files without
relative paths in Windows build. Following errors are seen for these
header files.
C:\> base\common.h(9): error : cannot open source file "cxgbe_compat.h"
#include "cxgbe_compat.h"
^
[...]
Fix by explicitly stating header file location using relative paths.
Also, remove automatically including header files for Linux, to keep
it consistent across both OS.
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
[1] https://software.intel.com/en-us/parallel-studio-xe
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Add ops link_update.
LSC interrupt is also enabled in this patch.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Normally when starting/stopping the device the queue
should be started and stopped. Support them both in
this patch.
Below ops are added,
dev_configure
dev_start
dev_stop
dev_close
dev_reset
rx_queue_start
rx_queue_stop
tx_queue_start
tx_queue_stop
rx_queue_setup
rx_queue_release
tx_queue_setup
tx_queue_release
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Add some MACRO defination and small functions which
are specific for DPDK.
Add readme too.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the structures that define how the Rx/Tx queues
are used.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the structures and defines that define what
protocols the NIC can handle.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add various headers that define status codes and
basic defines for use in the code.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add code that multiple other features use.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add code to read/write/query the NVM image.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add code to handle the virtual switch within the NIC.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add code for the basic TX scheduler.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the structures for the control queues.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add all the device IDs that represent the NIC.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the commands, error codes, and structures
for the sideband queue.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the commands, error codes, and structures for
the admin queue.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the structures required by the NIC.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the registers that comprise the Intel(R) 800
Series NIC. There is no functionality in this patch.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Skip parsing pattern match items that have no spec. This fixes NULL
dereference when accessing their non-existent spec.
Fixes: ee61f5113b ("net/cxgbe: parse and validate flows")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Location of filter TID table should be after active TID table memory,
and not from the beginning of TID table memory. This fixes memory
corruption due to overlapping regions.
Fixes: 3a381a4116 ("net/cxgbe: query firmware for HASH filter resources")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
The ENAv2 is introducing many new features, mainly the LLQ feature
(Low Latency Queue) which allows the device to process packets faster
and as a result, the latency is noticeably lower.
The second major feature is configurable depth of hw queues where Rx
and Tx can be reconfigured independently and maximum depth of Rx queue
is 8k.
The release also includes many bug fixes and minor new features, like
improved statistics counters and extended statistics.
Driver is still compatible with ENAv1 device.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
After Rx or Tx cleanup update completion queue head by calling
ena_com_update_dev_comp_head().
Fixes: 1daff5260f ("net/ena: use unmasked head and tail")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The Rx drops cannot be acquired using the older API. Now, it must be
read in keep alive message.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
ENA PMD is having it's own custom statistics counters. They are exposed
to the application by using the xstats DPDK API.
The deprecated and unused statistics are removed, together with old API.
Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Those counters provide information regards sent/received bytes and
packets per queue.
Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
When wrong req_id is detected some previous mbufs could be used for
receiving different segments of received packets. In such cases chained
mbufs will be twice returned to pool.
To prevent it chained mbuf is now freed just after error detection.
To simplify cleaning, pointers taken for Rx ring are set to NULL.
As after ena_rx_queue_release_bufs and ena_tx_queue_release_bufs queues
are not used updating of next_to_clean pointer is not necessary.
Fixes: c203497667 ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Use empty_rx_reqs instead of empty_tx_reqs.
As those two variables are part of union this not cause
any failure, but for consistency should be changed.
Fixes: c203497667 ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The PMD was not passing RSS offloads values although it was supporting
the RSS. To allow application to probe the PMD for RSS support, the
missing information was added.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Only PMD_*_LOG is adding new line character to the log message.
All printouts were adjusted for consistency.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Reset function should return the port to initial state, in which no Tx
and Rx queues are setup. Then application should reconfigure the queues.
According to DPDK documentation the rte_eth_dev_reset() itself is a
generic function which only does some hardware reset operations through
calling dev_unint() and dev_init().
ena_com_dev_reset which perform NIC registers reset should be called
during stop.
Fixes: 2081d5e2e9 ("net/ena: add reset routine")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
If start function fails, previously created queues have to be removed.
ena_queue_restart_all() and ena_queue_restart() are renamed to
ena_queue_start_all() and ena_queue_start().
ena_free_io_queues_all() is renamed to ena_queue_stop_all().
Fixes: df238f84c0 ("net/ena: recreate HW IO rings on start and stop")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Before sending next packet, check if calling doorbell is needed.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Some of ENA devices supports 8k Rx rings. Maximum supported size is
received upon device initialization.
As ENA_DEFAULT_RING_SIZE_RX macro is upper limit, it needs to be
adjusted.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
LLQ (Low Latency Queue) is the feature that allows pushing header
directly to the device through PCI before even DMA is triggered.
It reduces latency, because device can start preparing packet before
payload is sent through DMA.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
When invalid req_id is received, the reset should be handled by the
application, as it is indicating invalid rings state, so further Rx
is not making any sense.
Fixes: c203497667 ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The device now allows driver to reconfigure Tx and Rx queues depth
independently. Moreover, maximum size for Tx and Rx can be different.
Those maximum values are received from the device.
After reset, previous ring configuration is restored.
If number of descriptor is set to RTE_ETH_DEV_FALLBACK_RX_RINGSIZE
or RTE_ETH_DEV_FALLBACK_TX_RINGSIZE, the maximum value is restored.
Remove checks, if provided number is not too big, as this is done in
generic functions (rte_eth_rx_queue_setup and rte_eth_tx_queue_setup).
Maximum number of segments is being set for Rx packets and provided to
ena_com_rx_pkt() for validation.
Unused definitions were removed.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Whenever the driver will receive too many descriptors from the device,
it should trigger the device reset with reset reason set to
ENA_REGS_RESET_TOO_MANY_RX_DESCS.
Fixes: 241da076b1 ("net/ena: adjust error checking and cleaning")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The new ena_com allows the number of CPUs to be passed to the device in
the host info structure.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
ena_com is the communication layer provided by the vendor and common to
all ENA drivers.
This patch updates it to version from 2018.09.26.
It adds support for ENAv2 device together with LLQ feature, adds
doorbell optimization and reconfiguration of HW queues depth
independently.
The driver was adjusted to the new changes in the HAL.
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
In SW assisted live migration mode, driver will stop the device and
setup a mediated virtio ring to relay the communication between the
virtio driver and the VDPA device.
This data path intervention will allow SW to help on guest dirty page
logging for live migration.
This SW fallback is event driven relay thread, so when the network
throughput is low, this SW fallback will take little CPU resource, but
when the throughput goes up, the relay thread's CPU usage will goes up
accordingly.
User needs to take all the factors including CPU usage, guest perf
degradation, etc. into consideration when selecting the live migration
support mode.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Vhost lib has already provided a helper for used ring logging, driver
could use it to reduce code.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch series enables a new method for live migration, i.e. software
assisted live migration. This patch provides a device argument for user
to choose the methold.
When "sw-live-migration=1", driver/device will do live migration with a
relay thread dealing with dirty page logging. Without this parameter,
device will do dirty page logging and there's no relay thread consuming
CPU resource.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If user wants the VF to be used in VDPA (vhost data path acceleration)
mode, then the user can add a "vdpa=1" parameter for the device.
So if driver does not find this option, it should quit and let the bus
continue the probe.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If driver fails to register ifc VF device into vhost lib, then this
device should not be stored.
Fixes: a3f8150eac ("net/ifcvf: add ifcvf vDPA driver")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Driver probe may fail for different causes, debug message is helpful for
debugging issue.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
VDPA driver can decide if it needs to enable/disable the host notifier
mapping, so exposing a API can allow flexibility. A later patch will
base on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Until we have support for control virtqueues let's disable it and
fail device initalization if specified as a parameter.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add option to enable packed queue support for virtio-user
devices.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Use packed virtqueue format when reading and writing descriptors
to/from the ring.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This implements the transmit path for devices with
support for packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support to dump packed virtqueue data to the
VIRTQUEUE_DUMP() macro.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
It is possible that the VF device exists but DPDK doesn't know
about it. This could happen if device was blacklisted or more
likely the necessary device (Mellanox) was not part of the DPDK
configuration.
In either case, the right thing to do is just keep working
but only with the slower para-virtual device.
Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
On device close or startup errors, the transmit descriptor pool
was being left behind.
Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
While calculating the input packet count per port, discarded packets
should be reduced, right now only PF VSI discarded packets are reduced.
But while calculating the input byte count per port, Rx byte count is
used, which should take all discarded packets into account, including
VF VSI ones.
This will cause inconsistency in stat counters in some cases.
This patch would take all VSI stats as packet and byte count to address
the issue.
Fixes: 763de290cb ("net/i40e: fix packet count for PF")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The reset flags vf->vf_reset and vf->pend_msg are set when VF received
VIRTCHNL_EVENT_RESET_IMPENDING. So after resetting done, these flags
should be cleared.
Fixes: 8cacf78469 ("net/i40e: fix VF initialization error")
Cc: stable@dpdk.org
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
With a larger PAGE_SIZE it is possible for the MSI table to very
close to the end of the BAR s.t. when we align the start and end
of the MSI table to the PAGE_SIZE, the end offset of the MSI
table is out of the PCI BAR boundary.
This patch addresses the issue by comparing both the start and the
end offset of the MSI table with the BAR size, and skip the mapping
if it is out of Bar scope.
The patch fixes the debug log as below:
EAL: Skipping BAR0
Signed-off-by: Tone Zhang <tone.zhang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch increments error counter (stats.dequeue_err_count)
in case of any error detection during qat_comp_process_response
function.
Fixes: 3cc14fc48e ("compress/qat: check that correct firmware is in use")
Fixes: 32842f2a6d ("compress/qat: create FW request and process response")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
This patch fixes error status which should be set inside
qat_comp_build_request function in case any errors are detected.
In these cases op.status is set to
RTE_COMP_OP_STATUS_INVALID_ARGS to help application debug.
Fixes: 1947bd1858 ("compress/qat: support scatter-gather buffers")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
This reverts commit d09973f6c4 ("common/qat: fix for invalid
response from firmware") due to incorrectly reporting failures
on some older firmware versions.
Fixes: d09973f6c4 ("common/qat: fix for invalid response from firmware")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
NULL algo algo does not to set counter flag so it should be zeroed.
Fixes: db0e952a5c ("crypto/qat: add NULL capability")
Cc: stable@dpdk.org
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
AES-CCM algo does not to set counter flag so it should be zeroed.
Fixes: ab56c4d9ed ("crypto/qat: support AES-CCM")
Cc: stable@dpdk.org
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
Error code of qat_hash_get_block_size needs to be handle properly.
Fixes: 10b49880e3 ("crypto/qat: make the session struct variable in size")
Cc: stable@dpdk.org
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Tested-by: Marko Kovacevic <marko.kovacevic@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
Building Turbo Software as shared library for AVX512 failed
due to wrong order of library in the library list (LDLIBS)
Fixes: b8cfe2c9ae ("bb/turbo_sw: add software turbo driver")
Cc: stable@dpdk.org
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
Test application and Turbo Software driver were adapted
to support chained-mbuf for bigger TB sizes.
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
Offload cost test was improved in order to collect
more accurate results.
Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com>
Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>
Remove if() condition prior to calling BN_free() as
BN_free(a) does nothing if a is NULL.
Signed-off-by: Akash Saxena <akash.saxena@caviumnetworks.com>
Signed-off-by: Shally Verma <shally.verma@caviumnetworks.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
The DPAA bus support code put out messages like:
dpaax: read_memory_node(): Unable to glob device-tree memory node:
(/proc/device-tree/memory[@0-9]*/reg)(3)
dpaax: PA->VA translation not available;
dpaax: Expect performance impact.
These are unnecessary and likely to confuse the end user.
Fix this by doing nothing if bus is empty.
Fixes: 5a7dbb934d ("dpaa: enable dpaax library")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
NDIS multi-queue support is only in WS2012 or later. Check the NDIS
version to limit to single queue on older versions. Similar code
exists in Linux driver.
Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Make DPDK enable SRIOV flag in same way as Linux and FreeBSD.
Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
When using multiple queues, there was a race with the kernel
in setting up the second channel. This regression is due to a kernel change
which does not allow accessing sysfs files for Hyper-V channels that are not opened.
The fix is simple, just move the logic to detect not ready
sub channels earlier in the existing loop.
Fixes: 831dba47bd ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Reported-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Before this patch, there are two functions will call
i40e_dev_free_queues to free queues. For rte_eth_dev_close(), its
redundant because of duplication. For rte_eth_dev_reset() its
redundant because of not necessary, since following dev_configure
is required after dev_reset and it will be updated correctly.
This patch removes redundant code in i40e_dev_free_queues().
Fixes: 6b45371283 ("i40e: free queue memory when closing")
Cc: stable@dpdk.org
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This reverts
commit 09a62d7569 ("net/i40e: fix offload not supported mask")
Contrary to what is said in above patch commit log,
I40E_TX_OFFLOAD_NOTSUP_MASK is the mask of Tx offload bits that are part
of PKT_TX_OFFLOAD_MASK but not included in I40E_TX_OFFLOAD_MASK.
Above patch erroneously includes all PKT_RX_OFFLOAD_ bits in the
I40E_TX_OFFLOAD_NOTSUP_MASK, this is not what is expected.
Restore the initial xor that gives the expected result.
Fixes: 09a62d7569 ("net/i40e: fix offload not supported mask")
Cc: stable@dpdk.org
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When a guest is spanned on multiple NUMA nodes and
multiple Virtio devices are spanned onto these nodes,
we expect that their ring memory is allocated in the
right memory node.
Otherwise, vCPUs from node A may be polling Virtio rings
allocated on node B, which would increase QPI bandwidth
and impact performance.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Since below commit, several tx_prep functions are broken, they fail to
pass supported Tx offload features check:
PKT_TX_IPVx must be set when any PKT_TX_L4 checksum is requested,
but these values are not present in the mask of supported Tx offloads
of several drivers that advertise PKT_TX_L4_MASK.
So any packet sent to those drivers with a L4 checksum request and
one of PKT_TX_IPVx bit set is rejected by the tx prepare function.
Fixes: 1037ed842c ("mbuf: fix Tx offload mask")
Cc: stable@dpdk.org
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch introduces changes for supporting multiprocess support.
This is trivial for VFs but comes with some limitations for the PF.
Due to restrictions when using NFP CPP interface, just one secondary
process is supported by now for the PF.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
The VIC hardware has 64 MAC filters per vNIC, which can be either
unicast or multicast. Use one half for unicast and the other half for
multicast, as the VIC kernel drivers for Linux and Windows do.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Cisco VIC adapters run firmware. Add the fw_version_get handler to
help diagnostics.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Set RTE_ETH_DEV_CLOSE_REMOVE upon probe so rte_eth_dev_close() can
later free port resources including mac_addrs.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
When multiple Chelsio adapters are present, the current naming
convention of using underlying physical port number of the adapter
creates conflicts when allocating control queue mbuf pool for
multiple adapters and hence results in allocation failure. Fix by
using port_id from rte_eth_dev_data, instead of physical port number.
Fixes: 3a3aaabc7c ("net/cxgbe: add control queue to communicate filter requests")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Function mlx5_ctrl_flow_vlan() is used to set the rss rule in
MLX5 PMD, using priv->reta_idx_n as number of Rx queues.
This number is passed to mlx5_flow_validate_action_rss(), which
attempts to access the Rx queues at priv->rxqs.
In case priv->rxqs_n is 0, priv->rxqs is empty, and
mlx5_flow_validate_action_rss() will crash with segmentation fault.
priv->reta_idx_n can never be 0, even if priv->rxqs_n is set to 0.
But when priv->rxqs_n is set to 0, setting the rss rule is invalid.
This patch updates mlx5_ctrl_flow_vlan(), if priv->rxqs_n is 0 the
function will fail with EINVAL errno.
Fixes: 8086cf08b2 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Port reset will call i40evf_uninit_vf() to release resource. It wants
to call i40evf_dev_close() to release resources. Before this patch,
hw->adapter_stopped was used to mark the status about start/stop and
close. So it will never call i40evf_dev_close() after stopping the port.
This patch added hw->adapter_closed flag in i40evf_dev_close() and
i40e_dev_close() to control the status of close.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When starting the device, the RSS table is initialized. So the RSS
update before device_start would be overwritten. This patch allows users
to update the RSS reta table before device_start.
Fixes: db5b65301d ("ethdev: allow to set RSS hash computation flags and/or key")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When starting the device, the RSS table is initialized. So the RSS
update before dev_start would be overwritten. This patch allows users
to update the RSS reta table before dev_start and adjusts the order
to set entries sequentially.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Device can't function properly with a firmware that is in recovery
mode.
Check the firmware status at initialization time. If the firmware is
in recovery mode, alert the user to check it.
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
RSS hash configure get API (i40e_dev_rss_hash_conf_get()) didn't check
the return value of i40e_get_rss_key().
i40e_dev_rss_hash_conf_get() will return success even getting RSS hash
key failed. This patch fixes the issue.
Fixes: d0a349409b ("i40e: support AQ based RSS config")
Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The struct should be ifcvf_net_config other than ifcvf_net_device_config
The variable is used as named opaque variable, struct fields are not
accessed at all, so using wrong struct type has no effect but it should
be fixed.
Fixes: a3f8150eac ("net/ifcvf: add ifcvf vDPA driver")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
tso and vlan parameters were removed from the signature
of txq_mbuf_to_swp function.
The documentation of the function was not updated accordingly.
Remove the tso and vlan documentation to match the function signature.
Fixes: 8f6d9e13a9 ("net/mlx5: remove redundant checks")
Cc: stable@dpdk.org
Signed-off-by: Asaf Penso <asafp@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
If vdev_netvsc is run with debug logging enabled, then the
log output will fill with:
net_vdev_netvsc: interface lo is non-ethernet device
Remove the message since it is not useful.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matan Azrad <matan@mellanox.com>
Reduced number of INFO logs in BNX2X PMD by converting some INFO
logs to DEBUG and few NOTICE logs to INFO, removing extra new lines,
printing banner bar once for the adapter and device specific info.
Fixes: ba7eeb035a ("net/bnx2x: fix logging to include device name")
Fixes: 540a211084 ("bnx2x: driver core")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
The imissed counters (number of packets dropped because the queues were
full) were actually reported through xstats as "rx_out_of_buffer"
but was not reported through stats.
Following a recent discussion on the ML, as there is no way to tell the
user if a counter is implemented or not, this should be considered a
bug. For example, user looking at imissed will think the packets are
lost before reaching the device.
Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
ixgbe is able to recognize 1G SX and LX id, but it is missing the LHA.
Add it, so that it can handle LHA SFP plugin.
Fixes: d2e72774e5 ("ixgbe/base: support X550")
Cc: stable@dpdk.org
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Take the 'other interrupt' into account when setting up
MSI-X interrupts and use the proper mask when enabling it.
Also, rearm the MSI-X vector after the LSC interrupt fires.
This change allows both LSC and RXQ interrupts to work at
the same time when using MSI-X interrupts.
Cc: stable@dpdk.org
Signed-off-by: Timmons C. Player <timmons.player@spirent.com>
Acked-by: Wei Zhao <wei.zhao1@intel.com>
Setting up a device that wasn't setup in the primary
process will possibly break the primary process. That's
because the IPC message to retrieve the group fd in the
primary will also *open* that group if it wasn't opened
before. Even though the secondary process closes that fd
soon after as a part of its error handling path, the
primary process leaks it.
What's worse, opening that fd on the primary will
increment the process-local counter of opened groups.
If it was 0 before, then the group will never be added
to the vfio container, nor dpdk memory will be ever
mapped.
This patch moves the proper error checks earlier in the
code to fully prevent setting up devices in secondary
processes that weren't setup in the primary process.
Fixes: 2f4adfad0a ("vfio: add multiprocess support")
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The pci_resource_by_index called strlen() on uninitialized
memory which would lead to the wrong size of memory allocated
for the path portion of the resource map. This would either cause
excessively large allocation, or worse memory corruption.
Coverity issue: 300868
Fixes: ea9d56226e ("pci: introduce function to map uio resource by index")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In case of running with not enough capabilities, i.e. running as
non-root user any application linked with DPDK prints the message
about IOPL call failure even if it was just called like
'./testpmd --help'. For example, this breaks most of the OVS unit
tests if it built with DPDK support.
Let's register the virtio driver unconditionally and print error
message while probing the device. Silent iopl() call left in the
constructor to have privileges as early as possible as it was before.
Fixes: 565b85dcd9 ("eal: set iopl only when needed")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The VLAN push action on E-Switch supports only 802.1Q (0x8100)
and 802.1AD (0x88A8) Tag Protocol ID (TPID) insertions. The
parameter check for RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN action
is added.
Fixes: 57123c00c1 ("net/mlx5: add Linux TC flower driver for E-Switch flow")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The TCA_FLOWER_KEY_VLAN_ETH_TYPE should be specified for the E-Switch
Flows with VLAN and L3 pattern items in the Netlink messages. The patch
adds missing flower key to the messages. This patch partially reverts to
the code smashed by http://patches.dpdk.org/patch/47781
Fixes: 251e8d02cf ("net/mlx5: add VXLAN to flow translate routine")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The only time that software should write to the TDH register
is after a reset (hardware reset or CTRL.RST) and
before enabling the transmit function (TXDCTL.ENABLE).
If software were to write to this register while the transmit
function was enabled, the on-chip descriptor buffers might
be invalidated and the hardware could become confused.
Fixes: 029fd06d40 ("ixgbe: queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Following commit updated the Tx offload mask
commit 1037ed842c ("mbuf: fix Tx offload mask").
So, not having PKT_TX_IPV6 and PKT_TX_IPV4 in qede PMDs supported
Tx offload mask breaks TSO support since application will fail in transmit
prepare function.
Fixes: 1037ed842c ("mbuf: fix Tx offload mask")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
rx_buffer_info should be refill not linearly, but out of order.
IDs should be taken from empty_rx_reqs array.
rx_refill_buffer is introduced to temporary storage
bulk of mbufs taken from pool.
In case of error unused mbufs are put back to pool.
Fixes: c203497667 ("net/ena: add Rx out of order completion")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
The memcpy operation overwrites the device info fields set
by ethdev API and leaves the device field undefined.
Thus, replaces memcpy by only updating the required fields
at the driver layer.
error log: testpmd> show port info 1
**** Infos for port 1 ****
MAC address: 00:00:00:00:00:00
Device name: net_softnic0
Segmentation fault.
port_infos_display (port_id=1) at /dpdk/app/test-pmd/config.c:418
if (dev_info.device->devargs && dev_info.device->devargs->args)
Fixes: cc6d421574 ("net/softnic: add softnic PMD")
Cc: stable@dpdk.org
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
DCR is Intel internal information, no need to be in public code.
Fixes: de2cd512b1 ("net/i40e/base: new AQ commands for cloud filter")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
When compiling with gcc -O1, this error appears:
drivers/net/mlx4/mlx4_ethdev.c: In function ‘mlx4_rxmode_toggle’:
rte_log.h:321:3: error:
‘mode’ may be used uninitialized in this function
The function mlx4_rxmode_toggle is never called with a value which
is not in the switch block, but GCC complains about it with -O1.
So the default case is "fixed" by setting string "undefined".
Fixes: eacaac7bae ("net/mlx4: restore promisc and allmulti support")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
This patch checks the return value of function
rte_eth_bond_8023ad_agg_selection_set() in bond_ethdev_configure
for error return value.
Fixes: 6d72657ce3 ("net/bonding: add other aggregator modes")
Cc: stable@dpdk.org
Signed-off-by: Lee Daly <lee.daly@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Private vdev was the way previously, when pdump developed, now with
shared device mode on virtual devices, pcap data path in secondary
is not working.
When secondary adds a virtual device, related data transferred to
primary and primary creates the device and shares device back with
secondary. When pcap device created in primary, pcap handlers
(pointers) are process local and they are not valid for secondary
process. This breaks secondary.
So we can't directly share the pcap handlers, but need to create a new
set of handlers for secondary, that's what we done in this patch.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
This is prework for data path enabling for secondary process.
To prevent pcap handler opened by one process be overwritten by
another process, each process should have their private copy,
`rte_eth_dev->process_private` is exactly what we needed.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tx offload mask is updated in following commit 1037ed842c
("mbuf: fix Tx offload mask"). Currently, the new added offload
flags are not supported in PMD and application will fail to call
PMD transmit prepare function.
This patch updates AVF_TX_OFFFLOAD_MASK.
Fixes: 1037ed842c ("mbuf: fix Tx offload mask")
Cc: stable@dpdk.org
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When eth_octeontx is initialized before event_octeontx, eth_octeontx
would initiate setting up of the event device with the max ports and
queues available. If number of ports is more than the number of queues,
some ports would be left unused when the ports and queues are mapped
1:1. But even in that case the ports need to be setup, or otherwise it
would lead to a segmentation fault when event device is started.
Fixes: f7be70e513 ("net/octeontx: add net device probe and remove")
Cc: stable@dpdk.org
Signed-off-by: Anoob Joseph <anoob.joseph@caviumnetworks.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
When queues are stopped release Tx buffers.
During start initialize array of empty Tx/Rx reqs with default values.
Fixes: df238f84c0 ("net/ena: recreate HW IO rings on start and stop")
Cc: stable@dpdk.org
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Update the mlx5_flow_validate_item_mpls() function to allow
MPLS over IP, UDP, and GRE.
Modify the flow_dv_validate() function with the new logic introduced
in previous patch of this series: set new variable last_item
after each validation, update item_flags after each item iteration.
The new variable last_item is sent to mlx5_flow_validate_item_mpls()
and used to validate the MPLS item.
Same change implemented in flow_verbs_validate().
Update the mlx5_flow_validate_item_mpls() function to verify that
device configuration has mpls_en set to 1.
This code was added earlier this year, but unintentionaly removed
in recent versions.
Fixes: 84c406e745 ("net/mlx5: add flow translate function")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The support in MPLS on this flow engine was overlooked. It's absence is
critical because there are required actions for MPLS which can be done
only with the DV engine.
To set correctly the MPLS filter, we need to reason about the flow item
before the MPLS (UDP, GRE or other).
To do that, a new variable last_item was added and updated after each
translation. the full item flags are updated after each item iteration.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Unnecessary volatile attribute keeps compiler from further optimizing the
code and this results in a little performance drop (~2%). Because of memory
barriers, it is safe to remove.
Fixes: 6bf10ab69b ("net/mlx5: support 32-bit systems")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
There's some performance drop due to extra condition checks on the
datapath. Checking for external memory registration should be consolidated
to the existing bottom-half.
Fixes: 7e43a32ee0 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
There's some performance drop due to extra condition checks on the
datapath. Checking for external memory registration should be consolidated
to the existing bottom-half.
Fixes: 31912d9924 ("net/mlx4: support externally allocated static memory")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The flow_dv_query() just returns -ENOTSUP value and does not
set provided error parameter structure, that crashes the
port_flow_query(). The patch fixes flow_db_query(), now it
sets an error parameter structure.
Fixes: 684dafe795 ("net/mlx5: add flow query abstraction interface")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch fixes compilation errors with meson and the clang
compiler caused by some of the struct members not being
initialized.
```
../drivers/net/mlx5/mlx5_mr.c:345:37: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
struct mlx5_mr_cache entry = { 0, };
^
../drivers/net/mlx5/mlx5_mr.c:389:36: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
struct mlx5_mr_cache ret = { 0, };
^
../drivers/net/mlx5/mlx5_mr.c:691:35: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
struct mlx5_mr_cache ret = { 0, };
^
```
The compilation errors reproduce with
clang version 3.4.2 (tags/RELEASE_34/dot2-final) on RHEL.
Fixes: e1114ff6a5 ("net/mlx5: support e-switch flow count action")
Fixes: db48f9db5d ("net/mlx5: support new flow counter API")
Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Fixes: 65c9d24170 ("net/mlx5: enable loopback by configured mode")
Fixes: 87011737b7 ("mlx5: add software counters")
Cc: stable@dpdk.org
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch fixes compilation errors with meson and the clang
compiler caused by some of the struct members not being
initialized.
```
../drivers/net/mlx4/mlx4_mr.c:357:37: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
struct mlx4_mr_cache entry = { 0, };
^
../drivers/net/mlx4/mlx4_mr.c:401:36: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
struct mlx4_mr_cache ret = { 0, };
^
../drivers/net/mlx4/mlx4_mr.c:691:35: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
struct mlx4_mr_cache ret = { 0, };
^
```
The compilation errors reproduce with
clang version 3.4.2 (tags/RELEASE_34/dot2-final) on RHEL.
Fixes: 9797bfcce1 ("net/mlx4: add new memory region support")
Cc: stable@dpdk.org
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch adds the in_hw flag check for tc flower filter rules.
If rule is applied without skip_sw flag set driver should check
whether the in_hw flag is set after rule applying. If no in_hw
flag set found it means the rule is not hardware offloaded and
error should be returned to application.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The tc flower filter rules are used to control E-switch from
the application side. In order to gain garanteed rule hardware
offload the skip_sw flag should be specified while applying the
rule. But some tc rules is rejected by kernel if skip_sw flag
is set by design. Currently this regards VXLAN tunneling rules,
which are applied to special VXLAN virtual devices. Albeit these
rules are applied with skip_sw flag reset kernel tries to
perform hardware offload. If kernel succeeded the in_hw flag
is set in rule properties and application should check this
flag to get know whether the applied rule is actually hardware
offloaded.
This patch prepares for the rule flags query and check in_hw flag.
The driver checks only the rules with skip_sw flag reset, so we
need to test the actual flags of rule is being applied. The pointer
to flags field into translated rule is introduced and flags are
tested directly in the rule body. It is more reliable than save and
check flags copy.
Also patch swaps the flow_tcf_apply() and flow_tcf_remove(),
we are going to call flow_tcf_remove() from flow_tcf_appy() if
no in_hw flag set found in applied rule.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
While receiving the Netlink reply messages we should stop at DONE
or ACK message. The existing implementation stops at DONE message
or if no multiple message flag set ( NLM_F_MULTI). It prevents
the single query requests from working, these requests send the
single reply message without multi-message flag followed by
ACK message. This patch fixes receiving part of Netlink
communication routine.
Fixes: 6e74990b34 ("net/mlx5: update E-Switch VXLAN netlink routines")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch removes the unused message length parameter, we
do not send multiple commands in the single message anymore,
message length can be taken from the prepared message header,
so length parameter can be removed.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The assignment should be done after initialization, or it will be zero.
Fixes: 264b23e3d2 ("net/i40e: add parameter check for RSS flow init")
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Tested-by: Yuan Peng <yuan.peng@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
As the code has changed the max wait time to 1000ms, the comment should
be changed accordingly.
Fixes: 64f1c8539c ("net/ixgbe: wait longer for link after fiber MAC setup")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
There's an issue that X710 can't receive any packet after
reading some special registers. That's because these
registers are only valid for X722, read access for non-X722
will cause ECC error.
Fixes: d9efd0136a ("i40e: add EEPROM and registers dumping")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Using the rte_bsf32 function to find last set bit. This avoids
the for loop and hence the bad bit shift operation.
Coverity issue: 323477
Fixes: 351fbee219 ("common/cpt: support hash")
Signed-off-by: Ankur Dwivedi <ankur.dwivedi@caviumnetworks.com>
Signed-off-by: Anoob Joseph <anoob.joseph@caviumnetworks.com>
drivers/bus/pci/linux/pci_vfio.c:45:23: error:
‘failure_handle_lock’ defined but not used
Fixes: 8ffe738651 ("vfio: add lock for hot-unplug failure handler")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
When the sigbus handler be enabled for hot-unplug, whatever hot-unplug
sigbus or origin sigbus occur, the sigbus handler will be invoked and
it will access the bus and device. While in the control path, the vfio
req handler also will process the bus and device, so a protection of
the resources in vfio req handler should be need. This patch add a lock
in vfio req handler when process bus and device resource, to avoid the
synchronization issue when device be hot-unplugged.
Fixes: c115fd000c ("vfio: handle hotplug request notifier")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
The function otx_cpt_get_resource() would be setting the pointer
'instance'. In case of error, 'instance' would be set to NULL, and
returns rte_errno. If rte_errno when 'instance' is set to NULL, it can
lead to NULL pointer dereferencing.
Coverity Issue: 323486, 323489
Fixes: bfe2ae495e ("crypto/octeontx: add PMD skeleton")
Fixes: 0961348fdf ("crypto/octeontx: add queue pair functions")
Signed-off-by: Anoob Joseph <anoob.joseph@caviumnetworks.com>
If the length of string pointed by 'name' is equal to or greater than
the sizeof cptvf->dev_name string, the resultant string will not be
null terminated. Using strlcpy would make sure the string would always
be null terminated.
Coverity Issue: 323492
Fixes: 0dc1cffa4d ("crypto/octeontx: add hardware init routine")
Signed-off-by: Ankur Dwivedi <ankur.dwivedi@caviumnetworks.com>
Signed-off-by: Anoob Joseph <anoob.joseph@caviumnetworks.com>
This patch fixes the queue pair free for AESNI-MB PMD. Originally
the queue pair ring name are different than the object name,
caused the aesni_mb_pmd_qp_release() cannot find the ring to
release. This patch uses the same name between queue pair object
name and ring name.
Fixes: a831c318c5 ("crypto/aesni_mb: free ring memory on queue release")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
File *fp, opened but not closed, this patch add fclose(fp)
Fixes: ef4b04f87f ("crypto/ccp: support device init")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Ravi Kumar <ravi1.kumar@amd.com>
There need an parameter check for RSS flow init, or it may cause
core dump if pointer is NULL in memory copy.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
For ixgbevf kernel driver, link status changes from down to up
will trigger vf kernel driver send IXGBE_VF_RESET message to pf
kernel driver, after this, vf kernel driver will disable and enable
it self. By these series operations, the vf kernel driver report
link up. Besides, all these operations handles in kernel thread.
For DPDK user space driver, it only gets link status changes from
down to up, but miss IXGBE_VF_RESET message sending and reset itself.
If we will add fully implementation of link status change for DPDK
user space driver, we need take much more modification. We have
aligned that for link status changes from down to up we only notify
link is up, users need to reset vf port.
Fixes: dc66e5fd01 ("net/ixgbe: improve link state check on VF")
Cc: stable@dpdk.org
Signed-off-by: Rosen Xu <rosen.xu@intel.com>
Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
As mlx5_flow_hashfields_adjust() refers to flow->rss, actions must be
translated prior to items like in Verbs. Otherwise, hash fields are not
correctly set.
Fixes: d02cb06912 ("net/mlx5: add Direct Verbs translate actions")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The current code wrongly assumes that packets are non-TSO and ends up
rejecting large TSO packets. Check non-TSO and TSO max packet sizes
separately.
Fixes: 5a12c38740 ("net/enic: check maximum packet size in Tx prepare handler")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Fields in the struct efx_filter_spec_t starting from efs_outer_vid
are hashed for software filter lookup. efs_mark is not a matching
criteria. Exclude efs_mark from hash.
Fixes: 5f78af5239 ("net/sfc: support MARK and FLAG actions in flow API")
Cc: stable@dpdk.org
Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tunneling offloads are represented by multi-bit values. So, feature
wise tunneling offload can only be entirely supported/unsupported
using PKT_TX_TUNNEL_MASK. Its upon PMDs to further isolate which of
the tunneling offload types are supported by respective PMD.
Using subset of bits from PKT_TX_TUNNEL_MASK to indicate supported vs
unsupported offloads can lead to undesired result.
Use PKT_TX_TUNNEL_MASK in QEDE_TX_OFFLOAD_MASK and use independent
value of supported PKT_TX_TUNNEL_* in .tx_pkt_prepare() to mark
supported tunnel offloads.
Fixes: 44346c24b7 ("net/qede: fix VXLAN tunnel Tx offload flag setting")
Cc: stable@dpdk.org
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Reviewed-by: Rasesh Mody <rasesh.mody@cavium.com>
When the priv_size of the mbuf is > 128 bytes, the mbuf would not be
properly constructed. This would lead to a corrupt mbuf.
This patch fixes the issue by accounting for
rte_pktmbuf_priv_size(pool) and RTE_PKTMBUF_HEADROOM
while configuring first skip register calculation.
Fixes: 197438ee9f ("net/octeontx: add Rx queue setup and release ops")
Cc: stable@dpdk.org
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Anoob Joseph <anoob.joseph@caviumnetworks.com>
Signed-off-by: Nitin Saxena <nitin.saxena@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The ethdev flag RTE_ETH_DEV_CLOSE_REMOVE is set for drivers
having migrated to the new behaviour of rte_eth_dev_close().
As any other flag, it can be useful to know about its value
as soon as the port is probed.
Unfortunately, it was set inside the close operation,
just before being erased by memset() in rte_eth_dev_release_port().
The flag assignment is moved to the probing stage, so it can
be checked by the application in order to anticipate the behaviour.
Fixes: 42603bbdb5 ("net/mlx5: release port on close")
Fixes: 6c99085d97 ("net/vmxnet3: fix hot-unplug")
Fixes: 4d7877fde2 ("net/ena: remove resources when port is being closed")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Currently, if configuration fails (for example if a 100G card is used
with an odd number of RX/TX queues) QEDE crashes due to a null pointer
dereference.
This commit fixes it by checking that the pointer is not NULL before
using it.
Fixes: 7105b24f4b ("net/qede: fix memory alloc for multiple port reconfig")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
If rule contains tunneling action (like VXLAN encapsulation)
the VTEP (Virtual Tunneling EndPoint) device is pre-configured
before applying the rule. If kernel returns an error this
VTEP configuration should be rolled back to the origin state.
The patch adds the missing VTEP configuration restoration.
Fixes: 95a464cecc ("net/mlx5: add E-switch VXLAN tunnel devices management")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The VXLAN related rule cleanup routine queries and gathers all
existing local IP and neigh rules into buffer list. One buffer
may contain multiple rule deletion commands and is prepared
to send into Netlink as single message. But, if error occurs
for some deletion commands in the buffer, the multiple ACK
message with errors can be send back by the kernel. It breaks
the Netlink communication sequence numbers, because we expect
only one ACK message and it smashes out futher Netlik
communication.
The workaround of this problem is to send rule deletion commands
from buffer in one-by-one fashion and get ACK message for every
command sent. We do not expect too may rules preexist, so there
should not be critical performance degradation at VXLAN outer
interface initialization.
Fixes: f420f03d67 ("net/mlx5: add E-switch VXLAN rule cleanup routines")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This patch is preparation for the following fix, we are going to send
Netlink message from buffer in one-by-one fashion. It is highly
desirable to check multimessage buffer consistency for debug purposes.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The Netlink message buffer is allocated and there is the typo,
the other pointer is checked instead of returned one. If no
memory is allocated and NULL is returned by allocation routine
the bug causes segmentation fault. The patch fixes typo,
returned pointer is validated.
Fixes: f420f03d67 ("net/mlx5: add E-switch VXLAN rule cleanup routines")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Don't fail the build if pkg-config can't be found, instead print the
linker flag as it was doing before the change.
Fixes: b6b8793919 ("net/mlx5: use pkg-config to handle SUSE libmnl")
Cc: stable@dpdk.org
Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Fix the flow_fdir_cmp() function, used by flow_fdir_filter_lookup().
This function is used by flow_fdir_filter_add() to check if same rule
exists, and by flow_fdir_filter_delete() to find flow rule to delete.
The function compared actions conf pointers, changed to compare
actions type only.
Fixes: 2720f833d4 ("net/mlx5: add missing flow director delete")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
When stopping a bonded port all slaves are deactivated. Attempting
to deactivate a slave that was never activated will result in a segfault
when mode 4 is used.
Fixes: 7486331308 ("net/bonding: stop and deactivate slaves on stop")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Chas Williams <chas3@att.com>
In general the VF driver should not access the chip. For VF link status
update, VF driver should not use HW lock, use bnx2x_link_report_locked()
instead.
Add few prints for releasing previously held HW locks.
Fixes: a9b58b15ed ("net/bnx2x: fix to add PHY lock")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Use rte_log() rather than RTE_LOG() for dynamic logging. Rearrange
dynamic log types to the top and configurable log types to bottom.
Remove unused RTE_LIBRTE_BNX2X_DEBUG_TX_FREE
Fixes: ba7eeb035a ("net/bnx2x: fix logging to include device name")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
For performance reasons, word1 of send_hdr_s
sub descriptor was not cleared assuming it is always
having default value of zero since it comes from fixed
offsets of SQ buffer.
This is causing issues in case of SG mode because,
the size of send command might change and hence the word1
of send_hdr_s is not always at fixed offsets of the SQ buffer
and hence not having default value of zero.
This fixes the issue by clearing the word1 in case of SG mode
for every packet.
Fixes: 1c421f18e0 ("net/thunderx: add single and multi-segment Tx")
Cc: stable@dpdk.org
Signed-off-by: Subrahmanyam Nilla <snilla@caviumnetworks.com>
Signed-off-by: Nithin Dabilpuram <nithin.dabilpuram@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Individual MPS TCAM entries are not allocated as separate entities.
All entries are allocated once as an array. So, fix bug with attempting
to free illegal memory location.
Also add missing MPS TCAM initialization for CXGBEVF.
Fixes: 6fda3f0ddd ("net/cxgbe: add API to program hardware MPS table")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Set the "value" in the filter spec and not the "mask". Also, remove
duplicate comparison for hardware supported match items already done
earlier.
Fixes: af44a57798 ("net/cxgbe: support to offload flows to HASH region")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Under heavy load, flow related operations can take more time to
complete. Increase max completion wait time to 10 seconds. Also
increase max receive budget to read more replies from firmware
in every cycle.
Fixes: 9eb2c9a480 ("net/cxgbe: implement flow create operation")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Fix issue where ethertype is set to 0x800 for IPv4 by ether match
item and IPv4 match item also inherently sets ethertype to 0x800.
Current logic returns error when IPv4 match item is parsed. So, fix
by only bailing if the redefined match items have different values.
Fixes: ee61f5113b ("net/cxgbe: parse and validate flows")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
The AltiVec header file breaks boolean type:
error: incompatible types when initializing type
'__vector _bool int' {aka '_vector(4) __bool int'} using type 'int'
If __APPLE_ALTIVEC__ is defined, then bool type is redefined
and conflicts with stdbool.h.
There is no good solution to fix it for the whole project without
breaking something else, so a workaround is inserted in mlx5 PMD.
This workaround is not compatible with C++ but there is no C++ in DPDK.
Suggested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: David Wilder <dwilder@us.ibm.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
This patch fixes the variable 'phy_word' may be used uninitialized.
Fixes: 5b6439cf03 ("e1000/base: support different EEARBC for i210")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
In scenarios for multiq or flowq setup failure
`rte_eth_dev_probing_finish()` has to be invoked for successful device
registration.
Fixes: fbe90cdd77 ("ethdev: add probing finish function")
Cc: stable@dpdk.org
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
- Currently, no device supports partial mask for protocol in IP header.
- As there could be multiple IP items, next_protocol variable in flow
validation has to be reset for inner layer. Otherwise, inner TCP/UDP
will see protocol number of outer IP header.
- Remove redundant protocol checking for MPLS, which is done in
mlx5_flow_validate_item_mpls().
Fixes: 3d69434113 ("net/mlx5: add Direct Verbs validation function")
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
dev_flow->verbs is mistakenly used instead of dev_flow->dv. A sanity
check is added for debugging purpose.
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Static analysis tools don't like the fact that fd could be zero
in the error path. This won't happen in real world because
stdin would have to be closed, then other error occurring.
Coverity issue: 14079
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
If sysfs directory was incorrectly formatted then the vmbus
setup code would leak a directory handle in the error path.
Coverity issue: 302848
Fixes: 831dba47bd ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Some toolchain has fls() definition in string.h as argument type int,
which is conflicting uint32_t argument type.
/export/dpdk.org/lib/librte_eal/common/rte_reciprocal.c:47:19:
error: conflicting types for ‘fls’
static inline int fls(uint32_t x)
^~~
/opt/marvell-tools-201/aarch64-marvell-elf/include/strings.h:59:6:
note: previous declaration of ‘fls’ was here
int fls(int) __pure2;
FreeBSD string.h also has fls() with argument as int type.
https://www.freebsd.org/cgi/man.cgi?query=fls&sektion=3
Fixing the conflict by using rte version of fls.
Fixes: ffe3ec811e ("sched: introduce reciprocal divide")
Fixes: faf2b25c9f ("fm10k: support VMDQ in multi-queue configuration")
Cc: stable@dpdk.org
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The devargs of a device can be replaced by a newly allocated one
when trying to probe again the same device (multi-process or
multi-ports scenarios). This is breaking some pointer references.
It can be avoided by copying the new content, freeing the new devargs,
and returning the already inserted pointer.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The netvsc device calls VF (if present) to update the link status
with the wrong device. This leads to errors in mlx5 device when it
can't find the ifindex.
Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
ag_cons is used uninitialized, it is used when DEBUG enabled, remove
debug code.
Fixes: 0958d8b643 ("net/bnxt: support LRO")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Previously, there is instability during vector Rx if descriptor
number is not power of 2, e.g. process hang and some Rx packets
are unexpectedly empty. That's because vector Rx mode assumes Rx
descriptor number is power of 2 when doing bit mask.
This patch allows vector mode only when the number of Rx descriptor
is power of 2.
Fixes: 8e109464c0 ("i40e: allow vector Rx and Tx usage")
Fixes: a3c83a2527 ("net/i40e: enable runtime queue setup")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch adds support for the rx_queue_count API in mlx5 driver
Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Tx offload mask is updated in following commit 1037ed842c
("mbuf: fix Tx offload mask") Currently, the new added offload
flags are not supported in PMD and application will fail to call
PMD transmit prepare function.
This patch updates IGB_TX_OFFFLOAD_MASK.
Fixes: 1037ed842c ("mbuf: fix Tx offload mask")
Cc: stable@dpdk.org
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Even though flow_drv_prepare() takes item_flags and action_flags to be
filled in, those are not used and will be overwritten by parsing of
flow_drv_translate(). There's no reason to keep the flags and fill it.
Appropriate notes are added to the documentation of flow_drv_prepare() and
flow_drv_translate().
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
1) Fix layer parsing
In translation of tunneled flows, dev_flow->layers must not be used to
check tunneled layer as it contains all the layers parsed from
flow_drv_prepare(). Checking tunneled layer is needed to distinguish
between outer and inner item. This should be based on dynamic parsing. With
dev_flow->layers on a tunneled flow, items will always be interpreted as
inner as dev_flow->layer already has all the items. Dynamic parsing
(item_flags) is added as there's no such code.
2) Refactoring code
- flow_dv_create_item() and flow_dv_create_action() are merged into
flow_dv_translate() for consistency with Verbs and *_validate().
Fixes: 2466364115 ("net/mlx5: fix flow tunnel handling")
Fixes: d02cb06912 ("net/mlx5: add Direct Verbs translate actions")
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
1) Fix layer parsing
In translation of tunneled flows, dev_flow->layers must not be used to
check tunneled layer as it contains all the layers parsed from
flow_drv_prepare(). Checking tunneled layer is needed to set
IBV_FLOW_SPEC_INNER and it should be based on dynamic parsing. With
dev_flow->layers on a tunneled flow, items will always be interpreted as
inner as dev_flow->layer already has all the items.
2) Refactoring code
It is partly because flow_verbs_translate_item_*() sets layer flag. Same
code is repeating in multiple locations and that could be error-prone.
- Introduce VERBS_SPEC_INNER() to unify setting IBV_FLOW_SPEC_INNER.
- flow_verbs_translate_item_*() doesn't set parsing result -
MLX5_FLOW_LAYER_*.
- flow_verbs_translate_item_*() doesn't set priority or adjust hashfields
but does only item translation. Both have to be done outside.
- Make more consistent between Verbs and DV.
3) Remove flow_verbs_mark_update()
This code can never be reached as validation prohibits specifying mark and
flag actions together. No need to convert flag to mark.
Fixes: 84c406e745 ("net/mlx5: add flow translate function")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Applications which add RSS rules must supply an RSS key and length.
If an application is only interested in default RSS operation it
should not care about the exact RSS key.
By setting the key to NULL - the PMD will use the default RSS key.
In addition if the application does not care about the RSS type it can
set it to 0 and the PMD will use the default type (ETH_RSS_IP).
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Due to a limitation on driver/FW, priority ranges from 1 to 16 in kernel.
Priority in rte_flow attribute starts from 0 and is added by 1 in
translation. This is subject to be changed to determine the max priority
based on trial-and-error like Verbs driver once the restriction is lifted
or the range is extended.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Add txqs_max_vec parameter to configure the maximum number of Tx queues to
enable vectorized Tx. And its default value is set according to the
architecture and device type.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When a device is spawned, it does make more sense that the configuration
parameters are passed by callee. Furthermore, setting default value for
some configuration would need PCIe device ID which can be found in the
probe function.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The last part of patchset contains the rule cleanup routines.
These ones is the part of outer interface initialization at
the moment of VXLAN VTEP attaching. These routines query
the list of attached VXLAN devices, the list of local IP
addresses with peer and link scope attribute and the list
of permanent neigh rules, then all found abovementioned
items on the specified outer device are flushed.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
VXLAN encap rules are applied to the VF ingress traffic and have the
VTEP as actual redirection destinations instead of outer PF.
The encapsulation rule should provide:
- redirection action VF->PF
- VF port ID
- some inner network parameters (MACs/IP)
- the tunnel outer source IP (v4/v6)
- the tunnel outer destination IP (v4/v6). Current
- VNI - Virtual Network Identifier
There is no direct way found to provide kernel with all required
encapsulatioh header parameters. The encapsulation VTEP is created
attached to the outer interface and assumed as default path for
egress encapsulated traffic. The outer tunnel IP address are
assigned to interface using Netlink, the implicit route is
created like this:
ip addr add <src_ip> peer <dst_ip> dev <outer> scope link
Peer address provides implicit route, and scode link reduces
the risk of conflicts. At initialization time all local scope
link addresses are flushed from device (see next part of patchset).
The destination MAC address is provided via permenent neigh rule:
ip neigh add dev <outer> lladdr <dst_mac> to <dst_ip> nud permanent
At initialization time all neigh rules of this type are flushed
from device (see the next part of patchset).
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
VXLAN interfaces are dynamically created for each local UDP port
of outer networks and then used as targets for TC "flower" filters
in order to perform encapsulation. These VXLAN interfaces are
system-wide, the only one device with given UDP port can exist
in the system (the attempt of creating another device with the
same UDP local port returns EEXIST), so PMD should support the
shared device instances database for PMD instances. These VXLAN
implicitly created devices are called VTEPs (Virtual Tunnel
End Points).
Creation of the VTEP occurs at the moment of rule applying. The
link is set up, root ingress qdisc is also initialized.
Encapsulation VTEPs are created on per port basis, the single
VTEP is attached to the outer interface and is shared for all
encapsulation rules on this interface. The source UDP port is
automatically selected in range 30000-60000.
For decapsulaton one VTEP is created per every unique UDP
local port to accept tunnel traffic. The name of created
VTEP consists of prefix "vmlx_" and the number of UDP port in
decimal digits without leading zeros (vmlx_4789). The VTEP
can be preliminary created in the system before the launching
application, it allows to share UDP ports between primary
and secondary processes.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The counters for E-Switch rules were erroneously deleted in
flow_tcf_remove() routine. The counters deletion is moved to
flow_tcf_destroy() routine.
Fixes: e1114ff6a5 ("net/mlx5: support e-switch flow count action")
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This part of patchset updates Netlink exchange routine. Message
sequence numbers became not random ones, the multipart reply messages
are supported, not propagating errors to the following socket calls,
Netlink replies buffer size is increased to MNL_SOCKET_BUFFER_SIZE
and now is preallocated at context creation time instead of stack
usage. This update is needed to support Netlink query operations.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This part of patchset adds support of VXLAN-related items and
actions to the flow translation routine. Later some tunnel types,
other than VXLAN can be addedd (GRE). No VTEP devices are created at
this point, the flow rule is just translated, not applied yet.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The e-switch Flow prepare function is updated to support VXLAN
encapsulation/and decapsulation actions. The function calculates
buffer size for Netlink message and Flow description structures,
including optional ones for tunneling purposes.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This patch adds VXLAN support for flow item/action lists validation.
The following entities are now supported:
- RTE_FLOW_ITEM_TYPE_VXLAN, contains the tunnel VNI
- RTE_FLOW_ACTION_TYPE_VXLAN_DECAP, if this action is specified
the items in the flow items list treated as outer network
parameters for tunnel outer header match. The ethernet layer
addresses always are treated as inner ones.
- RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP, contains the item list to
build the encapsulation header. In current implementation the
values is the subject for some constraints:
- outer source MAC address will be always unconditionally
set to the one of MAC addresses of outer egress interface
- no way to specify source UDP port
- all abovementioned parameters are ignored if specified
in the rule, warning messages are sent to the log
Minimal tunneling support is also added. If VXLAN decapsulation
action is specified the ETH item can follow the VXLAN VNI item,
the content of this ETH item is treated as inner MAC addresses
and type. The outer ETH item for VXLAN decapsulation action
is always ignored.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The rule validation function for E-Switch checks item list first,
then action list is checked. This patch swaps the validation order,
now actions are checked first. This is preparation for validation
function update with VXLAN tunnel actions. VXLAN decapsulation
action requires to check the items in special way. We could do
this special check in the single item check pass if the action
flags were gathered before. This is the reason to swap the
item/actions checking loops.
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>