In order to map the guest physical addresses used by the virtio device
guest side to the host physical addresses used by the HW as the host
side, memory regions are created.
By this way, for example, the HW can translate the addresses of the
packets posted by the guest and to take the packets from the correct
place.
The design is to work with single MR which will be configured to the
virtio queues in the HW, hence a lot of direct MRs are grouped to single
indirect MR.
Create functions to prepare and release MRs with all the related
resources that are required for it.
Create a new file mlx5_vdpa_mem.c to manage all the MR related code
in the driver.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add support for get_features and get_protocol_features operations.
Part of the features are reported by the DevX capabilities.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Support get_queue_num operation to get the maximum number of queues
supported by the device.
This number comes from the DevX capabilities.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add a new driver to support vDPA operations by Mellanox devices.
The first Mellanox devices which support vDPA operations are
ConnectX-6 Dx and Bluefield1 HCA for their PF ports and VF ports.
This driver is depending on rdma-core like the mlx5 PMD, also it is
going to use mlx5 DevX to create HW objects directly by the FW.
Hence, the common/mlx5 library is linked to the mlx5_vdpa driver.
This driver will not be compiled by default due to the above
dependencies.
Register a new log type for this driver.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Get a burst mode information for Rx/Tx queues in mlx5.
Provide callback functions to show this information in
a "show rxq info" and "show txq info" output.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Earlier after a successful mac_addr_add operation, index was returned
by underlying layer which was unused but same as provided by DPDK API.
So API is enhanced to use application provided index location to add
MAC address entry.
Fixes: e4373bf1b3f5 ("net/octeontx: add unicast MAC filter")
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
MAC address table is allocated during octeontx device create and
same is used to maintain list of MAC address associated to port.
This table is not getting freed niether in case of error nor during
graceful shutdown of port.
Patch fixes memory required memory for both the cases as mentioned.
Fixes: f18b146c498d ("net/octeontx: create ethdev ports")
Cc: stable@dpdk.org
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
Tx flow controlled is disabled in the Ax silicon version due to an errata.
This errata is not applicable for HIGIG Tx flow control, therefore
not enabling in HIGIG case.
Fixes: 602009ee2dfb ("net/octeontx2: support HIGIG2")
Cc: stable@dpdk.org
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The interrupt vector which bind to queues should not be larger than
the max available vector. It will cause port start failed. This patch
changed the judgement condition of the limited vector id. It can
effectively avoid vector id out of range.
Fixes: 6a6cf5f88b4a ("net/i40e: enable multi-queue Rx interrupt for VF")
Signed-off-by: Lunyuan Cui <lunyuanx.cui@intel.com>
Reviewed-by: Qiming Yang <qiming.yang@intel.com>
The patch distinguishes fdir rules for GTPU with or without
extend header, so flow to match below patterns can be created
correctly.
1. eth / ipv4 / udp / gtpu teid is 10 / ...
2. eth / ipv4 / udp / gtpu teid is 10 / gtp_psc / ...
3. eth / ipv4 / udp / gtpu / gtp_psc qfi is 10 / ...
4. eth / ipv4 / udp / gtpu teid is 10 / gtp_psc is 10 / ...
Fixes: efc16c621415 ("net/ice: support flow director GTPU tunnel")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Use PCI_PRI_FMT instead of "%04d:%02d:%02d:%d" print format.
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Currently, when using VXLAN-GPE or VXLAN item in the flow
both are being treated the same with flags 0x8 in VXLAN
header. Which mean the matching of the item VXLAN-GPE
will match any VXLAN packet.
This fixes the translation of VXLAN GPE item into PMD flow
item. Which will by default set the flags to VXLAN-GPE
to be 0xc.
Fixes: 3d69434113d1 ("net/mlx5: add Direct Verbs validation function")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
thunderx-nic uses secondary VF's to provide more queues to DPDK.
Current instructions explain the concept but don't show an easy way to
find which PCI id is primary and which is secondary VF's.
This patch extending the documentation of secondary VF w.r.t
the enumeration details.
Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Before C0 HW revision, The RSS adder was computed based the following
static formula.
rss_adder<7:0> = flow_tag<7:0> ^ flow_tag<15:8> ^
flow_tag<23:16> ^ flow_tag<31:24>
The above scheme has the following drawbacks:
1) It is not in line with other standard NIC behavior.
2) There can be an SW use case where SW can compute the hash
upfront using Toeplitz function and predict the queue selection
to optimize some packet lookup function. The nonstandard
way of doing XOR makes the consumer to not predict the queue selection.
C0 HW revision onward, The HW can configure the
rss_adder<7:0> as flow_tag<7:0> to align with standard NICs.
This patch adds an option to select legacy RSS adder mode
using tag_as_xor=1 devargs option while keeping the standard NIC
behavior as default.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
HW seems to populate the cfa code in the Rx descriptor even
if an explicit flow rule is not configured via application as
there might be a default rule configured in HW even for promisc
mode.
Fixes: 94eb699bc82e ("net/bnxt: support flow mark action")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Use the MLX5_ASSERT macros instead of the standard assert clause.
Depends on the RTE_LIBRTE_MLX5_DEBUG configuration option to define it.
If RTE_LIBRTE_MLX5_DEBUG is enabled MLX5_ASSERT is equal to RTE_VERIFY
to bypass the global CONFIG_RTE_ENABLE_ASSERT option.
If RTE_LIBRTE_MLX5_DEBUG is disabled, the global CONFIG_RTE_ENABLE_ASSERT
can still make this assert active by calling RTE_VERIFY inside RTE_ASSERT.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Use the RTE_LIBRTE_MLX5_DEBUG configuration flag to get rid of dependency
on the NDEBUG definition. This is a preparation step to switch
from standard assert clauses to DPDK RTE_ASSERT ones in MLX5 driver.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Use the MLX4_ASSERT macros instead of the standard assert clause.
Depends on the RTE_LIBRTE_MLX4_DEBUG configuration option to define it.
If RTE_LIBRTE_MLX4_DEBUG is enabled MLX4_ASSERT is equal to RTE_VERIFY
to bypass the global CONFIG_RTE_ENABLE_ASSERT option.
If RTE_LIBRTE_MLX4_DEBUG is disabled, the global CONFIG_RTE_ENABLE_ASSERT
can still make this assert active by calling RTE_VERIFY inside RTE_ASSERT.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Use the RTE_LIBRTE_MLX4_DEBUG compilation flag to get rid of dependency
on the NDEBUG definition. This is a preparation step to switch
from standard assert clauses to DPDK RTE_ASSERT ones in MLX4 driver.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Remove -Werror-all flag in ICC configuration file to stop treating ICC
warnings as errors in DPDK due to many false positives. We are using
GCC and Clang as a benchmark for warnings anyway for simplification.
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This patch adds support for dynamic flag that hints transmit
datapath do not copy data to the descriptors. This flag is
useful when data are located in the memory of another (not NIC)
physical device and copying to the host memory is undesirable.
This hint flag is per mbuf for multi-segment packets.
This hint flag might be partially ignored if:
- hardware requires minimal data header to be inline into
descriptor, it depends on the hardware type and its configuration.
In this case PMD copies the minimal required number of bytes to
the descriptor, ignoring the no inline hint flag, the rest of data
is not copied.
- VLAN tag insertion offload is requested and hardware does not
support this options. In this case the VLAN tag is inserted by
software means and at least 18B are copied to descriptor.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The inline feature is designed to save PCI bandwidth by copying some
of the data to the wqe. This feature if enabled works for all packets.
In some cases when using external memory, the PCI bandwidth is not
relevant since the memory can be accessed by other means.
This commit introduce the ability to control the inline with mbuf
granularity.
In order to use this feature the application should register the field
name, and restart the port.
Signed-off-by: Ori Kam <orika@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
There are RDMA-CORE versions which are not supported multi-table for
some Mellanox mlx5 devices.
Hence, the optimization added in commit [1] which forwards all the FDB
traffic to table 1 cannot be configured.
Make the above optimization optional:
Do not fail when either table 1 cannot be created or the jump rule
(all =>jump to table 1) is not configured successfully.
In this case, all the flows will be configured to table 0.
[1] commit b67b4ecbde22 ("net/mlx5: skip table zero to improve
insertion rate")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add new 4 Netlink commands to support enable/disable ROCE:
1. mlx5_nl_devlink_family_id_get to get the Devlink family ID of
Netlink general command.
2. mlx5_nl_enable_roce_get to get the ROCE current status.
3. mlx5_nl_driver_reload - to reload the device kernel driver.
4. mlx5_nl_enable_roce_set - to set the ROCE status.
When the user changes the ROCE status, the IB device may disappear and
appear again, so DPDK driver should wait for it and to restart itself.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Move Netlink mechanism and its dependencies from net/mlx5 to
common/mlx5 in order to be ready to use by other mlx5 drivers.
The dependencies are BITFIELD defines, the ppc64 compilation workaround
for bool type and the function mlx5_translate_port_name.
Update build mechanism accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
As an arrangment for Netlink command moving to the common library,
reduce the net/mlx5 dependencies.
Replace ethdev class command parameters.
Improve Netlink sequence number mechanism to be controlled by the
mlx5 Netlink mechanism.
Move mlx5_nl_check_switch_info to mlx5_nl.c since it is the only one
which uses it.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The Netlink commands interfaces is included in the mlx5.h file with a
lot of other PMD interfaces.
As an arrangement to make the Netlink commands shared with different
PMDs, this patch moves the Netlink interface to a new file called
mlx5_nl.h.
Move non Netlink pure vlan commands from mlx5_nl.c to the
mlx5_vlan.c.
Rename Netlink commands and structures to use prefix mlx5_nl.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
There might be a case that one Mellanox device can be probed by
multiple mlx5 drivers.
One case is that any mlx5 vDPA device can be probed by both net/mlx5
and vdpa/mlx5.
Add a new mlx5 common API to get the requested driver by devargs:
class=[net/vdpa].
Skip net/mlx5 PMD probing while the device is selected to be probed by
the vDPA driver.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
In order to allow RQT size configuration which is limited to the
correct maximum value, add log_max_rqt_size for DevX capability
structure.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RQ table can be changed to support different list of queues.
Add DevX command to modify DevX RQT object to point on new RQ list.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The DevX TIR object configuration should get L3 and L4 protocols
expected to be forwarded by the TIR.
Add the PRM constant values needed to configure the L3 and L4 protocols.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Allow virtio queue type configuration in the RQ table.
The needed fields and configuration was added.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
QP creation is needed for vDPA virtq support.
Add 2 DevX commands to create QP and to modify QP state.
The support is for RC QP only in force loopback address mode.
By this way, the packets can be sent to other inernal destinations in
the nic. For example: other QPs or virtqs.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Virtio emulation offload allows SW to offload the I/O operations of a
virtio virtqueue, using the device, allowing an improved performance
for its users.
While supplying all the relevant Virtqueue information (type, size,
memory location, doorbell information, etc.). The device can then
offload the I/O operation of this queue, according to its device type
characteristics.
Some of the virtio features can be supported according to the device
capability, for example, TSO and checksum.
Add virtio queue create, modify and query DevX commands.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Virtio access region(VAR) is the UAR that allocated for virtio emulation
access.
Add rdma-core operations to allocate and free VAR.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
HW implements completion queues(CQ) used to post completion reports upon
completion of work request.
Used for Rx and Tx datapath.
Add DevX command to create a CQ.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The isolated, protected and independent direct access to the HW by
multiple processes is implemented via User Access Region (UAR)
mechanism.
The UAR is part of PCI address space that is mapped for direct access to
the HW from the CPU.
UAR is comprised of multiple pages, each page containing registers that
control the HW operation.
UAR mechanism is used to post execution or control requests to the HW.
It is used by the HW to enforce protection and isolation between
different processes.
Add a glue command to allocate and free an UAR.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add the next commands to glue in order to support interrupt event
channel operations associated to events in the EQ:
devx_create_event_channel,
devx_destroy_event_channel,
devx_subscribe_devx_event,
devx_subscribe_devx_event_fd,
devx_get_event.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The event queue is managed only by the kernel.
Add the rdma-core command in glue to query the kernel event queue
details.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add option to create an indirect mkey by the current
mlx5_devx_cmd_mkey_create command.
Indirect mkey points to set of direct mkeys.
By this way, the HW\SW can reference fragmented memory by one object.
Align the net/mlx5 driver usage in the above command.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add support for rdma-core API to allocate NULL MR.
When the device HW get a NULL MR address, it will do nothing with the
address, no read and no write.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add the DevX capabilities for vDPA configuration and information of
Mellanox devices.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The CQE has owner bit to indicate if it is in SW control or HW.
Share a CQE check for all the mlx5 drivers.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Move the vendor information, vendor ID and device IDs from net/mlx5 PMD
to the common mlx5 file.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Move PCI detection by IB device from mlx5 PMD to the common code.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
A new Mellanox vdpa PMD will be added to support vdpa operations by
Mellanox adapters.
This vdpa PMD design includes mlx5_glue and mlx5_devx operations and
large parts of them are shared with the net/mlx5 PMD.
Create a new common library in drivers/common for mlx5 PMDs.
Move mlx5_glue, mlx5_devx_cmds and their dependencies to the new mlx5
common library in drivers/common.
The files mlx5_devx_cmds.c, mlx5_devx_cmds.h, mlx5_glue.c,
mlx5_glue.h and mlx5_prm.h are moved as is from drivers/net/mlx5 to
drivers/common/mlx5.
Share the log mechanism macros.
Separate also the log mechanism to allow different log level control to
the common library.
Build files and version files are adjusted accordingly.
Include lines are adjusted accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The DevX commands interface is included in the mlx5.h file with a lot
of other PMD interfaces.
As an arrangement to make the DevX commands shared with different PMDs,
this patch moves the DevX interface to a new file called mlx5_devx_cmds.h.
Also remove shared device structure dependency on DevX commands.
Replace the DevX commands log mechanism from the mlx5 driver log
mechanism to the EAL log mechanism.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
For VLAN packet the tci is saved in rx_mb->vlan_tci, however the
STRIPPED offload flag is not set along with PKT_RX_VLAN flag.
Set the PKT_RX_VLAN_STRIPPED flag as well.
Fixes: 380a7aab1ae2 ("mbuf: rename deprecated VLAN flags")
Fixes: b37b528d957c ("mbuf: add new Rx flags for stripped VLAN")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rmody@marvell.com>
return value stored in "ret" but it has been overwritten before use.
Coverity issue: 353621
Fixes: 7fe5668d2ea3 ("net/bnxt: support VLAN filter and strip")
Fixes: df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>