This patch adds support for dynamic flag that hints transmit
datapath do not copy data to the descriptors. This flag is
useful when data are located in the memory of another (not NIC)
physical device and copying to the host memory is undesirable.
This hint flag is per mbuf for multi-segment packets.
This hint flag might be partially ignored if:
- hardware requires minimal data header to be inline into
descriptor, it depends on the hardware type and its configuration.
In this case PMD copies the minimal required number of bytes to
the descriptor, ignoring the no inline hint flag, the rest of data
is not copied.
- VLAN tag insertion offload is requested and hardware does not
support this options. In this case the VLAN tag is inserted by
software means and at least 18B are copied to descriptor.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The inline feature is designed to save PCI bandwidth by copying some
of the data to the wqe. This feature if enabled works for all packets.
In some cases when using external memory, the PCI bandwidth is not
relevant since the memory can be accessed by other means.
This commit introduce the ability to control the inline with mbuf
granularity.
In order to use this feature the application should register the field
name, and restart the port.
Signed-off-by: Ori Kam <orika@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
There are RDMA-CORE versions which are not supported multi-table for
some Mellanox mlx5 devices.
Hence, the optimization added in commit [1] which forwards all the FDB
traffic to table 1 cannot be configured.
Make the above optimization optional:
Do not fail when either table 1 cannot be created or the jump rule
(all =>jump to table 1) is not configured successfully.
In this case, all the flows will be configured to table 0.
[1] commit b67b4ecbde ("net/mlx5: skip table zero to improve
insertion rate")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add new 4 Netlink commands to support enable/disable ROCE:
1. mlx5_nl_devlink_family_id_get to get the Devlink family ID of
Netlink general command.
2. mlx5_nl_enable_roce_get to get the ROCE current status.
3. mlx5_nl_driver_reload - to reload the device kernel driver.
4. mlx5_nl_enable_roce_set - to set the ROCE status.
When the user changes the ROCE status, the IB device may disappear and
appear again, so DPDK driver should wait for it and to restart itself.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Move Netlink mechanism and its dependencies from net/mlx5 to
common/mlx5 in order to be ready to use by other mlx5 drivers.
The dependencies are BITFIELD defines, the ppc64 compilation workaround
for bool type and the function mlx5_translate_port_name.
Update build mechanism accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
As an arrangment for Netlink command moving to the common library,
reduce the net/mlx5 dependencies.
Replace ethdev class command parameters.
Improve Netlink sequence number mechanism to be controlled by the
mlx5 Netlink mechanism.
Move mlx5_nl_check_switch_info to mlx5_nl.c since it is the only one
which uses it.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The Netlink commands interfaces is included in the mlx5.h file with a
lot of other PMD interfaces.
As an arrangement to make the Netlink commands shared with different
PMDs, this patch moves the Netlink interface to a new file called
mlx5_nl.h.
Move non Netlink pure vlan commands from mlx5_nl.c to the
mlx5_vlan.c.
Rename Netlink commands and structures to use prefix mlx5_nl.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
There might be a case that one Mellanox device can be probed by
multiple mlx5 drivers.
One case is that any mlx5 vDPA device can be probed by both net/mlx5
and vdpa/mlx5.
Add a new mlx5 common API to get the requested driver by devargs:
class=[net/vdpa].
Skip net/mlx5 PMD probing while the device is selected to be probed by
the vDPA driver.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
In order to allow RQT size configuration which is limited to the
correct maximum value, add log_max_rqt_size for DevX capability
structure.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
RQ table can be changed to support different list of queues.
Add DevX command to modify DevX RQT object to point on new RQ list.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The DevX TIR object configuration should get L3 and L4 protocols
expected to be forwarded by the TIR.
Add the PRM constant values needed to configure the L3 and L4 protocols.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Allow virtio queue type configuration in the RQ table.
The needed fields and configuration was added.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
QP creation is needed for vDPA virtq support.
Add 2 DevX commands to create QP and to modify QP state.
The support is for RC QP only in force loopback address mode.
By this way, the packets can be sent to other inernal destinations in
the nic. For example: other QPs or virtqs.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Virtio emulation offload allows SW to offload the I/O operations of a
virtio virtqueue, using the device, allowing an improved performance
for its users.
While supplying all the relevant Virtqueue information (type, size,
memory location, doorbell information, etc.). The device can then
offload the I/O operation of this queue, according to its device type
characteristics.
Some of the virtio features can be supported according to the device
capability, for example, TSO and checksum.
Add virtio queue create, modify and query DevX commands.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Virtio access region(VAR) is the UAR that allocated for virtio emulation
access.
Add rdma-core operations to allocate and free VAR.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
HW implements completion queues(CQ) used to post completion reports upon
completion of work request.
Used for Rx and Tx datapath.
Add DevX command to create a CQ.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The isolated, protected and independent direct access to the HW by
multiple processes is implemented via User Access Region (UAR)
mechanism.
The UAR is part of PCI address space that is mapped for direct access to
the HW from the CPU.
UAR is comprised of multiple pages, each page containing registers that
control the HW operation.
UAR mechanism is used to post execution or control requests to the HW.
It is used by the HW to enforce protection and isolation between
different processes.
Add a glue command to allocate and free an UAR.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add the next commands to glue in order to support interrupt event
channel operations associated to events in the EQ:
devx_create_event_channel,
devx_destroy_event_channel,
devx_subscribe_devx_event,
devx_subscribe_devx_event_fd,
devx_get_event.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The event queue is managed only by the kernel.
Add the rdma-core command in glue to query the kernel event queue
details.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add option to create an indirect mkey by the current
mlx5_devx_cmd_mkey_create command.
Indirect mkey points to set of direct mkeys.
By this way, the HW\SW can reference fragmented memory by one object.
Align the net/mlx5 driver usage in the above command.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add support for rdma-core API to allocate NULL MR.
When the device HW get a NULL MR address, it will do nothing with the
address, no read and no write.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add the DevX capabilities for vDPA configuration and information of
Mellanox devices.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The CQE has owner bit to indicate if it is in SW control or HW.
Share a CQE check for all the mlx5 drivers.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Move the vendor information, vendor ID and device IDs from net/mlx5 PMD
to the common mlx5 file.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Move PCI detection by IB device from mlx5 PMD to the common code.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
A new Mellanox vdpa PMD will be added to support vdpa operations by
Mellanox adapters.
This vdpa PMD design includes mlx5_glue and mlx5_devx operations and
large parts of them are shared with the net/mlx5 PMD.
Create a new common library in drivers/common for mlx5 PMDs.
Move mlx5_glue, mlx5_devx_cmds and their dependencies to the new mlx5
common library in drivers/common.
The files mlx5_devx_cmds.c, mlx5_devx_cmds.h, mlx5_glue.c,
mlx5_glue.h and mlx5_prm.h are moved as is from drivers/net/mlx5 to
drivers/common/mlx5.
Share the log mechanism macros.
Separate also the log mechanism to allow different log level control to
the common library.
Build files and version files are adjusted accordingly.
Include lines are adjusted accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The DevX commands interface is included in the mlx5.h file with a lot
of other PMD interfaces.
As an arrangement to make the DevX commands shared with different PMDs,
this patch moves the DevX interface to a new file called mlx5_devx_cmds.h.
Also remove shared device structure dependency on DevX commands.
Replace the DevX commands log mechanism from the mlx5 driver log
mechanism to the EAL log mechanism.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
For VLAN packet the tci is saved in rx_mb->vlan_tci, however the
STRIPPED offload flag is not set along with PKT_RX_VLAN flag.
Set the PKT_RX_VLAN_STRIPPED flag as well.
Fixes: 380a7aab1a ("mbuf: rename deprecated VLAN flags")
Fixes: b37b528d95 ("mbuf: add new Rx flags for stripped VLAN")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rmody@marvell.com>
return value stored in "ret" but it has been overwritten before use.
Coverity issue: 353621
Fixes: 7fe5668d2e ("net/bnxt: support VLAN filter and strip")
Fixes: df6cd7c1f7 ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Certain applications(Ex: OVS-DPDK) can issue rte_flow_create with mark
id set to 0. The mark table entry creation in the driver was assuming
non-zero mark ids. Fix it to have a valid flag for each entry.
Also, it is possible that both RSS flags and cfa_code/ mark id are set
as part of the Rx packet completion descriptor if the 'action' for a
flow is MARK + RSS.
Fix Rx completion processing to look for both fields as opposed to only
either.
Fixes: 94eb699bc8 ("net/bnxt: support flow mark action")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Use "dev->data->dev_started" state, instead of local "dev_stopped"
to check whether port has been started or not.
Fixes: 316e412299 ("net/bnxt: fix crash when closing")
Cc: stable@dpdk.org
Reviewed-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
This change could help in reducing the size of bnxt PMD private
data structure by converting a uint8_t variable to use bit map flag.
Fixes: 5cd0e2889c ("net/bnxt: support NIC Partitioning")
Cc: stable@dpdk.org
Reviewed-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Use "dev->data->dev_started" state, instead of local BNXT_FLAG_INIT_DONE
to check whether device has been initialised or not.
Fixes: ed2ced6fe9 ("net/bnxt: check initialization before accessing stats")
Cc: stable@dpdk.org
Reviewed-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Since "eth_dev->data->dev_started" has been assigned to 0 at the
beginning of bnxt_dev_stop_op() routine, the code inside the if()
condition is redundant. Remove it.
Anyways "eth_dev->data->dev_link.link_status" will be set to 0 in
bnxt_dev_set_link_down_op() later in the routine.
Fixes: 316e412299 ("net/bnxt: fix crash when closing")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Allow RSS action with group ID 0. The RSS match check will ensure if
requested RSS action configuration parameters should be allowed as per
the VNIC's RSS configuration or not and if it does match as it is
by design in OVS-DPDK use case, the default vnic can be used to create
the filter. As part of this ensure that rx_queue_cnt for the default
vnic is setup during init as this field was being set only for non-zero
vnics while handling RSS action for flow create.
Check for out of bounds error while accessing the vnic array.
Fixes: adc0f81c65 ("net/bnxt: support RSS action")
Cc: stable@dpdk.org
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Now that the L2 filter reference count is bumped up in all cases
including bnxt_alloc_filter() which is issued in init, just move this
ref count bump inside the routine issuing the HWRM cmd so that it is
bumped up only if the cmd is successful.
Fixes: 5c1171c972 ("net/bnxt: refactor filter/flow")
Cc: stable@dpdk.org
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Invoke bnxt_get_unused_filter() inside bnxt_alloc_filter() so that
all filters are allocated from one common routine.
Fixes: f92735db1e ("net/bnxt: add L2 filter alloc/init/free")
Cc: stable@dpdk.org
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
If HIGIG mode is enabled on configure, This needs to be disabled
on port stop. Adding support to send mbox message on port stop
to configure the port to default.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Extend RSS offload types for octeontx2. Add support to select
L3 SRC, L3 DST, L4 SRC and L4 DST for RSS calculation.
Add support to select L3 SRC or DST only, L4 SRC or DST only for RSS
calculation.
With this requirement there will be following combinations,
IPV[4,6]_SRC_ONLY, IPV[4,6]_DST_ONLY, [TCP,UDP,SCTP]_SRC_ONLY,
[TCP,UDP,SCTP]_DST_ONLY. So, instead of creating a bit for each
combination, we are using upper 4 bits (31:28) in the flow_key_cfg
to represent the SRC, DST selection. 31 => L3_SRC, 30 => L3_DST,
29 => L4_SRC, 28 => L4_DST. These won't be part of flow_cfg, so that
we don't need to change the existing ABI.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
PMD handles fast path completions in the Rx handler and control path
completions in the interrupt handler. They both are processing
completions from the same fastpath completion queue. There is a
potential for race condition when these two paths are processing
the completions from the same queue and trying to updating Rx Producer.
Add a fastpath Rx lock between these two paths to close this race.
Fixes: 540a211084 ("bnx2x: driver core")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rmody@marvell.com>
The fastpath task queue handler resets the fastpath scan flag
unconditionally, this patch changes that to reset the flag
only if it was set.
Fixes: 08a6e472c3 ("net/bnx2x: fix packet drop")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rmody@marvell.com>
Flow with meter will split to three subflows, the prefix subflow with
meter action do the color, the meter subflow filter the packets, the
suffix subflow do all the left actions for packets pass the filter.
Both the color and the subflow match between prefix and suffix use the
register to store the tag.
For some of the NICs with meter color register share capability, it
only uses 8 LSB of the register for color, the left 24 MSB can be used
for flow id match between meter prefix subflow and suffix subflow.
Currently, one entire register is allocated for flow matching which
causes the NICs with limited registers don't have enough register for
other matching.
Add the meter color share capability checking to fix lacking of
registers issue.
Fixes: 9ea9b049a9 ("net/mlx5: split meter flow")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The id allocated is for the register unique id match. Some registers may
not use the full 32 bits. Add the maximum id to avoid allocate id over
the register restriction.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The ConnectX-5 HW cannot calculate the checksum for ICMPv6,
therefore flows with pattern 'ipv6 proto is 58' with actions that change
the header should be rejected. the actions that change the header
in this type of flow are 'set_ipv6_src' and 'set_ipv6_dst'.
Fixes: 4bb14c83df ("net/mlx5: support modify header using Direct Verbs")
Cc: stable@dpdk.org
Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Previous patch fixed the setting of port-id for eswitch rules, which
are ingress only.
This patch expands the fix, to support NIC rules as well, which can
be ingress or egress.
Fixes: ce777b147b ("net/mlx5: fix E-Switch flow without port item")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
The cited commit zeroed the UDP checksum for raw-encap case.
Add the same handling for vxlan-encap case.
Fixes: bf1d7d9a03 ("net/mlx5: zero out UDP checksum in encapsulation")
Cc: stable@dpdk.org
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Compilation massage example:
"dpdk/drivers/net/mlx5/mlx5_flow_dv.c:1087:10: error: comparison of
unsigned enum expression < 0 is always false
[-Werror,-Wtautological-compare]
if (reg < 0)
~~~ ^ ~
"
enum modify_reg holds only non-negative integers and in some places in
the code it was used to be compared with negative value, hence
compilation was failed.
Change all thus places to use integer instead of enum modify_reg.
Fixes: 3e8edd0ef8 ("net/mlx5: update metadata register ID query")
Fixes: 55deee1715 ("net/mlx5: extend flow mark support")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Description of several functions is not accurate.
This patch updates the description, parameter names etc.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Validation function of 'set VLAN VID' action checks twice for existing
same action in flow rule.
This patch updates the validation function logic, to check the same
restrictions more efficiently.
Fixes: 5f163d520c ("net/mlx5: support modify VLAN ID on existing VLAN header")
Fixes: b8c0372bc5 ("net/mlx5: fix set VLAN ID/PCP in new header")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Validation function of 'POP VLAN' action includes check for other
'POP VLAN' actions present in flow.
It doesn't check for 'PUSH VLAN' actions present in flow.
This patch adds check for 'PUSH VLAN' actions present in flow.
Fixes: b41e47da25 ("net/mlx5: support pop flow action on VLAN header")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>