Switch rules usually match packet fields to take actions. Add capability
to add a switch rule that does not match any packet fields, but instead
matches the profile that the packet hits in the switch block.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Added new value for cmd_flag:
Persistently set the DCB configuration mode for the current port.
Added new value for valid_flags: represent bit#1 of command flag
Signed-off-by: Sharon Haroni <sharon.haroni@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Add new API function to allow user to choose port
on which mac vlan rule going to be added.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
We will remove the logic of configuring the flow control out of the
ice_set_fc(...) function. The goal is to enable any driver to combine
all PHY related flow logic, without repeatedly call ice_aq_set_phy_cfg.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
In order to support switch rule for ethertype filter
with ipv6 ethertype id, it has to check ethertype
then find a proper dummy packet.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
When setting the PHY cfg (CQ cmd 0x0601), if the firmware responds
with an EMODE error, software will ignore the error as it simply
means that manageability (ex: BMC) is in control of the link and that
the new setting may not be applied.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
The scope of these variables can be reduced, so do so. This also
eliminates the need for the extra wrapping/braces.
Also, compact a line since it can fit within 80 columns
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Add macro for FDID priority 0 and 3, also adjust the
fdid_prio position to sync with kernel driver.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
The Hw counters is defined as 32bit unsigned value and read from
the sysfs. Firstly read the base value while application start,
then fetch the new value while do query and minus the base value.
If the new value is less than base value, will result in a
negative value and convert to the big value as unsigned 64bit.
PMD add xstats field to store the last successfully read counter,
use it if failed to read hw counter from sysfs.
PMD also record the last output value to handle the wrap around case,
if overflow happened, increase the wrap count by 1 and save into the
higher 32bit, and update the new value into lower 32bit, finally
return the 64bit counter value.
Fixes: ce9494d76c47 ("net/mlx5: report imissed statistics")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
When creating a hairpin queue, the total data size and the maximal
number of packets are interrelated. The differ is the stride size.
Larger buffer size means big packet like jumbo could be supported,
but in the meanwhile, it will introduce more cache misses and have a
side effect on the performance.
Now a new device parameter "hp_buf_log_sz" is introduced for
applications to set the total data buffer size (the logarithm value).
Then the maximal number of packets will also be calculated
automatically by this value.
Applications could also change this value to a larger one in order
to support larger packets in hairpin case. A smaller value will be
beneficial for memory consumption.
If it is not set, the default value will be used.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The PMD static logtype is original DPDK legacy and should not be used.
Fixes: 43362c6a7647 ("net/thunderx: support RSS and RETA query and update")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The static PMD logtype should not be used by drivers.
Instead, use existing log macros in this driver.
Also use standard rte_ether routine to format ether address.
Fixes: 37f9b54bd3cf ("net/dpaa: support Tx and Rx queue setup")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The PMD logtype is a legacy from original DPDK logging.
All drivers must use their own dynamic log type.
Fixes: b1bc1afa4a0e ("net/pfe: support dynamic logging")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The PMD logtype is legacy and drivers should use their own logtype.
Fixes: 050316a88313 ("net/tap: support TSO (TCP Segment Offload)")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Virtio driver has its own logtype and should not use legacy
PMD logtype.
Fixes: 32c118fd0059 ("virtio: free mbuf's with threshold")
Fixes: e5f456a98d3c ("net/virtio: support in-order Rx and Tx")
Fixes: 1c8489da561b ("net/virtio-user: fix multi-process support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The PMD logtype is legacy and should not be used.
Fixes: 3378383dceab ("net/mvneta: support statistics reset")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
To match the pools/queues configuration, the pools/queues output
should start from VMDQ base queue. This patch fixed the issue.
Fixes: 6bb97df521aa ("examples/vmdq: new app")
Cc: stable@dpdk.org
Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Tested-by: Yingya Han <yingyax.han@intel.com>
For DCF, flow ops may be executed in different threads,
so an thread safe option for generic flow APIs is needed.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Add a new ops "redirect" to flow engine, it's used to implement the
function that redirect a flow's destination. Currently only support
VSI-Redirect which will be used by DCF for handling VF-VSI mapping
table change.
A new API "ice_flow_redirect" is exposed, current usage is: it could
be called when there's VF-VSI mapping table change caused by VF reset.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
In function mlx5_devx_cmd_create_tir(), the 40 bytes of RSS key are
copied in 10 iterations, 4 bytes each time using the MLX5_SET macro.
As result the RSS key is copied into TIR context in swapped byte order.
This patch fixes the issue, using memcpy() to copy the RSS key as is.
The struct member mlx5_devx_tir_attr.rx_hash_toeplitz_key is updated
to byte array type.
Fixes: c3aea272eed8 ("net/mlx5: create advanced Rx object via DevX")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
By default, flows are categorized into two types of a mlx5 device.
1. The PMD driver will create some default flows to enable the
traffic and give some default behaviors on the packets. And
this is transparent to the upper layer application.
2. Other flows will be created in the application based on its
needs.
When in the old cached mode for application flows, it is allowed
to created the flow before the device is started. And when
starting the device, all the flows will be applied to the hardware
and take effect. The cached flows will be also applied in the same
time.
In non-cached mode, all the flows will never be cached when stopping
a device. So it makes no sense to insert any flow into the device
before it is started. Default flows owned by PMD driver are not
affected in this case.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Only the members of flow handle structure will be used when trying
to destroy a flow. Other members of mlx5 device flow resource will
only be used for flow creating, and they could be reused for different
flows.
So only the device flow handle structure needs to be saved for further
usage. This could be separated from the whole mlx5 device flow and
stored with a list for each rte flow.
Other members will be pre-allocated with an array, and an index will
be used to help to apply each device flow to the hardware.
The flow handle sizes of Verbs and DV mode will be different, and
some calculation could be done before allocating a verbs handle.
Then the total memory consumption will less for Verbs when there is
no inbox driver being used.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Common structures used for mlx5 flow creating and destroying are
reorganized in order to separating the parts only for destroying
from all the items.
The "mlx5_flow" will contain the common items of DV and Verbs flow,
specific items for DV / Verbs only. These items will only be used
when creating a flow.
At the end of "mlx5_flow", a nested structure "mlx5_flow_handle"
located. It contains all the items used both for creating and
destroying a flow. Also, it consists of common items, and DV / Verbs
specific items.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
When stopping a mlx5 device, all the flows inserted will be flushed
since they are with non-cached mode. And no more action will be done
for these flows in the device closing stage.
If the device restarts after stopped, no flow with non-cached mode
will be re-inserted.
The flush operation through rte interface will remain the same, and
all the flows will be flushed actively.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
MARK and META items are interrelated with datapath -
they might move from/to the applications in mbuf.
zero value for these items has the special meaning -
it means "no metadata are provided", not zero values
are treated by applications and PMD as valid ones.
Moreover in the flow engine domain the value zero is
acceptable to match and set, and we should allow to
specify zero values as rte_flow parameters for the
META and MARK items and actions. In the same time
zero mask has no meaning and should be rejected
on validation stage.
Fixes: fcc8d2f716fd ("net/mlx5: extend flow metadata support")
Fixes: e554b672aa05 ("net/mlx5: support flow tag")
Fixes: 55deee1715f0 ("net/mlx5: extend flow mark support")
Cc: stable@dpdk.org
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The item is being set according to the attribute value, whether it is
udp/tcp or ipv4/6.
Also, there are two condition calls.
If the attribute is neither udp/tcp or ipv4/6 the item is not
initialized at all, but the call to the flow_dv_convert_modify_action is
still being done.
Also, even if the attribute is tcp/udp or ipv4/6, we still have two
conditions.
This patch changes the conditions, so the item will always be set.
By doing this, there is also a save in the number of condition calls.
Fixes: 4bb14c83df95 ("net/mlx5: support modify header using Direct Verbs")
Cc: stable@dpdk.org
Signed-off-by: Asaf Penso <asafp@mellanox.com>
Reviewed-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Flow rule in NIC table on VF representor should not contain VLAN pop
or push actions, and encap or decap actions. Using these actions in
NIC table on VF representor is not a valid use case.
This patch updates the various validation functions to reject such
rules.
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Jack Min <jackmin@mellanox.com>
Invalidation of consumed CQEs incurs a performance penalty
due to many cache misses caused by a non-sequential CQEs access.
Prefetch CQEs to get a better data locality and speed up the
decompression of CQEs. Prefetching reduces CPI rate of the
rxq_cq_decompress_v() function from 1 to 0.85 in my environment,
resulting in 2% boost in mpps for 64B frames single core test.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
If packets with the same metadata are received with compressed CQE
the metadata value is not copied from the title packet in vectorized
rx_burst routines, it causes wrong metadata values seeing by
applications.
Fixes: a18ac6113331 ("net/mlx5: add metadata support to Rx datapath")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In the current state, when preforming read/write
transactions we must wait for a completion in order
to run the next transaction, and all transactions are
performed by order.
Relaxed Ordering is a PCI optimization which by enabling it
we allow the system to perform read/writes in a different
order without having to wait for completion and improve
the performance in that matter.
This commit introduces the creation of relaxed ordering
memory regions in mlx5.
As relaxed ordering is an optimization, drivers that
do not support it can simply ignore it and therefore
it is enabled by default.
Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Trying to create zero spec for vni wasn't allowed, to
avoid matching all packets from previous layer (udp).
This behavior is incorrect, since VXLAN is being identified
through the outer UDP destination port.
Currently, if the user didn't specify outer UDP destination
port the PMD will automatically match only on outer
UDP port of 4798, and if the user want to match on some none
standard port he need to specify it explicitly in the rule.
This removes the limitation of vni spec to be able to match any
vni.
Fixes: 23c1d42c7138 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
e_tag_ether_type has been wrongly defined as bool type which introduces
a bug for etag/etag_strip for x550 NIC. Fixes it by defining it as
uint16_t.
Fixes: ad43b7bce95b ("net/ixgbe: avoid multiple definitions of bool")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Setting LINK UP or LINK DOWN is divided into two parts, with
the main task done in a separate thread, which can take up
to 9 seconds. If cancel the thread in execution, may cause state
inconsistencies. Therefore, must wait for the previous setting
to exit normally before setting the new state.
Note: before using threads, use alarm to handle main tasks.
When canceling alarm, the execution of alarm will not be interrupted.
Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
Cc: stable@dpdk.org
Signed-off-by: Tao Zhu <taox.zhu@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Tested-by: Yu Jiang <yux.jiang@intel.com>
The DCF (Device Config Function) needs the hardware index of the VFs to
control the flow setting. And also if the VF resets, the index may be
changed, so it should handle this in VF reset event.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The DCF (Device Config Function) works at the user PF level, it can't
access the real PF hardware directly. So it will pass through the PF's
AdminQ command by the DCF's mailbox.
And the DCF is mainly used to control the flow setting of other VFs, so
it only needs to initialize some core functions related to the flow.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
A new DCF PMD will be introduced, which runs on Intel VF hardware, and
it is a pure software design to control the advance functionality (such
as switch, ACL) for rest of the VFs.
The DCF (Device Config Function) feature shares the core functions of
the ICE PMD, like it needs to export the DDP definition symbols for the
new DCF PMD use.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The DCF (Device Config Function) splits the AdminQ command into two
parts: one is the descriptor of AdminQ command, the other is the buffer
of AdminQ command (the descriptor has BUF flag set). When both of them
are received by the PF, the PF will handle them as one command.
And also, the filled descriptor and buffer of the response will be sent
back to DCF one by one through the virtchnl from PF.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Since the DCF (Device Config Function) controls the flow setting of
other VFs by the mailbox with PF, for security, it needs to acquire
the DCF capability from PF when starts, and disable it when exits.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Introduce the DCF (Device Config Function) feature in the ice PMD, it
works as a standalone PMD which doesn't handle the packet Rx/Tx related
things. Its hardware entity is the VF.
Add the basic DCF hardware initialization, this is specified by devarg
'cap=dcf'.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
A new DCF PMD will be introduced, which runs on Intel VF hardware, and
it is a pure software design to control the advance functionality (such
as switch, ACL) for rest of the VFs.
So if the DCF (Device Config Function) mode is specified by the devarg
'cap=dcf', then it will stop the PCI probe in the iavf PMD.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch unified the Rx ptype table.
Signed-off-by: Shougang Wang <shougangx.wang@intel.com>
Acked-by: Leyi Rong <leyi.rong@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Add an opcode for getting VSI mapping table.
Add an virtchnl event code for VF reset done.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add DCF capability flag for VF.
Add an opcode for disabling DCF capability.
Add an virtchnl event code for AVF resetting completion.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add two virtchnl opcodes to send the AdminQ command, one is used to
send the descriptor, the other is used to send the buffer payload if
the AdminQ command has BUF flag set.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Update copyright date to 2020.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
In order to support switch rule for PPPOE packet
with ipv6 payload, it has to use a new dummy packet
with ipv6 format.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Add a lock for protecting the tunnel table while adding, removing
and searching tunnels.
Add reference counting to tunnels so that multiple instances
of the same tunnel port can be created. Only physically
destroy the tunnel when all instances of that tunnel have been
destroyed.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
In order to support switch rule for NVGRE packets,
it need to check ipv4 next protocol number, if it is 0x2F,
which means next payload is NVGRE, we need to use NVGRE
format dummy packet.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>