The mlx5 PMD exposes the following new introduced
extended statistics counter to report the errors
of packet send scheduling on timestamps:
- txpp_err_miss_int - rearm queue interrupt was not handled
was not handled in time and service routine might miss
the completions
- txpp_err_rearm_queue - reports errors in rearm queue
- txpp_err_clock_queue - reports errors in clock queue
- txpp_err_ts_past - timestamps in the packet being sent
were found in the past, timestamps were ignored
- txpp_err_ts_future - timestamps in the packet being sent
were found in the too distant future (beyond HW/clock queue
capabilities to schedule, typically it is about 16M of
tx_pp devarg periods)
- txpp_jitter - estimated jitter in device clocks between
8K completions of Clock Queue.
- txpp_wander - estimated wander in device clocks between
16M completions of Clock Queue.
- txpp_sync_lost - error flag, the Clock Queue completions
synchronization is lost, accurate packet scheduling can
not be handled, timestamps are being ignored, the restart
of all ports using scheduling must be performed.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
If send schedule feature is engaged there is the Clock Queue
created, that reports reliable the current device clock counter
value. The device clock counter can be read directly from the
Clock Queue CQE.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch adds send scheduling on timestamps into tx_burst
routine template. The feature is controlled by static configuration
flag, the actual routines supporting the new feature are generated
over this updated template.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The new static control flag is introduced to control
routine generating from template, enabling the scheduling
on timestamps.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The application provides timestamps in Tx mbuf as clocks,
the hardware performs scheduling on Clock Queue completion index
match. This patch introduces the timestamp-to-completion-index
inline routine.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The fields to support send scheduling on dynamic timestamp
field are introduced and initialized on device start.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Service routine is invoked periodically on Rearm Queue
completion interrupts, typically once per some milliseconds
(1-16) to track clock jitter and wander in robust fashion.
It performs the following:
- fetches the completed CQEs for Rearm Queue
- restarts Rearm Queue on errors
- pushes new requests to Rearm Queue to make it
continuously running and pushing cross-channel requests
to Clock Queue
- reads and caches the Clock Queue CQE to be used in datapath
- gathers statistics to estimate clock jitter and wander
- gathers Clock Queue errors statistics
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch allocates the Packet Pacing context from the kernel,
configures one according to requested pace send scheduling
granularity and assigns to Clock Queue.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
To provide the packet send schedule on mbuf timestamp the Tx
queue must be attached to the same UAR as Clock Queue is.
UAR is special hardware related resource mapped to the host
memory and provides doorbell registers, the assigning UAR
to the queue being created is provided via DevX API only.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The dedicated Rearm Queue is needed to fire the work requests to
the Clock Queue in realtime. The Clock Queue should never stop,
otherwise the clock synchronization might be broken and packet
send scheduling would fail. The Rearm Queue uses cross channel
SEND_EN/WAIT operations to provides the requests to the
Clock Queue in robust way.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch creates the special completion queue providing
reference completions to schedule packet send from
other transmitting queues.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This is preparation step before moving the Tx queue creation
to the DevX approach. Some features require the shared UAR
for Tx queues and scheduling completion queues, the patch
manages the shared UAR.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The master and representors might be created over the multiport
Infiniband devices and the UAR resource allocated for sibling
ports might belong to the same underlying Infiniband device.
Hardware requires the write access to the UAR must be performed
as atomic 64-bit write, on 32-bit systems this is two sequential
writes, protected by lock. Due to possibility to share the same
UAR between sibling devices the locks must be moved to shared
context.
Fixes: f048f3d479 ("net/mlx5: switch to the shared IB device context")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch introduces the new devargs:
tx_pp - enables accurate packet send scheduling on mbuf timestamps
in the PMD. On the device start if "rte_dynflag_timestamp"
dynamic flag is registered and this devarg non-zero value is
specified, the driver initializes all necessary internal
infrastructure to provide packet scheduling. The parameter
value specifies scheduling granularity in nanoseconds.
tx_skew - the parameter adjusts the send packet scheduling on
timestamps and represents the average delay between beginning
of the transmitting descriptor processing by the hardware and
appearance of actual packet data on the wire. The value should
be provided in nanoseconds and is valid only if tx_pp parameter
is specified. The default value is zero.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch prepares the common part of the mlx5 PMDs to
support packet send scheduling on mbuf timestamps:
- the DevX routine to query the packet pacing HCA capabilities
- packet pacing Send Queue attributes support
- the hardware related definitions
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In case the ibverbs glue is a separate library to dlopen,
the PMD library must allocate a glue structure to be filled by dlopen.
The glue management was in mlx5_common.c and moved to mlx5_common_os.c,
but the variable allocation was not removed from the original file.
The consequence was a link failure, if ibverbs dlopen option is enabled,
because of the redefinition of the variable (with GCC 10):
multiple definition of 'mlx5_glue'
The original definition is removed to keep only the one moved
in the Linux sub-directory.
Fixes: 79aa430721 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
As tx mbuf is not set for some advanced descriptors, if there is no
mbuf checking before rte_pktmbuf_free_seg() function be called on
the process of tx done clean up, that will cause a segfault. So add
a NULL pointer check to fix it.
Bugzilla ID: 501
Fixes: 8d907d2b79 ("net/igb: free consumed Tx buffers on demand")
Cc: stable@dpdk.org
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>
When the configure pattern involve GTPU inner l3 and l4, even the
configure input set only l3 but not l4, the different l4 protocol
header should also be configured for the different l4 protocol.
Fixes: 215a247b5f ("net/iavf: refactor hash flow")
Fixes: 642f201950 ("net/iavf: support RSS for IPv4 IPv6 mix of GTP")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When the configure pattern involve GTPU inner l3 and l4, even the
configure input set only l3 but not l4, the different l4 protocol
header should also be configured for the different l4 protocol.
Fixes: 0b952714e9 ("net/ice: refactor PF hash flow")
Fixes: de32fa2ba2 ("net/ice: support RSS for IPv6 prefix")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Some protocol don't support symmetric hash, need to handle these cases.
When set an invalid symmetric hash rule, just return failed.
Fixes: 4eafe71ee9 ("net/ice: fix RSS type")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The rte_eth_dev_set_vlan_offload function will check vlan rx offload
capability, the i350/i210/i211 nics have vlan extend feature but
DEV_RX_OFFLOAD_VLAN_EXTEND is not set into the capability, that will
cause setting fail. So need to add this capability in
igb_get_rx_port_offloads_capa function.
Fixes: ef990fb56e ("net/e1000: convert to new Rx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>
The rte_eth_dev_set_vlan_offload function will check vlan rx offload
capability, the i40e vf has vlan filter feature but
DEV_RX_OFFLOAD_VLAN_FILTER is not set into the capability, that will
cause setting fail. So need to add this capability in
i40e_vf_representor_dev_infos_get function.
Fixes: e0cb96204b ("net/i40e: add support for representor ports")
Cc: stable@dpdk.org
Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
DEV_RX_OFFLOAD_TIMESTAMP is per port, so the internal implementation
shall enable it on per port basis only.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This experimental API is no longer required as the same
purpose can be solved with standard DEV_RX_OFFLOAD_TIMESTAMP
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
A dpdk bonding 802.3ad network as follows:
+----------+ +-----------+
|dpdk lacp |bond1.1 <------> bond2.1|switch lacp|
| |bond1.2 <------> bond2.2| |
+----------+ +-----------+
If a fiber optic go wrong about single pass during normal running like
this:
bond1.2 -----> bond2.2 ok
bond1.2 <--x-- bond2.2 error: bond1.2 receive no LACPDU Some packets
from switch to dpdk will choose bond2.2
and lost.
DPDK lacp state machine will transits to the expired state if no LACPDU
is received before the current_while_timer expires. But if no LACPDU is
received before the current_while_timer expires again, DPDK lacp state
machine has no change. Bond2.2 can not change to inactive depend on the
received LACPDU.
According to IEEE 802.3ad, if no lacpdu is received before the
current_while_timer expires again, the state machine should transits
from expired to defaulted. Bond2.2 will change to inactive depend on the
LACPDU with defaulted state.
This patch adds a state machine change from expired to defaulted when no
lacpdu is received before the current_while_timer expires again
according to IEEE 802.3ad:
If no LACPDU is received before the current_while timer expires again,
the state machine transits to the DEFAULTED state. The record Default
function overwrites the current operational parameters for the Partner
with administratively configured values. This allows configuration of
aggregations and individual links when no protocol partner is present,
while still permitting an active partner to override default settings.
The update_Default_Selected function sets the Selected variable FALSE
if the Link Aggregation Group has changed. Since all operational
parameters are now set to locally administered values there can be no
disagreement as to the Link Aggregation Group, so the Matched variable
is set TRUE.
The relevant description is in the chapter 43.4.12 of the link below:
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=850426
Signed-off-by: Weifeng Li <liweifeng96@126.com>
Acked-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
The function valid_bonded_port_id() has already contains function
rte_eth_dev_is_valid_port(), so delete redundant check.
Fixes: 588ae95e79 ("net/bonding: fix port ID check")
Cc: stable@dpdk.org
Signed-off-by: Dongyang Pan <197020236@qq.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Added support for exact match templates
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
The default egress rule should include buffer descriptor action
record only if the VF representor is enabled.
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The protocol header are implicitly matched based on the proto
field data. For instance, if ether type is set as 0x800 in the
ether header then ipv4 protocol header is assumed to be present
for template matching even if ipv4 header is not present in the
given flow pattern.
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
OVS-DPDK is accumulating the flow counters that are returned as part of
the flow_query API and it is being issued at least 3 times every second.
So there is no need to accumulate the counts internally in the driver.
Fixes: 306c2d28e2 ("net/bnxt: support count action in flow query")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Removed the check to enable default flows only when VF representor
are enabled. It should be enabled all the time in truflow mode.
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Initialize table scope resource manager parameter.
Clear out rm_is_allocated parms before calling as base_index was added
and used incorrectly in this instance.
Signed-off-by: Farah Smith <farah.smith@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Add support for new resource manager to manage CFA resources.
TCAM is split into high and low regions now and CFA resource types
are being updated accordingly.
Signed-off-by: Peter Spreadborough <peter.spreadborough@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
mac_ctrl_frame_fwd assignment is missing, so
setting mac_ctrl_frame_fwd should be added in
ixgbe_flow_ctrl_get().
The patch fixes the issue.
Fixes: 56ea46a997 ("ethdev: retrieve flow control configuration")
Cc: stable@dpdk.org
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Bo Chen <box.c.chen@intel.com>
mac_ctrl_frame_fwd shouldn't be cleared when port stop,
otherwise it will be inconsistent with the actual status.
This patch fixes the issue.
Fixes: a524f550da ("net/ixgbe: fix flow control mode setting")
Cc: stable@dpdk.org
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The i40e_filter_pctype TCP_SYN_NO_ACK, UNICAST_IPV4_UDP and
MULTICAST_IPV4_UDP for x722 were missing when translating RSS type to
i40e_filter_pctype. This patch fixes it.
Fixes: da7018ec29 ("net/i40e: fix queue region in RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Shougang Wang <shougangx.wang@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>