Each call to the AVX2 vector burst receive function makes at
least one pass through the function's inner loop, loading
256 bytes of completion descriptors and copying 8 rte_mbuf
pointers regardless of whether there are any packets to be
received.
Unidirectional forwarding performance is improved by about
3-4% if we ensure that at least one packet can be received
before entering the inner loop.
Fixes: c4e4c18963 ("net/bnxt: add AVX2 RX/Tx")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Mempool registration code had a wrong assumption that it is always
dealing with packet mempools and always called rte_pktmbuf_priv_flags(),
which returned a random value for different types of mempools.
In particular, it could consider MPRQ mempools as having externally
pinned buffers, which is wrong.
Packet mempools cannot be reliably recognized, but it is sufficient to
check that the mempool is not a packet one, so it cannot have externally
pinned buffers.
Compare mempool private data size to that of packet mempools to check.
Fixes: 690b2a88c2 ("common/mlx5: add mempool registration facilities")
Fixes: fec28ca0e3 ("net/mlx5: support mempool registration")
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The lock sh->txpp.mutex was not correctly released on one path
of cleanup function return, potentially causing the deadlock.
Fixes: d133f4cdb7 ("net/mlx5: create clock queue for packet pacing")
Cc: stable@dpdk.org
Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch uses tx_free_thresh to control mbufs free when the common
xmit algorithm is used.
This patch also modifies the implementation of PMD's tx_done_cleanup
because the mbuf free algorithm changed.
In the testpmd single core MAC forwarding scenario, the performance is
improved by 10% at 64B on Kunpeng920 platform.
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Currently the vector and simple xmit algorithm don't support multi_segs,
so if Tx offload support MBUF_FAST_FREE, driver could invoke
rte_mempool_put_bulk() to free Tx mbufs in this situation.
In the testpmd single core MAC forwarding scenario, the performance is
improved by 8% at 64B on Kunpeng920 platform.
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
When a port starts in non-isolated mode,
an internal indirect RSS is created that includes all configured queues
and a flow rule is created that references this indirect RSS.
If before switching to non-isolated mode an indirect RSS was created
that includes the same set of queues, it would be reused at this point.
However, because the port had been stopped (or not yet started),
the TIR for this indirect RSS had been destroyed (or not yet created).
The flow rule could not be created and the port start failed.
Creation of TIRs is moved before configuring non-isolated mode flows,
but it is not enough because of the following issue.
Commit 0cedf34da7 ("net/mlx5: move Rx queue reference count")
changed mlx5_rxq_get() not to increment RxQ control structure
reference count, mlx5_rxq_ref() was introduced for this purpose.
mlx5_ind_table_obj_attach() was not updated to use the new function,
so when the port was stopped, the control structure reference count
of an RxQ used in RSS reached zero and the structure was destroyed.
Use mlx5_rxq_ref() to keep RxQ control structure
needed for indirect RSS persistence across port restart.
Fixes: ec4e11d41d ("net/mlx5: preserve indirect actions on restart")
Fixes: 0cedf34da7 ("net/mlx5: move Rx queue reference count")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The RSS can be one of the fate actions when creating a meter with
policy. In the previous implementation, the RSS validation was missed
when creating a flow rule with such meter due to the fact that a
policy meter was created firstly and then used in the rule. In the
stage of meter creation, no rte_flow_item* information was provided.
A unnecessary RSS expansion might be called since the validation was
missed and would cause an unexpected error of the rule creation. Even
though the rule should be rejected from the very beginning, it would
cause confusion. There might be some other errors when the validation
was missed.
Adding the RSS validation inside the meter action validation will
prevent the code from continuing when there is a conflict between the
items, other actions and the policy meter RSS action.
Fixes: 4443201863 ("net/mlx5: support meter creation with policy")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Reviewed-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
When application creates several flows to match on GRE tunnel
without explicitly specifying GRE protocol type value in
flow rules, PMD will translate that to zero mask.
RDMA-CORE cannot distinguish between different inner flow types and
produces identical matchers for each zero mask.
The patch extracts inner header type from flow rule and forces it
in GRE protocol type, if application did not specify
any without explicitly specifying GRE protocol type value in
flow rules, protocol type value.
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When application creates several flows to match on GENEVE tunnel
without explicitly specifying GENEVE protocol type value in
flow rules, PMD will translate that to zero mask.
RDMA-CORE cannot distinguish between different inner flow types and
produces identical matchers for each zero mask.
The patch extracts inner header type from flow rule and forces it
in GENEVE protocol type, if application did not specify
any without explicitly specifying GENEVE protocol type value in
flow rules, protocol type value.
Fixes: e59a5dbcfd ("net/mlx5: add flow match on GENEVE item")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
RFC-2784 allows any valid Ethernet type in GRE protocol type field.
Add Ethernet to GRE RSS expansion.
Fixes: f4b901a46a ("net/mlx5: add flow GRE item")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
VXLAN-GPE extends VXLAN protocol and provides the next protocol
field specifying the first inner header type.
The application can assign some explicit value to
VXLAN-GPE::next_protocol field or set it to the default one. In the
latter case, the rdma-core library cannot recognize the matcher
built by PMD correctly, and it results in hardware configuration
missing inner headers match.
The patch forces VXLAN-GPE::next_protocol assignment if the
application did not explicitly assign it to the non-default value
Fixes: 90456726eb ("net/mlx5: fix VXLAN-GPE item translation")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Secondary process depends on the vector offload flag to select right
Rx offload path. This patch adds a variable in share memory to store
the vector offload flag that can be directly read by secondary process.
Fixes: 808a17b3c1 ("net/ice: add Rx AVX512 offload path")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Tested-by: Qin Sun <qinx.sun@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Just fix the wrong defines of GTPU flags between UL and DL. These two
are defined are misplaced to each other.
Fixes: 8ebb93942b ("net/ice/base: add function to set HW profile for raw flow")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Inside the MR control structure there is a pointer to the common device.
This pointer enables access to the global cache as well as hardware
objects that may be required in case a new MR needs to be created.
The purpose of adding this pointer into the MR control structure was to
avoid its transfer as a parameter to all the functions of searching MR
in the caches.
However, adding it to this structure increased the Rx and Tx data-path
structures, all the fields that followed it were slightly moved away
which caused to a reduction in performance.
This patch removes the pointer from the structure. It can be accessed
through the "dev_gen_ptr" existing field using the "container_of"
operator.
Fixes: 334ed198ab ("common/mlx5: remove redundant parameter in MR search")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The completed variable is used for debug logs even though clang 13
reports it as unused.
Bugzilla ID: 881
Fixes: c3ecdbb376 ("vmxnet3: support TSO")
Cc: stable@dpdk.org
Reported-by: Liang Longfeng <longfengx.liang@intel.com>
Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
eicr serves as a placeholder for some read-on-clear nic register.
clang 13 reports it as unused.
Bugzilla ID: 881
Fixes: b7311360fb ("net/txgbe: support VF interrupt")
Cc: stable@dpdk.org
Reported-by: Liang Longfeng <longfengx.liang@intel.com>
Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Since v0.4.0, if the underlying kernel supports it, libbpf uses 'bpf
link' to manage the programs on the interfaces of the XDP sockets (xsks).
This is not compatible with the PMD's custom XDP program loading feature
which uses the netlink-based method for loading custom programs.
The conflict arises when libbpf searches for a custom program on the
interface using bpf link, but doesn't find one because the netlink
method was used. libbpf then proceeds to try to load the default program
on the interface, but fails due to the presence of the custom program.
To work around this, the PMD now uses the
XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD flag which prevents libbpf from
attempting to search for or load a program. One repercussion is that
DPDK must now insert the xsk into the xsks_map as this was previously
handled by libbpf during the routines for program loading/probing.
Ideally, the PMD would use bpf link to load the custom program, however
at present there is no convenient and reliable way of detecting whether
the underlying kernel supports bpf link. Perhaps this may become
available in a future libbpf release, at which point we can switch the
PMD over to the new bpf link based method.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
The commit ae70cc6e89 ("net/af_xdp: use BPF link for XDP programs")
caused compilation errors on kernels older than v5.8 due to absence of
the bpf_link_info struct and some definitions in the linux/bpf.h header.
Since relying on the reported kernel version is not a robust solution
and also since there doesn't appear to be a suitable definition in the
bpf header that the preprocessor could rely on to determine support for
bpf link, we will take a different approach to solving the issue that
the original patch attempted to solve. The next commit will address
this.
Fixes: ae70cc6e89 ("net/af_xdp: use BPF link for XDP programs")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Remove the szedata2 device driver as the platform is no longer
supported.
Signed-off-by: Martin Spinler <spinler@cesnet.cz>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds Rx/Tx queue configuration setup operations.
On packet reception the respective BD Ring status bit is set
which is then used for packet processing.
Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
Signed-off-by: Apeksha Gupta <apeksha.gupta@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Implemented the fec-uio driver in kernel. enetfec PMD uses
UIO interface to interact with "fec-uio" driver implemented in
kernel for PHY initialisation and for mapping the allocated memory
of register & BD from kernel to DPDK which gives access to
non-cacheable memory for BD.
Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
Signed-off-by: Apeksha Gupta <apeksha.gupta@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The attribute to record the global control of hairpin queues' delay
drop was defined as a bit-field with one bit, and the intention was
to reduce the memory overhead. In the meanwhile, the macro was
defined as an enumerated value 0x2.
No matter what value inputted via devarg, the lowest bit was always
zero and the higher bits would be ignored. For hairpin queues, the
delay drop attribute couldn't be enabled.
With the commit, the double logical negation is used to fix this.
Fixes: febcac7b46 ("net/mlx5: support Rx queue delay drop")
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
There was a redundant check for the enabled E-Switch, this
resulted in device probing failure if the Tx scheduling was
requested and E-Switch was enabled.
Fixes: f17e4b4ffe ("net/mlx5: add Tx scheduling check on queue creation")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch fixes coverity issue by directly passing the address
of the meta data to subfunction.
Coverity issue: 373867
Fixes: 5ad3db8d4b ("net/ice: enable advanced RSS")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch fixes coverity issue by directly passing the address
of the meta data to lower function.
Coverity issue: 373867
Fixes: 7be10c3004 ("net/iavf: add RSS configuration for VF")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Wrong offset used to clear the extended stats section resulting
in eth stats not being reset.
Fixes: ccb49b834c ("net/iavf: support xstats for inline IPsec crypto")
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.
Since the entire descriptor is not read atomically, on relaxed memory
ordered systems like Aarch64, read of the word containing DD field
could be reordered after read of other words.
Read barrier is inserted between read of the word with DD field
and read of other words. The barrier ensures that the fetched data
is correct.
Testpmd single core test showed no performance drop on x86 or N1SDP.
On ThunderX2, 22% performance regression was observed.
Fixes: 7b0cf70135 ("net/i40e: support ARM platform")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
DCF tries to handle AdminQ when DCF is reset by PF, however the invalid
data may be returned, and error log may be output in this situation.
This patch stops handling AdminQ when a passive reset is detected to
avoid this situation.
Fixes: 7564d55096 ("net/ice: add DCF hardware initialization")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The routine converting RTE flow modify field action into
field driver's presentation did not specify the field mask
correctly and this resulted into wrong conversion for
the actions with shifted fields.
Fixes: 40c8fb1fd3 ("net/mlx5: update modify field action")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The global redirection table is used to create the default flow
rules for the ingress traffic with the lowest priority. It is also
used to create the default RSS rule in the destination table when
there is a tunnel offload.
To update the RETA in-flight, there is no restriction in the ethdev
API. In the previous implementation of mlx5, a port restart was
needed to make the new configuration take effect.
The restart is heavy, e.g., all the queues will be released and
reallocated, users' rules will be flushed. Since the restart is
internal, there is a risk to crash the application when some change
in the ethdev is introduced but no workaround is done in mlx5 PMD.
The users' rules, including the default miss rule for tunnel
offload, should not be impacted by the RETA update. It is improper
to flush all rules when updating RETA.
With this patch, only the default rules will be flushed and
re-created with the new table configuration.
Fixes: 3f2fe392bd ("net/mlx5: fix crash during RETA update")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
For the flows containing sample action, the tag action was added
implicitly to store the unique flow index into metadata register in the
split prefix subflow, and then match on this index in the split suffix
subflow. The metadata register for flow index of sample split subflows
was also used to store application metadata TAG 0 item, this might cause
TAG 0 corruption in the flows with sample actions.
This patch uses the same metadata register C index as used for
ASO action since it's reserved and not used directly by the application,
and adds the checking in validation to make sure not to conflict
with ASO CT in the same flow.
Fixes: b4c0ddbfcc ("net/mlx5: split sample flow into two sub-flows")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Tunnel offload API allows the application to restore packet to
its original form if the chain of flows is missed after DECAP action.
MLX5 PMD provides tunnel offload support only if DV API was enabled.
The patch verifies DV availability before processing with
tunnel offload tasks.
Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Registration of packet mempools with RTE_PKTMBUF_POOL_PINNED_EXT_MEM
was performed incorrectly: after population of such mempool chunks
only contain memory for rte_mbuf structures, while pointers to actual
external memory are not yet filled. MR LKeys could not be obtained
for external memory addresses of such mempools. Rx datapath assumes
all used mempools are registered and does not fallback to dynamic
MR creation in such case, so no packets could be received.
Skip registration of extmem pools on population because it is useless.
If used for Rx, they are registered at port start.
During registration, recognize such pools, inspect their mbufs
and recover the pages they reside in.
While MRs for these pages may already be created by rte_dev_dma_map(),
they are not reused to avoid synchronization on Rx datapath
in case these MRs are changed in the database.
Fixes: 690b2a88c2 ("common/mlx5: add mempool registration facilities")
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When a user specifies meter policy like "g_actions queue / end
y_actions queue / r_action drop / end", validation logic missed
to set meter policy mode and it took a random value from the stack.
Define ALL policy modes for the mentioned cases.
Fixes: 4b7bf3ffb4 ("net/mlx5: support yellow in meter policy validation")
Cc: stable@dpdk.org
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Bing Zhao <bingz@nvidia.com>
After yellow color actions in the metering policy were supported,
the RSS could be used for both green and yellow colors and only the
queues attribute could be different.
When specifying the attributes of a RSS, some fields can be ignored
and some default values will be used in PMD. For example, there is a
default RSS key in the PMD and it will be used to create the TIR if
nothing is provided by the application.
The default value cases were missed in the current implementation
and it would cause some false positives or crashes.
The comparison function should be adjusted to take all cases into
consideration when RSS is used for both green and yellow colors.
Fixes: 4b7bf3ffb4 ("net/mlx5: support yellow in meter policy validation")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Due to kernel driver / FW issues in direct MKEY creation using the DevX
API, this patch replaces the counter MR creation to use wrapped mkey
API.
Fixes: 5382d28c21 ("net/mlx5: accelerate DV flow counter transactions")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
The RTE_PORT_PCAP variable is used to signal libpcap availability,
though its name seems to refer to pcap support in the port library.
Prefer a generic name and add explicit link dependencies where needed.
Fixes: 7a944656b3 ("test/pcapng: test pcapng library")
Fixes: 2eccf6afbe ("bpf: add function to convert classic BPF to DPDK BPF")
Fixes: cbb44143be ("app/dumpcap: add new packet capture application")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>