The shared IB device (sh) has per port data updated in port creation.
In port closing this port data is updated even when the SH still exist.
However, this updating is happened after SH has been released and for
last port it actually accesses to freed memory.
This patch updates the port data before SH releasing.
Fixes: 13c5c09390 ("net/mlx5: fix port event cleaning order")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The item translation split causes GENEVE TLV option resource register
function flow_dev_geneve_tlv_option_resource_register() to be called
twice incorrectly both in spec and mask translation.
In SWS mode the refcnt will only be decreased by 1 in flow release.
The refcnt will never be 0 again, it causes the resource be leaked.
In HWS mode the resource is allocated as global, the refcnt should
not be increased after the resource be allocated. And the resource
should be released during PMD exists.
This commit fixes GENEVE resource management.
Fixes: 75a00812b1 ("net/mlx5: add hardware steering item translation")
Fixes: cd4ab74206 ("net/mlx5: split flow item matcher and value translation")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
If HW Steering is enabled, Rx queues were configured to receive MARKs
when a table with MARK actions was created. After stopping the port,
Rx queue configuration is released, but during starting the port
the mark flag was not updated in the Rx queue configuration.
This patch introduces a reference count on the MARK action and it
increases/decreases per template_table create/destroy.
When the port is stopped, Rx queue configuration is not cleared if
reference count is not zero.
Fixes: 3a2f674b6a ("net/mlx5: add queue and RSS HW steering action")
Cc: stable@dpdk.org
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The shared IB device (sh) has per port data with filed for interrupt
handler port_id. It used by shared interrupt handler to find the
corresponding rte_eth device by IB port index.
If value is equal or greater RTE_MAX_ETHPORTS it means there is no
subhandler installed for specified IB port index.
When a few ports are created under same sh, the sh is created with the
first port and the interrupt handler port_id is initialized to
RTE_MAX_ETHPORTS for each port.
In port creation, the interrupt handler port_id is updated with the
correct value. Since this updating, the mlx5_dev_interrupt_nl_cb
function uses this port and its priv structure.
However, when the ports are closed, this filed isn't updated and the
interrupt handler continue working until it is uninstalled in SH
destruction.
If mlx5_dev_interrupt_nl_cb is called between port closing and SH
destruction, it uses invalid port causing a crash.
This patch adds interrupt handler port_id updating to the close function
and add memory barrier to make sure it is done before priv reset.
Fixes: 655c3c26c1 ("net/mlx5: fix initial link status detection")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The macro HAVE_MLX5_HWS_SUPPORT was introduced for HWS only. And
HWS was not supported on Windows. So macro HAVE_MLX5_HWS_SUPPORT
should be only around the code which HWS uses, but avoid including
the code block shared by Linux and Windows.
Fixes: 22681deead ("net/mlx5/hws: enable hardware steering")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Counter management structure has array of counter pools. This array is
invalid in management structure initialization and grows on demand.
The resizing include:
1. Allocate memory for the new size.
2. Copy the existing data to the new memory.
3. Move the pointer to the new memory.
4. Free the old memory.
The third step can be performed before for this function, and compiler
may do that, but another thread might read the pointer before coping and
read invalid data or even crash.
This patch allocates memory for this array once in management structure
initialization and limit the counters number by 16M.
Fixes: 3aa279157f ("net/mlx5: synchronize flow counter pool creation")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
In some E-Switch use cases, applications want to receive all traffic
on a single port. Since currently, flow API does not provide a way to
match traffic forwarded to any port representor, this patch adds
support for controlling representor matching on ingress flow rules.
Representor matching is controlled through a new device argument
repr_matching_en.
- If representor matching is enabled (default setting),
then each ingress pattern template has an implicit REPRESENTED_PORT
item added. Flow rules based on this pattern template will match
the vport associated with the port on which the rule is created.
- If representor matching is disabled, then there will be no implicit
item added. As a result ingress flow rules will match traffic
coming to any port, not only the port on which the flow rule is
created.
Representor matching is enabled by default, to provide an expected
default behavior.
This patch enables egress flow rules on representors when E-Switch is
enabled in the following configurations:
- repr_matching_en=1 and dv_xmeta_en=4
- repr_matching_en=1 and dv_xmeta_en=0
- repr_matching_en=0 and dv_xmeta_en=0
When representor matching is enabled, the following logic is
implemented:
1. Creating an egress template table in group 0 for each port. These
tables will hold default flow rules defined as follows:
pattern SQ
actions MODIFY_FIELD (set available bits in REG_C_0 to
vport_meta_tag)
MODIFY_FIELD (copy REG_A to REG_C_1, only when
dv_xmeta_en == 4)
JUMP (group 1)
2. Egress pattern templates created by an application have an implicit
MLX5_RTE_FLOW_ITEM_TYPE_TAG item prepended to the pattern, which
matches available bits of REG_C_0.
3. Egress flow rules created by an application have an implicit
MLX5_RTE_FLOW_ITEM_TYPE_TAG item prepended to the pattern, which
matches vport_meta_tag placed in available bits of REG_C_0.
4. Egress template tables created by an application, which are in
group n, are placed in group n + 1.
5. Items and actions related to META are operating on REG_A when
dv_xmeta_en == 0 or REG_C_1 when dv_xmeta_en == 4.
When representor matching is disabled and extended metadata is disabled,
no changes to the current logic are required.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support for AGE action for HW steering.
This patch includes:
1. Add new structures to manage aging.
2. Initialize all of them in configure function.
3. Implement per second aging check using CNT background thread.
4. Enable AGE action in flow create/destroy operations.
5. Implement a queue-based function to report aged flow rules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add the ability to create an indirect action handle for METER_MARK.
It allows sharing one Meter between several different actions.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This commit adds the support of connection tracking to HW steering as
SW steering did before.
The difference from SW steering implementation is that it takes
advantage of HW steering bulk action allocation support, in HW
steering only one single CT pool is needed.
An indexed pool is introduced to record allocated actions from bulk and
CT action state etc. Once one CT action is allocated from bulk, one
indexed object will also be allocated from the indexed pool, similar to
deallocating. That makes mlx5_aso_ct_action can also be managed by that
indexed pool, no need to be reserved from mlx5_aso_ct_pool. The single
CT pool is also saved to mlx5_aso_ct_action struct directly.
The ASO operation functions are shared with SW steering implementation.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adapts mlx5 PMD to changes in mlx5dr API regarding the action
templates. It changes the following:
1. Actions template creation:
- Flow actions types are translated to mlx5dr action types in order
to create mlx5dr_action_template object.
- An offset is assigned to each flow action. This offset is used to
predetermine the action's location in the rule_acts array passed
on the rule creation.
2. Template table creation:
- Fixed actions are created and put in the rule_acts cache using
predetermined offsets
- mlx5dr matcher is parametrized by action templates bound to
template table.
- mlx5dr matcher is configured to optimize rule creation based on
passed rule indices.
3. Flow rule creation:
- mlx5dr rule is parametrized by the action template on which these
rule's actions are based.
- Rule index hint is provided to mlx5dr.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This commit adds HW steering counter action support.
The pool mechanism is the basic data structure for the HW steering
counter.
The HW steering's counter pool is based on the rte_ring of zero-copy
variation.
There are two global rte_rings:
1. free_list:
Store the counters indexes, which are ready for use.
2. wait_reset_list:
Store the counters indexes, which are just freed from the user and
need to query the hardware counter to get the reset value before
this counter can be reused again.
The counter pool also supports cache per HW steering's queues, which are
also based on the rte_ring of zero-copy variation.
The cache can be configured in size, preload, threshold, and fetch size,
they are all exposed via device args.
The main operations of the counter pool are as follows:
- Get one counter from the pool:
1. The user call _get_* API.
2. If the cache is enabled, dequeue one counter index from the local
cache:
2. A: if the dequeued one from the local cache is still in reset
status (counter's query_gen_when_free is equal to pool's query
gen):
I. Flush all counters in the local cache back to global
wait_reset_list.
II. Fetch _fetch_sz_ counters into the cache from the global
free list.
III. Fetch one counter from the cache.
3. If the cache is empty, fetch _fetch_sz_ counters from the global
free list into the cache and fetch one counter from the cache.
- Free one counter into the pool:
1. The user calls _put_* API.
2. Put the counter into the local cache.
3. If the local cache is full:
A: Write back all counters above _threshold_ into the global
wait_reset_list.
B: Also, write back this counter into the global wait_reset_list.
When the local cache is disabled, _get_/_put_ cache directly from/into
global list.
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The new mode 4 of devarg "dv_xmeta_en" is added for HWS only. In this
mode, the Rx / Tx metadata with 32b width copy between FDB and NIC is
supported.
The mark is only supported in NIC and there is no copy supported.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch implements creating and caching of port action for use with
HW Steering FDB flows.
Actions are created on flow template API configuration and created
only on the port designated as the master. Attaching and detaching ports
in the same switching domain causes an update to the port actions cache
by, respectively, creating and destroying actions.
A new devarg fdb_def_rule_en is being added and it's used to control
the default dedicated E-Switch rules that are created by the PMD
implicitly or not, and PMD sets this value to 1 by default.
If set to 0, the default E-Switch rule will not be created and the user
can create the specific E-Switch rules on the root table if needed.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This stores the available tags that can be used by the
application in a global array that will be used to
transfer the TAG item directly from the ID to the REG_C_x
since these can't be changed after startup.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
This adds conversion functions between both ethdev port_id and IB
context to internal corresponding tag/mask values.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
One of the conditions to allow LRO offload is the DV configuration.
The function incorrectly checks the DV configuration before initializing
it by the user devarg; hence, LRO cannot be allowed.
This patch moves this check to mlx5_shared_dev_ctx_args_config, where DV
configuration is initialized.
Fixes: c4b8620135 ("net/mlx5: refactor to detect operation by DevX")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reported-by: Gal Shalom <galshalom@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The pci bus interface is for drivers only.
Mark as internal and move the header in the driver headers list.
While at it, cleanup the code:
- fix indentation,
- remove unneeded reference to bus specific singleton object,
- remove unneeded list head structure type,
- reorder the definitions and macro manipulating the bus singleton object,
- remove inclusion of rte_bus.h and fix the code that relied on implicit
inclusion,
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Add mlx5 specific available descriptor threshold configuration
and query handler.
In mlx5 PMD, available descriptor threshold is also called
LWM (limit watermark).
While the Rx queue fullness reaches the LWM limit, the driver catches
an HW event and invokes the user callback.
The query handler finds the next Rx queue with pending LWM event
if any, starting from the given Rx queue index.
Signed-off-by: Spike Du <spiked@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
When LWM meets RQ WQE, the kernel driver raises an event to SW.
Use devx event_channel to catch this and to notify the user.
Allocate this channel per shared device.
The channel has a cookie that informs the specific event port and queue.
Signed-off-by: Spike Du <spiked@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This patch adds the support of meter action for yellow meter policy
flow, so can use meter action for both green and yellow policy flows
in meter hierarchy.
Currently must use the same meter within one meter policy. Packets
passing green/yellow policy flow will have previous meter color of
green/yellow in subsequent meter.
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
E-Switch DV flow is supported only when DV flow is supported and
enabled.
The mlx5_shared_dev_ctx_args_config() function ensures that when the
environment does not support DV, the "dv_esw_en" flag is turned off.
However, when the environment is supportive but the user has requested
to disable it, the "dv_esw_en" flag remains on and causes the PMD to try
to create an E-Switch through the Verbs engine.
This patch adds check to ensure that "dv_esw_en" flag will be turned off
when DV flow is disabled.
Fixes: a13ec19c19 ("net/mlx5: add shared device context config structure")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Sometimes net/mlx5 devices did not detect link status change to "up".
Each shared device was monitoring IBV_EVENT_PORT_{ACTIVE,ERR}
and queried the link status upon receiving the event.
IBV_EVENT_PORT_ACTIVE is delivered when the logical link status
(UP flag) is set, but the physical link status (RUNNING flag)
may be down at that time, in which case the new link status
would be erroneously considered down.
IBV interface is insufficient for the task.
Monitor interface events using Netlink.
Fixes: 198a3c339a ("mlx5: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support queue/RSS action for external Rx queue.
In indirection table creation, the queue index will be taken from
mapping array.
This feature supports neither LRO nor Hairpin.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
External queue is a queue that has been created and managed outside the
PMD. The queues owner might use PMD to generate flow rules using these
external queues.
When the queue is created in hardware it is given an ID represented by
32 bits. In contrast, the index of the queues in PMD is represented by
16 bits. To enable the use of PMD to generate flow rules, the queue
owner must provide a mapping between the HW index and a 16-bit index
corresponding to the ethdev API.
This patch adds an API enabling to insert/cancel a mapping between HW
queue id and ethdev queue id.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Flow table is a group of flows with the same matching criteria
and the same actions defined for them. The table defines rules
that have the same matching fields but with different matching
values. For example, matching on 5 tuple, the table will be
(IPv4 source + IPv4 dest + s_port + d_port + next_proto)
while the values for each rule will be different.
The templates' relevant matching criteria and action instances
will be created in the table creation and saved in the table.
As table attributes indicate the supported flow number, the flow
memory will also be allocated at the same time.
This commit adds the table management functions.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The hardware steering is backend to support rte_flow_async API in
mlx5 PMD. The port configuration function creates the queues and
needed flow management resources.
The PMD layer configuration function allocates the queues' context
and per-queue job descriptor pool. The job descriptor pool size
is equal to the queue size, and the job descriptors will be popped
from pool with LIFO strategy to convey the flow information during
flow insertion/destruction. Then, while polling the queued operation
result, the flow information will be extracted from the job descriptor
and the descriptor will be pushed back to the LIFO pool.
The commit creates the flow port queues and the job descriptor pools.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The new hardware steering engine relies on using dedicated steering WQEs
instead of writing to the low-level steering table entries directly.
In the first implementation the hardware steering engine supports the
new queue based Flow API, the existing synchronous non-queue based Flow
API is not supported.
A new dv_flow_en value 2 is added to manage mlx5 PMD steering engine:
dv_flow_en rte_flow API rte_flow_async API
------------------------------------------------
0 support not support
1 support not support
2 not support support
This commit introduces the extra dv_flow_en = 2 to specify the new
flow initialize and manage operation routine.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Improve the devargs handling in two aspects:
- Parse the devargs string only once.
- Return error and report for unknown keys.
The common driver parses once the devargs string into a dictionary, then
provides it to all the drivers' probe. Each driver updates within it
which keys it has used, then common driver receives the updated
dictionary and reports about unknown devargs.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add configuration structure for port (ethdev). This structure contains
all configurations coming from devargs which oriented to port. It is a
field of mlx5_priv structure, and is updated in spawn function for each
port.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add configuration structure for shared device context. This structure
contains all configurations coming from devargs which oriented to
device. It is a field of shared device context (SH) structure, and is
updated once in mlx5_alloc_shared_dev_ctx() function.
This structure cannot be changed when probing again, so add function to
prevent it. The mlx5_probe_again_args_validate() function creates a
temporary IB context configure structure according to new devargs
attached in probing again, then checks the match between the temporary
structure and the existing IB context configure structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Move all device configure to be performed by mlx5_os_cap_config()
function instead of the spawn function.
In addition move all relevant fields from mlx5_dev_config structure to
mlx5_dev_cap.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Rearrange the mlx5_os_get_dev_attr() function in such a way that it
first executes the queries and only then updates the fields.
In addition, it changed its name in preparation for expanding its
operations to configure the capabilities inside it.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This patch adds in SH structure a flag which indicates whether is
E-Switch mode.
When configure "dv_esw_en" from devargs, it is enabled only when is
E-switch mode. So, since dv_esw_en has been configure, it is enough to
check if "dv_esw_en" is valid.
This patch also removes E-Switch mode check when "dv_esw_en" is checked
too.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The mlx5_flow_counter_mode_config function exists for both Linux and
Windows with the same name and content.
This patch moves its implementation to the folder shared between the
operating systems, removing the duplication.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The realtime timestamp configure work for Linux as same as Windows.
This patch removes it to the function implemented in the folder shared
between the operating systems, removing the duplication.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The sharing device context structure has a field named "devx" which
indicates if DevX is supported.
The common configure structure has also field named "devx" with the same
meaning.
There is no need for this duplication, because there is a reference to
the common structure from within the sharing device context structure.
This patch removes it from sharing device context structure and uses the
common config structure instead.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The HCA attribute structure is field of net configure structure.
It is also field of common configure structure.
There is no need for this duplication, because there is a reference to
the common structure from within the net structures.
This patch removes it from net configure structure and uses the common
config structure instead.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The device arguments are parsed and updated twice during spawning. First
time before creating the share device context, and again later after
updating a default value to one of the arguments.
This patch consolidates them into one parsing and updates the default
values before it.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The mlx5_alloc_shared_dev_ctx() function has a local variable named
"err" which contains the errno value in case of failure.
When functions called by this function are failed, this variable is
updated with their return value (that should be a positive errno value).
However, some functions doesn't update errno value by themselves or
return negative errno value. If one of them fails, the "err" variable
contains negative value what cause to assertion failure.
This patch updates all functions uses by mlx5_alloc_shared_dev_ctx()
function to update rte_errno and take this value instead of "err" value.
Fixes: 5dfa003db5 ("common/mlx5: fix post doorbell barrier")
Fixes: 5d55a494f4 ("net/mlx5: split multi-thread flow handling per OS")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The ASO connection tracking structure is initialized once for sharing
device context.
Its release takes place in the close function which is called for each
ethdev individually. i.e. when there is more than one ethdev under the
same sharing device context, it will be destroyed when one of them is
closed. If the other wants to use it later, it may cause it to crash.
In addition, the creation of this structure is performed in the spawn
function. If one of the creations of the objects following it fails, it
is supposed to be destroyed but this does not happen.
This patch moves its release to the sharing device context free function
and thus solves both problems.
Fixes: 0af8a2298a ("net/mlx5: release connection tracking management")
Fixes: ee9e5fad03 ("net/mlx5: initialize connection tracking management")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Multiple PMDs have dummy/noop Rx/Tx packet burst functions.
These dummy functions are very simple, introduce a common function in
the ethdev and update drivers to use it instead of each driver having
its own functions.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
In the striding RQ management there are two important parameters, the
size of the single stride in bytes and the number of strides.
Both the data-path structure and config structure keep the log of the
above parameters. However, in their names there is no mention that the
value is a log which may be misleading as if the fields represent the
values themselves.
This patch updates their names describing the values more accurately.
Fixes: ecb160456a ("net/mlx5: add device parameter for MPRQ stride size")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The mlx5_args function reads the devargs and checks if they are valid
for this driver and if not it returns an error.
This was normal behavior as long as all the devargs come to this driver,
but since it is possible to run several drivers together, the function
may return an error for another driver's devarg even though it is
completely valid.
In addition the function does not allow the user to know which of the
devargs is incorrect, but returns an error without printing the
unknown devarg.
This patch eliminates the error return in the case of an unknown devarg,
and prints a warning for each such devarg specifically.
Fixes: 7b4f1e6bd3 ("common/mlx5: introduce common library")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The attribute to record the global control of hairpin queues' delay
drop was defined as a bit-field with one bit, and the intention was
to reduce the memory overhead. In the meanwhile, the macro was
defined as an enumerated value 0x2.
No matter what value inputted via devarg, the lowest bit was always
zero and the higher bits would be ignored. For hairpin queues, the
delay drop attribute couldn't be enabled.
With the commit, the double logical negation is used to fix this.
Fixes: febcac7b46 ("net/mlx5: support Rx queue delay drop")
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Due to kernel driver / FW issues in direct MKEY creation using the DevX
API, this patch replaces the counter MR creation to use wrapped mkey
API.
Fixes: 5382d28c21 ("net/mlx5: accelerate DV flow counter transactions")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":
- as regular cached memory, the variable is either missing or set to
zero. This type of mapping may cause the significant doorbell
register writing latency and requires an explicit memory write
barrier to mitigate this issue and prevent write combining.
- as non-cached memory, the variable is present and set to not "0"
value. This type of mapping may cause performance impact under
heavy loading conditions but the explicit write memory barrier is
not required and it may improve core performance.
The UAR creation function maps a doorbell in one of the above ways
according to the system. In run time, it always adds an explicit memory
barrier after writing to.
In cases where the doorbell was mapped as non-cached memory, the
explicit memory barrier is unnecessary and may impair performance.
The commit [1] solved this problem for a Tx queue. In run time, it
checks the mapping type and provides the memory barrier after writing to
a Tx doorbell register if it is needed. The mapping type is extracted
directly from the uar_mmap_offset field in the queue properties.
This patch shares this code between the drivers and extends the above
solution for each of them.
[1] commit 8409a28573
("net/mlx5: control transmit doorbell register mapping")
Fixes: f8c97babc9 ("compress/mlx5: add data-path functions")
Fixes: 8e196c08ab ("crypto/mlx5: support enqueue/dequeue operations")
Fixes: 4d4e245ad6 ("regex/mlx5: support enqueue")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The Tx doorbell has different virtual addresses per process.
The secondary process takes the UAR physical page ID of the primary and
mmap it to its own virtual address.
The primary doorbell references were saved in two shared memory
locations: the TxQ structure and a dedicated doorbell array.
Remove the doorbell reference from the TxQ structure and move the
primary processes to take the UAR information from the primary doorbell
array.
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>