Commit Graph

6 Commits

Author SHA1 Message Date
Michael Baum
d37435dc3f net/mlx5: assert for enough space in counter rings
There is a by-design assumption in the code that the global counter
rings can contain all the port counters.
So, enqueuing to these global rings should always succeed.

Add assertions to help for debugging this assumption.

In addition, change mlx5_hws_cnt_pool_put() function to return void due
to those assumptions.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
2022-11-10 18:15:45 +01:00
Michael Baum
77ca194b4e net/mlx5: add assertions in counter get/put for HWS
Add assertions to help debug in case of counter double alloc/free.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
2022-11-10 18:15:45 +01:00
Michael Baum
2fd25a6d13 net/mlx5: fix counter elements copies for HWS
The __hws_cnt_r2rcpy() function copies elements from one zero-copy ring
to another zero-copy ring in place.
This routine needs to consider the situation that the address was given
by source and destination could be both wrapped.

It uses 4 different "n" local variables to manage it:
 - n:  Number of elements to copy in total.
 - n1: Number of elements to copy from ptr1, it is the minimal value
       from source/dest n1 field.
 - n2: Number of elements to copy from src->ptr1 to dst->ptr2 or from
       src->ptr2 to dst->ptr1, this variable is 0 when both source and
       dest n1 field are equal.
 - n3: Number of elements to copy from src->ptr2 to dst->ptr2.

The function copies the first n1 elements. If n2 isn't zero it copies
more elements and check whether n3 is zero.
This logic is wrong since n3 may be bigger than zero even when n2 is
zero. This scenario is commonly happening in counters when the internal
mlx5 service thread copies elements from the reset ring into the reuse
ring.

This patch changes the function to copy n3 regardless of n2 value.

Fixes: 4d368e1da3 ("net/mlx5: support flow counter action for HWS")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
2022-11-10 18:15:45 +01:00
Michael Baum
5b21f92556 net/mlx5: fix counter access for HWS
The HWS counter has 2 different identifiers:
1. Type "cnt_id_t" which represents the counter inside caches and in
   the flow structure. This index cannot be zero and is mostly called
   "cnt_id".
 2. Internal index, the index in counters array with type "uint32_t".
    mostly it is called "iidx".
The second ID is calculated from the first using "mlx5_hws_cnt_iidx()"
function.

When a direct counter is allocated, if the queue cache is not empty, the
counter represented by cnt_id is popped from the cache. This counter may
be invalid according to the query_gen field. Thus, the "iidx" is parsed
from cnt_id and if it is valid, it is used to update the fields of the
counter structure.
When this counter is invalid, all the cache is flashed and new counters
are fetched into the cache. After fetching, another counter represented
by cnt_id is taken from the cache.
Unfortunately, for updating fields like "in_used" or "age_idx", the
function wrongly may use the old "iidx" coming from an invalid cnt_id.

Update the "iidx" in case of an invalid counter popped from the cache.

Fixes: 4d368e1da3 ("net/mlx5: support flow counter action for HWS")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
2022-11-10 18:15:44 +01:00
Michael Baum
04a4de756e net/mlx5: support flow age action with HWS
Add support for AGE action for HW steering.
This patch includes:

 1. Add new structures to manage aging.
 2. Initialize all of them in configure function.
 3. Implement per second aging check using CNT background thread.
 4. Enable AGE action in flow create/destroy operations.
 5. Implement a queue-based function to report aged flow rules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-10-26 13:33:41 +02:00
Xiaoyu Min
4d368e1da3 net/mlx5: support flow counter action for HWS
This commit adds HW steering counter action support.
The pool mechanism is the basic data structure for the HW steering
counter.

The HW steering's counter pool is based on the rte_ring of zero-copy
variation.

There are two global rte_rings:
1. free_list:
     Store the counters indexes, which are ready for use.
2. wait_reset_list:
     Store the counters indexes, which are just freed from the user and
     need to query the hardware counter to get the reset value before
     this counter can be reused again.

The counter pool also supports cache per HW steering's queues, which are
also based on the rte_ring of zero-copy variation.

The cache can be configured in size, preload, threshold, and fetch size,
they are all exposed via device args.

The main operations of the counter pool are as follows:

 - Get one counter from the pool:
   1. The user call _get_* API.
   2. If the cache is enabled, dequeue one counter index from the local
      cache:
      2. A: if the dequeued one from the local cache is still in reset
        status (counter's query_gen_when_free is equal to pool's query
        gen):
        I. Flush all counters in the local cache back to global
           wait_reset_list.
        II. Fetch _fetch_sz_ counters into the cache from the global
            free list.
        III. Fetch one counter from the cache.
   3. If the cache is empty, fetch _fetch_sz_ counters from the global
      free list into the cache and fetch one counter from the cache.
 - Free one counter into the pool:
   1. The user calls _put_* API.
   2. Put the counter into the local cache.
   3. If the local cache is full:
      A: Write back all counters above _threshold_ into the global
         wait_reset_list.
      B: Also, write back this counter into the global wait_reset_list.

When the local cache is disabled, _get_/_put_ cache directly from/into
global list.

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-10-26 13:33:39 +02:00