Commit Graph

29 Commits

Author SHA1 Message Date
Wisam Jaddo
97544f85bd app/flow-perf: simplify objects initialization
Since items are static then the default values will be zero,
thus the memset to zero value is just a redundant code.

Also remove the all not needed variables, that can be replaced
with direct set to the structure itself.

Fixes: bf3688f1e8 ("app/flow-perf: add insertion rate calculation")
Cc: stable@dpdk.org

Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
2021-01-07 15:52:29 +01:00
Wisam Jaddo
b9a9404fa9 app/flow-perf: change clock measurement
The clock() function is not good practice to use for multiple
cores/threads, since it measures the CPU time used by the process
and not the wall clock time, while when running through multiple
cores/threads simultaneously, we can burn through CPU time much
faster.

As a result this commit will change the way of measurement to use
rd_tsc, and the results will be divided by the processor frequency.

Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
2021-01-07 15:28:06 +01:00
Wisam Jaddo
070316d01d app/flow-perf: add multi-core rule insertion and deletion
One of the ways to increase the insertion/deletion rate is to use
multi-threaded insertion/deletion. Thus it's needed to have support
for testing and measure those rates using flow-perf application.

Now we generate cores and distribute all flows to those cores,
and start inserting/deleting in parallel.

The app now receive the cores count to use from command line option,
then it distribute the rte_flow rules evenly between the cores, and
start inserting/deleting. Each worker will report it's own results,
and in the end the MAIN worker will report the total results for all
cores.

The total results are calculated using RULES_COUNT divided over
max time used between all cores.

Also this touches the memory area, since inserting using multiple cores
in same time the pre solution for memory is not valid, thus now we save
memory before and after each allocation for all cores. In the end we
pick the min pre memory and the max post memory from all cores.

The difference between those values represent the total memory consumed
by the total rte_flow rules from all cores, and then report the total
size of single rte_flow in byte for each port.

How to use this feature:
--cores=N

Where 1 =< N <= RTE_MAX_LCORE

Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
2021-01-07 15:14:02 +01:00
Wisam Jaddo
8ccb4e3ef1 app/flow-perf: refactor flows handler
Provide the flows_handler() function the ability to control
flow performance processes. It is made possible after the
introduction of the insert_flows() function.

Also provide to the flows_handler() function the ability to print
the DPDK layer memory consumption of rte_flow rule, regardless
if deletion feature is enabled or not, while in previous
solution it was printing all memory changes after flows_handler().
Thus if deletion is there, it will not provide any memory that
represents the rte_flow rule size.

Also current design is easier to read and understand.

Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
2021-01-07 15:13:58 +01:00
Xiaoyu Min
8b91a7cd2c app/flow-perf: fix raw encapsulation size
The rte_flow_item_eth and rte_flow_item_vlan items are refined.
The structs do not exactly represent the packet bits captured on the
wire anymore so add_*_header functions should use real header instead of
the using rte_flow_item_* struct.

Replace the rte_flow_item_* with the existing corresponding rte_*_hdr.

Fixes: 09315fc838 ("ethdev: add VLAN attributes to ethernet and VLAN items")

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
2020-11-22 17:07:34 +01:00
Georgios Katsikas
4c0708ab7e app/flow-perf: configure rule batches
Currently, flow-perf measures the performance of
rule installation/deletion operations by breaking
down the entire number of operations into windows
of fixed size (i.e., 100000 operations per window).
Then, flow-perf measures the total time per window
and computes an average time across all windows.

This commit allows flow-perf users to configure
the number of rules per window instead of using
a fixed pre-compiled value. To do so, users must
pass --rules-batch=N, where N is the number of
rules per window (or batch).
For consistency reasons, flow_count variable is
now renamed to rules_count. This variable is the
total number of rules to be installed/deleted.

For example, if a user wants to measure how much
time it takes to install 1M rules in a certain NIC,
he/she can input:
--rules-count=1000000
This way flow-perf will break down 1M flow rules into
10 batches of 100k flow rules each (this is the default
batch size) and compute an average across the 10
measurements.
Now, if the user modifies the number of rules per
batch as follows:
--rules-count=1000000 --rules-batch=500000
then flow-perf will break down 1M flow rules into
2 batches of 500k flow rules each and compute the
average across the 2 measurements.

Finally, this commit also adds default variables
to the usage function instead of hardcoded values.

Signed-off-by: Georgios Katsikas <katsikas.gp@gmail.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
2020-11-04 21:17:35 +01:00
Stephen Hemminger
cb056611a8 eal: rename lcore master and slave
Replace master lcore with main lcore and
replace slave lcore with worker lcore.

Keep the old functions and macros but mark them as deprecated
for this release.

The "--master-lcore" command line option is also deprecated
and any usage will print a warning and use "--main-lcore"
as replacement.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-10-20 13:17:08 +02:00
Ivan Ilchenko
5a3f9f7f9c app/flow-perf: check stop call status
rte_eth_dev_stop() return value was changed from void to int,
so this patch modify usage of this function across app/flow-perf
according to new return type.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-10-16 22:26:41 +02:00
Wisam Jaddo
9783092022 app/flow-perf: allow fixed values for actions
Sometime the user want to have fixed values of
encap/decap or header modify for all flows.

This will introduce the ability to choose from
fixed or dynamic values by setting the flag in
config.h

To use different value for each flow:
config.h: #define FIXED_VALUES 0

To use single value for all flows:
config.h: #define FIXED_VALUES 1

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
7bcd402d7e app/flow-perf: support ICMP matching
Start support matching on icmpv4 and icmpv6.

Usage:
--icmpv4: add icmp item to match on.
--icmpv6: add icmpv6 item to match on.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
325bd805e4 app/flow-perf: add port mask option
Sometimes you need to check flow performance for
certain port and not all ports. Thus a portmask
option is needed.

Usage:
--portmask=N

Where N represent the hexadecimal bitmask of ports
used.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
b60fceb5ef app/flow-perf: add random mark values
Instead of having single id value, use up to 256
values, thus we make sure that all flows will not
use same mark action.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
cfa7554de1 app/flow-perf: fix IPv4 source matching
All value must be converted into intended endianness.

Fixes: bf3688f1e8 ("app/flow-perf: add insertion rate calculation")
Cc: stable@dpdk.org

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
0a0757a0db app/flow-perf: support VXLAN encap/decap actions
Introduce vxlan-encap and vxlan-decap actions.

vxlan-encap have fixed pattern and values for
encap data.

Usage example:
--vxlan-encap
--vxlan-decap

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
0c8f1f4ab9 app/flow-perf: support raw encap/decap actions
Introduce raw-encap and raw-decap actions.
The two actions are added in command line
options, and for the data to encap or decap
the user need to parse it within the command
line.

All values of raw-encap data is set to be fixed
values.

Usage example:

--raw-encap=ether,ipv4,udp,vxlan

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:11 +02:00
Wisam Jaddo
b777d9d046 app/flow-perf: fix memory leak from RSS action
Currently, each call for add_rss_action will allocate
extra memory for rss_data, which will reflect bad results
on memory consumption for all flows, and will leads into
memory leak.

In this fix, it will check if it's allocated before
reallocating it.

Fixes: bf3688f1e8 ("app/flow-perf: add insertion rate calculation")
Cc: stable@dpdk.org

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:10 +02:00
Wisam Jaddo
d71bc9e99c app/flow-perf: support flag action
Introduce flag action support to flow perf
application.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:10 +02:00
Wisam Jaddo
ef9ae0cf57 app/flow-perf: support header modify actions
Introduce headers modify actions in the app.
All header modify actions will add different value
for each flow, to make sure each flow will create
and use it's own actions.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:10 +02:00
Wisam Jaddo
9001a863f4 app/flow-perf: support user order
The old design was using the bit mask to identify
items, action and attributes.

So it was all based on the order of the code itself,
to place the order of the actions, items & attributes
inside the flows. Such design will lead into many failures
when some PMD support order different than other PMD,
in the end the rules will fail to create. Also sometimes
the user needs to have one action before other actions
and vice versa, so using new design of arrays that
take user order into consideration make more sense.

After this patch, we start supporting inner items
and more than one instance of same action.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:10 +02:00
Wisam Jaddo
849543648e app/flow-perf: fix actions mask
Actions have it's own macro which is FLOW_ACTION_MASK

Fixes: bf3688f1e8 ("app/flow-perf: add insertion rate calculation")
Cc: stable@dpdk.org

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Alexander Kozyrev <akozyrev@nvidia.com>
2020-09-18 18:55:10 +02:00
Ciara Power
3cc6ecfdfe build: remove makefiles
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.

[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:09:50 +02:00
Wisam Jaddo
e7554ebd07 app/flow-perf: fix typo in usage help
From hairping-rss into hairpin-rss.

Fixes: bf3688f1e8 ("app/flow-perf: add insertion rate calculation")

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
2020-07-19 15:14:50 +02:00
Wisam Jaddo
c75f036723 app/flow-perf: fix hairpin queues setup
The hairpin queue is the one that start from normal rxq,
and will be less than nr_queues where nr_queues is the
sum of normal and hairpin.

Fixes: bf3688f1e8 ("app/flow-perf: add insertion rate calculation")

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Reviewed-by: Asaf Penso <asafp@mellanox.com>
2020-07-19 15:10:30 +02:00
Thomas Monjalon
f7a4996c04 app/flow-perf: use macro for cache alignment
The macro __rte_cache_aligned is better suited for aligning
a structure on a cache line (of any size).

Fixes: 15c4318640 ("app/flow-perf: add packet forwarding support")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Wisam Jaddo <wisamm@mellanox.com>
2020-06-30 11:57:46 +02:00
Wisam Jaddo
15c4318640 app/flow-perf: add packet forwarding support
Introduce packet forwarding support to the app to do
some performance measurements.

The measurements are reported in term of packet per
second unit. The forwarding will start after the end
of insertion/deletion operations.

The support has single and multi performance measurements.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
662a72342a app/flow-perf: add memory dump to app
Introduce new feature to dump memory statistics of each socket
and a total for all before and after the creation.

This will give two main advantage:
1- Check the memory consumption for large number of flows
"insertion rate scenario alone"

2- Check that no memory leackage after doing insertion then
deletion.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
c12f4f217d app/flow-perf: add deletion rate calculation
Add the ability to test deletion rate for flow performance
application.

This feature is disabled by default, and can be enabled by
add "--deletion-rate" in the application command line options.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
bf3688f1e8 app/flow-perf: add insertion rate calculation
Add insertion rate calculation feature into flow
performance application.

The application now provide the ability to test
insertion rate of specific rte_flow rule, by
stressing it to the NIC, and calculate the
insertion rate.

The application offers some options in the command
line, to configure which rule to apply.

After that the application will start producing
rules with same pattern but increasing the outer IP
source address by 1 each time, thus it will give
different flow each time, and all other items will
have open masks.

The current design have single core insertion rate.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00
Wisam Jaddo
3344cf2e30 app/flow-perf: add flow performance skeleton
Add flow performance application skeleton.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
2020-06-29 15:47:36 +02:00