numam-dpdk/doc/guides/sample_app_ug/packet_ordering.rst
Phil Yang 016493307a examples/packet_ordering: add stats per worker thread
The current implementation using the '__sync' built-ins to synchronize
statistics within worker threads. The '__sync' built-ins functions are
full barriers which will affect the performance, so add a per worker
packets statistics to remove the synchronisation between worker threads.

Since the maximum core number can get to 256, so disable the per core
stats print in default and add the --insight-worker option to enable it.

For example:
sudo examples/packet_ordering/arm64-armv8a-linuxapp-gcc/packet_ordering \
-l 112-115 --socket-mem=1024,1024 -n 4 -- -p 0x03 --insight-worker

RX thread stats:
 - Pkts rxd:                            226539223
 - Pkts enqd to workers ring:           226539223

Worker thread stats on core [113]:
 - Pkts deqd from workers ring:         77557888
 - Pkts enqd to tx ring:                77557888
 - Pkts enq to tx failed:               0

Worker thread stats on core [114]:
 - Pkts deqd from workers ring:         148981335
 - Pkts enqd to tx ring:                148981335
 - Pkts enq to tx failed:               0

Worker thread stats:
 - Pkts deqd from workers ring:         226539223
 - Pkts enqd to tx ring:                226539223
 - Pkts enq to tx failed:               0

TX stats:
 - Pkts deqd from tx ring:              226539223
 - Ro Pkts transmitted:                 226539168
 - Ro Pkts tx failed:                   0
 - Pkts transmitted w/o reorder:        0
 - Pkts tx failed w/o reorder:          0

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
2019-07-08 16:33:06 +02:00

61 lines
2.2 KiB
ReStructuredText

.. SPDX-License-Identifier: BSD-3-Clause
Copyright(c) 2015 Intel Corporation.
Packet Ordering Application
============================
The Packet Ordering sample app simply shows the impact of reordering a stream.
It's meant to stress the library with different configurations for performance.
Overview
--------
The application uses at least three CPU cores:
* RX core (maser core) receives traffic from the NIC ports and feeds Worker
cores with traffic through SW queues.
* Worker core (slave core) basically do some light work on the packet.
Currently it modifies the output port of the packet for configurations with
more than one port enabled.
* TX Core (slave core) receives traffic from Worker cores through software queues,
inserts out-of-order packets into reorder buffer, extracts ordered packets
from the reorder buffer and sends them to the NIC ports for transmission.
Compiling the Application
-------------------------
To compile the sample application see :doc:`compiling`.
The application is located in the ``packet_ordering`` sub-directory.
Running the Application
-----------------------
Refer to *DPDK Getting Started Guide* for general information on running applications
and the Environment Abstraction Layer (EAL) options.
Application Command Line
~~~~~~~~~~~~~~~~~~~~~~~~
The application execution command line is:
.. code-block:: console
./packet_ordering [EAL options] -- -p PORTMASK [--disable-reorder] [--insight-worker]
The -c EAL CPU_COREMASK option has to contain at least 3 CPU cores.
The first CPU core in the core mask is the master core and would be assigned to
RX core, the last to TX core and the rest to Worker cores.
The PORTMASK parameter must contain either 1 or even enabled port numbers.
When setting more than 1 port, traffic would be forwarded in pairs.
For example, if we enable 4 ports, traffic from port 0 to 1 and from 1 to 0,
then the other pair from 2 to 3 and from 3 to 2, having [0,1] and [2,3] pairs.
The disable-reorder long option does, as its name implies, disable the reordering
of traffic, which should help evaluate reordering performance impact.
The insight-worker long option enables output the packet statistics of each worker thread.