numam-spdk/test/nvme/perf
Tomasz Zawadzki 1c9125e78e test: add bdev_wait_for_examine to static JSON configs
Runtime RPCs such as bdev creation has no chance to wait for
bdev examination to finish. To handle this case commit below
introduced bdev_wait_for_examine RPC, and built it into all newly
saved JSON configurations:
(e57bb1af)lib/bdev: build bdev_wait_for_examine into subsystem

Some tests generate the configuration by hand, rather than
saving it from an existing application.
This patch embeds this RPC into the test configs.

Fixes #1760

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I79f998d722a2d19aa98b78333c64dbd2c1151444
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7861
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-05-13 10:07:07 +00:00
..
common.sh test: add bdev_wait_for_examine to static JSON configs 2021-05-13 10:07:07 +00:00
config.fio.tmp test/nvme_perf: move fio config generation into one place 2020-07-24 09:40:49 +00:00
README.md Fix Markdown MD022 linter warnings - headers blank lines 2020-02-17 10:07:21 +00:00
run_perf.sh test/nvme_perf: add option for alternative fio job layout 2021-01-18 13:02:49 +00:00

Automated script for NVMe performance test

Compile SPDK with LTO

The link time optimization (lto) gcc flag allows the linker to run a post-link optimization pass on the code. During that pass the linker inlines thin wrappers such as those around DPDK calls which results in a shallow call stack and significantly improves performance. Therefore, we recommend compiling SPDK with the lto flag prior to running this benchmark script to archieve optimal performance. Link time optimization can be enabled in SPDK by doing the following:

~{.sh} ./configure --enable-lto ~

Configuration

Test is configured by using command-line options.

Available options

-h, --help

Prints available commands and help.

--run-time

Tell fio to terminate processing after the specified period of time. Value in seconds.

--ramp-time

Fio will run the specified workload for this amount of time before logging any performance numbers. Value in seconds.

--fio-bin

Path to fio binary.

--driver

Select between SPDK driver and kernel driver. The Linux Kernel driver has three configurations: Default mode, Hybrid Polling and Classic Polling. The SPDK driver supports 2 fio_plugin modes: bdev and NVMe PMD. Before running test with spdk, you will need to bind NVMe devics to the Linux uio_pci_generic or vfio-pci driver. When running test with the Kernel driver, NVMe devices use the Kernel driver. The 5 valid values for this option are: 'bdev', 'nvme', 'kernel-libaio', 'kernel-classic-polling' and 'kernel-hybrid-polling'.

--max-disk

This option will run multiple fio jobs with varying number of NVMe devices. First it will start with max-disk number of devices then decrease number of disk by two until there are no more devices. If set to 'all' then max-disk number will be set to all available devices. Only one of the max-disk or disk-no option can be used.

--disk-no

This option will run fio job on specified number of NVMe devices. If set to 'all' then max-disk number will be set to all available devices. Only one of the max-disk or disk-no option can be used.

--cpu-allowed

Specifies the CPU cores that will be used by fio to execute the performance test cases. When spdk driver is chosen, Nthe script attempts to assign NVMe devices to CPU cores on the same NUMA node. The script will try to align each core with devices matching core's NUMA first but if the is no devices left within the CPU core NUMA then it will use devices from the other NUMA node. It is important to choose cores that will ensure best NUMA node alignment. For example: On System with 8 devices on NUMA node 0 and 8 devices on NUMA node 1, cores 0-27 on numa node 0 and 28-55 on numa node 1, if test is set to use 16 disk and four cores then "--cpu-allowed=1,2,28,29" can be used resulting with 4 devices with node0 per core 1 and 2 and 4 devices with node1 per core 28 and 29. If 10 cores are required then best option would be "--cpu-allowed=1,2,3,4,28,29,30,31,32,33" because cores 1-4 will be aligned with 2 devices on numa0 per core and cores 28-33 will be aligned with 1 device on numa1 per core. If kernel driver is chosen then for each job with NVME device, all cpu cores with corresponding NUMA node are picked.

--rw

Type of I/O pattern. Accepted values are: randrw, rw

--rwmixread

Percentage of a mixed workload that should be reads.

--iodepth

Number of I/O units to keep in flight against each file.

--block-size

The block size in bytes used for I/O units.

--numjobs

Create the specified number of clones of a job.

--repeat-no

Specifies how many times run each workload. End results are averages of these workloads

--no-preconditioning

By default disks are preconditioned before test using fio with parameters: size=100%, loops=2, bs=1M, w=write, iodepth=32, ioengine=spdk. It can be skiped when this option is set.

"--no-io-scaling"

For SPDK fio plugin iodepth is multiplied by number of devices. When this option is set this multiplication will be disabled.

Results

Results are stored in "results" folder. After each workload, to this folder are copied files with: fio configuration file, json files with fio results and logs with latiencies with sampling interval 250 ms. Number of copied files depends from number of repeats of each workload. Additionall csv file is created with averaged results of all workloads.