Nearly all tests now make extensive use of the stub application,
so they're effectively all testing multiprocess all the time.
Further, we believe it to be the best policy to not attempt
to support scenarios where the primary process crashes unexpectedly.
We consider this equivalent to a kernel panic and all of the
processes will need to be halted and restarted.
Given the two things above, we can make some fairly dramatic
simplifications to the NVMe multiprocess testing. Only
one piece of functionality - multiple simultaneous secondary
processes - was not already tested by the other regular
tests. This patch removes all other multiprocess tests
and adds a simple test of multiple secondaries.
Change-Id: If99f85913b99862f02c3815ea7c10cd80ea3ce02
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/368208
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia9ea94a1dc4e583710374d453302319aa59ce62a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368206
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A short delay is required between starting up a primary and
secondary process with DPDK depending on what the secondary
depends on wrt the primary. As the SPDK sample apps are not
designed to be dependent on each other, when we use them
as primary/secondary in test scripts with no deterministic
synchronization, it is possible for one ore more to hang
resulting in DPDK fatal init failures. Often times this would
show up as a failure to get hugepages in vtophys
A related fix, same failing signature in the same test script,
is also included here where the stub app, which is designed
to act as primary in certain sections of the test script, was
being killed by the test script but the next primary app was
coming up before the process was dead and coming up as a
secondary. A wait was added to assure that the stub process is
gone before the next app tries to start.
Change-Id: If2f6fc25e76b769ad8edafa8e965be246e98dab9
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/367725
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It seems that asan does not work well with:
(1) Valgrind. If asan is enabled, we do not use valgrind.
(2) Spdk fio plugin. If asan is enabled, we cannot work
with it even with the suggestion by using LD_preloads.
(3) Hotplug. If asan is enabled, it catches the SEGV earlier
than our defined handler
Change-Id: Id4bd5ae0f545aaba7d028e3da14fdddc18682429
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/364917
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
For now, just hardcode the shm_id to 0 for any test apps
that currently do not support command-line arguments.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic8de44d4badc4c9b8858596b7f55dcc04371371b
Reviewed-on: https://review.gerrithub.io/365732
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Also add the nvme.sh test script to enable this new histogram
when running the overhead tool.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I825de58362ad631808173a1d3d1b4ccb7df3bcf2
Reviewed-on: https://review.gerrithub.io/365265
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
fio has a race between reap_threads() and free_ioengine(). free_ioengine()
will call the ioengine's cleanup routine and then dlclose it if it
is dynamically linked (like the spdk fio plugin). free_ioengine() does
not set td->io_ops = NULL though until after dlclose() is complete. If
reap_threads() tries to dereference td->io_ops after our plugin has been
closed but before io_ops was set to NULL, it will segfault.
Solution (until an upstream fio fix is available) is to use LD_PRELOAD
instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ide4eb3cb92a636513289107fc211fdf1f98b616f
Reviewed-on: https://review.gerrithub.io/365272
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Should find the root cause later.
Change-Id: I28ef058038c105c03e53555f7b972c75ac7121ae
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/365110
Reviewed-by: GangCao <gang.cao@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
A single -L can be used to get the latency summary.
Two -L's (or -LL) can be used to get both the latency
summary and the detailed histogram.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3fc0f4e2dfff7b041a665fe35aa33f11e4c3ebad
Reviewed-on: https://review.gerrithub.io/362270
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The latency tracking is done with ranges of bucket arrays.
The bucket for any given I/O is determined solely by TSC
deltas - any translation to microseconds is only done after
the test is finished and statistics are printed.
Each range has a number of buckets determined by a
NUM_BUCKETS_PER_RANGE value which is currently set to 128.
The buckets in ranges 0 and 1 each map to one specific TSC
delta. The buckets in subsequent ranges each map to twice
as many TSC deltas as buckets in the previous range:
Range 0: 1 TSC each - 128 buckets cover deltas 0 to 127
Range 1: 1 TSC each - 128 buckets cover deltas 128 to 255
Range 2: 2 TSC each - 128 buckets cover deltas 256 to 511
Range 3: 4 TSC each - 128 buckets cover deltas 512 to 1023
Range 4: 8 TSC each - 128 buckets cover deltas 1024 to 2047
Range 5: 16 TSC each - 128 buckets cover deltas 2048 to 4095
etc.
While here, change some variable names and usage
messages to differentiate between the existing latency
tracking via vendor-specific NVMe log pages on Intel
NVMe SSDs, and the newly added latency tracking done
in software.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I299f1c1f6dbfa7ea0e73085f7a685e71fc687a2b
Fix up a second copy of linux_iter_pci in the same manner as commit
6562e95092 (scripts/setup.sh: support
systems where more than one domain is used).
Change-Id: I3d9b842891d70c2960de8287e3b11c1a11b02d1f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Also adds the automation test case for using remote
NVMe devices exported by NVMe-oF target.
Change-Id: I2b839a4eeec33d5b0c30d654e6013ad8c7949e23
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Core dumps stop working if the path is longer than 64 bytes.
Use readlink to shorten some of the very long relative paths.
Change-Id: I5e7eb6580ca581c5ac3a71afb7b62953836e2660
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
By default, all SPDK applications will not share memory.
To share memory, start the applications with the same
shared memory id.
Change-Id: Ib6180369ef0ed12d05983a21d7943e467402b21a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Now that unittest.sh is run as part of the automated tests, drop the
various unit test calls scattered throughout the tree.
Change-Id: Iea98314bb7f04620d72d81d25e24f8e706b50ce1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently we use the pci functions provided by DPDK,
it identifies the device by class id related
info but not by pci bdf info, so we can add the filering
by pci_addr in pcie_nvme_enum_cb function.
Change-Id: I5942e98853f00fc10fa6aae5c113517653d1b357
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The PCIe-specific unit tests still need to be updated; this patch just
moves the existing tests over and stubs out the necessary external
functions.
Change-Id: Ia6d46013231d8880df111b744523d02b56b9b37a
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is an intermittent bug in the multi-process support causing test
failures, so disable the tests for now until the multi-process code is
fixed.
Change-Id: I778004c8276390accb06eab5b86265169886c45f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This version of multi-process support needs to have DPDK 16.11 builtin.
Change-Id: I3352944516f327800b4bd640347afc6127d82ed4
Signed-off-by: GangCao <gang.cao@intel.com>
NVMe opcodes contain a two-bit field that encodes the expected data
direction for each command. Add an enum and a function to extract these
bits.
Change-Id: Ie214319f121cf0899c6aa5663866f2988b128dd2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For those controllers which can support end-to-end data protection
feature, add the support in the driver layer.
Change-Id: Ifac3dd89dec9860773c850416a6116113a6ce22a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
This test is only intended to validate functionality, not actual
performance, so one second is plenty.
Change-Id: I2ff198c035226b50a113f9ff189c1abbd0fd1c17
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
NVMe reservations provide capabilities that may be used by two or more
hosts to coordinate access to a shared namespace, here we add the 4
reservation commands: reservation register/acquire/release/report.
Change-Id: Ib03ae2120a57dd14aa64311a6ffeb39fda73018c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
For the purpose to support different types of input scattered payloads,
such as iovs or scattered list, we define common method in the NVMe
driver, users should implement their own functions to iterate each
segment memory.
Change-Id: Id2765747296a66997518281af0db04888ffc4b53
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
The top-level autotest.sh script will catch any core dumps at the end of
the test run, so sprinkling process_core in the individual test scripts
is unnecessary.
Also make the per-component test scripts run with 'set -e' (exit on
error).
Change-Id: I85f124e164ca93d35eaf672a428a841c119c550b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, the I/O splitting code and child I/O completion were not
being tested (aside from in the unit tests).
Tweak the I/O size so that it will get split into two child I/Os on
devices with 128 KB striping.
This causes nvme_cb_complete_child() to be exercised, improving code
coverage.
Change-Id: Ia0299f7f809f8edb93452b5ad45939977a9f67f1
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
nvme_ut is testing multi-threaded operations, and Valgrind's thread
behavior is different than running the program natively. Under
Valgrind, the unit test essentially hangs waiting for the global
variable to be updated.
Change-Id: Id3665002c16ac3e695c50325375305a76f72cee4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>