Refactor filesystem test so that it does not use virsh, only qemu. Source compliation on filesystem was replaced with FIO+verify on file located on mounted filesystem. Reason for this change is that test for single filesystem type took about 1:30 for just compilation itself. Considering that we have 4 filesystems in test with 2 different vhost controllers, it took way too long. Change spaces to tabs. Change-Id: I396d653efe2bbf76934b2532576455be43632ff4 Signed-off-by: Karol Latecki <karol.latecki@intel.com> Reviewed-on: https://review.gerrithub.io/398603 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
253 lines
12 KiB
Markdown
253 lines
12 KiB
Markdown
# SPDK vhost Test Plan
|
|
|
|
## Current Tests
|
|
|
|
### Integrity tests
|
|
|
|
#### vhost self test
|
|
- compiles SPDK and Qemu
|
|
- launches SPDK Vhost
|
|
- starts VM with 1 NVMe device attached to it
|
|
- issues controller "reset" command using sg3_utils on guest system
|
|
- performs data integrity check using dd to write and read data from the device
|
|
- runs on 3 host systems (Ubuntu 16.04, Centos 7.3 and Fedora 25)
|
|
and 1 guest system (Ubuntu 16.04)
|
|
- runs against vhost scsi and vhost blk
|
|
|
|
#### FIO Integrity tests
|
|
- NVMe device is split into 4 LUNs, each is attached to separate vhost controller
|
|
- FIO uses job configuration with randwrite mode to verify if random pattern was
|
|
written to and read from correctly on each LUN
|
|
- runs on Fedora 25 and Ubuntu 16.04 guest systems
|
|
- runs against vhost scsi and vhost blk
|
|
|
|
#### Lvol tests
|
|
- starts vhost with at least 1 NVMe device
|
|
- starts 1 VM or multiple VMs
|
|
- lvol store is constructed on each NVMe device
|
|
- on each lvol store 1 lvol bdev will be constructed for each running VM
|
|
- Logical volume block device is used as backend instead of using
|
|
NVMe device backend directly
|
|
- after set up, data integrity check will be performed by FIO randwrite
|
|
operation with verify flag enabled
|
|
- optionally nested lvols can be tested with use of appropriate flag;
|
|
On each base lvol store additional lvol bdev will be created which will
|
|
serve as a base for nested lvol stores.
|
|
On each of the nested lvol stores there will be 1 lvol bdev created for each
|
|
VM running. Nested lvol bdevs will be used along with base lvol bdevs for
|
|
data integrity check.
|
|
- runs against vhost scsi and vhost blk
|
|
|
|
#### Filesystem integrity
|
|
- runs SPDK with 1 VM with 1 NVMe device attached.
|
|
- creates a partition table and filesystem on passed device, and mounts it
|
|
- 1GB test file is created on mounted file system and FIO randrw traffic
|
|
(with enabled verification) is run
|
|
- Tested file systems: ext4, brtfs, ntfs, xfs
|
|
- runs against vhost scsi and vhost blk
|
|
|
|
#### Windows HCK SCSI Compliance Test 2.0.
|
|
- Runs SPDK with 1 VM with Windows Server 2012 R2 operating system
|
|
- 4 devices are passed into the VM: NVMe, Split NVMe, Malloc and Split Malloc
|
|
- On each device Windows HCK SCSI Compliance Test 2.0 is run
|
|
|
|
#### MultiOS test
|
|
- start 3 VMs with guest systems: Ubuntu 16.04, Fedora 25 and Windows Server 2012 R2
|
|
- 3 physical NVMe devices are split into 9 LUNs
|
|
- each guest uses 3 LUNs from 3 different physical NVMe devices
|
|
- Linux guests run FIO integrity jobs to verify read/write operations,
|
|
while Windows HCK SCSI Compliance Test 2.0 is running on Windows guest
|
|
|
|
#### vhost hot-remove tests
|
|
- removing NVMe device (unbind from driver) which is already claimed
|
|
by controller in vhost
|
|
- hotremove tests performed with and without I/O traffic to device
|
|
- I/O traffic, if present in test, has verification enabled
|
|
- checks that vhost and/or VMs do not crash
|
|
- checks that other devices are unaffected by hot-remove of a NVMe device
|
|
- performed against vhost blk and vhost scsi
|
|
|
|
#### vhost scsi hot-attach and hot-detach tests
|
|
- adding and removing devices via RPC to a controller which is already in use by a VM
|
|
- I/O traffic generated with FIO read/write operations, verification enabled
|
|
- checks that vhost and/or VMs do not crash
|
|
- checks that other devices in the same controller are unaffected by hot-attach
|
|
and hot-detach operations
|
|
|
|
#### virtio initiator tests
|
|
- virtio user mode: connect to vhost-scsi controller sockets directly on host
|
|
- virtio pci mode: connect to virtual pci devices on guest virtual machine
|
|
- 6 concurrent jobs are run simultaneously on 7 devices, each with 8 virtqueues
|
|
|
|
##### kernel virtio-scsi-pci device
|
|
- test support for kernel vhost-scsi device
|
|
- create 1GB ramdisk using targetcli
|
|
- create target and add ramdisk to it using targetcli
|
|
- add created device to virtio pci tests
|
|
|
|
##### emulated virtio-scsi-pci device
|
|
- test support for QEMU emulated virtio-scsi-pci device
|
|
- add emulated virtio device "Virtio0" to virtio pci tests
|
|
|
|
##### Test configuration
|
|
- SPDK vhost application is used for testing
|
|
- FIO using spdk fio_plugin: rw, randrw, randwrite, write with verification enabled.
|
|
- trim sequential and trim random then write on trimmed areas with verification enabled
|
|
only on unmap supporting devices
|
|
- FIO job configuration: iodepth=128, block size=4k, runtime=10s
|
|
- all test cases run jobs in parallel on multiple bdevs
|
|
- 8 queues per device
|
|
|
|
##### vhost configuration
|
|
- scsi controller with 4 NVMe splits
|
|
- 2 block controllers, each with 1 NVMe split
|
|
- scsi controller with malloc with 512 block size
|
|
- scsi controller with malloc with 4096 block size
|
|
|
|
##### Test case 1
|
|
- virtio user on host
|
|
- perform FIO rw, randwrite, randrw, write, parallel jobs on all devices
|
|
|
|
##### Test case 2
|
|
- virtio user on host
|
|
- perform FIO trim, randtrim, rw, randwrite, randrw, write, - parallel jobs
|
|
then write on trimmed areas on unmap supporting devices
|
|
|
|
##### Test case 3
|
|
- virtio pci on vm
|
|
- same config as in TC#1
|
|
|
|
##### Test case 4
|
|
- virtio pci on vm
|
|
- same config as in TC#2
|
|
|
|
### Live migration
|
|
Live migration feature allows to move running virtual machines between SPDK vhost
|
|
instances.
|
|
Following tests include scenarios with SPDK vhost instances running on both the same
|
|
physical server and between remote servers.
|
|
Additional configuration of utilities like SSHFS share, NIC IP address adjustment,
|
|
etc., might be necessary.
|
|
|
|
#### Test case 1 - single vhost migration
|
|
- Start SPDK Vhost application.
|
|
- Construct a single Malloc bdev.
|
|
- Construct two SCSI controllers and add previously created Malloc bdev to it.
|
|
- Start first VM (VM_1) and connect to Vhost_1 controller.
|
|
Verify if attached disk is visible in the system.
|
|
- Start second VM (VM_2) but with "-incoming" option enabled, connect to.
|
|
Connect to Vhost_2 controller. Use the same VM image as VM_1.
|
|
- On VM_1 start FIO write job with verification enabled to connected Malloc bdev.
|
|
- Start VM migration from VM_1 to VM_2 while FIO is still running on VM_1.
|
|
- Once migration is complete check the result using Qemu monitor. Migration info
|
|
on VM_1 should return "Migration status: completed".
|
|
- VM_2 should be up and running after migration. Via SSH log in and check FIO
|
|
job result - exit code should be 0 and there should be no data verification errors.
|
|
- Cleanup:
|
|
- Shutdown both VMs.
|
|
- Gracefully shutdown Vhost instance.
|
|
|
|
#### Test case 2 - single server migration
|
|
- Detect RDMA NICs; At least 1 RDMA NIC is needed to run the test.
|
|
If there is no physical NIC available then emulated Soft Roce NIC will
|
|
be used instead.
|
|
- Create /tmp/share directory and put a test VM image in there.
|
|
- Start SPDK NVMeOF Target application.
|
|
- Construct a single NVMe bdev from available bound NVMe drives.
|
|
- Create NVMeoF subsystem with NVMe bdev as single namespace.
|
|
- Start first SDPK Vhost application instance (later referred to as "Vhost_1").
|
|
- Use different shared memory ID and CPU mask than NVMeOF Target.
|
|
- Construct a NVMe bdev by connecting to NVMeOF Target
|
|
(using trtype: rdma).
|
|
- Construct a single SCSI controller and add NVMe bdev to it.
|
|
- Start first VM (VM_1) and connect to Vhost_1 controller. Verify if attached disk
|
|
is visible in the system.
|
|
- Start second SDPK Vhost application instance (later referred to as "Vhost_2").
|
|
- Use different shared memory ID and CPU mask than previous SPDK instances.
|
|
- Construct a NVMe bdev by connecting to NVMeOF Target. Connect to the same
|
|
subsystem as Vhost_1, multiconnection is allowed.
|
|
- Construct a single SCSI controller and add NVMe bdev to it.
|
|
- Start second VM (VM_2) but with "-incoming" option enabled.
|
|
- Check states of both VMs using Qemu monitor utility.
|
|
VM_1 should be in running state.
|
|
VM_2 should be in paused (inmigrate) state.
|
|
- Run FIO I/O traffic with verification enabled on to attached NVME on VM_1.
|
|
- While FIO is running issue a command for VM_1 to migrate.
|
|
- When the migrate call returns check the states of VMs again.
|
|
VM_1 should be in paused (postmigrate) state. "info migrate" should report
|
|
"Migration status: completed".
|
|
VM_2 should be in running state.
|
|
- Verify that FIO task completed successfully on VM_2 after migrating.
|
|
There should be no I/O failures, no verification failures, etc.
|
|
- Cleanup:
|
|
- Shutdown both VMs.
|
|
- Gracefully shutdown Vhost instances and NVMEoF Target instance.
|
|
- Remove /tmp/share directory and it's contents.
|
|
- Clean RDMA NIC / Soft RoCE configuration.
|
|
|
|
#### Test case 3 - remote server migration
|
|
- Detect RDMA NICs on physical hosts. At least 1 RDMA NIC per host is needed
|
|
to run the test.
|
|
- On Host 1 create /tmp/share directory and put a test VM image in there.
|
|
- On Host 2 create /tmp/share directory. Using SSHFS mount /tmp/share from Host 1
|
|
so that the same VM image can be used on both hosts.
|
|
- Start SPDK NVMeOF Target application on Host 1.
|
|
- Construct a single NVMe bdev from available bound NVMe drives.
|
|
- Create NVMeoF subsystem with NVMe bdev as single namespace.
|
|
- Start first SDPK Vhost application instance on Host 1(later referred to as "Vhost_1").
|
|
- Use different shared memory ID and CPU mask than NVMeOF Target.
|
|
- Construct a NVMe bdev by connecting to NVMeOF Target
|
|
(using trtype: rdma).
|
|
- Construct a single SCSI controller and add NVMe bdev to it.
|
|
- Start first VM (VM_1) and connect to Vhost_1 controller. Verify if attached disk
|
|
is visible in the system.
|
|
- Start second SDPK Vhost application instance on Host 2(later referred to as "Vhost_2").
|
|
- Construct a NVMe bdev by connecting to NVMeOF Target. Connect to the same
|
|
subsystem as Vhost_1, multiconnection is allowed.
|
|
- Construct a single SCSI controller and add NVMe bdev to it.
|
|
- Start second VM (VM_2) but with "-incoming" option enabled.
|
|
- Check states of both VMs using Qemu monitor utility.
|
|
VM_1 should be in running state.
|
|
VM_2 should be in paused (inmigrate) state.
|
|
- Run FIO I/O traffic with verification enabled on to attached NVME on VM_1.
|
|
- While FIO is running issue a command for VM_1 to migrate.
|
|
- When the migrate call returns check the states of VMs again.
|
|
VM_1 should be in paused (postmigrate) state. "info migrate" should report
|
|
"Migration status: completed".
|
|
VM_2 should be in running state.
|
|
- Verify that FIO task completed successfully on VM_2 after migrating.
|
|
There should be no I/O failures, no verification failures, etc.
|
|
- Cleanup:
|
|
- Shutdown both VMs.
|
|
- Gracefully shutdown Vhost instances and NVMEoF Target instance.
|
|
- Remove /tmp/share directory and it's contents.
|
|
- Clean RDMA NIC configuration.
|
|
|
|
### Performance tests
|
|
Tests verifying the performance and efficiency of the module.
|
|
|
|
#### FIO Performance 6 NVMes
|
|
- SPDK and created controllers run on 2 CPU cores.
|
|
- Each NVMe drive is split into 2 Split NVMe bdevs, which gives a total of 12
|
|
in test setup.
|
|
- 12 vhost controllers are created, one for each Split NVMe bdev. All controllers
|
|
use the same CPU mask as used for running Vhost instance.
|
|
- 12 virtual machines are run as guest systems (with Ubuntu 16.04.2); Each VM
|
|
connects to a single corresponding vhost controller.
|
|
Per VM configuration is: 2 pass-through host CPU's, 1 GB RAM, 2 IO controller queues.
|
|
- NVMe drives are pre-conditioned before the test starts. Pre-conditioning is done by
|
|
writing over whole disk sequentially at least 2 times.
|
|
- FIO configurations used for tests:
|
|
- IO depths: 1, 8, 128
|
|
- Blocksize: 4k
|
|
- RW modes: read, randread, write, randwrite, rw, randrw
|
|
- Write modes are additionally run with 15 minute ramp-up time to allow better
|
|
measurements. Randwrite mode uses longer ramp-up preconditioning of 90 minutes per run.
|
|
- Each FIO job result is compared with baseline results to allow detecting performance drops.
|
|
|
|
## Future tests and improvements
|
|
|
|
### Stress tests
|
|
- Add stability and stress tests (long duration tests, long looped start/stop tests, etc.)
|
|
to test pool
|