markdownlint: enable rule MD013 - line_length
Fixed all MD013 errors Signed-off-by: wawryk <maciejx.wawryk@intel.com> Change-Id: I24846414ae6283e27a17caced16ac798a7e93018 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8938 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This commit is contained in:
parent
e4d23dc757
commit
12fcbc9b94
@ -3055,7 +3055,8 @@ See the [Pmem Block Device](http://www.spdk.io/doc/bdev.html#bdev_config_pmem) d
|
||||
A userspace driver for Virtio SCSI devices has been added.
|
||||
The driver is capable of creating block devices on top of LUNs exposed by another SPDK vhost-scsi application.
|
||||
|
||||
See the [Virtio SCSI](http://www.spdk.io/doc/virtio.html) documentation and [Getting Started](http://www.spdk.io/doc/bdev.html#bdev_config_virtio_scsi) guide for more information.
|
||||
See the [Virtio SCSI](http://www.spdk.io/doc/virtio.html) documentation and
|
||||
[Getting Started](http://www.spdk.io/doc/bdev.html#bdev_config_virtio_scsi) guide for more information.
|
||||
|
||||
### Vhost target
|
||||
|
||||
@ -3362,7 +3363,8 @@ user code.
|
||||
|
||||
## v1.2.0: IOAT user-space driver
|
||||
|
||||
This release adds a user-space driver with support for the Intel I/O Acceleration Technology (I/OAT, also known as "Crystal Beach") DMA offload engine.
|
||||
This release adds a user-space driver with support for the Intel I/O Acceleration Technology
|
||||
(I/OAT, also known as "Crystal Beach") DMA offload engine.
|
||||
|
||||
- IOAT
|
||||
- New user-space driver supporting DMA memory copy offload
|
||||
|
@ -10,7 +10,10 @@ acceleration capabilities. ISA/L is used for optimized CRC32C calculation within
|
||||
the software module.
|
||||
|
||||
The framework includes an API for getting the current capabilities of the
|
||||
selected module. See [`spdk_accel_get_capabilities`](https://spdk.io/doc/accel__engine_8h.html) for more details. For the software module, all capabilities will be reported as supported. For the hardware modules, only functions accelerated by hardware will be reported however any function can still be called, it will just be backed by software if it is not reported as a supported capability.
|
||||
selected module. See [`spdk_accel_get_capabilities`](https://spdk.io/doc/accel__engine_8h.html) for more details.
|
||||
For the software module, all capabilities will be reported as supported. For the hardware modules, only functions
|
||||
accelerated by hardware will be reported however any function can still be called, it will just be backed by
|
||||
software if it is not reported as a supported capability.
|
||||
|
||||
# Acceleration Framework Functions {#accel_functions}
|
||||
|
||||
@ -63,8 +66,11 @@ To use the IOAT engine, use the RPC [`ioat_scan_accel_engine`](https://spdk.io/d
|
||||
|
||||
## IDXD Module {#accel_idxd}
|
||||
|
||||
To use the DSA engine, use the RPC [`idxd_scan_accel_engine`](https://spdk.io/doc/jsonrpc.html) with an optional parameter of `-c` and provide a configuration number of either 0 or 1. These pre-defined configurations determine how the DSA engine will be setup in terms
|
||||
of work queues and engines. The DSA engine is very flexible allowing for various configurations of these elements to either account for different quality of service requirements or to isolate hardware paths where the back end media is of varying latency (i.e. persistent memory vs DRAM). The pre-defined configurations are as follows:
|
||||
To use the DSA engine, use the RPC [`idxd_scan_accel_engine`](https://spdk.io/doc/jsonrpc.html) with an optional parameter
|
||||
of `-c` and provide a configuration number of either 0 or 1. These pre-defined configurations determine how the DSA engine
|
||||
will be setup in terms of work queues and engines. The DSA engine is very flexible allowing for various configurations of
|
||||
these elements to either account for different quality of service requirements or to isolate hardware paths where the back
|
||||
end media is of varying latency (i.e. persistent memory vs DRAM). The pre-defined configurations are as follows:
|
||||
|
||||
0: A single work queue backed with four DSA engines. This is a generic configuration
|
||||
that enables the hardware to best determine which engine to use as it pulls in new
|
||||
|
@ -2,7 +2,8 @@
|
||||
|
||||
# Target Audience {#bdev_ug_targetaudience}
|
||||
|
||||
This user guide is intended for software developers who have knowledge of block storage, storage drivers, issuing JSON-RPC commands and storage services such as RAID, compression, crypto, and others.
|
||||
This user guide is intended for software developers who have knowledge of block storage, storage drivers, issuing JSON-RPC
|
||||
commands and storage services such as RAID, compression, crypto, and others.
|
||||
|
||||
# Introduction {#bdev_ug_introduction}
|
||||
|
||||
|
@ -78,7 +78,8 @@ Your output should look something like this:
|
||||
~~~{.sh}
|
||||
$ sudo docker run --privileged -v //dev//hugepages://dev//hugepages hello:1.0
|
||||
Starting SPDK v20.01-pre git sha1 80da95481 // DPDK 19.11.0 initialization...
|
||||
[ DPDK EAL parameters: hello_world -c 0x1 --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk0 --proc-type=auto ]
|
||||
[ DPDK EAL parameters: hello_world -c 0x1 --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa
|
||||
--base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk0 --proc-type=auto ]
|
||||
EAL: No available hugepages reported in hugepages-1048576kB
|
||||
Initializing NVMe Controllers
|
||||
Attaching to 0000:06:00.0
|
||||
|
@ -2756,7 +2756,8 @@ Example response:
|
||||
|
||||
## bdev_nvme_set_options {#rpc_bdev_nvme_set_options}
|
||||
|
||||
Set global parameters for all bdev NVMe. This RPC may only be called before SPDK subsystems have been initialized or any bdev NVMe has been created.
|
||||
Set global parameters for all bdev NVMe. This RPC may only be called before SPDK subsystems have been initialized
|
||||
or any bdev NVMe has been created.
|
||||
|
||||
### Parameters
|
||||
|
||||
@ -4720,7 +4721,9 @@ data_out_pool_size | Optional | number | Number of data out buffer
|
||||
|
||||
To load CHAP shared secret file, its path is required to specify explicitly in the parameter `auth_file`.
|
||||
|
||||
Parameters `disable_chap` and `require_chap` are mutually exclusive. Parameters `no_discovery_auth`, `req_discovery_auth`, `req_discovery_auth_mutual`, and `discovery_auth_group` are still available instead of `disable_chap`, `require_chap`, `mutual_chap`, and `chap_group`, respectivey but will be removed in future releases.
|
||||
Parameters `disable_chap` and `require_chap` are mutually exclusive. Parameters `no_discovery_auth`, `req_discovery_auth`,
|
||||
`req_discovery_auth_mutual`, and `discovery_auth_group` are still available instead of `disable_chap`, `require_chap`,
|
||||
`mutual_chap`, and `chap_group`, respectivey but will be removed in future releases.
|
||||
|
||||
### Example
|
||||
|
||||
@ -6870,7 +6873,8 @@ tgt_name | Optional | string | Parent NVMe-oF target nam
|
||||
### Response
|
||||
|
||||
The response is an object containing NVMf subsystem statistics.
|
||||
In the response, `admin_qpairs` and `io_qpairs` are reflecting cumulative queue pair counts while `current_admin_qpairs` and `current_io_qpairs` are showing the current number.
|
||||
In the response, `admin_qpairs` and `io_qpairs` are reflecting cumulative queue pair counts while
|
||||
`current_admin_qpairs` and `current_io_qpairs` are showing the current number.
|
||||
|
||||
### Example
|
||||
|
||||
@ -7820,7 +7824,8 @@ Example response:
|
||||
|
||||
## bdev_lvol_inflate {#rpc_bdev_lvol_inflate}
|
||||
|
||||
Inflate a logical volume. All unallocated clusters are allocated and copied from the parent or zero filled if not allocated in the parent. Then all dependencies on the parent are removed.
|
||||
Inflate a logical volume. All unallocated clusters are allocated and copied from the parent or zero filled
|
||||
if not allocated in the parent. Then all dependencies on the parent are removed.
|
||||
|
||||
### Parameters
|
||||
|
||||
@ -7855,7 +7860,9 @@ Example response:
|
||||
|
||||
## bdev_lvol_decouple_parent {#rpc_bdev_lvol_decouple_parent}
|
||||
|
||||
Decouple the parent of a logical volume. For unallocated clusters which is allocated in the parent, they are allocated and copied from the parent, but for unallocated clusters which is thin provisioned in the parent, they are kept thin provisioned. Then all dependencies on the parent are removed.
|
||||
Decouple the parent of a logical volume. For unallocated clusters which is allocated in the parent, they are
|
||||
allocated and copied from the parent, but for unallocated clusters which is thin provisioned in the parent,
|
||||
they are kept thin provisioned. Then all dependencies on the parent are removed.
|
||||
|
||||
### Parameters
|
||||
|
||||
@ -8506,7 +8513,8 @@ Example response:
|
||||
|
||||
Request notifications. Returns array of notifications that happend since the specified id (or first that is available).
|
||||
|
||||
Notice: Notifications are kept in circular buffer with limited size. Older notifications might be inaccesible due to being overwritten by new ones.
|
||||
Notice: Notifications are kept in circular buffer with limited size. Older notifications might be inaccesible
|
||||
due to being overwritten by new ones.
|
||||
|
||||
### Parameters
|
||||
|
||||
@ -8565,7 +8573,8 @@ Example response:
|
||||
|
||||
# Linux Network Block Device (NBD) {#jsonrpc_components_nbd}
|
||||
|
||||
SPDK supports exporting bdevs through Linux nbd. These devices then appear as standard Linux kernel block devices and can be accessed using standard utilities like fdisk.
|
||||
SPDK supports exporting bdevs through Linux nbd. These devices then appear as standard Linux kernel block devices
|
||||
and can be accessed using standard utilities like fdisk.
|
||||
|
||||
In order to export a device over nbd, first make sure the Linux kernel nbd driver is loaded by running 'modprobe nbd'.
|
||||
|
||||
@ -8809,7 +8818,8 @@ Example response:
|
||||
|
||||
Set cache pool size for blobfs filesystems. This RPC is only permitted when the cache pool is not already initialized.
|
||||
|
||||
The cache pool is initialized when the first blobfs filesystem is initialized or loaded. It is freed when the all initialized or loaded filesystems are unloaded.
|
||||
The cache pool is initialized when the first blobfs filesystem is initialized or loaded. It is freed when the all
|
||||
initialized or loaded filesystems are unloaded.
|
||||
|
||||
### Parameters
|
||||
|
||||
@ -8993,7 +9003,8 @@ Example response:
|
||||
|
||||
Send NVMe command directly to NVMe controller or namespace. Parameters and responses encoded by base64 urlsafe need further processing.
|
||||
|
||||
Notice: bdev_nvme_send_cmd requires user to guarentee the correctness of NVMe command itself, and also optional parameters. Illegal command contents or mismatching buffer size may result in unpredictable behavior.
|
||||
Notice: bdev_nvme_send_cmd requires user to guarentee the correctness of NVMe command itself, and also optional parameters.
|
||||
Illegal command contents or mismatching buffer size may result in unpredictable behavior.
|
||||
|
||||
### Parameters
|
||||
|
||||
|
@ -1,6 +1,8 @@
|
||||
# JSON-RPC Remote access {#jsonrpc_proxy}
|
||||
|
||||
SPDK provides a sample python script `rpc_http_proxy.py`, that provides http server which listens for JSON objects from users. It uses HTTP POST method to receive JSON objects including methods and parameters described in this chapter.
|
||||
SPDK provides a sample python script `rpc_http_proxy.py`, that provides http server which listens for JSON
|
||||
objects from users. It uses HTTP POST method to receive JSON objects including methods and parameters
|
||||
described in this chapter.
|
||||
|
||||
## Parameters
|
||||
|
||||
@ -26,7 +28,8 @@ Status 200 with resultant JSON object included on success.
|
||||
|
||||
## Client side
|
||||
|
||||
Below is a sample python script acting as a client side. It sends `bdev_get_bdevs` method with optional `name` parameter and prints JSON object returned from remote_rpc script.
|
||||
Below is a sample python script acting as a client side. It sends `bdev_get_bdevs` method with optional `name`
|
||||
parameter and prints JSON object returned from remote_rpc script.
|
||||
|
||||
~~~
|
||||
import json
|
||||
@ -47,5 +50,8 @@ Output:
|
||||
|
||||
~~~
|
||||
python client.py
|
||||
[{u'num_blocks': 2621440, u'name': u'Malloc0', u'uuid': u'fb57e59c-599d-42f1-8b89-3e46dbe12641', u'claimed': True, u'driver_specific': {}, u'supported_io_types': {u'reset': True, u'nvme_admin': False, u'unmap': True, u'read': True, u'nvme_io': False, u'write': True, u'flush': True, u'write_zeroes': True}, u'qos_ios_per_sec': 0, u'block_size': 4096, u'product_name': u'Malloc disk', u'aliases': []}]
|
||||
[{u'num_blocks': 2621440, u'name': u'Malloc0', u'uuid': u'fb57e59c-599d-42f1-8b89-3e46dbe12641', u'claimed': True,
|
||||
u'driver_specific': {}, u'supported_io_types': {u'reset': True, u'nvme_admin': False, u'unmap': True, u'read': True,
|
||||
u'nvme_io': False, u'write': True, u'flush': True, u'write_zeroes': True}, u'qos_ios_per_sec': 0, u'block_size': 4096,
|
||||
u'product_name': u'Malloc disk', u'aliases': []}]
|
||||
~~~
|
||||
|
58
doc/lvol.md
58
doc/lvol.md
@ -1,6 +1,7 @@
|
||||
# Logical Volumes {#logical_volumes}
|
||||
|
||||
The Logical Volumes library is a flexible storage space management system. It provides creating and managing virtual block devices with variable size. The SPDK Logical Volume library is built on top of @ref blob.
|
||||
The Logical Volumes library is a flexible storage space management system. It provides creating and managing virtual
|
||||
block devices with variable size. The SPDK Logical Volume library is built on top of @ref blob.
|
||||
|
||||
# Terminology {#lvol_terminology}
|
||||
|
||||
@ -9,15 +10,19 @@ The Logical Volumes library is a flexible storage space management system. It pr
|
||||
* Shorthand: lvolstore, lvs
|
||||
* Type name: struct spdk_lvol_store
|
||||
|
||||
A logical volume store uses the super blob feature of blobstore to hold uuid (and in future other metadata). Blobstore types are implemented in blobstore itself, and saved on disk. An lvolstore will generate a UUID on creation, so that it can be uniquely identified from other lvolstores.
|
||||
By default when creating lvol store data region is unmapped. Optional --clear-method parameter can be passed on creation to change that behavior to writing zeroes or performing no operation.
|
||||
A logical volume store uses the super blob feature of blobstore to hold uuid (and in future other metadata).
|
||||
Blobstore types are implemented in blobstore itself, and saved on disk. An lvolstore will generate a UUID on
|
||||
creation, so that it can be uniquely identified from other lvolstores.
|
||||
By default when creating lvol store data region is unmapped. Optional --clear-method parameter can be passed
|
||||
on creation to change that behavior to writing zeroes or performing no operation.
|
||||
|
||||
## Logical volume {#lvol}
|
||||
|
||||
* Shorthand: lvol
|
||||
* Type name: struct spdk_lvol
|
||||
|
||||
A logical volume is implemented as an SPDK blob created from an lvolstore. An lvol is uniquely identified by its UUID. Lvol additional can have alias name.
|
||||
A logical volume is implemented as an SPDK blob created from an lvolstore. An lvol is uniquely identified by
|
||||
its UUID. Lvol additional can have alias name.
|
||||
|
||||
## Logical volume block device {#lvol_bdev}
|
||||
|
||||
@ -25,14 +30,19 @@ A logical volume is implemented as an SPDK blob created from an lvolstore. An lv
|
||||
* Type name: struct spdk_lvol_bdev
|
||||
|
||||
Representation of an SPDK block device (spdk_bdev) with an lvol implementation.
|
||||
A logical volume block device translates generic SPDK block device I/O (spdk_bdev_io) operations into the equivalent SPDK blob operations. Combination of lvol name and lvolstore name gives lvol_bdev alias name in a form "lvs_name/lvol_name". block_size of the created bdev is always 4096, due to blobstore page size. Cluster_size is configurable by parameter.
|
||||
Size of the new bdev will be rounded up to nearest multiple of cluster_size.
|
||||
By default lvol bdevs claim part of lvol store equal to their set size. When thin provision option is enabled, no space is taken from lvol store until data is written to lvol bdev.
|
||||
By default when deleting lvol bdev or resizing down, allocated clusters are unmapped. Optional --clear-method parameter can be passed on creation to change that behavior to writing zeroes or performing no operation.
|
||||
A logical volume block device translates generic SPDK block device I/O (spdk_bdev_io) operations into the
|
||||
equivalent SPDK blob operations. Combination of lvol name and lvolstore name gives lvol_bdev alias name in
|
||||
a form "lvs_name/lvol_name". block_size of the created bdev is always 4096, due to blobstore page size.
|
||||
Cluster_size is configurable by parameter. Size of the new bdev will be rounded up to nearest multiple of
|
||||
cluster_size. By default lvol bdevs claim part of lvol store equal to their set size. When thin provision
|
||||
option is enabled, no space is taken from lvol store until data is written to lvol bdev.
|
||||
By default when deleting lvol bdev or resizing down, allocated clusters are unmapped. Optional --clear-method
|
||||
parameter can be passed on creation to change that behavior to writing zeroes or performing no operation.
|
||||
|
||||
## Thin provisioning {#lvol_thin_provisioning}
|
||||
|
||||
Thin provisioned lvols rely on dynamic cluster allocation (e.g. when the first write operation on a cluster is performed), only space required to store data is used and unallocated clusters are obtained from underlying device (e.g. zeroes_dev).
|
||||
Thin provisioned lvols rely on dynamic cluster allocation (e.g. when the first write operation on a cluster is performed), only space
|
||||
required to store data is used and unallocated clusters are obtained from underlying device (e.g. zeroes_dev).
|
||||
|
||||
Sample write operations of thin provisioned blob are shown on the diagram below:
|
||||
|
||||
@ -44,9 +54,11 @@ Sample read operations and the structure of thin provisioned blob are shown on t
|
||||
|
||||
## Snapshots and clone {#lvol_snapshots}
|
||||
|
||||
Logical volumes support snapshots and clones functionality. User may at any given time create snapshot of existing logical volume to save a backup of current volume state.
|
||||
When creating snapshot original volume becomes thin provisioned and saves only incremental differences from its underlying snapshot. This means that every read from unallocated cluster is actually a read from the snapshot and
|
||||
every write to unallocated cluster triggers new cluster allocation and data copy from corresponding cluster in snapshot to the new cluster in logical volume before the actual write occurs.
|
||||
Logical volumes support snapshots and clones functionality. User may at any given time create snapshot of existing
|
||||
logical volume to save a backup of current volume state. When creating snapshot original volume becomes thin provisioned
|
||||
and saves only incremental differences from its underlying snapshot. This means that every read from unallocated cluster
|
||||
is actually a read from the snapshot and every write to unallocated cluster triggers new cluster allocation and data copy
|
||||
from corresponding cluster in snapshot to the new cluster in logical volume before the actual write occurs.
|
||||
|
||||
The read operation is performed as shown in the diagram below:
|
||||
![Reading cluster from clone](lvol_clone_snapshot_read.svg)
|
||||
@ -54,26 +66,32 @@ The read operation is performed as shown in the diagram below:
|
||||
The write operation is performed as shown in the diagram below:
|
||||
![Writing cluster to the clone](lvol_clone_snapshot_write.svg)
|
||||
|
||||
User may also create clone of existing snapshot that will be thin provisioned and it will behave in the same way as logical volume from which snapshot is created.
|
||||
There is no limit of clones and snapshots that may be created as long as there is enough space on logical volume store. Snapshots are read only. Clones may be created only from snapshots or read only logical volumes.
|
||||
User may also create clone of existing snapshot that will be thin provisioned and it will behave in the same way as logical volume
|
||||
from which snapshot is created. There is no limit of clones and snapshots that may be created as long as there is enough space on
|
||||
logical volume store. Snapshots are read only. Clones may be created only from snapshots or read only logical volumes.
|
||||
|
||||
A snapshot can be removed only if there is a single clone on top of it. The relation chain will be updated accordingly. The cluster map of clone and snapshot will be merged and entries for unallocated clusters in the clone
|
||||
will be updated with addresses from the snapshot cluster map. The entire operation modifies metadata only - no data is copied during this process.
|
||||
A snapshot can be removed only if there is a single clone on top of it. The relation chain will be updated accordingly.
|
||||
The cluster map of clone and snapshot will be merged and entries for unallocated clusters in the clone will be updated with
|
||||
addresses from the snapshot cluster map. The entire operation modifies metadata only - no data is copied during this process.
|
||||
|
||||
## Inflation {#lvol_inflation}
|
||||
|
||||
Blobs can be inflated to copy data from backing devices (e.g. snapshots) and allocate all remaining clusters. As a result of this operation all dependencies for the blob are removed.
|
||||
Blobs can be inflated to copy data from backing devices (e.g. snapshots) and allocate all remaining clusters. As a result of this
|
||||
operation all dependencies for the blob are removed.
|
||||
|
||||
![Removing backing blob and bdevs relations using inflate call](lvol_inflate_clone_snapshot.svg)
|
||||
|
||||
## Decoupling {#lvol_decoupling}
|
||||
|
||||
Blobs can be decoupled from their parent blob by copying data from backing devices (e.g. snapshots) for all allocated clusters. Remaining unallocated clusters are kept thin provisioned.
|
||||
Note: When decouple is performed, only single dependency is removed. To remove all dependencies in a chain of blobs depending on each other, multiple calls need to be issued.
|
||||
Blobs can be decoupled from their parent blob by copying data from backing devices (e.g. snapshots) for all allocated clusters.
|
||||
Remaining unallocated clusters are kept thin provisioned.
|
||||
Note: When decouple is performed, only single dependency is removed. To remove all dependencies in a chain of blobs depending
|
||||
on each other, multiple calls need to be issued.
|
||||
|
||||
# Configuring Logical Volumes
|
||||
|
||||
There is no static configuration available for logical volumes. All configuration is done trough RPC. Information about logical volumes is kept on block devices.
|
||||
There is no static configuration available for logical volumes. All configuration is done trough RPC. Information about
|
||||
logical volumes is kept on block devices.
|
||||
|
||||
# RPC overview {#lvol_rpc}
|
||||
|
||||
|
@ -186,7 +186,8 @@ Basic Types
|
||||
year = 4 * digit ;
|
||||
month = '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ;
|
||||
digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
|
||||
hex digit = 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
|
||||
hex digit = 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '0' |
|
||||
'1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
|
||||
|
||||
NQN Definition
|
||||
NVMe Qualified Name = ( NVMe-oF Discovery NQN | NVMe UUID NQN | NVMe Domain NQN ), '\0' ;
|
||||
|
@ -1,8 +1,19 @@
|
||||
# spdk_top {#spdk_top}
|
||||
|
||||
The spdk_top application is designed to resemble the standard top in that it provides a real-time insights into CPU cores usage by SPDK lightweight threads and pollers. Have you ever wondered which CPU core is used most by your SPDK instance? Are you building your own bdev or library and want to know if your code is running efficiently? Are your new pollers busy most of the time? The spdk_top application uses RPC calls to collect performance metrics and displays them in a report that you can analyze and determine if your code is running efficiently so that you can tune your implementation and get more from SPDK.
|
||||
The spdk_top application is designed to resemble the standard top in that it provides a real-time insights into CPU cores usage by SPDK
|
||||
lightweight threads and pollers. Have you ever wondered which CPU core is used most by your SPDK instance? Are you building your own bdev
|
||||
or library and want to know if your code is running efficiently? Are your new pollers busy most of the time? The spdk_top application uses
|
||||
RPC calls to collect performance metrics and displays them in a report that you can analyze and determine if your code is running efficiently
|
||||
so that you can tune your implementation and get more from SPDK.
|
||||
|
||||
Why doesn't the classic top utility work for SPDK? SPDK uses a polled-mode design; a reactor thread running on each CPU core assigned to an SPDK application schedules SPDK lightweight threads and pollers to run on the CPU core. Therefore, the standard Linux top utility is not effective for analyzing the CPU usage for polled-mode applications like SPDK because it just reports that they are using 100% of the CPU resources assigned to them. The spdk_top utility was developed to analyze and report the CPU cycles used to do real work vs just polling for work. The utility relies on instrumentation added to pollers to track when they are doing work vs. polling for work. The spdk_top utility gets the fine grained metrics from the pollers, analyzes and report the metrics on a per poller, thread and core basis. This information enables users to identify CPU cores that are busy doing real work so that they can determine if the application needs more or less CPU resources.
|
||||
Why doesn't the classic top utility work for SPDK? SPDK uses a polled-mode design; a reactor thread running on each CPU core assigned to
|
||||
an SPDK application schedules SPDK lightweight threads and pollers to run on the CPU core. Therefore, the standard Linux top utility is
|
||||
not effective for analyzing the CPU usage for polled-mode applications like SPDK because it just reports that they are using 100% of the
|
||||
CPU resources assigned to them. The spdk_top utility was developed to analyze and report the CPU cycles used to do real work vs just
|
||||
polling for work. The utility relies on instrumentation added to pollers to track when they are doing work vs. polling for work. The
|
||||
spdk_top utility gets the fine grained metrics from the pollers, analyzes and report the metrics on a per poller, thread and core basis.
|
||||
This information enables users to identify CPU cores that are busy doing real work so that they can determine if the application
|
||||
needs more or less CPU resources.
|
||||
|
||||
# Run spdk_top
|
||||
Before running spdk_top you need to run the SPDK application whose performance you want to analyze using spdk_top.
|
||||
@ -18,7 +29,8 @@ Menu at the bottom of SPDK top window shows many options for changing displayed
|
||||
|
||||
* Quit - quits the SPDK top application.
|
||||
* Switch tab - allows to select THREADS/POLLERS/CORES tabs.
|
||||
* Previous page/Next page - scrolls up/down to the next set of rows displayed. Indicator in the bottom-left corner shows current page and number of all available pages.
|
||||
* Previous page/Next page - scrolls up/down to the next set of rows displayed. Indicator in the bottom-left corner shows current page and number
|
||||
of all available pages.
|
||||
* Item details - displays details pop-up window for highlighted data row. Selection is changed by pressing UP and DOWN arrow keys.
|
||||
* Help - displays help pop-up window.
|
||||
|
||||
@ -31,7 +43,8 @@ The threads tab displays a line item for each spdk thread. The information displ
|
||||
* Idle/Busy - how many microseconds the thread was idle/busy.
|
||||
|
||||
\n
|
||||
By pressing ENTER key a pop-up window appears, showing above and a list of pollers running on selected thread (with poller name, type, run count and period).
|
||||
By pressing ENTER key a pop-up window appears, showing above and a list of pollers running on selected
|
||||
thread (with poller name, type, run count and period).
|
||||
Pop-up then can be closed by pressing ESC key.
|
||||
|
||||
To learn more about spdk threads see @ref concurrency.
|
||||
@ -59,7 +72,8 @@ The cores tab provides insights into how the application is using the CPU cores
|
||||
* Idle/Busy - how many microseconds core was idle (including time when core ran pollers but did not find any work) or doing actual work.
|
||||
|
||||
\n
|
||||
Pressing ENTER key makes a pop-up window appear, showing above information, along with a list of threads running on selected core. Cores details window allows to select a thread and display thread details pop-up on top of it. To close both pop-ups use ESC key.
|
||||
Pressing ENTER key makes a pop-up window appear, showing above information, along with a list of threads running on selected core. Cores details
|
||||
window allows to select a thread and display thread details pop-up on top of it. To close both pop-ups use ESC key.
|
||||
|
||||
# Help Window
|
||||
Help window pop-up can be invoked by pressing H key inside any tab. It contains explanations for each key used inside the spdk_top application.
|
||||
|
@ -10,7 +10,7 @@ exclude_rule 'MD009'
|
||||
exclude_rule 'MD010'
|
||||
exclude_rule 'MD011'
|
||||
exclude_rule 'MD012'
|
||||
exclude_rule 'MD013'
|
||||
rule 'MD013', :line_length => 170
|
||||
exclude_rule 'MD014'
|
||||
exclude_rule 'MD018'
|
||||
exclude_rule 'MD019'
|
||||
|
@ -46,7 +46,8 @@ Quick start instructions for OSX:
|
||||
|
||||
## Linux Setup
|
||||
|
||||
Following the generic instructions should be sufficient for most Linux distributions. For more thorough instructions on installing VirtualBox on your distribution of choice, please see the following [guide](https://www.virtualbox.org/wiki/Linux_Downloads).
|
||||
Following the generic instructions should be sufficient for most Linux distributions. For more thorough instructions on installing
|
||||
VirtualBox on your distribution of choice, please see the following [guide](https://www.virtualbox.org/wiki/Linux_Downloads).
|
||||
|
||||
Examples on Fedora26/Fedora27/Fedora28
|
||||
|
||||
@ -76,7 +77,9 @@ If you are behind a corporate firewall, configure the following proxy settings.
|
||||
|
||||
## Download SPDK from GitHub
|
||||
|
||||
Use git to clone a new spdk repository. GerritHub can also be used. See the instructions at [spdk.io](http://www.spdk.io/development/#gerrithub) to setup your GerritHub account. Note that this spdk repository will be rsync'd into your VM, so you can use this repository to continue development within the VM.
|
||||
Use git to clone a new spdk repository. GerritHub can also be used. See the instructions at
|
||||
[spdk.io](http://www.spdk.io/development/#gerrithub) to setup your GerritHub account. Note that this spdk
|
||||
repository will be rsync'd into your VM, so you can use this repository to continue development within the VM.
|
||||
|
||||
## Create a Virtual Box
|
||||
|
||||
@ -117,7 +120,9 @@ $ spdk/scripts/vagrant/create_vbox.sh -h
|
||||
./scripts/vagrant/create_vbox.sh fedora26
|
||||
```
|
||||
|
||||
It is recommended that you call the `create_vbox.sh` script from outside of the spdk repository. Call this script from a parent directory. This will allow the creation of multiple VMs in separate <distro> directories, all using the same spdk repository. For example:
|
||||
It is recommended that you call the `create_vbox.sh` script from outside of the spdk repository.
|
||||
Call this script from a parent directory. This will allow the creation of multiple VMs in separate
|
||||
<distro> directories, all using the same spdk repository. For example:
|
||||
|
||||
```
|
||||
$ spdk/scripts/vagrant/create_vbox.sh -s 2048 -n 2 fedora26
|
||||
@ -133,7 +138,8 @@ This script will:
|
||||
6. rsync a copy of the `~/vagrant_tools` directory to `/home/vagrant/tools` (optional)
|
||||
7. execute vm_setup.sh on the guest to install all spdk dependencies (optional)
|
||||
|
||||
This arrangement allows the provisioning of multiple, different VMs within that same directory hierarchy using the same spdk repository. Following the creation of the vm you'll need to ssh into your virtual box and finish the VM initialization.
|
||||
This arrangement allows the provisioning of multiple, different VMs within that same directory hierarchy using thesame
|
||||
spdk repository. Following the creation of the vm you'll need to ssh into your virtual box and finish the VM initialization.
|
||||
|
||||
```
|
||||
$ cd <distro>
|
||||
@ -142,7 +148,9 @@ This arrangement allows the provisioning of multiple, different VMs within that
|
||||
|
||||
## Finish VM Initialization
|
||||
|
||||
A copy of the `spdk` repository you cloned will exist in the `spdk_repo` directory of the `/home/vagrant` user account. After using `vagrant ssh` to enter your VM you must complete the initialization of your VM by running the `scripts/vagrant/update.sh` script. For example:
|
||||
A copy of the `spdk` repository you cloned will exist in the `spdk_repo` directory of the `/home/vagrant` user
|
||||
account. After using `vagrant ssh` to enter your VM you must complete the initialization of your VM by running
|
||||
the `scripts/vagrant/update.sh` script. For example:
|
||||
|
||||
```
|
||||
$ script -c 'sudo spdk_repo/spdk/scripts/vagrant/update.sh' update.log
|
||||
@ -154,7 +162,8 @@ The `update.sh` script completes initialization of the VM by automating the foll
|
||||
2. Runs the scripts/pdkdep.sh script
|
||||
3. Installs the FreeBSD source in /usr/sys (FreeBSD only)
|
||||
|
||||
This only needs to be done once. This is also not necessary for Fedora VMs provisioned with the -d flag. The `vm_setup` script performs these operations instead.
|
||||
This only needs to be done once. This is also not necessary for Fedora VMs provisioned with the -d flag. The `vm_setup`
|
||||
script performs these operations instead.
|
||||
|
||||
## Post VM Initialization
|
||||
|
||||
@ -190,7 +199,8 @@ Following VM initialization you must:
|
||||
|
||||
### Running autorun.sh with vagrant
|
||||
|
||||
After running vm_setup.sh the `run-autorun.sh` can be used to run `spdk/autorun.sh` on a Fedora vagrant machine. Note that the `spdk/scripts/vagrant/autorun-spdk.conf` should be copied to `~/autorun-spdk.conf` before starting your tests.
|
||||
After running vm_setup.sh the `run-autorun.sh` can be used to run `spdk/autorun.sh` on a Fedora vagrant machine.
|
||||
Note that the `spdk/scripts/vagrant/autorun-spdk.conf` should be copied to `~/autorun-spdk.conf` before starting your tests.
|
||||
|
||||
```
|
||||
$ cp spdk/scripts/vagrant/autorun-spdk.conf ~/
|
||||
|
@ -2,7 +2,10 @@
|
||||
|
||||
## Compile SPDK with LTO
|
||||
|
||||
The link time optimization (lto) gcc flag allows the linker to run a post-link optimization pass on the code. During that pass the linker inlines thin wrappers such as those around DPDK calls which results in a shallow call stack and significantly improves performance. Therefore, we recommend compiling SPDK with the lto flag prior to running this benchmark script to archieve optimal performance.
|
||||
The link time optimization (lto) gcc flag allows the linker to run a post-link optimization pass on the code. During
|
||||
that pass the linker inlines thin wrappers such as those around DPDK calls which results in a shallow call stack and
|
||||
significantly improves performance. Therefore, we recommend compiling SPDK with the lto flag prior to running this
|
||||
benchmark script to archieve optimal performance.
|
||||
Link time optimization can be enabled in SPDK by doing the following:
|
||||
|
||||
~{.sh}
|
||||
@ -35,7 +38,9 @@ Path to fio binary.
|
||||
#### --driver
|
||||
|
||||
Select between SPDK driver and kernel driver. The Linux Kernel driver has three configurations:
|
||||
Default mode, Hybrid Polling and Classic Polling. The SPDK driver supports 2 fio_plugin modes: bdev and NVMe PMD. Before running test with spdk, you will need to bind NVMe devics to the Linux uio_pci_generic or vfio-pci driver. When running test with the Kernel driver, NVMe devices use the Kernel driver. The 5 valid values for this option are:
|
||||
Default mode, Hybrid Polling and Classic Polling. The SPDK driver supports 2 fio_plugin modes: bdev and NVMe PMD.
|
||||
Before running test with spdk, you will need to bind NVMe devics to the Linux uio_pci_generic or vfio-pci driver.
|
||||
When running test with the Kernel driver, NVMe devices use the Kernel driver. The 5 valid values for this option are:
|
||||
'bdev', 'nvme', 'kernel-libaio', 'kernel-classic-polling' and 'kernel-hybrid-polling'.
|
||||
|
||||
#### --max-disk
|
||||
@ -52,9 +57,10 @@ will be set to all available devices. Only one of the max-disk or disk-no option
|
||||
|
||||
#### --cpu-allowed
|
||||
|
||||
Specifies the CPU cores that will be used by fio to execute the performance test cases. When spdk driver is chosen, Nthe script attempts to assign NVMe devices to CPU cores on the same NUMA node. The script will try to align each core with devices matching
|
||||
core's NUMA first but if the is no devices left within the CPU core NUMA then it will use devices from the other
|
||||
NUMA node. It is important to choose cores that will ensure best NUMA node alignment. For example:
|
||||
Specifies the CPU cores that will be used by fio to execute the performance test cases. When spdk driver is chosen,
|
||||
the script attempts to assign NVMe devices to CPU cores on the same NUMA node. The script will try to align each
|
||||
core with devices matching core's NUMA first but if the is no devices left within the CPU core NUMA then it will use
|
||||
devices from the other NUMA node. It is important to choose cores that will ensure best NUMA node alignment. For example:
|
||||
On System with 8 devices on NUMA node 0 and 8 devices on NUMA node 1, cores 0-27 on numa node 0 and 28-55
|
||||
on numa node 1, if test is set to use 16 disk and four cores then "--cpu-allowed=1,2,28,29" can be used
|
||||
resulting with 4 devices with node0 per core 1 and 2 and 4 devices with node1 per core 28 and 29. If 10 cores
|
||||
|
Loading…
Reference in New Issue
Block a user