markdownlint: enable rule MD046
MD046 - Code block style Fixed all errors Signed-off-by: Maciej Wawryk <maciejx.wawryk@intel.com> Change-Id: I0a5f711a54e1859a6c8d0f26dcabf210496fb819 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9273 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>
This commit is contained in:
parent
63ee471b64
commit
111d42765d
24
doc/blob.md
24
doc/blob.md
@ -295,19 +295,23 @@ contribute to the Blobstore effort itself.
|
||||
The Blobstore owns the entire storage device. The device is divided into clusters starting from the beginning, such
|
||||
that cluster 0 begins at the first logical block.
|
||||
|
||||
LBA 0 LBA N
|
||||
+-----------+-----------+-----+-----------+
|
||||
| Cluster 0 | Cluster 1 | ... | Cluster N |
|
||||
+-----------+-----------+-----+-----------+
|
||||
```text
|
||||
LBA 0 LBA N
|
||||
+-----------+-----------+-----+-----------+
|
||||
| Cluster 0 | Cluster 1 | ... | Cluster N |
|
||||
+-----------+-----------+-----+-----------+
|
||||
```
|
||||
|
||||
Cluster 0 is special and has the following format, where page 0 is the first page of the cluster:
|
||||
|
||||
+--------+-------------------+
|
||||
| Page 0 | Page 1 ... Page N |
|
||||
+--------+-------------------+
|
||||
| Super | Metadata Region |
|
||||
| Block | |
|
||||
+--------+-------------------+
|
||||
```text
|
||||
+--------+-------------------+
|
||||
| Page 0 | Page 1 ... Page N |
|
||||
+--------+-------------------+
|
||||
| Super | Metadata Region |
|
||||
| Block | |
|
||||
+--------+-------------------+
|
||||
```
|
||||
|
||||
The super block is a single page located at the beginning of the partition. It contains basic information about
|
||||
the Blobstore. The metadata region is the remainder of cluster 0 and may extend to additional clusters. Refer
|
||||
|
@ -140,6 +140,7 @@ function `foo` performs some asynchronous operation and when that completes
|
||||
function `bar` is called, then function `bar` performs some operation that
|
||||
calls function `baz` on completion, a good way to write it is as such:
|
||||
|
||||
```c
|
||||
void baz(void *ctx) {
|
||||
...
|
||||
}
|
||||
@ -151,6 +152,7 @@ calls function `baz` on completion, a good way to write it is as such:
|
||||
void foo(void *ctx) {
|
||||
async_op(bar, ctx);
|
||||
}
|
||||
```
|
||||
|
||||
Don't split these functions up - keep them as a nice unit that can be read from bottom to top.
|
||||
|
||||
@ -162,6 +164,7 @@ them in C we can still write them out by hand. As an example, here's a
|
||||
callback chain that performs `foo` 5 times and then calls `bar` - effectively
|
||||
an asynchronous for loop.
|
||||
|
||||
```c
|
||||
enum states {
|
||||
FOO_START = 0,
|
||||
FOO_END,
|
||||
@ -244,6 +247,7 @@ an asynchronous for loop.
|
||||
|
||||
run_state_machine(sm);
|
||||
}
|
||||
```
|
||||
|
||||
This is complex, of course, but the `run_state_machine` function can be read
|
||||
from top to bottom to get a clear overview of what's happening in the code
|
||||
|
@ -27,6 +27,7 @@ well as their validity, as some of the data will be invalidated by subsequent wr
|
||||
logical address. The L2P mapping can be restored from the SSD by reading this information in order
|
||||
from the oldest band to the youngest.
|
||||
|
||||
```text
|
||||
+--------------+ +--------------+ +--------------+
|
||||
band 1 | zone 1 +--------+ zone 1 +---- --- --- --- --- ---+ zone 1 |
|
||||
+--------------+ +--------------+ +--------------+
|
||||
@ -42,16 +43,19 @@ from the oldest band to the youngest.
|
||||
+--------------+ +--------------+ +--------------+
|
||||
|
||||
parallel unit 1 pu 2 pu n
|
||||
```
|
||||
|
||||
The address map and valid map are, along with a several other things (e.g. UUID of the device it's
|
||||
part of, number of surfaced LBAs, band's sequence number, etc.), parts of the band's metadata. The
|
||||
metadata is split in two parts:
|
||||
|
||||
```text
|
||||
head metadata band's data tail metadata
|
||||
+-------------------+-------------------------------+------------------------+
|
||||
|zone 1 |...|zone n |...|...|zone 1 |...| | ... |zone m-1 |zone m|
|
||||
|block 1| |block 1| | |block x| | | |block y |block y|
|
||||
+-------------------+-------------+-----------------+------------------------+
|
||||
```
|
||||
|
||||
- the head part, containing information already known when opening the band (device's UUID, band's
|
||||
sequence number, etc.), located at the beginning blocks of the band,
|
||||
@ -73,6 +77,7 @@ support writes to a single block, the data needs to be buffered. The write buffe
|
||||
this problem. It consists of a number of pre-allocated buffers called batches, each of size allowing
|
||||
for a single transfer to the SSD. A single batch is divided into block-sized buffer entries.
|
||||
|
||||
```text
|
||||
write buffer
|
||||
+-----------------------------------+
|
||||
|batch 1 |
|
||||
@ -89,6 +94,7 @@ for a single transfer to the SSD. A single batch is divided into block-sized buf
|
||||
| |entry 1|entry 2| |entry n| |
|
||||
| +-----------------------------+ |
|
||||
+-----------------------------------+
|
||||
```
|
||||
|
||||
When a write is scheduled, it needs to acquire an entry for each of its blocks and copy the data
|
||||
onto this buffer. Once all blocks are copied, the write can be signalled as completed to the user.
|
||||
@ -108,12 +114,14 @@ situation in which all of the bands contain some valid data and no band can be e
|
||||
can be executed anymore. Therefore a mechanism is needed to move valid data and invalidate whole
|
||||
bands, so that they can be reused.
|
||||
|
||||
```text
|
||||
band band
|
||||
+-----------------------------------+ +-----------------------------------+
|
||||
| ** * * *** * *** * * | | |
|
||||
|** * * * * * * *| +----> | |
|
||||
|* *** * * * | | |
|
||||
+-----------------------------------+ +-----------------------------------+
|
||||
```
|
||||
|
||||
Valid blocks are marked with an asterisk '\*'.
|
||||
|
||||
|
@ -18,4 +18,6 @@ a new version of the *env* library. The new implementation can be
|
||||
integrated into the SPDK build by updating the following line
|
||||
in CONFIG:
|
||||
|
||||
CONFIG_ENV?=$(SPDK_ROOT_DIR)/lib/env_dpdk
|
||||
```bash
|
||||
CONFIG_ENV?=$(SPDK_ROOT_DIR)/lib/env_dpdk
|
||||
```
|
||||
|
@ -8,36 +8,48 @@ the GPL license.
|
||||
|
||||
Clone the fio source repository from https://github.com/axboe/fio
|
||||
|
||||
```bash
|
||||
git clone https://github.com/axboe/fio
|
||||
cd fio
|
||||
```
|
||||
|
||||
Compile the fio code and install:
|
||||
|
||||
```bash
|
||||
make
|
||||
make install
|
||||
```
|
||||
|
||||
## Compiling SPDK
|
||||
|
||||
Clone the SPDK source repository from https://github.com/spdk/spdk
|
||||
|
||||
```bash
|
||||
git clone https://github.com/spdk/spdk
|
||||
cd spdk
|
||||
git submodule update --init
|
||||
```
|
||||
|
||||
Then, run the SPDK configure script to enable fio (point it to the root of the fio repository):
|
||||
|
||||
```bash
|
||||
cd spdk
|
||||
./configure --with-fio=/path/to/fio/repo <other configuration options>
|
||||
```
|
||||
|
||||
Finally, build SPDK:
|
||||
|
||||
```bash
|
||||
make
|
||||
```
|
||||
|
||||
**Note to advanced users**: These steps assume you're using the DPDK submodule. If you are using your
|
||||
own version of DPDK, the fio plugin requires that DPDK be compiled with -fPIC. You can compile DPDK
|
||||
with -fPIC by modifying your DPDK configuration file and adding the line:
|
||||
|
||||
EXTRA_CFLAGS=-fPIC
|
||||
```bash
|
||||
EXTRA_CFLAGS=-fPIC
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
@ -45,20 +57,28 @@ To use the SPDK fio plugin with fio, specify the plugin binary using LD_PRELOAD
|
||||
fio and set ioengine=spdk_bdev in the fio configuration file (see example_config.fio in the same
|
||||
directory as this README).
|
||||
|
||||
LD_PRELOAD=<path to spdk repo>/build/fio/spdk_bdev fio
|
||||
```bash
|
||||
LD_PRELOAD=<path to spdk repo>/build/fio/spdk_bdev fio
|
||||
```
|
||||
|
||||
The fio configuration file must contain one new parameter:
|
||||
|
||||
spdk_json_conf=./examples/bdev/fio_plugin/bdev.json
|
||||
```bash
|
||||
spdk_json_conf=./examples/bdev/fio_plugin/bdev.json
|
||||
```
|
||||
|
||||
You can specify which block device to run against by setting the filename parameter
|
||||
to the block device name:
|
||||
|
||||
filename=Malloc0
|
||||
```bash
|
||||
filename=Malloc0
|
||||
```
|
||||
|
||||
Or for NVMe devices:
|
||||
|
||||
filename=Nvme0n1
|
||||
```bash
|
||||
filename=Nvme0n1
|
||||
```
|
||||
|
||||
fio by default forks a separate process for every job. It also supports just spawning a separate
|
||||
thread in the same process for every job. The SPDK fio plugin is limited to this latter thread
|
||||
@ -79,7 +99,9 @@ NVMe Zoned Namespaces (ZNS), and the virtual zoned block device SPDK module.
|
||||
|
||||
If you wish to run fio against a SPDK zoned block device, you can use the fio option:
|
||||
|
||||
zonemode=zbd
|
||||
```bash
|
||||
zonemode=zbd
|
||||
```
|
||||
|
||||
It is recommended to use a fio version newer than version 3.26, if using --numjobs > 1.
|
||||
If using --numjobs=1, fio version >= 3.23 should suffice.
|
||||
@ -108,7 +130,9 @@ zones limit, the easiest way to work around that fio does not manage this constr
|
||||
with a clean state each run (except for read-only workloads), by resetting all zones before fio
|
||||
starts running its jobs by using the engine option:
|
||||
|
||||
--initial_zone_reset=1
|
||||
```bash
|
||||
--initial_zone_reset=1
|
||||
```
|
||||
|
||||
### Zone Append
|
||||
|
||||
@ -116,7 +140,9 @@ When running fio against a zoned block device you need to specify --iodepth=1 to
|
||||
"Zone Invalid Write: The write to a zone was not at the write pointer." I/O errors.
|
||||
However, if your zoned block device supports Zone Append, you can use the engine option:
|
||||
|
||||
--zone_append=1
|
||||
```bash
|
||||
--zone_append=1
|
||||
```
|
||||
|
||||
To send zone append commands instead of write commands to the zoned block device.
|
||||
When using zone append, you will be able to specify a --iodepth greater than 1.
|
||||
|
@ -4,33 +4,45 @@
|
||||
|
||||
First, clone the fio source repository from https://github.com/axboe/fio
|
||||
|
||||
```bash
|
||||
git clone https://github.com/axboe/fio
|
||||
```
|
||||
|
||||
Then check out the latest fio version and compile the code:
|
||||
|
||||
```bash
|
||||
make
|
||||
```
|
||||
|
||||
## Compiling SPDK
|
||||
|
||||
First, clone the SPDK source repository from https://github.com/spdk/spdk
|
||||
|
||||
```bash
|
||||
git clone https://github.com/spdk/spdk
|
||||
git submodule update --init
|
||||
```
|
||||
|
||||
Then, run the SPDK configure script to enable fio (point it to the root of the fio repository):
|
||||
|
||||
```bash
|
||||
cd spdk
|
||||
./configure --with-fio=/path/to/fio/repo <other configuration options>
|
||||
```
|
||||
|
||||
Finally, build SPDK:
|
||||
|
||||
```bash
|
||||
make
|
||||
```
|
||||
|
||||
**Note to advanced users**: These steps assume you're using the DPDK submodule. If you are using your
|
||||
own version of DPDK, the fio plugin requires that DPDK be compiled with -fPIC. You can compile DPDK
|
||||
with -fPIC by modifying your DPDK configuration file and adding the line:
|
||||
|
||||
EXTRA_CFLAGS=-fPIC
|
||||
```bash
|
||||
EXTRA_CFLAGS=-fPIC
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
@ -38,20 +50,28 @@ To use the SPDK fio plugin with fio, specify the plugin binary using LD_PRELOAD
|
||||
fio and set ioengine=spdk in the fio configuration file (see example_config.fio in the same
|
||||
directory as this README).
|
||||
|
||||
LD_PRELOAD=<path to spdk repo>/build/fio/spdk_nvme fio
|
||||
```bash
|
||||
LD_PRELOAD=<path to spdk repo>/build/fio/spdk_nvme fio
|
||||
```
|
||||
|
||||
To select NVMe devices, you pass an SPDK Transport Identifier string as the filename. These are in the
|
||||
form:
|
||||
|
||||
filename=key=value [key=value] ... ns=value
|
||||
```bash
|
||||
filename=key=value [key=value] ... ns=value
|
||||
```
|
||||
|
||||
Specifically, for local PCIe NVMe devices it will look like this:
|
||||
|
||||
filename=trtype=PCIe traddr=0000.04.00.0 ns=1
|
||||
```bash
|
||||
filename=trtype=PCIe traddr=0000.04.00.0 ns=1
|
||||
```
|
||||
|
||||
And remote devices accessed via NVMe over Fabrics will look like this:
|
||||
|
||||
filename=trtype=RDMA adrfam=IPv4 traddr=192.168.100.8 trsvcid=4420 ns=1
|
||||
```bash
|
||||
filename=trtype=RDMA adrfam=IPv4 traddr=192.168.100.8 trsvcid=4420 ns=1
|
||||
```
|
||||
|
||||
**Note**: The specification of the PCIe address should not use the normal ':'
|
||||
and instead only use '.'. This is a limitation in fio - it splits filenames on
|
||||
@ -83,17 +103,23 @@ but it is not good to use one thread against many I/O devices.
|
||||
Running with PI setting, following settings steps are required.
|
||||
First, format device namespace with proper PI setting. For example:
|
||||
|
||||
```bash
|
||||
nvme format /dev/nvme0n1 -l 1 -i 1 -p 0 -m 1
|
||||
```
|
||||
|
||||
In fio configure file, add PRACT and set PRCHK by flags(GUARD|REFTAG|APPTAG) properly. For example:
|
||||
|
||||
pi_act=0
|
||||
pi_chk=GUARD
|
||||
```bash
|
||||
pi_act=0
|
||||
pi_chk=GUARD
|
||||
```
|
||||
|
||||
Blocksize should be set as the sum of data and metadata. For example, if data blocksize is 512 Byte, host generated
|
||||
PI metadata is 8 Byte, then blocksize in fio configure file should be 520 Byte:
|
||||
|
||||
bs=520
|
||||
```bash
|
||||
bs=520
|
||||
```
|
||||
|
||||
The storage device may use a block format that requires separate metadata (DIX). In this scenario, the fio_plugin
|
||||
will automatically allocate an extra 4KiB buffer per I/O to hold this metadata. For some cases, such as 512 byte
|
||||
@ -108,18 +134,24 @@ tag mask are set to 0x1234 and 0xFFFF by default.
|
||||
|
||||
To enable VMD enumeration add enable_vmd flag in fio configuration file:
|
||||
|
||||
enable_vmd=1
|
||||
```bash
|
||||
enable_vmd=1
|
||||
```
|
||||
|
||||
## ZNS
|
||||
|
||||
To use Zoned Namespaces then build the io-engine against, and run using, a fio version >= 3.23 and add:
|
||||
|
||||
zonemode=zbd
|
||||
```bash
|
||||
zonemode=zbd
|
||||
```
|
||||
|
||||
To your fio-script, also have a look at script-examples provided with fio:
|
||||
|
||||
fio/examples/zbd-seq-read.fio
|
||||
fio/examples/zbd-rand-write.fio
|
||||
```bash
|
||||
fio/examples/zbd-seq-read.fio
|
||||
fio/examples/zbd-rand-write.fio
|
||||
```
|
||||
|
||||
### Maximum Open Zones
|
||||
|
||||
@ -140,7 +172,9 @@ When running with the SPDK/NVMe fio io-engine you can be exposed to error messag
|
||||
completion errors, with the NVMe status code of 0xbd ("Too Many Active Zones"). To work around this,
|
||||
then you can reset all zones before fio start running its jobs by using the engine option:
|
||||
|
||||
--initial_zone_reset=1
|
||||
```bash
|
||||
--initial_zone_reset=1
|
||||
```
|
||||
|
||||
### Zone Append
|
||||
|
||||
@ -148,7 +182,9 @@ When running FIO against a Zoned Namespace you need to specify --iodepth=1 to av
|
||||
"Zone Invalid Write: The write to a zone was not at the write pointer." I/O errors.
|
||||
However, if your controller supports Zone Append, you can use the engine option:
|
||||
|
||||
--zone_append=1
|
||||
```bash
|
||||
--zone_append=1
|
||||
```
|
||||
|
||||
To send zone append commands instead of write commands to the controller.
|
||||
When using zone append, you will be able to specify a --iodepth greater than 1.
|
||||
@ -157,7 +193,9 @@ When using zone append, you will be able to specify a --iodepth greater than 1.
|
||||
|
||||
If your device has a lot of zones, fio can give you errors such as:
|
||||
|
||||
smalloc: OOM. Consider using --alloc-size to increase the shared memory available.
|
||||
```bash
|
||||
smalloc: OOM. Consider using --alloc-size to increase the shared memory available.
|
||||
```
|
||||
|
||||
This is because fio needs to allocate memory for the zone-report, that is, retrieve the state of
|
||||
zones on the device including auxiliary accounting information. To solve this, then you can follow
|
||||
|
@ -9,4 +9,3 @@ exclude_rule 'MD031'
|
||||
exclude_rule 'MD033'
|
||||
exclude_rule 'MD034'
|
||||
exclude_rule 'MD041'
|
||||
exclude_rule 'MD046'
|
||||
|
Loading…
Reference in New Issue
Block a user