Compare commits

..

47 Commits

Author SHA1 Message Date
Tomasz Zawadzki
2499efedb9 SPDK 19.10.1
Change-Id: I3aa4fadc32a381cfe0686650db8bf409f0834ae0
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478449
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2020-01-02 15:24:47 +00:00
Seth Howell
92b1758771 nvmf/rdma: make disconnect qp from cm event safe.
The call to update_ibv_state could result in a segfault if the other
thread had already freed the qp and was just spinning on handling the
rdma event. By not updating the qpair state here, I don't think that we
lose any information about the qpair state. Especiallyy since just a
little bit later we update the qpair state to be in error.
There is nothing in the man pages about the cm events changing the ib
state although I imagine they are closely related. I just say that
because I believe that's why the update was originally in that spot.

fixes: GitHub issue #1110

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477877 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>

(cherry picked from commit a27a377ac4)
Change-Id: I3f87ff009bc2019464ed7c6920dd71e2b286b3fd
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478785
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-02 15:24:47 +00:00
Ben Walker
f8a0750385 nvme/rdma: Increase timeout when waiting for CM_EVENTS
In some real data center deployments, 100ms is not enough. Increase
the timeout to 1 second.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478638 (master)

(cherry picked from commit 3d06a83fa4)
Change-Id: I8195a1c1e987b7eff2d8541509f79381be32ed4b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478724
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-02 15:24:47 +00:00
Alexey Marchuk
16282e8fc3 nvme/pcie: Don't use contig SGL commands for admin qpair
Command with cns SPDK_NVME_IDENTIFY_ACTIVE_NS_LIST is issued during
controller initialization and if the controller supports SGL,
this command will be built as a contig SGL. This leads
to a failed completion with the following status:
INVALID FIELD (00/02) sqid:0 cid:95 cdw0:0 sqhd:0004 p:1 m:0 dnr:0
The first identify command SPDK_NVME_IDENTIFY_CTRLR passed since
it was built as a PRP command - we didn't know that the controller
supported SGL at that time. Fix - do not build SGL requests
for admin qpair

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478320 (master)

(cherry picked from commit 71159819b0)
Change-Id: I72ab7fe33c03e60ea9f20a9c8afd7c79c40843aa
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478586
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Shuhei Matsumoto
0a5c002bb0 lib/nvmf: Accept KATO for discovery controller
Some NVMe applications require SPDK NVMe-oF target to support
KATO for discovery controller.  Hence change discovery controller
to accept KATO.  Update unit tests accordingly.

Fixes the issue #1089.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476810 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit e37fc5a32a)
Change-Id: Ib56e3b0b0faaf58276f9e692704763c1e5e5b042
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478361
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Zawadzki
23fd32ce2f ut/crypto: redirect mock rte_lcore_count
With DPDK 19.11 rte_eal_get_configuration() and rte_config
structure were made private. Those were only used in
the inline rte_lcore_count() included from DPDK.
After the update they were no longer available.

Since only rte_lcore_count() was used directly in crypto,
mock that and return 1, as was done previously.

This was tested with DPDK 19.08 and DPDK 19.11.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477841 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 1d11ab120d)
Change-Id: I13e4d9743b17a34ad786283f8b567d01e036d368
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478360
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Zawadzki
1f7c38b94d ut/compress: remove rte_eal_get_configuration stub
This went unused in the unit tests.
Tested with DPDK 19.08 and DPDK 19.11.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477840 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 2fbeb7ea52)
Change-Id: I738919379b5751697f9533f72fbaf77993cb6fb5
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478359
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Zawadzki
a6fdc98121 dpdk: update submodule to include fix for vhost CVE-2019-14818
Three patches from 19.08.1..19.08.2 that include:
vhost: fix possible denial of service by leaking FDs
vhost: fix possible denial of service on SET_VRING_NUM
vhost: fix vring requests validation broken if no FD

First two are resolution to CVE-2019-14818.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477827 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit c4acbcb706)
Change-Id: I67cd3ea4cddf9413b318957c28635b08c3b3c4b2
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478358
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Alexey Marchuk
b8f22c56fe rdma: Fix incoming_queue cleanup when RDMA qpair is destroyed
RDMA qpair might be destroyed by defunct timer, so it can have
active recv elements in incoming_queue. This queue is cleaned
incorrectly, so recv element for the destroyed qpair still may
be presented in the queue and be processed later. That leads
to undefined behaviour.

Fixes #1086

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477957 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 4af2b9bfb9)
Change-Id: Ieae186b2d2dce4ec88ab886b26165f6ef98e8d05
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478357
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Alexey Marchuk
d1fc4a28e9 nvme/pci: fix mapping length initialization for contig SGL request
mapping length is initialized with 0 and spdk_vtophys() returns
min(*mapping_length, cur_size) or 0. So length -= mapping_length has no
effect and req will be failed when nseg reaches NVME_MAX_SGL_DESCRIPTORS
Initialize mapping_length = request length

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477575 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit a092fac4a2)
Change-Id: I9082866b7f8055d99fa6930a78335b3b0fdf9b2b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478356
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Alexey Marchuk
af1784a971 rdma: Handle IBV_EVENT_SQ_DRAINED in asynchronous way
Send a message to the qpair thread to avoid modifying qpair
attributes in the acceptor poller thread which handles ibv events

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476715 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 5e3f93a75c)
Change-Id: If685d8b57aa7cb8d29fb1c2c270023c2ed0c1f84
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Alexey Marchuk
702eab9199 rdma: Handle IBV_EVENT_QP_FATAL in asynchronous way
Send a message to the qpair thread to avoid modifying qpair
attributes in the acceptor poller thread which handles ibv events

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476714 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit d238788bcc)
Change-Id: I8ea5658a2b226b0be9838eb375a8b80d15c456c5
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478443
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Alexey Marchuk
f5cee07778 rdma: Add synchronization for LAST_WQE_REACHED event
The following scenario might occur when nvmf_tgt is stopped:
1. nvmf_tgt receives SIGINT, changes state to NVMF_TGT_FINI_STOP_SUBSYSTEMS
2. In this state nvmf_tgt stops all subsystems and disconnects associated qpairs
3. In the case of RDMA qpair, its state will be changed to IBV_QPS_ERR.
Once qpair changes the state to IBV_QPS_ERR, RDMA device generates
LAST_WQE_REACHED event when there are no more WQE that can be sonsumed
from the SRQ by this qpair.
4. When all subsystems are stopped, some of qpair may still be alive since they
haven't received LAST_WQE_REACHED event yet.
5. nvmf_tgt stops all poll groups and forcefully destroyes any qpairs linked to them.
6. At this moment LAST_WQE_REACHED event might be generated and received in another thread.
Handler of this event sends a message with a pointer to qpair. The qpair itself may already
be destroyed.
7. Thread that owned qpair receives a message (LAST_WQE_REACHED) with a pointer to alredy destroyed qpair and
destroyes it for the second time when all pointer are invalid.

ibv events related to qpair should be handled by the thread that
owns this qpair. This commit adds a new structure that describes
ibv event, helper functions for sending the event and a list
of events per rdma qpair; add syncronization for LAST_WQE_REACHED event

Fixes #1075

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476712 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit dc84fbaaa1)
Change-Id: I22bff89741708df2518760934ecb4e33fad49473
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478355
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-23 08:04:56 +00:00
Ben Walker
db4208d3a1 nvme: Use sgls, if available, even for contiguous memory
The hardware sgl format can describe large contiguous
buffers using just a single element, so it's more
efficient that a prp list even for a single memory
segment. Always use the sgl format.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475542 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit bed4cdf6c7)
Change-Id: I9c62582829f0d64dcd1babdbc48930ddb4d9e626
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478354
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
a6302a4cea lib/nvme: prevent creating existing cuse device
This patch attempts to solve naming conflict between
CUSE devices created by different SPDK instances.

Each NVMe device is enumerated by SPDK process from 0
up to 127. When process attempts to start cuse device
tries to set exclusive lock on temporary file
"/tmp/spdk_nvme_cuse_lock_<index>" and keep it until
device will be stopped.
If setting lock fails, index is incremented.

It prevents to use the same controller index from
few SPDK instances.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474829 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>

(cherry picked from commit 46316bb5db)
Change-Id: If744ac23f813bd992efb80ae2b61a1acefb5054c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478353
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
50d13ab3e0 lib/nvme: remove device name parameter from nvme cuse
This patch removes posibility to set cuse device path. Instead
"/dev/spdk/nvme*" path is used.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474598 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>

(cherry picked from commit b7b45bc7bc)
Change-Id: I7c3087772a3661eebe03fce21356c35cc8204b49
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478352
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Zawadzki
79006b9e56 blob: fix sequentially allocated clusters starting from 0
When serializing extents, run-length encoding is supposed to
1) RLE all sequential LBAs
2) RLE zero LBAs (unallocated)

There is one special case, with sequential LBAs that start
with 0 LBA. This is RLE as 1) case, but results in descriptor
matching case 2). Which causes loss of allocated clusters.

This requires following conditions to be met:
- blobstore has just a single cluster reserved for MD
- blob is thin provisioned
- first allocation occurs on cluster_num=1

For last part to be true, very first write for blob has to be
issued to LBA between cluster_size and 2*cluster_size.
Causing allocation of second cluster in blobstore and assiging
it LBA equal to number of LBAs per cluster.

To fix this, case 1) disallows to RLE zeroes.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475494 (master)

(cherry picked from commit 0d1aa0252d)
Change-Id: I136282407966310c882ca97c960e9a71c442c469
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478351
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
bf3d670796 lib/nvme: stop all NVMe io producers on detach
Now all registered producers should be stopped (unregistered) before
NVMe detach, otherwise NVMe controller cannot be safely detached.

This patch allows to stop all not unregistered io producers before
NVMe detach:

1. Callback to the "struct nvme_io_msg_producer" to stop producer
   started on selected controller.
2. On nvme_io_msg_ctrlr_detach() if there's some unregistered producers,
   stop all before freeing resources.

This approach also fixes issue with not to stop CUSE device when
NVMe controller is detached without unregistering producer (github
issue #1033).

	Fixes #1033

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474273 (master)

(cherry picked from commit fd2af7afa9)
Change-Id: Ia1ffef566bb745edb55c54d6786ea481a35bbefd
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Ben Walker
1b0b8d04f8 env: Check supported iommu address width before using iova-mode=va
DPDK by default guesses that it should be using iova-mode=va
so that it can support running as an unprivileged user. However,
some systems (especially virtual machines) don't have an IOMMU capable
of handling the full virtual address space and DPDK doesn't
currently catch that. Add a check in SPDK and force iova-mode=pa
here.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475149 (master)

(cherry picked from commit 97b0f7733f)
Change-Id: Ib3a5691a584190feaab4b9064b5a500e361328f2
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478349
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
848927d96a lib/nvme: cuse device avoid using signals
This patch uses lowlevel fuse functions to process messages to
eliminate the need to use signals to interrupt blocking read
operation in fuse_session_loop().

  Fixes #1032

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473233 (master)

(cherry picked from commit 88808c5ab7)
Change-Id: Ie9c9ea76cc135c383f5757864aa2d84ac9eb3da3
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Ben Walker
1c8d673f75 env: Force iova-mode=pa on ppc
In DPDK, the ppc iommu support does not currently allow for
iova-mode=va, but DPDK doesn't detect ppc and so still attempts
to guess iova-mode=va in some modes. Force iova-mode=pa from
SPDK to fix this.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475148 (master)

(cherry picked from commit 07ca02210a)
Change-Id: I6a1ee25ab74873826ac211c3e0dfdf54afc74502
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478347
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
277cb377d9 lib/nvme: change api for io message
API changes in this patch:

 1) nvme_io_msg_ctrlr_start                         => nvme_io_msg_ctrlr_register
 2) nvme_io_msg_ctrlr_stop with (shutdown == false) => nvme_io_msg_ctrlr_unregister
 3) nvme_io_msg_ctrlr_stop with (shutdown == true)  => nvme_io_msg_ctrlr_detach

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474096 (master)

(cherry picked from commit 9eb0ffa90c)
Change-Id: I60153ebbfb0d0b22575128d106f9333c3887213d
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478346
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
44e398469d bdev/nvme: fix handle error on rpc cuse register
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473711 (master)

(cherry picked from commit b078bc8e42)
Change-Id: Ie746af29026bb6f9fdbcb67fb454a6eb8b9bec11
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478345
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2019-12-23 08:04:56 +00:00
Tomasz Kulasek
9cd5302810 lib/nvme: fix do not use external_io_msg_qpair after free
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473282 (master)

(cherry picked from commit 53184430a5)
Change-Id: I20ef8303c2fae6abf43d15ebb025ea368c0dfd67
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478585
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Jim Harris
673fe94f7e nvme: don't monitor hotplug events in secondary process
NVMe hotplug must be monitored in the primary process -
DPDK doesn't support trying to handle it in the
secondary process.

This issue was somewhat masked previously in secondary
processes, since usually it would just probe(NULL) which
meant probe all attached NVMe controllers.  So in the
secondary process, we would probe just once, and create
the hotplug fd - it would never actually try to monitor
it.

But when explicitly specifying multiple trids in a
secondary process, probe would get called multiple
times.  First time would be fine since it only creates
the hotplug fd.  But second time would segfault since
monitoring for hotplug requires checking the DPDK-allocated
context which doesn't exist in the secondary process.

Fixes issue #1063.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475015 (master)

(cherry picked from commit c3aaaa0181)
Change-Id: I2a9a91e222c206034293d90e30e3f598c8d7baa8
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478344
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-23 08:04:56 +00:00
Jim Harris
863814e60d nvme: add g_ prefix to hotplug_fd
Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475014 (master)

(cherry picked from commit 27e88b8d91)
Change-Id: I8cc03e1a8b5d2eb28bf945115f3c9b3980b30f1c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478343
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-12-23 08:04:56 +00:00
Michael Haeuptle
da657d4383 json: increase the json rpc client value limit
The jsonrpc client has a limit of 1024 JSON values per
request which is hardly enough for any meaningful config.
For example, calling getbdevs for 24 NVMe drives require
~2300 JSON values.
I kept the original 1024 limit for the RPC server where
it makes sense to have a smaller limit and introduced
a seperate limit for the client.

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473568 (master)

(cherry picked from commit 92df995526)
Change-Id: Id0300991b76151e4003e323f5ea29bc5fc0d2d11
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478342
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
2019-12-23 08:04:56 +00:00
Jim Harris
90c4efcb63 thread: fix set-but-unused warning
In release builds, the assert() is compiled out, making
it look like the rc value is never referenced after it's
set.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473472 (master)

(cherry picked from commit 750f2b4b3d)
Change-Id: I59305b0e928f2044146e30b7addc86f81e7a1d3f
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478584
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-12-23 08:04:56 +00:00
Jim Harris
a2c3412dd8 test: add additional asan suppressions related to fio
We are already suppressing fio (not SPDK fio_plugin)
leaks in a couple of other places, which could likely
be causing the indirect leaks we are now going to
suppress here.

Fixes issue #1003.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473495 (master)

(cherry picked from commit 2be2b6eba5)
Change-Id: Ie5283280495e7155cda1a93d2bd3d48ffbb6cba7
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478583
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Jim Harris
1cbc41021c thread: return int from spdk_thread_seng_msg
This at least allows the caller to know there was a
problem, and that the messages wasn't actually sent.

SPDK by default creates huge rings so this problem
should never occur, but out-of-tree use cases may
send messages much more often and require at least
a notification when it fails.

While here, change the thread check to an assert.
There's no need to work around someone calling
this function with a null thread parameter.

Fixes issue #811.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472438 (master)

(cherry picked from commit 4036f95bf8)
Change-Id: Ie6d432d616be45c7a4232aff1548cef198702bc0
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478442
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-23 08:04:56 +00:00
Jim Harris
58da6e7000 nvme: don't enable adminq until we know discovery_ctrlr exists
Fixes issue #1029.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473237 (master)

(cherry picked from commit e0a0f90b0f)
Change-Id: I489dfc853804b005d385b1c51815f0e7f342b39b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478341
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-23 08:04:56 +00:00
Jim Harris
a60e966556 test/raid: remove unused spdk_thread_send_msg stub
Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472437 (master)

(cherry picked from commit e58deb0257)
Change-Id: I7fc128a82b3d1d1f780c1c396644f331306de600
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478441
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-23 08:04:56 +00:00
Tomasz Zawadzki
e660235c9d SPDK 19.10
Change-Id: Ib2ae8d8d95152dac0c8a9180e3f479fef5d882e0
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472777
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 21:33:05 +00:00
Tomasz Zawadzki
41de7a4e1c changelog: add RPC rename entry
Added info on new names of RPC.

Moved rest of the RPC section to the top to acompany it.

Change-Id: I3ee265ab2f163d0bf01b74ca87d4510835041c3a
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472990
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 20:33:03 +00:00
Shuhei Matsumoto
1c51dfdb6e lib/iscsi: Remove iSCSI task left in PDU receive process due to connection exit
Previously iSCSI task was created after allocating data buffer
and reading all data, and hence creating iSCSI task and processing
iSCSI task were not separated.

However, the recent refactoring separate PDU header handling and
PDU payload handling, and then inserted allocating data buffer and
reading data segment in the middle.

If any critical error occurs during allocating data buffer or
reading data segment, PDU payload handling is not done, and hence
created iSCSI task is left in PDU receive process.

If any critical error occurs, the current connection starts exiting
and there is no way to continue PDU receive process.

The task left in PDU receive process is never freed, and hence
LUN hotplug or exiting connection never complete.

This patch do the following:
- Consolidate freeing pre-allocated PDU to spdk_iscsi_conn_destruct()
  because this is the only path to exit connection.
- Abort SCSI task of the task left in PDU receive process if found
  when freeing pre-allocated PDU. If the task is not SCSI or Data Out,
  remove it simply.

    Fix issues #1018.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472896 (master)

(cherry picked from commit cd654cc512)
Change-Id: I8a2464c446c43bf4cfb5afbc0cd78b5bdef7d080
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472988
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 19:26:02 +00:00
Shuhei Matsumoto
01988f644f Revert "lib/iscsi: Close the being hot-removed LUN even if connection is in exiting
This reverts commit Iad6ecdc37493fa9f2d7ccab262a2c75dac2fcd48.

Both estimated cause and code change were wrong and didn't fix
the issue.

The next patch will fix the issue.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472895 (master)

(cherry picked from commit 832d90c1e2)
Change-Id: I00c8bb515ee39522c0e744dccfb839af15e946c4
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472987
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 19:26:02 +00:00
Ben Walker
0680344863 nvme: Document new cuse functions
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472834 (master)

(cherry picked from commit 368de579b6)
Change-Id: I2644d7909899fd7aa4e9690eec0fe20de5f17289
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472964
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-31 19:26:02 +00:00
Ben Walker
5a472f2779 nvme/cuse: Poll the io_msg queue when the admin queue is polled
Users already have to poll the admin queue, so embed the io_msg
queue polling there to simplify the API.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472833 (master)

(cherry picked from commit 11739f3cb1)
Change-Id: I4d4d3be100be0798bee4096e0bbda96e20d2405e
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472963
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 19:26:02 +00:00
Tomasz Zawadzki
43890b3c4c doc/nvme: nvme character device documentation
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468827 (master)

(cherry picked from commit e9b5bef8d4)
Change-Id: Ieab32f3e7aca103a270d88329d4df5fc85302795
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472962
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-31 19:26:02 +00:00
Tomasz Zawadzki
df1fead54c dpdk: update submodule to include vhost compile fix
DPDK submodule is now updated to include:
https://review.gerrithub.io/c/spdk/dpdk/+/472258
"b5c9624: vhost: fix compile error"

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472773 (master)

(cherry picked from commit ab1faf3379)
Change-Id: I898ba6cd3d71874f1d55d363c9efe5263d944562
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472961
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 19:26:02 +00:00
Seth Howell
ec6de131f7 nvme: don't disconnect qpairs from admin thread.
Disconnecting qpairs from the admin thread during a reset led to an
inevitable race with the data thread. QP related memory is freed during
the disconnect and cannot be touched from the other threads.

The only way to fix this is to force the qpair disconnect onto the
data thread.

This requires a small change in the way that resets are handled for
pcie. Please see the code in reset.c for that change.

fixes: bb01a089

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472749 (master)

(cherry picked from commit 13f30a254e)
Change-Id: I8a39e444c7cbbe85fafca42ffd040e929721ce95
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472960
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 19:26:02 +00:00
Tomasz Zawadzki
1f737887cb changelog: added missing entries for 19.10
Added couple missing entries after comparing changes in public
headers in SPDK.

While here, added backticks around functions
to improve readability in MD.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472732 (master)

(cherry picked from commit 808ad5f398)
Change-Id: I3c723a2e76dc02a84e8277e0bd8db96f10ba2222
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472856
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-31 19:26:02 +00:00
Tomasz Zawadzki
b01b4a119d changelog: consolidated sections for SPDK 19.10
There were couple sections that were duplicated, so they
are now consolidated.

Moved around sections so that relevant ones are closer
to each other.

No change in content of section/entry was done in this patch.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472731 (master)

(cherry picked from commit cc25bd4aa9)
Change-Id: I1838d9057548c5f65f7304f783ee81e21d3b624c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472855
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-31 19:26:02 +00:00
Shuhei Matsumoto
2de916785b lib/iscsi: Close the being hot-removed LUN even if connection is in exiting
_iscsi_conn_remove_lun() which is the callback to LUN hot-removal
returns immediately without closing the LUN if the connecion is
already in exiting, then expects that the LUN will be closed by
after the connection moves to the exited state.

LUN hot removal process doesn't check any R2T task if it is not
pending in SCSI layer but connection close process checks any R2T
task even if it is not pending in SCSI layer.

LUN hot removal will not complete until all LUN accesses are closed.

iscsi_conn_close_lun() checks if the LUN is already closed or not,
and so it will be no harm even if _iscsi_conn_remove_lun() calls
iscsi_conn_close_lun(). If the connection is in exited state,
all LUNs are already closed.

This patch changes _iscsi_conn_remove_lun() to return immediately
if the connection is in exited state.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472507 (master)

(cherry picked from commit 1ef8449feb)
Change-Id: Iad6ecdc37493fa9f2d7ccab262a2c75dac2fcd48
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472776
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-30 16:57:49 +00:00
Shuhei Matsumoto
6012461966 lib/iscsi: Send R2T in SCSI Data-Out PDU Header handling
Recent patches refactored iSCSI target to separate PDU header
and payload handling. However for SCSI Data-Out PDU, the division
of roles done by refactoring was wrong. Before refactoring, LUN
hotplug was checked after sending R2T, but after refactoring LUN
hotplug is checked before sending R2T. This change stopped PDU
exchange between iSCSI initiator and target and caused timeout of
LUN removal.

This patch restores the original ordering of checking LUN hotplug
and sending R2T by changing the division of roles.

SCSI Write Command PDU handling don't have any issue related with
this.

Fixes #1004

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472308 (master)

(cherry picked from commit 84f59335c2)
Change-Id: I7b2866d8394b522fb5420d2936de2fbddc7d1daa
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472775
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-30 16:57:49 +00:00
Seth Howell
4130dd8ea5 nvme: take the lock when disconnecting qpairs.
If we disconnect qpairs without taking the lock, we run the risk of
trying to double free qpair resources before they have been marked as
NULL.
For example, polling on one thread and calling
nvme_rdma_qpair_disconnect from one thread while doing an
nvme_ctrlr_reset on another thread. nvme_ctrlr_reset will call down to
nvme_rdma_qpair_disconnect on the same qpair and without any locking it
can result in trying to destroy the qpair resources multiple times.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472413 (master)

(cherry picked from commit a4925ba744)
Change-Id: I9eef6f2f92961ef8e3f8ece0e4a3d54f3434cff8
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472711
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-30 16:57:49 +00:00
Konrad Sztyber
ecf2ccec7b lib/vmd: make sure pcie_cap is not NULL before dereferencing it
Fixes #1006

Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472385 (master)

(cherry picked from commit 4121477d91)
Change-Id: I761e1cbb49c09318a8d2eda9b4a2ee0bcdcebc37
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472710
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-30 16:57:49 +00:00
1279 changed files with 71695 additions and 153974 deletions

View File

@ -30,7 +30,6 @@ fi
echo "Running make with $COMP ..."
echo "${MAKE} clean " > make.log
$MAKE clean >> make.log 2>&1
echo "${MAKE} CONFIG_DEBUG=n CONFIG_WERROR=y " >> make.log
$MAKE CONFIG_DEBUG=n CONFIG_WERROR=y >> make.log 2>&1
rc=$?
@ -76,6 +75,64 @@ fi
echo "$MAKE clean " >> make.log
$MAKE clean >> make.log 2>&1
if [ "$SYSTEM" = "FreeBSD" ]; then
echo
echo "Pushing to $1 $2"
exit $rc
fi
if ! hash clang 2>/dev/null; then
echo "clang not found; skipping the clang tests"
echo
echo "Pushing to $1 $2"
exit $rc
fi
echo "Running make with clang ..."
echo "make CONFIG_DEBUG=n CONFIG_WERROR=y CC=clang CXX=clang++ " >> make.log
$MAKE CONFIG_DEBUG=n CONFIG_WERROR=y CC=clang CXX=clang++ >> make.log 2>&1
rc=$?
if [ $rc -ne 0 ]; then
tail -20 make.log
echo ""
echo "ERROR make CC=clang CXX=clang++ returned errors!"
echo "ERROR Fix the problem and use 'git commit' to update your changes."
echo "ERROR See `pwd`/make.log for more information."
echo ""
exit $rc
fi
echo "make clean CC=clang CXX=clang++ SKIP_DPDK_BUILD=1 " >> make.log
$MAKE clean CC=clang CXX=clang++ SKIP_DPDK_BUILD=1 >> make.log 2>&1
echo "make CONFIG_DEBUG=y CONFIG_WERROR=y CC=clang CXX=clang++ SKIP_DPDK_BUILD=1 " >> make.log
$MAKE CONFIG_DEBUG=y CONFIG_WERROR=y CC=clang CXX=clang++ SKIP_DPDK_BUILD=1 >> make.log 2>&1
rc=$?
if [ $rc -ne 0 ]; then
tail -20 make.log
echo ""
echo "ERROR make CC=clang CXX=clang++ returned errors!"
echo "ERROR Fix the problem and use 'git commit' to update your changes."
echo "ERROR See `pwd`/make.log for more information."
echo ""
exit $rc
fi
echo "Running unittest.sh ..."
echo "./test/unit/unittest.sh" >> make.log
"./test/unit/unittest.sh" >> make.log 2>&1
rc=$?
if [ $rc -ne 0 ]; then
tail -20 make.log
echo ""
echo "ERROR unittest returned errors!"
echo "ERROR Fix the problem and use 'git commit' to update your changes."
echo "ERROR See `pwd`/make.log for more information."
echo ""
exit $rc
fi
${MAKE} clean CC=clang CXX=clang++ 2> /dev/null
echo "Pushing to $1 $2"
exit $rc

View File

@ -1,8 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: SPDK Community
url: https://spdk.io/community/
about: Please ask and answer questions here.
- name: SPDK Common Vulnerabilities and Exposures (CVE) Process
url: https://spdk.io/cve_threat/
about: Please follow CVE process to responsibly disclose security vulnerabilities.

View File

@ -1,23 +0,0 @@
---
name: CI Intermittent Failure
about: Create a report with CI failure unrelated to the patch tested.
title: '[test_name] Failure description'
labels: 'Intermittent Failure'
assignees: ''
---
<!--- Provide a [test_name] where the issue occurred and brief description in the Title above. -->
<!--- Name of the test can be found by last occurrence of: -->
<!--- ************************************ -->
<!--- START TEST [test_name] -->
<!--- ************************************ -->
## Link to the failed CI build
<!--- Please provide a link to the failed CI build -->
## Execution failed at
<!--- Please provide the first failure in the test. Pointed to by the first occurrence of: -->
<!--- ========== Backtrace start: ========== -->

View File

@ -1,10 +0,0 @@
filters:
- true
commentBody: |
Thanks for your contribution! Unfortunately, we don't use GitHub pull
requests to manage code contributions to this repository. Instead, please
see https://spdk.io/development which provides instructions on how to
submit patches to the SPDK Gerrit instance.
addLabel: false

10
.gitignore vendored
View File

@ -2,23 +2,17 @@
*.a
*.cmd
*.d
*.dll
*.exe
*.gcda
*.gcno
*.kdev4
*.ko
*.lib
*.log
*.o
*.obj
*.pdb
*.pyc
*.so
*.so.*
*.swp
*.DS_Store
build/
ut_coverage/
tags
cscope.out
@ -31,11 +25,7 @@ CONFIG.local
.project
.cproject
.settings
.gitreview
mk/cc.mk
mk/config.mk
mk/cc.flags.mk
PYTHON_COMMAND
test_completions.txt
timing.txt
test/common/build_config.sh

5
.gitmodules vendored
View File

@ -1,6 +1,6 @@
[submodule "dpdk"]
path = dpdk
url = https://git.quacker.org/d/numam-dpdk.git
url = https://github.com/spdk/dpdk.git
[submodule "intel-ipsec-mb"]
path = intel-ipsec-mb
url = https://github.com/spdk/intel-ipsec-mb.git
@ -10,6 +10,3 @@
[submodule "ocf"]
path = ocf
url = https://github.com/Open-CAS/ocf.git
[submodule "libvfio-user"]
path = libvfio-user
url = https://github.com/nutanix/libvfio-user.git

File diff suppressed because it is too large Load Diff

37
CONFIG
View File

@ -43,6 +43,9 @@ CONFIG_CROSS_PREFIX=
# Build with debug logging. Turn off for performance testing and normal usage
CONFIG_DEBUG=n
# Build with support of backtrace printing in log messages. Requires libunwind.
CONFIG_LOG_BACKTRACE=n
# Treat warnings as errors (fail the build on any warning).
CONFIG_WERROR=n
@ -67,18 +70,9 @@ CONFIG_UBSAN=n
# Build with Thread Sanitizer enabled
CONFIG_TSAN=n
# Build functional tests
# Build tests
CONFIG_TESTS=y
# Build unit tests
CONFIG_UNIT_TESTS=y
# Build examples
CONFIG_EXAMPLES=y
# Build with Control-flow Enforcement Technology (CET)
CONFIG_CET=n
# Directory that contains the desired SPDK environment library.
# By default, this is implemented using DPDK.
CONFIG_ENV=
@ -87,9 +81,6 @@ CONFIG_ENV=
# installation.
CONFIG_DPDK_DIR=
# This directory should contain 'include' and 'lib' directories for WPDK.
CONFIG_WPDK_DIR=
# Build SPDK FIO plugin. Requires CONFIG_FIO_SOURCE_DIR set to a valid
# fio source code directory.
CONFIG_FIO_PLUGIN=n
@ -102,8 +93,6 @@ CONFIG_FIO_SOURCE_DIR=/usr/src/fio
# Requires ibverbs development libraries.
CONFIG_RDMA=n
CONFIG_RDMA_SEND_WITH_INVAL=n
CONFIG_RDMA_SET_ACK_TIMEOUT=n
CONFIG_RDMA_PROV=verbs
# Enable NVMe Character Devices.
CONFIG_NVME_CUSE=n
@ -119,14 +108,11 @@ CONFIG_RBD=n
# Build vhost library.
CONFIG_VHOST=y
CONFIG_VHOST_INTERNAL_LIB=n
# Build vhost initiator (Virtio) driver.
CONFIG_VIRTIO=y
# Build custom vfio-user transport for NVMf target and NVMe initiator.
CONFIG_VFIO_USER=n
CONFIG_VFIO_USER_DIR=
# Build with PMDK backends
CONFIG_PMDK=n
CONFIG_PMDK_DIR=
@ -134,6 +120,10 @@ CONFIG_PMDK_DIR=
# Enable the dependencies for building the compress vbdev
CONFIG_REDUCE=n
# Build with VPP
CONFIG_VPP=n
CONFIG_VPP_DIR=
# Requires libiscsi development libraries.
CONFIG_ISCSI_INITIATOR=n
@ -147,6 +137,9 @@ CONFIG_SHARED=n
CONFIG_VTUNE=n
CONFIG_VTUNE_DIR=
# Build the dpdk igb_uio driver
CONFIG_IGB_UIO_DRIVER=n
# Build Intel IPSEC_MB library
CONFIG_IPSEC_MB=n
@ -166,9 +159,3 @@ CONFIG_URING_PATH=
# Build with FUSE support
CONFIG_FUSE=n
# Build with RAID5 support
CONFIG_RAID5=n
# Build with IDXD support
CONFIG_IDXD=n

View File

@ -1,28 +1,19 @@
---
name: Bug report
about: Create a report to help us improve. Please use the issue tracker only for reporting suspected issues.
title: ''
labels: 'Sighting'
assignees: ''
Please use the issue tracker only for reporting suspected issues.
---
See [The SPDK Community Page](http://www.spdk.io/community/) for other SPDK communications channels.
<!--- Provide a general summary of the issue in the Title above -->
## Expected Behavior
<!--- Tell us what should happen -->
## Current Behavior
<!--- Tell us what happens instead of the expected behavior -->
## Possible Solution
<!--- Not obligatory, but suggest a fix/reason for the bug, -->
## Steps to Reproduce
<!--- Provide a link to a live example, or an unambiguous set of steps to -->
<!--- reproduce this bug. Include code to reproduce, if relevant -->
1.
@ -31,5 +22,4 @@ assignees: ''
4.
## Context (Environment including OS version, SPDK version, etc.)
<!--- Providing context helps us come up with a solution that is most useful in the real world -->

13
LICENSE
View File

@ -1,16 +1,3 @@
The SPDK repo contains multiple git submodules each with its own
license info. Unless otherwise noted all other code in this repo
is BSD as stated below.
Submodule license info:
dpdk: see dpdk/license
intel-ipsec-mb: see intel-ipsec-mb/LICENSE
isa-l: see isa-l/LICENSE
libvfio-user: see libvfio-user/LICENSE
ocf: see ocf/LICENSE
The rest of the SPDK repo:
BSD LICENSE
Copyright (c) Intel Corporation.

View File

@ -2,7 +2,6 @@
# BSD LICENSE
#
# Copyright (c) Intel Corporation.
# Copyright (c) 2020, Mellanox Corporation.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@ -40,20 +39,15 @@ include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
DIRS-y += lib
DIRS-y += module
DIRS-$(CONFIG_SHARED) += shared_lib
DIRS-y += app include
DIRS-$(CONFIG_EXAMPLES) += examples
DIRS-y += test
DIRS-y += examples app include
DIRS-$(CONFIG_TESTS) += test
DIRS-$(CONFIG_IPSEC_MB) += ipsecbuild
DIRS-$(CONFIG_ISAL) += isalbuild
DIRS-$(CONFIG_VFIO_USER) += vfiouserbuild
.PHONY: all clean $(DIRS-y) include/spdk/config.h mk/config.mk \
cc_version cxx_version .libs_only_other .ldflags ldflags install \
uninstall
# Workaround for ninja. See dpdkbuild/Makefile
export MAKE_PID := $(shell echo $$PPID)
ifeq ($(SPDK_ROOT_DIR)/lib/env_dpdk,$(CONFIG_ENV))
ifeq ($(CURDIR)/dpdk/build,$(CONFIG_DPDK_DIR))
ifneq ($(SKIP_DPDK_BUILD),1)
@ -63,13 +57,6 @@ endif
endif
endif
ifeq ($(OS),Windows)
ifeq ($(CURDIR)/wpdk/build,$(CONFIG_WPDK_DIR))
WPDK = wpdk
DIRS-y += wpdk
endif
endif
ifeq ($(CONFIG_SHARED),y)
LIB = shared_lib
else
@ -86,20 +73,9 @@ LIB += isalbuild
DPDK_DEPS += isalbuild
endif
ifeq ($(CONFIG_VFIO_USER),y)
VFIOUSERBUILD = vfiouserbuild
LIB += vfiouserbuild
endif
all: mk/cc.mk $(DIRS-y)
clean: $(DIRS-y)
$(Q)rm -f include/spdk/config.h
$(Q)rm -rf build/bin
$(Q)rm -rf build/fio
$(Q)rm -rf build/examples
$(Q)rm -rf build/include
$(Q)rm -rf build/lib/pkgconfig
$(Q)find build/lib ! -name .gitignore -type f -delete
install: all
$(Q)echo "Installed to $(DESTDIR)$(CONFIG_PREFIX)"
@ -108,11 +84,10 @@ uninstall: $(DIRS-y)
$(Q)echo "Uninstalled spdk"
ifneq ($(SKIP_DPDK_BUILD),1)
dpdkdeps $(DPDK_DEPS): $(WPDK)
dpdkbuild: $(WPDK) $(DPDK_DEPS)
dpdkbuild: $(DPDK_DEPS)
endif
lib: $(WPDK) $(DPDKBUILD) $(VFIOUSERBUILD)
lib: $(DPDKBUILD)
module: lib
shared_lib: module
app: $(LIB)
@ -121,23 +96,18 @@ examples: $(LIB)
pkgdep:
sh ./scripts/pkgdep.sh
$(DIRS-y): mk/cc.mk build_dir include/spdk/config.h
$(DIRS-y): include/spdk/config.h
mk/cc.mk:
$(Q)echo "Please run configure prior to make"
false
build_dir: mk/cc.mk
$(Q)mkdir -p build/lib/pkgconfig/tmp
$(Q)mkdir -p build/bin
$(Q)mkdir -p build/fio
$(Q)mkdir -p build/examples
$(Q)mkdir -p build/include/spdk
include/spdk/config.h: mk/config.mk scripts/genconfig.py
$(Q)echo "#ifndef SPDK_CONFIG_H" > $@.tmp; \
$(Q)PYCMD=$$(cat PYTHON_COMMAND 2>/dev/null) ; \
test -z "$$PYCMD" && PYCMD=python ; \
echo "#ifndef SPDK_CONFIG_H" > $@.tmp; \
echo "#define SPDK_CONFIG_H" >> $@.tmp; \
scripts/genconfig.py $(MAKEFLAGS) >> $@.tmp; \
$$PYCMD scripts/genconfig.py $(MAKEFLAGS) >> $@.tmp; \
echo "#endif /* SPDK_CONFIG_H */" >> $@.tmp; \
cmp -s $@.tmp $@ || mv $@.tmp $@ ; \
rm -f $@.tmp

View File

@ -10,7 +10,6 @@ interrupts, which avoids kernel context switches and eliminates interrupt
handling overhead.
The development kit currently includes:
* [NVMe driver](http://www.spdk.io/doc/nvme.html)
* [I/OAT (DMA engine) driver](http://www.spdk.io/doc/ioat.html)
* [NVMe over Fabrics target](http://www.spdk.io/doc/nvmf.html)
@ -18,7 +17,7 @@ The development kit currently includes:
* [vhost target](http://www.spdk.io/doc/vhost.html)
* [Virtio-SCSI driver](http://www.spdk.io/doc/virtio.html)
# In this readme
# In this readme:
* [Documentation](#documentation)
* [Prerequisites](#prerequisites)
@ -26,7 +25,6 @@ The development kit currently includes:
* [Build](#libraries)
* [Unit Tests](#tests)
* [Vagrant](#vagrant)
* [AWS](#aws)
* [Advanced Build Options](#advanced)
* [Shared libraries](#shared)
* [Hugepages and Device Binding](#huge)
@ -53,9 +51,6 @@ git submodule update --init
## Prerequisites
The dependencies can be installed automatically by `scripts/pkgdep.sh`.
The `scripts/pkgdep.sh` script will automatically install the bare minimum
dependencies required to build SPDK.
Use `--help` to see information on installing dependencies for optional components
~~~{.sh}
./scripts/pkgdep.sh
@ -97,23 +92,14 @@ success or failure.
A [Vagrant](https://www.vagrantup.com/downloads.html) setup is also provided
to create a Linux VM with a virtual NVMe controller to get up and running
quickly. Currently this has been tested on MacOS, Ubuntu 16.04.2 LTS and
Ubuntu 18.04.3 LTS with the VirtualBox and Libvirt provider.
The [VirtualBox Extension Pack](https://www.virtualbox.org/wiki/Downloads)
or [Vagrant Libvirt] (https://github.com/vagrant-libvirt/vagrant-libvirt) must
quickly. Currently this has only been tested on MacOS and Ubuntu 16.04.2 LTS
with the [VirtualBox](https://www.virtualbox.org/wiki/Downloads) provider. The
[VirtualBox Extension Pack](https://www.virtualbox.org/wiki/Downloads) must
also be installed in order to get the required NVMe support.
Details on the Vagrant setup can be found in the
[SPDK Vagrant documentation](http://spdk.io/doc/vagrant.html).
<a id="aws"></a>
## AWS
The following setup is known to work on AWS:
Image: Ubuntu 18.04
Before running `setup.sh`, run `modprobe vfio-pci`
then: `DRIVER_OVERRIDE=vfio-pci ./setup.sh`
<a id="advanced"></a>
## Advanced Build Options
@ -186,20 +172,16 @@ of the SPDK static ones.
In order to start a SPDK app linked with SPDK shared libraries, make sure
to do the following steps:
- run ldconfig specifying the directory containing SPDK shared libraries
- provide proper `LD_LIBRARY_PATH`
If DPDK shared libraries are used, you may also need to add DPDK shared
libraries to `LD_LIBRARY_PATH`
Linux:
~~~{.sh}
./configure --with-shared
make
ldconfig -v -n ./build/lib
LD_LIBRARY_PATH=./build/lib/:./dpdk/build/lib/ ./build/bin/spdk_tgt
LD_LIBRARY_PATH=./build/lib/ ./app/spdk_tgt/spdk_tgt
~~~
<a id="huge"></a>

View File

@ -41,13 +41,8 @@ DIRS-y += iscsi_top
DIRS-y += iscsi_tgt
DIRS-y += spdk_tgt
DIRS-y += spdk_lspci
ifneq ($(OS),Windows)
# TODO - currently disabled on Windows due to lack of support for curses
DIRS-y += spdk_top
endif
ifeq ($(OS),Linux)
DIRS-$(CONFIG_VHOST) += vhost
DIRS-y += spdk_dd
endif
.PHONY: all clean $(DIRS-y)

View File

@ -43,14 +43,13 @@ CFLAGS += -I$(SPDK_ROOT_DIR)/lib
C_SRCS := iscsi_tgt.c
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event_iscsi event_net
ifeq ($(SPDK_ROOT_DIR)/lib/env_dpdk,$(CONFIG_ENV))
SPDK_LIB_LIST += env_dpdk_rpc
endif
SPDK_LIB_LIST = $(ALL_MODULES_LIST)
SPDK_LIB_LIST += event_bdev event_copy event_iscsi event_net event_scsi event_vmd event
SPDK_LIB_LIST += jsonrpc json rpc bdev_rpc bdev iscsi scsi copy trace conf
SPDK_LIB_LIST += thread util log log_rpc app_rpc net sock notify
ifeq ($(OS),Linux)
SPDK_LIB_LIST += event_nbd
SPDK_LIB_LIST += event_nbd nbd
endif
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -41,6 +41,21 @@
static int g_daemon_mode = 0;
static void
spdk_sigusr1(int signo __attribute__((__unused__)))
{
char *config_str = NULL;
if (spdk_app_get_running_config(&config_str, "iscsi.conf") < 0) {
fprintf(stderr, "Error getting config\n");
} else {
fprintf(stdout, "============================\n");
fprintf(stdout, " iSCSI target running config\n");
fprintf(stdout, "=============================\n");
fprintf(stdout, "%s", config_str);
}
free(config_str);
}
static void
iscsi_usage(void)
{
@ -75,7 +90,7 @@ main(int argc, char **argv)
int rc;
struct spdk_app_opts opts = {};
spdk_app_opts_init(&opts, sizeof(opts));
spdk_app_opts_init(&opts);
opts.name = "iscsi";
if ((rc = spdk_app_parse_args(argc, argv, &opts, "b", NULL,
iscsi_parse_arg, iscsi_usage)) !=
@ -91,6 +106,7 @@ main(int argc, char **argv)
}
opts.shutdown_cb = NULL;
opts.usr1_handler = spdk_sigusr1;
/* Blocks until the application is exiting */
rc = spdk_app_start(&opts, spdk_startup, NULL);

View File

@ -33,14 +33,21 @@
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
include $(SPDK_ROOT_DIR)/mk/spdk.app_cxx.mk
CXXFLAGS += $(ENV_CXXFLAGS)
CXXFLAGS += -I$(SPDK_ROOT_DIR)/lib
CXX_SRCS = iscsi_top.cpp
APP = iscsi_top
SPDK_LIB_LIST = rpc
all: $(APP)
@:
CFLAGS += -I$(SPDK_ROOT_DIR)/lib
$(APP) : $(OBJS)
$(LINK_CXX)
C_SRCS := iscsi_top.c
clean:
$(CLEAN_C) $(APP)
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk
include $(SPDK_ROOT_DIR)/mk/spdk.deps.mk

View File

@ -33,106 +33,93 @@
#include "spdk/stdinc.h"
#include "spdk/event.h"
#include "spdk/jsonrpc.h"
#include "spdk/rpc.h"
#include "spdk/string.h"
#include "spdk/trace.h"
#include "spdk/util.h"
#include <algorithm>
#include <map>
#include <vector>
extern "C" {
#include "spdk/trace.h"
#include "iscsi/conn.h"
}
static char *exe_name;
static int g_shm_id = 0;
struct spdk_jsonrpc_client *g_rpc_client;
static void usage(void)
{
fprintf(stderr, "usage:\n");
fprintf(stderr, " %s <option>\n", exe_name);
fprintf(stderr, " option = '-i' to specify the shared memory ID,"
" (required)\n");
fprintf(stderr, " -r <path> RPC listen address (default: /var/tmp/spdk.sock\n");
}
struct rpc_conn_info {
uint32_t id;
uint32_t cid;
uint32_t tsih;
uint32_t lcore_id;
char *initiator_addr;
char *target_addr;
char *target_node_name;
};
static struct rpc_conn_info g_conn_info[1024];
static const struct spdk_json_object_decoder rpc_conn_info_decoders[] = {
{"id", offsetof(struct rpc_conn_info, id), spdk_json_decode_uint32},
{"cid", offsetof(struct rpc_conn_info, cid), spdk_json_decode_uint32},
{"tsih", offsetof(struct rpc_conn_info, tsih), spdk_json_decode_uint32},
{"lcore_id", offsetof(struct rpc_conn_info, lcore_id), spdk_json_decode_uint32},
{"initiator_addr", offsetof(struct rpc_conn_info, initiator_addr), spdk_json_decode_string},
{"target_addr", offsetof(struct rpc_conn_info, target_addr), spdk_json_decode_string},
{"target_node_name", offsetof(struct rpc_conn_info, target_node_name), spdk_json_decode_string},
};
static int
rpc_decode_conn_object(const struct spdk_json_val *val, void *out)
/* Group by poll group */
static bool
conns_compare(struct spdk_iscsi_conn *first, struct spdk_iscsi_conn *second)
{
struct rpc_conn_info *info = (struct rpc_conn_info *)out;
if ((uintptr_t)first->pg < (uintptr_t)second->pg) {
return true;
}
return spdk_json_decode_object(val, rpc_conn_info_decoders,
SPDK_COUNTOF(rpc_conn_info_decoders), info);
if ((uintptr_t)first->pg > (uintptr_t)second->pg) {
return false;
}
if (first->id < second->id) {
return true;
}
return false;
}
static void
print_connections(void)
{
struct spdk_jsonrpc_client_response *json_resp = NULL;
struct spdk_json_write_ctx *w;
struct spdk_jsonrpc_client_request *request;
int rc;
size_t conn_count, i;
struct rpc_conn_info *conn;
std::vector<struct spdk_iscsi_conn *> v;
std::vector<struct spdk_iscsi_conn *>::iterator iter;
size_t conns_size;
struct spdk_iscsi_conn *conns, *conn;
void *conns_ptr;
int fd, i;
char shm_name[64];
request = spdk_jsonrpc_client_create_request();
if (request == NULL) {
return;
snprintf(shm_name, sizeof(shm_name), "/spdk_iscsi_conns.%d", g_shm_id);
fd = shm_open(shm_name, O_RDONLY, 0600);
if (fd < 0) {
fprintf(stderr, "Cannot open shared memory: %s\n", shm_name);
usage();
exit(1);
}
w = spdk_jsonrpc_begin_request(request, 1, "iscsi_get_connections");
spdk_jsonrpc_end_request(request, w);
spdk_jsonrpc_client_send_request(g_rpc_client, request);
conns_size = sizeof(*conns) * MAX_ISCSI_CONNECTIONS;
do {
rc = spdk_jsonrpc_client_poll(g_rpc_client, 1);
} while (rc == 0 || rc == -ENOTCONN);
if (rc <= 0) {
goto end;
conns_ptr = mmap(NULL, conns_size, PROT_READ, MAP_SHARED, fd, 0);
if (conns_ptr == MAP_FAILED) {
fprintf(stderr, "Cannot mmap shared memory (%d)\n", errno);
exit(1);
}
json_resp = spdk_jsonrpc_client_get_response(g_rpc_client);
if (json_resp == NULL) {
goto end;
conns = (struct spdk_iscsi_conn *)conns_ptr;
for (i = 0; i < MAX_ISCSI_CONNECTIONS; i++) {
if (!conns[i].is_valid) {
continue;
}
v.push_back(&conns[i]);
}
if (spdk_json_decode_array(json_resp->result, rpc_decode_conn_object, g_conn_info,
SPDK_COUNTOF(g_conn_info), &conn_count, sizeof(struct rpc_conn_info))) {
goto end;
stable_sort(v.begin(), v.end(), conns_compare);
for (iter = v.begin(); iter != v.end(); iter++) {
conn = *iter;
printf("pg %p conn %3d T:%-8s I:%s (%s)\n",
conn->pg, conn->id,
conn->target_short_name, conn->initiator_name,
conn->initiator_addr);
}
for (i = 0; i < conn_count; i++) {
conn = &g_conn_info[i];
printf("Connection: %u CID: %u TSIH: %u Initiator Address: %s Target Address: %s Target Node Name: %s\n",
conn->id, conn->cid, conn->tsih, conn->initiator_addr, conn->target_addr, conn->target_node_name);
}
end:
spdk_jsonrpc_client_free_request(request);
printf("\n");
munmap(conns, conns_size);
close(fd);
}
int main(int argc, char **argv)
@ -140,7 +127,6 @@ int main(int argc, char **argv)
void *history_ptr;
struct spdk_trace_histories *histories;
struct spdk_trace_history *history;
const char *rpc_socket_path = SPDK_DEFAULT_RPC_ADDR;
uint64_t tasks_done, last_tasks_done[SPDK_TRACE_MAX_LCORE];
int delay, old_delay, history_fd, i, quit, rc;
@ -154,13 +140,10 @@ int main(int argc, char **argv)
int op;
exe_name = argv[0];
while ((op = getopt(argc, argv, "i:r:")) != -1) {
while ((op = getopt(argc, argv, "i:")) != -1) {
switch (op) {
case 'i':
g_shm_id = spdk_strtol(optarg, 10);
break;
case 'r':
rpc_socket_path = optarg;
g_shm_id = atoi(optarg);
break;
default:
usage();
@ -168,12 +151,6 @@ int main(int argc, char **argv)
}
}
g_rpc_client = spdk_jsonrpc_client_connect(rpc_socket_path, AF_UNIX);
if (!g_rpc_client) {
fprintf(stderr, "spdk_jsonrpc_client_connect() failed: %d\n", errno);
return 1;
}
snprintf(spdk_trace_shm_name, sizeof(spdk_trace_shm_name), "/iscsi_trace.%d", g_shm_id);
history_fd = shm_open(spdk_trace_shm_name, O_RDONLY, 0600);
if (history_fd < 0) {
@ -271,7 +248,5 @@ cleanup:
munmap(history_ptr, sizeof(*histories));
close(history_fd);
spdk_jsonrpc_client_close(g_rpc_client);
return (0);
}

View File

@ -39,14 +39,20 @@ APP = nvmf_tgt
C_SRCS := nvmf_main.c
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event_nvmf
ifeq ($(SPDK_ROOT_DIR)/lib/env_dpdk,$(CONFIG_ENV))
SPDK_LIB_LIST += env_dpdk_rpc
endif
SPDK_LIB_LIST = $(ALL_MODULES_LIST)
SPDK_LIB_LIST += event_bdev event_copy event_nvmf event_net event_vmd
SPDK_LIB_LIST += nvmf event log trace conf thread util bdev copy rpc jsonrpc json net sock
SPDK_LIB_LIST += app_rpc log_rpc bdev_rpc notify
ifeq ($(OS),Linux)
SPDK_LIB_LIST += event_nbd
SPDK_LIB_LIST += event_nbd nbd
endif
ifeq ($(CONFIG_FC),y)
ifneq ($(strip $(CONFIG_FC_PATH)),)
SYS_LIBS += -L$(CONFIG_FC_PATH)
endif
SYS_LIBS += -lufc
endif
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -63,7 +63,7 @@ main(int argc, char **argv)
struct spdk_app_opts opts = {};
/* default value in opts */
spdk_app_opts_init(&opts, sizeof(opts));
spdk_app_opts_init(&opts);
opts.name = "nvmf";
if ((rc = spdk_app_parse_args(argc, argv, &opts, "", NULL,
nvmf_parse_arg, nvmf_usage)) !=

View File

@ -1 +0,0 @@
spdk_dd

View File

@ -1,44 +0,0 @@
#
# BSD LICENSE
#
# Copyright (c) Intel Corporation.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
# * Neither the name of Intel Corporation nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
APP = spdk_dd
C_SRCS := spdk_dd.c
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event_bdev
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

File diff suppressed because it is too large Load Diff

View File

@ -33,12 +33,11 @@
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
APP = spdk_lspci
C_SRCS := spdk_lspci.c
SPDK_LIB_LIST = $(SOCK_MODULES_LIST) nvme vmd
SPDK_LIB_LIST = vmd log
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -117,7 +117,5 @@ main(int argc, char **argv)
dev = spdk_pci_get_next_device(dev);
}
spdk_vmd_fini();
return 0;
}

View File

@ -41,17 +41,28 @@ C_SRCS := spdk_tgt.c
SPDK_LIB_LIST = $(ALL_MODULES_LIST)
SPDK_LIB_LIST += event_iscsi event_nvmf
ifeq ($(SPDK_ROOT_DIR)/lib/env_dpdk,$(CONFIG_ENV))
SPDK_LIB_LIST += env_dpdk_rpc
ifeq ($(OS),Linux)
ifeq ($(CONFIG_VHOST),y)
SPDK_LIB_LIST += vhost event_vhost
ifeq ($(CONFIG_VHOST_INTERNAL_LIB),y)
SPDK_LIB_LIST += rte_vhost
endif
endif
endif
SPDK_LIB_LIST += event_bdev event_copy event_iscsi event_net event_scsi event_nvmf event_vmd event
SPDK_LIB_LIST += nvmf trace log conf thread util bdev iscsi scsi copy rpc jsonrpc json
SPDK_LIB_LIST += app_rpc log_rpc bdev_rpc net sock notify
ifeq ($(OS),Linux)
SPDK_LIB_LIST += event_nbd
ifeq ($(CONFIG_VHOST),y)
SPDK_LIB_LIST += event_vhost
SPDK_LIB_LIST += event_nbd nbd
endif
ifeq ($(CONFIG_FC),y)
ifneq ($(strip $(CONFIG_FC_PATH)),)
SYS_LIBS += -L$(CONFIG_FC_PATH)
endif
SYS_LIBS += -lufc
endif
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -109,7 +109,7 @@ main(int argc, char **argv)
struct spdk_app_opts opts = {};
int rc;
spdk_app_opts_init(&opts, sizeof(opts));
spdk_app_opts_init(&opts);
opts.name = "spdk_tgt";
if ((rc = spdk_app_parse_args(argc, argv, &opts, g_spdk_tgt_get_opts_string,
NULL, spdk_tgt_parse_arg, spdk_tgt_usage)) !=

View File

@ -1 +0,0 @@
spdk_top

View File

@ -1,44 +0,0 @@
#
# BSD LICENSE
#
# Copyright (c) Intel Corporation.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
# * Neither the name of Intel Corporation nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
APP = spdk_top
C_SRCS := spdk_top.c
SPDK_LIB_LIST = rpc
LIBS=-lncurses -lpanel -lmenu
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -1,74 +0,0 @@
Contents
========
- Overview
- Installation
- Usage
Overview
========
This application provides SPDK live statistics regarding usage of cores,
threads, pollers, execution times, and relations between those. All data
is being gathered from SPDK by calling appropriate RPC calls. Application
consists of three selectable tabs providing statistics related to three
main topics:
- Threads
- Pollers
- Cores
Installation
============
spdk_top requires Ncurses library (can by installed by running
spdk/scripts/pkgdep.sh) and is compiled by default when SPDK compiles.
Usage
=====
To run spdk_top:
sudo spdk_top [options]
options:
-r <path> RPC listen address (optional, default: /var/tmp/spdk.sock)
-h show help message
Application consists of:
- Tabs list (on top)
- Statistics window (main windows in the middle)
- Options window (below statistics window)
- Page indicator / Error status
Tabs list shows available tabs and highlights currently selected tab.
Statistics window displays current statistics. Available statistics
depend on which tab is currently selected. All time and run counter
related statistics are relative - show elapsed time / number of runs
since previous data refresh. Options windows provide hotkeys list
to change application settings. Available options are:
- [q] Quit - quit the application
- [1-3] TAB selection - select tab to be displayed
- [PgUp] Previous page - go to previous page
- [PgDown] Next page - go to next page
- [c] Columns - select which columns should be visible / hidden:
Use arrow up / down and space / enter keys to select which columns
should be visible. Select 'CLOSE' to confirm changes and close
the window.
- [s] Sorting - change data sorting:
Use arrow up / down to select based on which column data should be
sorted. Use enter key to confirm or esc key to exit without
changing current sorting scheme.
- [r] Refresh rate - change data refresh rate:
Enter new data refresh rate value. Refresh rate accepts value
between 0 and 255 seconds. Use enter key to apply or escape key
to cancel.
Page indicator show current data page. Error status can be displayed
on bottom right side of the screen when the application encountered
an error.

File diff suppressed because it is too large Load Diff

View File

@ -33,11 +33,19 @@
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
APP = spdk_trace
SPDK_NO_LINK_ENV = 1
include $(SPDK_ROOT_DIR)/mk/spdk.app_cxx.mk
CXX_SRCS := trace.cpp
include $(SPDK_ROOT_DIR)/mk/spdk.app_cxx.mk
APP = spdk_trace
all: $(APP)
@:
$(APP): $(OBJS) $(SPDK_LIBS)
$(LINK_CXX)
clean:
$(CLEAN_C) $(APP)
include $(SPDK_ROOT_DIR)/mk/spdk.deps.mk

View File

@ -613,8 +613,6 @@ int main(int argc, char **argv)
file_name = optarg;
break;
case 'h':
usage();
exit(EXIT_SUCCESS);
default:
usage();
exit(1);

View File

@ -39,10 +39,16 @@ APP = vhost
C_SRCS := vhost.c
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event_vhost event_nbd
SPDK_LIB_LIST = $(ALL_MODULES_LIST)
SPDK_LIB_LIST += vhost event_vhost
ifeq ($(SPDK_ROOT_DIR)/lib/env_dpdk,$(CONFIG_ENV))
SPDK_LIB_LIST += env_dpdk_rpc
ifeq ($(CONFIG_VHOST_INTERNAL_LIB),y)
SPDK_LIB_LIST += rte_vhost
endif
SPDK_LIB_LIST += event_bdev event_copy event_net event_scsi event_vmd event
SPDK_LIB_LIST += jsonrpc json rpc bdev_rpc bdev scsi copy trace conf
SPDK_LIB_LIST += thread util log log_rpc app_rpc
SPDK_LIB_LIST += event_nbd nbd net sock notify
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -33,6 +33,7 @@
#include "spdk/stdinc.h"
#include "spdk/conf.h"
#include "spdk/event.h"
#include "spdk/vhost.h"
@ -88,7 +89,7 @@ main(int argc, char *argv[])
struct spdk_app_opts opts = {};
int rc;
spdk_app_opts_init(&opts, sizeof(opts));
spdk_app_opts_init(&opts);
opts.name = "vhost";
if ((rc = spdk_app_parse_args(argc, argv, &opts, "f:S:", NULL,

View File

@ -8,34 +8,21 @@ if [[ ! -f $1 ]]; then
exit 1
fi
rootdir=$(readlink -f $(dirname $0))
source "$1"
rootdir=$(readlink -f $(dirname $0))
source "$rootdir/test/common/autotest_common.sh"
source "$rootdir/scripts/common.sh"
out=$output_dir
if [ -n "$SPDK_TEST_NATIVE_DPDK" ]; then
scanbuild_exclude=" --exclude $(dirname $SPDK_RUN_EXTERNAL_DPDK)"
else
scanbuild_exclude="--exclude $rootdir/dpdk/"
fi
scanbuild="scan-build -o $output_dir/scan-build-tmp $scanbuild_exclude --status-bugs"
config_params=$(get_config_params)
trap '[[ -d $SPDK_WORKSPACE ]] && rm -rf "$SPDK_WORKSPACE"' 0
SPDK_WORKSPACE=$(mktemp -dt "spdk_$(date +%s).XXXXXX")
export SPDK_WORKSPACE
out=$PWD
umask 022
cd $rootdir
# Print some test system info out for the log
date -u
git describe --tags
function ocf_precompile() {
if [ "$SPDK_TEST_OCF" -eq 1 ]; then
# We compile OCF sources ourselves
# They don't need to be checked with scanbuild and code coverage is not applicable
# So we precompile OCF now for further use as standalone static library
@ -44,199 +31,128 @@ function ocf_precompile() {
CC=gcc CCAR=ar $MAKE $MAKEFLAGS -C lib/env_ocf exportlib O=$rootdir/build/ocf.a
# Set config to use precompiled library
config_params="$config_params --with-ocf=/$rootdir/build/ocf.a"
# need to reconfigure to avoid clearing ocf related files on future make clean.
./configure $config_params
}
fi
function build_native_dpdk() {
local external_dpdk_dir
local external_dpdk_base_dir
./configure $config_params
external_dpdk_dir="$SPDK_RUN_EXTERNAL_DPDK"
external_dpdk_base_dir="$(dirname $external_dpdk_dir)"
# Print some test system info out for the log
echo "** START ** Info for Hostname: $HOSTNAME"
uname -a
$MAKE cc_version
$MAKE cxx_version
echo "** END ** Info for Hostname: $HOSTNAME"
if [[ ! -d "$external_dpdk_base_dir" ]]; then
sudo mkdir -p "$external_dpdk_base_dir"
sudo chown -R $(whoami) "$external_dpdk_base_dir"/..
fi
orgdir=$PWD
timing_enter autobuild
rm -rf "$external_dpdk_base_dir"
git clone --branch $SPDK_TEST_NATIVE_DPDK --depth 1 http://dpdk.org/git/dpdk "$external_dpdk_base_dir"
git -C "$external_dpdk_base_dir" log --oneline -n 5
timing_enter check_format
if [ $SPDK_RUN_CHECK_FORMAT -eq 1 ]; then
./scripts/check_format.sh
fi
timing_exit check_format
dpdk_cflags="-fPIC -g -Werror -fcommon"
dpdk_ldflags=""
scanbuild=''
make_timing_label='make'
if [ $SPDK_RUN_SCANBUILD -eq 1 ] && hash scan-build; then
scanbuild="scan-build -o $out/scan-build-tmp --status-bugs"
make_timing_label='scanbuild_make'
report_test_completion "scanbuild"
# the drivers we use
# net/i40e driver is not really needed by us, but it's built as a workaround
# for DPDK issue: https://bugs.dpdk.org/show_bug.cgi?id=576
DPDK_DRIVERS=("bus" "bus/pci" "bus/vdev" "mempool/ring" "net/i40e" "net/i40e/base")
# all possible DPDK drivers
DPDK_ALL_DRIVERS=($(find "$external_dpdk_base_dir/drivers" -mindepth 1 -type d | sed -n "s#^$external_dpdk_base_dir/drivers/##p"))
fi
if [[ "$SPDK_TEST_CRYPTO" -eq 1 ]]; then
git clone --branch v0.54 --depth 1 https://github.com/intel/intel-ipsec-mb.git "$external_dpdk_base_dir/intel-ipsec-mb"
cd "$external_dpdk_base_dir/intel-ipsec-mb"
$MAKE $MAKEFLAGS all SHARED=y EXTRA_CFLAGS=-fPIC
DPDK_DRIVERS+=("crypto")
DPDK_DRIVERS+=("crypto/aesni_mb")
DPDK_DRIVERS+=("crypto/qat")
DPDK_DRIVERS+=("compress/qat")
DPDK_DRIVERS+=("common/qat")
dpdk_cflags+=" -I$external_dpdk_base_dir/intel-ipsec-mb"
dpdk_ldflags+=" -L$external_dpdk_base_dir/intel-ipsec-mb"
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$external_dpdk_base_dir/intel-ipsec-mb
fi
if [ $SPDK_RUN_VALGRIND -eq 1 ]; then
report_test_completion "valgrind"
fi
if [[ "$SPDK_TEST_REDUCE" -eq 1 ]]; then
isal_dir="$external_dpdk_base_dir/isa-l"
git clone --branch v2.29.0 --depth 1 https://github.com/intel/isa-l.git "$isal_dir"
if [ $SPDK_RUN_ASAN -eq 1 ]; then
report_test_completion "asan"
fi
cd $isal_dir
./autogen.sh
./configure CFLAGS="-fPIC -g -O2" --enable-shared=yes --prefix="$isal_dir/build"
ln -s $PWD/include $PWD/isa-l
$MAKE $MAKEFLAGS all
$MAKE install
DPDK_DRIVERS+=("compress")
DPDK_DRIVERS+=("compress/isal")
DPDK_DRIVERS+=("compress/qat")
DPDK_DRIVERS+=("common/qat")
export PKG_CONFIG_PATH="$PKG_CONFIG_PATH:$isal_dir/build/lib/pkgconfig"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$isal_dir/build/lib"
fi
if [ $SPDK_RUN_UBSAN -eq 1 ]; then
report_test_completion "ubsan"
fi
# Use difference between DPDK_ALL_DRIVERS and DPDK_DRIVERS as a set of DPDK drivers we don't want or
# don't need to build.
DPDK_DISABLED_DRIVERS=($(sort <(printf "%s\n" "${DPDK_DRIVERS[@]}") <(printf "%s\n" "${DPDK_ALL_DRIVERS[@]}") | uniq -u))
echo $scanbuild
cd $external_dpdk_base_dir
if [ "$(uname -s)" = "Linux" ]; then
dpdk_cflags+=" -Wno-stringop-overflow"
# Fix for freeing device if not kernel driver configured.
# TODO: Remove once this is merged in upstream DPDK
if grep "20.08.0" $external_dpdk_base_dir/VERSION; then
wget https://github.com/spdk/dpdk/commit/64f1ced13f974e8b3d46b87c361a09eca68126f9.patch -O dpdk-pci.patch
wget https://github.com/spdk/dpdk/commit/c2c273d5c8fbf673623b427f8f4ab5af5ddf0e08.patch -O dpdk-qat.patch
elif grep "20.11\|21.02" $external_dpdk_base_dir/VERSION; then
wget https://github.com/karlatec/dpdk/commit/3219c0cfc38803aec10c809dde16e013b370bda9.patch -O dpdk-pci.patch
wget https://github.com/karlatec/dpdk/commit/adf8f7638de29bc4bf9ba3faf12bbdae73acda0c.patch -O dpdk-qat.patch
else
wget https://github.com/karlatec/dpdk/commit/f95e331be3a1f856b816948990dd2afc67ea4020.patch -O dpdk-pci.patch
wget https://github.com/karlatec/dpdk/commit/6fd2fa906ffdcee04e6ce5da40e61cb841be9827.patch -O dpdk-qat.patch
fi
git config --local user.name "spdk"
git config --local user.email "nomail@all.com"
git am dpdk-pci.patch
git am dpdk-qat.patch
fi
timing_enter "$make_timing_label"
meson build-tmp --prefix="$external_dpdk_dir" --libdir lib \
-Denable_docs=false -Denable_kmods=false -Dtests=false \
-Dc_link_args="$dpdk_ldflags" -Dc_args="$dpdk_cflags" \
-Dmachine=native -Ddisable_drivers=$(printf "%s," "${DPDK_DISABLED_DRIVERS[@]}")
ninja -C "$external_dpdk_base_dir/build-tmp" $MAKEFLAGS
ninja -C "$external_dpdk_base_dir/build-tmp" $MAKEFLAGS install
$MAKE $MAKEFLAGS clean
if [ $SPDK_BUILD_SHARED_OBJECT -eq 1 ]; then
$rootdir/test/make/check_so_deps.sh
report_test_completion "shared_object_build"
fi
# Save this path. In tests are run using autorun.sh then autotest.sh
# script will be unaware of LD_LIBRARY_PATH and will fail tests.
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH" > /tmp/spdk-ld-path
cd "$orgdir"
}
function make_fail_cleanup() {
fail=0
./configure $config_params
time $scanbuild $MAKE $MAKEFLAGS || fail=1
if [ $fail -eq 1 ]; then
if [ -d $out/scan-build-tmp ]; then
scanoutput=$(ls -1 $out/scan-build-tmp/)
mv $out/scan-build-tmp/$scanoutput $out/scan-build
rm -rf $out/scan-build-tmp
chmod -R a+rX $out/scan-build
fi
false
}
exit 1
else
rm -rf $out/scan-build-tmp
fi
timing_exit "$make_timing_label"
function scanbuild_make() {
pass=true
$scanbuild $MAKE $MAKEFLAGS > $out/build_output.txt && rm -rf $out/scan-build-tmp || make_fail_cleanup
xtrace_disable
rm -f $out/*files.txt
for ent in $(find app examples lib module test -type f | grep -vF ".h"); do
if [[ $ent == lib/env_ocf* ]]; then continue; fi
if file -bi $ent | grep -q 'text/x-c'; then
echo $ent | sed 's/\.cp\{0,2\}$//g' >> $out/all_c_files.txt
fi
done
xtrace_restore
grep -E "CC|CXX" $out/build_output.txt | sed 's/\s\s\(CC\|CXX\)\s//g' | sed 's/\.o//g' > $out/built_c_files.txt
cat $rootdir/test/common/skipped_build_files.txt >> $out/built_c_files.txt
sort -o $out/all_c_files.txt $out/all_c_files.txt
sort -o $out/built_c_files.txt $out/built_c_files.txt
# from comm manual:
# -2 suppress column 2 (lines unique to FILE2)
# -3 suppress column 3 (lines that appear in both files)
# comm may exit 1 if no lines were printed (undocumented, unreliable)
comm -2 -3 $out/all_c_files.txt $out/built_c_files.txt > $out/unbuilt_c_files.txt || true
if [ $(wc -l < $out/unbuilt_c_files.txt) -ge 1 ]; then
echo "missing files"
cat $out/unbuilt_c_files.txt
pass=false
fi
$pass
}
function porcelain_check() {
if [ $(git status --porcelain --ignore-submodules | wc -l) -ne 0 ]; then
echo "Generated files missing from .gitignore:"
git status --porcelain --ignore-submodules
exit 1
fi
}
# Check for generated files that are not listed in .gitignore
timing_enter generated_files_check
if [ $(git status --porcelain --ignore-submodules | wc -l) -ne 0 ]; then
echo "Generated files missing from .gitignore:"
git status --porcelain --ignore-submodules
exit 1
fi
timing_exit generated_files_check
# Check that header file dependencies are working correctly by
# capturing a binary's stat data before and after touching a
# header file and re-making.
function header_dependency_check() {
STAT1=$(stat $SPDK_BIN_DIR/spdk_tgt)
sleep 1
touch lib/nvme/nvme_internal.h
$MAKE $MAKEFLAGS
STAT2=$(stat $SPDK_BIN_DIR/spdk_tgt)
timing_enter dependency_check
STAT1=$(stat examples/nvme/identify/identify)
sleep 1
touch lib/nvme/nvme_internal.h
$MAKE $MAKEFLAGS
STAT2=$(stat examples/nvme/identify/identify)
if [ "$STAT1" == "$STAT2" ]; then
echo "Header dependency check failed"
false
fi
}
if [ "$STAT1" == "$STAT2" ]; then
echo "Header dependency check failed"
exit 1
fi
timing_exit dependency_check
function test_make_uninstall() {
# Create empty file to check if it is not deleted by target uninstall
touch "$SPDK_WORKSPACE/usr/lib/sample_xyz.a"
$MAKE $MAKEFLAGS uninstall DESTDIR="$SPDK_WORKSPACE" prefix=/usr
if [[ $(find "$SPDK_WORKSPACE/usr" -maxdepth 1 -mindepth 1 | wc -l) -ne 2 ]] || [[ $(find "$SPDK_WORKSPACE/usr/lib/" -maxdepth 1 -mindepth 1 | wc -l) -ne 1 ]]; then
ls -lR "$SPDK_WORKSPACE"
echo "Make uninstall failed"
exit 1
fi
}
# Test 'make install'
timing_enter make_install
rm -rf /tmp/spdk
mkdir /tmp/spdk
$MAKE $MAKEFLAGS install DESTDIR=/tmp/spdk prefix=/usr
timing_exit make_install
function build_doc() {
local doxygenv
doxygenv=$(doxygen --version)
# Test 'make uninstall'
timing_enter make_uninstall
# Create empty file to check if it is not deleted by target uninstall
touch /tmp/spdk/usr/lib/sample_xyz.a
$MAKE $MAKEFLAGS uninstall DESTDIR=/tmp/spdk prefix=/usr
if [[ $(ls -A /tmp/spdk/usr | wc -l) -ne 2 ]] || [[ $(ls -A /tmp/spdk/usr/lib/ | wc -l) -ne 1 ]]; then
ls -lR /tmp/spdk
rm -rf /tmp/spdk
echo "Make uninstall failed"
exit 1
else
rm -rf /tmp/spdk
fi
timing_exit make_uninstall
timing_enter doxygen
if [ $SPDK_BUILD_DOC -eq 1 ] && hash doxygen; then
$MAKE -C "$rootdir"/doc --no-print-directory $MAKEFLAGS &> "$out"/doxygen.log
if [ -s "$out"/doxygen.log ]; then
cat "$out"/doxygen.log
echo "Doxygen errors found!"
eq "$doxygenv" 1.8.20 || exit 1
echo "Doxygen $doxygenv detected, all warnings are potentially false positives, continuing the test"
exit 1
fi
if hash pdflatex 2> /dev/null; then
if hash pdflatex 2>/dev/null; then
$MAKE -C "$rootdir"/doc/output/latex --no-print-directory $MAKEFLAGS &>> "$out"/doxygen.log
fi
mkdir -p "$out"/doc
@ -246,58 +162,10 @@ function build_doc() {
fi
$MAKE -C "$rootdir"/doc --no-print-directory $MAKEFLAGS clean &>> "$out"/doxygen.log
if [ -s "$out"/doxygen.log ]; then
# Save the log as an artifact in case we are working with potentially broken version
eq "$doxygenv" 1.8.20 || rm "$out"/doxygen.log
rm "$out"/doxygen.log
fi
rm -rf "$rootdir"/doc/output
}
function autobuild_test_suite() {
run_test "autobuild_check_format" ./scripts/check_format.sh
run_test "autobuild_external_code" sudo -E --preserve-env=PATH LD_LIBRARY_PATH=$LD_LIBRARY_PATH $rootdir/test/external_code/test_make.sh $rootdir
if [ "$SPDK_TEST_OCF" -eq 1 ]; then
run_test "autobuild_ocf_precompile" ocf_precompile
fi
run_test "autobuild_check_so_deps" $rootdir/test/make/check_so_deps.sh $1
./configure $config_params --without-shared
run_test "scanbuild_make" scanbuild_make
run_test "autobuild_generated_files_check" porcelain_check
run_test "autobuild_header_dependency_check" header_dependency_check
run_test "autobuild_make_install" $MAKE $MAKEFLAGS install DESTDIR="$SPDK_WORKSPACE" prefix=/usr
run_test "autobuild_make_uninstall" test_make_uninstall
run_test "autobuild_build_doc" build_doc
}
if [ $SPDK_RUN_VALGRIND -eq 1 ]; then
run_test "valgrind" echo "using valgrind"
fi
timing_exit doxygen
if [ $SPDK_RUN_ASAN -eq 1 ]; then
run_test "asan" echo "using asan"
fi
if [ $SPDK_RUN_UBSAN -eq 1 ]; then
run_test "ubsan" echo "using ubsan"
fi
if [ -n "$SPDK_TEST_NATIVE_DPDK" ]; then
run_test "build_native_dpdk" build_native_dpdk
fi
./configure $config_params
echo "** START ** Info for Hostname: $HOSTNAME"
uname -a
$MAKE cc_version
$MAKE cxx_version
echo "** END ** Info for Hostname: $HOSTNAME"
if [ "$SPDK_TEST_AUTOBUILD" -eq 1 ]; then
run_test "autobuild" autobuild_test_suite $1
else
if [ "$SPDK_TEST_OCF" -eq 1 ]; then
run_test "autobuild_ocf_precompile" ocf_precompile
fi
# if we aren't testing the unittests, build with shared objects.
./configure $config_params --with-shared
run_test "make" $MAKE $MAKEFLAGS
fi
timing_exit autobuild

View File

@ -13,37 +13,6 @@ source "$1"
rootdir=$(readlink -f $(dirname $0))
source "$rootdir/test/common/autotest_common.sh"
function build_rpms() (
local version rpms
# Make sure linker will not attempt to look under DPDK's repo dir to get the libs
unset -v LD_LIBRARY_PATH
install_uninstall_rpms() {
rpms=("$HOME/rpmbuild/RPMS/x86_64/"spdk{,-devel,{,-dpdk}-libs}-$version-1.x86_64.rpm)
sudo rpm -i "${rpms[@]}"
rpms=("${rpms[@]##*/}") rpms=("${rpms[@]%.rpm}")
# Check if we can find one of the apps in the PATH now and verify if it doesn't miss
# any libs.
LIST_LIBS=yes "$rootdir/rpmbuild/rpm-deps.sh" "${SPDK_APP[@]##*/}"
sudo rpm -e "${rpms[@]}"
}
build_rpm() {
MAKEFLAGS="$MAKEFLAGS" SPDK_VERSION="$version" DEPS=no "$rootdir/rpmbuild/rpm.sh" "$@"
install_uninstall_rpms
}
version="test_shared"
run_test "build_shared_rpm" build_rpm --with-shared
if [[ -n $SPDK_TEST_NATIVE_DPDK ]]; then
version="test_shared_native_dpdk"
run_test "build_shared_native_dpdk_rpm" build_rpm --with-shared --with-dpdk="$SPDK_RUN_EXTERNAL_DPDK"
fi
)
out=$PWD
MAKEFLAGS=${MAKEFLAGS:--j16}
@ -59,28 +28,69 @@ if [ $(git status --porcelain --ignore-submodules | wc -l) -ne 0 ]; then
fi
timing_exit porcelain_check
if [[ $SPDK_TEST_RELEASE_BUILD -eq 1 ]]; then
run_test "build_rpms" build_rpms
$MAKE clean
fi
if [[ $RUN_NIGHTLY -eq 0 ]]; then
if [ $RUN_NIGHTLY -eq 0 ]; then
timing_finish
exit 0
fi
timing_enter build_release
timing_enter autopackage
config_params="$(get_config_params | sed 's/--enable-debug//g')"
if [ $(uname -s) = Linux ]; then
./configure $config_params --enable-lto
else
# LTO needs a special compiler to work on BSD.
./configure $config_params
spdk_pv=spdk-$(date +%Y_%m_%d)
spdk_tarball=${spdk_pv}.tar
dpdk_pv=dpdk-$(date +%Y_%m_%d)
dpdk_tarball=${dpdk_pv}.tar
ipsec_pv=ipsec-$(date +%Y_%m_%d)
ipsec_tarball=${ipsec_pv}.tar
isal_pv=isal-$(date +%Y_%m_%d)
isal_tarball=${isal_pv}.tar
ocf_pv=ocf-$(date +%Y_%m_%d)
ocf_tarball=${ocf_pv}.tar
find . -iname "spdk-*.tar* dpdk-*.tar* ipsec-*.tar* isal-*.tar*" -delete
git archive HEAD^{tree} --prefix=${spdk_pv}/ -o ${spdk_tarball}
# Build from packaged source
tmpdir=$(mktemp -d)
echo "tmpdir=$tmpdir"
tar -C "$tmpdir" -xf $spdk_tarball
if [ -z "$WITH_DPDK_DIR" ]; then
cd dpdk
git archive HEAD^{tree} --prefix=dpdk/ -o ../${dpdk_tarball}
cd ..
tar -C "$tmpdir/${spdk_pv}" -xf $dpdk_tarball
fi
$MAKE ${MAKEFLAGS}
$MAKE ${MAKEFLAGS} clean
timing_exit build_release
if [ -d "intel-ipsec-mb" ]; then
cd intel-ipsec-mb
git archive HEAD^{tree} --prefix=intel-ipsec-mb/ -o ../${ipsec_tarball}
cd ..
tar -C "$tmpdir/${spdk_pv}" -xf $ipsec_tarball
fi
if [ -d "isa-l" ]; then
cd isa-l
git archive HEAD^{tree} --prefix=isa-l/ -o ../${isal_tarball}
cd ..
tar -C "$tmpdir/${spdk_pv}" -xf $isal_tarball
fi
if [ -d "ocf" ]; then
cd ocf
git archive HEAD^{tree} --prefix=ocf/ -o ../${ocf_tarball}
cd ..
tar -C "$tmpdir/${spdk_pv}" -xf $ocf_tarball
fi
(
cd "$tmpdir"/spdk-*
# use $config_params to get the right dependency options, but disable coverage and ubsan
# explicitly since they are not needed for this build
./configure $config_params --disable-debug --enable-werror --disable-coverage --disable-ubsan
time $MAKE ${MAKEFLAGS}
)
rm -rf "$tmpdir"
timing_exit autopackage
timing_finish

View File

@ -4,8 +4,7 @@ set -e
rootdir=$(readlink -f $(dirname $0))
default_conf=~/autorun-spdk.conf
conf=${1:-${default_conf}}
conf=~/autorun-spdk.conf
# If the configuration of tests is not provided, no tests will be carried out.
if [[ ! -f $conf ]]; then
@ -18,5 +17,5 @@ cat "$conf"
# Runs agent scripts
$rootdir/autobuild.sh "$conf"
sudo -E $rootdir/autotest.sh "$conf"
sudo -E WITH_DPDK_DIR="$WITH_DPDK_DIR" $rootdir/autotest.sh "$conf"
$rootdir/autopackage.sh "$conf"

View File

@ -19,61 +19,62 @@ def highest_value(inp):
def generateTestCompletionTables(output_dir, completion_table):
data_table = pd.DataFrame(completion_table, columns=["Agent", "Domain", "Test", "With Asan", "With UBsan"])
data_table = pd.DataFrame(completion_table, columns=["Agent", "Test", "With Asan", "With UBsan"])
data_table.to_html(os.path.join(output_dir, 'completions_table.html'))
os.makedirs(os.path.join(output_dir, "post_process"), exist_ok=True)
pivot_by_agent = pd.pivot_table(data_table, index=["Agent", "Domain", "Test"])
pivot_by_agent = pd.pivot_table(data_table, index=["Agent", "Test"])
pivot_by_agent.to_html(os.path.join(output_dir, "post_process", 'completions_table_by_agent.html'))
pivot_by_test = pd.pivot_table(data_table, index=["Domain", "Test", "Agent"])
pivot_by_test = pd.pivot_table(data_table, index=["Test", "Agent"])
pivot_by_test.to_html(os.path.join(output_dir, "post_process", 'completions_table_by_test.html'))
pivot_by_asan = pd.pivot_table(data_table, index=["Domain", "Test"], values=["With Asan"], aggfunc=highest_value)
pivot_by_asan = pd.pivot_table(data_table, index=["Test"], values=["With Asan"], aggfunc=highest_value)
pivot_by_asan.to_html(os.path.join(output_dir, "post_process", 'completions_table_by_asan.html'))
pivot_by_ubsan = pd.pivot_table(data_table, index=["Domain", "Test"], values=["With UBsan"], aggfunc=highest_value)
pivot_by_ubsan = pd.pivot_table(data_table, index=["Test"], values=["With UBsan"], aggfunc=highest_value)
pivot_by_ubsan.to_html(os.path.join(output_dir, "post_process", 'completions_table_by_ubsan.html'))
def generateCoverageReport(output_dir, repo_dir):
coveragePath = os.path.join(output_dir, '**', 'cov_total.info')
covfiles = [os.path.abspath(p) for p in glob.glob(coveragePath, recursive=True)]
for f in covfiles:
print(f)
if len(covfiles) == 0:
return
lcov_opts = [
'--rc lcov_branch_coverage=1',
'--rc lcov_function_coverage=1',
'--rc genhtml_branch_coverage=1',
'--rc genhtml_function_coverage=1',
'--rc genhtml_legend=1',
'--rc geninfo_all_blocks=1',
]
cov_total = os.path.abspath(os.path.join(output_dir, 'cov_total.info'))
coverage = os.path.join(output_dir, 'coverage')
lcov = 'lcov' + ' ' + ' '.join(lcov_opts) + ' -q -a ' + ' -a '.join(covfiles) + ' -o ' + cov_total
genhtml = 'genhtml' + ' ' + ' '.join(lcov_opts) + ' -q ' + cov_total + ' --legend' + ' -t "Combined" --show-details -o ' + coverage
try:
subprocess.check_call([lcov], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except subprocess.CalledProcessError as e:
print("lcov failed")
print(e)
return
cov_total_file = open(cov_total, 'r')
replacement = "SF:" + repo_dir
file_contents = cov_total_file.readlines()
cov_total_file.close()
os.remove(cov_total)
with open(cov_total, 'w+') as file:
for Line in file_contents:
Line = re.sub("^SF:.*/repo", replacement, Line)
file.write(Line + '\n')
try:
subprocess.check_call([genhtml], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except subprocess.CalledProcessError as e:
print("genhtml failed")
print(e)
for f in covfiles:
os.remove(f)
with open(os.path.join(output_dir, 'coverage.log'), 'w+') as log_file:
coveragePath = os.path.join(output_dir, '**', 'cov_total.info')
covfiles = [os.path.abspath(p) for p in glob.glob(coveragePath, recursive=True)]
for f in covfiles:
print(f, file=log_file)
if len(covfiles) == 0:
return
lcov_opts = [
'--rc lcov_branch_coverage=1',
'--rc lcov_function_coverage=1',
'--rc genhtml_branch_coverage=1',
'--rc genhtml_function_coverage=1',
'--rc genhtml_legend=1',
'--rc geninfo_all_blocks=1',
]
cov_total = os.path.abspath(os.path.join(output_dir, 'cov_total.info'))
coverage = os.path.join(output_dir, 'coverage')
lcov = 'lcov' + ' ' + ' '.join(lcov_opts) + ' -q -a ' + ' -a '.join(covfiles) + ' -o ' + cov_total
genhtml = 'genhtml' + ' ' + ' '.join(lcov_opts) + ' -q ' + cov_total + ' --legend' + ' -t "Combined" --show-details -o ' + coverage
try:
subprocess.check_call([lcov], shell=True, stdout=log_file, stderr=log_file)
except subprocess.CalledProcessError as e:
print("lcov failed", file=log_file)
print(e, file=log_file)
return
cov_total_file = open(cov_total, 'r')
replacement = "SF:" + repo_dir
file_contents = cov_total_file.readlines()
cov_total_file.close()
os.remove(cov_total)
with open(cov_total, 'w+') as file:
for Line in file_contents:
Line = re.sub("^SF:.*/repo", replacement, Line)
file.write(Line + '\n')
try:
subprocess.check_call([genhtml], shell=True, stdout=log_file, stderr=log_file)
except subprocess.CalledProcessError as e:
print("genhtml failed", file=log_file)
print(e, file=log_file)
for f in covfiles:
os.remove(f)
def collectOne(output_dir, dir_name):
@ -91,96 +92,91 @@ def collectOne(output_dir, dir_name):
shutil.rmtree(d)
def getCompletions(completionFile, test_list, test_completion_table):
agent_name = os.path.basename(os.path.dirname(completionFile))
with open(completionFile, 'r') as completionList:
completions = completionList.read()
asan_enabled = "asan" in completions
ubsan_enabled = "ubsan" in completions
for line in completions.splitlines():
try:
domain, test_name = line.strip().split()
test_list[test_name] = (True, asan_enabled | test_list[test_name][1], ubsan_enabled | test_list[test_name][2])
test_completion_table.append([agent_name, domain, test_name, asan_enabled, ubsan_enabled])
try:
test_completion_table.remove(["None", "None", test_name, False, False])
except ValueError:
continue
except KeyError:
continue
def printList(header, test_list, index, condition):
print("\n\n-----%s------" % header)
executed_tests = [x for x in sorted(test_list) if test_list[x][index] is condition]
print(*executed_tests, sep="\n")
def printListInformation(table_type, test_list):
printList("%s Executed in Build" % table_type, test_list, 0, True)
printList("%s Missing From Build" % table_type, test_list, 0, False)
printList("%s Missing ASAN" % table_type, test_list, 1, False)
printList("%s Missing UBSAN" % table_type, test_list, 2, False)
def getSkippedTests(repo_dir):
skipped_test_file = os.path.join(repo_dir, "test", "common", "skipped_tests.txt")
if not os.path.exists(skipped_test_file):
return []
else:
with open(skipped_test_file, "r") as skipped_test_data:
return [x.strip() for x in skipped_test_data.readlines() if "#" not in x and x.strip() != '']
def confirmPerPatchTests(test_list, skiplist):
missing_tests = [x for x in sorted(test_list) if test_list[x][0] is False
and x not in skiplist]
if len(missing_tests) > 0:
print("Not all tests were run. Failing the build.")
print(missing_tests)
exit(1)
def aggregateCompletedTests(output_dir, repo_dir, skip_confirm=False):
def aggregateCompletedTests(output_dir, repo_dir):
test_list = {}
test_with_asan = {}
test_with_ubsan = {}
test_completion_table = []
testFiles = glob.glob(os.path.join(output_dir, '**', 'all_tests.txt'), recursive=True)
completionFiles = glob.glob(os.path.join(output_dir, '**', 'test_completions.txt'), recursive=True)
asan_enabled = False
ubsan_enabled = False
test_unit_with_valgrind = False
testFilePath = os.path.join(output_dir, '**', 'all_tests.txt')
completionFilePath = os.path.join(output_dir, '**', 'test_completions.txt')
testFiles = glob.glob(testFilePath, recursive=True)
completionFiles = glob.glob(completionFilePath, recursive=True)
testSummary = os.path.join(output_dir, "test_execution.log")
if len(testFiles) == 0:
print("Unable to perform test completion aggregator. No input files.")
return 0
with open(testFiles[0], 'r') as raw_test_list:
item = testFiles[0]
with open(item, 'r') as raw_test_list:
for line in raw_test_list:
try:
test_name = line.strip()
except Exception:
print("Failed to parse a test type.")
return 1
test_list[line.strip()] = (False, False, False)
test_completion_table.append(["None", line.strip(), False, False])
for item in completionFiles:
agent_name = os.path.split(os.path.split(item)[0])[1]
with open(item, 'r') as completion_list:
completions = completion_list.read()
test_list[test_name] = (False, False, False)
test_completion_table.append(["None", "None", test_name, False, False])
if "asan" not in completions:
asan_enabled = False
else:
asan_enabled = True
for completionFile in completionFiles:
getCompletions(completionFile, test_list, test_completion_table)
if "ubsan" not in completions:
ubsan_enabled = False
else:
ubsan_enabled = True
if "valgrind" in completions and "unittest" in completions:
test_unit_with_valgrind = True
test_completion_table.append([agent_name, "valgrind", asan_enabled, ubsan_enabled])
for line in completions.split('\n'):
try:
test_list[line.strip()] = (True, asan_enabled | test_list[line.strip()][1], ubsan_enabled | test_list[line.strip()][1])
test_completion_table.append([agent_name, line.strip(), asan_enabled, ubsan_enabled])
try:
test_completion_table.remove(["None", line.strip(), False, False])
except ValueError:
continue
except KeyError:
continue
with open(testSummary, 'w') as fh:
fh.write("\n\n-----Tests Executed in Build------\n")
for item in sorted(test_list):
if test_list[item][0]:
fh.write(item + "\n")
fh.write("\n\n-----Tests Missing From Build------\n")
if not test_unit_with_valgrind:
fh.write("UNITTEST_WITH_VALGRIND\n")
for item in sorted(test_list):
if test_list[item][0] is False:
fh.write(item + "\n")
fh.write("\n\n-----Tests Missing ASAN------\n")
for item in sorted(test_list):
if test_list[item][1] is False:
fh.write(item + "\n")
fh.write("\n\n-----Tests Missing UBSAN------\n")
for item in sorted(test_list):
if test_list[item][2] is False:
fh.write(item + "\n")
with open(testSummary, 'r') as fh:
print(fh.read())
printListInformation("Tests", test_list)
generateTestCompletionTables(output_dir, test_completion_table)
skipped_tests = getSkippedTests(repo_dir)
if not skip_confirm:
confirmPerPatchTests(test_list, skipped_tests)
def main(output_dir, repo_dir, skip_confirm=False):
print("-----Begin Post Process Script------")
def main(output_dir, repo_dir):
generateCoverageReport(output_dir, repo_dir)
collectOne(output_dir, 'doc')
collectOne(output_dir, 'ut_coverage')
aggregateCompletedTests(output_dir, repo_dir, skip_confirm)
aggregateCompletedTests(output_dir, repo_dir)
if __name__ == "__main__":
@ -189,7 +185,5 @@ if __name__ == "__main__":
help="The location of your build's output directory")
parser.add_argument("-r", "--repo_directory", type=str, required=True,
help="The location of your spdk repository")
parser.add_argument("-s", "--skip_confirm", required=False, action="store_true",
help="Do not check if all autotest.sh tests were executed.")
args = parser.parse_args()
main(args.directory_location, args.repo_directory, args.skip_confirm)
main(args.directory_location, args.repo_directory)

View File

@ -9,16 +9,6 @@ if [[ ! -f $1 ]]; then
exit 1
fi
# always test with SPDK shared objects.
export SPDK_LIB_DIR="$rootdir/build/lib"
# Autotest.sh, as part of autorun.sh, runs in a different
# shell process than autobuild.sh. Use helper file to pass
# over env variable containing libraries paths.
if [[ -e /tmp/spdk-ld-path ]]; then
source /tmp/spdk-ld-path
fi
source "$1"
source "$rootdir/test/common/autotest_common.sh"
source "$rootdir/test/nvmf/common.sh"
@ -29,25 +19,12 @@ if [ $EUID -ne 0 ]; then
fi
if [ $(uname -s) = Linux ]; then
old_core_pattern=$(< /proc/sys/kernel/core_pattern)
mkdir -p "$output_dir/coredumps"
# set core_pattern to a known value to avoid ABRT, systemd-coredump, etc.
echo "|$rootdir/scripts/core-collector.sh %P %s %t $output_dir/coredumps" > /proc/sys/kernel/core_pattern
echo 2 > /proc/sys/kernel/core_pipe_limit
# Make sure that the hugepage state for our VM is fresh so we don't fail
# hugepage allocation. Allow time for this action to complete.
echo 1 > /proc/sys/vm/drop_caches
sleep 3
echo "core" > /proc/sys/kernel/core_pattern
# make sure nbd (network block device) driver is loaded if it is available
# this ensures that when tests need to use nbd, it will be fully initialized
modprobe nbd || true
if udevadm=$(type -P udevadm); then
"$udevadm" monitor --property &> "$output_dir/udev.log" &
udevadm_pid=$!
fi
fi
trap "process_core; autotest_cleanup; exit 1" SIGINT SIGTERM EXIT
@ -57,16 +34,14 @@ timing_enter autotest
create_test_list
src=$(readlink -f $(dirname $0))
out=$output_dir
out=$PWD
cd $src
./scripts/setup.sh status
freebsd_update_contigmem_mod
# lcov takes considerable time to process clang coverage.
# Disabling lcov allow us to do this.
# More information: https://github.com/spdk/spdk/issues/1693
CC_TYPE=$(grep CC_TYPE mk/cc.mk)
if hash lcov && ! [[ "$CC_TYPE" == *"clang"* ]]; then
if hash lcov; then
# setup output dir for unittest.sh
export UT_COVERAGE=$out/ut_coverage
export LCOV_OPTS="
@ -81,7 +56,7 @@ if hash lcov && ! [[ "$CC_TYPE" == *"clang"* ]]; then
# Print lcov version to log
$LCOV -v
# zero out coverage data
$LCOV -q -c -i -t "Baseline" -d $src -o $out/cov_base.info
$LCOV -q -c -i -t "Baseline" -d $src -o cov_base.info
fi
# Make sure the disks are clean (no leftover partition tables)
@ -92,47 +67,43 @@ rm -f /var/tmp/spdk*.sock
# Load the kernel driver
./scripts/setup.sh reset
# Let the kernel discover any filesystems or partitions
sleep 10
if [ $(uname -s) = Linux ]; then
# OCSSD devices drivers don't support IO issues by kernel so
# detect OCSSD devices and block them (unbind from any driver).
# detect OCSSD devices and blacklist them (unbind from any driver).
# If test scripts want to use this device it needs to do this explicitly.
#
# If some OCSSD device is bound to other driver than nvme we won't be able to
# discover if it is OCSSD or not so load the kernel driver first.
while IFS= read -r -d '' dev; do
while IFS= read -r -d '' dev
do
# Send Open Channel 2.0 Geometry opcode "0xe2" - not supported by NVMe device.
if nvme admin-passthru $dev --namespace-id=1 --data-len=4096 --opcode=0xe2 --read > /dev/null; then
if nvme admin-passthru $dev --namespace-id=1 --data-len=4096 --opcode=0xe2 --read >/dev/null; then
bdf="$(basename $(readlink -e /sys/class/nvme/${dev#/dev/}/device))"
echo "INFO: blocking OCSSD device: $dev ($bdf)"
PCI_BLOCKED+=" $bdf"
echo "INFO: blacklisting OCSSD device: $dev ($bdf)"
PCI_BLACKLIST+=" $bdf"
OCSSD_PCI_DEVICES+=" $bdf"
fi
done < <(find /dev -maxdepth 1 -regex '/dev/nvme[0-9]+' -print0)
done < <(find /dev -maxdepth 1 -regex '/dev/nvme[0-9]+' -print0)
export OCSSD_PCI_DEVICES
# Now, bind blocked devices to pci-stub module. This will prevent
# Now, bind blacklisted devices to pci-stub module. This will prevent
# automatic grabbing these devices when we add device/vendor ID to
# proper driver.
if [[ -n "$PCI_BLOCKED" ]]; then
# shellcheck disable=SC2097,SC2098
PCI_ALLOWED="$PCI_BLOCKED" \
PCI_BLOCKED="" \
DRIVER_OVERRIDE="pci-stub" \
if [[ -n "$PCI_BLACKLIST" ]]; then
PCI_WHITELIST="$PCI_BLACKLIST" \
PCI_BLACKLIST="" \
DRIVER_OVERRIDE="pci-stub" \
./scripts/setup.sh
# Export our blocked list so it will take effect during next setup.sh
export PCI_BLOCKED
# Export our blacklist so it will take effect during next setup.sh
export PCI_BLACKLIST
fi
run_test "setup.sh" "$rootdir/test/setup/test-setup.sh"
fi
./scripts/setup.sh status
if [[ $(uname -s) == Linux ]]; then
# Revert NVMe namespaces to default state
nvme_namespace_revert
fi
# Delete all leftover lvols and gpt partitions
@ -151,13 +122,12 @@ timing_enter afterboot
./scripts/setup.sh
timing_exit afterboot
timing_enter nvmf_setup
rdma_device_init
timing_exit nvmf_setup
if [[ $SPDK_TEST_CRYPTO -eq 1 || $SPDK_TEST_REDUCE -eq 1 ]]; then
# Make sure that memory is distributed across all NUMA nodes - by default, all goes to
# node0, but if QAT devices are attached to a different node, all of their VFs will end
# up under that node too and memory needs to be available there for the tests.
CLEAR_HUGE=yes HUGE_EVEN_ALLOC=yes ./scripts/setup.sh
./scripts/setup.sh status
if [[ $SPDK_TEST_USE_IGB_UIO -eq 1 ]]; then
if grep -q '#define SPDK_CONFIG_IGB_UIO_DRIVER 1' $rootdir/include/spdk/config.h; then
./scripts/qat_setup.sh igb_uio
else
./scripts/qat_setup.sh
@ -173,163 +143,132 @@ opal_revert_cleanup
#####################
if [ $SPDK_TEST_UNITTEST -eq 1 ]; then
run_test "unittest" ./test/unit/unittest.sh
run_test "env" test/env/env.sh
timing_enter unittest
run_test suite ./test/unit/unittest.sh
report_test_completion "unittest"
timing_exit unittest
fi
if [ $SPDK_RUN_FUNCTIONAL_TEST -eq 1 ]; then
timing_enter lib
run_test "rpc" test/rpc/rpc.sh
run_test "rpc_client" test/rpc_client/rpc_client.sh
run_test "json_config" ./test/json_config/json_config.sh
run_test "alias_rpc" test/json_config/alias_rpc/alias_rpc.sh
run_test "spdkcli_tcp" test/spdkcli/tcp.sh
run_test "dpdk_mem_utility" test/dpdk_memory_utility/test_dpdk_mem_info.sh
run_test "event" test/event/event.sh
run_test "accel_engine" test/accel_engine/accel_engine.sh
run_test suite test/env/env.sh
run_test suite test/rpc_client/rpc_client.sh
run_test suite ./test/json_config/json_config.sh
run_test suite test/json_config/alias_rpc/alias_rpc.sh
run_test suite test/spdkcli/tcp.sh
if [ $SPDK_TEST_BLOCKDEV -eq 1 ]; then
run_test "blockdev_general" test/bdev/blockdev.sh
run_test "bdev_raid" test/bdev/bdev_raid.sh
run_test "bdevperf_config" test/bdev/bdevperf/test_config.sh
if [[ $(uname -s) == Linux ]]; then
run_test "spdk_dd" test/dd/dd.sh
run_test "reactor_set_interrupt" test/interrupt/reactor_set_interrupt.sh
run_test suite test/bdev/blockdev.sh
if [[ $RUN_NIGHTLY -eq 1 ]]; then
run_test suite test/bdev/bdev_raid.sh
fi
fi
if [ $SPDK_TEST_JSON -eq 1 ]; then
run_test "test_converter" test/config_converter/test_converter.sh
run_test suite test/config_converter/test_converter.sh
fi
if [ $SPDK_TEST_EVENT -eq 1 ]; then
run_test suite test/event/event.sh
fi
if [ $SPDK_TEST_NVME -eq 1 ]; then
run_test "blockdev_nvme" test/bdev/blockdev.sh "nvme"
run_test "blockdev_nvme_gpt" test/bdev/blockdev.sh "gpt"
run_test "nvme" test/nvme/nvme.sh
if [[ $SPDK_TEST_NVME_PMR -eq 1 ]]; then
run_test "nvme_pmr" test/nvme/nvme_pmr.sh
run_test suite test/nvme/nvme.sh
if [[ $SPDK_TEST_NVME_CLI -eq 1 ]]; then
run_test suite test/nvme/spdk_nvme_cli.sh
fi
if [[ $SPDK_TEST_NVME_CUSE -eq 1 ]]; then
run_test "nvme_cuse" test/nvme/cuse/nvme_cuse.sh
run_test suite test/nvme/spdk_nvme_cli_cuse.sh
fi
run_test "nvme_rpc" test/nvme/nvme_rpc.sh
# Only test hotplug without ASAN enabled. Since if it is
# enabled, it catches SEGV earlier than our handler which
# breaks the hotplug logic.
if [ $SPDK_RUN_ASAN -eq 0 ]; then
run_test "nvme_hotplug" test/nvme/hotplug.sh root
run_test suite test/nvme/hotplug.sh intel
fi
fi
if [ $SPDK_TEST_IOAT -eq 1 ]; then
run_test "ioat" test/ioat/ioat.sh
run_test suite test/ioat/ioat.sh
fi
timing_exit lib
if [ $SPDK_TEST_ISCSI -eq 1 ]; then
run_test "iscsi_tgt" ./test/iscsi_tgt/iscsi_tgt.sh
run_test "spdkcli_iscsi" ./test/spdkcli/iscsi.sh
run_test suite ./test/iscsi_tgt/iscsi_tgt.sh posix
run_test suite ./test/spdkcli/iscsi.sh
# Run raid spdkcli test under iSCSI since blockdev tests run on systems that can't run spdkcli yet
run_test "spdkcli_raid" test/spdkcli/raid.sh
run_test suite test/spdkcli/raid.sh
fi
if [ $SPDK_TEST_VPP -eq 1 ]; then
run_test suite ./test/iscsi_tgt/iscsi_tgt.sh vpp
fi
if [ $SPDK_TEST_BLOBFS -eq 1 ]; then
run_test "rocksdb" ./test/blobfs/rocksdb/rocksdb.sh
run_test "blobstore" ./test/blobstore/blobstore.sh
run_test "blobfs" ./test/blobfs/blobfs.sh
run_test "hello_blob" $SPDK_EXAMPLE_DIR/hello_blob \
examples/blob/hello_world/hello_blob.json
run_test suite ./test/blobfs/rocksdb/rocksdb.sh
run_test suite ./test/blobstore/blobstore.sh
run_test suite ./test/blobfs/blobfs.sh
fi
if [ $SPDK_TEST_NVMF -eq 1 ]; then
# The NVMe-oF run test cases are split out like this so that the parser that compiles the
# list of all tests can properly differentiate them. Please do not merge them into one line.
if [ "$SPDK_TEST_NVMF_TRANSPORT" = "rdma" ]; then
timing_enter rdma_setup
rdma_device_init
timing_exit rdma_setup
run_test "nvmf_rdma" ./test/nvmf/nvmf.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
run_test "spdkcli_nvmf_rdma" ./test/spdkcli/nvmf.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
elif [ "$SPDK_TEST_NVMF_TRANSPORT" = "tcp" ]; then
timing_enter tcp_setup
tcp_device_init
timing_exit tcp_setup
run_test "nvmf_tcp" ./test/nvmf/nvmf.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
run_test "spdkcli_nvmf_tcp" ./test/spdkcli/nvmf.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
run_test "nvmf_identify_passthru" test/nvmf/target/identify_passthru.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
run_test "nvmf_dif" test/nvmf/target/dif.sh
elif [ "$SPDK_TEST_NVMF_TRANSPORT" = "fc" ]; then
run_test "nvmf_fc" ./test/nvmf/nvmf.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
run_test "spdkcli_nvmf_fc" ./test/spdkcli/nvmf.sh
else
echo "unknown NVMe transport, please specify rdma, tcp, or fc."
exit 1
fi
run_test suite ./test/nvmf/nvmf.sh --transport=$SPDK_TEST_NVMF_TRANSPORT
run_test suite ./test/spdkcli/nvmf.sh
fi
if [ $SPDK_TEST_VHOST -eq 1 ]; then
run_test "vhost" ./test/vhost/vhost.sh
run_test suite ./test/vhost/vhost.sh
report_test_completion "vhost"
fi
if [ $SPDK_TEST_LVOL -eq 1 ]; then
run_test "lvol" ./test/lvol/lvol.sh
run_test "blob_io_wait" ./test/blobstore/blob_io_wait/blob_io_wait.sh
timing_enter lvol
run_test suite ./test/lvol/lvol.sh --test-cases=all
run_test suite ./test/blobstore/blob_io_wait/blob_io_wait.sh
report_test_completion "lvol"
timing_exit lvol
fi
if [ $SPDK_TEST_VHOST_INIT -eq 1 ]; then
timing_enter vhost_initiator
run_test "vhost_blockdev" ./test/vhost/initiator/blockdev.sh
run_test "spdkcli_virtio" ./test/spdkcli/virtio.sh
run_test "vhost_shared" ./test/vhost/shared/shared.sh
run_test "vhost_fuzz" ./test/vhost/fuzz/fuzz.sh
run_test suite ./test/vhost/initiator/blockdev.sh
run_test suite ./test/spdkcli/virtio.sh
run_test suite ./test/vhost/shared/shared.sh
run_test suite ./test/vhost/fuzz/fuzz.sh
report_test_completion "vhost_initiator"
timing_exit vhost_initiator
fi
if [ $SPDK_TEST_PMDK -eq 1 ]; then
run_test "blockdev_pmem" ./test/bdev/blockdev.sh "pmem"
run_test "pmem" ./test/pmem/pmem.sh -x
run_test "spdkcli_pmem" ./test/spdkcli/pmem.sh
run_test suite ./test/pmem/pmem.sh -x
run_test suite ./test/spdkcli/pmem.sh
fi
if [ $SPDK_TEST_RBD -eq 1 ]; then
run_test "blockdev_rbd" ./test/bdev/blockdev.sh "rbd"
run_test "spdkcli_rbd" ./test/spdkcli/rbd.sh
run_test suite ./test/spdkcli/rbd.sh
fi
if [ $SPDK_TEST_OCF -eq 1 ]; then
run_test "ocf" ./test/ocf/ocf.sh
run_test suite ./test/ocf/ocf.sh
fi
if [ $SPDK_TEST_FTL -eq 1 ]; then
run_test "ftl" ./test/ftl/ftl.sh
run_test suite ./test/ftl/ftl.sh
fi
if [ $SPDK_TEST_VMD -eq 1 ]; then
run_test "vmd" ./test/vmd/vmd.sh
run_test suite ./test/vmd/vmd.sh
fi
if [ $SPDK_TEST_REDUCE -eq 1 ]; then
run_test "compress_qat" ./test/compress/compress.sh "qat"
run_test "compress_isal" ./test/compress/compress.sh "isal"
fi
if [ $SPDK_TEST_REDUCE -eq 1 ]; then
run_test suite ./test/compress/compress.sh
fi
if [ $SPDK_TEST_OPAL -eq 1 ]; then
run_test "nvme_opal" ./test/nvme/nvme_opal.sh
fi
if [ $SPDK_TEST_CRYPTO -eq 1 ]; then
run_test "blockdev_crypto_aesni" ./test/bdev/blockdev.sh "crypto_aesni"
# Proceed with the test only if QAT devices are in place
if [[ $(lspci -d:37c8) ]]; then
run_test "blockdev_crypto_qat" ./test/bdev/blockdev.sh "crypto_qat"
fi
fi
if [[ $SPDK_TEST_SCHEDULER -eq 1 ]]; then
run_test "scheduler" ./test/scheduler/scheduler.sh
run_test suite ./test/nvme/nvme_opal.sh
fi
fi
@ -345,10 +284,10 @@ trap - SIGINT SIGTERM EXIT
# catch any stray core files
process_core
if hash lcov && ! [[ "$CC_TYPE" == *"clang"* ]]; then
if hash lcov; then
# generate coverage data and combine with baseline
$LCOV -q -c -d $src -t "$(hostname)" -o $out/cov_test.info
$LCOV -q -a $out/cov_base.info -a $out/cov_test.info -o $out/cov_total.info
$LCOV -q -c -d $src -t "$(hostname)" -o cov_test.info
$LCOV -q -a cov_base.info -a cov_test.info -o $out/cov_total.info
$LCOV -q -r $out/cov_total.info '*/dpdk/*' -o $out/cov_total.info
$LCOV -q -r $out/cov_total.info '/usr/*' -o $out/cov_total.info
git clean -f "*.gcda"

515
configure vendored
View File

@ -5,9 +5,9 @@ set -e
trap 'echo -e "\n\nConfiguration failed\n\n" >&2' ERR
rootdir=$(readlink -f $(dirname $0))
source "$rootdir/scripts/common.sh"
function usage() {
function usage()
{
echo "'configure' configures SPDK to compile on supported platforms."
echo ""
echo "Usage: ./configure [OPTION]..."
@ -24,6 +24,7 @@ function usage() {
echo " example: aarch64-linux-gnu"
echo ""
echo " --enable-debug Configure for debug builds"
echo " --enable-log-bt Enable support of backtrace printing in SPDK logs (requires libunwind)."
echo " --enable-werror Treat compiler warnings as errors"
echo " --enable-asan Enable address sanitizer"
echo " --enable-ubsan Enable undefined behavior sanitizer"
@ -31,10 +32,7 @@ function usage() {
echo " --enable-lto Enable link-time optimization"
echo " --enable-pgo-capture Enable generation of profile guided optimization data"
echo " --enable-pgo-use Use previously captured profile guided optimization data"
echo " --enable-cet Enable Intel Control-flow Enforcement Technology (CET)"
echo " --disable-tests Disable building of functional tests"
echo " --disable-unit-tests Disable building of unit tests"
echo " --disable-examples Disable building of examples"
echo " --disable-tests Disable building of tests"
echo ""
echo "Specifying Dependencies:"
echo "--with-DEPENDENCY[=path] Use the given dependency. Optionally, provide the"
@ -48,27 +46,31 @@ function usage() {
echo " example: /usr/share/dpdk/x86_64-default-linuxapp-gcc"
echo " env Use an alternate environment implementation instead of DPDK."
echo " Implies --without-dpdk."
echo " idxd Build the IDXD library and accel framework plug-in module."
echo " Disabled while experimental. Only built for x86 when enabled."
echo " igb-uio-driver Build DPDK's igb-uio driver."
echo " Required on some systems to use qat devices. This flag is"
echo " effective only with the default dpdk submodule."
echo " No path required"
echo " crypto Build vbdev crypto module."
echo " No path required."
echo " fio Build fio_plugin."
echo " default: /usr/src/fio"
echo " example: /usr/src/fio"
echo " vhost Build vhost target. Enabled by default."
echo " No path required."
echo " internal-vhost-lib Use the internal copy of rte_vhost. By default, the upstream"
echo " rte_vhost from DPDK will be used."
echo " No path required."
echo " virtio Build vhost initiator and virtio-pci bdev modules."
echo " No path required."
echo " vfio-user Build custom vfio-user transport for NVMf target and NVMe initiator."
echo " example: /usr/src/libvfio-user"
echo " pmdk Build persistent memory bdev."
echo " example: /usr/share/pmdk"
echo " reduce Build vbdev compression module."
echo " No path required."
echo " vpp Build VPP net module."
echo " example: /vpp_repo/build-root/rpmbuild/vpp-18.01.1.0/build-root/install-vpp-native/vpp"
echo " rbd Build Ceph RBD bdev module."
echo " No path required."
echo " rdma Build RDMA transport for NVMf target and initiator."
echo " Accepts optional RDMA provider name. Can be \"verbs\" or \"mlx5_dv\"."
echo " If no provider specified, \"verbs\" provider is used by default."
echo " No path required."
echo " fc Build FC transport for NVMf target."
echo " If an argument is provided, it is considered a directory containing"
echo " libufc.a and fc_lld.h. Otherwise the regular system paths will"
@ -84,9 +86,9 @@ function usage() {
echo " If argument is file, interpret it as compiled OCF lib"
echo " If no argument is specified, OCF git submodule is used by default"
echo " example: /usr/src/ocf/"
echo " isal Build with ISA-L. Enabled by default on x86 and aarch64 architectures."
echo " isal Build with ISA-L. Enabled by default on x86 architecture."
echo " No path required."
echo " uring Build I/O uring bdev or socket module."
echo " uring Build I/O uring bdev."
echo " If an argument is provided, it is considered a directory containing"
echo " liburing.a and io_uring.h. Otherwise the regular system paths will"
echo " be searched."
@ -94,10 +96,6 @@ function usage() {
echo " No path required."
echo " nvme-cuse Build NVMe driver with support for CUSE-based character devices."
echo " No path required."
echo " raid5 Build with bdev_raid module RAID5 support."
echo " No path required."
echo " wpdk Build using WPDK to provide support for Windows (experimental)."
echo " The argument must be a directory containing lib and include."
echo ""
echo "Environment variables:"
echo ""
@ -118,65 +116,6 @@ declare -A CONFIG
source $rootdir/CONFIG.sh
rm $rootdir/CONFIG.sh
for i in "$@"; do
case "$i" in
--cross-prefix=*)
CONFIG[CROSS_PREFIX]="${i#*=}"
;;
--enable-lto)
CONFIG[LTO]=y
;;
--disable-lto)
CONFIG[LTO]=n
;;
esac
done
# Detect the compiler toolchain
$rootdir/scripts/detect_cc.sh --cc="$CC" --cxx="$CXX" --lto="${CONFIG[LTO]}" --ld="$LD" --cross-prefix="${CONFIG[CROSS_PREFIX]}" > $rootdir/mk/cc.mk
CC=$(grep "DEFAULT_CC=" "$rootdir/mk/cc.mk" | sed s/DEFAULT_CC=//)
CC_TYPE=$(grep "CC_TYPE=" "$rootdir/mk/cc.mk" | cut -d "=" -f 2)
arch=$($CC -dumpmachine)
sys_name=$(uname -s)
if [[ $arch == *mingw* ]] || [[ $arch == *windows* ]]; then
sys_name=Windows
fi
# Sanitize default configuration. All parameters set by user explicit should fail
# Force no ISA-L if non-x86 or non-aarch64 architecture
if [[ "${CONFIG[ISAL]}" = "y" ]]; then
if [[ $arch != x86_64* ]] && [[ $arch != aarch64* ]]; then
CONFIG[ISAL]=n
echo "Notice: ISA-L not supported for ${arch}. Turning off default feature."
fi
fi
if [[ $sys_name != "Linux" ]]; then
# Vhost, rte_vhost library and virtio are only supported on Linux.
CONFIG[VHOST]="n"
CONFIG[VIRTIO]="n"
echo "Notice: Vhost, rte_vhost library and virtio are only supported on Linux. Turning off default feature."
fi
#check nasm only on x86
if [[ $arch == x86_64* ]]; then
ver=$(nasm -v 2> /dev/null | awk '{print $3}')
if lt "$ver" 2.14; then
# ISA-L, compression & crypto require NASM version 2.14 or newer.
CONFIG[ISAL]=n
CONFIG[CRYPTO]=n
CONFIG[IPSEC_MB]=n
CONFIG[REDUCE]=n
HAVE_NASM=n
echo "Notice: ISA-L, compression & crypto require NASM version 2.14 or newer. Turning off default ISA-L and crypto features."
else
HAVE_NASM=y
fi
fi
function check_dir() {
arg="$1"
dir="${arg#*=}"
@ -188,27 +127,31 @@ function check_dir() {
for i in "$@"; do
case "$i" in
-h | --help)
-h|--help)
usage
exit 0
;;
--cross-prefix=*) ;&
--enable-lto) ;&
--disable-lto)
# Options handled before detecting CC.
;;
--prefix=*)
CONFIG[PREFIX]="${i#*=}"
;;
--target-arch=*)
CONFIG[ARCH]="${i#*=}"
;;
--cross-prefix=*)
CONFIG[CROSS_PREFIX]="${i#*=}"
;;
--enable-debug)
CONFIG[DEBUG]=y
;;
--disable-debug)
CONFIG[DEBUG]=n
;;
--enable-log-bt)
CONFIG[LOG_BACKTRACE]=y
;;
--disable-log-bt)
CONFIG[LOG_BACKTRACE]=n
;;
--enable-asan)
CONFIG[ASAN]=y
;;
@ -233,6 +176,12 @@ for i in "$@"; do
--disable-coverage)
CONFIG[COVERAGE]=n
;;
--enable-lto)
CONFIG[LTO]=y
;;
--disable-lto)
CONFIG[LTO]=n
;;
--enable-pgo-capture)
CONFIG[PGO_CAPTURE]=y
;;
@ -251,30 +200,12 @@ for i in "$@"; do
--disable-tests)
CONFIG[TESTS]=n
;;
--enable-unit-tests)
CONFIG[UNIT_TESTS]=y
;;
--disable-unit-tests)
CONFIG[UNIT_TESTS]=n
;;
--enable-examples)
CONFIG[EXAMPLES]=y
;;
--disable-examples)
CONFIG[EXAMPLES]=n
;;
--enable-werror)
CONFIG[WERROR]=y
;;
--disable-werror)
CONFIG[WERROR]=n
;;
--enable-cet)
CONFIG[CET]=y
;;
--disable-cet)
CONFIG[CET]=n
;;
--with-dpdk=*)
check_dir "$i"
CONFIG[DPDK_DIR]=$(readlink -f ${i#*=})
@ -282,10 +213,6 @@ for i in "$@"; do
--without-dpdk)
CONFIG[DPDK_DIR]=
;;
--with-wpdk=*)
check_dir "$i"
CONFIG[WPDK_DIR]=$(readlink -f ${i#*=})
;;
--with-env=*)
CONFIG[ENV]="${i#*=}"
;;
@ -295,13 +222,8 @@ for i in "$@"; do
--without-rbd)
CONFIG[RBD]=n
;;
--with-rdma=*)
CONFIG[RDMA]=y
CONFIG[RDMA_PROV]=${i#*=}
;;
--with-rdma)
CONFIG[RDMA]=y
CONFIG[RDMA_PROV]="verbs"
;;
--without-rdma)
CONFIG[RDMA]=n
@ -342,24 +264,18 @@ for i in "$@"; do
--without-vhost)
CONFIG[VHOST]=n
;;
--with-internal-vhost-lib)
CONFIG[VHOST_INTERNAL_LIB]=y
;;
--without-internal-vhost-lib)
CONFIG[VHOST_INTERNAL_LIB]=n
;;
--with-virtio)
CONFIG[VIRTIO]=y
;;
--without-virtio)
CONFIG[VIRTIO]=n
;;
--with-vfio-user)
CONFIG[VFIO_USER]=y
CONFIG[VFIO_USER_DIR]=""
;;
--with-vfio-user=*)
CONFIG[VFIO_USER]=y
check_dir "$i"
CONFIG[VFIO_USER_DIR]=$(readlink -f ${i#*=})
;;
--without-vfio-user)
CONFIG[VFIO_USER]=n
;;
--with-pmdk)
CONFIG[PMDK]=y
CONFIG[PMDK_DIR]=""
@ -378,15 +294,24 @@ for i in "$@"; do
--without-reduce)
CONFIG[REDUCE]=n
;;
--with-fio) ;&
--with-vpp)
CONFIG[VPP]=y
;;
--with-vpp=*)
CONFIG[VPP]=y
check_dir "$i"
CONFIG[VPP_DIR]=$(readlink -f ${i#*=})
;;
--without-vpp)
CONFIG[VPP]=n
;;
--with-fio=*)
if [[ ${i#*=} != "$i" ]]; then
CONFIG[FIO_SOURCE_DIR]=$(readlink -f "${i#*=}")
fi
check_dir "--with-fio=${CONFIG[FIO_SOURCE_DIR]}"
check_dir "$i"
CONFIG[FIO_SOURCE_DIR]="${i#*=}"
CONFIG[FIO_PLUGIN]=y
;;
--without-fio)
CONFIG[FIO_SOURCE_DIR]=
CONFIG[FIO_PLUGIN]=n
;;
--with-vtune=*)
@ -398,6 +323,12 @@ for i in "$@"; do
CONFIG[VTUNE_DIR]=
CONFIG[VTUNE]=n
;;
--with-igb-uio-driver)
CONFIG[IGB_UIO_DRIVER]=y
;;
--without-igb-uio-driver)
CONFIG[IGB_UIO_DRIVER]=n
;;
--with-ocf)
CONFIG[OCF]=y
CONFIG[OCF_PATH]=$(readlink -f "./ocf")
@ -440,18 +371,6 @@ for i in "$@"; do
--without-nvme-cuse)
CONFIG[NVME_CUSE]=n
;;
--with-raid5)
CONFIG[RAID5]=y
;;
--without-raid5)
CONFIG[RAID5]=n
;;
--with-idxd)
CONFIG[IDXD]=y
;;
--without-idxd)
CONFIG[IDXD]=n
;;
--)
break
;;
@ -459,69 +378,41 @@ for i in "$@"; do
echo "Unrecognized option $i"
usage
exit 1
;;
esac
done
# Detect the compiler toolchain
$rootdir/scripts/detect_cc.sh --cc="$CC" --cxx="$CXX" --lto="${CONFIG[LTO]}" --ld="$LD" --cross-prefix="${CONFIG[CROSS_PREFIX]}" > $rootdir/mk/cc.mk
CC=$(cat $rootdir/mk/cc.mk | grep "DEFAULT_CC=" | cut -d "=" -f 2)
CC_TYPE=$(cat $rootdir/mk/cc.mk | grep "CC_TYPE=" | cut -d "=" -f 2)
arch=$($CC -dumpmachine)
if [[ $arch == x86_64* ]]; then
BUILD_CMD=("$CC" -o /dev/null -x c $CPPFLAGS $CFLAGS $LDFLAGS "-march=native")
BUILD_CMD=($CC -o /dev/null -x c $CPPFLAGS $CFLAGS $LDFLAGS -march=native)
else
BUILD_CMD=("$CC" -o /dev/null -x c $CPPFLAGS $CFLAGS $LDFLAGS)
fi
BUILD_CMD+=(-I/usr/local/include -L/usr/local/lib)
if [[ "${CONFIG[VFIO_USER]}" = "y" ]]; then
if ! hash cmake; then
echo "ERROR: --with-vfio-user requires cmake"
echo "Please install then re-run this script"
exit 1
fi
if [[ ! -d /usr/include/json-c ]] && [[ ! -d /usr/local/include/json-c ]]; then
echo "ERROR: --with-vfio-user requires json-c-devel"
echo "Please install then re-run this script"
exit 1
fi
if [[ ! -e /usr/include/cmocka.h ]] && [[ ! -e /usr/local/include/cmocka.h ]]; then
echo "ERROR: --with-vfio-user requires libcmocka-devel"
echo "Please install then re-run this script"
exit 1
fi
BUILD_CMD=($CC -o /dev/null -x c $CPPFLAGS $CFLAGS $LDFLAGS)
fi
# IDXD uses Intel specific instructions.
if [[ "${CONFIG[IDXD]}" = "y" ]]; then
if [ $(uname -s) == "FreeBSD" ]; then
intel="hw.model: Intel"
cpu_vendor=$(sysctl -a | grep hw.model | cut -c 1-15)
else
intel="GenuineIntel"
cpu_vendor=$(grep -i 'vendor' /proc/cpuinfo --max-count=1)
fi
if [[ "$cpu_vendor" != *"$intel"* ]]; then
echo "ERROR: IDXD cannot be used due to CPU incompatiblity."
exit 1
fi
fi
# Detect architecture and force no ISA-L if non-x86 or non-aarch64 architecture
# Detect architecture and force no ISA-L if non-x86 archtecture
if [[ "${CONFIG[ISAL]}" = "y" ]]; then
if [[ $arch != x86_64* ]] && [[ $arch != aarch64* ]]; then
echo "ERROR: ISA-L cannot be used due to CPU incompatiblity."
exit 1
if [[ $arch != x86_64* ]]; then
echo "Notice: ISA-L disabled due to CPU incompatiblity."
CONFIG[ISAL]=n
fi
fi
if [[ "${CONFIG[ISAL]}" = "n" ]] && [[ "${CONFIG[REDUCE]}" = "y" ]]; then
echo "ERROR Conflicting options: --with-reduce is not compatible with --without-isal."
exit 1
echo "ERROR Conflicting options: --with-reduce is not compatible with --without-isal."
exit 1
fi
if [ -z "${CONFIG[ENV]}" ]; then
CONFIG[ENV]=$rootdir/lib/env_dpdk
echo "Using default SPDK env in ${CONFIG[ENV]}"
if [ -z "${CONFIG[DPDK_DIR]}" ]; then
if [ ! -f "$rootdir"/dpdk/config/meson.build ]; then
if [ ! -f "$rootdir"/dpdk/config/common_base ]; then
echo "DPDK not found; please specify --with-dpdk=<path> or run:"
echo
echo " git submodule update --init"
@ -530,6 +421,31 @@ if [ -z "${CONFIG[ENV]}" ]; then
CONFIG[DPDK_DIR]="${rootdir}/dpdk/build"
echo "Using default DPDK in ${CONFIG[DPDK_DIR]}"
fi
if [[ "${CONFIG[VHOST]}" = "y" ]] && [[ "${CONFIG[VHOST_INTERNAL_LIB]}" = "n" ]]; then
# We lookup "common_linux" file to check if DPDK version is >= 19.05.
# "common_linux" is available since exactly DPDK 19.05 - it was renamed
# from "common_linuxapp".
if [ ! -f "$rootdir"/dpdk/config/common_linux ]; then
echo "Notice: Using internal, legacy rte_vhost library due to DPDK" \
"version < 19.05"
CONFIG[VHOST_INTERNAL_LIB]=y
fi
fi
else
if [[ "${CONFIG[VHOST]}" = "y" ]] && [[ "${CONFIG[VHOST_INTERNAL_LIB]}" = "n" ]]; then
# DPDK must be already built, so we can simply try to use the new rte_vhost.
# It has a number of internal dependencies though, so don't try to link the
# program, just compile it
if ! echo -e '#include <rte_vhost.h>\n' \
'int main(void) { return rte_vhost_extern_callback_register(0, NULL, NULL); }\n' \
| ${BUILD_CMD[@]} -c -Wno-deprecated-declarations -Werror \
-I"${CONFIG[DPDK_DIR]}/include" - &>/dev/null; then
echo "Notice: DPDK's rte_vhost not found or version < 19.05, using internal," \
"legacy rte_vhost library."
CONFIG[VHOST_INTERNAL_LIB]=y
fi
fi
fi
else
if [ -n "${CONFIG[DPDK_DIR]}" ]; then
@ -549,21 +465,13 @@ else
CONFIG[VIRTIO]="n"
fi
if [[ $sys_name == "Windows" ]]; then
if [ -z "${CONFIG[WPDK_DIR]}" ]; then
if [ ! -f "$rootdir"/wpdk/Makefile ]; then
echo "WPDK not found; please specify --with-wpdk=<path>. See https://wpdk.github.io."
exit 1
else
CONFIG[WPDK_DIR]="${rootdir}/wpdk/build"
echo "Using default WPDK in ${CONFIG[WPDK_DIR]}"
fi
fi
else
if [ -n "${CONFIG[WPDK_DIR]}" ]; then
echo "ERROR: --with-wpdk is only supported for Windows"
if [ "${CONFIG[FIO_PLUGIN]}" = "y" ]; then
if [ -z "${CONFIG[FIO_SOURCE_DIR]}" ]; then
echo "When fio is enabled, you must specify the fio directory using --with-fio=path"
exit 1
fi
else
CONFIG[FIO_SOURCE_DIR]=
fi
if [ "${CONFIG[VTUNE]}" = "y" ]; then
@ -573,12 +481,12 @@ if [ "${CONFIG[VTUNE]}" = "y" ]; then
fi
fi
if [[ "${CONFIG[ASAN]}" = "y" && "${CONFIG[TSAN]}" = "y" ]]; then
if [ "${CONFIG[ASAN]}" = "y" -a "${CONFIG[TSAN]}" = "y" ]; then
echo "ERROR: ASAN and TSAN cannot be enabled at the same time."
exit 1
fi
if [[ $sys_name == "FreeBSD" ]]; then
if [[ "$OSTYPE" == "freebsd"* ]]; then
# FreeBSD doesn't support all configurations
if [[ "${CONFIG[COVERAGE]}" == "y" ]]; then
echo "ERROR: CONFIG_COVERAGE not available on FreeBSD"
@ -586,34 +494,33 @@ if [[ $sys_name == "FreeBSD" ]]; then
fi
fi
if [[ $sys_name != "Linux" ]]; then
if [[ "$OSTYPE" == "freebsd"* ]]; then
if [[ "${CONFIG[VHOST]}" == "y" ]]; then
echo "Vhost is only supported on Linux."
exit 1
echo "Vhost is only supported on Linux. Disabling it."
CONFIG[VHOST]="n"
fi
if [[ "${CONFIG[VHOST_INTERNAL_LIB]}" == "y" ]]; then
echo "Internal rte_vhost library is only supported on Linux. Disabling it."
CONFIG[VHOST_INTERNAL_LIB]="n"
fi
if [[ "${CONFIG[VIRTIO]}" == "y" ]]; then
echo "Virtio is only supported on Linux."
exit 1
echo "Virtio is only supported on Linux. Disabling it."
CONFIG[VIRTIO]="n"
fi
fi
if [ "${CONFIG[RDMA]}" = "y" ]; then
if [[ ! "${CONFIG[RDMA_PROV]}" == "verbs" ]] && [[ ! "${CONFIG[RDMA_PROV]}" == "mlx5_dv" ]]; then
echo "Invalid RDMA provider specified, must be \"verbs\" or \"mlx5_dv\""
exit 1
fi
if ! echo -e '#include <infiniband/verbs.h>\n#include <rdma/rdma_verbs.h>\n' \
'int main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -libverbs -lrdmacm - 2> /dev/null; then
echo "--with-rdma requires libverbs and librdmacm."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -libverbs -lrdmacm - 2>/dev/null; then
echo --with-rdma requires libverbs and librdmacm.
echo Please install then re-run this script.
exit 1
fi
if echo -e '#include <infiniband/verbs.h>\n' \
'int main(void) { return !!IBV_WR_SEND_WITH_INV; }\n' \
| "${BUILD_CMD[@]}" -c - 2> /dev/null; then
| ${BUILD_CMD[@]} -c - 2>/dev/null; then
CONFIG[RDMA_SEND_WITH_INVAL]="y"
else
CONFIG[RDMA_SEND_WITH_INVAL]="n"
@ -632,29 +539,6 @@ of libibverbs, so Linux kernel NVMe-oF initiators based on kernels greater
than or equal to 4.14 will see significantly reduced performance.
*******************************************************************************"
fi
if echo -e '#include <rdma/rdma_cma.h>\n' \
'int main(void) { return !!RDMA_OPTION_ID_ACK_TIMEOUT; }\n' \
| "${BUILD_CMD[@]}" -c - 2> /dev/null; then
CONFIG[RDMA_SET_ACK_TIMEOUT]="y"
else
CONFIG[RDMA_SET_ACK_TIMEOUT]="n"
echo "RDMA_OPTION_ID_ACK_TIMEOUT is not supported"
fi
if [ "${CONFIG[RDMA_PROV]}" == "mlx5_dv" ]; then
if ! echo -e '#include <spdk/stdinc.h>\n' \
'#include <infiniband/mlx5dv.h>\n' \
'#include <rdma/rdma_cma.h>\n' \
'int main(void) { return rdma_establish(NULL) || ' \
'!!IBV_QP_INIT_ATTR_SEND_OPS_FLAGS || !!MLX5_OPCODE_RDMA_WRITE; }\n' \
| "${BUILD_CMD[@]}" -lmlx5 -I${rootdir}/include -c - 2> /dev/null; then
echo "mlx5_dv provider is not supported"
exit 1
fi
fi
echo "Using '${CONFIG[RDMA_PROV]}' RDMA provider"
fi
if [[ "${CONFIG[FC]}" = "y" ]]; then
@ -667,10 +551,15 @@ if [[ "${CONFIG[FC]}" = "y" ]]; then
fi
if [[ "${CONFIG[ISAL]}" = "y" ]] || [[ "${CONFIG[CRYPTO]}" = "y" ]]; then
if [[ "${HAVE_NASM}" = "n" ]] && [[ $arch == x86_64* ]]; then
echo "ERROR: ISA-L, compression & crypto require NASM version 2.14 or newer."
echo "Please install or upgrade them re-run this script."
exit 1
ver=$(nasm -v | awk '{print $3}' | sed 's/[^0-9]*//g')
if [[ "${ver:0:1}" -le "2" ]] && [[ "${ver:0:3}" -le "213" ]] && [[ "${ver:0:5}" -lt "21303" ]]; then
echo "Notice: ISA-L, compression & crypto auto-disabled due to nasm dependency."
echo "These features require NASM version 2.13.03 or newer. Please install"
echo "or upgrade then re-run this script."
CONFIG[ISAL]=n
CONFIG[CRYPTO]=n
CONFIG[IPSEC_MB]=n
CONFIG[REDUCE]=n
else
if [[ "${CONFIG[CRYPTO]}" = "y" ]]; then
CONFIG[IPSEC_MB]=y
@ -678,29 +567,63 @@ if [[ "${CONFIG[ISAL]}" = "y" ]] || [[ "${CONFIG[CRYPTO]}" = "y" ]]; then
fi
fi
if [[ "${CONFIG[ISAL]}" = "y" ]]; then
if [ ! -f "$rootdir"/isa-l/autogen.sh ]; then
echo "ISA-L was not found; To install ISA-L run:"
echo " git submodule update --init"
exit 1
fi
if [[ "${CONFIG[RBD]}" = "y" ]]; then
echo "ISAL and RBD cannot co-exist currently so disabling ISAL and compression."
CONFIG[ISAL]=n
CONFIG[REDUCE]=n
else
cd $rootdir/isa-l
ISAL_LOG=/tmp/spdk-isal.log
echo -n "Configuring ISA-L (logfile: $ISAL_LOG)..."
./autogen.sh &> $ISAL_LOG
./configure CFLAGS="-fPIC -g -O2" --enable-shared=no >> $ISAL_LOG 2>&1
echo "done."
cd $rootdir
fi
fi
if [[ "${CONFIG[PMDK]}" = "y" ]]; then
if ! echo -e '#include <libpmemblk.h>\nint main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -lpmemblk - 2> /dev/null; then
echo "--with-pmdk requires libpmemblk."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -lpmemblk - 2>/dev/null; then
echo --with-pmdk requires libpmemblk.
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[REDUCE]}" = "y" ]]; then
if ! echo -e '#include <libpmem.h>\nint main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -lpmem - 2> /dev/null; then
echo "--with-reduce requires libpmem."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -lpmem - 2>/dev/null; then
echo --with-reduce requires libpmem.
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[VPP]}" = "y" ]]; then
if [ ! -z "${CONFIG[VPP_DIR]}" ]; then
VPP_CFLAGS="-L${CONFIG[VPP_DIR]}/lib -I${CONFIG[VPP_DIR]}/include"
fi
if ! echo -e '#include <vnet/session/application_interface.h>\nint main(void) { return 0; }\n' \
| ${BUILD_CMD[@]} ${VPP_CFLAGS} -lvppinfra -lsvm -lvlibmemoryclient - 2>/dev/null; then
echo --with-vpp requires installed vpp.
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[NVME_CUSE]}" = "y" ]]; then
if ! echo -e '#define FUSE_USE_VERSION 31\n#include <fuse3/cuse_lowlevel.h>\n#include <fuse3/fuse_lowlevel.h>\n#include <fuse3/fuse_opt.h>\nint main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -lfuse3 -D_FILE_OFFSET_BITS=64 - 2> /dev/null; then
echo "--with-cuse requires libfuse3."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -lfuse3 -D_FILE_OFFSET_BITS=64 - 2>/dev/null; then
echo --with-cuse requires libfuse3.
echo Please install then re-run this script.
exit 1
fi
fi
@ -708,9 +631,9 @@ fi
if [[ "${CONFIG[RBD]}" = "y" ]]; then
if ! echo -e '#include <rbd/librbd.h>\n#include <rados/librados.h>\n' \
'int main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -lrados -lrbd - 2> /dev/null; then
echo "--with-rbd requires librados and librbd."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -lrados -lrbd - 2>/dev/null; then
echo --with-rbd requires librados and librbd.
echo Please install then re-run this script.
exit 1
fi
fi
@ -722,39 +645,46 @@ if [[ "${CONFIG[ISCSI_INITIATOR]}" = "y" ]]; then
'#error\n' \
'#endif\n' \
'int main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -L/usr/lib64/iscsi -liscsi - 2> /dev/null; then
echo "--with-iscsi-initiator requires libiscsi with"
echo "LIBISCSI_API_VERSION >= 20150621."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -L/usr/lib64/iscsi -liscsi - 2>/dev/null; then
echo --with-iscsi-initiator requires libiscsi with
echo 'LIBISCSI_API_VERSION >= 20150621.'
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[LOG_BACKTRACE]}" = "y" ]]; then
if ! echo -e '#include <libunwind.h>\nint main(void) { return 0; }\n' \
| ${BUILD_CMD[@]} -lunwind - 2>/dev/null; then
echo --enable-log-bt requires libunwind.
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[ASAN]}" = "y" ]]; then
if ! echo -e 'int main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -fsanitize=address - 2> /dev/null; then
echo "--enable-asan requires libasan."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -fsanitize=address - 2>/dev/null; then
echo --enable-asan requires libasan.
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[UBSAN]}" = "y" ]]; then
if ! echo -e 'int main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -fsanitize=undefined - 2> /dev/null; then
echo "--enable-ubsan requires libubsan."
echo "Please install then re-run this script."
echo "If installed, please check that the GCC version is at least 6.4"
echo "and synchronize CC accordingly."
| ${BUILD_CMD[@]} -fsanitize=undefined - 2>/dev/null; then
echo --enable-ubsan requires libubsan.
echo Please install then re-run this script.
exit 1
fi
fi
if [[ "${CONFIG[TSAN]}" = "y" ]]; then
if ! echo -e 'int main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -fsanitize=thread - 2> /dev/null; then
echo "--enable-tsan requires libtsan."
echo "Please install then re-run this script."
| ${BUILD_CMD[@]} -fsanitize=thread - 2>/dev/null; then
echo --enable-tsan requires libtsan.
echo Please install then re-run this script.
exit 1
fi
fi
@ -785,11 +715,6 @@ if [[ "${CONFIG[URING]}" = "y" ]]; then
echo "${CONFIG[URING_PATH]}: directory not found"
exit 1
fi
elif ! echo -e '#include <liburing.h>\nint main(void) { return 0; }\n' \
| "${BUILD_CMD[@]}" -luring - 2> /dev/null; then
echo "--with-uring requires liburing."
echo "Please build and install then re-run this script."
exit 1
fi
fi
@ -801,51 +726,22 @@ if [[ "${CONFIG[FUSE]}" = "y" ]]; then
fi
fi
if [ "${CONFIG[CET]}" = "y" ]; then
if ! echo -e 'int main(void) { return 0; }\n' | "${BUILD_CMD[@]}" -fcf-protection - 2> /dev/null; then
echo "--enable-cet requires compiler/linker that supports CET."
echo "Please install then re-run this script."
exit 1
fi
fi
if [[ "${CONFIG[ISAL]}" = "y" ]]; then
if [ ! -f "$rootdir"/isa-l/autogen.sh ]; then
echo "ISA-L was not found; To install ISA-L run:"
echo " git submodule update --init"
exit 1
fi
cd $rootdir/isa-l
ISAL_LOG=$rootdir/isa-l/spdk-isal.log
if [[ -n "${CONFIG[CROSS_PREFIX]}" ]]; then
ISAL_OPTS=("--host=${CONFIG[CROSS_PREFIX]}")
else
ISAL_OPTS=()
fi
echo -n "Configuring ISA-L (logfile: $ISAL_LOG)..."
./autogen.sh &> $ISAL_LOG
./configure CFLAGS="-fPIC -g -O2" "${ISAL_OPTS[@]}" --enable-shared=no >> $ISAL_LOG 2>&1
echo "done."
cd $rootdir
fi
# We are now ready to generate final configuration. But first do sanity
# check to see if all keys in CONFIG array have its reflection in CONFIG file.
if (($(grep -cE "^\s*CONFIG_[[:alnum:]_]+=" "$rootdir/CONFIG") != ${#CONFIG[@]})); then
if [ $(egrep -c "^\s*CONFIG_[[:alnum:]_]+=" $rootdir/CONFIG) -ne ${#CONFIG[@]} ]; then
echo ""
echo "BUG: Some configuration options are not present in CONFIG file. Please update this file."
echo "Missing options in CONFIG (+) file and in current config (-): "
diff -u --label "CONFIG file" --label "CONFIG[@]" \
<(sed -r -e '/^\s*$/d; /^\s*#.*/d; s/(CONFIG_[[:alnum:]_]+)=.*/\1/g' CONFIG | sort) \
<(printf "CONFIG_%s\n" "${!CONFIG[@]}" | sort)
<(printf "CONFIG_%s\n" ${!CONFIG[@]} | sort)
exit 1
fi
echo -n "Creating mk/config.mk..."
cp -f $rootdir/CONFIG $rootdir/mk/config.mk
for key in "${!CONFIG[@]}"; do
sed -i.bak -r "s#[[:space:]]*CONFIG_${key}=.*#CONFIG_${key}\?=${CONFIG[$key]}#g" $rootdir/mk/config.mk
for key in ${!CONFIG[@]}; do
sed -i.bak -r "s#^\s*CONFIG_${key}=.*#CONFIG_${key}\?=${CONFIG[$key]}#g" $rootdir/mk/config.mk
done
# On FreeBSD sed -i 'SUFFIX' - SUFFIX is mandatory. So no way but to delete the backed file.
rm -f $rootdir/mk/config.mk.bak
@ -860,12 +756,7 @@ rm -f $rootdir/mk/cc.flags.mk
[ -n "$DESTDIR" ] && echo "DESTDIR?=$DESTDIR" >> $rootdir/mk/cc.flags.mk
echo "done."
# Create .sh with build config for easy sourcing|lookup during the tests.
for conf in "${!CONFIG[@]}"; do
echo "CONFIG_$conf=${CONFIG[$conf]}"
done > "$rootdir/test/common/build_config.sh"
if [[ $sys_name == "FreeBSD" ]]; then
if [[ "$OSTYPE" == "freebsd"* ]]; then
echo "Type 'gmake' to build."
else
echo "Type 'make' to build."

View File

@ -1,42 +0,0 @@
# ABI and API Deprecation {#deprecation}
This document details the policy for maintaining stability of SPDK ABI and API.
Major ABI version can change at most once for each quarterly SPDK release.
ABI versions are managed separately for each library and follow [Semantic Versoning](https://semver.org/).
API and ABI deprecation notices shall be posted in the next section.
Each entry must describe what will be removed and can suggest the future use or alternative.
Specific future SPDK release for the removal must be provided.
ABI cannot be removed without providing deprecation notice for at least single SPDK release.
# Deprecation Notices {#deprecation-notices}
## net
The net library is deprecated and will be removed in the 21.07 release.
## nvmf
The following APIs have been deprecated and will be removed in SPDK 21.07:
- `spdk_nvmf_poll_group_get_stat` (function in `nvmf.h`),
- `spdk_nvmf_transport_poll_group_get_stat` (function in `nvmf.h`),
- `spdk_nvmf_transport_poll_group_free_stat`(function in `nvmf.h`),
- `spdk_nvmf_rdma_device_stat` (struct in `nvmf.h`),
- `spdk_nvmf_transport_poll_group_stat` (struct in `nvmf.h`),
- `poll_group_get_stat` (transport op in `nvmf_transport.h`),
- `poll_group_free_stat` (transport op in `nvmf_transport.h`).
Please use `spdk_nvmf_poll_group_dump_stat` and `poll_group_dump_stat` instead.
## rpc
Parameter `enable-zerocopy-send` of RPC `sock_impl_set_options` is deprecated and will be removed in SPDK 21.07,
use `enable-zerocopy-send-server` or `enable-zerocopy-send-client` instead.
Parameter `disable-zerocopy-send` of RPC `sock_impl_set_options` is deprecated and will be removed in SPDK 21.07,
use `disable-zerocopy-send-server` or `disable-zerocopy-send-client` instead.
## rpm
`pkg/spdk.spec` is considered to be deprecated and scheduled for removal in SPDK 21.07.
Please use `rpmbuild/spdk.spec` instead and see
[RPM documentation](https://spdk.io/doc/rpm.html) for more details.

View File

@ -234,7 +234,7 @@ ALIASES =
# A mapping has the form "name=value". For example adding "class=itcl::class"
# will allow you to use the command class in the itcl::class meaning.
# TCL_SUBST =
TCL_SUBST =
# Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C sources
# only. Doxygen will then generate output that is more tailored for C. For
@ -795,16 +795,13 @@ INPUT += \
misc.md \
driver_modules.md \
tools.md \
ci_tools.md \
performance_reports.md \
# All remaining pages are listed here in alphabetical order by filename.
INPUT += \
about.md \
accel_fw.md \
applications.md \
bdev.md \
bdevperf.md \
bdev_module.md \
bdev_pg.md \
blob.md \
@ -812,35 +809,27 @@ INPUT += \
changelog.md \
compression.md \
concurrency.md \
containers.md \
../deprecation.md \
event.md \
ftl.md \
gdb_macros.md \
getting_started.md \
idxd.md \
ioat.md \
iscsi.md \
jsonrpc.md \
jsonrpc_proxy.md \
libraries.md \
lvol.md \
memory.md \
notify.md \
nvme.md \
nvme-cli.md \
nvme_spec.md \
nvmf.md \
nvmf_tgt_pg.md \
nvmf_tracing.md \
overview.md \
peer_2_peer.md \
pkgconfig.md \
porting.md \
rpm.md \
scheduler.md \
shfmt.md \
spdkcli.md \
spdk_top.md \
ssd_internals.md \
system_configuration.md \
userspace.md \
@ -848,7 +837,7 @@ INPUT += \
vhost.md \
vhost_processing.md \
virtio.md \
vmd.md
vpp_integration.md
# This tag can be used to specify the character encoding of the source files
# that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses
@ -1105,7 +1094,7 @@ ALPHABETICAL_INDEX = YES
# Minimum value: 1, maximum value: 20, default value: 5.
# This tag requires that the tag ALPHABETICAL_INDEX is set to YES.
# COLS_IN_ALPHA_INDEX = 5
COLS_IN_ALPHA_INDEX = 5
# In case all classes in a project start with a common prefix, all classes will
# be put under the same header in the alphabetical index. The IGNORE_PREFIX tag
@ -1666,7 +1655,7 @@ EXTRA_SEARCH_MAPPINGS =
# If the GENERATE_LATEX tag is set to YES, doxygen will generate LaTeX output.
# The default value is: YES.
GENERATE_LATEX = NO
GENERATE_LATEX = YES
# The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. If a
# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of
@ -2170,7 +2159,7 @@ EXTERNAL_PAGES = YES
# interpreter (i.e. the result of 'which perl').
# The default file (with absolute path) is: /usr/bin/perl.
# PERL_PATH = /usr/bin/perl
PERL_PATH = /usr/bin/perl
#---------------------------------------------------------------------------
# Configuration options related to the dot tool
@ -2192,7 +2181,7 @@ CLASS_DIAGRAMS = YES
# the mscgen tool resides. If left empty the tool is assumed to be found in the
# default search path.
# MSCGEN_PATH =
MSCGEN_PATH =
# You can include diagrams made with dia in doxygen documentation. Doxygen will
# then run dia to produce the diagram and insert it in the documentation. The

View File

@ -1,4 +1,4 @@
# What is SPDK {#about}
# What is SPDK? {#about}
The Storage Performance Development Kit (SPDK) provides a set of tools and
libraries for writing high performance, scalable, user-mode storage

View File

@ -1,107 +0,0 @@
# Acceleration Framework {#accel_fw}
SPDK provides a framework for abstracting general acceleration capabilities
that can be implemented through plug-in modules and low-level libraries. These
plug-in modules include support for hardware acceleration engines such as
the Intel(R) I/O Acceleration Technology (IOAT) engine and the Intel(R) Data
Streaming Accelerator (DSA) engine. Additionally, a software plug-in module
exists to enable use of the framework in environments without hardware
acceleration capabilities. ISA/L is used for optimized CRC32C calculation within
the software module.
The framework includes an API for getting the current capabilities of the
selected module. See [`spdk_accel_get_capabilities`](https://spdk.io/doc/accel__engine_8h.html) for more details. For the software module, all capabilities will be reported as supported. For the hardware modules, only functions accelerated by hardware will be reported however any function can still be called, it will just be backed by software if it is not reported as a supported capability.
# Acceleration Framework Functions {#accel_functions}
Functions implemented via the framework can be found in the DoxyGen documentation of the
framework public header file here [accel_engine.h](https://spdk.io/doc/accel__engine_8h.html)
# Acceleration Framework Design Considerations {#accel_dc}
The general interface is defined by `/include/accel_engine.h` and implemented
in `/lib/accel`. These functions may be called by an SPDK application and in
most cases, except where otherwise documented, are asynchronous and follow the
standard SPDK model for callbacks with a callback argument.
If the acceleration framework is started without initializing a hardware module,
optimized software implementations of the functions will back the public API.
Additionally, if any hardware module does not support a specific function and that
hardware module is initialized, the specific function will fallback to a software
optimized implementation. For example, IOAT does not support the dualcast function
in hardware but if the IOAT module has been initialized and the public dualcast API
is called, it will actually be done via software behind the scenes.
# Acceleration Low Level Libraries {#accel_libs}
Low level libraries provide only the most basic functions that are specific to
the hardware. Low level libraries are located in the '/lib' directory with the
exception of the software implementation which is implemented as part of the
framework itself. The software low level library does not expose a public API.
Applications may choose to interact directly with a low level library if there are
specific needs/considerations not met via accessing the library through the
framework/module. Note that when using the low level libraries directly, the
framework abstracted interface is bypassed as the application will call the public
functions exposed by the individual low level libraries. Thus, code written this
way needs to be certain that the underlying hardware exists everywhere that it runs.
The low level library for IOAT is located in `/lib/ioat`. The low level library
for DSA is in `/liv/idxd` (IDXD stands for Intel(R) Data Acceleration Driver).
# Acceleration Plug-In Modules {#accel_modules}
Plug-in modules depend on low level libraries to interact with the hardware and
add additional functionality such as queueing during busy conditions or flow
control in some cases. The framework in turn depends on the modules to provide
the complete implementation of the acceleration component. A module must be
selected via startup RPC when the application is started. Otherwise, if no startup
RPC is provided, the framework is available and will use the software plug-in module.
## IOAT Module {#accel_ioat}
To use the IOAT engine, use the RPC [`ioat_scan_accel_engine`](https://spdk.io/doc/jsonrpc.html) before starting the application.
## IDXD Module {#accel_idxd}
To use the DSA engine, use the RPC [`idxd_scan_accel_engine`](https://spdk.io/doc/jsonrpc.html) with an optional parameter of `-c` and provide a configuration number of either 0 or 1. These pre-defined configurations determine how the DSA engine will be setup in terms
of work queues and engines. The DSA engine is very flexible allowing for various configurations of these elements to either account for different quality of service requirements or to isolate hardware paths where the back end media is of varying latency (i.e. persistent memory vs DRAM). The pre-defined configurations are as follows:
0: A single work queue backed with four DSA engines. This is a generic configuration
that enables the hardware to best determine which engine to use as it pulls in new
operations.
1: Two separate work queues each backed with two DSA engines. This is another
generic configuration that is documented in the specification and allows the
application to partition submissions across two work queues. This would be useful
when different priorities might be desired per group.
There are several other configurations that are possible that include quality
of service parameters on the work queues that are not currently utilized by
the module. Specialized use of DSA may require different configurations that
can be added to the module as needed.
## Software Module {#accel_sw}
The software module is enabled by default. If no hardware engine is explicitly
enabled via startup RPC as discussed earlier, the software module will use ISA-L
if available for functions such as CRC32C. Otherwise, standard glibc calls are
used to back the framework API.
## Batching {#batching}
Batching is exposed by the acceleration framework and provides an interface to
batch sets of commands up and then submit them with a single command. The public
API is consistent with the implementation however each plug-in module behaves
differently depending on its capabilities.
The DSA engine has complete support for batching all supported commands together
into one submission. This is advantageous as it reduces the overhead incurred in
the submission process to the hardware.
The software engine supports batching only to be consistent with the framework API.
In software there is no savings by batching sets of commands versus submitting them
individually.
The IOAT engine supports batching but it is only beneficial for `memmove` and `memfill`
as these are supported by the hardware. All other commands can be batched and the
framework will manage all other commands via software.

View File

@ -35,22 +35,26 @@ Param | Long Param | Type | Default | Descript
-i | --shm-id | integer | | @ref cmd_arg_multi_process
-m | --cpumask | CPU mask | 0x1 | application @ref cpu_mask
-n | --mem-channels | integer | all channels | number of memory channels used for DPDK
-p | --main-core | integer | first core in CPU mask | main (primary) core for DPDK
-p | --master-core | integer | first core in CPU mask | master (primary) core for DPDK
-r | --rpc-socket | string | /var/tmp/spdk.sock | RPC listen address
-s | --mem-size | integer | all hugepage memory | @ref cmd_arg_memory_size
| | --silence-noticelog | flag | | disable notice level logging to `stderr`
-u | --no-pci | flag | | @ref cmd_arg_disable_pci_access.
| | --wait-for-rpc | flag | | @ref cmd_arg_deferred_initialization
-B | --pci-blocked | B:D:F | | @ref cmd_arg_pci_blocked_allowed.
-A | --pci-allowed | B:D:F | | @ref cmd_arg_pci_blocked_allowed.
-B | --pci-blacklist | B:D:F | | @ref cmd_arg_pci_blacklist_whitelist.
-W | --pci-whitelist | B:D:F | | @ref cmd_arg_pci_blacklist_whitelist.
-R | --huge-unlink | flag | | @ref cmd_arg_huge_unlink
| | --huge-dir | string | the first discovered | allocate hugepages from a specific mount
-L | --logflag | string | | @ref cmd_arg_log_flags
-L | --logflag | string | | @ref cmd_arg_debug_log_flags
### Configuration file {#cmd_arg_config_file}
SPDK applications are configured using a JSON RPC configuration file.
See @ref jsonrpc for details.
Historically, the SPDK applications were configured using a configuration file.
This is still supported, but is considered deprecated in favor of JSON RPC
configuration. See @ref jsonrpc for details.
Note that `--config` and `--wait-for-rpc` cannot be used at the same time.
### Limit coredump {#cmd_arg_limit_coredump}
@ -121,12 +125,12 @@ If SPDK is run with PCI access disabled it won't detect any PCI devices. This
includes primarily NVMe and IOAT devices. Also, the VFIO and UIO kernel modules
are not required in this mode.
### PCI address blocked and allowed lists {#cmd_arg_pci_blocked_allowed}
### PCI address blacklist and whitelist {#cmd_arg_pci_blacklist_whitelist}
If blocked list is used, then all devices with the provided PCI address will be
ignored. If an allowed list is used, only allowed devices will be probed.
`-B` or `-A` can be used more than once, but cannot be mixed together. That is,
`-B` and `-A` cannot be used at the same time.
If blacklist is used, then all devices with the provided PCI address will be
ignored. If a whitelist is used, only whitelisted devices will be probed.
`-B` or `-W` can be used more than once, but cannot be mixed together. That is,
`-B` and `-W` cannot be used at the same time.
### Unlink hugepage files after initialization {#cmd_arg_huge_unlink}
@ -134,11 +138,11 @@ By default, each DPDK-based application tries to remove any orphaned hugetlbfs
files during its initialization. This option removes hugetlbfs files of the current
process as soon as they're created, but is not compatible with `--shm-id`.
### Log flag {#cmd_arg_log_flags}
### Debug log {#cmd_arg_debug_log_flags}
Enable a specific log type. This option can be used more than once. A list of
Enable a specific debug log type. This option can be used more than once. A list of
all available types is provided in the `--help` output, with `--logflag all`
enabling all of them. Additionally enables debug print level in debug builds of SPDK.
enabling all of them. Debug logs are only available in debug builds of SPDK.
## CPU mask {#cpu_mask}

View File

@ -1,9 +1,5 @@
# Block Device User Guide {#bdev}
# Target Audience {#bdev_ug_targetaudience}
This user guide is intended for software developers who have knowledge of block storage, storage drivers, issuing JSON-RPC commands and storage services such as RAID, compression, crypto, and others.
# Introduction {#bdev_ug_introduction}
The SPDK block device layer, often simply called *bdev*, is a C library
@ -39,12 +35,72 @@ directly from SPDK application by running `scripts/rpc.py rpc_get_methods`.
Detailed help for each command can be displayed by adding `-h` flag as a
command parameter.
# Configuring Block Device Modules {#bdev_ug_general_rpcs}
# General Purpose RPCs {#bdev_ug_general_rpcs}
Block devices can be configured using JSON RPCs. A complete list of available RPC commands
with detailed information can be found on the @ref jsonrpc_components_bdev page.
## bdev_get_bdevs {#bdev_ug_get_bdevs}
# Common Block Device Configuration Examples
List of currently available block devices including detailed information about
them can be get by using `bdev_get_bdevs` RPC command. User can add optional
parameter `name` to get details about specified by that name bdev.
Example response
~~~
{
"num_blocks": 32768,
"assigned_rate_limits": {
"rw_ios_per_sec": 10000,
"rw_mbytes_per_sec": 20
},
"supported_io_types": {
"reset": true,
"nvme_admin": false,
"unmap": true,
"read": true,
"write_zeroes": true,
"write": true,
"flush": true,
"nvme_io": false
},
"driver_specific": {},
"claimed": false,
"block_size": 4096,
"product_name": "Malloc disk",
"name": "Malloc0"
}
~~~
## bdev_set_qos_limit {#bdev_set_qos_limit}
Users can use the `bdev_set_qos_limit` RPC command to enable, adjust, and disable
rate limits on an existing bdev. Two types of rate limits are supported:
IOPS and bandwidth. The rate limits can be enabled, adjusted, and disabled at any
time for the specified bdev. The bdev name is a required parameter for this
RPC command and at least one of `rw_ios_per_sec` and `rw_mbytes_per_sec` must be
specified. When both rate limits are enabled, the first met limit will
take effect. The value 0 may be specified to disable the corresponding rate
limit. Users can run this command with `-h` or `--help` for more information.
## Histograms {#rpc_bdev_histogram}
The `bdev_enable_histogram` RPC command allows to enable or disable gathering
latency data for specified bdev. Histogram can be downloaded by the user by
calling `bdev_get_histogram` and parsed using scripts/histogram.py script.
Example command
`rpc.py bdev_enable_histogram Nvme0n1 --enable`
The command will enable gathering data for histogram on Nvme0n1 device.
`rpc.py bdev_get_histogram Nvme0n1 | histogram.py`
The command will download gathered histogram data. The script will parse
the data and show table containing IO count for latency ranges.
`rpc.py bdev_enable_histogram Nvme0n1 --disable`
The command will disable histogram on Nvme0n1 device.
# Ceph RBD {#bdev_config_rbd}
@ -63,12 +119,6 @@ To remove a block device representation use the bdev_rbd_delete command.
`rpc.py bdev_rbd_delete Rbd0`
To resize a bdev use the bdev_rbd_resize command.
`rpc.py bdev_rbd_resize Rbd0 4096`
This command will resize the Rbd0 bdev to 4096 MiB.
# Compression Virtual Bdev Module {#bdev_config_compress}
The compression bdev module can be configured to provide compression/decompression
@ -114,7 +164,7 @@ a value of 1 tells the driver to use QAT and if not available then the creation
the vbdev should fail to create or load. A value of '2' as shown below tells the module
to use ISAL and if for some reason it is not available, the vbdev should fail to create or load.
`rpc.py compress_set_pmd -p 2`
`rpc.py set_compress_pmd -p 2`
To remove a compression vbdev, use the following command which will also delete the PMEM
file. If the logical volume is deleted the PMEM file will not be removed and the
@ -139,8 +189,8 @@ time the SPDK virtual bdev module supports cipher only as follows:
- AESN-NI Multi Buffer Crypto Poll Mode Driver: RTE_CRYPTO_CIPHER_AES128_CBC
- Intel(R) QuickAssist (QAT) Crypto Poll Mode Driver: RTE_CRYPTO_CIPHER_AES128_CBC
(Note: QAT is functional however is marked as experimental until the hardware has
been fully integrated with the SPDK CI system.)
(Note: QAT is functional however is marked as experimental until the hardware has
been fully integrated with the SPDK CI system.)
In order to support using the bdev block offset (LBA) as the initialization vector (IV),
the crypto module break up all I/O into crypto operations of a size equal to the block
@ -205,7 +255,7 @@ possibly multiple virtual bdevs.
## SPDK GPT partition table {#bdev_ug_gpt}
The SPDK partition type GUID is `7c5222bd-8f5d-4087-9c00-bf9843c7b58c`. Existing SPDK bdevs
can be exposed as Linux block devices via NBD and then can be partitioned with
can be exposed as Linux block devices via NBD and then ca be partitioned with
standard partitioning tools. After partitioning, the bdevs will need to be deleted and
attached again for the GPT bdev module to see any changes. NBD kernel module must be
loaded first. To create NBD bdev user should use `nbd_start_disk` RPC command.
@ -280,9 +330,9 @@ Example commands
This command will create `aio0` device from /dev/sda.
`rpc.py bdev_aio_create /tmp/file file 4096`
`rpc.py bdev_aio_create /tmp/file file 8192`
This command will create `file` device with block size 4096 from /tmp/file.
This command will create `file` device with block size 8192 from /tmp/file.
To delete an aio bdev use the bdev_aio_delete command.
@ -312,22 +362,16 @@ To remove `Cache1`:
During removal OCF-cache will be stopped and all cached data will be written to the core device.
Note that OCF has a per-device RAM requirement. More details can be found in the
[OCF documentation](https://open-cas.github.io/guide_system_requirements.html).
Note that OCF has a per-device RAM requirement
of about 56000 + _cache device size_ * 58 / _cache line size_ (in bytes).
To get more information on OCF
please visit [OCF documentation](https://open-cas.github.io/).
# Malloc bdev {#bdev_config_malloc}
Malloc bdevs are ramdisks. Because of its nature they are volatile. They are created from hugepage memory given to SPDK
application.
Example command for creating malloc bdev:
`rpc.py bdev_malloc_create -b Malloc0 64 512`
Example command for removing malloc bdev:
`rpc.py bdev_malloc_delete Malloc0`
# Null {#bdev_config_null}
The SPDK null bdev driver is a dummy block I/O target that discards all writes and returns undefined
@ -369,22 +413,15 @@ This command will remove NVMe bdev named Nvme0.
## NVMe bdev character device {#bdev_config_nvme_cuse}
This feature is considered as experimental. You must configure with --with-nvme-cuse
option to enable this RPC.
This feature is considered as experimental.
Example commands
`rpc.py bdev_nvme_cuse_register -n Nvme3
`rpc.py bdev_nvme_cuse_register -n Nvme0 -p spdk/nvme0`
This command will register a character device under /dev/spdk associated with Nvme3
controller. If there are namespaces created on Nvme3 controller, a namespace
character device is also created for each namespace.
For example, the first controller registered will have a character device path of
/dev/spdk/nvmeX, where X is replaced with a unique integer to differentiate it from
other controllers. Note that this 'nvmeX' name here has no correlation to the name
associated with the controller in SPDK. Namespace character devices will have a path
of /dev/spdk/nvmeXnY, where Y is the namespace ID.
This command will register /dev/spdk/nvme0 character device associated with Nvme0
controller. If there are namespaces created on Nvme0 controller, for each namespace
device /dev/spdk/nvme0nX is created.
Cuse devices are removed from system, when NVMe controller is detached or unregistered
with command:
@ -417,6 +454,7 @@ User can get list of available lvol stores using `bdev_lvol_get_lvstores` RPC co
parameters available).
Example response
~~~
{
"uuid": "330a6ab2-f468-11e7-983e-001e67edf35d",
@ -448,6 +486,26 @@ Example commands
`rpc.py bdev_lvol_create lvol2 25 -u 330a6ab2-f468-11e7-983e-001e67edf35d`
# RAID {#bdev_ug_raid}
RAID virtual bdev module provides functionality to combine any SPDK bdevs into
one RAID bdev. Currently SPDK supports only RAID 0. RAID functionality does not
store on-disk metadata on the member disks, so user must recreate the RAID
volume when restarting application. User may specify member disks to create RAID
volume event if they do not exists yet - as the member disks are registered at
a later time, the RAID module will claim them and will surface the RAID volume
after all of the member disks are available. It is allowed to use disks of
different sizes - the smallest disk size will be the amount of space used on
each member disk.
Example commands
`rpc.py bdev_raid_create -n Raid0 -z 64 -r 0 -b "lvol0 lvol1 lvol2 lvol3"`
`rpc.py bdev_raid_get_bdevs`
`rpc.py bdev_raid_delete Raid0`
# Passthru {#bdev_config_passthru}
The SPDK Passthru virtual block device module serves as an example of how to write a
@ -497,65 +555,6 @@ To remove a block device representation use the bdev_pmem_delete command.
`rpc.py bdev_pmem_delete pmem`
# RAID {#bdev_ug_raid}
RAID virtual bdev module provides functionality to combine any SPDK bdevs into
one RAID bdev. Currently SPDK supports only RAID 0. RAID functionality does not
store on-disk metadata on the member disks, so user must recreate the RAID
volume when restarting application. User may specify member disks to create RAID
volume event if they do not exists yet - as the member disks are registered at
a later time, the RAID module will claim them and will surface the RAID volume
after all of the member disks are available. It is allowed to use disks of
different sizes - the smallest disk size will be the amount of space used on
each member disk.
Example commands
`rpc.py bdev_raid_create -n Raid0 -z 64 -r 0 -b "lvol0 lvol1 lvol2 lvol3"`
`rpc.py bdev_raid_get_bdevs`
`rpc.py bdev_raid_delete Raid0`
# Split {#bdev_ug_split}
The split block device module takes an underlying block device and splits it into
several smaller equal-sized virtual block devices. This serves as an example to create
more vbdevs on a given base bdev for user testing.
Example commands
To create four split bdevs with base bdev_b0 use the `bdev_split_create` command.
Each split bdev will be one fourth the size of the base bdev.
`rpc.py bdev_split_create bdev_b0 4`
The `split_size_mb`(-s) parameter restricts the size of each split bdev.
The total size of all split bdevs must not exceed the base bdev size.
`rpc.py bdev_split_create bdev_b0 4 -s 128`
To remove the split bdevs, use the `bdev_split_delete` command with the base bdev name.
`rpc.py bdev_split_delete bdev_b0`
# Uring {#bdev_ug_uring}
The uring bdev module issues I/O to kernel block devices using the io_uring Linux kernel API. This module requires liburing.
For more information on io_uring refer to kernel [IO_uring] (https://kernel.dk/io_uring.pdf)
The user needs to configure SPDK to include io_uring support:
`configure --with-uring`
To create a uring bdev with given filename, bdev name and block size use the `bdev_uring_create` RPC.
`rpc.py bdev_uring_create /path/to/device bdev_u0 512`
To remove a uring bdev use the `bdev_uring_delete` RPC.
`rpc.py bdev_uring_delete bdev_u0`
# Virtio Block {#bdev_config_virtio_blk}
The Virtio-Block driver allows creating SPDK bdevs from Virtio-Block devices.

View File

@ -18,7 +18,7 @@ how to write a module.
## Creating A New Module
Block device modules are located in subdirectories under module/bdev today. It is not
Block device modules are located in subdirectories under lib/bdev today. It is not
currently possible to place the code for a bdev module elsewhere, but updates
to the build system could be made to enable this in the future. To create a
module, add a new directory with a single C file and a Makefile. A great
@ -137,15 +137,6 @@ block device. Once the I/O request is completed, the module must call
spdk_bdev_io_complete(). The I/O does not have to finish within the calling
context of `submit_request`.
Integrating a new bdev module into the build system requires updates to various
files in the /mk directory.
## Creating Bdevs in an External Repository
A User can build their own bdev module and application on top of existing SPDK libraries. The example in
test/external_code serves as a template for creating, building and linking an external
bdev module. Refer to test/external_code/README.md and @ref so_linking for further information.
## Creating Virtual Bdevs
Block devices are considered virtual if they handle I/O requests by routing
@ -153,7 +144,7 @@ the I/O to other block devices. The canonical example would be a bdev module
that implements RAID. Virtual bdevs are created in the same way as regular
bdevs, but take one additional step. The module can look up the underlying
bdevs it wishes to route I/O to using spdk_bdev_get_by_name(), where the string
name is provided by the user via an RPC. The module
name is provided by the user in a configuration file or via an RPC. The module
then may proceed is normal by opening the bdev to obtain a descriptor, and
creating I/O channels for the bdev (probably in response to the
`get_io_channel` callback). The final step is to have the module use its open

View File

@ -1,86 +0,0 @@
# Using bdevperf application {#bdevperf}
## Introduction
bdevperf is an SPDK application that is used for performance testing
of block devices (bdevs) exposed by the SPDK bdev layer. It is an
alternative to the SPDK bdev fio plugin for benchmarking SPDK bdevs.
In some cases, bdevperf can provide much lower overhead than the fio
plugin, resulting in much better performance for tests using a limited
number of CPU cores.
bdevperf exposes command line interface that allows to specify
SPDK framework options as well as testing options.
Since SPDK 20.07, bdevperf supports configuration file that is similar
to FIO. It allows user to create jobs parameterized by
filename, cpumask, blocksize, queuesize, etc.
## Config file
Bdevperf's config file is similar to FIO's config file format.
Below is an example config file that uses all available parameters:
~~~{.ini}
[global]
filename=Malloc0:Malloc1
bs=1024
iosize=256
rw=randrw
rwmixread=90
[A]
cpumask=0xff
[B]
cpumask=[0-128]
filename=Malloc1
[global]
filename=Malloc0
rw=write
[C]
bs=4096
iosize=128
offset=1000000
length=1000000
~~~
Jobs `[A]` `[B]` or `[C]`, inherit default values from `[global]`
section residing above them. So in the example, job `[A]` inherits
`filename` value and uses both `Malloc0` and `Malloc1` bdevs as targets,
job `[B]` overrides its `filename` value and uses `Malloc1` and
job `[C]` inherits value `Malloc0` for its `filename`.
Interaction with CLI arguments is not the same as in FIO however.
If bdevperf receives CLI argument, it overrides values
of corresponding parameter for all `[global]` sections of config file.
So if example config is used, specifying `-q` argument
will make jobs `[A]` and `[B]` use its value.
Below is a full list of supported parameters with descriptions.
Param | Default | Description
--------- | ----------------- | -----------
filename | | Bdevs to use, separated by ":"
cpumask | Maximum available | CPU mask. Format is defined at @ref cpu_mask
bs | | Block size (io size)
iodepth | | Queue depth
rwmixread | `50` | Percentage of a mixed workload that should be reads
offset | `0` | Start I/O at the provided offset on the bdev
length | 100% of bdev size | End I/O at `offset`+`length` on the bdev
rw | | Type of I/O pattern
Available rw types:
- read
- randread
- write
- randwrite
- verify
- reset
- unmap
- write_zeroes
- flush
- rw
- randrw

View File

@ -35,27 +35,27 @@ NAND too.
## Theory of Operation {#blob_pg_theory}
### Abstractions
### Abstractions:
The Blobstore defines a hierarchy of storage abstractions as follows.
* **Logical Block**: Logical blocks are exposed by the disk itself, which are numbered from 0 to N, where N is the
number of blocks in the disk. A logical block is typically either 512B or 4KiB.
number of blocks in the disk. A logical block is typically either 512B or 4KiB.
* **Page**: A page is defined to be a fixed number of logical blocks defined at Blobstore creation time. The logical
blocks that compose a page are always contiguous. Pages are also numbered from the beginning of the disk such
that the first page worth of blocks is page 0, the second page is page 1, etc. A page is typically 4KiB in size,
so this is either 8 or 1 logical blocks in practice. The SSD must be able to perform atomic reads and writes of
at least the page size.
blocks that compose a page are always contiguous. Pages are also numbered from the beginning of the disk such
that the first page worth of blocks is page 0, the second page is page 1, etc. A page is typically 4KiB in size,
so this is either 8 or 1 logical blocks in practice. The SSD must be able to perform atomic reads and writes of
at least the page size.
* **Cluster**: A cluster is a fixed number of pages defined at Blobstore creation time. The pages that compose a cluster
are always contiguous. Clusters are also numbered from the beginning of the disk, where cluster 0 is the first cluster
worth of pages, cluster 1 is the second grouping of pages, etc. A cluster is typically 1MiB in size, or 256 pages.
are always contiguous. Clusters are also numbered from the beginning of the disk, where cluster 0 is the first cluster
worth of pages, cluster 1 is the second grouping of pages, etc. A cluster is typically 1MiB in size, or 256 pages.
* **Blob**: A blob is an ordered list of clusters. Blobs are manipulated (created, sized, deleted, etc.) by the application
and persist across power failures and reboots. Applications use a Blobstore provided identifier to access a particular blob.
Blobs are read and written in units of pages by specifying an offset from the start of the blob. Applications can also
store metadata in the form of key/value pairs with each blob which we'll refer to as xattrs (extended attributes).
and persist across power failures and reboots. Applications use a Blobstore provided identifier to access a particular blob.
Blobs are read and written in units of pages by specifying an offset from the start of the blob. Applications can also
store metadata in the form of key/value pairs with each blob which we'll refer to as xattrs (extended attributes).
* **Blobstore**: An SSD which has been initialized by a Blobstore-based application is referred to as "a Blobstore." A
Blobstore owns the entire underlying device which is made up of a private Blobstore metadata region and the collection of
blobs as managed by the application.
Blobstore owns the entire underlying device which is made up of a private Blobstore metadata region and the collection of
blobs as managed by the application.
@htmlonly
@ -87,6 +87,7 @@ The Blobstore defines a hierarchy of storage abstractions as follows.
35,
{ alignment: 'center', fill: 'white' });
for (var j = 0; j < 4; j++) {
let pageWidth = 100;
let pageHeight = canvasHeight;
@ -114,19 +115,19 @@ For all Blobstore operations regarding atomicity, there is a dependency on the u
operations of at least one page size. Atomicity here can refer to multiple operations:
* **Data Writes**: For the case of data writes, the unit of atomicity is one page. Therefore if a write operation of
greater than one page is underway and the system suffers a power failure, the data on media will be consistent at a page
size granularity (if a single page were in the middle of being updated when power was lost, the data at that page location
will be as it was prior to the start of the write operation following power restoration.)
greater than one page is underway and the system suffers a power failure, the data on media will be consistent at a page
size granularity (if a single page were in the middle of being updated when power was lost, the data at that page location
will be as it was prior to the start of the write operation following power restoration.)
* **Blob Metadata Updates**: Each blob has its own set of metadata (xattrs, size, etc). For performance reasons, a copy of
this metadata is kept in RAM and only synchronized with the on-disk version when the application makes an explicit call to
do so, or when the Blobstore is unloaded. Therefore, setting of an xattr, for example is not consistent until the call to
synchronize it (covered later) which is, however, performed atomically.
this metadata is kept in RAM and only synchronized with the on-disk version when the application makes an explicit call to
do so, or when the Blobstore is unloaded. Therefore, setting of an xattr, for example is not consistent until the call to
synchronize it (covered later) which is, however, performed atomically.
* **Blobstore Metadata Updates**: Blobstore itself has its own metadata which, like per blob metadata, has a copy in both
RAM and on-disk. Unlike the per blob metadata, however, the Blobstore metadata region is not made consistent via a blob
synchronization call, it is only synchronized when the Blobstore is properly unloaded via API. Therefore, if the Blobstore
metadata is updated (blob creation, deletion, resize, etc.) and not unloaded properly, it will need to perform some extra
steps the next time it is loaded which will take a bit more time than it would have if shutdown cleanly, but there will be
no inconsistencies.
RAM and on-disk. Unlike the per blob metadata, however, the Blobstore metadata region is not made consistent via a blob
synchronization call, it is only synchronized when the Blobstore is properly unloaded via API. Therefore, if the Blobstore
metadata is updated (blob creation, deletion, resize, etc.) and not unloaded properly, it will need to perform some extra
steps the next time it is loaded which will take a bit more time than it would have if shutdown cleanly, but there will be
no inconsistencies.
### Callbacks
@ -182,22 +183,22 @@ When the Blobstore is initialized, there are multiple configuration options to c
options and their defaults are:
* **Cluster Size**: By default, this value is 1MB. The cluster size is required to be a multiple of page size and should be
selected based on the applications usage model in terms of allocation. Recall that blobs are made up of clusters so when
a blob is allocated/deallocated or changes in size, disk LBAs will be manipulated in groups of cluster size. If the
application is expecting to deal with mainly very large (always multiple GB) blobs then it may make sense to change the
cluster size to 1GB for example.
selected based on the applications usage model in terms of allocation. Recall that blobs are made up of clusters so when
a blob is allocated/deallocated or changes in size, disk LBAs will be manipulated in groups of cluster size. If the
application is expecting to deal with mainly very large (always multiple GB) blobs then it may make sense to change the
cluster size to 1GB for example.
* **Number of Metadata Pages**: By default, Blobstore will assume there can be as many clusters as there are metadata pages
which is the worst case scenario in terms of metadata usage and can be overridden here however the space efficiency is
not significant.
which is the worst case scenario in terms of metadata usage and can be overridden here however the space efficiency is
not significant.
* **Maximum Simultaneous Metadata Operations**: Determines how many internally pre-allocated memory structures are set
aside for performing metadata operations. It is unlikely that changes to this value (default 32) would be desirable.
aside for performing metadata operations. It is unlikely that changes to this value (default 32) would be desirable.
* **Maximum Simultaneous Operations Per Channel**: Determines how many internally pre-allocated memory structures are set
aside for channel operations. Changes to this value would be application dependent and best determined by both a knowledge
of the typical usage model, an understanding of the types of SSDs being used and empirical data. The default is 512.
aside for channel operations. Changes to this value would be application dependent and best determined by both a knowledge
of the typical usage model, an understanding of the types of SSDs being used and empirical data. The default is 512.
* **Blobstore Type**: This field is a character array to be used by applications that need to identify whether the
Blobstore found here is appropriate to claim or not. The default is NULL and unless the application is being deployed in
an environment where multiple applications using the same disks are at risk of inadvertently using the wrong Blobstore, there
is no need to set this value. It can, however, be set to any valid set of characters.
Blobstore found here is appropriate to claim or not. The default is NULL and unless the application is being deployed in
an environment where multiple applications using the same disks are at risk of inadvertently using the wrong Blobstore, there
is no need to set this value. It can, however, be set to any valid set of characters.
### Sub-page Sized Operations
@ -209,11 +210,10 @@ requires finer granularity it will have to accommodate that itself.
As mentioned earlier, Blobstore can share a single thread with an application or the application
can define any number of threads, within resource constraints, that makes sense. The basic considerations that must be
followed are:
* Metadata operations (API with MD in the name) should be isolated from each other as there is no internal locking on the
memory structures affected by these API.
memory structures affected by these API.
* Metadata operations should be isolated from conflicting IO operations (an example of a conflicting IO would be one that is
reading/writing to an area of a blob that a metadata operation is deallocating).
reading/writing to an area of a blob that a metadata operation is deallocating).
* Asynchronous callbacks will always take place on the calling thread.
* No assumptions about IO ordering can be made regardless of how many or which threads were involved in the issuing.
@ -225,7 +225,7 @@ with SPDK API.
### Error Handling
Asynchronous Blobstore callbacks all include an error number that should be checked; non-zero values
indicate an error. Synchronous calls will typically return an error value if applicable.
indicate and error. Synchronous calls will typically return an error value if applicable.
### Asynchronous API
@ -267,18 +267,18 @@ relevant in understanding any kind of structure for what is on the Blobstore.
There are multiple examples of Blobstore usage in the [repo](https://github.com/spdk/spdk):
* **Hello World**: Actually named `hello_blob.c` this is a very basic example of a single threaded application that
does nothing more than demonstrate the very basic API. Although Blobstore is optimized for NVMe, this example uses
a RAM disk (malloc) back-end so that it can be executed easily in any development environment. The malloc back-end
is a `bdev` module thus this example uses not only the SPDK Framework but the `bdev` layer as well.
does nothing more than demonstrate the very basic API. Although Blobstore is optimized for NVMe, this example uses
a RAM disk (malloc) back-end so that it can be executed easily in any development environment. The malloc back-end
is a `bdev` module thus this example uses not only the SPDK Framework but the `bdev` layer as well.
* **CLI**: The `blobcli.c` example is command line utility intended to not only serve as example code but as a test
and development tool for Blobstore itself. It is also a simple single threaded application that relies on both the
SPDK Framework and the `bdev` layer but offers multiple modes of operation to accomplish some real-world tasks. In
command mode, it accepts single-shot commands which can be a little time consuming if there are many commands to
get through as each one will take a few seconds waiting for DPDK initialization. It therefore has a shell mode that
allows the developer to get to a `blob>` prompt and then very quickly interact with Blobstore with simple commands
that include the ability to import/export blobs from/to regular files. Lastly there is a scripting mode to automate
a series of tasks, again, handy for development and/or test type activities.
and development tool for Blobstore itself. It is also a simple single threaded application that relies on both the
SPDK Framework and the `bdev` layer but offers multiple modes of operation to accomplish some real-world tasks. In
command mode, it accepts single-shot commands which can be a little time consuming if there are many commands to
get through as each one will take a few seconds waiting for DPDK initialization. It therefore has a shell mode that
allows the developer to get to a `blob>` prompt and then very quickly interact with Blobstore with simple commands
that include the ability to import/export blobs from/to regular files. Lastly there is a scripting mode to automate
a series of tasks, again, handy for development and/or test type activities.
## Configuration {#blob_pg_config}
@ -318,25 +318,6 @@ form a linked list. The first page in the list will be written in place on updat
be written to fresh locations. This requires the backing device to support an atomic write size greater than
or equal to the page size to guarantee that the operation is atomic. See the section on atomicity for details.
### Blob cluster layout {#blob_pg_cluster_layout}
Each blob is an ordered list of clusters, where starting LBA of a cluster is called extent. A blob can be
thin provisioned, resulting in no extent for some of the clusters. When first write operation occurs
to the unallocated cluster - new extent is chosen. This information is stored in RAM and on-disk.
There are two extent representations on-disk, dependent on `use_extent_table` (default:true) opts used
when creating a blob.
* **use_extent_table=true**: EXTENT_PAGE descriptor is not part of linked list of pages. It contains extents
that are not run-length encoded. Each extent page is referenced by EXTENT_TABLE descriptor, which is serialized
as part of linked list of pages. Extent table is run-length encoding all unallocated extent pages.
Every new cluster allocation updates a single extent page, in case when extent page was previously allocated.
Otherwise additionally incurs serializing whole linked list of pages for the blob.
* **use_extent_table=false**: EXTENT_RLE descriptor is serialized as part of linked list of pages.
Extents pointing to contiguous LBA are run-length encoded, including unallocated extents represented by 0.
Every new cluster allocation incurs serializing whole linked list of pages for the blob.
### Sequences and Batches
Internally Blobstore uses the concepts of sequences and batches to submit IO to the underlying device in either
@ -394,6 +375,5 @@ example,
~~~
And for the most part the following conventions are followed throughout:
* functions beginning with an underscore are called internally only
* functions or variables with the letters `cpl` are related to set or callback completions

View File

@ -14,30 +14,30 @@ make
~~~
Clone the RocksDB repository from the SPDK GitHub fork into a separate directory.
Make sure you check out the `6.15.fb` branch.
Make sure you check out the `spdk-v5.14.3` branch.
~~~{.sh}
cd ..
git clone -b 6.15.fb https://github.com/spdk/rocksdb.git
git clone -b spdk-v5.14.3 https://github.com/spdk/rocksdb.git
~~~
Build RocksDB. Only the `db_bench` benchmarking tool is integrated with BlobFS.
~~~{.sh}
cd rocksdb
make db_bench SPDK_DIR=relative_path/to/spdk
make db_bench SPDK_DIR=path/to/spdk
~~~
Or you can also add `DEBUG_LEVEL=0` for a release build (need to turn on `USE_RTTI`).
~~~{.sh}
export USE_RTTI=1 && make db_bench DEBUG_LEVEL=0 SPDK_DIR=relative_path/to/spdk
export USE_RTTI=1 && make db_bench DEBUG_LEVEL=0 SPDK_DIR=path/to/spdk
~~~
Create an NVMe section in the configuration file using SPDK's `gen_nvme.sh` script.
~~~{.sh}
scripts/gen_nvme.sh --json-with-subsystems > /usr/local/etc/spdk/rocksdb.json
scripts/gen_nvme.sh > /usr/local/etc/spdk/rocksdb.conf
~~~
Verify the configuration file has specified the correct NVMe SSD.
@ -54,7 +54,7 @@ HUGEMEM=5120 scripts/setup.sh
Create an empty SPDK blobfs for testing.
~~~{.sh}
test/blobfs/mkfs/mkfs /usr/local/etc/spdk/rocksdb.json Nvme0n1
test/blobfs/mkfs/mkfs /usr/local/etc/spdk/rocksdb.conf Nvme0n1
~~~
At this point, RocksDB is ready for testing with SPDK. Three `db_bench` parameters are used to configure SPDK:
@ -74,7 +74,7 @@ BlobFS provides a FUSE plug-in to mount an SPDK BlobFS as a kernel filesystem fo
The FUSE plug-in requires fuse3 and will be built automatically when fuse3 is detected on the system.
~~~{.sh}
test/blobfs/fuse/fuse /usr/local/etc/spdk/rocksdb.json Nvme0n1 /mnt/fuse
test/blobfs/fuse/fuse /usr/local/etc/spdk/rocksdb.conf Nvme0n1 /mnt/fuse
~~~
Note that the FUSE plug-in has some limitations - see the list below.

View File

@ -1,6 +0,0 @@
# CI Tools {#ci_tools}
Section describing tools used by CI to verify integrity of the submitted
patches ([status](https://ci.spdk.io)).
- @subpage shfmt

View File

@ -20,7 +20,7 @@ properties:
because you don't have to change the data model from the single-threaded
version. You add a lock around the data.
* You can write your program as a synchronous, imperative list of statements
that you read from top to bottom.
that you read from top to bottom.
* The scheduler can interrupt threads, allowing for efficient time-sharing
of CPU resources.
@ -117,10 +117,14 @@ framework for all of the example applications it shipped with, in the interest
of supporting the widest variety of frameworks possible. But the applications do
of course require something that implements an asynchronous event loop in order
to run, so enter the `event` framework located in `lib/event`. This framework
includes things like polling and scheduling the lightweight threads, installing
signal handlers to cleanly shutdown, and basic command line option parsing.
Only established applications should consider directly integrating the lower
level libraries.
includes things like spawning one thread per core, pinning each thread to a
unique core, polling and scheduling the lightweight threads, installing signal
handlers to cleanly shutdown, and basic command line option parsing. When
started through spdk_app_start(), the library automatically spawns all of the
threads requested, pins them, and is ready for lightweight threads to be
created. This makes it much easier to implement a brand new SPDK application and
is the recommended method for those starting out. Only established applications
should consider directly integrating the lower level libraries.
# Limitations of the C Language

View File

@ -1,91 +0,0 @@
# SPDK and Containers {#containers}
This is a living document as there are many ways to use containers with
SPDK. As new usages are identified and tested, they will be documented
here.
# In this document {#containers_toc}
* @ref kata_containers_with_spdk_vhost
* @ref spdk_in_docker
# Using SPDK vhost target to provide volume service to Kata Containers and Docker {#kata_containers_with_spdk_vhost}
[Kata Containers](https://katacontainers.io) can build a secure container
runtime with lightweight virtual machines that feel and perform like
containers, but provide stronger workload isolation using hardware
virtualization technology as a second layer of defense.
From Kata Containers [1.11.0](https://github.com/kata-containers/runtime/releases/tag/1.11.0),
vhost-user-blk support is enabled in `kata-containers/runtime`. That is to say
SPDK vhost target can be used to provide volume service to Kata Containers directly.
In addition, a container manager like Docker, can be configured easily to launch
a Kata container with an SPDK vhost-user block device. For operating details, visit
Kata containers use-case [Setup to run SPDK vhost-user devices with Kata Containers and Docker](https://github.com/kata-containers/documentation/blob/master/use-cases/using-SPDK-vhostuser-and-kata.md#host-setup-for-vhost-user-devices)
# Containerizing an SPDK Application for Docker {#spdk_in_docker}
There are no SPDK specific changes needed to run an SPDK based application in
a docker container, however this quick start guide should help you as you
containerize your SPDK based application.
1. Make sure you have all of your app dependencies identified and included in your Dockerfile
2. Make sure you have compiled your application for the target arch
3. Make sure your host has hugepages enabled
4. Make sure your host has bound your nvme device to your userspace driver
5. Write your Dockerfile. The following is a simple Dockerfile to containerize the nvme `hello_world`
example:
~~~{.sh}
# start with the latest Fedora
FROM fedora
# if you are behind a proxy, set that up now
ADD dnf.conf /etc/dnf/dnf.conf
# these are the min dependencies for the hello_world app
RUN dnf install libaio-devel -y
RUN dnf install numactl-devel -y
# set our working dir
WORKDIR /app
# add the hello_world binary
ADD hello_world hello_world
# run the app
CMD ./hello_world
~~~
6. Create your image
`sudo docker image build -t hello:1.0 .`
7. You docker command line will need to include at least the following:
- the `--privileged` flag to enable sharing of hugepages
- use of the `-v` switch to map hugepages
`sudo docker run --privileged -v /dev/hugepages:/dev/hugepages hello:1.0`
or depending on the needs of your app you may need one or more of the following parameters:
- If you are using the SPDK app framework: `-v /dev/shm:/dev/shm`
- If you need to use RPCs from outside of the container: `-v /var/tmp:/var/tmp`
- If you need to use the host network (i.e. NVMF target application): `--network host`
Your output should look something like this:
~~~{.sh}
$ sudo docker run --privileged -v //dev//hugepages://dev//hugepages hello:1.0
Starting SPDK v20.01-pre git sha1 80da95481 // DPDK 19.11.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk0 --proc-type=auto ]
EAL: No available hugepages reported in hugepages-1048576kB
Initializing NVMe Controllers
Attaching to 0000:06:00.0
Attached to 0000:06:00.0
Using controller INTEL SSDPEDMD400G4 (CVFT7203005M400LGN ) with 1 namespaces.
Namespace ID: 1 size: 400GB
Initialization complete.
INFO: using host memory buffer for IO
Hello world!
~~~

View File

@ -2,6 +2,4 @@
- @subpage nvme
- @subpage ioat
- @subpage idxd
- @subpage virtio
- @subpage vmd

View File

@ -1,9 +1,8 @@
# Flash Translation Layer {#ftl}
The Flash Translation Layer library provides block device access on top of devices
implementing bdev_zone interface.
It handles the logical to physical address mapping, responds to the asynchronous
media management events, and manages the defragmentation process.
The Flash Translation Layer library provides block device access on top of non-block SSDs
implementing Open Channel interface. It handles the logical to physical address mapping, responds to
the asynchronous media management events, and manages the defragmentation process.
# Terminology {#ftl_terminology}
@ -11,32 +10,32 @@ media management events, and manages the defragmentation process.
* Shorthand: L2P
Contains the mapping of the logical addresses (LBA) to their on-disk physical location. The LBAs
are contiguous and in range from 0 to the number of surfaced blocks (the number of spare blocks
Contains the mapping of the logical addresses (LBA) to their on-disk physical location (PPA). The
LBAs are contiguous and in range from 0 to the number of surfaced blocks (the number of spare blocks
are calculated during device formation and are subtracted from the available address space). The
spare blocks account for zones going offline throughout the lifespan of the device as well as
spare blocks account for chunks going offline throughout the lifespan of the device as well as
provide necessary buffer for data [defragmentation](#ftl_reloc).
## Band {#ftl_band}
A band describes a collection of zones, each belonging to a different parallel unit. All writes to
a band follow the same pattern - a batch of logical blocks is written to one zone, another batch
Band describes a collection of chunks, each belonging to a different parallel unit. All writes to
the band follow the same pattern - a batch of logical blocks is written to one chunk, another batch
to the next one and so on. This ensures the parallelism of the write operations, as they can be
executed independently on different zones. Each band keeps track of the LBAs it consists of, as
executed independently on a different chunks. Each band keeps track of the LBAs it consists of, as
well as their validity, as some of the data will be invalidated by subsequent writes to the same
logical address. The L2P mapping can be restored from the SSD by reading this information in order
from the oldest band to the youngest.
+--------------+ +--------------+ +--------------+
band 1 | zone 1 +--------+ zone 1 +---- --- --- --- --- ---+ zone 1 |
band 1 | chunk 1 +--------+ chk 1 +---- --- --- --- --- ---+ chk 1 |
+--------------+ +--------------+ +--------------+
band 2 | zone 2 +--------+ zone 2 +---- --- --- --- --- ---+ zone 2 |
band 2 | chunk 2 +--------+ chk 2 +---- --- --- --- --- ---+ chk 2 |
+--------------+ +--------------+ +--------------+
band 3 | zone 3 +--------+ zone 3 +---- --- --- --- --- ---+ zone 3 |
band 3 | chunk 3 +--------+ chk 3 +---- --- --- --- --- ---+ chk 3 |
+--------------+ +--------------+ +--------------+
| ... | | ... | | ... |
+--------------+ +--------------+ +--------------+
band m | zone m +--------+ zone m +---- --- --- --- --- ---+ zone m |
band m | chunk m +--------+ chk m +---- --- --- --- --- ---+ chk m |
+--------------+ +--------------+ +--------------+
| ... | | ... | | ... |
+--------------+ +--------------+ +--------------+
@ -46,20 +45,21 @@ from the oldest band to the youngest.
The address map and valid map are, along with a several other things (e.g. UUID of the device it's
part of, number of surfaced LBAs, band's sequence number, etc.), parts of the band's metadata. The
metadata is split in two parts:
head metadata band's data tail metadata
+-------------------+-------------------------------+------------------------+
|zone 1 |...|zone n |...|...|zone 1 |...| | ... |zone m-1 |zone m|
|block 1| |block 1| | |block x| | | |block y |block y|
+-------------------+-------------+-----------------+------------------------+
* the head part, containing information already known when opening the band (device's UUID, band's
sequence number, etc.), located at the beginning blocks of the band,
* the tail part, containing the address map and the valid map, located at the end of the band.
Bands are written sequentially (in a way that was described earlier). Before a band can be written
to, all of its zones need to be erased. During that time, the band is considered to be in a `PREP`
state. After that is done, the band transitions to the `OPENING` state, in which head metadata
head metadata band's data tail metadata
+-------------------+-------------------------------+----------------------+
|chk 1|...|chk n|...|...|chk 1|...| | ... |chk m-1 |chk m|
|lbk 1| |lbk 1| | |lbk x| | | |lblk y |lblk y|
+-------------------+-------------+-----------------+----------------------+
Bands are being written sequentially (in a way that was described earlier). Before a band can be
written to, all of its chunks need to be erased. During that time, the band is considered to be in a
`PREP` state. After that is done, the band transitions to the `OPENING` state, in which head metadata
is being written. Then the band moves to the `OPEN` state and actual user data can be written to the
band. Once the whole available space is filled, tail metadata is written and the band transitions to
`CLOSING` state. When that finishes the band becomes `CLOSED`.
@ -103,7 +103,7 @@ servicing read requests from the buffer.
Since a write to the same LBA invalidates its previous physical location, some of the blocks on a
band might contain old data that basically wastes space. As there is no way to overwrite an already
written block, this data will stay there until the whole zone is reset. This might create a
written block, this data will stay there until the whole chunk is reset. This might create a
situation in which all of the bands contain some valid data and no band can be erased, so no writes
can be executed anymore. Therefore a mechanism is needed to move valid data and invalidate whole
bands, so that they can be reused.
@ -123,13 +123,13 @@ long time ago) or due to read disturb (media characteristic, that causes corrupt
blocks during a read operation).
Module responsible for data relocation is called `reloc`. When a band is chosen for defragmentation
or a media management event is received, the appropriate blocks are marked as
or an ANM (asynchronous NAND management) event is received, the appropriate blocks are marked as
required to be moved. The `reloc` module takes a band that has some of such blocks marked, checks
their validity and, if they're still valid, copies them.
Choosing a band for defragmentation depends on several factors: its valid ratio (1) (proportion of
valid blocks to all user blocks), its age (2) (when was it written) and its write count / wear level
index of its zones (3) (how many times the band was written to). The lower the ratio (1), the
index of its chunks (3) (how many times the band was written to). The lower the ratio (1), the
higher its age (2) and the lower its write count (3), the higher the chance the band will be chosen
for defrag.
@ -137,45 +137,18 @@ for defrag.
## Prerequisites {#ftl_prereq}
In order to use the FTL module, a device capable of zoned interface is required e.g. `zone_block`
bdev or OCSSD `nvme` bdev.
## FTL bdev creation {#ftl_create}
Similar to other bdevs, the FTL bdevs can be created either based on JSON config files or via RPC.
Both interfaces require the same arguments which are described by the `--help` option of the
`bdev_ftl_create` RPC call, which are:
- bdev's name
- base bdev's name (base bdev must implement bdev_zone API)
- UUID of the FTL device (if the FTL is to be restored from the SSD)
## FTL usage with OCSSD nvme bdev {#ftl_ocssd}
This option requires an Open Channel SSD, which can be emulated using QEMU.
The QEMU with the patches providing Open Channel support can be found on the SPDK's QEMU fork
on [spdk-3.0.0](https://github.com/spdk/qemu/tree/spdk-3.0.0) branch.
In order to use the FTL module, an Open Channel SSD is required. The easiest way to obtain one is to
emulate it using QEMU. The QEMU with the patches providing Open Channel support can be found on the
SPDK's QEMU fork on [spdk-3.0.0](https://github.com/spdk/qemu/tree/spdk-3.0.0) branch.
## Configuring QEMU {#ftl_qemu_config}
To emulate an Open Channel device, QEMU expects parameters describing the characteristics and
geometry of the SSD:
- `serial` - serial number,
- `lver` - version of the OCSSD standard (0 - disabled, 1 - "1.2", 2 - "2.0"), libftl only supports
2.0,
- `lba_index` - default LBA format. Possible values can be found in the table below (libftl only supports lba_index >= 3):
- `lnum_ch` - number of groups,
- `lnum_lun` - number of parallel units
- `lnum_pln` - number of planes (logical blocks from all planes constitute a chunk)
- `lpgs_per_blk` - number of pages (smallest programmable unit) per chunk
- `lsecs_per_pg` - number of sectors in a page
- `lblks_per_pln` - number of chunks in a parallel unit
- `laer_thread_sleep` - timeout in ms between asynchronous events requesting the host to relocate
the data based on media feedback
- `lmetadata` - metadata file
- `lba_index` - default LBA format. Possible values (libftl only supports lba_index >= 3):
|lba_index| data| metadata|
|---------|-----|---------|
| 0 | 512B| 0B |
@ -185,6 +158,15 @@ geometry of the SSD:
| 4 |4096B| 64B |
| 5 |4096B| 128B |
| 6 |4096B| 16B |
- `lnum_ch` - number of groups,
- `lnum_lun` - number of parallel units
- `lnum_pln` - number of planes (logical blocks from all planes constitute a chunk)
- `lpgs_per_blk` - number of pages (smallest programmable unit) per chunk
- `lsecs_per_pg` - number of sectors in a page
- `lblks_per_pln` - number of chunks in a parallel unit
- `laer_thread_sleep` - timeout in ms between asynchronous events requesting the host to relocate
the data based on media feedback
- `lmetadata` - metadata file
For more detailed description of the available options, consult the `hw/block/nvme.c` file in
the QEMU repository.
@ -203,7 +185,7 @@ block being 4096B. Therefore the data file needs to be at least 384G (8 * 512 *
size and can be created with the following command:
```
fallocate -l 384G /path/to/data/file
$ fallocate -l 384G /path/to/data/file
```
## Configuring SPDK {#ftl_spdk_config}
@ -213,7 +195,7 @@ To verify that the drive is emulated correctly, one can check the output of the
device):
```
$ build/examples/identify
$ examples/nvme/identify/identify
=====================================================
NVMe Controller at 0000:00:0a.0 [1d1d:1f1f]
=====================================================
@ -241,49 +223,39 @@ Logical blks per chunk: 24576
```
In order to create FTL on top Open Channel SSD, the following steps are required:
Similarly to other bdevs, the FTL bdevs can be created either based on config files or via RPC. Both
interfaces require the same arguments which are described by the `--help` option of the
`bdev_ftl_create` RPC call, which are:
- bdev's name
- transport type of the device (e.g. PCIe)
- transport address of the device (e.g. `00:0a.0`)
- parallel unit range
- UUID of the FTL device (if the FTL is to be restored from the SSD)
1) Attach OCSSD NVMe controller
2) Create OCSSD bdev on the controller attached in step 1 (user could specify parallel unit range
and create multiple OCSSD bdevs on single OCSSD NVMe controller)
3) Create FTL bdev on top of bdev created in step 2
Example config:
Example:
```
$ scripts/rpc.py bdev_nvme_attach_controller -b nvme0 -a 00:0a.0 -t pcie
[Ftl]
TransportID "trtype:PCIe traddr:00:0a.0" nvme0 "0-3" 00000000-0000-0000-0000-000000000000
TransportID "trtype:PCIe traddr:00:0a.0" nvme1 "4-5" e9825835-b03c-49d7-bc3e-5827cbde8a88
```
$ scripts/rpc.py bdev_ocssd_create -c nvme0 -b nvme0n1
nvme0n1
The above will result in creation of two devices:
- `nvme0` on `00:0a.0` using parallel units 0-3, created from scratch
- `nvme1` on the same device using parallel units 4-5, restored from the SSD using the UUID
provided
$ scripts/rpc.py bdev_ftl_create -b ftl0 -d nvme0n1
The same can be achieved with the following two RPC calls:
```
$ scripts/rpc.py bdev_ftl_create -b nvme0 -l 0-3 -a 00:0a.0
{
"name": "ftl0",
"uuid": "3b469565-1fa5-4bfb-8341-747ec9fca9b9"
}
```
## FTL usage with zone block bdev {#ftl_zone_block}
Zone block bdev is a bdev adapter between regular `bdev` and `bdev_zone`. It emulates a zoned
interface on top of a regular block device.
In order to create FTL on top of a regular bdev:
1) Create regular bdev e.g. `bdev_nvme`, `bdev_null`, `bdev_malloc`
2) Create zone block bdev on top of a regular bdev created in step 1 (user could specify zone capacity
and optimal number of open zones)
3) Create FTL bdev on top of bdev created in step 2
Example:
```
$ scripts/rpc.py bdev_nvme_attach_controller -b nvme0 -a 00:05.0 -t pcie
nvme0n1
$ scripts/rpc.py bdev_zone_block_create -b zone1 -n nvme0n1 -z 4096 -o 32
zone1
$ scripts/rpc.py bdev_ftl_create -b ftl0 -d zone1
{
"name": "ftl0",
"uuid": "3b469565-1fa5-4bfb-8341-747ec9f3a9b9"
"name": "nvme0",
"uuid": "b4624a89-3174-476a-b9e5-5fd27d73e870"
}
$ scripts/rpc.py bdev_ftl_create -b nvme1 -l 0-3 -a 00:0a.0 -u e9825835-b03c-49d7-bc3e-5827cbde8a88
{
"name": "nvme1",
"uuid": "e9825835-b03c-49d7-bc3e-5827cbde8a88"
}
```

View File

@ -196,7 +196,7 @@ Error occurred in Python command: No symbol table is loaded. Use the "file"
command.
~~~
# Macros available
# Macros available:
- spdk_load_macros: load the macros (use --reload in order to reload them)
- spdk_print_bdevs: information about bdevs
@ -205,7 +205,7 @@ command.
- spdk_print_nvmf_subsystems: information about nvmf subsystems
- spdk_print_threads: information about threads
# Adding New Macros
# Adding New Macros:
The list iteration macros are usually built from 3 layers:

View File

@ -1,6 +1,5 @@
# General Information {#general}
- @subpage event
- @subpage scheduler
- @subpage logical_volumes
- @subpage accel_fw
- @subpage vpp_integration

View File

@ -10,20 +10,13 @@ git submodule update --init
# Installing Prerequisites {#getting_started_prerequisites}
The `scripts/pkgdep.sh` script will automatically install the bare minimum
dependencies required to build SPDK.
Use `--help` to see information on installing dependencies for optional components.
The `scripts/pkgdep.sh` script will automatically install the full set of
dependencies required to build and develop SPDK.
~~~{.sh}
sudo scripts/pkgdep.sh
~~~
Option --all will install all dependencies needed by SPDK features.
~~~{.sh}
sudo scripts/pkgdep.sh --all
~~~
# Building {#getting_started_building}
Linux:
@ -110,7 +103,7 @@ with no arguments to see the help output. If your system has its IOMMU
enabled you can run the examples as your regular user. If it doesn't, you'll
need to run as a privileged user (root).
A good example to start with is `build/examples/identify`, which prints
A good example to start with is `examples/nvme/identify/identify`, which prints
out information about all of the NVMe devices on your system.
Larger, more fully functional applications are available in the `app`

View File

@ -1,28 +0,0 @@
# IDXD Driver {#idxd}
# Public Interface {#idxd_interface}
- spdk/idxd.h
# Key Functions {#idxd_key_functions}
Function | Description
--------------------------------------- | -----------
spdk_idxd_probe() | @copybrief spdk_idxd_probe()
spdk_idxd_batch_get_max() | @copybrief spdk_idxd_batch_get_max()
spdk_idxd_batch_create() | @copybrief spdk_idxd_batch_create()
spdk_idxd_batch_prep_copy() | @copybrief spdk_idxd_batch_prep_copy()
spdk_idxd_batch_submit() | @copybrief spdk_idxd_batch_submit()
spdk_idxd_submit_copy() | @copybrief spdk_idxd_submit_copy()
spdk_idxd_submit_compare() | @copybrief spdk_idxd_submit_compare()
spdk_idxd_submit_crc32c() | @copybrief spdk_idxd_submit_crc32c()
spdk_idxd_submit_dualcast | @copybrief spdk_idxd_submit_dualcast()
spdk_idxd_submit_fill() | @copybrief spdk_idxd_submit_fill()
# Pre-defined configurations {#idxd_configs}
The RPC `idxd_scan_accel_engine` is used to both enable IDXD and set it's
configuration to one of two pre-defined configs:
Config #0: 4 groups, 1 work queue per group, 1 engine per group.
Config #1: 2 groups, 2 work queues per group, 2 engines per group.

View File

@ -1,41 +1,28 @@
# Storage Performance Development Kit {#mainpage}
# Storage Performance Development Kit {#index}
# Introduction
@copydoc intro
# Concepts
@copydoc concepts
# User Guides
@copydoc user_guides
# Programmer Guides
@copydoc prog_guides
# General Information
@copydoc general
# Miscellaneous
@copydoc misc
# Driver Modules
@copydoc driver_modules
# Tools
@copydoc tools
# CI Tools
@copydoc ci_tools
# Performance Reports
@copydoc performance_reports

View File

@ -4,5 +4,4 @@
- @subpage getting_started
- @subpage vagrant
- @subpage changelog
- @subpage deprecation
- [Source Code (GitHub)](https://github.com/spdk/spdk)

View File

@ -10,7 +10,7 @@ This following section describes how to run iscsi from your cloned package.
This guide starts by assuming that you can already build the standard SPDK distribution on your
platform.
Once built, the binary will be in `build/bin`.
Once built, the binary will be in `app/iscsi_tgt`.
If you want to kill the application by using signal, make sure use the SIGTERM, then the application
will release all the shared memory resource before exit, the SIGKILL will make the shared memory
@ -23,6 +23,24 @@ document.
![iSCSI structure](iscsi.svg)
## Configuring iSCSI Target via config file {#iscsi_config}
A `iscsi_tgt` specific configuration file is used to configure the iSCSI target. A fully documented
example configuration file is located at `etc/spdk/iscsi.conf.in`.
The configuration file is used to configure the SPDK iSCSI target. This file defines the following:
TCP ports to use as iSCSI portals; general iSCSI parameters; initiator names and addresses to allow
access to iSCSI target nodes; number and types of storage backends to export over iSCSI LUNs; iSCSI
target node mappings between portal groups, initiator groups, and LUNs.
You should make a copy of the example configuration file, modify it to suit your environment, and
then run the iscsi_tgt application and pass it the configuration file using the -c option. Right now,
the target requires elevated privileges (root) to run.
~~~
app/iscsi_tgt/iscsi_tgt -c /path/to/iscsi.conf
~~~
### Assigning CPU Cores to the iSCSI Target {#iscsi_config_lcore}
SPDK uses the [DPDK Environment Abstraction Layer](http://dpdk.org/doc/guides/prog_guide/env_abstraction_layer.html)
@ -39,9 +57,26 @@ command line option is used to configure the SPDK iSCSI target:
This is a hexadecimal bit mask of the CPU cores where the iSCSI target will start polling threads.
In this example, CPU cores 24, 25, 26 and 27 would be used.
### Configuring a LUN in the iSCSI Target {#iscsi_lun}
Each LUN in an iSCSI target node is associated with an SPDK block device. See @ref bdev
for details on configuring SPDK block devices. The block device to LUN mappings are specified in the
configuration file as:
~~~~
[TargetNodeX]
LUN0 Malloc0
LUN1 Nvme0n1
~~~~
This exports a malloc'd target. The disk is a RAM disk that is a chunk of memory allocated by iscsi in
user space. It will use offload engine to do the copy job instead of memcpy if the system has enough DMA
channels.
## Configuring iSCSI Target via RPC method {#iscsi_rpc}
The iSCSI target is configured via JSON-RPC calls. See @ref jsonrpc for details.
In addition to the configuration file, the iSCSI target may also be configured via JSON-RPC calls. See
@ref jsonrpc for details.
### Portal groups
@ -183,7 +218,7 @@ echo "1024" > /sys/block/sdc/queue/nr_requests
### Example: Configure simple iSCSI Target with one portal and two LUNs
Assuming we have one iSCSI Target server with portal at 10.0.0.1:3200, two LUNs (Malloc0 and Malloc1),
Assuming we have one iSCSI Target server with portal at 10.0.0.1:3200, two LUNs (Malloc0 and Malloc),
and accepting initiators on 10.0.0.2/32, like on diagram below:
![Sample iSCSI configuration](iscsi_example.svg)
@ -192,33 +227,33 @@ Assuming we have one iSCSI Target server with portal at 10.0.0.1:3200, two LUNs
Start iscsi_tgt application:
```
./build/bin/iscsi_tgt
$ ./app/iscsi_tgt/iscsi_tgt
```
Construct two 64MB Malloc block devices with 512B sector size "Malloc0" and "Malloc1":
```
./scripts/rpc.py bdev_malloc_create -b Malloc0 64 512
./scripts/rpc.py bdev_malloc_create -b Malloc1 64 512
$ ./scripts/rpc.py bdev_malloc_create -b Malloc0 64 512
$ ./scripts/rpc.py bdev_malloc_create -b Malloc1 64 512
```
Create new portal group with id 1, and address 10.0.0.1:3260:
```
./scripts/rpc.py iscsi_create_portal_group 1 10.0.0.1:3260
$ ./scripts/rpc.py iscsi_create_portal_group 1 10.0.0.1:3260
```
Create one initiator group with id 2 to accept any connection from 10.0.0.2/32:
```
./scripts/rpc.py iscsi_create_initiator_group 2 ANY 10.0.0.2/32
$ ./scripts/rpc.py iscsi_create_initiator_group 2 ANY 10.0.0.2/32
```
Finally construct one target using previously created bdevs as LUN0 (Malloc0) and LUN1 (Malloc1)
with a name "disk1" and alias "Data Disk1" using portal group 1 and initiator group 2.
```
./scripts/rpc.py iscsi_create_target_node disk1 "Data Disk1" "Malloc0:0 Malloc1:1" 1:2 64 -d
$ ./scripts/rpc.py iscsi_create_target_node disk1 "Data Disk1" "Malloc0:0 Malloc1:1" 1:2 64 -d
```
#### Configure initiator
@ -233,7 +268,7 @@ $ iscsiadm -m discovery -t sendtargets -p 10.0.0.1
Connect to the target
~~~
iscsiadm -m node --login
$ iscsiadm -m node --login
~~~
At this point the iSCSI target should show up as SCSI disks.
@ -274,54 +309,26 @@ sde
At the iSCSI level, we provide the following support for Hotplug:
1. bdev/nvme:
At the bdev/nvme level, we start one hotplug monitor which will call
spdk_nvme_probe() periodically to get the hotplug events. We provide the
private attach_cb and remove_cb for spdk_nvme_probe(). For the attach_cb,
we will create the block device base on the NVMe device attached, and for the
remove_cb, we will unregister the block device, which will also notify the
upper level stack (for iSCSI target, the upper level stack is scsi/lun) to
handle the hot-remove event.
At the bdev/nvme level, we start one hotplug monitor which will call
spdk_nvme_probe() periodically to get the hotplug events. We provide the
private attach_cb and remove_cb for spdk_nvme_probe(). For the attach_cb,
we will create the block device base on the NVMe device attached, and for the
remove_cb, we will unregister the block device, which will also notify the
upper level stack (for iSCSI target, the upper level stack is scsi/lun) to
handle the hot-remove event.
2. scsi/lun:
When the LUN receive the hot-remove notification from block device layer,
the LUN will be marked as removed, and all the IOs after this point will
return with check condition status. Then the LUN starts one poller which will
wait for all the commands which have already been submitted to block device to
return back; after all the commands return back, the LUN will be deleted.
When the LUN receive the hot-remove notification from block device layer,
the LUN will be marked as removed, and all the IOs after this point will
return with check condition status. Then the LUN starts one poller which will
wait for all the commands which have already been submitted to block device to
return back; after all the commands return back, the LUN will be deleted.
## Known bugs and limitations {#iscsi_hotplug_bugs}
For write command, if you want to test hotplug with write command which will
cause r2t, for example 1M size IO, it will crash the iscsi tgt.
For read command, if you want to test hotplug with large read IO, for example 1M
size IO, it will probably crash the iscsi tgt.
@sa spdk_nvme_probe
# iSCSI Login Redirection {#iscsi_login_redirection}
The SPDK iSCSI target application supports iSCSI login redirection feature.
A portal refers to an IP address and TCP port number pair, and a portal group
contains a set of portals. Users for the SPDK iSCSI target application configure
portals through portal groups.
To support login redirection feature, we utilize two types of portal groups,
public portal group and private portal group.
The SPDK iSCSI target application usually has a discovery portal. The discovery
portal is connected by an initiator to get a list of targets, as well as the list
of portals on which these target may be accessed, by a discovery session.
Public portal groups have their portals returned by a discovery session. Private
portal groups do not have their portals returned by a discovery session. A public
portal group may optionally have a redirect portal for non-discovery logins for
each associated target. This redirect portal must be from a private portal group.
Initiators configure portals in public portal groups as target portals. When an
initator logs in to a target through a portal in an associated public portal group,
the target sends a temporary redirection response with a redirect portal. Then the
initiator logs in to the target again through the redirect portal.
Users set a portal group to public or private at creation using the
`iscsi_create_portal_group` RPC, associate portal groups with a target using the
`iscsi_create_target_node` RPC or the `iscsi_target_node_add_pg_ig_maps` RPC,
specify a up-to-date redirect portal in a public portal group for a target using
the `iscsi_target_node_set_redirect` RPC, and terminate the corresponding connections
by asynchronous logout request using the `iscsi_target_node_request_logout` RPC.
Typically users will use the login redirection feature in scale out iSCSI target
system, which runs multiple SPDK iSCSI target applications.

File diff suppressed because it is too large Load Diff

View File

@ -1,213 +0,0 @@
# SPDK Libraries {#libraries}
The SPDK repository is, first and foremost, a collection of high-performance
storage-centric software libraries. With this in mind, much care has been taken
to ensure that these libraries have consistent and robust naming and versioning
conventions. The libraries themselves are also divided across two directories
(`lib` and `module`) inside of the SPDK repository in a deliberate way to prevent
mixing of SPDK event framework dependent code and lower level libraries. This document
is aimed at explaining the structure, naming conventions, versioning scheme, and use cases
of the libraries contained in these two directories.
# Directory Structure {#structure}
The SPDK libraries are divided into two directories. The `lib` directory contains the base libraries that
compose SPDK. Some of these base libraries define plug-in systems. Instances of those plug-ins are called
modules and are located in the `module` directory. For example, the `spdk_sock` library is contained in the
`lib` directory while the implementations of socket abstractions, `sock_posix` and `sock_uring`
are contained in the `module` directory.
## lib {#lib}
The libraries in the `lib` directory can be readily divided into four categories:
- Utility Libraries: These libraries contain basic, commonly used functions that make more complex
libraries easier to implement. For example, `spdk_log` contains macro definitions that provide a
consistent logging paradigm and `spdk_json` is a general purpose JSON parsing library.
- Protocol Libraries: These libraries contain the building blocks for a specific service. For example,
`spdk_nvmf` and `spdk_vhost` each define the storage protocols after which they are named.
- Storage Service Libraries: These libraries provide a specific abstraction that can be mapped to somewhere
between the physical drive and the filesystem level of your typical storage stack. For example `spdk_bdev`
provides a general block device abstraction layer, `spdk_lvol` provides a logical volume abstraction,
`spdk_blobfs` provides a filesystem abstraction, and `spdk_ftl` provides a flash translation layer
abstraction.
- System Libraries: These libraries provide system level services such as a JSON based RPC service
(see `spdk_jsonrpc`) and thread abstractions (see `spdk_thread`). The most notable library in this category
is the `spdk_env_dpdk` library which provides a shim for the underlying Data Plane Development Kit (DPDK)
environment and provides services like memory management.
The one library in the `lib` directory that doesn't fit into the above classification is the `spdk_event` library.
This library defines a framework used by the applications contained in the `app` and `example` directories. Much
care has been taken to keep the SPDK libraries independent from this framework. The libraries in `lib` are engineered
to allow plugging directly into independent application frameworks such as Seastar or libuv with minimal effort.
Currently there are two exceptions in the `lib` directory which still rely on `spdk_event`, `spdk_vhost` and `spdk_iscsi`.
There are efforts underway to remove all remaining dependencies these libraries have on the `spdk_event` library.
Much like the `spdk_event` library, the `spdk_env_dpdk` library has been architected in such a way that it
can be readily replaced by an alternate environment shim. More information on replacing the `spdk_env_dpdk`
module and the underlying `dpdk` environment can be found in the [environment](#env_replacement) section.
## module {#module}
The component libraries in the `module` directory represent specific implementations of the base libraries in
the `lib` directory. As with the `lib` directory, much care has been taken to avoid dependencies on the
`spdk_event` framework except for those libraries which directly implement the `spdk_event` module plugin system.
There are seven sub-directories in the `module` directory which each hold a different class of libraries. These
sub-directories can be divided into two types.
- plug-in libraries: These libraries are explicitly tied to one of the libraries in the `lib` directory and
are registered with that library at runtime by way of a specific constructor function. The parent library in
the `lib` directory then manages the module directly. These types of libraries each implement a function table
defined by their parent library. The following table shows these directories and their corresponding parent
libraries:
<center>
| module directory | parent library | dependent on event library |
|------------------|----------------|----------------------------|
| module/accel | spdk_accel | no |
| module/bdev | spdk_bdev | no |
| module/event | spdk_event | yes |
| module/sock | spdk_sock | no |
</center>
- Free libraries: These libraries are highly dependent upon a library in the `lib` directory but are not
explicitly registered to that library via a constructor. The libraries in the `blob`, `blobfs`, and `env_dpdk`
directories fall into this category. None of the libraries in this category depend explicitly on the
`spdk_event` library.
# Library Conventions {#conventions}
The SPDK libraries follow strict conventions for naming functions, logging, versioning, and header files.
## Headers {#headers}
All public SPDK header files exist in the `include` directory of the SPDK repository. These headers
are divided into two sub-directories.
`include/spdk` contains headers intended to be used by consumers of the SPDK libraries. All of the
functions, variables, and types in these functions are intended for public consumption. Multiple headers
in this directory may depend upon the same underlying library and work together to expose different facets
of the library. The `spdk_bdev` library, for example, is exposed in three different headers. `bdev_module.h`
defines the interfaces a bdev module library would need to implement, `bdev.h` contains general block device
functions that would be used by an application consuming block devices exposed by SPDK, and `bdev_zone.h`
exposes zoned bdev specific functions. Many of the other libraries exhibit a similar behavior of splitting
headers between consumers of the library and those wishing to register a module with that library.
`include/spdk_internal`, as its name suggests contains header files intended to be consumed only by other
libraries inside of the SPDK repository. These headers are typically used for sharing lower level functions
between two libraries that both require similar functions. For example `spdk_internal/nvme_tcp.h` contains
low level tcp functions used by both the `spdk_nvme` and `spdk_nvmf` libraries. These headers are *NOT*
intended for general consumption.
Other header files contained directly in the `lib` and `module` directories are intended to be consumed *only*
by source files of their corresponding library. Any symbols intended to be used across libraries need to be
included in a header in the `include/spdk_internal` directory.
## Naming Conventions {#naming}
All public types and functions in SPDK libraries begin with the prefix `spdk_`. They are also typically
further namespaced using the spdk library name. The rest of the function or type name describes its purpose.
There are no internal library functions that begin with the `spdk_` prefix. This naming convention is
enforced by the SPDK continuous Integration testing. Functions not intended for use outside of their home
library should be namespaced with the name of the library only.
## Map Files {#map}
SPDK libraries can be built as both static and shared object files. To facilitate building libraries as shared
objects, each one has a corresponding map file (e.g. `spdk_nvmf` relies on `spdk_nvmf.map`). SPDK libraries
not exporting any symbols rely on a blank map file located at `mk/spdk_blank.map`.
# SPDK Shared Objects {#shared_objects}
## Shared Object Versioning {#versioning}
SPDK shared objects follow a semantic versioning pattern with a major and minor version. Any changes which
break backwards compatibility (symbol removal or change) will cause a shared object major increment and
backwards compatible changes will cause a minor version increment; i.e. an application that relies on
`libspdk_nvmf.so.3.0` will be compatible with `libspdk_nvmf.so.3.1` but not with `libspdk_nvmf.so.4.0`.
Shared object versions are incremented only once between each release cycle. This means that at most, the
major version of each SPDK shared library will increment only once between each SPDK release.
There are currently no guarantees in SPDK of ABI compatibility between two major SPDK releases.
The point releases of an LTS release will be ABI compatible with the corresponding LTS major release.
Shared objects are versioned independently of one another. This means that `libspdk_nvme.so.3.0` and
`libspdk_bdev.so.3.0` do not necessarily belong to the same release. This also means that shared objects
with the same suffix are not necessarily compatible with each other. It is important to source all of your
SPDK libraries from the same repository and version to ensure inter-library compatibility.
## Linking to Shared Objects {#so_linking}
Shared objects in SPDK are created on a per-library basis. There is a top level `libspdk.so` object
which is a linker script. It simply contains references to all of the other spdk shared objects.
There are essentially two ways of linking to SPDK libraries.
1. An application can link to the top level shared object library as follows:
~~~{.sh}
gcc -o my_app ./my_app.c -lspdk -lspdk_env_dpdk -ldpdk
~~~
2. An application can link to only a subset of libraries by linking directly to the ones it relies on:
~~~{.sh}
gcc -o my_app ./my_app.c -lpassthru_external -lspdk_event_bdev -lspdk_bdev -lspdk_bdev_malloc
-lspdk_log -lspdk_thread -lspdk_util -lspdk_event -lspdk_env_dpdk -ldpdk
~~~
In the second instance, please note that applications need only link to the libraries upon which they
directly depend. All SPDK libraries have their dependencies specified at object compile time. This means
that when linking to `spdk_net`, one does not also have to specify `spdk_log`, `spdk_util`, `spdk_json`,
`spdk_jsonrpc`, and `spdk_rpc`. However, this dependency inclusion does not extend to the application
itself; i.e. if an application directly uses symbols from both `spdk_bdev` and `spdk_log`, both libraries
will need to be supplied to the linker when linking the application even though `spdk_log` is a dependency
of `spdk_bdev`.
Please also note that when linking to SPDK libraries, both the spdk_env shim library and the env library
itself need to be supplied to the linker. In the examples above, these are `spdk_env_dpdk` and `dpdk`
respectively. This was intentional and allows one to easily swap out both the environment and the
environment shim.
## Replacing the env abstraction {#env_replacement}
SPDK depends on an environment abstraction that provides crucial pinned memory management and PCIe
bus management operations. The interface for this environment abstraction is defined in the
`include/env.h` header file. The default implementation of this environment is located in `spdk_env_dpdk`.
This abstraction in turn relies upon the DPDK libraries. This two part implementation was deliberate
and allows for easily swapping out the dpdk version upon which the spdk libraries rely without making
modifications to the spdk source directly.
Any environment can replace the `spdk_env_dpdk` environment by implementing the `include/env.h` header
file. The environment can either be implemented wholesale in a single library or as a two-part
shim/implementation library system.
~~~{.sh}
# single library
gcc -o my_app ./my_app.c -lspdk -lcustom_env_implementation
# two libraries
gcc -o my_app ./my_app.c -lspdk -lcustom_env_shim -lcustom_env_implementation
~~~
# SPDK Static Objects {#static_objects}
SPDK static objects are compiled by default even when no parameters are supplied to the build system.
Unlike SPDK shared objects, the filename does not contain any versioning semantics. Linking against
static objects is similar to shared objects but will always require the use of `-Wl,--whole-archive`
as argument. This is due to the use of constructor functions in SPDK such as those to register
NVMe transports.
Due to the lack of versioning semantics, it is not recommended to install static libraries system wide.
Instead the path to these static libraries should be added as argument at compile time using
`-L/path/to/static/libs`. The use of static objects instead of shared objects can also be forced
through `-Wl,-Bsatic`, otherwise some compilers might prefer to use the shared objects if both
are available.
~~~{.sh}
gcc -o my_app ./my_app.c -L/path/to/static/libs -Wl,--whole-archive -Wl,-Bstatic -lpassthru_external
-lspdk_event_bdev -lspdk_bdev -lspdk_bdev_malloc -lspdk_log -lspdk_thread -lspdk_util -lspdk_event
-lspdk_env_dpdk -Wl,--no-whole-archive -Wl,-Bdynamic -pthread -ldpdk
~~~

View File

@ -1,4 +1,3 @@
# Miscellaneous {#misc}
- @subpage peer_2_peer
- @subpage containers

View File

@ -1,5 +1,4 @@
# Notify library {#notify}
The notify library implements an event bus, allowing users to register, generate,
and listen for events. For example, the bdev library may register a new event type
for bdev creation. Any time a bdev is created, it "sends" the event. Consumers of

87
doc/nvme-cli.md Normal file
View File

@ -0,0 +1,87 @@
# nvme-cli {#nvme-cli}
# nvme-cli with SPDK Getting Started Guide
Now nvme-cli can support both kernel driver and SPDK user mode driver for most of its available commands and
Intel specific commands.
1. Clone the nvme-cli repository from the SPDK GitHub fork. Make sure you check out the spdk-1.6 branch.
~~~{.sh}
git clone -b spdk-1.6 https://github.com/spdk/nvme-cli.git
~~~
2. Clone the SPDK repository from https://github.com/spdk/spdk under the nvme-cli folder.
3. Refer to the "README.md" under SPDK folder to properly build SPDK.
4. Refer to the "README.md" under nvme-cli folder to properly build nvme-cli.
5. Execute "<spdk_folder>/scripts/setup.sh" with the "root" account.
6. Update the "spdk.conf" file under nvme-cli folder to properly configure the SPDK. Notes as following:
~~~{.sh}
spdk=1
Indicates whether or not to use spdk. Can be 0 (off) or 1 (on).
Defaults to 1 which assumes that you have run "<spdk_folder>/scripts/setup.sh", unbinding your drives from the kernel.
core_mask=0x1
A bitmask representing which core(s) to use for nvme-cli operations.
Defaults to core 0.
mem_size=512
The amount of reserved hugepage memory to use for nvme-cli (in MB).
Defaults to 512MB.
shm_id=0
Indicates the shared memory ID for the spdk application with which your NVMe drives are associated,
and should be adjusted accordingly.
Defaults to 0.
~~~
7. Run the "./nvme list" command to get the domain:bus:device.function for each found NVMe SSD.
8. Run the other nvme commands with domain:bus:device.function instead of "/dev/nvmeX" for the specified device.
~~~{.sh}
Example: ./nvme smart-log 0000:01:00.0
~~~
9. Run the "./nvme intel" commands for Intel specific commands against Intel NVMe SSD.
~~~{.sh}
Example: ./nvme intel internal-log 0000:08:00.0
~~~
10. Execute "<spdk_folder>/scripts/setup.sh reset" with the "root" account and update "spdk=0" in spdk.conf to
use the kernel driver if wanted.
## Use scenarios
### Run as the only SPDK application on the system
1. Modify the spdk to 1 in spdk.conf. If the system has fewer cores or less memory, update the spdk.conf accordingly.
### Run together with other running SPDK applications on shared NVMe SSDs
1. For the other running SPDK application, start with the parameter like "-i 1" to have the same "shm_id".
2. Use the default spdk.conf setting where "shm_id=1" to start the nvme-cli.
3. If other SPDK applications run with different shm_id parameter, update the "spdk.conf" accordingly.
### Run with other running SPDK applications on non-shared NVMe SSDs
1. Properly configure the other running SPDK applications.
~~~{.sh}
a. Only access the NVMe SSDs it wants.
b. Allocate a fixed number of memory instead of all available memory.
~~~
2. Properly configure the spdk.conf setting for nvme-cli.
~~~{.sh}
a. Not access the NVMe SSDs from other SPDK applications.
b. Change the mem_size to a proper size.
~~~
## Note
1. To run the newly built nvme-cli, either explicitly run as "./nvme" or added it into the $PATH to avoid
invoke other already installed version.
2. To run the newly built nvme-cli with SPDK support in arbitrary directory, copy "spdk.conf" to that
directory from the nvme cli folder and update the configuration as suggested.

View File

@ -117,38 +117,6 @@ spdk_nvme_qpair_process_completions().
@sa spdk_nvme_ns_cmd_read, spdk_nvme_ns_cmd_write, spdk_nvme_ns_cmd_dataset_management,
spdk_nvme_ns_cmd_flush, spdk_nvme_qpair_process_completions
### Fused operations {#nvme_fuses}
To "fuse" two commands, the first command should have the SPDK_NVME_IO_FLAGS_FUSE_FIRST
io flag set, and the next one should have the SPDK_NVME_IO_FLAGS_FUSE_SECOND.
In addition, the following rules must be met to execute two commands as an atomic unit:
- The commands shall be inserted next to each other in the same submission queue.
- The LBA range, should be the same for the two commands.
E.g. To send fused compare and write operation user must call spdk_nvme_ns_cmd_compare
followed with spdk_nvme_ns_cmd_write and make sure no other operations are submitted
in between on the same queue, like in example below:
~~~
rc = spdk_nvme_ns_cmd_compare(ns, qpair, cmp_buf, 0, 1, nvme_fused_first_cpl_cb,
NULL, SPDK_NVME_CMD_FUSE_FIRST);
if (rc != 0) {
...
}
rc = spdk_nvme_ns_cmd_write(ns, qpair, write_buf, 0, 1, nvme_fused_second_cpl_cb,
NULL, SPDK_NVME_CMD_FUSE_SECOND);
if (rc != 0) {
...
}
~~~
The NVMe specification currently defines compare-and-write as a fused operation.
Support for compare-and-write is reported by the controller flag
SPDK_NVME_CTRLR_COMPARE_AND_WRITE_SUPPORTED.
### Scaling Performance {#nvme_scaling}
NVMe queue pairs (struct spdk_nvme_qpair) provide parallel submission paths for
@ -249,10 +217,9 @@ DPDK EAL allows different types of processes to be spawned, each with different
on the hugepage memory used by the applications.
There are two types of processes:
1. a primary process which initializes the shared memory and has full privileges and
2. a secondary process which can attach to the primary process by mapping its shared memory
regions and perform NVMe operations including creating queue pairs.
regions and perform NVMe operations including creating queue pairs.
This feature is enabled by default and is controlled by selecting a value for the shared
memory group ID. This ID is a positive integer and two applications with the same shared
@ -273,30 +240,31 @@ Example: identical shm_id and non-overlapping core masks
1. Two processes sharing memory may not share any cores in their core mask.
2. If a primary process exits while secondary processes are still running, those processes
will continue to run. However, a new primary process cannot be created.
will continue to run. However, a new primary process cannot be created.
3. Applications are responsible for coordinating access to logical blocks.
4. If a process exits unexpectedly, the allocated memory will be released when the last
process exits.
process exits.
@sa spdk_nvme_probe, spdk_nvme_ctrlr_process_admin_completions
# NVMe Hotplug {#nvme_hotplug}
At the NVMe driver level, we provide the following support for Hotplug:
1. Hotplug events detection:
The user of the NVMe library can call spdk_nvme_probe() periodically to detect
hotplug events. The probe_cb, followed by the attach_cb, will be called for each
new device detected. The user may optionally also provide a remove_cb that will be
called if a previously attached NVMe device is no longer present on the system.
All subsequent I/O to the removed device will return an error.
The user of the NVMe library can call spdk_nvme_probe() periodically to detect
hotplug events. The probe_cb, followed by the attach_cb, will be called for each
new device detected. The user may optionally also provide a remove_cb that will be
called if a previously attached NVMe device is no longer present on the system.
All subsequent I/O to the removed device will return an error.
2. Hot remove NVMe with IO loads:
When a device is hot removed while I/O is occurring, all access to the PCI BAR will
result in a SIGBUS error. The NVMe driver automatically handles this case by installing
a SIGBUS handler and remapping the PCI BAR to a new, placeholder memory location.
This means I/O in flight during a hot remove will complete with an appropriate error
code and will not crash the application.
When a device is hot removed while I/O is occurring, all access to the PCI BAR will
result in a SIGBUS error. The NVMe driver automatically handles this case by installing
a SIGBUS handler and remapping the PCI BAR to a new, placeholder memory location.
This means I/O in flight during a hot remove will complete with an appropriate error
code and will not crash the application.
@sa spdk_nvme_probe
@ -304,107 +272,27 @@ At the NVMe driver level, we provide the following support for Hotplug:
This feature is considered as experimental.
## Design
![NVMe character devices processing diagram](nvme_cuse.svg)
For each controller as well as namespace, character devices are created in the
locations:
~~~{.sh}
/dev/spdk/nvmeX
/dev/spdk/nvmeXnY
/dev/'dev_path'
/dev/'dev_path'nY
...
~~~
Where X is unique SPDK NVMe controller index and Y is namespace id.
Requests from CUSE are handled by pthreads when controller and namespaces are created.
Those pass the I/O or admin commands via a ring to a thread that processes them using
nvme_io_msg_process().
During the handling of CUSE requests, operations may be forwarded to an internal NVMe
queue pair. The user must poll this internal queue pair periodically by polling
the admin qpair for the associated NVMe controller.
Ioctls that request information attained when attaching NVMe controller receive an
immediate response, without passing them through the ring.
immediate response, without passing them to the internal queue pair.
This interface reserves one additional qpair for sending down the I/O for each controller.
This interface reserves one qpair for sending down the I/O for each controller.
## Usage
## Enabling cuse support for NVMe
### Enabling cuse support for NVMe
Cuse support is disabled by default. To enable support for NVMe-CUSE devices first
install required dependencies
~~~{.sh}
sudo scripts/pkgdep.sh --fuse
~~~
Then compile SPDK with "./configure --with-nvme-cuse".
### Creating NVMe-CUSE device
First make sure to prepare the environment (see @ref getting_started).
This includes loading CUSE kernel module.
Any NVMe controller attached to a running SPDK application can be
exposed via NVMe-CUSE interface. When closing SPDK application,
the NVMe-CUSE devices are unregistered.
~~~{.sh}
$ sudo scripts/setup.sh
$ sudo modprobe cuse
$ sudo build/bin/spdk_tgt
# Continue in another session
$ sudo scripts/rpc.py bdev_nvme_attach_controller -b Nvme0 -t PCIe -a 0000:82:00.0
Nvme0n1
$ sudo scripts/rpc.py bdev_nvme_get_controllers
[
{
"name": "Nvme0",
"trid": {
"trtype": "PCIe",
"traddr": "0000:82:00.0"
}
}
]
$ sudo scripts/rpc.py bdev_nvme_cuse_register -n Nvme0
$ ls /dev/spdk/
nvme0 nvme0n1
~~~
### Example of using nvme-cli
Most nvme-cli commands can point to specific controller or namespace by providing a path to it.
This can be leveraged to issue commands to the SPDK NVMe-CUSE devices.
~~~{.sh}
sudo nvme id-ctrl /dev/spdk/nvme0
sudo nvme smart-log /dev/spdk/nvme0
sudo nvme id-ns /dev/spdk/nvme0n1
~~~
Note: `nvme list` command does not display SPDK NVMe-CUSE devices,
see nvme-cli [PR #773](https://github.com/linux-nvme/nvme-cli/pull/773).
### Examples of using smartctl
smartctl tool recognizes device type based on the device path. If none of expected
patterns match, SCSI translation layer is used to identify device.
To use smartctl '-d nvme' parameter must be used in addition to full path to
the NVMe device.
~~~{.sh}
smartctl -d nvme -i /dev/spdk/nvme0
smartctl -d nvme -H /dev/spdk/nvme1
...
~~~
## Limitations
NVMe namespaces are created as character devices and their use may be limited for
tools expecting block devices.
Sysfs is not updated by SPDK.
SPDK NVMe CUSE creates nodes in "/dev/spdk/" directory to explicitly differentiate
from other devices. Tools that only search in the "/dev" directory might not work
with SPDK NVMe CUSE.
SCSI to NVMe Translation Layer is not implemented. Tools that are using this layer to
identify, manage or operate device might not work properly or their use may be limited.
Cuse support is disabled by default. To enable support for NVMe devices SPDK
must be compiled with "./configure --with-nvme-cuse".

View File

@ -20,8 +20,8 @@ registers involved that are called doorbells.
An I/O is submitted to an NVMe device by constructing a 64 byte command, placing
it into the submission queue at the current location of the submission queue
tail index, and then writing the new index of the submission queue tail to the
submission queue tail doorbell register. It's actually valid to copy a whole set
head index, and then writing the new index of the submission queue head to the
submission queue head doorbell register. It's actually valid to copy a whole set
of commands into open slots in the ring and then write the doorbell just one
time to submit the whole batch.

View File

@ -29,11 +29,16 @@ available [here](https://downloads.openfabrics.org/OFED/).
### Prerequisites {#nvmf_prereqs}
To build nvmf_tgt with the RDMA transport, there are some additional dependencies,
which can be install using pkgdep.sh script.
To build nvmf_tgt with the RDMA transport, there are some additional dependencies.
Fedora:
~~~{.sh}
sudo scripts/pkgdep.sh --rdma
dnf install libibverbs-devel librdmacm-devel
~~~
Ubuntu:
~~~{.sh}
apt-get install libibverbs-dev librdmacm-dev
~~~
Then build SPDK with RDMA enabled:
@ -43,7 +48,7 @@ Then build SPDK with RDMA enabled:
make
~~~
Once built, the binary will be in `build/bin`.
Once built, the binary will be in `app/nvmf_tgt`.
### Prerequisites for InfiniBand/RDMA Verbs {#nvmf_prereqs_verbs}
@ -106,26 +111,30 @@ using 1GB hugepages or by pre-reserving memory at application startup with `--me
option. All pre-reserved memory will be registered as a single region, but won't be returned to the
system until the SPDK application is terminated.
Another known issue occurs when using the E810 NICs in RoCE mode. Specifically, the NVMe-oF target
sometimes cannot destroy a qpair, because its posted work requests don't get flushed. It can cause
the NVMe-oF target application unable to terminate cleanly.
## TCP transport support {#nvmf_tcp_transport}
The transport is built into the nvmf_tgt by default, and it does not need any special libraries.
## Configuring the SPDK NVMe over Fabrics Target {#nvmf_config}
An NVMe over Fabrics target can be configured using JSON RPCs.
The basic RPCs needed to configure the NVMe-oF subsystem are detailed below. More information about
working with NVMe over Fabrics specific RPCs can be found on the @ref jsonrpc_components_nvmf_tgt RPC page.
Using .ini style configuration files for configuration of the NVMe-oF target is deprecated and should
be replaced with JSON based RPCs. .ini style configuration files can be converted to json format by way
of the new script `scripts/config_converter.py`.
## FC transport support {#nvmf_fc_transport}
To build nvmf_tgt with the FC transport, there is an additional FC LLD (Low Level Driver) code dependency.
Please contact your FC vendor for instructions to obtain FC driver module.
### Broadcom FC LLD code
FC LLD driver for Broadcom FC NVMe capable adapters can be obtained from,
https://github.com/ecdufcdrvr/bcmufctdrvr.
### Fetch FC LLD module and then build SPDK with FC enabled
### Fetch FC LLD module and then build SPDK with FC enabled:
After cloning SPDK repo and initialize submodules, FC LLD library is built which then can be linked with
the fc transport.
@ -141,12 +150,6 @@ cd ../spdk
make
~~~
## Configuring the SPDK NVMe over Fabrics Target {#nvmf_config}
An NVMe over Fabrics target can be configured using JSON RPCs.
The basic RPCs needed to configure the NVMe-oF subsystem are detailed below. More information about
working with NVMe over Fabrics specific RPCs can be found on the @ref jsonrpc_components_nvmf_tgt RPC page.
### Using RPCs {#nvmf_config_rpc}
Start the nvmf_tgt application with elevated privileges. Once the target is started,
@ -157,9 +160,9 @@ and an in capsule data size of 0 bytes. The TCP transport is configured with an
16384 bytes, 8 max qpairs per controller, and an in capsule data size of 8192 bytes.
~~~{.sh}
build/bin/nvmf_tgt
scripts/rpc.py nvmf_create_transport -t RDMA -u 8192 -m 4 -c 0
scripts/rpc.py nvmf_create_transport -t TCP -u 16384 -m 8 -c 8192
app/nvmf_tgt/nvmf_tgt
scripts/rpc.py nvmf_create_transport -t RDMA -u 8192 -p 4 -c 0
scripts/rpc.py nvmf_create_transport -t TCP -u 16384 -p 8 -c 8192
~~~
Below is an example of creating a malloc bdev and assigning it to a subsystem. Adjust the bdevs,
@ -198,7 +201,6 @@ NVMe Domain NQN = "nqn.", year, '-', month, '.', reverse domain, ':', utf-8 stri
~~~
Please note that the following types from the definition above are defined elsewhere:
1. utf-8 string: Defined in [rfc 3629](https://tools.ietf.org/html/rfc3629).
2. reverse domain: Equivalent to domain name as defined in [rfc 1034](https://tools.ietf.org/html/rfc1034).
@ -231,7 +233,7 @@ The `-m` core mask option specifies a bit mask of the CPU cores that
SPDK is allowed to execute work items on.
For example, to allow SPDK to use cores 24, 25, 26 and 27:
~~~{.sh}
build/bin/nvmf_tgt -m 0xF000000
app/nvmf_tgt/nvmf_tgt -m 0xF000000
~~~
## Configuring the Linux NVMe over Fabrics Host {#nvmf_host}

View File

@ -68,7 +68,7 @@ system. This is used for access control.
A user of the NVMe-oF target library begins by creating a target using
spdk_nvmf_tgt_create(), setting up a set of addresses on which to accept
connections by calling spdk_nvmf_tgt_listen_ext(), then creating a subsystem
connections by calling spdk_nvmf_tgt_listen(), then creating a subsystem
using spdk_nvmf_subsystem_create().
Subsystems begin in an inactive state and must be activated by calling
@ -78,13 +78,12 @@ calling spdk_nvmf_subsystem_pause() and resumed by calling
spdk_nvmf_subsystem_resume().
Namespaces may be added to the subsystem by calling
spdk_nvmf_subsystem_add_ns_ext() when the subsystem is inactive or paused.
spdk_nvmf_subsystem_add_ns() when the subsystem is inactive or paused.
Namespaces are bdevs. See @ref bdev for more information about the SPDK bdev
layer. A bdev may be obtained by calling spdk_bdev_get_by_name().
Once a subsystem exists and the target is listening on an address, new
connections will be automatically assigned to poll groups as they are
detected.
connections may be accepted by polling spdk_nvmf_tgt_accept().
All I/O to a subsystem is driven by a poll group, which polls for incoming
network I/O. Poll groups may be created by calling
@ -92,6 +91,14 @@ spdk_nvmf_poll_group_create(). They automatically request to begin polling
upon creation on the thread from which they were created. Most importantly, *a
poll group may only be accessed from the thread on which it was created.*
When spdk_nvmf_tgt_accept() detects a new connection, it will construct a new
struct spdk_nvmf_qpair object and call the user provided `new_qpair_fn`
callback for each new qpair. In response to this callback, the user must
assign the qpair to a poll group by calling spdk_nvmf_poll_group_add().
Remember, a poll group may only be accessed from the thread on which it was created,
so making a call to spdk_nvmf_poll_group_add() may require passing a message
to the appropriate thread.
## Access Control
Access control is performed at the subsystem level by adding allowed listen
@ -104,7 +111,9 @@ and hosts may only be added to inactive or paused subsystems.
A discovery subsystem, as defined by the NVMe-oF specification, is
automatically created for each NVMe-oF target constructed. Connections to the
discovery subsystem are handled in the same way as any other subsystem.
discovery subsystem are handled in the same way as any other subsystem - new
qpairs are created in response to spdk_nvmf_tgt_accept() and they must be
assigned to a poll group.
## Transports
@ -123,7 +132,15 @@ fabrics simultaneously.
The SPDK NVMe-oF target library does not strictly dictate threading model, but
poll groups do all of their polling and I/O processing on the thread they are
created on. Given that, it almost always makes sense to create one poll group
per thread used in the application.
per thread used in the application. New qpairs created in response to
spdk_nvmf_tgt_accept() can be handed out round-robin to the poll groups. This
is how the SPDK NVMe-oF target application currently functions.
More advanced algorithms for distributing qpairs to poll groups are possible.
For instance, a NUMA-aware algorithm would be an improvement over basic
round-robin, where NUMA-aware means assigning qpairs to poll groups running on
CPU cores that are on the same NUMA node as the network adapter and storage
device. Load-aware algorithms also may have benefits.
## Scaling Across CPU Cores

View File

@ -16,14 +16,14 @@ the instrumentation of all the tracepoints group in an SPDK target application,
target with -e parameter set to 0xFFFF:
~~~
build/bin/nvmf_tgt -e 0xFFFF
app/nvmf_tgt/nvmf_tgt -e 0xFFFF
~~~
To enable the instrumentation of just the NVMe-oF RDMA tracepoints in an SPDK target
application, start the target with the -e parameter set to 0x10:
~~~
build/bin/nvmf_tgt -e 0x10
app/nvmf_tgt/nvmf_tgt -e 0x10
~~~
When the target starts, a message is logged with the information you need to view
@ -55,7 +55,7 @@ The spdk_trace program can be found in the app/trace directory. To analyze the
system running the NVMe-oF target, simply execute the command line shown in the log:
~~~{.sh}
build/bin/spdk_trace -s nvmf -p 24147
app/trace/spdk_trace -s nvmf -p 24147
~~~
To analyze the tracepoints on a different system, first prepare the tracepoint file for transfer. The
@ -70,7 +70,7 @@ After transferring the /tmp/trace.bz2 tracepoint file to a different system:
~~~{.sh}
bunzip2 /tmp/trace.bz2
build/bin/spdk_trace -f /tmp/trace
app/trace/spdk_trace -f /tmp/trace
~~~
The following is sample trace capture showing the cumulative time that each
@ -134,7 +134,7 @@ and store all entries into specified output file at its shutdown on SIGINT or SI
After SPDK nvmf target is launched, simply execute the command line shown in the log:
~~~{.sh}
build/bin/spdk_trace_record -q -s nvmf -p 24147 -f /tmp/spdk_nvmf_record.trace
app/trace_record/spdk_trace_record -q -s nvmf -p 24147 -f /tmp/spdk_nvmf_record.trace
~~~
Also send I/Os to the SPDK target application to generate events by previous perf example for 10 minutes.
@ -147,7 +147,7 @@ After the completion of perf example, shut down spdk_trace_record by signal SIGI
To analyze the tracepoints output file from spdk_trace_record, simply run spdk_trace program by:
~~~{.sh}
build/bin/spdk_trace -f /tmp/spdk_nvmf_record.trace
app/trace/spdk_trace -f /tmp/spdk_nvmf_record.trace
~~~
# Adding New Tracepoints {#add_tracepoints}
@ -202,4 +202,4 @@ record the current trace state of several tracepoints.
...
~~~
All the tracing functions are documented in the [Tracepoint library documentation](https://spdk.io/doc/trace_8h.html)
All the tracing functions are documented in the [Tracepoint library documentation](https://www.spdk.io/doc/trace_8h.html)

View File

@ -29,8 +29,8 @@ capabilities are given in the table below.
Key Functions | Description
------------------------------------------- | -----------
spdk_nvme_ctrlr_map_cmb() | @copybrief spdk_nvme_ctrlr_map_cmb()
spdk_nvme_ctrlr_unmap_cmb() | @copybrief spdk_nvme_ctrlr_unmap_cmb()
spdk_nvme_ctrlr_alloc_cmb_io_buffer() | @copybrief spdk_nvme_ctrlr_alloc_cmb_io_buffer()
spdk_nvme_ctrlr_free_cmb_io_buffer() | @copybrief spdk_nvme_ctrlr_free_cmb_io_buffer()
spdk_nvme_ctrlr_get_regs_cmbsz() | @copybrief spdk_nvme_ctrlr_get_regs_cmbsz()
# Determining device support {#p2p_support}
@ -39,7 +39,7 @@ SPDK's identify example application displays whether a device has a controller
memory buffer and which operations it supports. Run it as follows:
~~~{.sh}
./build/examples/identify -r traddr:<pci id of ssd>
./examples/nvme/identify/identify -r traddr:<pci id of ssd>
~~~
# cmb_copy: An example P2P Application {#p2p_cmb_copy}
@ -47,7 +47,7 @@ memory buffer and which operations it supports. Run it as follows:
Run the cmb_copy example application.
~~~{.sh}
./build/examples/cmb_copy -r <pci id of write ssd>-1-0-1 -w <pci id of write ssd>-1-0-1 -c <pci id of the ssd with cmb>
./examples/nvme/cmb_copy/cmb_copy -r <pci id of write ssd>-1-0-1 -w <pci id of write ssd>-1-0-1 -c <pci id of the ssd with cmb>
~~~
This should copy a single LBA (LBA 0) from namespace 1 on the read
NVMe SSD to LBA 0 on namespace 1 on the write SSD using the CMB as the

View File

@ -1,61 +1,9 @@
# Performance Reports {#performance_reports}
## Release 21.01
- [SPDK 21.01 NVMe Bdev Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvme_bdev_perf_report_2101.pdf)
- [SPDK 21.01 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2101.pdf)
- [SPDK 21.01 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_rdma_perf_report_2101.pdf)
- [SPDK 21.01 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_2101.pdf)
## Release 20.10
- [SPDK 20.10 NVMe Bdev Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvme_bdev_perf_report_2010.pdf)
- [SPDK 20.10 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2010.pdf)
- [SPDK 20.10 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_rdma_perf_report_2010.pdf)
- [SPDK 20.10 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_2010.pdf)
## Release 20.07
- [SPDK 20.07 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2007.pdf)
- [SPDK 20.07 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_rdma_perf_report_2007.pdf)
- [SPDK 20.07 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_2007.pdf)
## Release 20.04
- [SPDK 20.04 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2004.pdf)
- [SPDK 20.04 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_rdma_perf_report_2004.pdf)
- [SPDK 20.04 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_2004.pdf)
## Release 20.01
- [SPDK 20.01 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_2001.pdf)
- [SPDK 20.01 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_tcp_perf_report_2001.pdf)
- [SPDK 20.01 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_rdma_perf_report_2001.pdf)
## Release 19.10
- [SPDK 19.10 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_1910.pdf)
- [SPDK 19.10 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvmeof_tcp_perf_report_1910.pdf)
- [SPDK 19.10 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvmeof_rdma_perf_report_1910.pdf)
## Release 19.07
- [SPDK 19.07 Vhost Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_vhost_perf_report_19.07.pdf)
- [SPDK 19.07 NVMe-oF TCP Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvmeof_tcp_perf_report_19.07.pdf)
## Release 19.04
- [SPDK 19.04 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_19.04_NVMeOF_RDMA_benchmark_report.pdf)
## Release 19.01
- [SPDK 19.01.1 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvmeof_perf_report_19.01.1.pdf)
## Release 18.04
- [SPDK 18.04 NVMe BDEV Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvme_bdev_perf_report_18.04.pdf)
- [SPDK 18.04 NVMe-oF RDMA Performance Report](https://ci.spdk.io/download/performance-reports/SPDK_nvmeof_perf_report_18.04.pdf)
## Release 17.07
- [SPDK 17.07 vhost-scsi Performance Report](https://ci.spdk.io/download/performance-reports/SPDK17_07_vhost_scsi_performance_report.pdf)
- [SPDK 17.07 vhost-scsi Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK17_07_vhost_scsi_performance_report.pdf)
- [SPDK 18.04 NVMe BDEV Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK_nvme_bdev_perf_report_18.04.pdf)
- [SPDK 18.04 NVMe-oF Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK_nvmeof_perf_report_18.04.pdf)
- [SPDK 19.01.1 NVMe-oF Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK_nvmeof_perf_report_19.01.1.pdf)
- [SPDK 19.04 NVMe-oF RDMA Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK_19.04_NVMeOF_RDMA_benchmark_report.pdf)
- [SPDK 19.07 Vhost Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK_vhost_perf_report_19.07.pdf)
- [SPDK 19.07 NVMe-oF TCP Performance Report](https://dqtibwqq6s6ux.cloudfront.net/download/performance-reports/SPDK_nvmeof_tcp_perf_report_19.07.pdf)

View File

@ -1,56 +0,0 @@
# Linking SPDK applications with pkg-config {#pkgconfig}
The SPDK build system generates pkg-config files to facilitate linking
applications with the correct set of SPDK and DPDK libraries. Using pkg-config
in your build system will ensure you do not need to make modifications
when SPDK adds or modifies library dependencies.
If your application is using the SPDK nvme library, you would use the following
to get the list of required SPDK libraries:
~~~
PKG_CONFIG_PATH=/path/to/spdk/build/lib/pkgconfig pkg-config --libs spdk_nvme
~~~
To get the list of required SPDK and DPDK libraries to use the DPDK-based
environment layer:
~~~
PKG_CONFIG_PATH=/path/to/spdk/build/lib/pkgconfig pkg-config --libs spdk_env_dpdk
~~~
When linking with static libraries, the dependent system libraries must also be
specified. To get the list of required system libraries:
~~~
PKG_CONFIG_PATH=/path/to/spdk/build/lib/pkgconfig pkg-config --libs spdk_syslibs
~~~
Note that SPDK libraries use constructor functions liberally, so you must surround
the library list with extra linker options to ensure these functions are not dropped
from the resulting application binary. With shared libraries this is achieved through
the `-Wl,--no-as-needed` parameters while with static libraries `-Wl,--whole-archive`
is used. Here is an example Makefile snippet that shows how to use pkg-config to link
an application that uses the SPDK nvme shared library:
~~~
PKG_CONFIG_PATH = $(SPDK_DIR)/build/lib/pkgconfig
SPDK_LIB := $(shell PKG_CONFIG_PATH="$(PKG_CONFIG_PATH)" pkg-config --libs spdk_nvme
DPDK_LIB := $(shell PKG_CONFIG_PATH="$(PKG_CONFIG_PATH)" pkg-config --libs spdk_env_dpdk
app:
$(CC) -o app app.o -pthread -Wl,--no-as-needed $(SPDK_LIB) $(DPDK_LIB) -Wl,--as-needed
~~~
If using the SPDK nvme static library:
~~~
PKG_CONFIG_PATH = $(SPDK_DIR)/build/lib/pkgconfig
SPDK_LIB := $(shell PKG_CONFIG_PATH="$(PKG_CONFIG_PATH)" pkg-config --libs spdk_nvme
DPDK_LIB := $(shell PKG_CONFIG_PATH="$(PKG_CONFIG_PATH)" pkg-config --libs spdk_env_dpdk
SYS_LIB := $(shell PKG_CONFIG_PATH="$(PKG_CONFIG_PATH)" pkg-config --libs --static spdk_syslibs
app:
$(CC) -o app app.o -pthread -Wl,--whole-archive $(SPDK_LIB) $(DPDK_LIB) -Wl,--no-whole-archive \
$(SYS_LIB)
~~~

View File

@ -1,49 +0,0 @@
# RPMs {#rpms}
# In this document {#rpms_toc}
* @ref building_rpms
# Building SPDK RPMs {#building_rpms}
To build basic set of RPM packages out of the SPDK repo simply run:
~~~{.sh}
# rpmbuild/rpm.sh
~~~
Additional configuration options can be passed directly as arguments:
~~~{.sh}
# rpmbuild/rpm.sh --with-shared --with-dpdk=/path/to/dpdk/build
~~~
There are several options that may be passed via environment as well:
- DEPS - Install all needed dependencies for building RPM packages.
Default: "yes"
- MAKEFLAGS - Flags passed to make
- RPM_RELEASE - Target release version of the RPM packages. Default: 1
- REQUIREMENTS - Extra set of RPM dependencies if deemed as needed
- SPDK_VERSION - SPDK version. Default: currently checked out tag
~~~{.sh}
# DEPS=no MAKEFLAGS="-d -j1" rpmbuild/rpm.sh --with-shared
~~~
By default, all RPM packages should be created under $HOME directory of the
target user:
~~~{.sh}
# printf '%s\n' /root/rpmbuild/RPMS/x86_64/*
/root/rpmbuild/RPMS/x86_64/spdk-devel-v21.01-1.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/spdk-dpdk-libs-v21.01-1.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/spdk-libs-v21.01-1.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/spdk-v21.01-1.x86_64.rpm
#
~~~
- spdk - provides all the binaries, common tooling, etc.
- spdk-devel - provides development files
- spdk-libs - provides target lib, .pc files (--with-shared)
- spdk-dpdk-libs - provides dpdk lib files (--with-shared|--with-dpdk)

View File

@ -1,82 +0,0 @@
# Scheduler {#scheduler}
SPDK's event/application framework (`lib/event`) now supports scheduling of
lightweight threads. Schedulers are provided as plugins, called
implementations. A default implementation is provided, but users may wish to
write their own scheduler to integrate into broader code frameworks or meet
their performance needs.
This feature should be considered experimental and is disabled by default. When
enabled, the scheduler framework gathers data for each spdk thread and reactor
and passes it to a scheduler implementation to perform one of the following
actions.
## Actions
### Move a thread
`spdk_thread`s can be moved to another reactor. Schedulers can examine the
suggested cpu_mask value for each lightweight thread to see if the user has
requested specific reactors, or choose a reactor using whatever algorithm they
deem fit.
### Switch reactor mode
Reactors by default run in a mode that constantly polls for new actions for the
most efficient processing. Schedulers can switch a reactor into a mode that
instead waits for an event on a file descriptor. On Linux, this is implemented
using epoll. This results in reduced CPU usage but may be less responsive when
events occur. A reactor cannot enter this mode if any `spdk_threads` are
currently scheduled to it. This limitation is expected to be lifted in the
future, allowing `spdk_threads` to enter interrupt mode.
### Set frequency of CPU core
The frequency of CPU cores can be modified by the scheduler in response to
load. Only CPU cores that match the application cpu_mask may be modified. The
mechanism for controlling CPU frequency is pluggable and the default provided
implementation is called `dpdk_governor`, based on the `rte_power` library from
DPDK.
#### Known limitation
When SMT (Hyperthreading) is enabled the two logical CPU cores sharing a single
physical CPU core must run at the same frequency. If one of two of such logical
CPU cores is outside the application cpu_mask, the policy and frequency on that
core has to be managed by the administrator.
## Scheduler implementations
The scheduler in use may be controlled by JSON-RPC. Please use the
[framework_set_scheduler](jsonrpc.md/#rpc_framework_set_scheduler) RPC to
switch between schedulers or change their options.
[spdk_top](spdk_top.md#spdk_top) is a useful tool to observe the behavior of
schedulers in different scenarios and workloads.
### static [default]
The `static` scheduler is the default scheduler and does no dynamic scheduling.
Lightweight threads are distributed round-robin among reactors, respecting
their requested cpu_mask, and then they are never moved. This is equivalent to
the previous behavior of the SPDK event/application framework.
### dynamic
The `dynamic` scheduler is designed for power saving and reduction of CPU
utilization, especially in cases where workloads show large variations over
time.
Active threads are distributed equally among reactors, taking cpu_mask into
account. All idle threads are moved to the main core. Once an idle thread becomes
active, it is redistributed again.
When a reactor has no scheduled `spdk_thread`s it is switched into interrupt
mode and stops actively polling. After enough threads become active, the
reactor is switched back into poll mode and threads are assigned to it again.
The main core can contain active threads only when their execution time does
not exceed the sum of all idle threads. When no active threads are present on
the main core, the frequency of that CPU core will decrease as the load
decreases. All CPU cores corresponding to the other reactors remain at maximum
frequency.

View File

@ -1,146 +0,0 @@
# shfmt {#shfmt}
# In this document {#shfmt_toc}
* @ref shfmt_overview
* @ref shfmt_usage
* @ref shfmt_installation
* @ref shfmt_examples
# Overview {#shfmt_overview}
The majority of tests (and scripts overall) in the SPDK repo are written
in Bash (with a quite significant emphasis on "Bashism"), thus a style
formatter, shfmt, was introduced to help keep the .sh code consistent
across the entire repo. For more details on the tool itself, please see
[shfmt](https://github.com/mvdan/sh).
# Usage {#shfmt_usage}
On the CI pool, the shfmt is run against all the updated .sh files that
have been committed but not merged yet. Additionally, shfmt will pick
all .sh present in the staging area when run locally from our pre-commit
hook (via check_format.sh). In case any style errors are detected, a
patch with needed changes is going to be generated and either build (CI)
or the commit will be aborted. Said patch can be then easily applied:
~~~{.sh}
# Run from the root of the SPDK repo
patch --merge -p0 <shfmt-3.1.0.patch
~~~
The name of the patch is derived from the version of shfmt that is
currently in use (3.1.0 is currently supported).
Please, see ./scripts/check_format.sh for all the arguments the shfmt
is run with. Additionally, @ref shfmt_examples has more details on how
each of the arguments behave.
# Installation {#shfmt_installation}
The shfmt can be easily installed via pkgdep.sh:
~~~{.sh}
./scripts/pkgdep.sh -d
~~~
This will install all the developers tools, including shfmt, on the
local system. The precompiled binary will be saved, by default, to
/opt/shfmt and then linked under /usr/bin. Both paths can be changed
by setting SHFMT_DIR and SHFMT_DIR_OUT in the environment. Example:
~~~{.sh}
SHFMT_DIR=/keep_the_binary_here \
SHFMT_DIR_OUT=/and_link_it_here \
./scripts/pkgdep.sh -d
~~~
# Examples {#shfmt_examples}
~~~{.sh}
#######################################
if foo=$(bar); then
echo "$foo"
fi
exec "$foo" \
--bar \
--foo
# indent_style = tab
if foo=$(bar); then
echo "$foo"
fi
exec foobar \
--bar \
--foo
######################################
if foo=$(bar); then
echo "$foo" && \
echo "$(bar)"
fi
# binary_next_line = true
if foo=$(bar); then
echo "$foo" \
&& echo "$(bar)"
fi
# Note that each break line is also being indented:
if [[ -v foo ]] \
&& [[ -v bar ]] \
&& [[ -v foobar ]]; then
echo "This is foo"
fi
# ->
if [[ -v foo ]] \
&& [[ -v bar ]] \
&& [[ -v foobar ]]; then
echo "This is foo"
fi
# Currently, newlines are being escaped even if syntax-wise
# they are not needed, thus watch for the following:
if [[ -v foo
&& -v bar
&& -v foobar ]]; then
echo "This is foo"
fi
#->
if [[ -v foo && -v \
bar && -v \
foobar ]]; then
echo "This is foo"
fi
# This, unfortunately, also breaks the -bn behavior.
# (see https://github.com/mvdan/sh/issues/565) for details.
######################################
case "$FOO" in
BAR)
echo "$FOO" ;;
esac
# switch_case_indent = true
case "$FOO" in
BAR)
echo "$FOO"
;;
esac
######################################
exec {foo}>bar
:>foo
exec {bar}<foo
# -sr
exec {foo}> bar
: > foo
exec {bar}< foo
######################################
# miscellaneous, enforced by shfmt
(( no_spacing_at_the_beginning & ~and_no_spacing_at_the_end ))
: $(( no_spacing_at_the_beginning & ~and_no_spacing_at_the_end ))
# ->
((no_spacing_at_the_beginning & ~and_no_spacing_at_the_end))
: $((no_spacing_at_the_beginning & ~and_no_spacing_at_the_end))
~~~

View File

@ -1,65 +0,0 @@
# spdk_top {#spdk_top}
The spdk_top application is designed to resemble the standard top in that it provides a real-time insights into CPU cores usage by SPDK lightweight threads and pollers. Have you ever wondered which CPU core is used most by your SPDK instance? Are you building your own bdev or library and want to know if your code is running efficiently? Are your new pollers busy most of the time? The spdk_top application uses RPC calls to collect performance metrics and displays them in a report that you can analyze and determine if your code is running efficiently so that you can tune your implementation and get more from SPDK.
Why doesn't the classic top utility work for SPDK? SPDK uses a polled-mode design; a reactor thread running on each CPU core assigned to an SPDK application schedules SPDK lightweight threads and pollers to run on the CPU core. Therefore, the standard Linux top utility is not effective for analyzing the CPU usage for polled-mode applications like SPDK because it just reports that they are using 100% of the CPU resources assigned to them. The spdk_top utility was developed to analyze and report the CPU cycles used to do real work vs just polling for work. The utility relies on instrumentation added to pollers to track when they are doing work vs. polling for work. The spdk_top utility gets the fine grained metrics from the pollers, analyzes and report the metrics on a per poller, thread and core basis. This information enables users to identify CPU cores that are busy doing real work so that they can determine if the application needs more or less CPU resources.
# Run spdk_top
Before running spdk_top you need to run the SPDK application whose performance you want to analyze using spdk_top.
Run the spdk_top application
~~~{.sh}
./build/bin/spdk_top
~~~
# Bottom menu
Menu at the bottom of SPDK top window shows many options for changing displayed data. Each menu item has a key associated with it in square brackets.
* Quit - quits the SPDK top application.
* TAB selection - allows to select THREADS/POLLERS/CORES tabs.
* Previous page/Next page - scrolls up/down to the next set of rows displayed. Indicator in the bottom-left corner shows current page and number of all available pages.
* Columns - enables/disables chosen columns in a column pop-up window.
* Sorting - allows to sort displayed data by column in a sorting pop-up.
* Refresh rate - takes user input from 0 to 255 and changes refresh rate to that value in seconds.
* Item details - displays details pop-up window for highlighted data row. Selection is changed by pressing UP and DOWN arrow keys.
* Total/Interval - changes displayed values in all tabs to either Total time (measured since start of SPDK application) or Interval time (measured since last refresh).
# Threads Tab
The threads tab displays a line item for each spdk thread. The information displayed shows:
* Thread name - name of SPDK thread.
* Core - core on which the thread is currently running.
* Active/Timed/Paused pollers - number of pollers grouped by type on this thread.
* Idle/Busy - how many microseconds the thread was idle/busy.
\n
By pressing ENTER key a pop-up window appears, showing above and a list of pollers running on selected thread (with poller name, type, run count and period).
Pop-up then can be closed by pressing ESC key.
To learn more about spdk threads see @ref concurrency.
# Pollers Tab
The pollers tab displays a line item for each poller. The information displayed shows:
* Poller name - name of currently selected poller.
* Type - type of poller (Active/Paused/Timed).
* On thread - thread on which the poller is running.
* Run count - how many times poller was run.
* Period - poller period in microseconds. If period equals 0 then it is not displayed.
* Status - whether poller is currently Busy (red color) or Idle (blue color).
\n
Poller pop-up window can be displayed by pressing ENTER on a selected data row and displays above information.
Pop-up can be closed by pressing ESC key.
# Cores Tab
The cores tab provides insights into how the application is using the CPU cores assigned to it. The information displayed for each core shows:
* Core - core number.
* Thread count - number of threads currently running on core.
* Poller count - total number of pollers running on core.
* Idle/Busy - how many microseconds core was idle (including time when core ran pollers but did not find any work) or doing actual work.
\n
Pressing ENTER key makes a pop-up window appear, showing above information, along with a list of threads running on selected core. Cores details window allows to select a thread and display thread details pop-up on top of it. To close both pop-ups use ESC key.

View File

@ -11,14 +11,13 @@ for the next SPDK release.
All dependencies should be handled by scripts/pkgdep.sh script.
Package dependencies at the moment include:
- configshell
### Run SPDK application instance
~~~{.sh}
./scripts/setup.sh
./build/bin/vhost -c vhost.json
./app/vhost/vhost -c vhost.conf
~~~
### Run SPDK CLI

View File

@ -15,121 +15,3 @@ the IOMMU or to set it into passthrough mode prior to running `scripts/setup.sh`
To disable the IOMMU or place it into passthrough mode, add `intel_iommu=off`
or `amd_iommu=off` or `intel_iommu=on iommu=pt` to the GRUB command line on
x86_64 system, or add `iommu.passthrough=1` on arm64 systems.
There are also some instances where a user may not want to use `uio_pci_generic` or the kernel
version they are using has a bug where `uio_pci_generic` [fails to bind to NVMe drives](https://github.com/spdk/spdk/issues/399).
In these cases, users can build the `igb_uio` kernel module which can be found in dpdk-kmods repository.
To ensure that the driver is properly bound, users should specify `DRIVER_OVERRIDE=/path/to/igb_uio.ko`.
# Running SPDK as non-priviledged user {#system_configuration_nonroot}
One of the benefits of using the `VFIO` Linux kernel driver is the ability to
perform DMA operations with peripheral devices as unprivileged user. The
permissions to access particular devices still need to be granted by the system
administrator, but only on a one-time basis. Note that this functionality
is supported with DPDK starting from version 18.11.
## Hugetlbfs access
Make sure the target user has RW access to at least one hugepage mount.
A good idea is to create a new mount specifically for SPDK:
~~~{.sh}
# mkdir /mnt/spdk_hugetlbfs
# mount -t hugetlbfs -o uid=spdk,size=<value> none /mnt/spdk_hugetlbfs
~~~
Then start SPDK applications with an additional parameter `--huge-dir /mnt/spdk_hugetlbfs`
Full guide on configuring hugepage mounts is available in the
[Linux Hugetlbpage Documentation](https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt)
## Device access {#system_configuration_nonroot_device_access}
`VFIO` device access is protected with sysfs file permissions and can be
configured with chown/chmod.
Please note that the VFIO device isolation is based around IOMMU groups and it's
only possible to change permissions of the entire group, which might possibly
consist of more than one device. (You could also apply a custom kernel patch to
further isolate those devices in the kernel, but it comes with potential risks
as described on
[Alex Williamson's VFIO blog](https://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html),
with the patch in question available here:
[[PATCH] pci: Enable overrides for missing ACS capabilities](https://lkml.org/lkml/2013/5/30/513))
Let's assume we want to use PCI device `0000:04:00.0`. First of all, verify
that it has an IOMMU group assigned:
~~~{.sh}
$ readlink "/sys/bus/pci/devices/0000:00:04.0/iommu_group"
~~~
The output should be e.g.
`../../../kernel/iommu_groups/5`
Which means that the device is a part of the IOMMU group 5. We can check if
there are any other devices in that group.
~~~{.sh}
$ ls /sys/kernel/iommu_groups/5/devices/
0000:00:04.0 0000:00:04.1 0000:00:04.2 0000:00:04.3 0000:00:04.4 0000:00:04.5 0000:00:04.6 0000:00:04.7
~~~
In this case `0000:04:00.0` is an I/OAT channel which comes with 7 different
channels associated with the same IOMMU group.
To give the user `spdk` full access to the VFIO IOMMU group 5 and all its
devices, use the following:
~~~{.sh}
# chown spdk /dev/vfio/5
~~~
## Memory constraints {#system_configuration_nonroot_memory_constraints}
As soon as the first device is attached to SPDK, all of SPDK memory will be
mapped to the IOMMU through the VFIO APIs. VFIO will try to mlock that memory and
will likely exceed user ulimit on locked memory. Besides having various
SPDK errors and failures, this would also pollute the syslog with the following
entries:
`vfio_pin_pages: RLIMIT_MEMLOCK`
The limit can be checked by running the following command as target user:
(output in kilobytes)
~~~{.sh}
$ ulimit -l
~~~
On Ubuntu 18.04 this returns 16384 (16MB) by default, which is way below
what SPDK needs.
The limit can be increased with one of the methods below. Keep in mind SPDK will
try to map not only its reserved hugepages, but also all the memory that's
shared by its vhost clients as described in the
[Vhost processing guide](https://spdk.io/doc/vhost_processing.html#vhost_processing_init).
### Increasing the memlock limit permanently
Open the `/etc/security/limits.conf` file as root and append the following:
```
spdk hard memlock unlimited
spdk soft memlock unlimited
```
Then logout from the target user account. The changes will take effect after the next login.
### Increasing the memlock for a specific process
Linux offers a `prlimit` utility that can override limits of any given process.
On Ubuntu, it is a part of the `util-linux` package.
~~~{.sh}
# prlimit --pid <pid> --memlock=<soft>:<hard>
~~~
Note that the above needs to be executed before the first device is attached to
the SPDK application.

View File

@ -1,5 +1,4 @@
# Tools {#tools}
- @subpage spdkcli
- @subpage bdevperf
- @subpage spdk_top
- @subpage nvme-cli

View File

@ -1,8 +1,6 @@
# User Guides {#user_guides}
- @subpage system_configuration
- @subpage libraries
- @subpage pkgconfig
- @subpage app_overview
- @subpage iscsi
- @subpage nvmf

View File

@ -31,7 +31,6 @@ copy the vagrant configuration file (a.k.a. `Vagrantfile`) to it,
and run `vagrant up` with some settings defined by the script arguments.
By default, the VM created is configured with:
- 2 vCPUs
- 4G of RAM
- 2 NICs (1 x NAT - host access, 1 x private network)
@ -147,7 +146,7 @@ vagrant@vagrant:~/spdk_repo/spdk$ make
vagrant@vagrant:~/spdk_repo/spdk$ sudo ./scripts/setup.sh
0000:00:0e.0 (80ee 4e56): nvme -> uio_pci_generic
vagrant@vagrant:~/spdk_repo/spdk$ sudo build/examples/hello_world
vagrant@vagrant:~/spdk_repo/spdk$ sudo examples/nvme/hello_world/hello_world
Starting SPDK v18.10-pre / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --legacy-mem --file-prefix=spdk0 --base-virtaddr=0x200000000000 --proc-type=auto ]
EAL: Detected 4 lcore(s)

View File

@ -91,13 +91,13 @@ be restricted to run on a subset of these CPU cores. See @ref vhost_vdev_create
details.
~~~{.sh}
build/bin/vhost -S /var/tmp -m 0x3
app/vhost/vhost -S /var/tmp -m 0x3
~~~
To list all available vhost options use the following command.
~~~{.sh}
build/bin/vhost -h
app/vhost/vhost -h
~~~
# SPDK Configuration {#vhost_config}
@ -105,7 +105,7 @@ build/bin/vhost -h
## Create bdev (block device) {#vhost_bdev_create}
SPDK bdevs are block devices which will be exposed to the guest OS.
For vhost-scsi, bdevs are exposed as SCSI LUNs on SCSI devices attached to the
For vhost-scsi, bdevs are exposed as as SCSI LUNs on SCSI devices attached to the
vhost-scsi controller in the guest OS.
For vhost-blk, bdevs are exposed directly as block devices in the guest OS and are
not associated at all with SCSI.
@ -171,6 +171,26 @@ extra `-r` or `--readonly` parameter.
scripts/rpc.py vhost_create_blk_controller --cpumask 0x1 -r vhost.1 Malloc0
~~~
### Vhost-NVMe (experimental)
The following RPC will attach the Malloc0 bdev to the vhost.0 vhost-nvme
controller. Malloc0 will appear as Namespace 1 of vhost.0 controller. Users
can use `--cpumask` parameter to specify which cores should be used for this
controller. Users must specify the maximum I/O queues supported for the
controller, at least 1 Namespace is required for each controller.
~~~{.sh}
$rpc_py vhost_create_nvme_controller --cpumask 0x1 vhost.2 16
$rpc_py vhost_nvme_controller_add_ns vhost.2 Malloc0
~~~
Users can use the following command to remove the controller, all the block
devices attached to controller's Namespace will be removed automatically.
~~~{.sh}
$rpc_py vhost_delete_controller vhost.2
~~~
## QEMU {#vhost_qemu_config}
Now the virtual machine can be started with QEMU. The following command-line
@ -209,6 +229,13 @@ Finally, specify the SPDK vhost devices:
-device vhost-user-blk-pci,id=blk0,chardev=char1
~~~
### Vhost-NVMe (experimental)
~~~{.sh}
-chardev socket,id=char2,path=/var/tmp/vhost.2
-device vhost-user-nvme,id=nvme0,chardev=char2,num_io_queues=4
~~~
## Example output {#vhost_example}
This example uses an NVMe bdev alongside Mallocs. SPDK vhost application is started
@ -220,9 +247,9 @@ host:~# HUGEMEM=2048 ./scripts/setup.sh
~~~
~~~{.sh}
host:~# ./build/bin/vhost -S /var/tmp -s 1024 -m 0x3 &
host:~# ./app/vhost/vhost -S /var/tmp -s 1024 -m 0x3 &
Starting DPDK 17.11.0 initialization...
[ DPDK EAL parameters: vhost -c 3 -m 1024 --main-lcore=1 --file-prefix=spdk_pid156014 ]
[ DPDK EAL parameters: vhost -c 3 -m 1024 --master-lcore=1 --file-prefix=spdk_pid156014 ]
EAL: Detected 48 lcore(s)
EAL: Probing VFIO support...
EAL: VFIO support initialized
@ -310,6 +337,7 @@ vhost.c:1006:session_shutdown: *NOTICE*: Exiting
We can see that `sdb` and `sdc` are SPDK vhost-scsi LUNs, and `vda` is SPDK a
vhost-blk disk.
# Advanced Topics {#vhost_advanced_topics}
## Multi-Queue Block Layer (blk-mq) {#vhost_multiqueue}
@ -319,9 +347,9 @@ To enable it on Linux, it is required to modify kernel options inside the
virtual machine.
Instructions below for Ubuntu OS:
1. `vi /etc/default/grub`
2. Make sure mq is enabled: `GRUB_CMDLINE_LINUX="scsi_mod.use_blk_mq=1"`
2. Make sure mq is enabled:
`GRUB_CMDLINE_LINUX="scsi_mod.use_blk_mq=1"`
3. `sudo update-grub`
4. Reboot virtual machine
@ -346,7 +374,7 @@ be aborted - possibly flooding a VM with syslog warnings and errors.
### Hot-attach
Hot-attach is done by simply attaching a bdev to a vhost controller with a QEMU VM
Hot-attach is is done by simply attaching a bdev to a vhost controller with a QEMU VM
already started. No other extra action is necessary.
~~~{.sh}
@ -384,6 +412,5 @@ See the [bug report](https://bugzilla.redhat.com/show_bug.cgi?id=1411092) for
more information.
## QEMU vhost-user-blk
QEMU [vhost-user-blk](https://git.qemu.org/?p=qemu.git;a=commit;h=00343e4b54ba) is
supported from version 2.12.

View File

@ -89,7 +89,6 @@ device (SPDK) can access it directly. The memory can be fragmented into multiple
physically-discontiguous regions and Vhost-user specification puts a limit on
their number - currently 8. The driver sends a single message for each region with
the following data:
* file descriptor - for mmap
* user address - for memory translations in Vhost-user messages (e.g.
translating vring addresses)
@ -107,7 +106,6 @@ as they use common SCSI I/O to inquiry the underlying disk(s).
Afterwards, the driver requests the number of maximum supported queues and
starts sending virtqueue data, which consists of:
* unique virtqueue id
* index of the last processed vring descriptor
* vring addresses (from user address space)

View File

@ -6,9 +6,8 @@ SPDK Virtio driver is a C library that allows communicating with Virtio devices.
It allows any SPDK application to become an initiator for (SPDK) vhost targets.
The driver supports two different usage models:
* PCI - This is the standard mode of operation when used in a guest virtual
machine, where QEMU has presented the virtio controller as a virtual PCI device.
machine, where QEMU has presented the virtio controller as a virtual PCI device.
* vhost-user - Can be used to connect to a vhost socket directly on the same host.
The driver, just like the SPDK @ref vhost, is using pollers instead of standard

View File

@ -1,116 +0,0 @@
# VMD driver {#vmd}
# In this document {#vmd_toc}
* @ref vmd_intro
* @ref vmd_interface
* @ref vmd_key_functions
* @ref vmd_config
* @ref vmd_app_frame
* @ref vmd_app
* @ref vmd_led
# Introduction {#vmd_intro}
Intel Volume Management Device is a hardware logic inside processor's Root Complex
responsible for management of PCIe NVMe SSDs. It provides robust Hot Plug support
and Status LED management.
The driver is responsible for enumeration and hooking NVMe devices behind VMD
into SPDK PCIe subsystem. It also provides API for LED management and hot plug.
# Public Interface {#vmd_interface}
- spdk/vmd.h
# Key Functions {#vmd_key_functions}
Function | Description
--------------------------------------- | -----------
spdk_vmd_init() | @copybrief spdk_vmd_init()
spdk_vmd_pci_device_list() | @copybrief spdk_vmd_pci_device_list()
spdk_vmd_set_led_state() | @copybrief spdk_vmd_set_led_state()
spdk_vmd_get_led_state() | @copybrief spdk_vmd_get_led_state()
spdk_vmd_hotplug_monitor() | @copybrief spdk_vmd_hotplug_monitor()
# Configuration {#vmd_config}
To enable VMD driver enumeration, the following steps are required:
Check for available VMD devices (VMD needs to be properly set up in BIOS first).
Example:
```
$ lspci | grep 201d
$ 5d:05.5 RAID bus controller: Intel Corporation Device 201d (rev 04)
$ d7:05.5 RAID bus controller: Intel Corporation Device 201d (rev 04)
```
Run setup.sh script with VMD devices set in PCI_ALLOWED.
Example:
```
$ PCI_ALLOWED="0000:5d:05.5 0000:d7:05.5" scripts/setup.sh
```
Check for available devices behind the VMD with spdk_lspci.
Example:
```
$ ./build/bin/spdk_lspci
5d0505:01:00.0 (8086 a54) (NVMe disk behind VMD)
5d0505:03:00.0 (8086 a54) (NVMe disk behind VMD)
d70505:01:00.0 (8086 a54) (NVMe disk behind VMD)
d70505:03:00.0 (8086 a54) (NVMe disk behind VMD)
0000:5d:05.5 (8086 201d) (VMD)
0000:d7:05.5 (8086 201d) (VMD)
```
VMD NVMe BDF could be used as regular NVMe BDF.
Example:
```
$ ./scripts/rpc.py bdev_nvme_attach_controller -b NVMe1 -t PCIe -a 5d0505:01:00.0
```
# Application framework {#vmd_app_frame}
When application framework is used, VMD section needs to be added to the configuration file:
JSON config:
```
{
"subsystem": "vmd",
"config": [
{
"method": "enable_vmd",
"params": {}
}
]
}
```
or use RPC call before framework starts e.g.
```
$ ./build/bin/spdk_tgt --wait_for_rpc
$ ./scripts/rpc.py enable_vmd
$ ./scripts/rpc.py framework_start_init
```
# Applications w/o application framework {#vmd_app}
To enable VMD enumeration in SPDK application that are not using application framework
e.g nvme/perf, nvme/identify -V flag is required - please refer to app help if it supports VMD.
Applications need to call spdk_vmd_init() to enumerate NVMe devices behind the VMD prior to calling
spdk_nvme_(probe|connect).
To support hot plugs spdk_vmd_hotplug_monitor() needs to be called periodically.
# LED management {#vmd_led}
VMD LED utility in the [examples/vmd/led](https://github.com/spdk/spdk/tree/master/examples/vmd/led)
could be used to set LED states.
In order to verify that a platform is correctly configured to support LED management, ledctl(8) can
be utilized. For instructions on how to use it, consult the manual page of this utility.

232
doc/vpp_integration.md Normal file
View File

@ -0,0 +1,232 @@
# Vector Packet Processing {#vpp_integration}
VPP (part of [Fast Data - Input/Output](https://fd.io/) project) is an extensible
userspace framework providing networking functionality. It is built around the concept of
packet processing graph (see [What is VPP?](https://wiki.fd.io/view/VPP/What_is_VPP?)).
Detailed instructions for **simplified steps 1-3** below, can be found on
VPP [Quick Start Guide](https://wiki.fd.io/view/VPP).
*SPDK supports VPP version 19.04.2.*
# 1. Building VPP (optional) {#vpp_build}
*Please skip this step if using already built packages.*
Clone and checkout VPP
~~~
git clone https://gerrit.fd.io/r/vpp && cd vpp
git checkout v19.04.2
~~~
Install VPP build dependencies
~~~
make install-dep
~~~
Build and create .rpm packages
~~~
make pkg-rpm
~~~
Alternatively, build and create .deb packages
~~~
make bootstrap && make pkg-deb
~~~
Packages can be found in `vpp/build-root/` directory.
For more in depth instructions please see Building section in
[VPP documentation](https://wiki.fd.io/view/VPP/Pulling,_Building,_Running,_Hacking_and_Pushing_VPP_Code#Building)
# 2. Installing VPP {#vpp_install}
Packages can be installed from a distribution repository or built in previous step.
Minimal set of packages consists of `vpp`, `vpp-lib` and `vpp-devel`.
*Note: Please remove or modify /etc/sysctl.d/80-vpp.conf file with appropriate values
dependent on number of hugepages that will be used on system.*
# 3. Running VPP {#vpp_run}
VPP takes over any network interfaces that were bound to userspace driver,
for details please see DPDK guide on
[Binding and Unbinding Network Ports to/from the Kernel Modules](http://dpdk.org/doc/guides/linux_gsg/linux_drivers.html#binding-and-unbinding-network-ports-to-from-the-kernel-modules).
VPP is installed as service and disabled by default. To start VPP with default config:
~~~
sudo systemctl start vpp
~~~
Alternatively, use `vpp` binary directly
~~~
sudo vpp unix {cli-listen /run/vpp/cli.sock} session { evt_qs_memfd_seg } socksvr { socket-name /run/vpp-api.sock }
~~~
# 4. Configure VPP {#vpp_config}
VPP can be configured using a VPP startup file and the `vppctl` command; By default, the VPP startup file is `/etc/vpp/startup.conf`, however, you can pass any file with the `-c` vpp command argument.
## Startup configuration
Some key values from iSCSI point of view includes:
CPU section (`cpu`):
- `main-core <lcore>` -- logical CPU core used for main thread.
- `corelist-workers <lcore list>` -- logical CPU cores where worker threads are running.
DPDK section (`dpdk`):
- `num-rx-queues <num>` -- number of receive queues.
- `num-tx-queues <num>` -- number of transmit queues.
- `dev <PCI address>` -- whitelisted device.
Session section (`session`):
- `evt_qs_memfd_seg` -- uses a memfd segment for event queues. This is required for SPDK.
Socket server session (`socksvr`):
- `socket-name <path>` -- configure API socket filename (curently SPDK uses default path `/run/vpp-api.sock`).
Plugins section (`plugins`):
- `plugin <plugin name> { [enable|disable] }` -- enable or disable VPP plugin.
### Example:
~~~
unix {
nodaemon
cli-listen /run/vpp/cli.sock
}
cpu {
main-core 1
}
session {
evt_qs_memfd_seg
}
socksvr {
socket-name /run/vpp-api.sock
}
plugins {
plugin default { disable }
plugin dpdk_plugin.so { enable }
}
~~~
## vppctl command tool
The `vppctl` command tool allows users to control VPP at runtime via a command prompt
~~~
sudo vppctl
~~~
Or, by sending single command directly. For example to display interfaces within VPP:
~~~
sudo vppctl show interface
~~~
Useful commands:
- `show interface` -- show interfaces settings, state and some basic statistics.
- `show interface address` -- show interfaces state and assigned addresses.
- `set interface ip address <VPP interface> <Address>` -- set interfaces IP address.
- `set interface state <VPP interface> [up|down]` -- bring interface up or down.
- `show errors` -- show error counts.
## Example: Configure two interfaces to be available via VPP
We want to configure two DPDK ports with PCI addresses 0000:09:00.1 and 0000:0b:00.1
to be used as portals 10.0.0.1/24 and 10.10.0.1/24.
In the VPP startup file (e.g. `/etc/vpp/startup.conf`), whitelist the interfaces
by specifying PCI addresses in section dpdk:
~~~
dev 0000:09:00.1
dev 0000:0b:00.1
~~~
Bind PCI NICs to UIO driver (`igb_uio` or `uio_pci_generic`).
Restart vpp and use vppctl tool to verify interfaces.
~~~
$ vppctl show interface
Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
FortyGigabitEthernet9/0/1 1 down 9000/0/0/0
FortyGigabitEthernetb/0/1 2 down 9000/0/0/0
~~~
Set appropriate addresses and bring interfaces up:
~~~
$ vppctl set interface ip address FortyGigabitEthernet9/0/1 10.0.0.1/24
$ vppctl set interface state FortyGigabitEthernet9/0/1 up
$ vppctl set interface ip address FortyGigabitEthernetb/0/1 10.10.0.1/24
$ vppctl set interface state FortyGigabitEthernetb/0/1 up
~~~
Verify configuration:
~~~
$ vppctl show interface address
FortyGigabitEthernet9/0/1 (up):
L3 10.0.0.1/24
FortyGigabitEthernetb/0/1 (up):
L3 10.10.0.1/24
~~~
Now, both interfaces are ready to use. To verify conectivity you can ping
10.0.0.1 and 10.10.0.1 addresses from another machine.
## Example: Tap interfaces on single host
For functional test purposes a virtual tap interface can be created,
so no additional network hardware is required.
This will allow network communication between SPDK iSCSI target using VPP end of tap
and kernel iSCSI initiator using the kernel part of tap. A single host is used in this scenario.
Create tap interface via VPP
~~~
vppctl tap connect tap0
vppctl set interface state tapcli-0 up
vppctl set interface ip address tapcli-0 10.0.0.1/24
vppctl show int addr
~~~
Assign address on kernel interface
~~~
sudo ip addr add 10.0.0.2/24 dev tap0
sudo ip link set tap0 up
~~~
To verify connectivity
~~~
ping 10.0.0.1
~~~
# 5. Building SPDK with VPP {#vpp_built_into_spdk}
Support for VPP can be built into SPDK by using configuration option.
~~~
configure --with-vpp
~~~
Alternatively, directory with built libraries can be pointed at
and will be used for compilation instead of installed packages.
~~~
configure --with-vpp=/path/to/vpp/repo/build-root/install-vpp-native/vpp
~~~
# 6. Running SPDK with VPP {#vpp_running_with_spdk}
VPP application has to be started before SPDK application, in order to enable
usage of network interfaces. For example, if you use SPDK iSCSI target or
NVMe-oF target, after the initialization finishes, interfaces configured within
VPP will be available to be configured as portal addresses.
Moreover, you do not need to specifiy which TCP sock implementation (e.g., posix,
VPP) to be used through configuration file or RPC call. Since SPDK program
automatically determines the protocol according to the configured portal addresses
info. For example, you can specify a Listen address in NVMe-oF subsystem
configuration such as "Listen TCP 10.0.0.1:4420". SPDK programs automatically
uses different implemenation to listen this provided portal info via posix or
vpp implemenation(if compiled in SPDK program), and only one implementation can
successfully listen on the provided portal.

2
dpdk

@ -1 +1 @@
Subproject commit 4f93dbc0c0ab3804abaa20123030ad7fccf78709
Subproject commit 0698cc38e0cb86bc01d82b6f1aef85fd983b213d

View File

@ -36,64 +36,102 @@ include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
.PHONY: all clean install uninstall
DPDK_OPTS = -Denable_docs=false
DPDK_FRAMEWORK = n
DPDK_OPTS =
DPDK_CFLAGS =
DPDK_KMODS = false
ifeq ($(OS),FreeBSD)
DPDK_KMODS = true
endif
DPDK_OPTS += -Denable_kmods=$(DPDK_KMODS)
ifeq ($(CONFIG_DEBUG),y)
DPDK_OPTS += --buildtype=debug
endif
# the drivers we use
DPDK_DRIVERS = bus bus/pci bus/vdev mempool/ring
# common crypto/reduce drivers
ifeq ($(findstring y,$(CONFIG_CRYPTO)$(CONFIG_REDUCE)),y)
DPDK_DRIVERS += crypto/qat compress/qat common/qat
ifeq ($(CONFIG_SHARED),y)
DPDK_OPTS += CONFIG_RTE_BUILD_SHARED_LIB=y
DPDK_LDFLAGS+= -rpath $(SPDK_ROOT_DIR)/dpdk/build/lib
else
DPDK_OPTS += CONFIG_RTE_BUILD_SHARED_LIB=n
endif
ifeq ($(CONFIG_CRYPTO),y)
# crypto/qat is just a stub, the compress/qat pmd is used instead
DPDK_DRIVERS += crypto crypto/aesni_mb
DPDK_FRAMEWORK = y
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_AESNI_MB=y
DPDK_OPTS += CONFIG_RTE_LIBRTE_REORDER=y
DPDK_CFLAGS += -I$(IPSEC_MB_DIR)
DPDK_LDFLAGS += -L$(IPSEC_MB_DIR)
else
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_AESNI_MB=n
DPDK_OPTS += CONFIG_RTE_LIBRTE_REORDER=n
endif
ifeq ($(CONFIG_REDUCE),y)
DPDK_DRIVERS += compress compress/isal
DPDK_FRAMEWORK = y
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_ISAL=y
DPDK_CFLAGS += -I$(ISAL_DIR)
DPDK_LDFLAGS += -L$(ISAL_DIR)/.libs -lisal
DPDK_LDFLAGS += -L$(ISAL_DIR)/.libs
else
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_ISAL=n
endif
DPDK_OPTS += -Dmachine=$(TARGET_ARCHITECTURE)
ifeq ($(CONFIG_VHOST),y)
DPDK_OPTS += CONFIG_RTE_LIBRTE_ETHER=y
DPDK_OPTS += CONFIG_RTE_LIBRTE_CMDLINE=y
DPDK_OPTS += CONFIG_RTE_LIBRTE_METER=y
DPDK_OPTS += CONFIG_RTE_LIBRTE_HASH=y
DPDK_OPTS += CONFIG_RTE_LIBRTE_VHOST=y
else
DPDK_OPTS += CONFIG_RTE_LIBRTE_ETHER=n
DPDK_OPTS += CONFIG_RTE_LIBRTE_CMDLINE=n
DPDK_OPTS += CONFIG_RTE_LIBRTE_METER=n
DPDK_OPTS += CONFIG_RTE_LIBRTE_HASH=n
DPDK_OPTS += CONFIG_RTE_LIBRTE_VHOST=n
endif
ifeq ($(DPDK_FRAMEWORK),y)
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_QAT=y
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_QAT_SYM=y
ifeq ($(CONFIG_IGB_UIO_DRIVER),y)
DPDK_OPTS += CONFIG_RTE_EAL_IGB_UIO=y
else
DPDK_OPTS += CONFIG_RTE_EAL_IGB_UIO=n
endif
else
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_QAT=n
DPDK_OPTS += CONFIG_RTE_LIBRTE_PMD_QAT_SYM=n
endif
ifeq ($(TARGET_MACHINE),aarch64)
DPDK_CONFIG := arm64-armv8a
else
DPDK_CONFIG := $(TARGET_MACHINE)-native
endif
ifneq ($(CONFIG_CROSS_PREFIX),)
ifeq ($(findstring mingw,$(CONFIG_CROSS_PREFIX)),mingw)
DPDK_OPTS += --cross-file $(SPDK_ROOT_DIR)/dpdk/config/x86/cross-mingw
else
$(error Automatic DPDK cross build is not supported. Please compile DPDK manually \
with e.g. `meson build --cross-file config/arm/arm64_armv8_linux_gcc`)
DPDK_OPTS += CROSS=$(CONFIG_CROSS_PREFIX)-
endif
ifeq ($(OS),Linux)
DPDK_CONFIG := $(DPDK_CONFIG)-linuxapp
NPROC := $(shell nproc)
else
ifeq ($(OS),FreeBSD)
DPDK_CONFIG := $(DPDK_CONFIG)-bsdapp
NPROC := $(shell sysctl hw.ncpu | awk '{print $$NF}')
endif
endif
ifeq ($(CC_TYPE),clang)
DPDK_CONFIG := $(DPDK_CONFIG)-clang
else
DPDK_CONFIG := $(DPDK_CONFIG)-gcc
endif
DPDK_CFLAGS += -fPIC
ifeq ($(CONFIG_DEBUG),y)
DPDK_CFLAGS += -O0 -g
endif
ifeq ($(CONFIG_WERROR),y)
DPDK_CFLAGS += -Werror
else
DPDK_CFLAGS += -Wno-error
endif
ifeq ($(CONFIG_CET),y)
DPDK_CFLAGS += -fcf-protection
DPDK_LDFLAGS += -fcf-protection
endif
ifdef EXTRA_DPDK_CFLAGS
$(warning EXTRA_DPDK_CFLAGS defined, possibly to work around an unsupported compiler version)
$(shell sleep 1)
@ -102,80 +140,17 @@ endif
# Allow users to specify EXTRA_DPDK_CFLAGS if they want to build DPDK using unsupported compiler versions
DPDK_CFLAGS += $(EXTRA_DPDK_CFLAGS)
ifeq ($(CC_TYPE),gcc)
GCC_MAJOR = $(shell echo __GNUC__ | $(CC) -E -x c - | tail -n 1)
ifeq ($(shell test $(GCC_MAJOR) -ge 10 && echo 1), 1)
#1. gcc 10 complains on operations with zero size arrays in rte_cryptodev.c, so
#disable this warning
#2. gcc 10 disables fcommon by default and complains on multiple definition of
#aesni_mb_logtype_driver symbol which is defined in header file and presented in sevral
#translation units
DPDK_CFLAGS += -Wno-stringop-overflow -fcommon
endif
endif
$(SPDK_ROOT_DIR)/dpdk/build: $(SPDK_ROOT_DIR)/mk/cc.mk $(SPDK_ROOT_DIR)/include/spdk/config.h
$(Q)rm -rf $(SPDK_ROOT_DIR)/dpdk/build
$(Q)$(MAKE) -C $(SPDK_ROOT_DIR)/dpdk config T=$(DPDK_CONFIG) $(DPDK_OPTS)
# Force-disable scan-build
SUB_CC = $(patsubst %ccc-analyzer,$(DEFAULT_CC),$(CC))
DPDK_ALL_DRIVER_DIRS = $(shell find $(SPDK_ROOT_DIR)/dpdk/drivers -mindepth 1 -type d)
DPDK_ALL_DRIVERS = $(DPDK_ALL_DRIVER_DIRS:$(SPDK_ROOT_DIR)/dpdk/drivers/%=%)
DPDK_DISABLED_DRVERS = $(filter-out $(DPDK_DRIVERS),$(DPDK_ALL_DRIVERS))
ifneq ($(OS),FreeBSD)
SED_INPLACE_FLAG = "-i"
MESON_PREFIX = $(SPDK_ROOT_DIR)/dpdk/build
else
SED_INPLACE_FLAG = "-i ''"
MESON_PREFIX = "/"
endif
# Some ninja versions come with a (broken?) jobserver which defaults to use
# only 1 thread for the build. We workaround this by specifying -j to ninja
# with the same value as top-makefile. This is OK as long as DPDK is not built
# in parralel with anything else, which is the case for now.
ifeq ($(MAKE_PID),)
MAKE_PID := $(shell echo $$PPID)
endif
MAKE_NUMJOBS := $(shell ps T | sed -nE 's/[[:space:]]*$(MAKE_PID)[[:space:]].* (-j|--jobs=)( *[0-9]+).*/\1\2/p')
all: $(SPDK_ROOT_DIR)/dpdk/build-tmp
$(Q)# DPDK doesn't handle nested make calls, so unset MAKEFLAGS
$(Q)env -u MAKEFLAGS ninja -C $(SPDK_ROOT_DIR)/dpdk/build-tmp $(MAKE_NUMJOBS)
$(Q) \
# Meson on FreeBSD sometimes appends --prefix value to the default DESTDIR (which is e.g. \
# /usr/local) instead of replacing it. --prefix needs to be an absolute path, so we set \
# it to / and then set DESTDIR directly, so libs and headers are copied to "DESTDIR//". \
# DPDK kernel modules are set to install in $DESTDIR/boot/modules, but we move them \
# to DESTDIR/kmod to be consistent with the makefile build. \
# \
# Also use meson install --only-changed instead of ninja install so that the shared \
# libraries don't get reinstalled when they haven't been rebuilt - this avoids all of \
# our applications getting relinked even when nothing has changed.
$(Q)if [ "$(OS)" = "FreeBSD" ]; then \
env -u MAKEFLAGS DESTDIR=$(SPDK_ROOT_DIR)/dpdk/build ninja -C $(SPDK_ROOT_DIR)/dpdk/build-tmp $(MAKE_NUMJOBS) install > /dev/null && \
mv $(SPDK_ROOT_DIR)/dpdk/build/boot/modules $(SPDK_ROOT_DIR)/dpdk/build/kmod; \
else \
env -u MAKEFLAGS meson install -C $(SPDK_ROOT_DIR)/dpdk/build-tmp --only-changed > /dev/null; \
fi
$(SPDK_ROOT_DIR)/dpdk/build-tmp: $(SPDK_ROOT_DIR)/mk/cc.mk $(SPDK_ROOT_DIR)/include/spdk/config.h
$(Q)rm -rf $(SPDK_ROOT_DIR)/dpdk/build $(SPDK_ROOT_DIR)/dpdk/build-tmp
$(Q)cd "$(SPDK_ROOT_DIR)/dpdk"; CC="$(SUB_CC)" meson --prefix="$(MESON_PREFIX)" --libdir lib -Dc_args="$(DPDK_CFLAGS)" -Dc_link_args="$(DPDK_LDFLAGS)" $(DPDK_OPTS) -Ddisable_drivers="$(shell echo $(DPDK_DISABLED_DRVERS) | sed -E "s/ +/,/g")" build-tmp
$(Q)sed $(SED_INPLACE_FLAG) 's/#define RTE_EAL_PMD_PATH .*/#define RTE_EAL_PMD_PATH ""/g' $(SPDK_ROOT_DIR)/dpdk/build-tmp/rte_build_config.h
$(Q) \
# TODO Meson build adds libbsd dependency when it's available. This means any app will be \
# forced to link with -lbsd, but only if it's available on the system. The clean way to \
# handle this would be to rely on DPDK's pkg-config file which will contain the -lbsd when \
# required. For now just remove the libbsd dependency. DPDK will fallback to its internal \
# functions.
$(Q)sed $(SED_INPLACE_FLAG) 's/#define RTE_USE_LIBBSD .*//g' $(SPDK_ROOT_DIR)/dpdk/build-tmp/rte_build_config.h
all: $(SPDK_ROOT_DIR)/dpdk/build
$(Q)$(MAKE) -C $(SPDK_ROOT_DIR)/dpdk/build EXTRA_CFLAGS="$(DPDK_CFLAGS)" EXTRA_LDFLAGS="$(DPDK_LDFLAGS)" MAKEFLAGS="T=$(DPDK_CONFIG) -j$(NPROC)" $(DPDK_OPTS)
clean:
$(Q)rm -rf $(SPDK_ROOT_DIR)/dpdk/build $(SPDK_ROOT_DIR)/dpdk/build-tmp
$(Q)rm -rf $(SPDK_ROOT_DIR)/dpdk/build
install:
@:
install: all
uninstall:
@:

230
etc/spdk/iscsi.conf.in Normal file
View File

@ -0,0 +1,230 @@
# iSCSI target configuration file
#
# Please write all parameters using ASCII.
# The parameter must be quoted if it includes whitespace.
#
# Configuration syntax:
# Leading whitespace is ignored.
# Lines starting with '#' are comments.
# Lines ending with '\' are concatenated with the next line.
# Bracketed ([]) names define sections
[Global]
# Shared Memory Group ID. SPDK applications with the same ID will share memory.
# Default: <the process PID>
#SharedMemoryID 0
# Disable PCI access. PCI is enabled by default. Setting this
# option will hide any PCI device from all SPDK modules, making
# SPDK act as if they don't exist.
#NoPci Yes
# Tracepoint group mask for spdk trace buffers
# Default: 0x0 (all tracepoint groups disabled)
# Set to 0xFFFF to enable all tracepoint groups.
#TpointGroupMask 0x0
# Users may activate entries in this section to override default values for
# global parameters in the block device (bdev) subsystem.
[Bdev]
# Number of spdk_bdev_io structures allocated in the global bdev subsystem pool.
#BdevIoPoolSize 65536
# Maximum number of spdk_bdev_io structures to cache per thread.
#BdevIoCacheSize 256
[iSCSI]
# node name (not include optional part)
# Users can optionally change this to fit their environment.
NodeBase "iqn.2016-06.io.spdk"
AuthFile /usr/local/etc/spdk/auth.conf
# Socket I/O timeout sec. (0 is infinite)
Timeout 30
# authentication information for discovery session
# Options:
# None, Auto, CHAP and Mutual. Note that Mutual infers CHAP.
DiscoveryAuthMethod Auto
#MaxSessions 128
#MaxConnectionsPerSession 2
# iSCSI initial parameters negotiate with initiators
# NOTE: incorrect values might crash
DefaultTime2Wait 2
DefaultTime2Retain 60
# Maximum amount in bytes of unsolicited data the iSCSI
# initiator may send to the target during the execution of
# a single SCSI command.
FirstBurstLength 8192
ImmediateData Yes
ErrorRecoveryLevel 0
# Users must change the PortalGroup section(s) to match the IP addresses
# for their environment.
# PortalGroup sections define which network portals the iSCSI target
# will use to listen for incoming connections. These are also used to
# determine which targets are accessible over each portal group.
# Up to 1024 portal directives are allowed. These define the network
# portals of the portal group. The user must specify a IP address
# for each network portal, and may optionally specify a port.
# If the port is omitted, 3260 will be used.
# Syntax:
# Portal <Name> <IP address>[:<port>]
[PortalGroup1]
Portal DA1 192.168.2.21:3260
Portal DA2 192.168.2.22:3260
# Users must change the InitiatorGroup section(s) to match the IP
# addresses and initiator configuration in their environment.
# Netmask can be used to specify a single IP address or a range of IP addresses
# Netmask 192.168.1.20 <== single IP address
# Netmask 192.168.1.0/24 <== IP range 192.168.1.*
[InitiatorGroup1]
InitiatorName ANY
Netmask 192.168.2.0/24
# NVMe configuration options
[Nvme]
# NVMe Device Whitelist
# Users may specify which NVMe devices to claim by their transport id.
# See spdk_nvme_transport_id_parse() in spdk/nvme.h for the correct format.
# The second argument is the assigned name, which can be referenced from
# other sections in the configuration file. For NVMe devices, a namespace
# is automatically appended to each name in the format <YourName>nY, where
# Y is the NSID (starts at 1).
TransportID "trtype:PCIe traddr:0000:00:00.0" Nvme0
TransportID "trtype:PCIe traddr:0000:01:00.0" Nvme1
# The number of attempts per I/O when an I/O fails. Do not include
# this key to get the default behavior.
RetryCount 4
# Timeout for each command, in microseconds. If 0, don't track timeouts.
TimeoutUsec 0
# Action to take on command time out. Only valid when Timeout is greater
# than 0. This may be 'Reset' to reset the controller, 'Abort' to abort
# the command, or 'None' to just print a message but do nothing.
# Admin command timeouts will always result in a reset.
ActionOnTimeout None
# Set how often the admin queue is polled for asynchronous events.
# Units in microseconds.
AdminPollRate 100000
# Set how often I/O queues are polled from completions.
# Units in microseconds.
IOPollRate 0
# Disable handling of hotplug (runtime insert and remove) events,
# users can set to Yes if want to enable it.
# Default: No
HotplugEnable No
# Set how often the hotplug is processed for insert and remove events.
# Units in microseconds.
HotplugPollRate 0
# Users may change this section to create a different number or size of
# malloc LUNs.
# If the system has hardware DMA engine, it can use an IOAT
# (i.e. Crystal Beach DMA) channel to do the copy instead of memcpy
# by specifying "Enable Yes" in [Ioat] section.
# Offload is disabled by default even it is available.
[Malloc]
# Number of Malloc targets
NumberOfLuns 5
# Malloc targets are 128M
LunSizeInMB 128
# Block size. Default is 512 bytes.
BlockSize 4096
# Users can use offload by specifying "Enable Yes" in this section
# if it is available.
# Users may use the whitelist to initialize specified devices, IDS
# uses BUS:DEVICE.FUNCTION to identify each Ioat channel.
[Ioat]
Enable No
Whitelist 00:04.0
Whitelist 00:04.1
# Users must change this section to match the /dev/sdX devices to be
# exported as iSCSI LUNs. The devices are accessed using Linux AIO.
# The format is:
# AIO <file name> <bdev name> [<block size>]
# The file name is the backing device
# The bdev name can be referenced from elsewhere in the configuration file.
# Block size may be omitted to automatically detect the block size of a disk.
[AIO]
AIO /dev/sdb AIO0
AIO /dev/sdc AIO1
AIO /tmp/myfile AIO2 4096
# PMDK libpmemblk-based block device
[Pmem]
# Syntax:
# Blk <pmemblk pool file name> <bdev name>
Blk /path/to/pmem-pool Pmem0
# The Split virtual block device slices block devices into multiple smaller bdevs.
[Split]
# Syntax:
# Split <bdev> <count> [<size_in_megabytes>]
# Split Malloc1 into two equally-sized portions, Malloc1p0 and Malloc1p1
Split Malloc1 2
# Split Malloc2 into eight 1-megabyte portions, Malloc2p0 ... Malloc2p7,
# leaving the rest of the device inaccessible
Split Malloc2 8 1
# The RAID virtual block device based on pre-configured block device.
[RAID1]
# Unique name of this RAID device.
Name Raid0
# RAID level, only raid level 0 is supported.
RaidLevel 0
# Strip size in KB.
StripSize 64
# Number of pre-configured bdevs.
NumDevices 2
# Pre-configured bdevs name with Nvme.
#Devices Nvme0n1 Nvme1n1
# Pre-configured bdevs name with Malloc.
Devices Malloc3 Malloc4
# Pre-configured bdevs name with AIO.
#Devices AIO0 AIO1
# Users should change the TargetNode section(s) below to match the
# desired iSCSI target node configuration.
# TargetName, Mapping, LUN0 are minimum required
[TargetNode1]
TargetName disk1
TargetAlias "Data Disk1"
Mapping PortalGroup1 InitiatorGroup1
AuthMethod Auto
AuthGroup AuthGroup1
# Enable header and data digest
# UseDigest Header Data
UseDigest Auto
# Use the first malloc target
LUN0 Malloc0
# Using the first AIO target
LUN1 AIO0
# Using the second storage target
LUN2 AIO1
# Using the third storage target
LUN3 AIO2
QueueDepth 128
[TargetNode2]
TargetName disk2
TargetAlias "Data Disk2"
Mapping PortalGroup1 InitiatorGroup1
AuthMethod Auto
AuthGroup AuthGroup1
UseDigest Auto
LUN0 Nvme0n1
LUN1 Raid0
QueueDepth 32

288
etc/spdk/nvmf.conf.in Normal file
View File

@ -0,0 +1,288 @@
# NVMf Target Configuration File
#
# Please write all parameters using ASCII.
# The parameter must be quoted if it includes whitespace.
#
# Configuration syntax:
# Leading whitespace is ignored.
# Lines starting with '#' are comments.
# Lines ending with '\' are concatenated with the next line.
# Bracketed ([]) names define sections
[Global]
# Tracepoint group mask for spdk trace buffers
# Default: 0x0 (all tracepoint groups disabled)
# Set to 0xFFFF to enable all tracepoint groups.
#TpointGroupMask 0x0
# PciBlacklist and PciWhitelist cannot be used at the same time
#PciBlacklist 0000:01:00.0
#PciBlacklist 0000:02:00.0
#PciWhitelist 0000:03:00.0
#PciWhitelist 0000:04:00.0
# Users may activate entries in this section to override default values for
# global parameters in the block device (bdev) subsystem.
[Bdev]
# Number of spdk_bdev_io structures allocated in the global bdev subsystem pool.
#BdevIoPoolSize 65536
# Maximum number of spdk_bdev_io structures to cache per thread.
#BdevIoCacheSize 256
# Users may change this section to create a different number or size of
# malloc LUNs.
# This will generate 8 LUNs with a malloc-allocated backend.
# Each LUN will be size 64MB and these will be named
# Malloc0 through Malloc7. Not all LUNs defined here are necessarily
# used below.
[Malloc]
NumberOfLuns 8
LunSizeInMB 64
# Users must change this section to match the /dev/sdX devices to be
# exported as iSCSI LUNs. The devices are accessed using Linux AIO.
# The format is:
# AIO <file name> <bdev name>
# The file name is the backing device
# The bdev name can be referenced from elsewhere in the configuration file.
# Block size may be omitted to automatically detect the block size of a disk.
[AIO]
AIO /dev/sdb AIO0
AIO /dev/sdc AIO1
AIO /tmp/myfile AIO2 4096
# PMDK libpmemblk-based block device
[Pmem]
# Syntax:
# Blk <pmemblk pool file name> <bdev name>
Blk /path/to/pmem-pool Pmem0
# Define NVMf protocol global options
[Nvmf]
# Set how often the acceptor polls for incoming connections. The acceptor is also
# responsible for polling existing connections that have gone idle. 0 means continuously
# poll. Units in microseconds.
AcceptorPollRate 10000
# Set how the connection is scheduled among multiple threads, current supported string value are
# "RoundRobin", "Host", "Transport".
# RoundRobin: Schedule the connection with roundrobin manner.
# Host: Schedule the connection according to host IP.
# Transport: Schedule the connection according to the transport characteristics.
# For example, for TCP transport, we can schedule the connection according to socket NAPI_ID info.
# The connection which has the same socket NAPI_ID info will be grouped in the same polling group.
ConnectionScheduler RoundRobin
# One valid transport type must be set in each [Transport].
# The first is the case of RDMA transport and the second is the case of TCP transport.
[Transport]
# Set RDMA transport type.
Type RDMA
# Set the maximum number of outstanding I/O per queue.
#MaxQueueDepth 128
# Set the maximum number of submission and completion queues per session.
# Setting this to '8', for example, allows for 8 submission and 8 completion queues
# per session.
#MaxQueuesPerSession 4
# Set the maximum in-capsule data size. Must be a multiple of 16.
# 0 is a valid choice.
#InCapsuleDataSize 4096
# Set the maximum I/O size. Must be a multiple of 4096.
#MaxIOSize 131072
# Set the I/O unit size, and this value should not be larger than MaxIOSize
#IOUnitSize 131072
# Set the maximum number of IO for admin queue
#MaxAQDepth 32
# Set the number of pooled data buffers available to the transport
# It is used to provide the read/write data buffers for the qpairs on this transport.
#NumSharedBuffers 512
# Set the number of shared buffers to be cached per poll group
#BufCacheSize 32
# Set the maximum number outstanding I/O per shared receive queue. Relevant only for RDMA transport
#MaxSRQDepth 4096
[Transport]
# Set TCP transport type.
Type TCP
# Set the maximum number of outstanding I/O per queue.
#MaxQueueDepth 128
# Set the maximum number of submission and completion queues per session.
# Setting this to '8', for example, allows for 8 submission and 8 completion queues
# per session.
#MaxQueuesPerSession 4
# Set the maximum in-capsule data size. Must be a multiple of 16.
# 0 is a valid choice.
#InCapsuleDataSize 4096
# Set the maximum I/O size. Must be a multiple of 4096.
#MaxIOSize 131072
# Set the I/O unit size, and this value should not be larger than MaxIOSize
#IOUnitSize 131072
# Set the maximum number of IO for admin queue
#MaxAQDepth 32
# Set the number of pooled data buffers available to the transport
# It is used to provide the read/write data buffers for the qpairs on this transport.
#NumSharedBuffers 512
# Set the number of shared buffers to be cached per poll group
#BufCacheSize 32
# Set whether to use the C2H Success optimization, only used for TCP transport.
# C2HSuccess true
# Define FC transport
#[Transport]
# Set FC transport type.
#Type FC
# Set the maximum number of submission and completion queues per session.
# Setting this to '8', for example, allows for 8 submission and 8 completion queues
# per session.
#MaxQueuesPerSession 5
# Set the maximum number of outstanding I/O per queue.
#MaxQueueDepth 128
# Set the maximum I/O size. Must be a multiple of 4096.
#MaxIOSize 65536
[Nvme]
# NVMe Device Whitelist
# Users may specify which NVMe devices to claim by their transport id.
# See spdk_nvme_transport_id_parse() in spdk/nvme.h for the correct format.
# The second argument is the assigned name, which can be referenced from
# other sections in the configuration file. For NVMe devices, a namespace
# is automatically appended to each name in the format <YourName>nY, where
# Y is the NSID (starts at 1).
TransportID "trtype:PCIe traddr:0000:00:00.0" Nvme0
TransportID "trtype:PCIe traddr:0000:01:00.0" Nvme1
TransportID "trtype:PCIe traddr:0000:02:00.0" Nvme2
TransportID "trtype:PCIe traddr:0000:03:00.0" Nvme3
TransportID "trtype:RDMA adrfam:IPv4 traddr:192.168.100.8 trsvcid:4420 hostaddr:192.168.100.9 subnqn:nqn.2016-06.io.spdk:cnode1" Nvme4
TransportID "trtype:TCP adrfam:IPv4 traddr:192.168.100.3 trsvcid:4420 hostaddr:192.168.100.4 subnqn:nqn.2016-06.io.spdk:cnode2" Nvme5
# The number of attempts per I/O when an I/O fails. Do not include
# this key to get the default behavior.
RetryCount 4
# Timeout for each command, in microseconds. If 0, don't track timeouts.
TimeoutUsec 0
# Action to take on command time out. Only valid when Timeout is greater
# than 0. This may be 'Reset' to reset the controller, 'Abort' to abort
# the command, or 'None' to just print a message but do nothing.
# Admin command timeouts will always result in a reset.
ActionOnTimeout None
# Set how often the admin queue is polled for asynchronous events.
# Units in microseconds.
AdminPollRate 100000
# Set how often I/O queues are polled from completions.
# Units in microseconds.
IOPollRate 0
# Disable handling of hotplug (runtime insert and remove) events,
# users can set to Yes if want to enable it.
# Default: No
HotplugEnable No
# The Split virtual block device slices block devices into multiple smaller bdevs.
[Split]
# Syntax:
# Split <bdev> <count> [<size_in_megabytes>]
# Split Malloc2 into two equally-sized portions, Malloc2p0 and Malloc2p1
Split Malloc2 2
# Split Malloc3 into eight 1-megabyte portions, Malloc3p0 ... Malloc3p7,
# leaving the rest of the device inaccessible
Split Malloc3 8 1
# The RAID virtual block device based on pre-configured block device.
[RAID1]
# Unique name of this RAID device.
Name Raid0
# RAID level, only raid level 0 is supported.
RaidLevel 0
# Strip size in KB.
StripSize 64
# Number of pre-configured bdevs.
NumDevices 2
# Pre-configured bdevs name with Nvme.
Devices Nvme2n1 Nvme3n1
# Pre-configured bdevs name with Malloc.
#Devices Malloc0 Malloc1
# Pre-configured bdevs name with AIO.
#Devices AIO0 AIO1
# Define an NVMf Subsystem.
# - NQN is required and must be unique.
# - Between 1 and 255 Listen directives are allowed. This defines
# the addresses on which new connections may be accepted. The format
# is Listen <type> <address> where type can be RDMA, TCP or FC.
# - Between 0 and 255 Host directives are allowed. This defines the
# NQNs of allowed hosts. If no Host directive is specified, all hosts
# are allowed to connect.
# - Between 0 and 255 Namespace directives are allowed. These define the
# namespaces accessible from this subsystem.
# The user must specify MaxNamespaces to allow for adding namespaces
# during active connection. By default it is 0
# The user must specify a bdev name for each namespace, and may optionally
# specify a namespace ID. If nsid is omitted, the namespace will be
# assigned the next available NSID. The NSID must be unique within the
# subsystem. An optional namespace UUID may also be specified.
# Syntax:
# Namespace <bdev_name> [<nsid> [<uuid>]]
# Namespaces backed by physical NVMe devices
[Subsystem1]
NQN nqn.2016-06.io.spdk:cnode1
Listen TCP 15.15.15.2:4420
AllowAnyHost No
Host nqn.2016-06.io.spdk:init
SN SPDK00000000000001
MN SPDK_Controller1
MaxNamespaces 20
Namespace Nvme0n1 1
Namespace Nvme1n1 2
Namespace Raid0
# Multiple subsystems are allowed.
# Namespaces backed by non-NVMe devices
[Subsystem2]
NQN nqn.2016-06.io.spdk:cnode2
Listen RDMA 192.168.2.21:4420
AllowAnyHost No
Host nqn.2016-06.io.spdk:init
SN SPDK00000000000002
MN SPDK_Controller2
Namespace Malloc0
Namespace Malloc1
Namespace AIO0
Namespace AIO1
# Subsystem with FC listen address directive
# - Listen option allows subsystem access on specific FC ports identified
# by WWNN-WWPN. Each subsystem allows 0 - 255 listen directives.
# If no listen directive is provided, subsystem can be accessed on all
# avialable FC links permitted by FC zoning rules.
#
# [Subsystem3]
#NQN nqn.2016-06.io.spdk:cnode3
#Listen FC "nn-0x20000090fac7ca5c:pn-0x10000090fac7ca5c"
#AllowAnyHost Yes
#SN SPDK00000000000003
#Namespace Malloc4

187
etc/spdk/vhost.conf.in Normal file
View File

@ -0,0 +1,187 @@
# SPDK vhost configuration file
#
# Please write all parameters using ASCII.
# The parameter must be quoted if it includes whitespace.
# Configuration syntax:
# Leading whitespace is ignored.
# Lines starting with '#' are comments.
# Lines ending with '\' are concatenated with the next line.
# Bracketed ([]) names define sections
[Global]
# Instance ID for multi-process support
# Default: 0
#InstanceID 0
# Disable PCI access. PCI is enabled by default. Setting this
# option will hide any PCI device from all SPDK modules, making
# SPDK act as if they don't exist.
#NoPci Yes
# Tracepoint group mask for spdk trace buffers
# Default: 0x0 (all tracepoint groups disabled)
# Set to 0xFFFF to enable all tracepoint groups.
#TpointGroupMask 0x0
# Users may activate entries in this section to override default values for
# global parameters in the block device (bdev) subsystem.
[Bdev]
# Number of spdk_bdev_io structures allocated in the global bdev subsystem pool.
#BdevIoPoolSize 65536
# Maximum number of spdk_bdev_io structures to cache per thread.
#BdevIoCacheSize 256
# Users may not want to use offload even it is available.
# Users can use offload by specifying "Enable Yes" in this section
# if it is available.
# Users may use the whitelist to initialize specified devices, IDS
# uses BUS:DEVICE.FUNCTION to identify each Ioat channel.
[Ioat]
Enable No
#Whitelist 00:04.0
#Whitelist 00:04.1
# Users must change this section to match the /dev/sdX devices to be
# exported as vhost scsi drives. The devices are accessed using Linux AIO.
[AIO]
#AIO /dev/sdb AIO0
#AIO /dev/sdc AIO1
# PMDK libpmemblk-based block device
[Pmem]
# Syntax:
# Blk <pmemblk pool file name> <bdev name>
Blk /path/to/pmem-pool Pmem0
# Users may change this section to create a different number or size of
# malloc LUNs.
# If the system has hardware DMA engine, it can use an IOAT
# (i.e. Crystal Beach DMA) channel to do the copy instead of memcpy
# by specifying "Enable Yes" in [Ioat] section.
# Offload is disabled by default even it is available.
[Malloc]
# Number of Malloc targets
NumberOfLuns 3
# Malloc targets are 128M
LunSizeInMB 128
# Block size. Default is 512 bytes.
BlockSize 4096
# NVMe configuration options
[Nvme]
# NVMe Device Whitelist
# Users may specify which NVMe devices to claim by their transport id.
# See spdk_nvme_transport_id_parse() in spdk/nvme.h for the correct format.
# The second argument is the assigned name, which can be referenced from
# other sections in the configuration file. For NVMe devices, a namespace
# is automatically appended to each name in the format <YourName>nY, where
# Y is the NSID (starts at 1).
TransportID "trtype:PCIe traddr:0000:00:00.0" Nvme0
TransportID "trtype:PCIe traddr:0000:01:00.0" Nvme1
# The number of attempts per I/O when an I/O fails. Do not include
# this key to get the default behavior.
RetryCount 4
# Timeout for each command, in microseconds. If 0, don't track timeouts.
TimeoutUsec 0
# Action to take on command time out. Only valid when Timeout is greater
# than 0. This may be 'Reset' to reset the controller, 'Abort' to abort
# the command, or 'None' to just print a message but do nothing.
# Admin command timeouts will always result in a reset.
ActionOnTimeout None
# Set how often the admin queue is polled for asynchronous events.
# Units in microseconds.
AdminPollRate 100000
# Set how often I/O queues are polled from completions.
# Units in microseconds.
IOPollRate 0
# The Split virtual block device slices block devices into multiple smaller bdevs.
[Split]
# Syntax:
# Split <bdev> <count> [<size_in_megabytes>]
#
# Split Nvme1n1 into two equally-sized portions, Nvme1n1p0 and Nvme1n1p1
#Split Nvme1n1 2
# Split Malloc2 into eight 1-megabyte portions, Malloc2p0 ... Malloc2p7,
# leaving the rest of the device inaccessible
#Split Malloc2 8 1
# The RAID virtual block device based on pre-configured block device.
[RAID1]
# Unique name of this RAID device.
Name Raid0
# RAID level, only raid level 0 is supported.
RaidLevel 0
# Strip size in KB.
StripSize 64
# Number of pre-configured bdevs.
NumDevices 2
# Pre-configured bdevs name with Nvme.
#Devices Nvme0n1 Nvme1n1
# Pre-configured bdevs name with Malloc.
Devices Malloc1 Malloc2
# Pre-configured bdevs name with AIO.
#Devices AIO0 AIO1
# Vhost scsi controller configuration
# Users should change the VhostScsi section(s) below to match the desired
# vhost configuration.
# Name is minimum required
[VhostScsi0]
# Define name for controller
Name vhost.0
# Assign devices from backend
# Use the first malloc device
Target 0 Malloc0
# Use the first AIO device
#Target 1 AIO0
# Use the frist Nvme device
#Target 2 Nvme0n1
# Use the third partition from second Nvme device
#Target 3 Nvme1n1p2
# Start the poller for this vhost controller on one of the cores in
# this cpumask. By default, it not specified, will use any core in the
# SPDK process.
#Cpumask 0x1
#[VhostScsi1]
# Name vhost.1
# Target 0 AIO1
# Cpumask 0x1
#[VhostBlk0]
# Define name for controller
#Name vhost.2
# Use first partition from the second Malloc device
#Dev Malloc2p0
# Put controller in read-only mode
#ReadOnly no
# Start the poller for this vhost controller on one of the cores in
# this cpumask. By default, it not specified, will use any core in the
# SPDK process.
#Cpumask 0x1
#[VhostBlk1]
# Define name for controller
#Name vhost.2
# Use device which named Raid0
#Dev Raid0
#[VhostNvme0]
# Define name for controller
#Name vhost.0
#NumberOfQueues 2
# Use first partition from the first NVMe device
#Namespace Nvme0n1p0
# Use first partition from the first NVMe device
#Namespace Nvme0n1p1
# Start the poller for this vhost controller on one of the cores in
# this cpumask. By default, it not specified, will use any core in the
# SPDK process.
#Cpumask 0x1

View File

@ -34,11 +34,7 @@
SPDK_ROOT_DIR := $(abspath $(CURDIR)/..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
DIRS-y += accel bdev blob ioat nvme sock vmd nvmf
ifeq ($(OS),Linux)
DIRS-$(CONFIG_VHOST) += interrupt_tgt
endif
DIRS-y += bdev blob ioat nvme sock vmd
.PHONY: all clean $(DIRS-y)

View File

@ -1 +0,0 @@
accel_perf

View File

@ -1,44 +0,0 @@
#
# BSD LICENSE
#
# Copyright (c) Intel Corporation.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
# * Neither the name of Intel Corporation nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
APP = accel_perf
C_SRCS := accel_perf.c
SPDK_LIB_LIST = $(ACCEL_MODULES_LIST) event_accel
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

File diff suppressed because it is too large Load Diff

View File

@ -36,10 +36,14 @@ SPDK_ROOT_DIR := $(abspath $(CURDIR)/../../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
FIO_PLUGIN := spdk_bdev
APP := fio_plugin
C_SRCS = fio_plugin.c
CFLAGS += -I$(CONFIG_FIO_SOURCE_DIR)
LDFLAGS += -shared -rdynamic -Wl,-z,nodelete
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event_bdev
SPDK_LIB_LIST = $(ALL_MODULES_LIST)
SPDK_LIB_LIST += thread util bdev conf copy rpc jsonrpc json log sock trace notify
SPDK_LIB_LIST += event event_bdev event_copy event_vmd
include $(SPDK_ROOT_DIR)/mk/spdk.fio.mk
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk

View File

@ -1,9 +1,3 @@
# Introduction
This directory contains a plug-in module for fio to enable use
with SPDK. Fio is free software published under version 2 of
the GPL license.
# Compiling fio
Clone the fio source repository from https://github.com/axboe/fio
@ -45,11 +39,14 @@ To use the SPDK fio plugin with fio, specify the plugin binary using LD_PRELOAD
fio and set ioengine=spdk_bdev in the fio configuration file (see example_config.fio in the same
directory as this README).
LD_PRELOAD=<path to spdk repo>/build/fio/spdk_bdev fio
LD_PRELOAD=<path to spdk repo>/examples/bdev/fio_plugin/fio_plugin fio
The fio configuration file must contain one new parameter:
spdk_json_conf=./examples/bdev/fio_plugin/bdev.json
spdk_conf=./examples/bdev/fio_plugin/bdev.conf
This must point at an SPDK configuration file. There are a number of example configuration
files in the SPDK repository under etc/spdk.
You can specify which block device to run against by setting the filename parameter
to the block device name:
@ -60,10 +57,8 @@ Or for NVMe devices:
filename=Nvme0n1
fio by default forks a separate process for every job. It also supports just spawning a separate
thread in the same process for every job. The SPDK fio plugin is limited to this latter thread
usage model, so fio jobs must also specify thread=1 when using the SPDK fio plugin. The SPDK fio
plugin supports multiple threads - in this case, the "1" just means "use thread mode".
Currently the SPDK fio plugin is limited to the thread usage model, so fio jobs must also specify thread=1
when using the SPDK fio plugin.
fio also currently has a race condition on shutdown if dynamically loading the ioengine by specifying the
engine's full path via the ioengine parameter - LD_PRELOAD is recommended to avoid this race condition.

Some files were not shown because too many files have changed in this diff Show More