Always start bdev pollers on the calling core.
This removes the lcore concept from the bdev poller abstraction and
simplifies the job of spdk_bdev_initialize() callers providing their own
poller and event implementations.
All callers except the NVMe bdev hotplug poller already used the current
core as the parameter. The NVMe HotplugPollCore option was undocumented
and unused in any of the tests or example configuration files, so it
should be safe to remove.
Change-Id: I93b466e1e58901b8785c40cbe296fa46c157850f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/382857
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
At very high queue depths, bdev modules may not have enough
internal resources to track all of the incoming I/O. For example,
we allocate a finite number of nvme_request objects per allocated
queue pair. Currently if these resources are exhausted, the
bdev module will return failure (with no indication why) which
gets propagated all the way back to the application.
So instead, add SPDK_BDEV_IO_STATUS_NOMEM to allow bdev modules
to indicate this type of failure. Also add handling for this
status type in the generic bdev layer, involving queuing these
I/O for later retry after other I/O on the failing channel have
completed.
This does place an expectation on the bdev module that these
internal resources are allocated per io_channel. Otherwise we
cannot guarantee forward progress solely on reception of
completions. For example, without this guarantee, a bdev
module could theoretically return ENOMEM even if there were
no I/O oustanding for that io_channel. nvme, aio, rbd,
virtio and null drivers comply with this expectation already.
malloc only complies though when not using copy offload.
This patch will fix malloc w/ copy engine to at least
return ENOMEM when no copy descriptors are available. If the
condition above occurs, I/O waiting for resources will get
failed as part of a subsequent reset which matches the
behavior it has today.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iea7cd51a611af8abe882794d0b2361fdbb74e84e
Reviewed-on: https://review.gerrithub.io/378853
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This is required for upcoming UNMAP
implementation in bdev_virtio.
While here, also added documentation for
spdk_bdev_io_get_buf().
Change-Id: Ia769ee9b8b132f31208ae66598b29a1c9ed37312
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/379721
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Since all direct bdev_io types have the same layout,
there is no need to keep them differentiated.
Change-Id: If8bb85e43c9922c0ebfc39837e3a45006e508b56
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/377686
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The bdev modules now take all read, write, unmap, and flush requests in
terms of blocks rather than bytes.
The public bdev APIs still accept offset and length in bytes for now.
Change-Id: I57f0955d52272f57755f0ff4dbc56721fdc2ef51
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/376037
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
This matches the name to the behavior and prepares for addition of a new
log macro for "info" log level.
Change-Id: I94ccd49face4309d3368e399528776ab140748c4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/375833
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This makes it consistent it with other parts of this structure.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie36488813b53ce20663c50a5c9f049c4c9723d3a
Reviewed-on: https://review.gerrithub.io/375494
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I0e8c34237acc88fb51bac56c2f99df52ec199f9b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373835
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The construct_aio_bdev RPC still accepts "fname" for backwards
compatibility.
Change-Id: Ibf44f5f3667c6de4b827f7f3f8787aff0a6c4fc9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373834
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ac636c18d7e71e0184d4f67ec54a36217a11db0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373833
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Disk size in bytes is only used within create_aio_disk(), so we don't
need to save it in the struct file_disk context structure.
Change-Id: I63d230448a67c2b49c57eac2b4d44dce27c303cf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373832
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a file-backed AIO bdev to test it out.
Change-Id: Ifdf206bbdf6cae9379fdc02c80755e96a7198bce
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/373673
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change the poller to tolerate channel deletion within
an I/O completion callback. This won't happen today
because channel deletion is always deferred, but
prepare for that case.
It turns out this is simpler anyway.
Change-Id: Ibff23d84fe14247849e95cebc3c80369812bdd6c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/370525
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We still will sort the bdev_module list so that modules
with an examine() callback are initialized first. This ensures
they have a chance to initialize before later modules start
registering physical block devices.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I792cfb41b0abe030fe2486a2c872cbf329735932
Reviewed-on: https://review.gerrithub.io/369486
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Module initializaiton was made asynchronous recently
to support bdev modules like gpt which need to do
asynchronous I/O. But all modules now do any
asynchronous I/O in their examine() routines, and
init functions only do very basic operations to be
ready to handle examine() callbacks.
So simplify the bdev code and modules to go back to
a synchronous init procedure.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idb16156796ad7511d00f465d7a2db9acda6315b6
Reviewed-on: https://review.gerrithub.io/369485
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This includes file names, functions, #defines, etc.
There are still a few uses of "blockdev" outside of
include/ and lib/ - these can be handled later.
This preps for a future patch to consolidate vbdev
modules and bdev modules into just bdev modules.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I70e575709ae1b0a116b08515fd38ae793de05377
Reviewed-on: https://review.gerrithub.io/369325
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic3051b63942770e45be22af0ae03a78a7c543f81
Reviewed-on: https://review.gerrithub.io/368597
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Add a name parameter to the MODULE_REGISTER macros, and then
modify each bdev module to pass a string for its name.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If878617ce3c3eacfcf5df44ed6f194f11c66f78f
Reviewed-on: https://review.gerrithub.io/368596
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will enable asynchronous request handling in a future patch, and it
also removes the need for the RPC handlers to know about request id and
the JSON-RPC rules about notification-only requests.
Change-Id: I25aaa8e48bff8d5594ffcccecb61842b1e31ec3c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/368225
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Abstract these through the bdev API to break this
dependency on the event framework.
Change-Id: I108505bf27e94b2985f53d0a4dc0b847ae264d25
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/366340
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Dynamically allocate bdev names to remove the arbitrary 16-character
name length limit.
All of the existing product_names are constant strings, so those can
just use string literals instead of a copy per bdev.
Change-Id: I3280da67a4fcf2e4ec8ee8193362ca1b96a9c0cb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/363601
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I93684a004e2ae276734edbb4767b5ba1bac3dd48
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/362111
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This wasn't used anywhere and we currently believe there
are superior software-only techniques for controlling
quality of service.
Change-Id: Icdadd5870ed0629b338c307d2619bbc242c3e7a3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362065
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is no longer used anywhere. For the places where we previously
used it, we've since found alternate solutions that do not
require it.
Change-Id: I738a80b95ef50348ce1c14969a3812b0a625b3fd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362064
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We plan to use these buffers for more than just reads.
Change-Id: I8fa6cb432a6cfe4406fbf240cd3aa2ae4ab5f3d5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The user can get there via the bdev, so this didn't
have a purpose.
Change-Id: I7f85bb71d5ee238d37ba3624d0ac68a161c95e49
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Instead, they register some internal structure of
their choosing.
Change-Id: Id1f8c563d0a2c6f1066d741f86b8aa6fe09b6319
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Some calls were passing bdev->ctxt, some calls just
bdev. In most of our implementations those are the
same pointer, but they aren't necessarily.
Change-Id: If2d19f9eef059aded10a917ffb270c1dc4a8dc41
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The user now must choose the name for each AIO bdev. This
provides consistency for names across restarts.
Change-Id: I13ced1d02bb28c51d314512d60f739499b0c7d8d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
SPDK_COUNTOF works like sizeof, except it returns the number of elements
in an array instead of the number of bytes.
Change-Id: I38ff4dd3485ed9b630cc5660ff84851d0031911f
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spdk_poller_register() function provides a way to pass an event to
call once the poller is registered, but it is always NULL in the current
code base.
Change-Id: I459bf40ae4d050589577d113b7984f1563aaa9cc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is consistent with the other internal-only API headers.
Change-Id: I2c4748977d38a6c173311d26197d6273c168da7d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The SPDK_TRACELOG macro depends on a CONFIG setting (DEBUG), so it
should not be part of the public API.
Create a new include/spdk_internal directory for headers that should
only be used within SPDK, not exported for public use.
Change-Id: I39b90ce57da3270e735ba32210c4b3a3468c460b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When creating a bdev via the RPC interface, there was no way to know
what name it was assigned (other than predicting it based on the
numbering scheme). Change all of the relevant RPC interfaces to return
an array of bdev names so they can be used to construct LUNs/subsystems
dynamically in scripts.
Change-Id: I8e03349bdc81afd3d69247396a20df5fcf050f40
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Only call spdk_bdev_io_complete() where IO error is seen.
Change-Id: I829e4c589dbcb47017e810035837a4c61c3428f9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: Ie2c1f6853bfd54ebd8039df9a0305854ca3297b9
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Enforce exactly one trailing \n, and fix all of the existing cases.
Change-Id: I6218e4700e90aeb647eaee78089530c79993c8c8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch enables vector operation for bdev drivers aio, malloc and
nvme.
The rbd driver still handle only one vector.
Change-Id: I5f401527c2717011ecc21116363bbb722e804112
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
off_t is problematic for use as a file/block offset: it is signed, and
on 32-bit platforms, it can be 32 bits (depending on the settings of
_FILE_OFFSET_BITS and _LARGEFILE_SOURCE).
The blockdev layer already uses uint64_t to represent offsets; replace
the blockdev module uses of off_t in internal functions with uint64_t
to match.
Change-Id: I77a2e594572c56f1cd8a7a080f985ea5b27c35f3
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Some subsystems may wish to create unique I/O channels
which are not shared across all users of the same I/O
device on the same thread.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3ade3675d57338cf85b6a301285e6f392bd6cd2e
bdev and copy modules no longer have check_io functions
now - all polling is done via pollers registered when
I/O channels are created.
Other default resources are also removed - for example,
a qpair is no longer allocated and assigned per bdev
exposed by the nvme driver - the qpairs are only allocated
via I/O channels. Similar principle also applies to the
aio driver.
ioat channels are no longer allocated and assigned to
lcores - they are dynamically allocated and assigned
to I/O channels when needed. If no ioat channel is
available for an I/O channel, the copy engine framework
will revert to using memcpy/memset instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99435a75fe792a2b91ab08f25962dfd407d6402f