Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Icba20351c9ca76397393064b41013c527084853e
Reviewed-on: https://review.gerrithub.io/396385
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
For thin provisioned blobs we allocate dma memory
required for copying cluster from backing device.
When cluster size is too big dma allocation may fail
silently (only IO error).
Use PRIu32 to print out the cluster size, and while
here, fix two other places that were using %d to print
cluster size instead of PRIu32.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I098b1a58aee2f0d3f4ead7aa326ecdb63a5b53d8
Reviewed-on: https://review.gerrithub.io/397563
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I3c7d2096f549a88b4a9884c0026d15d3bcd8dc67
Reviewed-on: https://review.gerrithub.io/396387
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ibc9609ad36188006e9454e5c799bccd8a92d7991
Reviewed-on: https://review.gerrithub.io/391422
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be useful for backing thin provisioned
blobs in the future.
Change-Id: I78cf8cda39e8dff42da69b79ed460797d7494af1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/397043
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be needed for thin provisioning, since a write
I/O may result in needing to insert a cluster into the
blob and that write I/O may not have been performed
on the metadata thread.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I84b0cb6e7af87b1f9c6cab4e2c24fa26b12e2c06
Reviewed-on: https://review.gerrithub.io/396737
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This enables some code reuse for future patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I296a6c5c0915da4a77a1ab43e8f10a335b7d16d0
Reviewed-on: https://review.gerrithub.io/396736
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This will be used in the upcoming thin provisioning
patches. A thread the wishes to insert a newly
allocated cluster into the blob will send a message
to the metadata thread to perform to call this
function, and if it succeeds, sync the blob's
metadata.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I26fcca235ea0d7a187b9fe559851290b6db13649
Reviewed-on: https://review.gerrithub.io/396711
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
For now, use this to add some assert() calls to ensure
per-blob metadata operations are only called from the
thread that initialized/loaded the blobstore.
Upcoming patches will utilize this for metadata updates
required due to cluster allocations on thin provisioned
blobs. In that case, the cluster allocations may not
always be done on the metadata thread - but we want
the metadata thread to actually do the metadata sync
operation to guard against races from allocations on
multiple threads in parallel.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa0adfe8b7e61ba770449d1e076126ecb9d7a556
Reviewed-on: https://review.gerrithub.io/396712
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Currently, there is no possibility to save read only blob to disk.
This patch modifies behaviour so that read only flags are applied after syncing blob.
This is analogy to resize, set xattr and remove xattr operations.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Iffed601c78cb83231bb20e7ef05b73847dc3c95a
Reviewed-on: https://review.gerrithub.io/394243
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
iovcnt value was set to 0 instead of being assigned from
input argument.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I0f7e89871357f5db9a5a34c801176bc5c7870021
Reviewed-on: https://review.gerrithub.io/395959
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows a channel's request_set resources to be
used for queuing I/O requests. This is needed
for upcoming thin provisioning functionality,
where we must queue I/O requests that need to
allocate a cluster, if another cluster allocation
is in progress.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie8d3e799afc0b56bc95ba5ecab11253d8bc8608f
Reviewed-on: https://review.gerrithub.io/395037
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Now all operations (both single buffer and iov-based
payloads) which span a cluster boundary get split into
separate blob calls for each cluster.
This will simplify upcoming patches that need to do
special operations if a cluster needs to be allocated.
This code can now be added to just the single cluster
operations and not have to worry about splits. It
will also simplify the code that will eventually queue
requests which require a cluster allocation if an
allocation is already in progress.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I22a850e6e23e3b3be31183b28d01d57c163b25b1
Reviewed-on: https://review.gerrithub.io/395035
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
A future patch will queue an operation in
_spdk_blob_request_submit_op_single if the cluster is not allocated
and another allocation is already in progress. If the queueing fails
because of no channel resources, we want to fail the callback and
need to do this before creating the batch.
This causes a bit of code duplication, but in the end should make the
code easier to read.
While here, pass spdk_blob instead of spdk_blob_data to
_spdk_blob_request_submit_op_single. This simplifies a future patch
which will need the spdk_blob when queueing an operation. It also
incidentally makes it consistent with _spdk_blob_request_submit_op_split.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib0ec0e5138eac5bf208fcde6676cd2a77a1a663f
Reviewed-on: https://review.gerrithub.io/395196
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For I/O that do not span a cluster boundary, just issue
a single batch command to underlying block device.
For I/O that do span a cluster boundary, issue a batch
command for each against the blob (not the block device)
for each cluster accessed by the I/O.
This is all in preparation for upcoming patches which
enable thin provisioning and hence cluster allocation
in the I/O path. It will simplify implementation of
the cluster allocation path since now that code only
needs to be concerned with a single allocation at once.
Splitting for readv/writev will be handled in a
later patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia2341abbda599dace3357c4eec06ab6602ef81a8
Reviewed-on: https://review.gerrithub.io/395027
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This breaks out the logic for building a batch for
non-iov operations to a separate function. Future
patches will do further modifications on this
new function - separating it out into two separate
functions, one for operations that span a cluster boundary
and one for those that do not.
No functional change here - this is just moving code
around to reduce size of upcoming patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id1e67e6305d7ba2317700e1477e12c749ebf664c
Reviewed-on: https://review.gerrithub.io/395026
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
These will be used to implement user request splitting
on cluster boundaries.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I29e00ae9555fdd8a149e92be3cf88a2e528f5c0e
Reviewed-on: https://review.gerrithub.io/395021
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This clarifies that the read/write/etc. operation is
being performaned on the block device. This
clarification will be important in some upcoming
patches which will batch similar operations on a
blob (as part of splitting a user request into smaller
operations if it spans a cluster boundary).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic574804c15f769ee80c1b6f68ca3b77ec910f0f6
Reviewed-on: https://review.gerrithub.io/395017
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
These were unused in blobstore so remove them for now.
This reduces number of code changes in some upcoming
patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I02e04f7d84fff1d2a70f371d222425f373a5d4ff
Reviewed-on: https://review.gerrithub.io/395016
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prepares for some future blobstore.c code that will use these
bs_dev functions directly making possible to read from different device
than the one in the current context.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I55b15544ea4b53475c4b72ee5c92ff1c0a2b8c88
Reviewed-on: https://review.gerrithub.io/393781
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I35779dc547e0c084086ec6d9bf44f86850cb7f05
Reviewed-on: https://review.gerrithub.io/393780
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iadb29a8ce8dcebfea68d4feeb5f3de1bb3124f16
Reviewed-on: https://review.gerrithub.io/392286
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ib3470fbac49e92308ed14e20ccde6655354f2580
Reviewed-on: https://review.gerrithub.io/389577
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch provides logic for returning errors instead of
assert when size is larger than blobstore size.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I16d12338e2b682c39bd33d507d57ea126501a0d7
Reviewed-on: https://review.gerrithub.io/392749
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Recovery code did not claim clusters taken by metadata.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If6726eddd22f4e1a3f9814b2348243155fb0fdb9
Reviewed-on: https://review.gerrithub.io/394173
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This only adds the option and metadata flags.
Actual functionality will be added in an upcoming commit
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I66015f48f34d4c7c64fce1831ebaed134098407c
Reviewed-on: https://review.gerrithub.io/390196
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch fixes issue when blobstore doesn't serialize flags
when there is also at least one extent or xattr.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I85d5031dc45df510cebe1acf4694ab62bca2e720
Reviewed-on: https://review.gerrithub.io/393770
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We need to make the channel operations numbers configurable for blob.
Reason: for iSCSI tests, if there is one CPU core, there will be only
one channel, thus read stress tests would
fail since we need more operations for blob channel.
Select a value equal to the small buffer size(8192) for
bdev layer, thus we can solve the iSCSI read issue
correctly. Since for bdev read, we currently only
allow 8192 active bdev I/o requests, so this solution should
work.
PS: Current solution is still not perfect, I think the very
precise fix is that we need to restrict sending I/Os
to the blob, if there is no channel operations. Though
current code, we have retry I/O in bdev , but it still fails
the iSCSI high pressure test.
Change-Id: I211f7a89d144af2c96ad4cc1bd7ac8e94adc72e7
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/393115
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Only Makefiles for libraries that directly depend on DPDK (rather than
the SPDK env abstraction) should add $(ENV_CFLAGS).
Change-Id: Ifdf44d3ef8c42bbf7f20edd524b330d00658235b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/392818
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ibffb43e39b44e5f443d3dfbfa5b5d7dcac3243ef
Reviewed-on: https://review.gerrithub.io/391182
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ic2c23d16360b26359c2a32920b89f2f3a21a2a9a
Reviewed-on: https://review.gerrithub.io/391191
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This can be used for two purposes:
1) more quickly iterate the blob list, avoiding
metadata pages that are valid but not the first
page in the blob's metadata list
2) close races between delete and open operations -
now we can clear the bit in the blobid bit array
when the delete operation is in progress, ensuring
no one else can try to open the blob
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3904648fd6fa656cb98c9e17ea763ed5a84ef537
Reviewed-on: https://review.gerrithub.io/391695
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prepares for a future change where we need to use the
recovery path when loading pre-v3 on-disk formats, since the
older disk formats do not save a blobid mask.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia94d56450202f81373c3de94237eca2dfd96526c
Reviewed-on: https://review.gerrithub.io/391694
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This eliminates a bunch of code duplication. This also
fixes a couple of places where the ctx->bs was not being
freed in the load fail path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie6b0a4a653b5c80edf14086801b75457852a4736
Reviewed-on: https://review.gerrithub.io/391693
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iba33c55f129c60fad2d58f5254dec5c54ed56805
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Reviewed-on: https://review.gerrithub.io/388217
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This ensures we do not end up with a racing close v.
delete. If we decrement the ref up front, we could
start the close process (which may include persisting
metadata) and then also allow a delete operation to
start. It is safer to wait until the close operation
is done before decrementing the ref count, because then
it will eliminate this race condition (the delete op
would immediately fail).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad7fd8320d2c9b56f3c4fce054bcb6271e19ad38
Reviewed-on: https://review.gerrithub.io/391493
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This ensures all blob-loading functionality goes through
the single spdk_bs_open_blob function which will simplify
some upcoming changes around managing global metadata
state from multiple threads.
This will also help prevent races where a delete operation
has started followed by an open on the blob that is
being deleted. Those specific changes will be in an
upcoming patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I17e995145ab23068b816b44c33483b0708f5f563
Reviewed-on: https://review.gerrithub.io/391486
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Found during unit testing for blobid_mask coming in a future
patch - the unit test will be added as part of that future
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iecdde6ba16c5af9caf59214f328ddc22aae71e94
Reviewed-on: https://review.gerrithub.io/391692
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Similar to previous change, the ** paradigm is a bit
problematic for asynchronous routines that could fail.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife7748280482356c4c51a796817b71cd7bc7e479
Reviewed-on: https://review.gerrithub.io/391483
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Using the ** paradigm is a bit problematic for asynchronous
routines that could fail. Currently we were inconsistent in
that some error paths would zero the pointer while others
did not. So make this just a plain pointer, which simplifies
the API and its implementation.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I67147931c6e8350896a4505022a6a314655de3d3
Reviewed-on: https://review.gerrithub.io/391482
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We do not need to separately call _spdk_blob_lookup() in
_spdk_bs_iter_cpl() - spdk_bs_open_blob() does this
already. This minimizes the number of entry points to
_spdk_blob_lookup() which will be important with some
upcoming changes around multiple threads performing
metadata operations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I032cd55c862267298cbe1674dd13d7a83eef7c0f
Reviewed-on: https://review.gerrithub.io/391474
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove the metadata channel, and instead use the same
channel for metadata and data operations on the metadata
thread.
This prepares for future changes which will allow
for metadata operations on any thread - not just the
thread where spdk_bs_load() or spdk_bs_init() was
called.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6b11a58fcb237a9a7603841d118b3729d83c6c98
Reviewed-on: https://review.gerrithub.io/391311
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
As part of clarifying the API and preparing for some
future changes, rename the following functions:
spdk_bs_md_create_blob => spdk_bs_create_blob
spdk_bs_md_open_blob => spdk_bs_open_blob
spdk_bs_md_delete_blob => spdk_bs_delete_blob
spdk_bs_md_iter_first => spdk_bs_iter_first
spdk_bs_md_iter_next => spdk_bs_iter_next
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I86bf792717b68379484a6108396bb891fe1c221e
Reviewed-on: https://review.gerrithub.io/391031
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This function was a nop and is not needed.
lvol was calling this function when an lvol bdev
gets a FLUSH I/O, but that is not needed either. So
lvol will now report it does not support flush.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I92df83243f7ebce81c69040a8874891dc2ffc961
Reviewed-on: https://review.gerrithub.io/391023
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The implementations in blobstore.c still remain for now, but
those will be removed after some upcoming changes which will
eliminate a global md thread and instead allow caller to
specify an I/O channel for each blobstore level API call.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idf800a7f061ffc9c42488951262e28e660871356
Reviewed-on: https://review.gerrithub.io/391020
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Moving forward, the spdk_blob returned to users will
actually be an I/O channel - not the blob structure
itself. So rename the existing spdk_blob to spdk_blob_data.
spdk_blob_data will continue to contain global state for
the blob. In the future spdk_blob will point to an
I/O channel for the blob - for now it effectively still
points to the spdk_blob_data, but by changing the
structure names here it will reduce the code churn in
future patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7d0cbc0553f68f96c24173c833091a80d058eb89
Reviewed-on: https://review.gerrithub.io/390900
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This keeps the top level functions as simple pass-throughs
which will simplify some future commits.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6578a1440d0404f600c055a5e37f28468b633d6f
Reviewed-on: https://review.gerrithub.io/390899
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Disambiguate the log components from the trace functionality
(include/spdk/trace.h).
The internal spdk_trace_flag structure and related functions will be
renamed in a later commit - this is just a find and replace on
SPDK_TRACE_* and SPDK_LOG_REGISTER_TRACE_FLAG().
Change-Id: I617bd5a9fbe35ffb44ae6020b292658c094a0ad6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/376421
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The function just needs to zero out metadata so that the
blobstore is effectively destroyed. If the user wants
to unmap the rest of the disk after the blobstore is
destroyed, they are free to do so. On future initializations
of blobstores the code will do the unmapping, so performance
is not impacted.
While here, implement the zeroing using the new
write_zeroes functionality instead of allocating a buffer
full of zeroes.
Change-Id: I7f18be0fd5e13a48b171ab3f4d5f5e12876023bc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/390307
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Unmaps are only used within blobstore to improve device performance,
never to zero blocks. Therefore, if the device does not support
unmap, just skip it instead of writing zeroes.
This is different than devices that elect to implement the write
zeroes command as an unmap because they will return 0 for
subsequent reads. That optimization is still in effect.
Change-Id: Ie1bf98fe86d73b4ac40b41c0d2804db325716500
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/390306
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Similar flags will be added at the blobstore level in a future
patch.
This allows backwards compatibility - i.e. allow older blobstore
applications to open blobstores created by newer blobstore
applications with new features. Any blob's using a new feature
should have an associated flag set in one of three new flag masks:
- invalid: if a bit is set in this mask that the application is not
aware of, do not allow the blob to be opened
- data_ro: if a bit is set in this mask that the application is not
aware of, allow the blob to be opened, but do not allow
write I/O nor any operation that changes metadata
- md_ro: if a bit is set in this mask that the application is not
aware of, allow the blob to be opened for performing any
kind of I/O, but do not allow any operation that changes
metadata
While here, bump SPDK_BS_VERSION to 3. We intend this to be the
last change made to SPDK_BS_VERSION - future versioning will be
done via blobstore or per-blob feature flags instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If059e38bfffbeec25c849a7629a81193b12302c4
Reviewed-on: https://review.gerrithub.io/388703
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently there are a bunch of asserts() on metadata
descriptors - change these to fail the blob parsing
instead.
While here also return -ENOMEM if any of the memory
allocations fail.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie37b73c57b304d05a45d10a8d33bcc1d47e7a1be
Reviewed-on: https://review.gerrithub.io/388702
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
data_ro means that write, write_zeroes and unmap operations are not
allowed.
md_ro means that resize, set_xattr and remove_xattr are not
allowed.
There is no code yet that can activate this - it is coming in a future
patch. Two usages are planned though:
1) a user explicitly marks a blob as read-only - this is persisted so that
future loads of the blob will ensure the blob cannot be modified - neither
metadata nor data
2) a future feature flag framework (how's that for alliteration) may allow
a blob to be opened, but not allow metadata modifications, if there are
feature flags set in the blob's or blobstore's metadata that the
application does not understand
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I247fd900430c56f7176edfb80dddd5a1a6c8dc87
Reviewed-on: https://review.gerrithub.io/388663
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Currently on blobstore creation we use write zeros everytime.
Some drives does not support write zeros, but support unmap.
We should do write zeroes only on metadata and try to unmap
data clusters
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iae36c1ccacc08340e79ad40c4c9a2c53dda920ba
Reviewed-on: https://review.gerrithub.io/387152
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If89275acfb1560982e332148a99ed3c83f8cb34f
Reviewed-on: https://review.gerrithub.io/387609
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is case like no memory to create the channel.
Needs to handle this properly.
Change-Id: I5d13d18037e6aa8f057769b1ef345f45597b22af
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/386016
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
At the moment there was no way to a user of blobstore api to know,
how many clusters are availible to him. Total_clusters describes
number of clusters for metadata and user data.
New field added total_data_clusters, keeping number of clusters
that can be used to create blobs - meaning just user data.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I60555217644557410844f74628375a6b46fd2ac7
Reviewed-on: https://review.gerrithub.io/385633
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously lack of support for specific bdev was not known to user.
This impacts all unmap operations, such as initialization of blobstore.
It should be useful to user to know it will take longer
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I89bf3bc0342558fda9a8964fb5cb1daa3a8ed79e
Reviewed-on: https://review.gerrithub.io/385999
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
During tasting, if bdev is already claimed, we send errors on screen.
This is expected behavior so we should send only debug logs.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ic5766cfa3aed88099415991998381de69ee8b8b6
Reviewed-on: https://review.gerrithub.io/384229
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
If the bdev doesn't support unmap, we should not send unmap I/O.
Instead, use spdk_bdev_write_zeroes(), which has a fallback in the bdev
layer for devices that don't natively support it.
Change-Id: I1bd05d3518716f8e60501dbb4f9da0fee23cf7c2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/383491
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I59cbef4ce1bfe8af113c66c2c9cb9f208440c0aa
Reviewed-on: https://review.gerrithub.io/383887
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It is possible that a user will call spdk_bs_unload() with blobs
list not-empty. Instead of just asserting that, now the call fails
with appropriate error.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I83818453d6c90ff9b5bf657c90e12b2f9d5ca013
Reviewed-on: https://review.gerrithub.io/383220
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Unmap does not guarantee that erased blocks will return all zeroes.
using write_zeroes when unmapping metadata gives the
desired behavior for a blob.
Only metadata pages will be cleared with write_zeroes in this patch;
blob data clusters will still call unmap. This behavior may be made
configurable in a later patch (to allow the user to request zeroing of
clusters rather than just unmapping).
Change-Id: I1b210abac110867ce703bcfdeb634eb45aa9d5c9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/372004
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently exposed API allows to load/unload and to
initialize blobstore on a device.
A spdk_bs_destroy() call is added in order to reach
functional parity with spdk_bs_init(). It was not
possible to remove blobstore from device from within
SPDK previously.
spdk_bs_destroy() takes blobstore pointer as argument
(instead of bs_dev), because blobstore has to be already
loaded to destroy it.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2c493a4407868fcf08fd1766a19fc8463f634ef5
Reviewed-on: https://review.gerrithub.io/382019
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We only sync the metadata and data in the runtime of blobstore, which
means we only update the used md bitmap and used clusters bitmap in memory.
if the system crushed, we have no chance to sync the used md bitmap and
used clusters bitmap into disk, then next time when we try to load the
blobstore, all the data will lost, this patch add the logic to recover the
valid data from last dirty shutdown. We will go through all the metadata pages
to find all valid data and rebuild them.
Change-Id: Ieb7c5f932206b1b68fdde0cee35f2d2cb3a4f309
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/376470
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A recent patch did some refactoring on how the superblock is
written out - it introduced a bug where on load we would
write out INVALID for super_blob id and a null bstype when
clearing the clean bit.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1d6256e35030645b3e8fda83bfe0f74aeb635733
Reviewed-on: https://review.gerrithub.io/383129
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Introducing bstype as a way to identify and verify
blobstore type.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I50267b5408625be10fe0c146ae329016d5509b4a
Reviewed-on: https://review.gerrithub.io/380476
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch reduce duplicate data structure and make some functions common
for both bs load and unload porcess in the future.
Change-Id: I40b2135e89a705aa5073c1ded4c7b28be4b32f6e
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/381912
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ia6b6e5352ce4da04784fb0a3ea1efd0552650067
Reviewed-on: https://review.gerrithub.io/381548
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fixes github issue #29.
Because of how we handle the blobid and pagenum in blobstore,
it was possible to have blobstore inadvertently open the wrong
blob if open is provided a blobid where the lower 32 bits match
an existing blob but the upper 32 are clear.
Patch does the following:
- removes assert() that caught this on MD load and replace with
an error given that this condition can be induced via the API
- cleanup of pagenum and blobid conversion/handling to make it
clearer how they're related and converted
- new UTs that would have failed w/o the new check in place
Change-Id: I2b49b237922b3b8cfc4df296f5bc20195e41dc41
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/380872
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is in preparation for enabling hot remove of logical volumes when
their underlying blobstore device is hot-removed.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I310a3f64f0de5d628609c20a1a3b4d38df0755aa
Reviewed-on: https://review.gerrithub.io/377041
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Three checks are added to options passed to spdk_bs_init,
with appropriate errors returned:
- whether any of the options is set to 0
- device size has to be bigger than cluster size
- pages reserved for metadata exceed total number of clusters
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Idee3c194b653e737ec7c7a768f1973ff72452c5b
Reviewed-on: https://review.gerrithub.io/379676
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Now bs_dev is destroyed only in two instances:
- within spdk_bs_init() on failure path
- vbdev_lvs_create() if spdk_lvs_init() errors out,
before even calling spdk_bs_init()
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I7b8af39fbe83907b0c47797f0f55ca3b941729d9
Reviewed-on: https://review.gerrithub.io/379848
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Fixes condition where blobstore was prematurely calling the
application callback on spdk_bs_unload(), if the application
tries to do something too quickly bad things happen.
To avoid application changes with how the g_devlist_mutex is
held, it is no longer held while calling
_spdk_io_device_attempt_free() because the app unload CB is
called from that function and may want to call
spdk_io_device_unregister() from its unload CB. So the lock
is now held and releases strictly around the list its
protecting which allows the CB from _spdk_io_device_attempt_free()
to be called without issue.
Change-Id: Ib451cfe6b33ea0c3f9e66c86785316f9d88837c7
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/377872
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
io_device and their channels are created after _spdk_bs_alloc finishes.
Until they are, only free() is required on allocated bs structure.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie00126cdaa2bb5cd77cad2dec89d670734367b49
Reviewed-on: https://review.gerrithub.io/379675
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch add support crc for metadata pages, we will also
add crc for supper block, used md and used clusters bitmask
pages in the following patches.
Change-Id: Ie36fcc16b39296d06721f1f8eb5689260194c558
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/377901
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The system could crash anytime, we need sync the "clean" flag
into disk as soon as we load the blobstore. Then if the system crashed,
we will find the blobstore is not clean shutdown next time when we load
the blobstore, and we could run the recover process then.
Change-Id: I6189678e970ffe979a224e02be6cede0ee44dde8
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/376276
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Change-Id: I34356444b68d8310f66d7130cbdf8132b5487a94
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/376258
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This matches the name to the behavior and prepares for addition of a new
log macro for "info" log level.
Change-Id: I94ccd49face4309d3368e399528776ab140748c4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/375833
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch delays destruction of bs_dev till after md_target io_device
is unregistered. Otherwise bs_dev would no longer exist when destroying
attached channels.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6e526e3f65f7f5bca0617888be06a5296422f8e0
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-on: https://review.gerrithub.io/371885
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Most of the work here revolves around having to split
an I/O that spans a cluster boundary. In this case
we need to allocate a separate iov array, and then
issue each sub-I/O serially, copying the relevant
subset of the original iov array.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0d46b3f832245900d109ee6c78cc6d49cf96428b
Reviewed-on: https://review.gerrithub.io/374880
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We use the size of a md page struct in a lot of places, use a #define
instead.
Change-Id: I522897c883bfc8b241c6da9b726d92f58faedd63
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/375040
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Fail spdk_bs_init() if the dev being used has an
LBA size that is larger than a metadata page or not evenly
divisible by the size of a metadata page.
Change-Id: I0e0ca747ecd5b6039c20fb6a885382bde4527158
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/374182
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Completion routine for reading superblock was not updating
the bs struct with the superblob id thus it would have
failed to be persistent from the application's perspective.
Change-Id: I4aa51ebe73315e9be7e08f82340b03f0e3836df7
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/373406
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is far simpler, although it does limit the bdev
layer to unmapped just one range per command. In practice,
all of our code reports limits of just one range per command
anyway.
Change-Id: I99247ab349fe85b9925769e965833b06708d0d70
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/370382
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Patch afe860ae deferred freeing the io_device. However, for nvme, the
io_device context (spdk_nvme_ctrlr) is still being destructed before
io_channels are destroyed, causing segfaults on hotremove.
This patch defers io_device context destruction and fixes nvme
hotremove.
Fixes: afe860aeb1 ("channel: Correctly defer unregisters if channels exist")
Fixes: 5533c3d208 ("util: defer put_io_channel")
Change-Id: I7af699174cac0c6c6a6faa2cc65418c47347eb9a
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370459
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This enables checking permissions - for example,
spdk_bdev_write will fail if the descriptor was not
created with write permissions.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I68b65a560f471f2e0f71a7f42cfa6689b911110f
Reviewed-on: https://review.gerrithub.io/369493
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Now spdk_bdev_create_bs_dev() opens the underlaying spdk_bdev.
Due to that spdk_bdev should be closed when bs_dev is destroyed.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I0805f29abfeb52ff1db067bad7b7e0f13fc39398
Reviewed-on: https://review.gerrithub.io/368351
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Retire the old claim/unclaim semantics in favor of
open/close. Clients must now open a bdev to get
an spdk_bdev_desc, then pass this desc to get an
I/O channel.
This allows multiple clients to open a bdev,
although only one may open a bdev with write
access.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4d319f1278170124169a8a75fd791e926b3f7171
Reviewed-on: https://review.gerrithub.io/367611
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1439b471b101b390fcbef558039f2a543f465acd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/367121
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
It is not actually useful to be immediately returned
a handle to the bdev_io. There isn't anything valid
that the user can do with it at that point. Instead,
return an integer error code.
Change-Id: Iffa9a8dc5b2eefab57e3cc1f68919985431d17d1
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/364137
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
- rename spdk_malloc_socket to spdk_dma_malloc_socket
- rename spdk_malloc to spdk_dma_malloc
- rename spdk_zmalloc to spdk_dma_zmalloc
- rename spdk_realloc to spdk_dma_realloc
- rename spdk_free to spdk_dma_free
Change-Id: I52a11b7a4243281f9c56f503e826fd7c4a1fd883
Signed-off-by: John Meneghini <johnm@netapp.com>
Reviewed-on: https://review.gerrithub.io/362604
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously, each freed cluster was unmapped individually.
Instead, coalesce unmaps for contiguous clusters to reduce
the volume of commands sent.
Change-Id: I6ea1d2e1235e3c030cd2826c97e57aca571bd2ae
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362773
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This wasn't used anywhere and we currently believe there
are superior software-only techniques for controlling
quality of service.
Change-Id: Icdadd5870ed0629b338c307d2619bbc242c3e7a3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362065
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The user should not see the bdev_io status directly; the NVMe and SCSI
error code wrappers provide the ability to translate to the desired
format regardless of what kind of error is stored inside the bdev_io.
Replace the spdk_bdev_io_completion_cb status parameter with a bool
simply indiciating whether the I/O completed successfully.
Change-Id: Iad18c2dac4374112c41b7a656154ed3ae1a68569
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/362047
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is no longer used anywhere. For the places where we previously
used it, we've since found alternate solutions that do not
require it.
Change-Id: I738a80b95ef50348ce1c14969a3812b0a625b3fd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362064
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows us to allocate different size channels and
not require the unique flag.
Change-Id: I4b1ffd244b60e9e9ab06f9ab4da8161ab57e1169
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/361668
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The channel memory isn't allocated by these
libraries, so they can't free it.
Change-Id: I30909fa4e77bc5a41b45230f04ba5fe75b172dbf
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Separate the maximum metadata operations from the
maximum channel operations.
Change-Id: I1bbd440ab094a2a2e19c9a5b71724ac91ba88e42
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This will need to be configured globally for all channels.
Change-Id: I773252f220373617f8d09d1f24243db8095cf8a4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Fix up all existing spacing errors in comments and add an automated
check for patterns like /*comment*/.
Change-Id: I28f61c93612dc0f8aed66bd509da78e91ea9737e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The descriptor type must be 0 to break out of the loop,
so we need to initialize this.
Change-Id: I5fdb24dcfece01332c487364d5694c4fb8412e1b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Or rather, at least assert that the allocation failed.
This is not a recoverable error in general.
Change-Id: I9bc325066e829fc311ce84ce83536e9933ac5473
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is the initial commit for "blobfs", a lightweight
filesystem built on top of the SPDK blobstore.
Also included in this patch:
1) a shim for using SPDK bdevs as the backing store for
SPDK blobstore/blobfs
2) documentation for using blobfs as the storage engine
with RocksDB
3) scripts for running a set of workloads and collecting
profiling data with RocksDB and blobfs
See doc/blobfs/getting_started.md included in this commit
for more details on blobfs, including some of the current
limitations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2a6d3d4b87236730051228ed62c0c04e04c42c73
This is the initial commit for the "blobstore", a lightweight,
highly parallel, persistent, power-fail safe block allocator.
Documentation will be added in future patches.
Change-Id: I20a4daf899f1215d396f7931c3ec9a2e2bb269d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>