This change wasn't correctly rebased and needs to be updated to compile
against the current blobstore.
This reverts commit c1174e6895.
Change-Id: I529608bee7323cb626d8c36dff15adc9ba24ad26
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/402352
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Unit tests implemented in following patches.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ib18c9060f527bd22bfdbed74e96871a6e0551ead
Reviewed-on: https://review.gerrithub.io/396648
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
blobfs and lvol can now use this to automatically iterate
all existing blobs during spdk_bs_load. Changes to blobfs
and lvol will come in future patches.
This will also be used in some upcoming patches which need
to iterate through blobs during load to determine
snapshot/clone relationships.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic7c5fac4535ceaa926217a105dda532517e3e251
Reviewed-on: https://review.gerrithub.io/400177
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Finish the sequence first, before calling _spdk_bs_free().
Otherwise synchronous bs_devs (like we use in the unit
tests) cause the sequence memory to get freed via
_spdk_bs_free() and then we try to finish the sequence.
This eliminates the need for g_scheduler_delay and
_bs_flush_scheduler() in the blob unit tests. But don't
remove them - they will be useful in upcoming unit tests
for queued persist operations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I09aac3ae4d3a56ff8e04a5b822fcd6746f13afc3
Reviewed-on: https://review.gerrithub.io/401267
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
No functional change - this just separates out the
code that creates the persist ctx from the code that
actually performs the persist operation.
Part of series to enable queuing persist operations -
this will be useful for starting a previously queued
persist operation.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie1966ff2a477f3075c36f90560010d036658f803
Reviewed-on: https://review.gerrithub.io/401255
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iaef32731b05a53ac0707524d78086eedc89d6af6
Reviewed-on: https://review.gerrithub.io/401254
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I096fb24dd2fe2fc4dd97d80c957c328d960fb867
Reviewed-on: https://review.gerrithub.io/401073
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
spdk_blob_close() and spdk_blob_sync_md() currently do their
own CLEAN state. To consolidate the state checking code,
have both functions rely on the check in _spdk_blob_persist()
instead.
This will reduce code but more importantly is needed for
some upcoming changes for queuing persist operations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I38118624b4fad6f18c4b7466d9ddfa0915c3fce0
Reviewed-on: https://review.gerrithub.io/401065
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
These are common functions that can be called from
any function that reads or modifies a blob's metadata to
perform necessary asserts.
This will also fix several places where blob metadata
functions were asserting the calling thread context,
but not the current state of the blob.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9e16c082a27c439311f8ff214335adadfa715497
Reviewed-on: https://review.gerrithub.io/401053
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
All metadata operations are now done on the metadata
thread, so we no longer have to worry about one thread
updating in-memory metadata structures while another
thread is transferring the in-memory structures to
on-disk structures.
This does not protect against multiple sync operations
outstanding at once - that will be coming in an
upcoming path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibf33edf4d41d867c96a38df017737e9ceb87fa58
Reviewed-on: https://review.gerrithub.io/401056
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The md (metadata) thread is always the thread that
initialize/loaded the blobstore. Metadata operations may
only be performed from this thread. This patch adds some
more asserts in metadata functions that were previously
missed.
While here, also update some of the blobstore documentation
related to this.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5cafdb3ba402ceb6c3ccb6fdd9d36e7768f59f39
Reviewed-on: https://review.gerrithub.io/400885
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
These new names are much more clear and are aligned with other
functions such as spdk_blob_close.
Keep the old names around for now but deprecate them. We will
remove them in next release.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idc60fd0b19fa2a8b0247a1f5835774d342e721f9
Reviewed-on: https://review.gerrithub.io/400884
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is needed for an upcoming change which will
prevent metadata functions from being called on
threads other than the metadata thread. Without
this change, there was no way for this function
to return an error if it was called from the wrong
thread.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I67e591140194ff6ad250878168f6b166a1ff2282
Reviewed-on: https://review.gerrithub.io/400883
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
There was some thinking that we would need to allocate
I/O channels on a per-blob basis to handle dynamic
resizing during I/O. Making spdk_blob an opaque handle,
with the existing spdk_blob structure renamed to
spdk_blob_data was a first step towards making that
happen. But more recent work on blobstore has
simplified the resizing approach, so this spdk_blob_data
is no longer needed. So revert it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I22e07008faceb70649ee560176ebe5e014d5f1a3
Reviewed-on: https://review.gerrithub.io/400881
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This reduces some code duplication and ensures all
successful load operations (whether or not it included
recovery after power fail) through the same function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia463ce6ebe976ab2420525d86962e8fe54ad04d7
Reviewed-on: https://review.gerrithub.io/400164
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Patch adds internal version of xattr functions to allow
operations on internal xattrs, which are not visible to
upper layers.
When there is at least one internal xattr set, also
SPDK_BLOB_INTERNAL_XATTR flag is set in invalid_flags to prevent
loading this blob in previous spdk versions.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Iec918ec858f069f7cd9f36d5e8f0495ffa4a42d8
Reviewed-on: https://review.gerrithub.io/395122
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I8b570802b05b8e03802d3c2b68a1e7644ea548ac
Reviewed-on: https://review.gerrithub.io/396572
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I277f7288427788e7a107b143331753fd5b23f16f
Reviewed-on: https://review.gerrithub.io/396571
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This change will allow reusing this structure for both internal
and external xattrs as well as in functions having optional xattr,
but missing other options (i.e. snapshot, clone implemented in next patches)
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ia6619a75efa0a100168a6f8317be274823af04ab
Reviewed-on: https://review.gerrithub.io/396417
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Icba20351c9ca76397393064b41013c527084853e
Reviewed-on: https://review.gerrithub.io/396385
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
For thin provisioned blobs we allocate dma memory
required for copying cluster from backing device.
When cluster size is too big dma allocation may fail
silently (only IO error).
Use PRIu32 to print out the cluster size, and while
here, fix two other places that were using %d to print
cluster size instead of PRIu32.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I098b1a58aee2f0d3f4ead7aa326ecdb63a5b53d8
Reviewed-on: https://review.gerrithub.io/397563
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I3c7d2096f549a88b4a9884c0026d15d3bcd8dc67
Reviewed-on: https://review.gerrithub.io/396387
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ibc9609ad36188006e9454e5c799bccd8a92d7991
Reviewed-on: https://review.gerrithub.io/391422
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This will be needed for thin provisioning, since a write
I/O may result in needing to insert a cluster into the
blob and that write I/O may not have been performed
on the metadata thread.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I84b0cb6e7af87b1f9c6cab4e2c24fa26b12e2c06
Reviewed-on: https://review.gerrithub.io/396737
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This enables some code reuse for future patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I296a6c5c0915da4a77a1ab43e8f10a335b7d16d0
Reviewed-on: https://review.gerrithub.io/396736
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This will be used in the upcoming thin provisioning
patches. A thread the wishes to insert a newly
allocated cluster into the blob will send a message
to the metadata thread to perform to call this
function, and if it succeeds, sync the blob's
metadata.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I26fcca235ea0d7a187b9fe559851290b6db13649
Reviewed-on: https://review.gerrithub.io/396711
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
For now, use this to add some assert() calls to ensure
per-blob metadata operations are only called from the
thread that initialized/loaded the blobstore.
Upcoming patches will utilize this for metadata updates
required due to cluster allocations on thin provisioned
blobs. In that case, the cluster allocations may not
always be done on the metadata thread - but we want
the metadata thread to actually do the metadata sync
operation to guard against races from allocations on
multiple threads in parallel.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa0adfe8b7e61ba770449d1e076126ecb9d7a556
Reviewed-on: https://review.gerrithub.io/396712
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Currently, there is no possibility to save read only blob to disk.
This patch modifies behaviour so that read only flags are applied after syncing blob.
This is analogy to resize, set xattr and remove xattr operations.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Iffed601c78cb83231bb20e7ef05b73847dc3c95a
Reviewed-on: https://review.gerrithub.io/394243
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This allows a channel's request_set resources to be
used for queuing I/O requests. This is needed
for upcoming thin provisioning functionality,
where we must queue I/O requests that need to
allocate a cluster, if another cluster allocation
is in progress.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie8d3e799afc0b56bc95ba5ecab11253d8bc8608f
Reviewed-on: https://review.gerrithub.io/395037
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Now all operations (both single buffer and iov-based
payloads) which span a cluster boundary get split into
separate blob calls for each cluster.
This will simplify upcoming patches that need to do
special operations if a cluster needs to be allocated.
This code can now be added to just the single cluster
operations and not have to worry about splits. It
will also simplify the code that will eventually queue
requests which require a cluster allocation if an
allocation is already in progress.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I22a850e6e23e3b3be31183b28d01d57c163b25b1
Reviewed-on: https://review.gerrithub.io/395035
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
A future patch will queue an operation in
_spdk_blob_request_submit_op_single if the cluster is not allocated
and another allocation is already in progress. If the queueing fails
because of no channel resources, we want to fail the callback and
need to do this before creating the batch.
This causes a bit of code duplication, but in the end should make the
code easier to read.
While here, pass spdk_blob instead of spdk_blob_data to
_spdk_blob_request_submit_op_single. This simplifies a future patch
which will need the spdk_blob when queueing an operation. It also
incidentally makes it consistent with _spdk_blob_request_submit_op_split.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib0ec0e5138eac5bf208fcde6676cd2a77a1a663f
Reviewed-on: https://review.gerrithub.io/395196
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For I/O that do not span a cluster boundary, just issue
a single batch command to underlying block device.
For I/O that do span a cluster boundary, issue a batch
command for each against the blob (not the block device)
for each cluster accessed by the I/O.
This is all in preparation for upcoming patches which
enable thin provisioning and hence cluster allocation
in the I/O path. It will simplify implementation of
the cluster allocation path since now that code only
needs to be concerned with a single allocation at once.
Splitting for readv/writev will be handled in a
later patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia2341abbda599dace3357c4eec06ab6602ef81a8
Reviewed-on: https://review.gerrithub.io/395027
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This breaks out the logic for building a batch for
non-iov operations to a separate function. Future
patches will do further modifications on this
new function - separating it out into two separate
functions, one for operations that span a cluster boundary
and one for those that do not.
No functional change here - this is just moving code
around to reduce size of upcoming patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id1e67e6305d7ba2317700e1477e12c749ebf664c
Reviewed-on: https://review.gerrithub.io/395026
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This clarifies that the read/write/etc. operation is
being performaned on the block device. This
clarification will be important in some upcoming
patches which will batch similar operations on a
blob (as part of splitting a user request into smaller
operations if it spans a cluster boundary).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic574804c15f769ee80c1b6f68ca3b77ec910f0f6
Reviewed-on: https://review.gerrithub.io/395017
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I35779dc547e0c084086ec6d9bf44f86850cb7f05
Reviewed-on: https://review.gerrithub.io/393780
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iadb29a8ce8dcebfea68d4feeb5f3de1bb3124f16
Reviewed-on: https://review.gerrithub.io/392286
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ib3470fbac49e92308ed14e20ccde6655354f2580
Reviewed-on: https://review.gerrithub.io/389577
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch provides logic for returning errors instead of
assert when size is larger than blobstore size.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I16d12338e2b682c39bd33d507d57ea126501a0d7
Reviewed-on: https://review.gerrithub.io/392749
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Recovery code did not claim clusters taken by metadata.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If6726eddd22f4e1a3f9814b2348243155fb0fdb9
Reviewed-on: https://review.gerrithub.io/394173
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This only adds the option and metadata flags.
Actual functionality will be added in an upcoming commit
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I66015f48f34d4c7c64fce1831ebaed134098407c
Reviewed-on: https://review.gerrithub.io/390196
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch fixes issue when blobstore doesn't serialize flags
when there is also at least one extent or xattr.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I85d5031dc45df510cebe1acf4694ab62bca2e720
Reviewed-on: https://review.gerrithub.io/393770
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We need to make the channel operations numbers configurable for blob.
Reason: for iSCSI tests, if there is one CPU core, there will be only
one channel, thus read stress tests would
fail since we need more operations for blob channel.
Select a value equal to the small buffer size(8192) for
bdev layer, thus we can solve the iSCSI read issue
correctly. Since for bdev read, we currently only
allow 8192 active bdev I/o requests, so this solution should
work.
PS: Current solution is still not perfect, I think the very
precise fix is that we need to restrict sending I/Os
to the blob, if there is no channel operations. Though
current code, we have retry I/O in bdev , but it still fails
the iSCSI high pressure test.
Change-Id: I211f7a89d144af2c96ad4cc1bd7ac8e94adc72e7
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/393115
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ibffb43e39b44e5f443d3dfbfa5b5d7dcac3243ef
Reviewed-on: https://review.gerrithub.io/391182
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: Ic2c23d16360b26359c2a32920b89f2f3a21a2a9a
Reviewed-on: https://review.gerrithub.io/391191
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This can be used for two purposes:
1) more quickly iterate the blob list, avoiding
metadata pages that are valid but not the first
page in the blob's metadata list
2) close races between delete and open operations -
now we can clear the bit in the blobid bit array
when the delete operation is in progress, ensuring
no one else can try to open the blob
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3904648fd6fa656cb98c9e17ea763ed5a84ef537
Reviewed-on: https://review.gerrithub.io/391695
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prepares for a future change where we need to use the
recovery path when loading pre-v3 on-disk formats, since the
older disk formats do not save a blobid mask.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia94d56450202f81373c3de94237eca2dfd96526c
Reviewed-on: https://review.gerrithub.io/391694
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This eliminates a bunch of code duplication. This also
fixes a couple of places where the ctx->bs was not being
freed in the load fail path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie6b0a4a653b5c80edf14086801b75457852a4736
Reviewed-on: https://review.gerrithub.io/391693
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iba33c55f129c60fad2d58f5254dec5c54ed56805
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Reviewed-on: https://review.gerrithub.io/388217
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This ensures we do not end up with a racing close v.
delete. If we decrement the ref up front, we could
start the close process (which may include persisting
metadata) and then also allow a delete operation to
start. It is safer to wait until the close operation
is done before decrementing the ref count, because then
it will eliminate this race condition (the delete op
would immediately fail).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad7fd8320d2c9b56f3c4fce054bcb6271e19ad38
Reviewed-on: https://review.gerrithub.io/391493
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This ensures all blob-loading functionality goes through
the single spdk_bs_open_blob function which will simplify
some upcoming changes around managing global metadata
state from multiple threads.
This will also help prevent races where a delete operation
has started followed by an open on the blob that is
being deleted. Those specific changes will be in an
upcoming patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I17e995145ab23068b816b44c33483b0708f5f563
Reviewed-on: https://review.gerrithub.io/391486
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Found during unit testing for blobid_mask coming in a future
patch - the unit test will be added as part of that future
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iecdde6ba16c5af9caf59214f328ddc22aae71e94
Reviewed-on: https://review.gerrithub.io/391692
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Similar to previous change, the ** paradigm is a bit
problematic for asynchronous routines that could fail.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife7748280482356c4c51a796817b71cd7bc7e479
Reviewed-on: https://review.gerrithub.io/391483
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Using the ** paradigm is a bit problematic for asynchronous
routines that could fail. Currently we were inconsistent in
that some error paths would zero the pointer while others
did not. So make this just a plain pointer, which simplifies
the API and its implementation.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I67147931c6e8350896a4505022a6a314655de3d3
Reviewed-on: https://review.gerrithub.io/391482
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We do not need to separately call _spdk_blob_lookup() in
_spdk_bs_iter_cpl() - spdk_bs_open_blob() does this
already. This minimizes the number of entry points to
_spdk_blob_lookup() which will be important with some
upcoming changes around multiple threads performing
metadata operations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I032cd55c862267298cbe1674dd13d7a83eef7c0f
Reviewed-on: https://review.gerrithub.io/391474
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Remove the metadata channel, and instead use the same
channel for metadata and data operations on the metadata
thread.
This prepares for future changes which will allow
for metadata operations on any thread - not just the
thread where spdk_bs_load() or spdk_bs_init() was
called.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6b11a58fcb237a9a7603841d118b3729d83c6c98
Reviewed-on: https://review.gerrithub.io/391311
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
As part of clarifying the API and preparing for some
future changes, rename the following functions:
spdk_bs_md_create_blob => spdk_bs_create_blob
spdk_bs_md_open_blob => spdk_bs_open_blob
spdk_bs_md_delete_blob => spdk_bs_delete_blob
spdk_bs_md_iter_first => spdk_bs_iter_first
spdk_bs_md_iter_next => spdk_bs_iter_next
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I86bf792717b68379484a6108396bb891fe1c221e
Reviewed-on: https://review.gerrithub.io/391031
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This function was a nop and is not needed.
lvol was calling this function when an lvol bdev
gets a FLUSH I/O, but that is not needed either. So
lvol will now report it does not support flush.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I92df83243f7ebce81c69040a8874891dc2ffc961
Reviewed-on: https://review.gerrithub.io/391023
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The implementations in blobstore.c still remain for now, but
those will be removed after some upcoming changes which will
eliminate a global md thread and instead allow caller to
specify an I/O channel for each blobstore level API call.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idf800a7f061ffc9c42488951262e28e660871356
Reviewed-on: https://review.gerrithub.io/391020
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Moving forward, the spdk_blob returned to users will
actually be an I/O channel - not the blob structure
itself. So rename the existing spdk_blob to spdk_blob_data.
spdk_blob_data will continue to contain global state for
the blob. In the future spdk_blob will point to an
I/O channel for the blob - for now it effectively still
points to the spdk_blob_data, but by changing the
structure names here it will reduce the code churn in
future patches.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7d0cbc0553f68f96c24173c833091a80d058eb89
Reviewed-on: https://review.gerrithub.io/390900
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This keeps the top level functions as simple pass-throughs
which will simplify some future commits.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6578a1440d0404f600c055a5e37f28468b633d6f
Reviewed-on: https://review.gerrithub.io/390899
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Disambiguate the log components from the trace functionality
(include/spdk/trace.h).
The internal spdk_trace_flag structure and related functions will be
renamed in a later commit - this is just a find and replace on
SPDK_TRACE_* and SPDK_LOG_REGISTER_TRACE_FLAG().
Change-Id: I617bd5a9fbe35ffb44ae6020b292658c094a0ad6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/376421
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The function just needs to zero out metadata so that the
blobstore is effectively destroyed. If the user wants
to unmap the rest of the disk after the blobstore is
destroyed, they are free to do so. On future initializations
of blobstores the code will do the unmapping, so performance
is not impacted.
While here, implement the zeroing using the new
write_zeroes functionality instead of allocating a buffer
full of zeroes.
Change-Id: I7f18be0fd5e13a48b171ab3f4d5f5e12876023bc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/390307
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Similar flags will be added at the blobstore level in a future
patch.
This allows backwards compatibility - i.e. allow older blobstore
applications to open blobstores created by newer blobstore
applications with new features. Any blob's using a new feature
should have an associated flag set in one of three new flag masks:
- invalid: if a bit is set in this mask that the application is not
aware of, do not allow the blob to be opened
- data_ro: if a bit is set in this mask that the application is not
aware of, allow the blob to be opened, but do not allow
write I/O nor any operation that changes metadata
- md_ro: if a bit is set in this mask that the application is not
aware of, allow the blob to be opened for performing any
kind of I/O, but do not allow any operation that changes
metadata
While here, bump SPDK_BS_VERSION to 3. We intend this to be the
last change made to SPDK_BS_VERSION - future versioning will be
done via blobstore or per-blob feature flags instead.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If059e38bfffbeec25c849a7629a81193b12302c4
Reviewed-on: https://review.gerrithub.io/388703
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently there are a bunch of asserts() on metadata
descriptors - change these to fail the blob parsing
instead.
While here also return -ENOMEM if any of the memory
allocations fail.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie37b73c57b304d05a45d10a8d33bcc1d47e7a1be
Reviewed-on: https://review.gerrithub.io/388702
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
data_ro means that write, write_zeroes and unmap operations are not
allowed.
md_ro means that resize, set_xattr and remove_xattr are not
allowed.
There is no code yet that can activate this - it is coming in a future
patch. Two usages are planned though:
1) a user explicitly marks a blob as read-only - this is persisted so that
future loads of the blob will ensure the blob cannot be modified - neither
metadata nor data
2) a future feature flag framework (how's that for alliteration) may allow
a blob to be opened, but not allow metadata modifications, if there are
feature flags set in the blob's or blobstore's metadata that the
application does not understand
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I247fd900430c56f7176edfb80dddd5a1a6c8dc87
Reviewed-on: https://review.gerrithub.io/388663
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Currently on blobstore creation we use write zeros everytime.
Some drives does not support write zeros, but support unmap.
We should do write zeroes only on metadata and try to unmap
data clusters
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iae36c1ccacc08340e79ad40c4c9a2c53dda920ba
Reviewed-on: https://review.gerrithub.io/387152
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If89275acfb1560982e332148a99ed3c83f8cb34f
Reviewed-on: https://review.gerrithub.io/387609
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is case like no memory to create the channel.
Needs to handle this properly.
Change-Id: I5d13d18037e6aa8f057769b1ef345f45597b22af
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/386016
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
At the moment there was no way to a user of blobstore api to know,
how many clusters are availible to him. Total_clusters describes
number of clusters for metadata and user data.
New field added total_data_clusters, keeping number of clusters
that can be used to create blobs - meaning just user data.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I60555217644557410844f74628375a6b46fd2ac7
Reviewed-on: https://review.gerrithub.io/385633
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I59cbef4ce1bfe8af113c66c2c9cb9f208440c0aa
Reviewed-on: https://review.gerrithub.io/383887
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It is possible that a user will call spdk_bs_unload() with blobs
list not-empty. Instead of just asserting that, now the call fails
with appropriate error.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I83818453d6c90ff9b5bf657c90e12b2f9d5ca013
Reviewed-on: https://review.gerrithub.io/383220
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Unmap does not guarantee that erased blocks will return all zeroes.
using write_zeroes when unmapping metadata gives the
desired behavior for a blob.
Only metadata pages will be cleared with write_zeroes in this patch;
blob data clusters will still call unmap. This behavior may be made
configurable in a later patch (to allow the user to request zeroing of
clusters rather than just unmapping).
Change-Id: I1b210abac110867ce703bcfdeb634eb45aa9d5c9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/372004
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently exposed API allows to load/unload and to
initialize blobstore on a device.
A spdk_bs_destroy() call is added in order to reach
functional parity with spdk_bs_init(). It was not
possible to remove blobstore from device from within
SPDK previously.
spdk_bs_destroy() takes blobstore pointer as argument
(instead of bs_dev), because blobstore has to be already
loaded to destroy it.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I2c493a4407868fcf08fd1766a19fc8463f634ef5
Reviewed-on: https://review.gerrithub.io/382019
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
We only sync the metadata and data in the runtime of blobstore, which
means we only update the used md bitmap and used clusters bitmap in memory.
if the system crushed, we have no chance to sync the used md bitmap and
used clusters bitmap into disk, then next time when we try to load the
blobstore, all the data will lost, this patch add the logic to recover the
valid data from last dirty shutdown. We will go through all the metadata pages
to find all valid data and rebuild them.
Change-Id: Ieb7c5f932206b1b68fdde0cee35f2d2cb3a4f309
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/376470
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A recent patch did some refactoring on how the superblock is
written out - it introduced a bug where on load we would
write out INVALID for super_blob id and a null bstype when
clearing the clean bit.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1d6256e35030645b3e8fda83bfe0f74aeb635733
Reviewed-on: https://review.gerrithub.io/383129
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Introducing bstype as a way to identify and verify
blobstore type.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I50267b5408625be10fe0c146ae329016d5509b4a
Reviewed-on: https://review.gerrithub.io/380476
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch reduce duplicate data structure and make some functions common
for both bs load and unload porcess in the future.
Change-Id: I40b2135e89a705aa5073c1ded4c7b28be4b32f6e
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/381912
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Fixes github issue #29.
Because of how we handle the blobid and pagenum in blobstore,
it was possible to have blobstore inadvertently open the wrong
blob if open is provided a blobid where the lower 32 bits match
an existing blob but the upper 32 are clear.
Patch does the following:
- removes assert() that caught this on MD load and replace with
an error given that this condition can be induced via the API
- cleanup of pagenum and blobid conversion/handling to make it
clearer how they're related and converted
- new UTs that would have failed w/o the new check in place
Change-Id: I2b49b237922b3b8cfc4df296f5bc20195e41dc41
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/380872
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Three checks are added to options passed to spdk_bs_init,
with appropriate errors returned:
- whether any of the options is set to 0
- device size has to be bigger than cluster size
- pages reserved for metadata exceed total number of clusters
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Idee3c194b653e737ec7c7a768f1973ff72452c5b
Reviewed-on: https://review.gerrithub.io/379676
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Now bs_dev is destroyed only in two instances:
- within spdk_bs_init() on failure path
- vbdev_lvs_create() if spdk_lvs_init() errors out,
before even calling spdk_bs_init()
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I7b8af39fbe83907b0c47797f0f55ca3b941729d9
Reviewed-on: https://review.gerrithub.io/379848
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Fixes condition where blobstore was prematurely calling the
application callback on spdk_bs_unload(), if the application
tries to do something too quickly bad things happen.
To avoid application changes with how the g_devlist_mutex is
held, it is no longer held while calling
_spdk_io_device_attempt_free() because the app unload CB is
called from that function and may want to call
spdk_io_device_unregister() from its unload CB. So the lock
is now held and releases strictly around the list its
protecting which allows the CB from _spdk_io_device_attempt_free()
to be called without issue.
Change-Id: Ib451cfe6b33ea0c3f9e66c86785316f9d88837c7
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/377872
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
io_device and their channels are created after _spdk_bs_alloc finishes.
Until they are, only free() is required on allocated bs structure.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie00126cdaa2bb5cd77cad2dec89d670734367b49
Reviewed-on: https://review.gerrithub.io/379675
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch add support crc for metadata pages, we will also
add crc for supper block, used md and used clusters bitmask
pages in the following patches.
Change-Id: Ie36fcc16b39296d06721f1f8eb5689260194c558
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/377901
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The system could crash anytime, we need sync the "clean" flag
into disk as soon as we load the blobstore. Then if the system crashed,
we will find the blobstore is not clean shutdown next time when we load
the blobstore, and we could run the recover process then.
Change-Id: I6189678e970ffe979a224e02be6cede0ee44dde8
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/376276
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This matches the name to the behavior and prepares for addition of a new
log macro for "info" log level.
Change-Id: I94ccd49face4309d3368e399528776ab140748c4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/375833
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch delays destruction of bs_dev till after md_target io_device
is unregistered. Otherwise bs_dev would no longer exist when destroying
attached channels.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6e526e3f65f7f5bca0617888be06a5296422f8e0
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-on: https://review.gerrithub.io/371885
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Most of the work here revolves around having to split
an I/O that spans a cluster boundary. In this case
we need to allocate a separate iov array, and then
issue each sub-I/O serially, copying the relevant
subset of the original iov array.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0d46b3f832245900d109ee6c78cc6d49cf96428b
Reviewed-on: https://review.gerrithub.io/374880
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We use the size of a md page struct in a lot of places, use a #define
instead.
Change-Id: I522897c883bfc8b241c6da9b726d92f58faedd63
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/375040
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Fail spdk_bs_init() if the dev being used has an
LBA size that is larger than a metadata page or not evenly
divisible by the size of a metadata page.
Change-Id: I0e0ca747ecd5b6039c20fb6a885382bde4527158
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/374182
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Completion routine for reading superblock was not updating
the bs struct with the superblob id thus it would have
failed to be persistent from the application's perspective.
Change-Id: I4aa51ebe73315e9be7e08f82340b03f0e3836df7
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/373406
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Patch afe860ae deferred freeing the io_device. However, for nvme, the
io_device context (spdk_nvme_ctrlr) is still being destructed before
io_channels are destroyed, causing segfaults on hotremove.
This patch defers io_device context destruction and fixes nvme
hotremove.
Fixes: afe860aeb1 ("channel: Correctly defer unregisters if channels exist")
Fixes: 5533c3d208 ("util: defer put_io_channel")
Change-Id: I7af699174cac0c6c6a6faa2cc65418c47347eb9a
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/370459
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change-Id: I1439b471b101b390fcbef558039f2a543f465acd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/367121
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
- rename spdk_malloc_socket to spdk_dma_malloc_socket
- rename spdk_malloc to spdk_dma_malloc
- rename spdk_zmalloc to spdk_dma_zmalloc
- rename spdk_realloc to spdk_dma_realloc
- rename spdk_free to spdk_dma_free
Change-Id: I52a11b7a4243281f9c56f503e826fd7c4a1fd883
Signed-off-by: John Meneghini <johnm@netapp.com>
Reviewed-on: https://review.gerrithub.io/362604
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Previously, each freed cluster was unmapped individually.
Instead, coalesce unmaps for contiguous clusters to reduce
the volume of commands sent.
Change-Id: I6ea1d2e1235e3c030cd2826c97e57aca571bd2ae
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362773
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This wasn't used anywhere and we currently believe there
are superior software-only techniques for controlling
quality of service.
Change-Id: Icdadd5870ed0629b338c307d2619bbc242c3e7a3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362065
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is no longer used anywhere. For the places where we previously
used it, we've since found alternate solutions that do not
require it.
Change-Id: I738a80b95ef50348ce1c14969a3812b0a625b3fd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/362064
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows us to allocate different size channels and
not require the unique flag.
Change-Id: I4b1ffd244b60e9e9ab06f9ab4da8161ab57e1169
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/361668
Tested-by: <sys_sgsw@intel.com>
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The channel memory isn't allocated by these
libraries, so they can't free it.
Change-Id: I30909fa4e77bc5a41b45230f04ba5fe75b172dbf
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Separate the maximum metadata operations from the
maximum channel operations.
Change-Id: I1bbd440ab094a2a2e19c9a5b71724ac91ba88e42
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This will need to be configured globally for all channels.
Change-Id: I773252f220373617f8d09d1f24243db8095cf8a4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Fix up all existing spacing errors in comments and add an automated
check for patterns like /*comment*/.
Change-Id: I28f61c93612dc0f8aed66bd509da78e91ea9737e
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The descriptor type must be 0 to break out of the loop,
so we need to initialize this.
Change-Id: I5fdb24dcfece01332c487364d5694c4fb8412e1b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Or rather, at least assert that the allocation failed.
This is not a recoverable error in general.
Change-Id: I9bc325066e829fc311ce84ce83536e9933ac5473
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is the initial commit for the "blobstore", a lightweight,
highly parallel, persistent, power-fail safe block allocator.
Documentation will be added in future patches.
Change-Id: I20a4daf899f1215d396f7931c3ec9a2e2bb269d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>