Commit Graph

5281 Commits

Author SHA1 Message Date
Changpeng Liu
601fbbf969 scsi: add persistent reservation out with reserve service action support
An application client creates a persistent reservation by issuing a
PERSISTENT RESERVE OUT command with RESERVE service action through
a registered I_T nexus with RESERVATION KEY and TYPE, and persistent
reservation has a scope of LUN, only one persistent reservation is
allowed at a time per LUN.

Change-Id: I052d0ddd439dd9994c0793218dd5c5d6ed176d2c
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436089
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-24 23:04:03 +00:00
Changpeng Liu
d0d19eb82e scsi: add persistent reservation out with register feature support
To establish a persistent reservation the application client shall first
register an I_T nexus with the device server. An application client registers
with a logical unit by issuing a PERSISTENT RESERVE OUT command with REGISTER
service action or REGISTER AND IGNORE EXISTING KEY service action.

Specify Initiator Ports (SPEC_I_PT) bit and All Target Ports (ALL_TG_PT) bit
are not supported for now, the registrants belong to the I_T nexus who sends
the command.

Also Activate Persist Through Power Loss (APTPL) bit will be supported in
following patches.

Change-Id: If057a764a4bfa73017f98048c94b697dbfa4b4a1
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/436088
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-24 23:04:03 +00:00
Jim Harris
59a1fbe937 iscsi: fix hang on high QD large reads
We throttle the number of data_in operations per
connection.  Currently after a read is completed,
we try to send more data_in operations since one
has just been completed.

But we are trying to send more too early.  The data_in_cnt
doesn't actually get decremented until after the PDU is
written on the socket.  So this results in a case
where data_in_cnt == 64, and all 64 read operations
complete before any of those 64 are actually transmitted
onto the TCP socket.  There are no more read operations
waiting, so we won't try to handle the data_in list
again, and if none of these 64 resulted in a SCSI
command completing, then the initiator may not send us
any more read I/O which would have also kicked the data_in
list.

So the solution is to kick the data_in list after the
PDU has been written - not after a read I/O is completed
back from the SCSI layer.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia01cf96e8eb6e08ddcaaeff449386e78de7c5bc5

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455454
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-24 04:19:03 +00:00
Jim Harris
462c12d9e9 iscsi: free pdus in hot remove path with no lun
The datain handling code path will set the LUN to
NULL if it finds a task's LUN has been hotremoved.
This could happen before the iscsi hotplug routine
actually gets a chance to run.  If this happens,
one of these tasks doesn't actually get freed, and
then will be freed after the lun is closed -
causing a segfault in the bdev layer since it may
have a bdev_io associated with it.

Found by running the iscsi_tgt/fio test after
applying the next patch in this series.

There's more work needed in this hot remove clean up
path - currently we are just freeing a lot of PDUs
rather than completing them with error status when
a LUN is hot removed.  But let's tackle that
separately.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8d27f0c7a79ae91cb6504e5ff6ffc8e346c9e54c

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455460
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-24 04:19:03 +00:00
Tianyu yang
151a357ca3 rte_vhost: fix deadlock on rte_vhost_driver_unregister()
when qemu connect to vhost, but don't send msg to vhost. We use
kill -15 to destroy vhost process. it will lead to deadlock.
(A)
* rte_vhost_driver_unregister()
 * pthread_mutex_lock hold vhost_user.mutex (1)
 * wait TAILQ_FIRST(&vsocket->conn_list) is NULL
(B)
* fdset_event_dispatch()
 * vhost_user_read_cb() start
   * vhost_user_msg_handler() start
   * dev->notify_ops is NULL because qemu just connect, no message recv.
   * vhost_driver_callback_get()
     * pthread_mutex_lock hold vhost_user.mutex (2)

(A) & (B) deadlock

To avoid this scenes, when qemu connect in vhost_new_device()
initialize dev->notify_ops

Change-Id: Iaf699da41dfa3088cfc0f09688b50fada6b2c8d6
Signed-off-by: Tianyu yang <yangtianyu2@huawei.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454832
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-24 00:53:42 +00:00
Vitaliy Mysak
4003ebf73c ocf: serialize base bdev names in OCF UUID
Use OCF per-volume UUID field to serialize bdev names.
This is going to be used during cache load to find out
  which bdevs we want to attach.

Note that there is in fact "user_metadata" buffor that is
  also stored in cache metadata, but we cannot use it
  because its size is limited to 64 bytes.
  UUID in OCF terms is not standarized
  and is meant to be used to store custom immutable data.

This change preparation for persistent metadata support.
Functionality is not changed. (everything works the same way)

Change-Id: Ia9204fae29106f5b816d93a6771425a223d6c028
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455414
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
1292ef24e7 ocf: stop cache immediately at shutdown
Change shutdown behavior such that now first bdev destruct() call
  stops OCF cache instance.
Previusly, cache was stopped when single vbdev was referencing it.

This patch is related to persistent metadata support.
Without this change, every time a SPDK application is stopped,
  only the last core is remembered in metadata,
  because vbdevs detach by 1 by 1, each time updating the metadata.

Change-Id: I57db3f77db525177c024ee85e660a85aff2f8c31
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455413
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
efe48d7ca9 ocf: fix vbdev_ocf_ctx_data_seek() function
Fix instance function for OCF ctx interface.
This function is used for persistent metadata only.

Change-Id: I9583ea8eb21f07a3e9072a9552bed1c077cb7114
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455412
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
cca2eab10a ocf: allow loading cache without setting cache mode
Make init_vbdev() accept NULL as cache_mode if loadq flag
  is set.
This is needed for load path because we don't know cache mode
  before the actual load (happens later).

This change is related to persistant metadata support.
This patch does not change current behavior because loadq flag
  is always false.

Change-Id: I42727eb841d87903c81bd5e6d51b6d4869ed9be3
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455411
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
2b1c1e7031 ocf: implement cache load path
Implement load path for OCF cache.
During load OCF will read metadata from cache disk and restore its state.

management_channel initialization is moved before cache load/attach part
  because ocf_mngt_cache_load() needs it to read metadata.

This patch is a preparation for persistant metadata support.
Load path is never used yet because loadq flag is always false.

Change-Id: Iff6c1c52eae7b9f52812a8bd3d5ae6d6facedd60
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455410
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
242564a82c ocf: add loadq flag to construct path
Add new loadq flag that indicates if vbdev
  should load cache instance from disk or start a new one.

This change is a preparation for persistent metadata support.
Functionality is not changed in this patch as the loadq flag is always false.

Change-Id: I1baac7f988e6eeadb4f365ba7bfab8019d55a753
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455409
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-23 19:01:38 +00:00
Piotr Pelplinski
fca402b0f2 ocf: add metadata callbacks
This patch adds callback for serializing spdk metadata
stored in ocf cache devices.

Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I856bac3d63c7172dd6c67361794159fa38812a78
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449300
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
40cac0ecad ocf: send flush during shutdown
Do management flush during OCF shutdown to write all dirty requests to cores.
Dirty requests are relevant to WriteBack mode only.

Change-Id: I778a73ed8ab5659921f192f638027d513c239814
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450683
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-23 19:01:38 +00:00
Piotr Pelplinski
a34f905a74 OCF: Update ocf submodule to version OCF v19.3.1
This patch updates submodule and appropiate functions.

Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I8cd7707058745cdc3e91f7d821bf215ff9287c86
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451463
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
80a2ff01f3 ocf: implement OCF cleaner
Implement cleaner that is used in WriteBack mode.

Cleaner is a background agent that does synchronization
 of data between cache and its cores.
 Cleaner usually runs every ~20 seconds to perform cleaning.
 The synchronization is a simmilar operation to OCF management flushes.
We need cleaner for WriteBack because only WriteBack mode
 produces dirty data that cleaner needs to deal with.
Cleaner requires adopting trylock() because in current version
 cleaner uses management lock when performs cleaning,
 which may lead to deadlocks if cleaner runs on
 the same thread as management operations.

WriteBack mode is fully functional after this change,
  but persistent metadata support is required to use it for production.

Cleaner will run on management thread for now.
  We plan to implement functionality of
  chosing a CPU core where cleaner should run.
Cleaning policy is not configurable yet.
  The default is ALRU with 20 sec interval.

Change-Id: I35aa7e00c44e0d7a77e64e60df1f66f20be03f55
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-23 19:01:38 +00:00
Vitaliy Mysak
eb58ad5379 ocf: create management channel
Create new management channel and handle sharing between vbdevs.
This channel is going to be used for management operations
 that produce IOs (such as flush) and also for cleaner.

Change-Id: Ieeed8454a7ab7459c86ac06ec6c0ece038bc928e
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455272
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-23 19:01:38 +00:00
Darek Stojaczyk
60d487877a util/cpuset: use dst & src parameter names
Declaractions for those functions already have dst
and src as param names, but definitions use set1 and
set2, which doesn't really tell which one is which.

My IDE actually gives me tooltips with param names
from definitions and it would be much easier to see
"dst" and "src" in there.

Change-Id: I45013cf27fc95f69cb2c776fb1b6701c5e60e373
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455163
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-23 07:00:34 +00:00
Vitaliy Mysak
54eeeac6e0 ocf: implement mngt_queue for asynchronous management
Implement management queue for asynchronous management that is provided
  in new OCF API. This is a neccessery step to be able to implement
  metadata support.

OCF submodule was updated to get management queue functionality
It has to be done in this patch because OCF module will not work
otherwise.

Change-Id: Id19c2e5bd6a5d26fee41752b62720e408dc082e8
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448619
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-22 17:57:31 +00:00
Vitaliy Mysak
244d6e3daa ocf: lock queue list during create/delete
OCF queue list needs to be managed synchronously.
This patch uses our own mutex to achieve that because we cannot rely on
  ocf_mngt_cache_lock() as it may produce deadlocks when using
  cleaner.

Alternative way would be to use trylock, but
  we need to register a poller for it and do locking concurently
  which doesn't seem to be possible in callbacks of io channel.

We agreed with OCF team that we are going to change this part
  when OCF will deliver safe ocf_mngt_cache_lock() function.

This patch fixes a very rare failure on our CI that looked like this:
```
04:33:02 vbdev_ocf.c: 134:stop_vbdev: *NOTICE*: Not stopping cache instance 'Malloc0' because it is referenced by other OCF bdev
04:33:03 MalCache1: Core core2 successfully removed
04:33:03 MalCache1: Stopping cache
04:33:03 MalCache1: Done saving cache state!
04:33:03 src/ocf/mngt/ocf_mngt_cache.c:1576:2: runtime error: pointer index expression with base 0x000000000000 overflowed to 0xffffffffffffffa8
04:33:03     #0 0x7f3c52d54c26 in _ocf_mngt_cache_stop src/ocf/mngt/ocf_mngt_cache.c:1576
04:33:03     #1 0x7f3c52d5579f in ocf_mngt_cache_stop src/ocf/mngt/ocf_mngt_cache.c:1657
04:33:03     #2 0x7f3c52cbe9f4 in stop_vbdev /var/jenkins/workspace/NVMe_tests/nvme_phy_autotest/spdk/lib/bdev/ocf/vbdev_ocf.c:147
04:33:03     #3 0x7f3c52cbf4a0 in vbdev_ocf_destruct /var/jenkins/workspace/NVMe_tests/nvme_phy_autotest/spdk/lib/bdev/ocf/vbdev_ocf.c:216
```

Change-Id: Id6fafb444958f3becdc480e44762074c6c081e1f
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450682
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-22 17:57:31 +00:00
Vitaliy Mysak
4e7fb25066 ocf: add shared cache instance context
Introduce cache context structure that is going to be used
  for sharing per cache info. For example: management queue,
  cleaner_channel, cleaner_thread.

Lifetime of this structure ends with last vbdev that gets unregistered
  and NOT when cache stops because cache does not have to be freed
  there.

Change-Id: I66252084d7efda92edd10fb737c8e9c8169b4f6d
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451403
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-22 17:57:31 +00:00
Vitaliy Mysak
f51d3b6dd7 ocf: adapt new asynchronous OCF API
Adapt to new async API which was changed recently
This is a neccessery step to be able to implement persistent metadata support

OCF submodule was updated to get new API

Change-Id: I6bf9941ab0557981235c5be27686594a1b8ac3a0
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448397
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-05-22 17:57:31 +00:00
Vitaliy Mysak
cac0e9d01c ocf: move register_finish() function up
Move last step of register chain to the top because it is arguably more readable.
This should not be done in previus patch because it spoils the gerrithub diff.

Change-Id: If51642a32c5cf2a757ca20485d2347410653ca2a
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-05-22 17:57:31 +00:00
Vitaliy Mysak
9000deb734 ocf: use trylock in vbdev_ocf.c
Use new trylock API to prevent lockups when using OCF cleaner.
Starting and stoping OCF bdev becomes asynchronous with this patch.

Using trylock means that we have to poll function that does locking each
  time we want to initiate some operation on cache instance.
  This is the reason why management functions become asynchronous.
Each management operation now has a _poll() operation associated with it.
  _poll() uses trylock and continues when cache is locked.

Change-Id: I83b9fbe87c27433e178583411b87a68b8efaf58e
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447888
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-05-22 17:57:31 +00:00
Vitaliy Mysak
c3d044cbd7 ocf: use vbdev_ocf_mngt_ interface in register path
This patch is a preparation for adopting ocf_cache_trylock() function.
Register OCF bdev path uses vbdev_ocf_mngt_ interface now
Register stays synchronous blocking operation in this patch,
 but with adoption of ocf_cache_trylock() function,
 register will be asynchronous, in which case we
 want to use mngt_ interface.

This series of OCF patches will enable WriteBack support for OCF bdev.
There is a lot of preparation before we will be able
 to actually implement WriteBack.
Dependencies look like this:
- WriteBack
  - Cleaner
    - trylock
  - Persistent metadata
    - asynchronous management API

Cleaner is a background agent that does synchronization
 of data between cache and its cores.
 Cleaner usually runs every ~30 seconds to perform cleaning.
 The synchronization is a simmilar operation to OCF management flushes.
We need cleaner for WriteBack because only WriteBack mode
 produces dirty data that cleaner needs to deal with.
Cleaner requires adopting trylock() because in current version
 cleaner uses management lock when performs cleaning,
 which may lead to deadlocks if cleaner runs on
 the same thread as management operations.

Peristent metadata is functionality of OCF
 that allows to restore cache state after shutdown
We need persistent metadata in context of shutdown
 with some data being dirty which may only happen
 when using WriteBack mode
Support for persistent metadata requieres
 asynchronous OCF API because in current version
 we will have a deadlock during metadata initialization
 because it is required to be done on different thread
 than cache initialization.

Change-Id: I45d84a5fa0c96581d522050dad186b06d489226e
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455225
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-22 17:57:31 +00:00
Konrad Sztyber
bdbd32b4c7 lib/ftl: minor ftl_rwb_fill cleanup
Change-Id: I69ee623d9e5646b7a793313dac0fb82ef04035f3
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453994
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-22 16:17:23 +00:00
Konrad Sztyber
36ff0eeb88 lib/ftl: mirror writes to non-volatile cache
Apart from writing the data to OCSSD, mirror the latest two bands of
data on the persistent write buffer cache. Currently the data is only
sent there, further patches will add metadata support, shutdown recovery
and L2P updates.

Change-Id: Ief05d0c23fa0e25bd6085e0ce3e1528d6736d174
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450266
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-22 16:17:23 +00:00
Konrad Sztyber
aba66c0644 lib/ftl: added ftl_io_shrink_iovec
This function can be used to reduce the number of iovecs on exisiting IO
structure. It's useful when only the maximum number of iovecs is known
at the ftl_io allocation time.

Change-Id: Ief06634446c65889401f5e0b07b2d1ce7912af34
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450630
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
2019-05-22 16:17:23 +00:00
Konrad Sztyber
c139a7d9ba lib/ftl: persistent cache initialization
Change-Id: I608d420bd115a70d60ee6995220dbfafb7a00731
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449242
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-22 16:17:23 +00:00
Tianyu yang
a8b8badb3c bdev/aio: An error code can be returned via the rpc interface.
When we create AIO using the RPC interface, we can use the error
message to return to determine what exceptions have occurred.

Change-Id: I537f1a7d1053dcc27960de004fd0b4aea49919e9
Signed-off-by: Tianyu yang <yangtianyu2@huawei.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-22 16:12:40 +00:00
paul luse
3d6305b9fd bdev/compress: fix issues with multi-thread
Submission logic was incorrect and completion logic was not
present yet.  This passes now with multi-thread bdevio test.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia8ed1c123be511240d93503a2c5e501ccad445bf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455018
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 15:36:48 +00:00
paul luse
4397e38782 bdev/compress: add UT for the poller
Adding these tests identified a small fix in the code under test
that is also included here.  The compressdev 'produced' field
is unsigned and reduce is expecting a negative errno in the
callback.

Change-Id: I28ab11ee3ef54768a9d6ccd26282cf7dd022be43
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454806
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 15:36:48 +00:00
paul luse
3cf1cd0b14 bdev/compress: use comp bdev assigned queue pair in poller
Previously hardcoded to 0 for iniital dev/test.

Change-Id: If3b98b0083ff52366eb4deaadb5ffb6a277f904c
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454681
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 15:36:48 +00:00
paul luse
96fe0fb199 bdev/compress: updates to the poller function
Two quick updates, support dequeueing more than 1 IO at a time which
is now more likely given that we can queue up compression operations
(others can come in qhile one is on the queue).

Dropped the max inflight value as it was in there as a palceholder
and now it makes sense to give it a sane value. Also updated the value
passed to reduce to signal an error.

UT for the poller will come in the next patch...

Change-Id: I14d6aeb98aa1e193c498e1549e37eca0e4c56c31
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454680
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 15:36:48 +00:00
Jim Harris
37184dd471 nvme: add nvme_free_request() variant that takes qpair
This avoids dereferencing the request to get the qpair
in cases where we already know the qpair.  Adding a new
variant instead of just modifying nvme_free_request()
since there are 72 calls to this function and I don't
want to change all of them.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifd6fd964e546bcd71ff180fd71d5bf5cbab79d4f

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455287
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 14:51:01 +00:00
Jim Harris
ef1f844395 nvme: add qpair parameter to nvme_complete_request
In some cases we have the qpair already when calling
this function.  So pass the qpair to avoid having
to get it from the request.  This shows about a 3%
performance improvement for high IOPs single core
tests.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I22fcca560492f4e7cf5ffedd252e41a027d0dd79

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455286
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 14:51:01 +00:00
Jim Harris
bf647c168a nvmf: increase default max num qps to 128
This matches the Linux kernel target.  Users can
still decrease this default when creating the
transport (i.e. -p option for nvmf_create_transport
in rpc.py).

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icad59350a2cd35cfc4ad76d06399345191680c05

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454820
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 14:50:05 +00:00
Changpeng Liu
c55c85f807 bdev: complete parent IO only when all the children IO are finished
When parent IO was splitted into several children requests, SPDK
may return parent completion callback with error status before
all the children requests are finished.

Change-Id: I63221a0ae1a5925a7fcd9744b4f5d8079c641252
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453611
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 08:18:40 +00:00
Darek Stojaczyk
75a6265de9 rpc: fix segfault on get_spdk_version request with extra params
We tried to send two responses for a single RPC request.

Fixes #798

Change-Id: I37eebd6a1212f6ab2c026b7587415e42c3c7417e
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455220
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-22 04:36:47 +00:00
Ben Walker
e036215fe9 thread: Add a mechanism to exit a lightweight thread
Lightweight threads may now be exited by calling
spdk_thread_exit() within the thread. The framework
polling the thread can release the resources associated
with that lightweight thread by calling
spdk_thread_destroy().

Change-Id: I6586b9d22556b3874fb113ce5402c6b1f371786e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-05-22 04:23:17 +00:00
Ben Walker
f658a867f9 thread: Remove unknown_tsc tracking
This was actually always 0.

Change-Id: I33a14a9f213e8c7159f00f57c88f12cfdf1d3d60
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455318
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2019-05-22 04:23:17 +00:00
Darek Stojaczyk
3a0627f069 vhost: remove 2MB memory region size restriction
We no longer have any assumptions about vhost memory regions
size being a 2MB multiple, so we can get rid of the security
check preventing some vhost sessions from being initialized.

It will be necessary for virtio-vhost-user, whose memory comes
from PCI BARs and its size may not be a 2MB multiple.

Change-Id: I48f9bc20f4c61aefdddf39ade875867148f0ed75
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454879
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-22 00:53:08 +00:00
Darek Stojaczyk
ac498fa31c vhost: use DPDK APIs to split non-contiguous virtual memory buffers
Currently, we translate each 2MB chunk to manually check
if it's contiguous with the previous one, but there are
rte_vhost APIs that do it way more efficiently.

rte_vhost_va_from_guest_pa() was introduced in DPDK 18.02,
but was backported to 17.11 as well, so we don't even need
any RTE_VERSION ifdefs to use it now. This function
calculates the remaining region size instead of trying to
translate subsequent 2MB chunks over and over.

The previous rte_vhost_gpa_to_vva() was deprecated a long
time ago and after this patch we no longer make any use of
it.

DPDK usages of this new function check if the translated
memory region has 0 length, which seems very silly, but
let's just do it in SPDK as well.

Change-Id: Ifae8daa5f810b5a2ba1524958ad2399af700b532
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454878
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-22 00:53:08 +00:00
Darek Stojaczyk
ca0cdbf269 bdev: increase QoS granularity
Fixes #791

Change-Id: Iaba31bdbbff6fa7996ca3e89ac4c9e3658a2cc94
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454669
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-21 05:09:41 +00:00
paul luse
02064419e2 bdev/passthru: add error check in get buffer read callback
Was simply missing.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia8ff3825e5d78c8814c3c57b779608ebe6be34ac
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454467
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-20 17:02:27 +00:00
paul luse
634af37b46 bdev/compress: use the queue pair for the current device
Had 0 hardcoded early on, this just uses the queue pair that
was pre-assigned to the compression bdev to use when interacting
with the cryptodev API.

Multi-threaded testing is coming later.

Change-Id: Iab8ea4e59169980933e9c81715baf4d41803b572
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453921
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-20 16:59:43 +00:00
paul luse
70165cedeb bdev/compress: queue up compression operations if we can't submit
The compressdev enqueue call will return the number of ops that
it was able to accept.  By design this will always be 1 for our
implementation so check that to make sure.

If it is 0 then we queue the compression operation up and try
again after something is dequeued in the poller.

We will also queue if we can't get an mbuf or op from our pools.

Change-Id: I7285749b4a599d1ee265e3c3e16073f25c4d7469
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451686
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-20 16:59:43 +00:00
Darek Stojaczyk
122cc72cd9 Revert "scsi: Simplify removing LUN when no connection is for it"
The new, simplified scsi lun hotremove path doesn't call
lun->hotremove_cb if there's no io_channel allocated for
that lun. Vhost still depends on that callback and currently,
when the underlying bdev is removed, vhost is left completely
unnotified. It keeps a dangling pointer to a scsi lun and
will eventually crash. The vhost scsi controller also can't
be removed in this case.

This reverts commit 19182431c8.

Change-Id: I330330fdd7d6941db070d972192481f535f62977
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454836
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-20 16:56:06 +00:00
yidong0635
26683005dd lib/ftl: fix scanbuild error about divided by zero.
We should add assert here to avoid be divided by zero,
that resolve this scanbuild error.

Change-Id: I12fa11cdbf0f608995f91759140408303e243357
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454539
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-20 16:55:47 +00:00
Konrad Sztyber
275f0c341c blobfs: removed unaligned memory access
Replaced (uint64_t *) dereference of (char *) variable with memcpy to
avoid unaligned memory access. It fixes the following error reported by
ubsan:

spdk/lib/blobfs/blobfs.c:1492:2: runtime error: load of misaligned
address 0x00000047a136 for type 'uint64_t', which requires 8 byte
alignment

Change-Id: I5f3f28a52b8cbe3eb30394ae7ebd0e64b683db57
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454886
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-05-20 16:55:42 +00:00
Konrad Sztyber
2fac8bc76b lib/bdev: merge buffer allocation in get_buf/put_buf
These two paths do the same thing, only spdk_bdev_io_get_buf tries to
allocate the buffer directly, while spdk_bdev_io_put_buf reuses the
returned buffer for one of the waiting requests.

Change-Id: I341a011f3a16a99de399c8b56fa73ccd9c7fe4fa
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451466
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2019-05-20 16:54:13 +00:00