2473 Commits

Author SHA1 Message Date
Jim Harris
534d9c2002 bdev: add per-thread spdk_bdev_io cache
This mimics the per-lcore cache that the DPDK rte_mempool
implements.  But DPDK rte_mempool relies on the DPDK
lcore_ids which are not set for non-DPDK threads (such as
the fio bdev plugin).

So implement our own per-thread bdev_io cache instead.
This is quite simple since we already have a per-thread
bdev channel called spdk_bdev_mgmt_channel.

Previously, we passed 64 to spdk_mempool for the
per-core cache size.  This patch effectively changes it
to 256 and moves it from the spdk_mempool (which we now
specify with a per-core cache size of 0) to this internal
bdev cache.  We allocate 64K of these bdev_io, so putting
a few more in each thread's cache will not hurt anything.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5e715f8c69b99130c7b80347b47a881595d184ae

Reviewed-on: https://review.gerrithub.io/392531
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-21 16:22:23 -05:00
Shuhei Matsumoto
14797d839d vhost: Allow set cpumask more than active cores for vhost
The latest change for the portal group is applied to the vhost.
The following comment is quoted from it.

Currently the cpumask must be a subset of the reactor mask.

However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux.  The latter will
be familier for more people. Hence the later is adopted.

The following is quoted from the FreeBSD Man Page of taskset:

  The CPU affinity is represented as a bitmask, with the lowest
  order bit corresponding to the first logical CPU and the
  highest order bit corresponding to the last logical CPU.

  Not all CPUs may exist on a given system but a mask may specify
  more CPUs than are present.

  A retrieved mask will reflect only the bits that correspond to
  CPUs physically on the system.

  If an invalid mask is given (i.e., one that corresponds to no
  valid CPUs on the current system) an error is returned.

  The masks are typically given in hexadecimal.

Change-Id: Idcd72a12ef52e4ccec8476e7d54fab82867cf936
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392587
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-21 13:32:31 -05:00
Maciej Szwed
236f84dae7 lvol: don't return lvs_bdev if it's being destroyed
During hot remove of lvol store some lvols can already be in a process
of removal. We should not start another removal process for lvol that
is already being removed.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ifc91e4cee11ee63af04eac3729d014d7c04ff98b
Reviewed-on: https://review.gerrithub.io/390217
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-21 13:30:44 -05:00
Daniel Verkamp
453f5ae9f6 bdev: unregister all bdevs in spdk_bdev_finish()
Instead of requiring each bdev module to track its own bdevs and clean
them up during its fini callback, we can walk the list of registered bdevs
during spdk_bdev_finish() and call spdk_bdev_unregister() on each one of
them before cleaning up the bdev modules.

Change-Id: I01816707c9100f66f542bfd73b90bcb0e0fb0c0c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/389878
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 13:29:29 -05:00
Changpeng Liu
381af5775f nvme: re-enable the separate metadata support in nvme driver
Commit ID "269910c0" removed the support of separate metadata,
for those controllers which can support this feature, SPDK driver
can't be used. SPDK provides APIs such as:
spdk_nvme_ctrlr_cmd_io_raw_with_md/spdk_nvme_ns_cmd_write_with_md/
spdk_nvme_ns_cmd_read_with_md, which can support separate metadata.
While here, re-enable this feature with this commit.

Change-Id: If77c21e9ac700c4b334548ebfa7e8e6286285a64
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/392440
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-21 13:21:55 -05:00
Jim Harris
40c911b957 blob: add used blobid bit array for valid blobids
This can be used for two purposes:

1) more quickly iterate the blob list, avoiding
   metadata pages that are valid but not the first
   page in the blob's metadata list
2) close races between delete and open operations -
   now we can clear the bit in the blobid bit array
   when the delete operation is in progress, ensuring
   no one else can try to open the blob

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3904648fd6fa656cb98c9e17ea763ed5a84ef537

Reviewed-on: https://review.gerrithub.io/391695
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 12:28:21 -05:00
Ziye Yang
4547ce65a8 iscsi: make some functions into static
No need to export these functions since they are not
used by other files.

Change-Id: Iab5d44667cc0d57ec105e90a71d434cc4e07f4f5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392590
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 12:19:44 -05:00
Ziye Yang
50d0957b5e iscsi/conn: remove rte_config.h header
For g_num_connections, we should create an proper
array size, we cannot directly create it by the size:
spdk_env_get_core_count(). The reason is that the
core mask can be non-continuous,e.g., 0x1001, thus
for effient access, we create a large array size with
last_core +1, although we will have some space waste,
but this will not be big, but still maintain the fast
array index acccess.

Change-Id: I95e1fc34e0816ac2f8764880c0d0e629f43a5dc4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/391909
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 11:50:49 -05:00
Jim Harris
832f4e4df6 nvme: add quirks for Intel NVMe P4600 SSD
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iedfa8d3de8520836e184f7ef0925822fb705fc67

Reviewed-on: https://review.gerrithub.io/391672
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2017-12-21 10:45:05 -05:00
Jim Harris
8b0b3c350c bdev: add spdk_bdev_mgmt_channel_free_resources()
Call this function from spdk_bdev_mgmt_channel_destroy().
Currently there are no real resources to free, but that
will change in an upcoming patch which adds per-thread
bdev_io caches.

While here, also add a for_each_channel iterator to
call this function on each existing channel during bdev
finish code path.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9348e37053505c9fba7a6421e55ffc416668d24f

Reviewed-on: https://review.gerrithub.io/392530
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Jim Harris
f1f14e5583 bdev: pass mgmt_channel to spdk_bdev_get_io()
This prepares for some upcoming changes which will
add a per-thread bdev_io cache.

While here, remove spdk_bdev_get_io() from the
internal bdev API.  This function is not meant
to be called outside of bdev.c.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f764a88a079fac936931c46d615999454013732

Reviewed-on: https://review.gerrithub.io/392529
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Jim Harris
45697d33ff bdev/null: complete requests asynchronously
To better match bdev modules like nvme, complete requests
for the bdev/null driver asynchronously.  This will be
done by allocating IO channels that register a poller
and keep a TAILQ of bdev IO to be completed next time
the poller runs.

This is actually more efficient as well, since completing
I/O in submit_request context defers the completion using
an event.  A benchmark of bdevperf with split running on
top of null module shows this patch increases throughput
20%.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8c664234660c249fd8ec8d9244eed33502d4103e

Reviewed-on: https://review.gerrithub.io/392528
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Ben Walker
1545c8eb5e nvmf: Fix bug when resizing sgroups array
Change-Id: I366b941a60d1fb00951591e7f631a65e8a449904
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392566
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-20 21:38:25 -05:00
Ben Walker
fd0770fecb nvmf: Delete subsystems when target is destroyed
Change-Id: I102954505c2c53458aae30f6d15b46e008355501
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392565
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-20 21:38:25 -05:00
Shuhei Matsumoto
b65443be6e iscsi: Allow to set cpumask more than active cores for iSCSI connection
Currently the cpumask must be a subset of the reactor mask.

However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux.  The latter will
be familier for more people. Hence the later is adopted.

The following is quoted from the FreeBSD Man Page of taskset:

  The CPU affinity is represented as a bitmask, with the lowest
  order bit corresponding to the first logical CPU and the
  highest order bit corresponding to the last logical CPU.

  Not all CPUs may exist on a given system but a mask may specify
  more CPUs than are present.

  A retrieved mask will reflect only the bits that correspond to
  CPUs physically on the system.

  If an invalid mask is given (i.e., one that corresponds to no
  valid CPUs on the current system) an error is returned.

  The masks are typically given in hexadecimal.

Change-Id: I7e0d2e029569bfc986f7fcdf78048791ab389f72
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392446
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:19:04 -05:00
Shuhei Matsumoto
0241b85bc1 iscsi: Remove DPDK dependency and simplify load balancing
Load balancing for idle iSCSI connections uses the RTE EAL Launch
state and uses DPDK RTE EAL API.

But all SPDK reactors will exit simultaneously because each SPDK
reactor checks if the global state is RUNNING to exit.

Hence calling rte_eal_get_lcore_state() is not necessary.

When the reactor hot-plug function is supported, this implementation
will be reconsidered together.

Change-Id: I34eaf3e42b5b7deae6473d2bfaf0910aaa9da6de
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391339
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:17:15 -05:00
Shuhei Matsumoto
855d8e032a iscsi: change master_lcore to first_core for idle connection management
Currently idle iSCSI connections are managed by the master lcore,
but the master lcore is like BSP of OS and for initialization.

To manage idle iSCSI connections it is important that the core is
consistent.

Hence the first core is better than the master lcore.

In this patch the following are changed together:
- Errors of kqueue() and epoll_create1() are not related with master
  lcore. "master lcore" is removed and errno is added into the log.
- In spdk_iscsi_conn_allocate_reactor(), when cpumask is 0, 0 is
  selected as core number. 0 is not safe and first_core is used instead.

In spdk_iscsi_conn_allocate_reactor(), when first_core is used instead
of master_lcore, we may observe some contradiction in the following
code. But few changes are done in this patch.

In the current implementation we can assume the first lcore is
equal to the master lcore and the following code will be removed
in the subsequent patch.

/*
 * DPDK returns WAIT for the master lcore instead of RUNNING.
 * So we always treat the reactor on master core as RUNNING.
 */
if (i == master_lcore) {
    state = RUNNING;
} else {
    state = rte_eal_get_lcore_state(i);
}

Change-Id: I6cac06c27b289db5ea1f9452e33489286c64d2fa
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391338
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:17:15 -05:00
Shuhei Matsumoto
7767990829 iscsi: Unexpected usage of RTE EAL Launch in load balancing
Before removing the DPDK dependency from the iSCSI connection
load balancing, this should be done.

In spdk_iscsi_conn_allocate_reactor(cpumask)

- if any lcore[i]'s state is FINISHED, the caller calls
  rte_eal_wait_lcore(i). But the purpose of rte_eal_wait_lcore()
  is to check if the slave is in a WAIT state before calling
  rte_eal_remote_launch(). The meaning of this usage is not clear.

- If the state of lcore[i] is WAIT or FINISHED, the reactor does
  not run on the lcore[i]. iSCSI connections consist of not reactor
  but poller. Hence selecting lcore[i] with the state WAIT or
  FINISHED does not look correct.

Change-Id: If8c420f2d16dc44e77f8963f5732faa52e3d829b
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391332
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
2017-12-20 15:17:15 -05:00
Changpeng Liu
9ca670ac8a util/crc16: add crc16 library support and unit tests
Change-Id: I7174f1799361b8337ff5590b90ad6a0564ca8e9b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391899
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:12:26 -05:00
Shuhei Matsumoto
8fd2882fb2 etc/spdk/iscsi.conf.in: How to use cpumask for the connection
User can specify processor affinity for each iSCSI connection
by specifying cpumask in the configuration file.

However the example of iscsi.conf.in does not have any description
about this. Hence it is very difficult for user to use this.

The portal group section of the config dump file has the same
description. Hence it is also changed.

Change-Id: I6e7b3bb67e10e78f4a47165525f023555080f146
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391510
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-20 11:31:45 -05:00
GangCao
c22f12c8df event: update the poller's period_ticks calculation
There existing an overflow for the large value of sleeping time
for the poller and the actual time may be incorrect setting due
to this overflow. Update the calculation here.

Change-Id: I14fe21d3f0e1abaa9d13d3d6254aff254d2dfcc3
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/392127
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 18:49:16 -05:00
Jim Harris
9c1d97a247 nvme: add checks for sq_head
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8519a4b68db44cb8fe6dd251a52bf0f1dca73c32

Reviewed-on: https://review.gerrithub.io/391890
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-19 16:45:05 -05:00
Shuhei Matsumoto
28ff7a0da7 iscsi: Set cpumask to all available CPUs when PG is created JSON-RPC
Currently the default setting of cpumask of network portal is
different between iSCSI.conf and JSON-RPC.

When a network portal is created by iSCSI.conf, its cpumask is
set to all available CPUs by default. However when it is created
by JSON-RPC, its cpumask is set to 0 by default.

Auto test 'test/iscsi_tgt/idle_migration creates a network portal
by JSON-RPC. Hence the auto test cannot test the load balancing
function of iSCSI target.

Change-Id: I2685172cb9259b643f6d18d4660a8425dcef3f5d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391898
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-19 15:22:05 -05:00
Changpeng Liu
4395093b97 vhost_blk: close the bdev in the hotplug callback
For normal exit logic, such as Ctrl+C, vhost blk will not
shutdown the backend device, e.g: NVMe controller.

Change-Id: I7fdf8687a2cfa6a8cc6a61428d722debfa9a2180
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391348
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 14:54:51 -05:00
Dariusz Stojaczyk
3099d7a5f9 virtio: remove pollers from virtqueues
Now that we have SCSI-specific virtio
device struct, we can keep our virtqueue
pollers in there.

Change-Id: If4b643f8c46e42d5d403532ad015c721c0429282
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/390114
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-19 13:35:52 -05:00
Dariusz Stojaczyk
451de7e171 virtio: switch to the new virtqueue API
The virtio_req struct has been removed.
The new lower-level API has been used
instead. This puts more responsibility on
the upper abstraction layers, but gives
much more descriptor layout flexibility.
It is required for e.g. upcoming
Virtio-SCSI eventq implementation

Change-Id: I9a310c0ba4451bf3a076bef4e90bb75c0046c70a
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391028
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-19 13:35:52 -05:00
Shuhei Matsumoto
b37d1b60f1 scsi&test/iscsi_tgt: SCSI device == iSCSI name
In the iSCSI specification, the SCSI device name is defined to be
the iSCSI name of the node.

However, when g_spdk_iscsi.nodebase is used, the SCSI device name
is made of the device specific string (the part of IQN after the
colon).

The size of the temporary buffer fullname[MAX_TMPBUF] in
spdk_iscsi_tgt_node_construct() is 1024 and the size of
spdk_scsi_dev.name is 255. The former is larger than the later.

However the max length of IQN, EUI, NAA are 223, 20, and 36,
respectively. All are less than 255.

Hence even if we use fullname as the SCSI device name, no overflow
will occur. Even if fullname is more than 255, strncpy() does not
write more than 255 in spdk_scsi_dev_construct().

It's possible to check the length of iSCSI name strictly, but I
will do the least in this patch.

Change-Id: Icc6655fcd846797720867c10e316d2951c664030
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/390360
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-19 13:34:03 -05:00
Cunyin Chang
6e82aa5ace nvme: Add support of hot remove vfio-attached devices in pcie layer.
Change-Id: Ia7d6ca2d6c0bec6345f05718f6a6328eccda2dcc
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/391329
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-19 13:07:20 -05:00
Cunyin Chang
2966839dd9 nvme: return specific value of register when the device hot removed.
This patch set the controller as removed in pcie level when the register
return specific value (0xffffffff), we also return the real value to the
upper level (nvme bdev), which will help the upper level do the work of
hotplug.

Change-Id: Ifad45c760cccbce522506ffbf86495318a6b393b
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/391327
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-19 13:07:20 -05:00
Jim Harris
d8022e1357 blob: allow _spdk_bs_recover to operate as a sequence completion
This prepares for a future change where we need to use the
recovery path when loading pre-v3 on-disk formats, since the
older disk formats do not save a blobid mask.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia94d56450202f81373c3de94237eca2dfd96526c

Reviewed-on: https://review.gerrithub.io/391694
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2017-12-19 12:41:26 -05:00
Jim Harris
ac1aa04ba7 blob: add _spdk_bs_load_fail helper routine
This eliminates a bunch of code duplication.  This also
fixes a couple of places where the ctx->bs was not being
freed in the load fail path.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie6b0a4a653b5c80edf14086801b75457852a4736

Reviewed-on: https://review.gerrithub.io/391693
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-19 12:41:26 -05:00
Seth Howell
1cc15f10fc lib/event replace printf and fprintf with spdk_log
there are still several printf statements in app.c but those occur
before the call to spdk_log_open

Change-Id: If017c4d658ca45f34b97500bb1a3db5ab1f0675e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/391888
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 11:59:21 -05:00
Maciej Szwed
13ece6a735 blob: add spdk_bs_create_blob_ext
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iba33c55f129c60fad2d58f5254dec5c54ed56805
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Reviewed-on: https://review.gerrithub.io/388217
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 11:55:36 -05:00
Shuhei Matsumoto
81673d0c25 iscsi: Remove use of perror for malloc, strdup, and writev failure
All SPDK libraries should use the spdk/log.h family of functions
for logging.

Change-Id: I2b8ac30f2901b325784552f0016f1058ae2cd577
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391687
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 11:41:43 -05:00
Xiaodong Liu
f6465006d7 nbd: stop nbd if backing bdev is removed
Change-Id: I964b7a37fcb641a610d518a02841b25913c5be2e
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391733
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-18 12:51:24 -05:00
Xiaodong Liu
3a6ff82760 nbd: fix hidden danger caused by stop rpc
If occasionally there is unflushed data in kernel,
nbd disconnect ioctl will not return, until these
data are flushed. spdk_nbd_disk should process these
data in flushing.
But at present, spdk_nbd_disk is running on a some
reactor with rpc. It is a hidden danger of deadlock.

Change-Id: I2850105dff78f09f0e1b3c0a570dbbf7efdb469e
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391707
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2017-12-18 12:50:31 -05:00
Philipp Skadorov
6d98264552 nvmf/rdma: decrement r/w counter if ibv_post_send fails
The outstanding r/w requests counter is not decremented
back if IB r/w request fails.

As the result, the rdma qpair stops pumping the requests
after the number of ibv_post_send failures reaches
the threshold for outstanding r/w requests for that qpair.

The patch decrements qpair's r/w counter back in case of
ibv_post_send returns an error.

Change-Id: I8fa0f2905974a50037034962e4d2a001290a06a9
Signed-off-by: Philipp Skadorov <philipp.skadorov@wdc.com>
Reviewed-on: https://review.gerrithub.io/391799
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-15 16:42:51 -05:00
Ben Walker
2a0772e3b8 nvmf/rdma: Create one cq per thread instead of per connection
This greatly increases the efficiency when the target is scaled
to many connections. Now all connections being handled by a given
thread can be polled in O(1), whereas before it was O(n) where
n was the number of connections.

Change-Id: I9f695f68093d73e6538df416b0f1aabef07119ff
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/391491
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-15 16:26:33 -05:00
Cunyin Chang
bdcb0d709a nvmf: add support of hotplug for nvmf.
Change-Id: Iebd5b75e3525e77bf256f5b7f52aa2504d7a68c3
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/390549
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-15 16:14:02 -05:00
Cunyin Chang
7f5864be20 nvmf: Add public interface of remove ns from subsystem.
Change-Id: I9c2746dd54a13f3dae0ac2bab1d5fced931e8591
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/391699
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-15 16:14:02 -05:00
Shuhei Matsumoto
304d851a5d lib/event/app: Remove use of perror() for malloc() failure
All SPDK libraries should use the spdk/log.h family of functions
for logging.

Change-Id: I4b9f172102c6d7998d3a634573493db3fe25ff76
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391686
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-15 16:13:27 -05:00
Shuhei Matsumoto
f86f107579 conf: Remove use of perror() for strdup() failure
All SPDK libraries should use the spdk/log.h family of functions
for logging.

The cause of strdup failure is only out-of-memory for malloc().
Hence errno is omitted.

Change-Id: I682f11fbb6f12c9de8d57a025b704b4f050f7474
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391685
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-15 16:13:27 -05:00
Dariusz Stojaczyk
d2374e5e4c bdev/virtio: minimize virtio_req usages
The struct is about to be removed.
This is a preliminary patch towards
the removal.

Change-Id: I5a479678790b8758634bdb0bb2e6facfde514aa1
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391945
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-15 15:56:48 -05:00
Dariusz Stojaczyk
8d26e7e24a virtio: added low level virtqueue API
The old API is simply not sufficient.
We assumed that each request contains
a single request and a single response
descriptor. However that's not the case
for e.g. virtio scsi eventq, where each
event contains only a response.

This patch only introduces the new API,
keeping the old one intact. The old API
will be removed in subsequent patches.

Change-Id: I89e53d602165aa0c7ceb25d98237f87550f4eae7
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/390854
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-15 15:56:48 -05:00
Shuhei Matsumoto
a83c39e0b4 bdev_malloc: Remove use of perror() for spdk_dma_zmalloc failure
All SPDK libraries should use the spdk/log.h family of functions
for logging.

Change-Id: I088f62e6035aa1885ad0973a033ecee2f325ed30
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391684
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-15 15:48:32 -05:00
Shuhei Matsumoto
a79a69d226 bdev_aio: Remove use of perror() for open/close failure
All SPDK libraries should use the spdk/log.h family of functions
for logging.

Change-Id: I4c12388433f8c57291cea9f30566438a9d78e3d1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391683
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-15 15:22:53 -05:00
Jim Harris
feba13bbba blob: do not decrement ref on close until it is done
This ensures we do not end up with a racing close v.
delete.  If we decrement the ref up front, we could
start the close process (which may include persisting
metadata) and then also allow a delete operation to
start.  It is safer to wait until the close operation
is done before decrementing the ref count, because then
it will eliminate this race condition (the delete op
would immediately fail).

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad7fd8320d2c9b56f3c4fce054bcb6271e19ad38

Reviewed-on: https://review.gerrithub.io/391493
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-15 12:28:44 -05:00
Jim Harris
751691d24c blob: use spdk_bs_open_blob from spdk_bs_delete_blob
This ensures all blob-loading functionality goes through
the single spdk_bs_open_blob function which will simplify
some upcoming changes around managing global metadata
state from multiple threads.

This will also help prevent races where a delete operation
has started followed by an open on the blob that is
being deleted.  Those specific changes will be in an
upcoming patch.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I17e995145ab23068b816b44c33483b0708f5f563

Reviewed-on: https://review.gerrithub.io/391486
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-15 12:28:44 -05:00
Jim Harris
2ba5607ee8 blob: handle FLAGS descriptor during dirty shutdown recovery
Found during unit testing for blobid_mask coming in a future
patch - the unit test will be added as part of that future
patch.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iecdde6ba16c5af9caf59214f328ddc22aae71e94

Reviewed-on: https://review.gerrithub.io/391692
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-15 12:28:44 -05:00
Jim Harris
ae5a01dd9f blob: change spdk_bs_iter_next parameter to spdk_blob *
Similar to previous change, the ** paradigm is a bit
problematic for asynchronous routines that could fail.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife7748280482356c4c51a796817b71cd7bc7e479

Reviewed-on: https://review.gerrithub.io/391483
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-15 12:28:44 -05:00