This mimics the per-lcore cache that the DPDK rte_mempool
implements. But DPDK rte_mempool relies on the DPDK
lcore_ids which are not set for non-DPDK threads (such as
the fio bdev plugin).
So implement our own per-thread bdev_io cache instead.
This is quite simple since we already have a per-thread
bdev channel called spdk_bdev_mgmt_channel.
Previously, we passed 64 to spdk_mempool for the
per-core cache size. This patch effectively changes it
to 256 and moves it from the spdk_mempool (which we now
specify with a per-core cache size of 0) to this internal
bdev cache. We allocate 64K of these bdev_io, so putting
a few more in each thread's cache will not hurt anything.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5e715f8c69b99130c7b80347b47a881595d184ae
Reviewed-on: https://review.gerrithub.io/392531
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The latest change for the portal group is applied to the vhost.
The following comment is quoted from it.
Currently the cpumask must be a subset of the reactor mask.
However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux. The latter will
be familier for more people. Hence the later is adopted.
The following is quoted from the FreeBSD Man Page of taskset:
The CPU affinity is represented as a bitmask, with the lowest
order bit corresponding to the first logical CPU and the
highest order bit corresponding to the last logical CPU.
Not all CPUs may exist on a given system but a mask may specify
more CPUs than are present.
A retrieved mask will reflect only the bits that correspond to
CPUs physically on the system.
If an invalid mask is given (i.e., one that corresponds to no
valid CPUs on the current system) an error is returned.
The masks are typically given in hexadecimal.
Change-Id: Idcd72a12ef52e4ccec8476e7d54fab82867cf936
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392587
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
During hot remove of lvol store some lvols can already be in a process
of removal. We should not start another removal process for lvol that
is already being removed.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ifc91e4cee11ee63af04eac3729d014d7c04ff98b
Reviewed-on: https://review.gerrithub.io/390217
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Instead of requiring each bdev module to track its own bdevs and clean
them up during its fini callback, we can walk the list of registered bdevs
during spdk_bdev_finish() and call spdk_bdev_unregister() on each one of
them before cleaning up the bdev modules.
Change-Id: I01816707c9100f66f542bfd73b90bcb0e0fb0c0c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/389878
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Commit ID "269910c0" removed the support of separate metadata,
for those controllers which can support this feature, SPDK driver
can't be used. SPDK provides APIs such as:
spdk_nvme_ctrlr_cmd_io_raw_with_md/spdk_nvme_ns_cmd_write_with_md/
spdk_nvme_ns_cmd_read_with_md, which can support separate metadata.
While here, re-enable this feature with this commit.
Change-Id: If77c21e9ac700c4b334548ebfa7e8e6286285a64
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/392440
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This can be used for two purposes:
1) more quickly iterate the blob list, avoiding
metadata pages that are valid but not the first
page in the blob's metadata list
2) close races between delete and open operations -
now we can clear the bit in the blobid bit array
when the delete operation is in progress, ensuring
no one else can try to open the blob
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3904648fd6fa656cb98c9e17ea763ed5a84ef537
Reviewed-on: https://review.gerrithub.io/391695
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
No need to export these functions since they are not
used by other files.
Change-Id: Iab5d44667cc0d57ec105e90a71d434cc4e07f4f5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392590
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For g_num_connections, we should create an proper
array size, we cannot directly create it by the size:
spdk_env_get_core_count(). The reason is that the
core mask can be non-continuous,e.g., 0x1001, thus
for effient access, we create a large array size with
last_core +1, although we will have some space waste,
but this will not be big, but still maintain the fast
array index acccess.
Change-Id: I95e1fc34e0816ac2f8764880c0d0e629f43a5dc4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/391909
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iedfa8d3de8520836e184f7ef0925822fb705fc67
Reviewed-on: https://review.gerrithub.io/391672
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Call this function from spdk_bdev_mgmt_channel_destroy().
Currently there are no real resources to free, but that
will change in an upcoming patch which adds per-thread
bdev_io caches.
While here, also add a for_each_channel iterator to
call this function on each existing channel during bdev
finish code path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9348e37053505c9fba7a6421e55ffc416668d24f
Reviewed-on: https://review.gerrithub.io/392530
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prepares for some upcoming changes which will
add a per-thread bdev_io cache.
While here, remove spdk_bdev_get_io() from the
internal bdev API. This function is not meant
to be called outside of bdev.c.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f764a88a079fac936931c46d615999454013732
Reviewed-on: https://review.gerrithub.io/392529
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
To better match bdev modules like nvme, complete requests
for the bdev/null driver asynchronously. This will be
done by allocating IO channels that register a poller
and keep a TAILQ of bdev IO to be completed next time
the poller runs.
This is actually more efficient as well, since completing
I/O in submit_request context defers the completion using
an event. A benchmark of bdevperf with split running on
top of null module shows this patch increases throughput
20%.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8c664234660c249fd8ec8d9244eed33502d4103e
Reviewed-on: https://review.gerrithub.io/392528
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change-Id: I366b941a60d1fb00951591e7f631a65e8a449904
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392566
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Change-Id: I102954505c2c53458aae30f6d15b46e008355501
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392565
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Currently the cpumask must be a subset of the reactor mask.
However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux. The latter will
be familier for more people. Hence the later is adopted.
The following is quoted from the FreeBSD Man Page of taskset:
The CPU affinity is represented as a bitmask, with the lowest
order bit corresponding to the first logical CPU and the
highest order bit corresponding to the last logical CPU.
Not all CPUs may exist on a given system but a mask may specify
more CPUs than are present.
A retrieved mask will reflect only the bits that correspond to
CPUs physically on the system.
If an invalid mask is given (i.e., one that corresponds to no
valid CPUs on the current system) an error is returned.
The masks are typically given in hexadecimal.
Change-Id: I7e0d2e029569bfc986f7fcdf78048791ab389f72
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392446
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Load balancing for idle iSCSI connections uses the RTE EAL Launch
state and uses DPDK RTE EAL API.
But all SPDK reactors will exit simultaneously because each SPDK
reactor checks if the global state is RUNNING to exit.
Hence calling rte_eal_get_lcore_state() is not necessary.
When the reactor hot-plug function is supported, this implementation
will be reconsidered together.
Change-Id: I34eaf3e42b5b7deae6473d2bfaf0910aaa9da6de
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391339
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently idle iSCSI connections are managed by the master lcore,
but the master lcore is like BSP of OS and for initialization.
To manage idle iSCSI connections it is important that the core is
consistent.
Hence the first core is better than the master lcore.
In this patch the following are changed together:
- Errors of kqueue() and epoll_create1() are not related with master
lcore. "master lcore" is removed and errno is added into the log.
- In spdk_iscsi_conn_allocate_reactor(), when cpumask is 0, 0 is
selected as core number. 0 is not safe and first_core is used instead.
In spdk_iscsi_conn_allocate_reactor(), when first_core is used instead
of master_lcore, we may observe some contradiction in the following
code. But few changes are done in this patch.
In the current implementation we can assume the first lcore is
equal to the master lcore and the following code will be removed
in the subsequent patch.
/*
* DPDK returns WAIT for the master lcore instead of RUNNING.
* So we always treat the reactor on master core as RUNNING.
*/
if (i == master_lcore) {
state = RUNNING;
} else {
state = rte_eal_get_lcore_state(i);
}
Change-Id: I6cac06c27b289db5ea1f9452e33489286c64d2fa
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391338
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Before removing the DPDK dependency from the iSCSI connection
load balancing, this should be done.
In spdk_iscsi_conn_allocate_reactor(cpumask)
- if any lcore[i]'s state is FINISHED, the caller calls
rte_eal_wait_lcore(i). But the purpose of rte_eal_wait_lcore()
is to check if the slave is in a WAIT state before calling
rte_eal_remote_launch(). The meaning of this usage is not clear.
- If the state of lcore[i] is WAIT or FINISHED, the reactor does
not run on the lcore[i]. iSCSI connections consist of not reactor
but poller. Hence selecting lcore[i] with the state WAIT or
FINISHED does not look correct.
Change-Id: If8c420f2d16dc44e77f8963f5732faa52e3d829b
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391332
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Change-Id: I7174f1799361b8337ff5590b90ad6a0564ca8e9b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391899
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
User can specify processor affinity for each iSCSI connection
by specifying cpumask in the configuration file.
However the example of iscsi.conf.in does not have any description
about this. Hence it is very difficult for user to use this.
The portal group section of the config dump file has the same
description. Hence it is also changed.
Change-Id: I6e7b3bb67e10e78f4a47165525f023555080f146
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391510
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
There existing an overflow for the large value of sleeping time
for the poller and the actual time may be incorrect setting due
to this overflow. Update the calculation here.
Change-Id: I14fe21d3f0e1abaa9d13d3d6254aff254d2dfcc3
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/392127
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8519a4b68db44cb8fe6dd251a52bf0f1dca73c32
Reviewed-on: https://review.gerrithub.io/391890
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently the default setting of cpumask of network portal is
different between iSCSI.conf and JSON-RPC.
When a network portal is created by iSCSI.conf, its cpumask is
set to all available CPUs by default. However when it is created
by JSON-RPC, its cpumask is set to 0 by default.
Auto test 'test/iscsi_tgt/idle_migration creates a network portal
by JSON-RPC. Hence the auto test cannot test the load balancing
function of iSCSI target.
Change-Id: I2685172cb9259b643f6d18d4660a8425dcef3f5d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391898
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
For normal exit logic, such as Ctrl+C, vhost blk will not
shutdown the backend device, e.g: NVMe controller.
Change-Id: I7fdf8687a2cfa6a8cc6a61428d722debfa9a2180
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391348
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Now that we have SCSI-specific virtio
device struct, we can keep our virtqueue
pollers in there.
Change-Id: If4b643f8c46e42d5d403532ad015c721c0429282
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/390114
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The virtio_req struct has been removed.
The new lower-level API has been used
instead. This puts more responsibility on
the upper abstraction layers, but gives
much more descriptor layout flexibility.
It is required for e.g. upcoming
Virtio-SCSI eventq implementation
Change-Id: I9a310c0ba4451bf3a076bef4e90bb75c0046c70a
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391028
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
In the iSCSI specification, the SCSI device name is defined to be
the iSCSI name of the node.
However, when g_spdk_iscsi.nodebase is used, the SCSI device name
is made of the device specific string (the part of IQN after the
colon).
The size of the temporary buffer fullname[MAX_TMPBUF] in
spdk_iscsi_tgt_node_construct() is 1024 and the size of
spdk_scsi_dev.name is 255. The former is larger than the later.
However the max length of IQN, EUI, NAA are 223, 20, and 36,
respectively. All are less than 255.
Hence even if we use fullname as the SCSI device name, no overflow
will occur. Even if fullname is more than 255, strncpy() does not
write more than 255 in spdk_scsi_dev_construct().
It's possible to check the length of iSCSI name strictly, but I
will do the least in this patch.
Change-Id: Icc6655fcd846797720867c10e316d2951c664030
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/390360
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch set the controller as removed in pcie level when the register
return specific value (0xffffffff), we also return the real value to the
upper level (nvme bdev), which will help the upper level do the work of
hotplug.
Change-Id: Ifad45c760cccbce522506ffbf86495318a6b393b
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Reviewed-on: https://review.gerrithub.io/391327
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prepares for a future change where we need to use the
recovery path when loading pre-v3 on-disk formats, since the
older disk formats do not save a blobid mask.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia94d56450202f81373c3de94237eca2dfd96526c
Reviewed-on: https://review.gerrithub.io/391694
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
This eliminates a bunch of code duplication. This also
fixes a couple of places where the ctx->bs was not being
freed in the load fail path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie6b0a4a653b5c80edf14086801b75457852a4736
Reviewed-on: https://review.gerrithub.io/391693
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
there are still several printf statements in app.c but those occur
before the call to spdk_log_open
Change-Id: If017c4d658ca45f34b97500bb1a3db5ab1f0675e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/391888
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iba33c55f129c60fad2d58f5254dec5c54ed56805
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Reviewed-on: https://review.gerrithub.io/388217
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
All SPDK libraries should use the spdk/log.h family of functions
for logging.
Change-Id: I2b8ac30f2901b325784552f0016f1058ae2cd577
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391687
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I964b7a37fcb641a610d518a02841b25913c5be2e
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391733
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
If occasionally there is unflushed data in kernel,
nbd disconnect ioctl will not return, until these
data are flushed. spdk_nbd_disk should process these
data in flushing.
But at present, spdk_nbd_disk is running on a some
reactor with rpc. It is a hidden danger of deadlock.
Change-Id: I2850105dff78f09f0e1b3c0a570dbbf7efdb469e
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391707
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The outstanding r/w requests counter is not decremented
back if IB r/w request fails.
As the result, the rdma qpair stops pumping the requests
after the number of ibv_post_send failures reaches
the threshold for outstanding r/w requests for that qpair.
The patch decrements qpair's r/w counter back in case of
ibv_post_send returns an error.
Change-Id: I8fa0f2905974a50037034962e4d2a001290a06a9
Signed-off-by: Philipp Skadorov <philipp.skadorov@wdc.com>
Reviewed-on: https://review.gerrithub.io/391799
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This greatly increases the efficiency when the target is scaled
to many connections. Now all connections being handled by a given
thread can be polled in O(1), whereas before it was O(n) where
n was the number of connections.
Change-Id: I9f695f68093d73e6538df416b0f1aabef07119ff
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/391491
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
All SPDK libraries should use the spdk/log.h family of functions
for logging.
Change-Id: I4b9f172102c6d7998d3a634573493db3fe25ff76
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391686
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
All SPDK libraries should use the spdk/log.h family of functions
for logging.
The cause of strdup failure is only out-of-memory for malloc().
Hence errno is omitted.
Change-Id: I682f11fbb6f12c9de8d57a025b704b4f050f7474
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391685
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The struct is about to be removed.
This is a preliminary patch towards
the removal.
Change-Id: I5a479678790b8758634bdb0bb2e6facfde514aa1
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391945
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The old API is simply not sufficient.
We assumed that each request contains
a single request and a single response
descriptor. However that's not the case
for e.g. virtio scsi eventq, where each
event contains only a response.
This patch only introduces the new API,
keeping the old one intact. The old API
will be removed in subsequent patches.
Change-Id: I89e53d602165aa0c7ceb25d98237f87550f4eae7
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/390854
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
All SPDK libraries should use the spdk/log.h family of functions
for logging.
Change-Id: I088f62e6035aa1885ad0973a033ecee2f325ed30
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391684
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
All SPDK libraries should use the spdk/log.h family of functions
for logging.
Change-Id: I4c12388433f8c57291cea9f30566438a9d78e3d1
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391683
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This ensures we do not end up with a racing close v.
delete. If we decrement the ref up front, we could
start the close process (which may include persisting
metadata) and then also allow a delete operation to
start. It is safer to wait until the close operation
is done before decrementing the ref count, because then
it will eliminate this race condition (the delete op
would immediately fail).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iad7fd8320d2c9b56f3c4fce054bcb6271e19ad38
Reviewed-on: https://review.gerrithub.io/391493
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This ensures all blob-loading functionality goes through
the single spdk_bs_open_blob function which will simplify
some upcoming changes around managing global metadata
state from multiple threads.
This will also help prevent races where a delete operation
has started followed by an open on the blob that is
being deleted. Those specific changes will be in an
upcoming patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I17e995145ab23068b816b44c33483b0708f5f563
Reviewed-on: https://review.gerrithub.io/391486
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Found during unit testing for blobid_mask coming in a future
patch - the unit test will be added as part of that future
patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iecdde6ba16c5af9caf59214f328ddc22aae71e94
Reviewed-on: https://review.gerrithub.io/391692
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Similar to previous change, the ** paradigm is a bit
problematic for asynchronous routines that could fail.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife7748280482356c4c51a796817b71cd7bc7e479
Reviewed-on: https://review.gerrithub.io/391483
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>