Adopt two improvements for iSCSI by Ziye to VHOST.
- iscsi/conn: remove rte_config.h header
- env: export spdk_env_get_last_core function
Change-Id: I8f067093d593c8d483c52587669f8b0b706f497f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392910
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The latest patch for the iSCSI connection is applied to the vhost
too.
RTE_MAX_LCORE in the for loop is removed and the for loop is
replaced by SPDK_ENV_FOREACH_CORE().
When the cpumask is unexpectedly 0, not 0 but the first core is
returned.
Change-Id: I39cfc2219a3532eccc8c0ce59712102b947a76d7
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392588
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Previously we used to manually set
vdev->max_queues and called virtio_dev_restart
to go through all virtio init states, negotiate
features and allocate virtqueues. This is,
however, insufficient for Virtio-Blk, where we
e.g. need to check against negotiated multiqueue
flag before deciding how many queues we can use
(reading num_queues field from device config is
forbidden unless VIRTIO_BLK_F_MQ is negotiated).
This patch refactors queue-num related code
and also removes various restrictions. If device
supports less queues than requested, a warning
will be printed during initialization, but
the device will now continue to init normally.
The queue-num negotiation for virtio-user should
be eventually moved to upper layers, but that is
not necessary for now.
Change-Id: I418b56fa62c17b547243422ea077f0d76555bd13
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/393087
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If the host fails to send some events,
the next event it succeeds to send
might have an EVENTS_MISSED flag set.
Once virtio receives such event, it will
schedule a full bus rescan to manually
detect any changes.
Change-Id: Ifa66536f8e2980ad31ee68769f042f08100da54e
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392780
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Don't allow starting full device rescan
if such scan is already in progress.
This patch also makes it possible to
start a full scan while only particular
targets are being rescanned.
Change-Id: I8677f640a4e5d9d8c486dfe1e9a58331e941a461
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392373
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Implemented eventq rescan message.
When the host hot-attaches a target
to the controller, it will now
automatically appear as a virtio bdev.
Change-Id: Ie81352b70808a0e078009f618cc1fcfed407d1ba
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392374
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A public API to remove a virtio
controller is about to be published
soon. Hence, we need to cover all
possible corner-cases where calling
it could cause a segfault.
Change-Id: I804cc0bee4a60ab8fc159bb98b7712cc258117f3
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392271
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The upcoming rescan implementation will
do scan I/O on existing I/O virtqueues.
To handle scan cleanly, we need to queue
scan I/O if virtqueues are full.
The heuristics for resending I/O should
be sufficient for all real case scenarios.
A scan I/O consists of either 2 or 3 iovs.
A raw I/O consists of at least 2
descriptors. Since most I/O requests
contain some additional payload, in 99%
of cases the scan I/O will be successfully
resent after polling a single I/O response.
To handle the remaining 1%, we try to
resend a scan I/O up to SCAN_REQUEST_RETRIES
(currently = 5) times.
Change-Id: I8c84ed1d109d9f403c9d7b8efabb904eb26183db
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392174
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
SCSI doesn't specify any payload associated
with TEST_UNIT_READY and START_STOP_UNIT
commands. Sending a payload didn't cause
any troubles so far, but still it's not SCSI
compliant and hence has been removed.
It is now assumed that iov_len of 0 means
that no payload should be sent.
Change-Id: Ibdfa7cd77af6049132278911d872cd52c10be6c7
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391868
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
scan_finish function will now unconditionally
finish the scan, while scan_next will continue
scanning on the next target.
Now that target scan be aborted from external
sources (device removal during re-scan). There
is simply a need for such scan_finish function.
Change-Id: I49e0dd6ea7abdddc24c23af362313d657fe1dc66
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391866
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We used to defer registering bdevs when
a target scan was in progress, as bdev examine
(gpt) would do an I/O that could be polled by
the scan poller and parsed as a scan I/O response.
This limitation has been removed a long time
ago. Scan I/O can go alongside default I/O and
bdevs can now be registered at runtime.
Change-Id: I949c57acbfe23a7246de90e90ce58ea007be947e
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391865
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Eventq is a special virtqueue. The driver
(e.g. SPDK virtio) enqueues some fixed
amount of write-only requests. The host
device completes them only when it has
something to tell us - like a device
hotremove. After we receive the response,
containing full event details, we schedule
proper action and re-enqueue the event to
be used later by the host as another event.
Change-Id: I0516b161d8d4a49490b909fa2a454c9c9fa517f2
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/390115
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Previous descriptor chain was being corrupted
by setting invalid vq->req_end (virtio.c:538).
Change-Id: I4b27db02dc990e6af011a1b614e30e3050379e9f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392774
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Remove unused macro constants of iSCSI.
MAX_PORTAL, MAX_INITIATOR, MAX_NETMASK are still used to determine
buffer size for JSON-RPC and iSCSI.conf and are not removed in
this patch yet.
Change-Id: I3036dc472eca09eff7fa3f6ea7e8e28b0978358f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392912
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
A few previous changes replaced ALL by ANY for the initiator group
because ANY was normal for this case. ANY was tested enough for
the initiator name of the initiator group but not tested for
the netmask of the initiator group.
Hence the previous changes caused degradation.
This is the bug fix and UT code is added together.
Change-Id: Idf7642dd4c111a4788aca31a0105b3497631aecd
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392923
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Exported lib/bdev/rte_virtio as a separate
library not dependent on bdev.
Virtio is now accessible via spdk_internal/virtio.h.
The header is marked `internal`, as it's
not meant to be used by end users. It's
not handy to handle all backend-specific
(e.g. Virtio-SCSI) logic in a user code.
For now the Virtio interface is publicly
exposed only via bdev_virtio module. We
might want to consider adding a separate,
public Virtio-SCSI library in the future.
Note: this patch doesn't do any changes
to the virtio code. Everything is
moved 1:1.
Change-Id: I805e5d12d265d82b0bc5784c89fbadb81abdb278
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/388166
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
At present, close(dev_fd) is always executed before the
return of ioctl(nbd->dev_fd, NBD_DO_IT). So when executing
these 2 ioctl commands, dev_fd is already closed.
Change-Id: I6fce73c440972af91f662f24c1fbca51a7b95d61
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391708
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
1 Some functions should be static.
2 For the unused API, we delete them first. When it is used, we should add
them back.
Change-Id: I2c39e8bef96155fc71801c76534955d9cf6cb0bd
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392721
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
No need to export this function, since it is only
used in lib/iscsi.c
Change-Id: Ib5fc3d54e45d0b5696413788a6a98278e9c2dfef
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392716
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reason: It only handles the queue datain tasks. After the changing,
it would be more accurate for the code reading.
Change-Id: I87999f811810cadd4b58d99be1cdeba0a1a7503f
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392719
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Also use this function in iscsi/conn.c
Change-Id: I25f6da175eddb12c4ac2624d695c2c43c871d8e8
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392713
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We can now proudly say that rte_virtio
doesn't contain SCSI references anymore.
Change-Id: Id814de255d8715af03c68a237923432f1e4fe2e1
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392628
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
This mimics the per-lcore cache that the DPDK rte_mempool
implements. But DPDK rte_mempool relies on the DPDK
lcore_ids which are not set for non-DPDK threads (such as
the fio bdev plugin).
So implement our own per-thread bdev_io cache instead.
This is quite simple since we already have a per-thread
bdev channel called spdk_bdev_mgmt_channel.
Previously, we passed 64 to spdk_mempool for the
per-core cache size. This patch effectively changes it
to 256 and moves it from the spdk_mempool (which we now
specify with a per-core cache size of 0) to this internal
bdev cache. We allocate 64K of these bdev_io, so putting
a few more in each thread's cache will not hurt anything.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5e715f8c69b99130c7b80347b47a881595d184ae
Reviewed-on: https://review.gerrithub.io/392531
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
The latest change for the portal group is applied to the vhost.
The following comment is quoted from it.
Currently the cpumask must be a subset of the reactor mask.
However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux. The latter will
be familier for more people. Hence the later is adopted.
The following is quoted from the FreeBSD Man Page of taskset:
The CPU affinity is represented as a bitmask, with the lowest
order bit corresponding to the first logical CPU and the
highest order bit corresponding to the last logical CPU.
Not all CPUs may exist on a given system but a mask may specify
more CPUs than are present.
A retrieved mask will reflect only the bits that correspond to
CPUs physically on the system.
If an invalid mask is given (i.e., one that corresponds to no
valid CPUs on the current system) an error is returned.
The masks are typically given in hexadecimal.
Change-Id: Idcd72a12ef52e4ccec8476e7d54fab82867cf936
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392587
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
During hot remove of lvol store some lvols can already be in a process
of removal. We should not start another removal process for lvol that
is already being removed.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ifc91e4cee11ee63af04eac3729d014d7c04ff98b
Reviewed-on: https://review.gerrithub.io/390217
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Instead of requiring each bdev module to track its own bdevs and clean
them up during its fini callback, we can walk the list of registered bdevs
during spdk_bdev_finish() and call spdk_bdev_unregister() on each one of
them before cleaning up the bdev modules.
Change-Id: I01816707c9100f66f542bfd73b90bcb0e0fb0c0c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/389878
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Commit ID "269910c0" removed the support of separate metadata,
for those controllers which can support this feature, SPDK driver
can't be used. SPDK provides APIs such as:
spdk_nvme_ctrlr_cmd_io_raw_with_md/spdk_nvme_ns_cmd_write_with_md/
spdk_nvme_ns_cmd_read_with_md, which can support separate metadata.
While here, re-enable this feature with this commit.
Change-Id: If77c21e9ac700c4b334548ebfa7e8e6286285a64
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/392440
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This can be used for two purposes:
1) more quickly iterate the blob list, avoiding
metadata pages that are valid but not the first
page in the blob's metadata list
2) close races between delete and open operations -
now we can clear the bit in the blobid bit array
when the delete operation is in progress, ensuring
no one else can try to open the blob
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3904648fd6fa656cb98c9e17ea763ed5a84ef537
Reviewed-on: https://review.gerrithub.io/391695
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
No need to export these functions since they are not
used by other files.
Change-Id: Iab5d44667cc0d57ec105e90a71d434cc4e07f4f5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392590
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For g_num_connections, we should create an proper
array size, we cannot directly create it by the size:
spdk_env_get_core_count(). The reason is that the
core mask can be non-continuous,e.g., 0x1001, thus
for effient access, we create a large array size with
last_core +1, although we will have some space waste,
but this will not be big, but still maintain the fast
array index acccess.
Change-Id: I95e1fc34e0816ac2f8764880c0d0e629f43a5dc4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/391909
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iedfa8d3de8520836e184f7ef0925822fb705fc67
Reviewed-on: https://review.gerrithub.io/391672
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Call this function from spdk_bdev_mgmt_channel_destroy().
Currently there are no real resources to free, but that
will change in an upcoming patch which adds per-thread
bdev_io caches.
While here, also add a for_each_channel iterator to
call this function on each existing channel during bdev
finish code path.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9348e37053505c9fba7a6421e55ffc416668d24f
Reviewed-on: https://review.gerrithub.io/392530
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prepares for some upcoming changes which will
add a per-thread bdev_io cache.
While here, remove spdk_bdev_get_io() from the
internal bdev API. This function is not meant
to be called outside of bdev.c.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f764a88a079fac936931c46d615999454013732
Reviewed-on: https://review.gerrithub.io/392529
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
To better match bdev modules like nvme, complete requests
for the bdev/null driver asynchronously. This will be
done by allocating IO channels that register a poller
and keep a TAILQ of bdev IO to be completed next time
the poller runs.
This is actually more efficient as well, since completing
I/O in submit_request context defers the completion using
an event. A benchmark of bdevperf with split running on
top of null module shows this patch increases throughput
20%.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8c664234660c249fd8ec8d9244eed33502d4103e
Reviewed-on: https://review.gerrithub.io/392528
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Change-Id: I366b941a60d1fb00951591e7f631a65e8a449904
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392566
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Change-Id: I102954505c2c53458aae30f6d15b46e008355501
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392565
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Currently the cpumask must be a subset of the reactor mask.
However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux. The latter will
be familier for more people. Hence the later is adopted.
The following is quoted from the FreeBSD Man Page of taskset:
The CPU affinity is represented as a bitmask, with the lowest
order bit corresponding to the first logical CPU and the
highest order bit corresponding to the last logical CPU.
Not all CPUs may exist on a given system but a mask may specify
more CPUs than are present.
A retrieved mask will reflect only the bits that correspond to
CPUs physically on the system.
If an invalid mask is given (i.e., one that corresponds to no
valid CPUs on the current system) an error is returned.
The masks are typically given in hexadecimal.
Change-Id: I7e0d2e029569bfc986f7fcdf78048791ab389f72
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392446
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Load balancing for idle iSCSI connections uses the RTE EAL Launch
state and uses DPDK RTE EAL API.
But all SPDK reactors will exit simultaneously because each SPDK
reactor checks if the global state is RUNNING to exit.
Hence calling rte_eal_get_lcore_state() is not necessary.
When the reactor hot-plug function is supported, this implementation
will be reconsidered together.
Change-Id: I34eaf3e42b5b7deae6473d2bfaf0910aaa9da6de
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391339
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently idle iSCSI connections are managed by the master lcore,
but the master lcore is like BSP of OS and for initialization.
To manage idle iSCSI connections it is important that the core is
consistent.
Hence the first core is better than the master lcore.
In this patch the following are changed together:
- Errors of kqueue() and epoll_create1() are not related with master
lcore. "master lcore" is removed and errno is added into the log.
- In spdk_iscsi_conn_allocate_reactor(), when cpumask is 0, 0 is
selected as core number. 0 is not safe and first_core is used instead.
In spdk_iscsi_conn_allocate_reactor(), when first_core is used instead
of master_lcore, we may observe some contradiction in the following
code. But few changes are done in this patch.
In the current implementation we can assume the first lcore is
equal to the master lcore and the following code will be removed
in the subsequent patch.
/*
* DPDK returns WAIT for the master lcore instead of RUNNING.
* So we always treat the reactor on master core as RUNNING.
*/
if (i == master_lcore) {
state = RUNNING;
} else {
state = rte_eal_get_lcore_state(i);
}
Change-Id: I6cac06c27b289db5ea1f9452e33489286c64d2fa
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391338
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Before removing the DPDK dependency from the iSCSI connection
load balancing, this should be done.
In spdk_iscsi_conn_allocate_reactor(cpumask)
- if any lcore[i]'s state is FINISHED, the caller calls
rte_eal_wait_lcore(i). But the purpose of rte_eal_wait_lcore()
is to check if the slave is in a WAIT state before calling
rte_eal_remote_launch(). The meaning of this usage is not clear.
- If the state of lcore[i] is WAIT or FINISHED, the reactor does
not run on the lcore[i]. iSCSI connections consist of not reactor
but poller. Hence selecting lcore[i] with the state WAIT or
FINISHED does not look correct.
Change-Id: If8c420f2d16dc44e77f8963f5732faa52e3d829b
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391332
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Change-Id: I7174f1799361b8337ff5590b90ad6a0564ca8e9b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391899
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
User can specify processor affinity for each iSCSI connection
by specifying cpumask in the configuration file.
However the example of iscsi.conf.in does not have any description
about this. Hence it is very difficult for user to use this.
The portal group section of the config dump file has the same
description. Hence it is also changed.
Change-Id: I6e7b3bb67e10e78f4a47165525f023555080f146
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391510
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
There existing an overflow for the large value of sleeping time
for the poller and the actual time may be incorrect setting due
to this overflow. Update the calculation here.
Change-Id: I14fe21d3f0e1abaa9d13d3d6254aff254d2dfcc3
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/392127
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8519a4b68db44cb8fe6dd251a52bf0f1dca73c32
Reviewed-on: https://review.gerrithub.io/391890
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently the default setting of cpumask of network portal is
different between iSCSI.conf and JSON-RPC.
When a network portal is created by iSCSI.conf, its cpumask is
set to all available CPUs by default. However when it is created
by JSON-RPC, its cpumask is set to 0 by default.
Auto test 'test/iscsi_tgt/idle_migration creates a network portal
by JSON-RPC. Hence the auto test cannot test the load balancing
function of iSCSI target.
Change-Id: I2685172cb9259b643f6d18d4660a8425dcef3f5d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391898
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
For normal exit logic, such as Ctrl+C, vhost blk will not
shutdown the backend device, e.g: NVMe controller.
Change-Id: I7fdf8687a2cfa6a8cc6a61428d722debfa9a2180
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391348
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>