2599 Commits

Author SHA1 Message Date
Shuhei Matsumoto
b29d208267 vhost: Replace RTE_MAX_LCORE by spdk_env_get_last_core()
Adopt two improvements for iSCSI by Ziye to VHOST.
- iscsi/conn: remove rte_config.h header
- env: export spdk_env_get_last_core function

Change-Id: I8f067093d593c8d483c52587669f8b0b706f497f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392910
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-28 12:03:40 -05:00
Shuhei Matsumoto
0686f0381b vhost: Remove DPDK dependency and simplify load balancing
The latest patch for the iSCSI connection is applied to the vhost
too.

RTE_MAX_LCORE in the for loop is removed and the for loop is
replaced by SPDK_ENV_FOREACH_CORE().

When the cpumask is unexpectedly 0, not 0 but the first core is
returned.

Change-Id: I39cfc2219a3532eccc8c0ce59712102b947a76d7
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392588
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
2017-12-28 12:03:40 -05:00
Dariusz Stojaczyk
7755bed389 virtio: split device initialization to separate functions
Previously we used to manually set
vdev->max_queues and called virtio_dev_restart
to go through all virtio init states, negotiate
features and allocate virtqueues. This is,
however, insufficient for Virtio-Blk, where we
e.g. need to check against negotiated multiqueue
flag before deciding how many queues we can use
(reading num_queues field from device config is
forbidden unless VIRTIO_BLK_F_MQ is negotiated).

This patch refactors queue-num related code
and also removes various restrictions. If device
supports less queues than requested, a warning
will be printed during initialization, but
the device will now continue to init normally.

The queue-num negotiation for virtio-user should
be eventually moved to upper layers, but that is
not necessary for now.

Change-Id: I418b56fa62c17b547243422ea077f0d76555bd13
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/393087
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
89bdcf77a1 bdev/virtio: add "scsi" keyword to filenames
Change-Id: I6bcdf502fc762aa2c7b7a9b0861e4d99df3f943e
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/393055
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
7f4ff3d922 bdev/virtio: handle missed eventq I/O
If the host fails to send some events,
the next event it succeeds to send
might have an EVENTS_MISSED flag set.
Once virtio receives such event, it will
schedule a full bus rescan to manually
detect any changes.

Change-Id: Ifa66536f8e2980ad31ee68769f042f08100da54e
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392780
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
166f6e9cb8 bdev/virtio: add rescan safety checks
Don't allow starting full device rescan
if such scan is already in progress.

This patch also makes it possible to
start a full scan while only particular
targets are being rescanned.

Change-Id: I8677f640a4e5d9d8c486dfe1e9a58331e941a461
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392373
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
5eea99fc57 bdev/virtio: add a helper function for starting tgt scan
Change-Id: If3655b26132a5df48d8c09283d0a3f76c76dacef
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392776
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
91a1d8ae93 bdev/virtio: add target hot-attach support
Implemented eventq rescan message.
When the host hot-attaches a target
to the controller, it will now
automatically appear as a virtio bdev.

Change-Id: Ie81352b70808a0e078009f618cc1fcfed407d1ba
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392374
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
31177e645c bdev/virtio: defer ctrlr removal if scan I/O is in progress
A public API to remove a virtio
controller is about to be published
soon. Hence, we need to cover all
possible corner-cases where calling
it could cause a segfault.

Change-Id: I804cc0bee4a60ab8fc159bb98b7712cc258117f3
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392271
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
8b0f049e85 bdev/virtio: resend scan I/O on full-queue scenario
The upcoming rescan implementation will
do scan I/O on existing I/O virtqueues.
To handle scan cleanly, we need to queue
scan I/O if virtqueues are full.

The heuristics for resending I/O should
be sufficient for all real case scenarios.
A scan I/O consists of either 2 or 3 iovs.
A raw I/O consists of at least 2
descriptors. Since most I/O requests
contain some additional payload, in 99%
of cases the scan I/O will be successfully
resent after polling a single I/O response.
To handle the remaining 1%, we try to
resend a scan I/O up to SCAN_REQUEST_RETRIES
(currently = 5) times.

Change-Id: I8c84ed1d109d9f403c9d7b8efabb904eb26183db
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392174
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
a27d7bc188 bdev/virtio: send scan I/O payload descriptor only when necessary
SCSI doesn't specify any payload associated
with TEST_UNIT_READY and START_STOP_UNIT
commands. Sending a payload didn't cause
any troubles so far, but still it's not SCSI
compliant and hence has been removed.

It is now assumed that iov_len of 0 means
that no payload should be sent.

Change-Id: Ibdfa7cd77af6049132278911d872cd52c10be6c7
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391868
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
36e34144f5 bdev/virtio: remove target param from functions sending scan I/O
Removed some duplicated code.

Change-Id: Ifcbf533cb6faef89adad2f25d5c652af9d0fba05
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391867
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
b212741eeb bdev/virtio: cleanup target scan
scan_finish function will now unconditionally
finish the scan, while scan_next will continue
scanning on the next target.

Now that target scan be aborted from external
sources (device removal during re-scan). There
is simply a need for such scan_finish function.

Change-Id: I49e0dd6ea7abdddc24c23af362313d657fe1dc66
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391866
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
f217aa4667 bdev/virtio: don't allow registering the same target twice
Change-Id: I0c0b6cfded7fa2450c652c477ab8c1d88fc6209f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392775
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
c3cee7efe1 bdev/virtio: register bdevs during target scan
We used to defer registering bdevs when
a target scan was in progress, as bdev examine
(gpt) would do an I/O that could be polled by
the scan poller and parsed as a scan I/O response.

This limitation has been removed a long time
ago. Scan I/O can go alongside default I/O and
bdevs can now be registered at runtime.

Change-Id: I949c57acbfe23a7246de90e90ce58ea007be947e
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/391865
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
9bfaa6b2a6 bdev/virtio: implement eventq / hot-detach
Eventq is a special virtqueue. The driver
(e.g. SPDK virtio) enqueues some fixed
amount of write-only requests. The host
device completes them only when it has
something to tell us - like a device
hotremove. After we receive the response,
containing full event details, we schedule
proper action and re-enqueue the event to
be used later by the host as another event.

Change-Id: I0516b161d8d4a49490b909fa2a454c9c9fa517f2
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/390115
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-27 15:13:03 -05:00
Dariusz Stojaczyk
a1ca7a12ff virtio: fix enqueuing zero-length descriptor chains
Previous descriptor chain was being corrupted
by setting invalid vq->req_end (virtio.c:538).

Change-Id: I4b27db02dc990e6af011a1b614e30e3050379e9f
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392774
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 15:13:03 -05:00
Shuhei Matsumoto
a58611d513 iscsi: Remove unused macro constants
Remove unused macro constants of iSCSI.
MAX_PORTAL, MAX_INITIATOR, MAX_NETMASK are still used to determine
buffer size for JSON-RPC and iSCSI.conf and are not removed in
this patch yet.

Change-Id: I3036dc472eca09eff7fa3f6ea7e8e28b0978358f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392912
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-27 12:49:32 -05:00
Shuhei Matsumoto
f53462b432 iscsi: ANY does not work as wild card netmask of ACL (degradation)
A few previous changes replaced ALL by ANY for the initiator group
because ANY was normal for this case. ANY was tested enough for
the initiator name of the initiator group but not tested for
the netmask of the initiator group.

Hence the previous changes caused degradation.
This is the bug fix and UT code is added together.

Change-Id: Idf7642dd4c111a4788aca31a0105b3497631aecd
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392923
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-26 17:21:50 -05:00
Dariusz Stojaczyk
9dde5e5c27 virtio: add new library virtio
Exported lib/bdev/rte_virtio as a separate
library not dependent on bdev.

Virtio is now accessible via spdk_internal/virtio.h.
The header is marked `internal`, as it's
not meant to be used by end users. It's
not handy to handle all backend-specific
(e.g. Virtio-SCSI) logic in a user code.

For now the Virtio interface is publicly
exposed only via bdev_virtio module. We
might want to consider adding a separate,
public Virtio-SCSI library in the future.

Note: this patch doesn't do any changes
to the virtio code. Everything is
moved 1:1.

Change-Id: I805e5d12d265d82b0bc5784c89fbadb81abdb278
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/388166
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-26 13:03:29 -05:00
Xiaodong Liu
998feedb17 nbd: place mop-up ioctl to _nbd_stop
At present, close(dev_fd) is always executed before the
return of ioctl(nbd->dev_fd, NBD_DO_IT). So when executing
these 2 ioctl commands, dev_fd is already closed.

Change-Id: I6fce73c440972af91f662f24c1fbca51a7b95d61
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391708
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-26 12:09:35 -05:00
Ziye Yang
82ac638d80 iscsi/portal_grp: make some functions into static and remove unused ones
1 Some functions should be static.
2 For the unused API, we delete them first. When it is used, we should add
them back.

Change-Id: I2c39e8bef96155fc71801c76534955d9cf6cb0bd
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392721
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-26 11:16:47 -05:00
Ziye Yang
40dbe37f63 iscsi: make spdk_del_connection_queued_task into static
No need to export this function, since it is only
used in lib/iscsi.c

Change-Id: Ib5fc3d54e45d0b5696413788a6a98278e9c2dfef
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392716
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-26 11:14:38 -05:00
Ziye Yang
57199e1e89 iscsi: change name of function spdk_iscsi_conn_handle_queued_tasks
Reason: It only handles the queue datain tasks. After the changing,
it would be more accurate for the code reading.

Change-Id: I87999f811810cadd4b58d99be1cdeba0a1a7503f
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392719
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
2017-12-26 11:13:47 -05:00
Ziye Yang
1b7ce80530 env: export spdk_env_get_last_core function.
Also use this function in iscsi/conn.c

Change-Id: I25f6da175eddb12c4ac2624d695c2c43c871d8e8
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392713
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-23 15:08:24 -05:00
Dariusz Stojaczyk
de7760df1e virtio: remove SCSI-specific Virtio features from the library code
We can now proudly say that rte_virtio
doesn't contain SCSI references anymore.

Change-Id: Id814de255d8715af03c68a237923432f1e4fe2e1
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/392628
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-21 17:24:18 -05:00
Jim Harris
534d9c2002 bdev: add per-thread spdk_bdev_io cache
This mimics the per-lcore cache that the DPDK rte_mempool
implements.  But DPDK rte_mempool relies on the DPDK
lcore_ids which are not set for non-DPDK threads (such as
the fio bdev plugin).

So implement our own per-thread bdev_io cache instead.
This is quite simple since we already have a per-thread
bdev channel called spdk_bdev_mgmt_channel.

Previously, we passed 64 to spdk_mempool for the
per-core cache size.  This patch effectively changes it
to 256 and moves it from the spdk_mempool (which we now
specify with a per-core cache size of 0) to this internal
bdev cache.  We allocate 64K of these bdev_io, so putting
a few more in each thread's cache will not hurt anything.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5e715f8c69b99130c7b80347b47a881595d184ae

Reviewed-on: https://review.gerrithub.io/392531
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-21 16:22:23 -05:00
Shuhei Matsumoto
14797d839d vhost: Allow set cpumask more than active cores for vhost
The latest change for the portal group is applied to the vhost.
The following comment is quoted from it.

Currently the cpumask must be a subset of the reactor mask.

However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux.  The latter will
be familier for more people. Hence the later is adopted.

The following is quoted from the FreeBSD Man Page of taskset:

  The CPU affinity is represented as a bitmask, with the lowest
  order bit corresponding to the first logical CPU and the
  highest order bit corresponding to the last logical CPU.

  Not all CPUs may exist on a given system but a mask may specify
  more CPUs than are present.

  A retrieved mask will reflect only the bits that correspond to
  CPUs physically on the system.

  If an invalid mask is given (i.e., one that corresponds to no
  valid CPUs on the current system) an error is returned.

  The masks are typically given in hexadecimal.

Change-Id: Idcd72a12ef52e4ccec8476e7d54fab82867cf936
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392587
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-21 13:32:31 -05:00
Maciej Szwed
236f84dae7 lvol: don't return lvs_bdev if it's being destroyed
During hot remove of lvol store some lvols can already be in a process
of removal. We should not start another removal process for lvol that
is already being removed.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ifc91e4cee11ee63af04eac3729d014d7c04ff98b
Reviewed-on: https://review.gerrithub.io/390217
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-21 13:30:44 -05:00
Daniel Verkamp
453f5ae9f6 bdev: unregister all bdevs in spdk_bdev_finish()
Instead of requiring each bdev module to track its own bdevs and clean
them up during its fini callback, we can walk the list of registered bdevs
during spdk_bdev_finish() and call spdk_bdev_unregister() on each one of
them before cleaning up the bdev modules.

Change-Id: I01816707c9100f66f542bfd73b90bcb0e0fb0c0c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/389878
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 13:29:29 -05:00
Changpeng Liu
381af5775f nvme: re-enable the separate metadata support in nvme driver
Commit ID "269910c0" removed the support of separate metadata,
for those controllers which can support this feature, SPDK driver
can't be used. SPDK provides APIs such as:
spdk_nvme_ctrlr_cmd_io_raw_with_md/spdk_nvme_ns_cmd_write_with_md/
spdk_nvme_ns_cmd_read_with_md, which can support separate metadata.
While here, re-enable this feature with this commit.

Change-Id: If77c21e9ac700c4b334548ebfa7e8e6286285a64
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/392440
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2017-12-21 13:21:55 -05:00
Jim Harris
40c911b957 blob: add used blobid bit array for valid blobids
This can be used for two purposes:

1) more quickly iterate the blob list, avoiding
   metadata pages that are valid but not the first
   page in the blob's metadata list
2) close races between delete and open operations -
   now we can clear the bit in the blobid bit array
   when the delete operation is in progress, ensuring
   no one else can try to open the blob

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3904648fd6fa656cb98c9e17ea763ed5a84ef537

Reviewed-on: https://review.gerrithub.io/391695
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 12:28:21 -05:00
Ziye Yang
4547ce65a8 iscsi: make some functions into static
No need to export these functions since they are not
used by other files.

Change-Id: Iab5d44667cc0d57ec105e90a71d434cc4e07f4f5
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/392590
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 12:19:44 -05:00
Ziye Yang
50d0957b5e iscsi/conn: remove rte_config.h header
For g_num_connections, we should create an proper
array size, we cannot directly create it by the size:
spdk_env_get_core_count(). The reason is that the
core mask can be non-continuous,e.g., 0x1001, thus
for effient access, we create a large array size with
last_core +1, although we will have some space waste,
but this will not be big, but still maintain the fast
array index acccess.

Change-Id: I95e1fc34e0816ac2f8764880c0d0e629f43a5dc4
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/391909
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-21 11:50:49 -05:00
Jim Harris
832f4e4df6 nvme: add quirks for Intel NVMe P4600 SSD
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iedfa8d3de8520836e184f7ef0925822fb705fc67

Reviewed-on: https://review.gerrithub.io/391672
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2017-12-21 10:45:05 -05:00
Jim Harris
8b0b3c350c bdev: add spdk_bdev_mgmt_channel_free_resources()
Call this function from spdk_bdev_mgmt_channel_destroy().
Currently there are no real resources to free, but that
will change in an upcoming patch which adds per-thread
bdev_io caches.

While here, also add a for_each_channel iterator to
call this function on each existing channel during bdev
finish code path.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9348e37053505c9fba7a6421e55ffc416668d24f

Reviewed-on: https://review.gerrithub.io/392530
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Jim Harris
f1f14e5583 bdev: pass mgmt_channel to spdk_bdev_get_io()
This prepares for some upcoming changes which will
add a per-thread bdev_io cache.

While here, remove spdk_bdev_get_io() from the
internal bdev API.  This function is not meant
to be called outside of bdev.c.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9f764a88a079fac936931c46d615999454013732

Reviewed-on: https://review.gerrithub.io/392529
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Jim Harris
45697d33ff bdev/null: complete requests asynchronously
To better match bdev modules like nvme, complete requests
for the bdev/null driver asynchronously.  This will be
done by allocating IO channels that register a poller
and keep a TAILQ of bdev IO to be completed next time
the poller runs.

This is actually more efficient as well, since completing
I/O in submit_request context defers the completion using
an event.  A benchmark of bdevperf with split running on
top of null module shows this patch increases throughput
20%.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8c664234660c249fd8ec8d9244eed33502d4103e

Reviewed-on: https://review.gerrithub.io/392528
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-21 10:42:47 -05:00
Ben Walker
1545c8eb5e nvmf: Fix bug when resizing sgroups array
Change-Id: I366b941a60d1fb00951591e7f631a65e8a449904
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392566
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-20 21:38:25 -05:00
Ben Walker
fd0770fecb nvmf: Delete subsystems when target is destroyed
Change-Id: I102954505c2c53458aae30f6d15b46e008355501
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/392565
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-20 21:38:25 -05:00
Shuhei Matsumoto
b65443be6e iscsi: Allow to set cpumask more than active cores for iSCSI connection
Currently the cpumask must be a subset of the reactor mask.

However, this is different from sched_setaffinity() function
and taskset command of FreeBSD and Linux.  The latter will
be familier for more people. Hence the later is adopted.

The following is quoted from the FreeBSD Man Page of taskset:

  The CPU affinity is represented as a bitmask, with the lowest
  order bit corresponding to the first logical CPU and the
  highest order bit corresponding to the last logical CPU.

  Not all CPUs may exist on a given system but a mask may specify
  more CPUs than are present.

  A retrieved mask will reflect only the bits that correspond to
  CPUs physically on the system.

  If an invalid mask is given (i.e., one that corresponds to no
  valid CPUs on the current system) an error is returned.

  The masks are typically given in hexadecimal.

Change-Id: I7e0d2e029569bfc986f7fcdf78048791ab389f72
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/392446
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:19:04 -05:00
Shuhei Matsumoto
0241b85bc1 iscsi: Remove DPDK dependency and simplify load balancing
Load balancing for idle iSCSI connections uses the RTE EAL Launch
state and uses DPDK RTE EAL API.

But all SPDK reactors will exit simultaneously because each SPDK
reactor checks if the global state is RUNNING to exit.

Hence calling rte_eal_get_lcore_state() is not necessary.

When the reactor hot-plug function is supported, this implementation
will be reconsidered together.

Change-Id: I34eaf3e42b5b7deae6473d2bfaf0910aaa9da6de
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391339
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:17:15 -05:00
Shuhei Matsumoto
855d8e032a iscsi: change master_lcore to first_core for idle connection management
Currently idle iSCSI connections are managed by the master lcore,
but the master lcore is like BSP of OS and for initialization.

To manage idle iSCSI connections it is important that the core is
consistent.

Hence the first core is better than the master lcore.

In this patch the following are changed together:
- Errors of kqueue() and epoll_create1() are not related with master
  lcore. "master lcore" is removed and errno is added into the log.
- In spdk_iscsi_conn_allocate_reactor(), when cpumask is 0, 0 is
  selected as core number. 0 is not safe and first_core is used instead.

In spdk_iscsi_conn_allocate_reactor(), when first_core is used instead
of master_lcore, we may observe some contradiction in the following
code. But few changes are done in this patch.

In the current implementation we can assume the first lcore is
equal to the master lcore and the following code will be removed
in the subsequent patch.

/*
 * DPDK returns WAIT for the master lcore instead of RUNNING.
 * So we always treat the reactor on master core as RUNNING.
 */
if (i == master_lcore) {
    state = RUNNING;
} else {
    state = rte_eal_get_lcore_state(i);
}

Change-Id: I6cac06c27b289db5ea1f9452e33489286c64d2fa
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391338
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:17:15 -05:00
Shuhei Matsumoto
7767990829 iscsi: Unexpected usage of RTE EAL Launch in load balancing
Before removing the DPDK dependency from the iSCSI connection
load balancing, this should be done.

In spdk_iscsi_conn_allocate_reactor(cpumask)

- if any lcore[i]'s state is FINISHED, the caller calls
  rte_eal_wait_lcore(i). But the purpose of rte_eal_wait_lcore()
  is to check if the slave is in a WAIT state before calling
  rte_eal_remote_launch(). The meaning of this usage is not clear.

- If the state of lcore[i] is WAIT or FINISHED, the reactor does
  not run on the lcore[i]. iSCSI connections consist of not reactor
  but poller. Hence selecting lcore[i] with the state WAIT or
  FINISHED does not look correct.

Change-Id: If8c420f2d16dc44e77f8963f5732faa52e3d829b
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391332
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
2017-12-20 15:17:15 -05:00
Changpeng Liu
9ca670ac8a util/crc16: add crc16 library support and unit tests
Change-Id: I7174f1799361b8337ff5590b90ad6a0564ca8e9b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391899
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-20 15:12:26 -05:00
Shuhei Matsumoto
8fd2882fb2 etc/spdk/iscsi.conf.in: How to use cpumask for the connection
User can specify processor affinity for each iSCSI connection
by specifying cpumask in the configuration file.

However the example of iscsi.conf.in does not have any description
about this. Hence it is very difficult for user to use this.

The portal group section of the config dump file has the same
description. Hence it is also changed.

Change-Id: I6e7b3bb67e10e78f4a47165525f023555080f146
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391510
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
2017-12-20 11:31:45 -05:00
GangCao
c22f12c8df event: update the poller's period_ticks calculation
There existing an overflow for the large value of sleeping time
for the poller and the actual time may be incorrect setting due
to this overflow. Update the calculation here.

Change-Id: I14fe21d3f0e1abaa9d13d3d6254aff254d2dfcc3
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/392127
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 18:49:16 -05:00
Jim Harris
9c1d97a247 nvme: add checks for sq_head
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8519a4b68db44cb8fe6dd251a52bf0f1dca73c32

Reviewed-on: https://review.gerrithub.io/391890
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-19 16:45:05 -05:00
Shuhei Matsumoto
28ff7a0da7 iscsi: Set cpumask to all available CPUs when PG is created JSON-RPC
Currently the default setting of cpumask of network portal is
different between iSCSI.conf and JSON-RPC.

When a network portal is created by iSCSI.conf, its cpumask is
set to all available CPUs by default. However when it is created
by JSON-RPC, its cpumask is set to 0 by default.

Auto test 'test/iscsi_tgt/idle_migration creates a network portal
by JSON-RPC. Hence the auto test cannot test the load balancing
function of iSCSI target.

Change-Id: I2685172cb9259b643f6d18d4660a8425dcef3f5d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/391898
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-12-19 15:22:05 -05:00
Changpeng Liu
4395093b97 vhost_blk: close the bdev in the hotplug callback
For normal exit logic, such as Ctrl+C, vhost blk will not
shutdown the backend device, e.g: NVMe controller.

Change-Id: I7fdf8687a2cfa6a8cc6a61428d722debfa9a2180
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/391348
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2017-12-19 14:54:51 -05:00