We used to rely on lcore >= 0 for sessions that are
started (have their pollers running) and in order to
prevent data races, that lcore field had to be set from
the same thread that runs the pollers, directly after
registering/unregistering them. The lcore was always
set to spdk_env_get_current_core(), but we won't be able
to use an equivalent get_current_poll_group() function
after we switch to poll groups. We will have a poll group
object only inside spdk_vhost_session_send_event() that's
called from the DPDK rte_vhost thread.
In order to change the lcore field (or a poll group one)
from spdk_vhost_session_send_event(), we'll need a separate
field to maintain the started/stopped status that's only
going to be modified from the session's thread.
Change-Id: Idb09cae3c4715eebb20282aad203987b26be707b
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452394
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Prepare to switch to spdk_thread_send_msg() which
accepts only one context parameter.
Change-Id: Iea3e8d1e715957d9b3fea12e969f29084a2948dc
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452393
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The goal is to remove legacy event messages from vhost.
The new message passing API accepts thread objects instead
of lcore numbers and poll groups are meant to simplify
the transition.
Eventually we'd like vhost to spawn its own threads and
do message passing only within those, but SPDK libraries
can't spawn their own threads just yet. As a stopgap, vhost
will now maintain a list of all available threads (in form
of "poll groups" to mimic nvmf) and will start pollers on
them using its own round robin scheduler.
This patch only adds the poll groups list, it doesn't
change any existing functionality.
Change-Id: I89cc5da5df3612827c6fc9015f03c94b5f4a10ad
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452206
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Prepare vhost lib init to be asynchronous. We'll need
it for setting up the upcoming poll groups.
Change-Id: I3c66b3f17f8635d4b705dd988393431193938971
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452205
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Put all shutdown functions in a single place. This also
lets us remove one forward declaration.
Change-Id: I8c8c602e67e3dafd3cd5e80bc9dd90f23381711e
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452392
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Switch to the new spdk_thread_send_msg() API instead.
Change-Id: I810465cc49d5c4ef23e04953aa29d369f48f68b1
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452391
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
spdk_dma_malloc() is not required here, as the device
object is neither DMA-able nor shared between processes.
The device structures used to be aligned to cache line
size, but that's just a leftover from before sessions
were introduced. The device object is just a generic
device information that can be accessed from any thread
holding the proper mutex. The hot data used in the I/O
path sits in the session structure, which is now allocated
with posix_memalloc() to ensure proper alignment.
Vhost NVMe is an exception, as the device struct is used
as hot I/O data for the one and only session it supports,
so it's also allocated with posix_memalloc().
While here, also allocate various vhost buffers using
spdk_zmalloc() instead of spdk_dma_zmalloc(), as
spdk_dma_*malloc() is about to be deprecated.
Change-Id: Ic7f63185639b7b98dc1ef756166c826a0af87b44
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450551
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The previous patches described as optimizations also
fixed some issues. They seem sufficient to cover all
the error cases, but the real source of the problem
lies in foreach_session() initiated by the device backend,
which can use sessions that were never seen by the
backend.
The backends are only notified when a session is
*started*, but foreach_session() iterates through
all the sessions - even those that were never started.
Vhost SCSI, for example, in the foreach_session() callbacks
used to expect svsession->svdev to be always set, but
that field is only set when the session gets started.
A perfect solution would to introduce a new backend
callback to be called on new connection. Vhost SCSI
could set e.g. svsession->svdev inside. For now we go
with much easier solution that prevents sessions from
being used in foreach-session() unless they were
started at least once. (...and e.g. got their ->svdev set)
Change-Id: Ida30a1f27f99977360d08a71a64fc92931b25b75
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449394
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There is currently a small window after we stop
session's pollers and before we mark the session
as stopped (by setting vsession->lcore to -1). If
spdk_vhost_dev_foreach_session() is called within
this window, its callback could assume the session
is still running and for example in vhost scsi
target hotremove case, could destroy an io_channel
for the second time - as it'd first done when the
session was stopped. That's a bug.
A similar case exists for session start.
We fix the above by setting vsession->lcore directly
after starting or stopping the session, hence
eliminating the possible window for data races.
This has a few implications:
* spdk_vhost_session_send_event() called before
session start can't operate on vsession->lcore,
so it needs to be provided with the lcore as
an additional parameter now.
* the vsession->lcore can't be accessed until
spdk_vhost_session_start_done() is called, so
its existing usages were replaced with
spdk_env_get_current_core()
* active_session_num is decremented right after
spdk_vhost_session_stop_done() is called and
before spdk_vhost_session_send_event() returns,
so some active_session_num == 1 checks meaning
"the last session gets stopped now" needed to be
changed to check against == 0, as if "the last
session has been just stopped"
Change-Id: I5781bb0ce247425130c9672e0df27d06b6234317
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448229
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Split spdk_vhost_session_event_done() into two separate
functions. This is just a preparation for the next patch.
Change-Id: I05e046e4b963387f058d2b822d7493c761eebbbb
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448228
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In the next patch we will put much more responsibility
on spdk_vhost_session_event_done(), so here we make
sure it's always called under the global vhost mutex.
Specifically, spdk_vhost_session_event_done() will set
vsession->lcore, which any other thread might try to
concurrently access via spdk_vhost_dev_foreach_session().
Change-Id: I7a5fde4be4e8bdfdbbb24ac955af964f516bdb68
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448227
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We'll make use of it inside the vhost device backend
code. The function itself is generic enough to be put
in the public vhost.h header rather than vhost_internal.h.
Change-Id: I60602c61d8bba665dcf9c6d27af2e910c208a7be
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448226
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The context had to be previously carried around by
particular vhost backend code and now it's embedded
inside the generic vsession struct. This serves mostly
as a cleanup.
Change-Id: I7b6ac2c3cb5d60a035d56affbf42fe5d4697f0f6
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448223
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Nothing actually needs this to be asynchronous. If something
comes up, we can make it asynchronous again.
Change-Id: Icde3af3f8f9efebe75b08471b4afcce3a70da541
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447114
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Windows Virtio drivers use indirect descriptors without
negotiating their feature flag, which is explicitly
forbidden by the Virtio 1.0 spec. "(2.4.5.3.1 Driver
Requirements: Indirect Descriptors) The driver MUST NOT
set the VIRTQ_DESC_F_INDIRECT flag unless the
VIRTIO_F_INDIRECT_DESC feature was negotiated.".
Violating this rule doesn't cause any issues for SPDK
vhost, but triggers an assert, so we can only run Windows
VMs with non-debug SPDK builds.
This patch removes the assert and allows Windows VMs
to be run with debug versions of SPDK vhost.
Fixes#650
Change-Id: I95f534c33c384a4e1126a8c343c21eb63ec7bcef
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447803
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
rte_vhost has rejected a patch with this feature, so
we implement it using the external rte_vhost msg handling
hooks directly in SPDK.
Change-Id: Ib072fc19b921fe0fa01c7f4892e60430232e3a1c
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Make the vdev initialization happen before calling
any vdev related functions. This is mostly needed
for an upcomming patch where additional step is
required after initializing the vdev and before
starting rte vhost.
On the other hand, this patch also fixes a technically
possible scenario where rte vhost starts processing
vhost-user messages and calling our ops before the
related vdev was initialized.
Change-Id: I8fbc7e7bc0b364327cfcec60faa74d4f64d6fad8
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
rte_vhost requires all queues to be fully initialized
in order to start I/O processing. This behavior is not
compliant with the vhost-user specification and doesn't
work with QEMU 2.12+, which will only initialize 1 I/O
queue for the SeaBIOS boot. Theoretically, we should
start polling each virtqueue individually after
receiving its SET_VRING_KICK message, but rte_vhost is
not designed to poll individual queues. So we use
a workaround to detect when a vhost session could be
potentially at that SeaBIOS stage and we mark it to
start polling as soon as its first virtqueue gets
initialized. This doesn't hurt any non-QEMU vhost slaves
and allows QEMU 2.12+ to boot correctly. SET_FEATURES
could be sent at any time, but QEMU will send it at
least once on SeaBIOS initialization - whenever
powered-up or rebooted.
Vhost sessions are still mostly started/stopped from
within rte_vhost callbacks, but now there's additional
concept of "forced" polling, in which SPDK starts
sessions manually, while rte_vhost still thinks the
sessions are stopped. This can potentially lead to cases
where a session is "started" twice, or gets destroyed
while it's still being polled (by force). Those cases
also need to be handled within this patch.
Change-Id: I70636d63e27914906ddece59cec34f1dd37ec5cd
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446086
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
DPDK 19.05+ gives us an ability to pre or post-process
any single vhost-user message. The user can either perform
additional actions upon some generic events, or can
implement handling for brand new message types that
rte_vhost doesn't even know about.
In order to smoothly switch to the upstream rte_vhost
and drop our internal copy, we introduce an SPDK wrapper
function to register SPDK-specific message handlers. For
DPDK 19.05+ this will use the new rte_vhost API to
register those message handlers, and for older DPDKs
this function simply won't do anything - as w assume the
internal rte_copy already contains all the necessary
changes and does not need any "external" hooks.
For now we use the message handlers to stop the vhost
device and wait for any pending DMA ops before letting
rte_vhost to process the SET_MEM_TABLE message and unmap
the current shared memory.
Change-Id: Ic0fefa9174254627cb3fc0ed30ab1e54be4dd654
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446085
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It's disabled by default, so no functionality is changed yet.
The intention is to use the upstream rte_vhost from DPDK,
which - starting from DPDK 19.05 - is finally capable of
running with storage device backends.
SPDK still requires a lot of changes in order to support
that upstream version, but the most fundamental change is
dropping vhost-nvme support. It'll remain usable only with
the internal rte_vhost copy and with the upstream rte_vhost
it simply won't be compiled. This allows us at least to
compile with that upstream rte_vhost, where we can pursue
adding the full integration.
Change-Id: Ic8bc5497c4d77bfef77c57f3d5a1f8681ffb6d1f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446082
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Adapted our custom rte_vhost APIs to the upstream DPDK
version which has independently added similar APIs.
This will potentially allow us to remove our internal
rte_vhost copy.
rte_vhost_set_vhost_vring_last_idx() was renamed to
rte_vhost_set_vring_base() and the last vring indices
have to be acquired with a newly introduced rte_vhost_get_vring_base()
rather than rte_vhost_get_vhost_vring().
This is only a refactor, no functionality is changed.
Change-Id: I1ca2c1216635c117832c9d9c784d5661145c04cd
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/446081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Older versions of QEMU (<= 2.11) expose the VGA BIOS
hole (0xA0000-0xBFFFF) by specifying two separate memory
regions - one before and one after the hole. This results
in the "size" not being a 2MB multiple. But the underlying
memory is still mmaped at a 2MB multiple - so that's what
we should be checking to ensure the memory is hugepage backed.
Fixes#673.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1644bb6d8a8fb1fd51a548ae7a17da061c18c669
Reviewed-on: https://review.gerrithub.io/c/445764
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Removed their various usages inside the core vhost code
together with the external events themselves. External
events were completely replaced by spdk_vhost_lock()
and spdk_vhost_dev_find().
Change-Id: I1f9d0268c27a06e2eecab9e7d179b1fd54d4223d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440379
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Vhost external events no longer do any asynchronous
calls, they only lock the vhost mutex and directly
call the provided function. The mutex encapsulation
isn't worth the additional complexity of splitting
each vdev-handling code into multiple functions, so
we expose low-level APIs that should eventually
replace external events entirely.
Instead of:
```
static int do_something_cb(struct spdk_vhost_dev *vdev, void *arg)
{
struct my_data *ctx = arg;
/* access the vdev and ctx */
free(ctx);
}
struct my_data *ctx = calloc(...);
rc = spdk_vhost_call_external_event("my_vdev", do_something_cb, ctx);
if (rc != 0) { /* err handling */ }
```
We can now do just:
```
spdk_vhost_lock();
vdev = spdk_vhost_dev_find("my_vdev");
if (vdev == NULL) { /* err handling */ }
/* access the vdev any context data */
spdk_vhost_unlock();
```
Change-Id: I06e1e149d6dd006720b021d3bef8d9b7bfaeceaa
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440377
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This ensures that SPDK will detect descriptor chains
that are too long.
The additional check in vhost block stands as an
optimization and makes us fail the corrupted I/O early.
Change-Id: Icceaa0dd938dca96a1872e5ee96bf6a151fdd9e7
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Signed-off-by: Dima Stepanov <dstepanov.src@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/433641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
SPDK doesn't provide sufficient runtime checks to properly
handle clients with memory sizes that aren't 2MB multiples
and could potentially segfault during I/O processing.
That's why we'll reject such clients now.
Change-Id: I34e85be5b5c6df863371d0ad688f228ed44107ff
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/433640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Although Vhost SCSI code is technically capable
of polling different sessions on different lcores,
the underlying SCSI API won't allow allocating
io_channels on more than one lcore.
That's why we will now let device backends assign
lcores by themselves.
The first Vhost SCSI session will now choose one
core from the available ones, and any subsequent
sessions will stick to the same one.
Change-Id: I616cd195a919960dff68508473cea236abf8d6a3
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441581
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
With all the patches in place, we can finally
enable having more than one simultaneous sessions
to a single vhost device.
This patch adds a unique id to the session structure,
similar to the one in a vhost device and also fills in
the implementation holes in foreach_session().
Vhost-NVMe can support only one session per device
and now has an additional check that prevents it from
starting more than one at a time.
Vhost-SCSI also has the same check now since it needs
additional work on the lcore assignment policy. The
check will be removed once the required work is done.
Change-Id: I13a32c7a0eae808e9bec63a7b8c15ec0bc2e36ed
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Particular backends will now be responsible for sending
events to vsession->lcore. This was previously done by
the generic vhost layer, but since some backends will
need different lcore assignment policies soon, we need
to give them more power now.
Change-Id: I72cbbccb9d5a5b2358acca6d4b6bb882131937af
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441580
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
It's sessions that are tied with the lcores now.
This makes the vhost devices accessible by any
thread that only locks the global vhost mutex.
The mechanism used for external device events was
refactored to serve for foreach_session() API.
Additionally, since we don't want to handle cases
where the entire vhost device gets removed while
an asynchronous foreach_session chain is pending,
a new per-vdev counter of pending async operations
was added. We'll fail the device removal request
if there are any pending operations. Eventually
we would like the device removal to be asynchronous,
but that's a todo for later.
The external events are still there, although
they only lock the mutex and call the provided
function now.
Change-Id: I20618f9420a9bc04270373469deaad8fb2049c7c
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439323
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Before we implement the support for multiple sessions
per device, we still need to make a few intermediate
changes that will require a counter of currently polled
sessions. So here it is.
Change-Id: I0a1d928eafa75efa1b5c2e6670a5ceb282c87fa4
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441734
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Returning negative value from a `foreach` callback
will now break the entire chain. This is required
for refactoring spdk_vhost_dev_foreach_session() to
use the same mechanism as external events. Before
we actually do the refactor, we add the only feature
that external events were missing.
Change-Id: I70bda3df99748de51429e329a056c37a3bc7e348
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439444
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Sessions are allocated internally by the core vhost
library whenever DPDK accepts a new connection, so
the only reasonable way to store additional per-sesion
data is to tell the core vhost library how much extra
memory it needs to allocate. Hence, we add a new field
to the vhost device backend struct.
Change-Id: Id6c8285505b2e610e28e5d985aceb271ed232555
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437778
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Instead of calculating those settings once and storing
them in the device struct, we'll now recalculate them
whenever a device session is created. This lets us
remove 2 fields from the device struct.
Change-Id: I2cb2bdbc570a41ae78c0666490fb1462a00d0b6f
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Makes the code slightly more readable.
Change-Id: Iebf8fb07bceacf433d4bdad0a30419a3faab7eee
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439370
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We use those values in various places in SPDK,
so let's define them in a single place now.
Change-Id: Iad9a5745d69166a6e6032370d4e5a0e604914e45
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Keep all coalescing variables inside the session struct.
Interrupt coalescing is still configured with the device-
pecific APIs, but those will now transparently propagate
the change to all active connections.
This is the last piece that held struct spdk_vhost_dev
tied with the session's lcore. Now that device
settings aren't actively polled by any sessions, they
only need to be synchronized with the global vhost lock.
This will potentially let us get rid of the vhost external
events API, allowing user to lock the mutex directly,
set coalescing params directly, and transparently let
the internal spdk_vhost_dev_foreach_session() do the
tricky synchronization.
Change-Id: Ifba96d241c736d33376861fa894c738e7d9b5b40
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437777
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
When device is changed, e.g. the underlying bdev is
hotremoved, all sessions need to be notified. For
instance, Vhost-SCSI would send an additional hotremove
eventq message. That's why we introduce a helper
function to iterate through all active sessions.
Eventually, we may want to poll different sessions
from different lcores, so there will be some kind of
internal cross-lcore message management required
- just like there is one for spdk_vhost_call_external_event_foreach().
For now, though, we can get away with a dumbest
implementation.
We still want to keep this API internal for the time
being. The end-user (RPC) should only modify the
device, and the whole concept of sessions should be
completely encapsulated.
Change-Id: I2e142632c07a23daeac15cabea4cffecf984e455
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/418736
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Session struct will be now allocated inside the
`new_connection` rte_vhost callback. There can be
still only one connection per device, but this
change brings us one step towards supporting more.
Besides the obvious pointer changes, we'll now also
use the session pointer to check if the connection
actually exists. We used to set device vid to -1
when there was no connection but we no longer have
to do that.
Change-Id: I4d062c0b5f093fef132a6a2c9cc29458cbaad414
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/437776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This follows the same trend as the mem_map APIs.
Currently, most of the spdk_vtophys() callers manually
detect physically noncontiguous buffers to split them
into multiple physically contiguous chunks. This patch
is a first step towards encapsulating most of that logic
in a single place - in spdk_vtophys() itself.
This patch doesn't change any functionality on its own,
it only extends the API.
Change-Id: I16faa9dea270c370f2a814cd399f59055b5ccc3d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438449
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Each connection is created with the `new_connection`
rte_vhost callback with a unique vid parameter. Storing
the vid inside the device struct was sufficient until
we wanted to have multiple connections per device.
Change-Id: Ic730d3377e1410499bdc163ce961863c530b880d
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/437775
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Grouped a few spdk_vhost_dev struct fields into a new
struct spdk_vhost_session. A session will represent the
connection between SPDK vhost device (vhost-user slave)
and QEMU (vhost-user master).
This essentially serves two purposes. The first is to
allow multiple simultaneous connections to a single
vhost device. Each connection (session) will have access
to the same storage, but will use separate virtqueues,
separate features and possibly different memory. For
Vhost-SCSI, this could be used together with the upcoming
SCSI reservations feature.
The other purpose is to untie devices from lcores and tie
sessions instead. This will potentially allow us to modify
the device struct from any thread, meaning we'll be able
to get rid of the external events API and simplify a lot
of the code that manages vhost - vhost RPC for instance.
Device backends themselves would be responsible for
propagating all device events to each session, but it could
be completely transparent to the upper layers.
Change-Id: I39984cc0a3ae2e76e0817d48fdaa5f43d3339607
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/437774
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
For some old Linux Guest kernels, the new NVMe 1.3 feature: shadow
doorbell buffer is not enabled, while here, make a dummy BAR region
inside slave target, when Guest submits a new request, the doorbell
value will be write to the shared memory between Guest and vhost
target, so that the existing vhost target can support both new
Linux Guest kernel(newer than 4.12) and old Guest kernel.
Also, the shared BAR space can be used in future which we can move
ADMIN queue processing into SPDK vhost target, with this feature,
the QEMU driver will become very small and easy for upstreaming.
Change-Id: I9463e9f13421368f43bfe4076facddd119f4552e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/419157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
This is in response to a Scan-build issue reported on Clang 6.
Change-Id: I2edc853145762998db818cbbe0e9ca0d9b8c123d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/424139
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7716f1748803872cac85e296d60747752ca046f4
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/422273
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
irq_delay must be not less than zero.
Change-Id: I22d8a7df453f07a44a32582d8e880949824bf868
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/421685
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This also save CWD on init so socket path is no longer relative.
Change-Id: I303401fe0340f0bc2ea5e4ba468361f68ea84c3d
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/419067
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Pawel Kaminski <pawelx.kaminski@intel.com>
Reviewed-by: Paweł Niedźwiecki <pawelx.niedzwiecki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>