numam-spdk

Author	SHA1	Message	Date
Darek Stojaczyk	81883ec5a7	vhost: factor out semaphore usage We'll start using the same code in even more places soon, so put in a function. Change-Id: Iee2e091009b14e9d8b56ec8f0d4a86094f7c9727 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467229 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-10-09 02:28:17 +00:00
Darek Stojaczyk	2072d16e94	vhost: assign poll group in vhost_session_start_done Threads were assigned to sessions inside vhost_session_send_event() so far, but even the doxygen comments say that sessions would be assigned to the thread which called vhost_session_start_done(). Currently, Vhost uses only vhost_session_send_event() to schedule starting the session on some thread, so the code ends up working. We're about to remove vhost_session_send_event(), so move the thread (poll group) assignment to start_done(). While here, publish the vhost_poll_group struct definition via vhost_internal.h. As a replacement for vhost_session_send_event() we would like to use spdk_thread_send_msg() which a requires a thread object - one of the struct fields inside vhost_poll_group. The code for starting a session could look as follows: pg = vhost_get_poll_group(cpumask); spdk_thread_send_msg(pg->thread, cb); ... cb: // start_pollers vhost_session_start_done(0); Change-Id: I563f61509674768c1dea0b03767e9f39a9fb0069 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467228 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com>	2019-10-09 02:28:17 +00:00
Pawel Kaminski	bf15f51cef	rpc: Rename set_vhost_controller_coalescing to vhost_controller_set_coalescing Change-Id: Ic775a2397a2177c72ed8c42edf9ac0456a8aea1f Signed-off-by: Pawel Kaminski <pawelx.kaminski@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469872 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-10-03 23:00:40 +00:00
Darek Stojaczyk	f94be73dc9	vhost: explicitly set VRING_USED_F_NO_NOTIFY We used to call a dpdk function to do it, but using a function for something that simple doesn't make sense. The function also does its internal queue lookup by vid and queue number, which could potentially fail, return an error and technically require SPDK to handle it. The function makes some sense for vhost-net applications which don't touch vrings directly but rely on rte_vhost's API for enqueueing/dequeuing mbufs. SPDK touches DPDK's rings directly for the entire I/O handling, so it might just as well for initialization. This serves as cleanup. Change-Id: Ifb44fa22ea5fc3633aa85f075aa1a5cd02f5423c Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466745 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-09-09 05:41:56 +00:00
Darek Stojaczyk	6e77b0b68d	vhost: make poll group refcount per-session Change the way we increase poll group reference counts for round-robin scheduling. So far we used to increase them whenever someone called vhost_get_poll_group() and this worked fine for Vhost-Block which picks a new poll group for each session. Vhost-SCSI, however, picks only one poll group for all sessions on a vhost device. This means that some threads will have multiple Vhost-SCSI pollers but will still appear to the vhost scheduler as if they had only one. To fix it, increase poll group refcnt only when sessions are really being started - in vhost_session_start_done(). Change-Id: I60f0d2101239e5a91138a5afd30c51dc1ccf7c2e Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466733 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-09-09 01:39:41 +00:00
Darek Stojaczyk	1eba43239e	vhost: add a separate cpl cb to foreach_session() Currently vhost_dev_foreach_session() accepts a single callback function for both iterating through all active sessions and for signaling the end of iteration (called last time with vsession param == NULL). Now that the final signal has completely different semantics and is called on a specific thread, it makes sense to put it in a separate function. While here, remove the one-line description of spdk_vhost_session_fn typepef. It wasn't helpful anyway. Change-Id: I56b97180110874a813e666f964bb51c39a8ce6bb Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466732 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-09-09 01:39:41 +00:00
Darek Stojaczyk	d4f7bf9cdd	vhost: remove redundant vdev == NULL checks in foreach_session() Historically the callbacks from vhost_dev_foreach_session() could be called with vdev argument == NULL, which would mean that device was removed after enqueuing the event and before consuming it. Now we keep track of pending asynchronous operations on each vhost device and don't allow removing it if there are any unconsumed events, so the the vdev == NULL checks are redundant. Remove them. Change-Id: I7aa3785080d20ed06e008c081d3f37a949228f5a Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466729 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-09-09 01:39:41 +00:00
Darek Stojaczyk	0cf5d5160b	vhost: remove spdk_ prefix from private functions Remove them all at once. spdk_ prefix should be only applied to publicly exported functions. Change-Id: Ib6d2bd0954ec5cb7c8cf253d79b9d3cd8aa0eeef Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466728 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-09-09 01:39:41 +00:00
Darek Stojaczyk	e5d7a44581	vhost: introduce session names We currently don't have any way to differentiate different sessions e.g. in error messages. Whenever there's an error in some session, we just print the device name. We now introduce vsession->name with the following format: <device name>s<dpdk connection id> Note that it's still impossible to know exactly which qemu process corresponds to which session in spdk, but there's not much we could do in that matter right now. In spdk we don't even have the accepted connection fd. Change-Id: I666aa60c5e36bf3d56f68133042af2afc8cc5e85 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466039 Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-08-26 17:24:04 +00:00
Darek Stojaczyk	3b6f69c8f5	vhost: remove unnecessary membarrier on I/O completion We've recently switched from manually calling eventfd_write() to rte_vhost_vring_call(), which besides writing to the eventfd, always calls a full memory barrier in the upstream rte_vhost lib. With upstream rte_vhost we're actually calling two memory barriers on I/O completion - one in spdk code, one inside rte_vhost_vring_call(). The spdk barrier was only required for our internal rte_vhost lib, whose rte_vhost_vring_call() implementation (that we wrote) did not have such membarrier inside. So now we'll add this membarrier there, and remove the same barrier from spdk code. This doesn't change any code flow for the internal rte_vhost lib, but optimizes I/O path for the upstream version. Change-Id: I68738d7feb9159f718b0e60ac7eed1fafd4836b9 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466037 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-08-23 22:34:02 +00:00
Darek Stojaczyk	feaf45d31a	vhost: allocate just one ctx per foreach_session chain We used to allocate a ctx whenever new event had to be sent, but since all events in foreach_session are always called in a chain, we could allocate one ctx at the start and then re-initialize it before sending each msg. Change-Id: Ie5477b07242f0c6eb6dc2160055a829da8ba5d11 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459167 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	3fb1a9565e	vhost: finish foreach_session chain always on the init thread foreach_session() is used to e.g. close a bdev, where for each session we close any io_channels and then, on the final "finish" call, close the bdev descriptor. The vhost init thread is the one that called spdk_vhost_init() and also the same one that calls all management APIs. One of those is for hotplugging LUNs to vhost scsi targets, which practically results in opening bdev descriptors. By always scheduling that final foreach_session() callback to the init thread, we end up with calling spdk_bdev_close() always on the same thread which called spdk_bdev_open(), which is actually a bdev layer requirement. Change-Id: I2338e15c63f93ef37dd4412dd677dee40d272ec2 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459166 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	4b60bd1b59	vhost: don't setup session coalescing on vdev init We used to call potentially-asynchronous foreach_session() in vdev initialization path and that was perfectly fine because at that time there were no sessions created and foreach_session() was always finishing synchronously. We're about to refactor it to be always asynchronous, and for this coalescing case it could complicate the init error path. Once asynchronous thread msg is sent, we would need to wait for it to complete and we just don't want to do that. We want error handling to be simple. Since we know there are no sessions at the time of vdev creation, we just add a new function for setting coalescing params just for vdev (and not for its sessions) and we use that function in vdev init code. Change-Id: I44d204d03b5040525e4871693678d4b4a0204e63 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459196 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	d476d10665	vhost: reorder foreach_sesion_continue Put it next to other functions in this call chain. Change-Id: Ic621855b028f9bd110cdcda86b3a182369ec5e90 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459165 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	74243e36b9	vhost: reorder spdk_vhost_session_send_event Put it next to other functions in this call chain. Change-Id: Ieafd91c6cfefec134594aec8671eb4efdac15dfe Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459164 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	5e63804146	vhost: remove spdk_ prefix from some static function spdk_ prefix should be only used on public API functions. Change-Id: I663b107bd6b1c92c2c6263f2ec7c763d9812e7fe Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459163 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	4de67bbf6d	vhost: inline spdk_vhost_event_async_send_foreach_continue Despite its name, this function is defined as static and is only used in one place, so inline it. Change-Id: I4e217b3baae9b735761f5497f06b681a118860e9 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459162 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	98af6aba4d	vhost: remove vsession->ev_ctx It's no longer used. Change-Id: Iffa385e18ba7a979d7a384f420f546207774dea3 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459161 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-16 10:57:46 +00:00
Darek Stojaczyk	dad4c43a88	vhost: add a single dpdk semaphore The semaphore was a part of struct spdk_vhost_session_fn_ctx so far, but since there's only one pthread waiting on that semaphore and hence only one event using it, we could just use a single global sem_t. Same thing with response code for those callbacks - there's only one needed. Going a step further, the function complete_session_event() was removed - it would only operate on global variables now, and its signature wouldn't make much sense after this refactor, so it's been inlined. This serves as cleanup. Change-Id: I63ef41d7e1564fff5e785de101d887bc1014aad9 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459160 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-07-01 12:50:57 +00:00
Darek Stojaczyk	5fb7330151	vhost: introduce g_vhost_init_thread Enforce spdk_vhost_fini() to be called on the same thread which called spdk_vhost_init(). We'll also use the newly added g_vhost_init_thread for other purposes later on. Change-Id: I99aebeda2d8ddaf42554aa422c32ed935634595f Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459159 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-07-01 12:50:57 +00:00
Darek Stojaczyk	ccdc0b615f	vhost: operate on poll groups instead of lcores With all the pieces in place we can finally remove the legacy cross thread messages from vhost. We replace spdk_vhost_allocate_reactor() with spdk_vhost_get_poll_group(). The returned poll_group has to be passed to spdk_vhost_session_send_event(), where it will be assigned to the session. After the session it started, that poll group will be used for all the internal vhost cross-thread messaging. Change-Id: I17f13d3cc6e2b64e4b614c3ceb1eddb31056669b Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452207 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-07-01 12:50:57 +00:00
Darek Stojaczyk	4fcec18d38	vhost: don't interrupt uninitialized virtqueues rte_vhost_vring_call() from upstream DPDK can read some unitialized memory and crash if it's called on invalid queue ids. The implementation in our internal rte_vhost fork ends up wiritng to a random descriptor number, which doesn't cause any crashes but is a bug nevertheless. To fix it, just check if the queue is initialized before interrupting it during the session start. It's not a hot I/O path and there's no performance impact. Change-Id: I830c1be98ef00d4ece9a6bd88cf79b9dfe29d2a9 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457247 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-06-11 01:14:45 +00:00
Nikos Dragazis	c8202a4d79	vhost: fix vhost memory registration in case of vvu transport The memory API has been refactored. It is not possible anymore to register a memory region more than once. This has been introduced in this patch: https://review.gerrithub.io/426085 In case of vhost with vvu transport, it often happens that two consequtive vhost memory regions are mapped to virtual addresses that lie within the same 2MB address range. This means that the vhost memory regions may not be 2MB-aligned in the process virtual address space. As a result, the `FLOOR_2MB()` of those addresses gives the same address. Thus, we end up trying to register the same 2MB memory range twice. This issue does not appear in case of AF_UNIX transport. Vhost memory regions in case of AF_UNIX transport are hugepage backed. Therefore, the mmapped virtual addresses of those memory regions are always 2MB-aligned. On the contrary, in case of vvu transport, the vhost memory regions are segments of the PCI memory address space of the virtio-vhost-user PCI device. This MMIO space is mapped in its entirety by the DPDK vfio interface along with the other PCI BARs. Ultimately, the vhost memory regions correspond to offsets in this mmapped PCI memory region and thus there is no warranty that the mmapped virtual addresses are 2MB-aligned. This issue is fixed by skipping the already-registered 2MB memory regions. Change-Id: I62c9c257e6f172c894cd3454d2cbeee1986e6189 Signed-off-by: Nikos Dragazis <ndragazis@arrikto.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/441057 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-06-05 06:43:01 +00:00
Nikos Dragazis	5f4e42b80b	vhost: abstract vring call mechanism as it is transport-specific vring notification mechanism is transport-specific. At present, vhost dataplane code in `lib/vhost/vhost.c` triggers guest notifications with `eventfd_write()` system call. But this is an AF_UNIX specific notification mechanism. This patch replaces `eventfd_write()` with the existing generic `rte_vhost_vring_call()` function that is part of DPDK's librte_vhost public API. `rte_vhost_vring_call()` takes a vring_idx as an argument to associate the `struct spdk_vhost_virtqueue` instance with the relevant `struct vhost_virtqueue` instance. We introduce a new `vring_idx` field in `struct spdk_vhost_virtqueue` to enable this association. This field is initialized in `start_device()`. In addition, a stub for `rte_vhost_vring_call()` is added in the vhost unit test file. SPDK's internal `rte_vhost` copy will not be updated in order to support the virtio-vhost-user transport. However, an `rte_vhost_vring_call()` function is introduced in SPDK's `rte_vhost` in order to have a solid API. This function is just a wrapper of `eventfd_write()`. Change-Id: Ic93e25cd3f06e92f04766521bc850f1ee80b8ec8 Signed-off-by: Nikos Dragazis <ndragazis@arrikto.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454373 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-06-03 20:14:08 +00:00
Darek Stojaczyk	3a0627f069	vhost: remove 2MB memory region size restriction We no longer have any assumptions about vhost memory regions size being a 2MB multiple, so we can get rid of the security check preventing some vhost sessions from being initialized. It will be necessary for virtio-vhost-user, whose memory comes from PCI BARs and its size may not be a 2MB multiple. Change-Id: I48f9bc20f4c61aefdddf39ade875867148f0ed75 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454879 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-05-22 00:53:08 +00:00
Darek Stojaczyk	ac498fa31c	vhost: use DPDK APIs to split non-contiguous virtual memory buffers Currently, we translate each 2MB chunk to manually check if it's contiguous with the previous one, but there are rte_vhost APIs that do it way more efficiently. rte_vhost_va_from_guest_pa() was introduced in DPDK 18.02, but was backported to 17.11 as well, so we don't even need any RTE_VERSION ifdefs to use it now. This function calculates the remaining region size instead of trying to translate subsequent 2MB chunks over and over. The previous rte_vhost_gpa_to_vva() was deprecated a long time ago and after this patch we no longer make any use of it. DPDK usages of this new function check if the translated memory region has 0 length, which seems very silly, but let's just do it in SPDK as well. Change-Id: Ifae8daa5f810b5a2ba1524958ad2399af700b532 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454878 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-05-22 00:53:08 +00:00
Darek Stojaczyk	1234a3e52a	vhost: set lcore from the DPDK thread Now that sessions have a separate flag to check if the pollers are started, we can set the lcore field on any thread we want. We currently assign it from within the session thread to spdk_env_get_current_core(), but we won't be able to use an equivalent get_current_poll_group() function after we switch to poll groups. We will only have a poll group object inside spdk_vhost_session_send_event(), so that's where we move the lcore assignment for now. Change-Id: Ib5fb37ec488de80e9d79432120c81500c297b608 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452395 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-10 22:31:41 +00:00
Darek Stojaczyk	376d893a20	vhost: introduce vsession->started We used to rely on lcore >= 0 for sessions that are started (have their pollers running) and in order to prevent data races, that lcore field had to be set from the same thread that runs the pollers, directly after registering/unregistering them. The lcore was always set to spdk_env_get_current_core(), but we won't be able to use an equivalent get_current_poll_group() function after we switch to poll groups. We will have a poll group object only inside spdk_vhost_session_send_event() that's called from the DPDK rte_vhost thread. In order to change the lcore field (or a poll group one) from spdk_vhost_session_send_event(), we'll need a separate field to maintain the started/stopped status that's only going to be modified from the session's thread. Change-Id: Idb09cae3c4715eebb20282aad203987b26be707b Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452394 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-10 22:31:41 +00:00
Darek Stojaczyk	a1d4dcdc85	vhost: don't use the second ctx param in spdk_event_allocate Prepare to switch to spdk_thread_send_msg() which accepts only one context parameter. Change-Id: Iea3e8d1e715957d9b3fea12e969f29084a2948dc Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452393 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-08 21:54:04 +00:00
Darek Stojaczyk	b643dcd1fd	vhost: introduce poll groups The goal is to remove legacy event messages from vhost. The new message passing API accepts thread objects instead of lcore numbers and poll groups are meant to simplify the transition. Eventually we'd like vhost to spawn its own threads and do message passing only within those, but SPDK libraries can't spawn their own threads just yet. As a stopgap, vhost will now maintain a list of all available threads (in form of "poll groups" to mimic nvmf) and will start pollers on them using its own round robin scheduler. This patch only adds the poll groups list, it doesn't change any existing functionality. Change-Id: I89cc5da5df3612827c6fc9015f03c94b5f4a10ad Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452206 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-08 21:54:04 +00:00
Darek Stojaczyk	a5599094da	vhost: add completion callback to lib init Prepare vhost lib init to be asynchronous. We'll need it for setting up the upcoming poll groups. Change-Id: I3c66b3f17f8635d4b705dd988393431193938971 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452205 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-02 17:06:34 +00:00
Darek Stojaczyk	92d6eaa95c	vhost: reorder some shutdown functions Put all shutdown functions in a single place. This also lets us remove one forward declaration. Change-Id: I8c8c602e67e3dafd3cd5e80bc9dd90f23381711e Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452392 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-02 17:06:34 +00:00
Darek Stojaczyk	b9d23be188	vhost: remove legacy spdk_event from shutdown path Switch to the new spdk_thread_send_msg() API instead. Change-Id: I810465cc49d5c4ef23e04953aa29d369f48f68b1 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452391 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-02 17:06:34 +00:00
Darek Stojaczyk	6c17f696c1	vhost: allocate device objects with regular calloc spdk_dma_malloc() is not required here, as the device object is neither DMA-able nor shared between processes. The device structures used to be aligned to cache line size, but that's just a leftover from before sessions were introduced. The device object is just a generic device information that can be accessed from any thread holding the proper mutex. The hot data used in the I/O path sits in the session structure, which is now allocated with posix_memalloc() to ensure proper alignment. Vhost NVMe is an exception, as the device struct is used as hot I/O data for the one and only session it supports, so it's also allocated with posix_memalloc(). While here, also allocate various vhost buffers using spdk_zmalloc() instead of spdk_dma_zmalloc(), as spdk_dma_*malloc() is about to be deprecated. Change-Id: Ic7f63185639b7b98dc1ef756166c826a0af87b44 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450551 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-04-22 16:50:37 +00:00
Darek Stojaczyk	e051a5366e	vhost: ignore sessions that weren't ever started The previous patches described as optimizations also fixed some issues. They seem sufficient to cover all the error cases, but the real source of the problem lies in foreach_session() initiated by the device backend, which can use sessions that were never seen by the backend. The backends are only notified when a session is started, but foreach_session() iterates through all the sessions - even those that were never started. Vhost SCSI, for example, in the foreach_session() callbacks used to expect svsession->svdev to be always set, but that field is only set when the session gets started. A perfect solution would to introduce a new backend callback to be called on new connection. Vhost SCSI could set e.g. svsession->svdev inside. For now we go with much easier solution that prevents sessions from being used in foreach-session() unless they were started at least once. (...and e.g. got their ->svdev set) Change-Id: Ida30a1f27f99977360d08a71a64fc92931b25b75 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449394 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-04-22 15:41:51 +00:00
Darek Stojaczyk	23d7ff31fc	vhost: change vsession->lcore only within that lcore There is currently a small window after we stop session's pollers and before we mark the session as stopped (by setting vsession->lcore to -1). If spdk_vhost_dev_foreach_session() is called within this window, its callback could assume the session is still running and for example in vhost scsi target hotremove case, could destroy an io_channel for the second time - as it'd first done when the session was stopped. That's a bug. A similar case exists for session start. We fix the above by setting vsession->lcore directly after starting or stopping the session, hence eliminating the possible window for data races. This has a few implications: * spdk_vhost_session_send_event() called before session start can't operate on vsession->lcore, so it needs to be provided with the lcore as an additional parameter now. * the vsession->lcore can't be accessed until spdk_vhost_session_start_done() is called, so its existing usages were replaced with spdk_env_get_current_core() * active_session_num is decremented right after spdk_vhost_session_stop_done() is called and before spdk_vhost_session_send_event() returns, so some active_session_num == 1 checks meaning "the last session gets stopped now" needed to be changed to check against == 0, as if "the last session has been just stopped" Change-Id: I5781bb0ce247425130c9672e0df27d06b6234317 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448229 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-28 14:16:56 +00:00
Darek Stojaczyk	2cddd571ee	vhost: add spdk_vhost_session_start_done/stop_done Split spdk_vhost_session_event_done() into two separate functions. This is just a preparation for the next patch. Change-Id: I05e046e4b963387f058d2b822d7493c761eebbbb Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448228 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-28 14:16:56 +00:00
Darek Stojaczyk	64d76e50cc	vhost: call session_event_done() always under the global vhost lock In the next patch we will put much more responsibility on spdk_vhost_session_event_done(), so here we make sure it's always called under the global vhost mutex. Specifically, spdk_vhost_session_event_done() will set vsession->lcore, which any other thread might try to concurrently access via spdk_vhost_dev_foreach_session(). Change-Id: I7a5fde4be4e8bdfdbbb24ac955af964f516bdb68 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448227 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-28 14:16:56 +00:00
Darek Stojaczyk	bbfbadf59a	vhost: add spdk_vhost_trylock() We'll make use of it inside the vhost device backend code. The function itself is generic enough to be put in the public vhost.h header rather than vhost_internal.h. Change-Id: I60602c61d8bba665dcf9c6d27af2e910c208a7be Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448226 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-28 14:16:56 +00:00
Darek Stojaczyk	49e0400920	vhost: check for strdup failure We could silently fail the allocation and probably segfault soon after. Change-Id: I3851b78500fcb3f64a06bdf0c0e5566d6148cbee Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447026 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-03-20 02:36:34 +00:00
Darek Stojaczyk	3b760a4d09	vhost: encapsulate synchronous event ctx within the generic vhost layer The context had to be previously carried around by particular vhost backend code and now it's embedded inside the generic vsession struct. This serves mostly as a cleanup. Change-Id: I7b6ac2c3cb5d60a035d56affbf42fe5d4697f0f6 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448223 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-19 17:09:08 +00:00
Ben Walker	eefe8806a2	event: Subsystem level write_config_json callback no longer asynchronous Nothing actually needs this to be asynchronous. If something comes up, we can make it asynchronous again. Change-Id: Icde3af3f8f9efebe75b08471b4afcce3a70da541 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447114 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-03-19 15:54:34 +00:00
Darek Stojaczyk	c2a53da73c	vhost: don't check indirect desc feature flag Windows Virtio drivers use indirect descriptors without negotiating their feature flag, which is explicitly forbidden by the Virtio 1.0 spec. "(2.4.5.3.1 Driver Requirements: Indirect Descriptors) The driver MUST NOT set the VIRTQ_DESC_F_INDIRECT flag unless the VIRTIO_F_INDIRECT_DESC feature was negotiated.". Violating this rule doesn't cause any issues for SPDK vhost, but triggers an assert, so we can only run Windows VMs with non-debug SPDK builds. This patch removes the assert and allows Windows VMs to be run with debug versions of SPDK vhost. Fixes #650 Change-Id: I95f534c33c384a4e1126a8c343c21eb63ec7bcef Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447803 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-18 05:54:16 +00:00
Darek Stojaczyk	b9b1c9592e	vhost/compat: implement SET/GET_CONFIG rte_vhost has rejected a patch with this feature, so we implement it using the external rte_vhost msg handling hooks directly in SPDK. Change-Id: Ib072fc19b921fe0fa01c7f4892e60430232e3a1c Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447025 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-03-15 22:03:26 +00:00
Darek Stojaczyk	aa5f129f57	vhost: don't operate on partially-initialized vdev Make the vdev initialization happen before calling any vdev related functions. This is mostly needed for an upcomming patch where additional step is required after initializing the vdev and before starting rte vhost. On the other hand, this patch also fixes a technically possible scenario where rte vhost starts processing vhost-user messages and calling our ops before the related vdev was initialized. Change-Id: I8fbc7e7bc0b364327cfcec60faa74d4f64d6fad8 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447024 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-03-15 22:03:26 +00:00
Darek Stojaczyk	1ecbb6a8da	vhost/compat: start polling queues prematurely rte_vhost requires all queues to be fully initialized in order to start I/O processing. This behavior is not compliant with the vhost-user specification and doesn't work with QEMU 2.12+, which will only initialize 1 I/O queue for the SeaBIOS boot. Theoretically, we should start polling each virtqueue individually after receiving its SET_VRING_KICK message, but rte_vhost is not designed to poll individual queues. So we use a workaround to detect when a vhost session could be potentially at that SeaBIOS stage and we mark it to start polling as soon as its first virtqueue gets initialized. This doesn't hurt any non-QEMU vhost slaves and allows QEMU 2.12+ to boot correctly. SET_FEATURES could be sent at any time, but QEMU will send it at least once on SeaBIOS initialization - whenever powered-up or rebooted. Vhost sessions are still mostly started/stopped from within rte_vhost callbacks, but now there's additional concept of "forced" polling, in which SPDK starts sessions manually, while rte_vhost still thinks the sessions are stopped. This can potentially lead to cases where a session is "started" twice, or gets destroyed while it's still being polled (by force). Those cases also need to be handled within this patch. Change-Id: I70636d63e27914906ddece59cec34f1dd37ec5cd Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446086 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-03-15 22:03:26 +00:00
Darek Stojaczyk	0a6ad9b02e	vhost: install external msg handling hooks to rte_vhost DPDK 19.05+ gives us an ability to pre or post-process any single vhost-user message. The user can either perform additional actions upon some generic events, or can implement handling for brand new message types that rte_vhost doesn't even know about. In order to smoothly switch to the upstream rte_vhost and drop our internal copy, we introduce an SPDK wrapper function to register SPDK-specific message handlers. For DPDK 19.05+ this will use the new rte_vhost API to register those message handlers, and for older DPDKs this function simply won't do anything - as w assume the internal rte_copy already contains all the necessary changes and does not need any "external" hooks. For now we use the message handlers to stop the vhost device and wait for any pending DMA ops before letting rte_vhost to process the SET_MEM_TABLE message and unmap the current shared memory. Change-Id: Ic0fefa9174254627cb3fc0ed30ab1e54be4dd654 Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446085 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-13 14:26:20 +00:00
Darek Stojaczyk	2b846acc49	configure: add option not to use the internal rte_vhost copy It's disabled by default, so no functionality is changed yet. The intention is to use the upstream rte_vhost from DPDK, which - starting from DPDK 19.05 - is finally capable of running with storage device backends. SPDK still requires a lot of changes in order to support that upstream version, but the most fundamental change is dropping vhost-nvme support. It'll remain usable only with the internal rte_vhost copy and with the upstream rte_vhost it simply won't be compiled. This allows us at least to compile with that upstream rte_vhost, where we can pursue adding the full integration. Change-Id: Ic8bc5497c4d77bfef77c57f3d5a1f8681ffb6d1f Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/446082 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-13 14:26:20 +00:00
Darek Stojaczyk	0aa926c0c0	rte_vhost: introduce get/set vring base idx APIs Adapted our custom rte_vhost APIs to the upstream DPDK version which has independently added similar APIs. This will potentially allow us to remove our internal rte_vhost copy. rte_vhost_set_vhost_vring_last_idx() was renamed to rte_vhost_set_vring_base() and the last vring indices have to be acquired with a newly introduced rte_vhost_get_vring_base() rather than rte_vhost_get_vhost_vring(). This is only a refactor, no functionality is changed. Change-Id: I1ca2c1216635c117832c9d9c784d5661145c04cd Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/c/446081 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-02-27 01:43:16 +00:00
Jim Harris	7739a1f338	vhost: use mmap_size to check for 2MB hugepage multiple Older versions of QEMU (<= 2.11) expose the VGA BIOS hole (0xA0000-0xBFFFF) by specifying two separate memory regions - one before and one after the hole. This results in the "size" not being a 2MB multiple. But the underlying memory is still mmaped at a 2MB multiple - so that's what we should be checking to ensure the memory is hugepage backed. Fixes #673. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I1644bb6d8a8fb1fd51a548ae7a17da061c18c669 Reviewed-on: https://review.gerrithub.io/c/445764 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-02-22 10:24:16 +00:00

1 2 3 4 5

202 Commits