perf: Add option to create unused io queue pairs

For some testing, we need queue pairs to exist but not actually be in use. Change-Id: I2b17ff0172c9ec002692babcf7d4d612c3062eb4 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/392977 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
perf: Allow the user to specify the number of queues
2019-03-27 13:09:13 -07:00 · 2019-03-27 13:09:06 -07:00 · 2019-03-27 13:07:33 -07:00 · 2019-03-27 13:07:33 -07:00 · 2019-03-27 13:07:33 -07:00 · 2019-03-27 13:07:33 -07:00
24 changed files with 305 additions and 76 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,5 +1,30 @@
 # Changelog

+## v18.07.1:
+
+### NVMe
+
+Added a quirk to handle QEMU emulated NVMe SSDs, which report an Intel
+vendor ID but don't support Intel vendor-specific log pages.
+
+### Vagrant
+
+Modified scripts/vagrant/create_vbox.sh to run as a bash script, since
+it explicitly requires bash functionality.
+
+### bdev
+
+Fixed a bug that resulted in incorrect disk utilization reporting.
+
+Fixed a crash when the bdev layer ran out of free bdev I/O request objects.
+
+Fixed a race condition between closing the final bdev descriptor
+and unregistering the bdev.
+
+### DPDK
+
+Updated the DPDK submodule to be based off of DPDK 18.05.1.
+
 ## v18.07:

 ### bdev
--- a/doc/jsonrpc.md
+++ b/doc/jsonrpc.md
@ -903,7 +903,8 @@ Name                       | Optional | Type        | Description
 action_on_timeout          | Optional | string      | Action to take on command time out: none, reset or abort
 timeout_us                 | Optional | number      | Timeout for each command, in microseconds. If 0, don't track timeouts
 retry_count                | Optional | number      | The number of attempts per I/O before an I/O fails
-nvme_adminq_poll_period_us | Optional | number      | How often the admin queue is polled for asynchronous events in microsecond
+nvme_adminq_poll_period_us | Optional | number      | How often the admin queue is polled for asynchronous events in microseconds
+nvme_ioq_poll_period_us    | Optional | number      | How often I/O queues are polled for completions, in microseconds. Default: 0 (as fast as possible).

 ### Example

--- a/2
+++ b/2
@ -1 +1 @@
-Subproject commit b6ae5bcff6ca09a7e1536eaa449aa6f4e704a6d9
+Subproject commit b20a027e88b5c3b54498a62c075865656efb86e5
--- a/etc/spdk/iscsi.conf.in
+++ b/etc/spdk/iscsi.conf.in
@ -118,6 +118,9 @@
  # Set how often the admin queue is polled for asynchronous events.
  # Units in microseconds.
  AdminPollRate 100000
+  # Set how often I/O queues are polled from completions.
+  # Units in microseconds.
+  IOPollRate 0

  # Disable handling of hotplug (runtime insert and remove) events,
  # users can set to Yes if want to enable it.
--- a/etc/spdk/nvmf.conf.in
+++ b/etc/spdk/nvmf.conf.in
@ -109,6 +109,9 @@
  # Set how often the admin queue is polled for asynchronous events.
  # Units in microseconds.
  AdminPollRate 100000
+  # Set how often I/O queues are polled from completions.
+  # Units in microseconds.
+  IOPollRate 0

  # Disable handling of hotplug (runtime insert and remove) events,
  # users can set to Yes if want to enable it.
--- a/etc/spdk/vhost.conf.in
+++ b/etc/spdk/vhost.conf.in
@ -99,6 +99,9 @@
  # Set how often the admin queue is polled for asynchronous events.
  # Units in microseconds.
  AdminPollRate 100000
+  # Set how often I/O queues are polled from completions.
+  # Units in microseconds.
+  IOPollRate 0

 # The Split virtual block device slices block devices into multiple smaller bdevs.
 [Split]
--- a/examples/nvme/fio_plugin/fio_plugin.c
+++ b/examples/nvme/fio_plugin/fio_plugin.c
@ -222,6 +222,7 @@ attach_cb(void *cb_ctx, const struct spdk_nvme_transport_id *trid,
 {
 	struct thread_data	*td = cb_ctx;
 	struct spdk_fio_thread	*fio_thread = td->io_ops_data;
+	struct spdk_nvme_io_qpair_opts	qpopts;
 	struct spdk_fio_ctrlr	*fio_ctrlr;
 	struct spdk_fio_qpair	*fio_qpair;
 	struct spdk_nvme_ns	*ns;
@ -287,7 +288,10 @@ attach_cb(void *cb_ctx, const struct spdk_nvme_transport_id *trid,
 		return;
 	}

-	fio_qpair->qpair = spdk_nvme_ctrlr_alloc_io_qpair(fio_ctrlr->ctrlr, NULL, 0);
+	spdk_nvme_ctrlr_get_default_io_qpair_opts(fio_ctrlr->ctrlr, &qpopts, sizeof(qpopts));
+	qpopts.delay_pcie_doorbell = true;
+
+	fio_qpair->qpair = spdk_nvme_ctrlr_alloc_io_qpair(fio_ctrlr->ctrlr, &qpopts, sizeof(qpopts));
 	if (!fio_qpair->qpair) {
 		SPDK_ERRLOG("Cannot allocate nvme io_qpair any more\n");
 		g_error = true;
--- a/examples/nvme/perf/perf.c
+++ b/examples/nvme/perf/perf.c
@ -51,6 +51,9 @@
 struct ctrlr_entry {
 	struct spdk_nvme_ctrlr			*ctrlr;
 	struct spdk_nvme_intel_rw_latency_page	*latency_page;
+
+	struct spdk_nvme_qpair			**unused_qpairs;
+
 	struct ctrlr_entry			*next;
 	char					name[1024];
 };
@ -116,7 +119,9 @@ struct ns_worker_ctx {

 	union {
 		struct {
-			struct spdk_nvme_qpair	*qpair;
+			int			num_qpairs;
+			struct spdk_nvme_qpair	**qpair;
+			int			last_qpair;
 		} nvme;

 #if HAVE_LIBAIO
@ -174,6 +179,8 @@ static uint32_t g_metacfg_prchk_flags;
 static int g_rw_percentage;
 static int g_is_random;
 static int g_queue_depth;
+static int g_nr_io_queues_per_ns = 1;
+static int g_nr_unused_io_queues = 0;
 static int g_time_in_sec;
 static uint32_t g_max_completions;
 static int g_dpdk_mem;
@ -359,6 +366,26 @@ register_ctrlr(struct spdk_nvme_ctrlr *ctrlr)
 		register_ns(ctrlr, ns);
 	}

+	if (g_nr_unused_io_queues) {
+		int i;
+
+		printf("Creating %u unused qpairs for controller %s\n", g_nr_unused_io_queues, entry->name);
+
+		entry->unused_qpairs = calloc(g_nr_unused_io_queues, sizeof(struct spdk_nvme_qpair *));
+		if (!entry->unused_qpairs) {
+			fprintf(stderr, "Unable to allocate memory for qpair array\n");
+			exit(1);
+		}
+
+		for (i = 0; i < g_nr_unused_io_queues; i++) {
+			entry->unused_qpairs[i] = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, NULL, 0);
+			if (!entry->unused_qpairs[i]) {
+				fprintf(stderr, "Unable to allocate unused qpair. Did you request too many?\n");
+				exit(1);
+			}
+		}
+	}
+
 }

 #if HAVE_LIBAIO
@ -617,6 +644,7 @@ submit_single_io(struct perf_task *task)
 {
 	uint64_t		offset_in_ios;
 	int			rc;
+	int			qp_num;
 	struct ns_worker_ctx	*ns_ctx = task->ns_ctx;
 	struct ns_entry		*entry = ns_ctx->entry;

@ -633,6 +661,12 @@ submit_single_io(struct perf_task *task)
 	task->submit_tsc = spdk_get_ticks();
 	task->lba = offset_in_ios * entry->io_size_blocks;

+	qp_num = ns_ctx->u.nvme.last_qpair;
+	ns_ctx->u.nvme.last_qpair++;
+	if (ns_ctx->u.nvme.last_qpair == ns_ctx->u.nvme.num_qpairs) {
+		ns_ctx->u.nvme.last_qpair = 0;
+	}
+
 	if ((g_rw_percentage == 100) ||
 	    (g_rw_percentage != 0 && ((rand_r(&seed) % 100) < g_rw_percentage))) {
 #if HAVE_LIBAIO
@ -646,7 +680,7 @@ submit_single_io(struct perf_task *task)
 						   entry->io_size_blocks, false);
 			task->is_read = true;

-			rc = spdk_nvme_ns_cmd_read_with_md(entry->u.nvme.ns, ns_ctx->u.nvme.qpair,
+			rc = spdk_nvme_ns_cmd_read_with_md(entry->u.nvme.ns, ns_ctx->u.nvme.qpair[qp_num],
 							   task->buf, NULL,
 							   task->lba,
 							   entry->io_size_blocks, io_complete,
@ -664,7 +698,7 @@ submit_single_io(struct perf_task *task)
 			task_extended_lba_setup_pi(entry, task, task->lba,
 						   entry->io_size_blocks, true);

-			rc = spdk_nvme_ns_cmd_write_with_md(entry->u.nvme.ns, ns_ctx->u.nvme.qpair,
+			rc = spdk_nvme_ns_cmd_write_with_md(entry->u.nvme.ns, ns_ctx->u.nvme.qpair[qp_num],
 							    task->buf, NULL,
 							    task->lba,
 							    entry->io_size_blocks, io_complete,
@ -735,13 +769,21 @@ io_complete(void *ctx, const struct spdk_nvme_cpl *completion)
 static void
 check_io(struct ns_worker_ctx *ns_ctx)
 {
+	int i, rc;
+
 #if HAVE_LIBAIO
 	if (ns_ctx->entry->type == ENTRY_TYPE_AIO_FILE) {
 		aio_check_io(ns_ctx);
 	} else
 #endif
 	{
-		spdk_nvme_qpair_process_completions(ns_ctx->u.nvme.qpair, g_max_completions);
+		for (i = 0; i < ns_ctx->u.nvme.num_qpairs; i++) {
+			rc = spdk_nvme_qpair_process_completions(ns_ctx->u.nvme.qpair[i], g_max_completions);
+			if (rc < 0) {
+				fprintf(stderr, "NVMe io qpair process completion error\n");
+				exit(1);
+			}
+		}
 	}
 }

@ -807,17 +849,27 @@ init_ns_worker_ctx(struct ns_worker_ctx *ns_ctx)
 		 *  For now, give each namespace/thread combination its own queue.
 		 */
 		struct spdk_nvme_io_qpair_opts opts;
+		int i;
+
+		ns_ctx->u.nvme.num_qpairs = g_nr_io_queues_per_ns;
+		ns_ctx->u.nvme.qpair = calloc(ns_ctx->u.nvme.num_qpairs, sizeof(struct spdk_nvme_qpair *));
+		if (!ns_ctx->u.nvme.qpair) {
+			return -1;
+		}

 		spdk_nvme_ctrlr_get_default_io_qpair_opts(ns_ctx->entry->u.nvme.ctrlr, &opts, sizeof(opts));
 		if (opts.io_queue_requests < ns_ctx->entry->num_io_requests) {
 			opts.io_queue_requests = ns_ctx->entry->num_io_requests;
 		}
+		opts.delay_pcie_doorbell = true;

-		ns_ctx->u.nvme.qpair = spdk_nvme_ctrlr_alloc_io_qpair(ns_ctx->entry->u.nvme.ctrlr, &opts,
-				       sizeof(opts));
-		if (!ns_ctx->u.nvme.qpair) {
-			printf("ERROR: spdk_nvme_ctrlr_alloc_io_qpair failed\n");
-			return -1;
+		for (i = 0; i < ns_ctx->u.nvme.num_qpairs; i++) {
+			ns_ctx->u.nvme.qpair[i] = spdk_nvme_ctrlr_alloc_io_qpair(ns_ctx->entry->u.nvme.ctrlr, &opts,
+						  sizeof(opts));
+			if (!ns_ctx->u.nvme.qpair[i]) {
+				printf("ERROR: spdk_nvme_ctrlr_alloc_io_qpair failed\n");
+				return -1;
+			}
 		}
 	}

@ -827,13 +879,19 @@ init_ns_worker_ctx(struct ns_worker_ctx *ns_ctx)
 static void
 cleanup_ns_worker_ctx(struct ns_worker_ctx *ns_ctx)
 {
+	int i;
+
 	if (ns_ctx->entry->type == ENTRY_TYPE_AIO_FILE) {
 #ifdef HAVE_LIBAIO
 		io_destroy(ns_ctx->u.aio.ctx);
 		free(ns_ctx->u.aio.events);
 #endif
 	} else {
-		spdk_nvme_ctrlr_free_io_qpair(ns_ctx->u.nvme.qpair);
+		for (i = 0; i < ns_ctx->u.nvme.num_qpairs; i++) {
+			spdk_nvme_ctrlr_free_io_qpair(ns_ctx->u.nvme.qpair[i]);
+		}
+
+		free(ns_ctx->u.nvme.qpair);
 	}
 }

@ -846,7 +904,7 @@ work_fn(void *arg)

 	printf("Starting thread on core %u\n", worker->lcore);

-	/* Allocate a queue pair for each namespace. */
+	/* Allocate queue pairs for each namespace. */
 	ns_ctx = worker->ns_ctx;
 	while (ns_ctx != NULL) {
 		if (init_ns_worker_ctx(ns_ctx) != 0) {
@ -901,6 +959,8 @@ static void usage(char *program_name)
 	printf("\n");
 	printf("\t[-q io depth]\n");
 	printf("\t[-s io size in bytes]\n");
+	printf("\t[-n number of io queues per namespace. default: 1]\n");
+	printf("\t[-U number of unused io queues per controller. default: 0]\n");
 	printf("\t[-w io pattern type, must be one of\n");
 	printf("\t\t(read, write, randread, randwrite, rw, randrw)]\n");
 	printf("\t[-M rwmixread (100 for reads, 0 for writes)]\n");
@ -1240,7 +1300,7 @@ parse_args(int argc, char **argv)
 	g_core_mask = NULL;
 	g_max_completions = 0;

-	while ((op = getopt(argc, argv, "c:d:e:i:lm:q:r:s:t:w:DLM:")) != -1) {
+	while ((op = getopt(argc, argv, "c:d:e:i:lm:n:q:r:s:t:w:DLM:U:")) != -1) {
 		switch (op) {
 		case 'c':
 			g_core_mask = optarg;
@ -1263,6 +1323,9 @@ parse_args(int argc, char **argv)
 		case 'm':
 			g_max_completions = atoi(optarg);
 			break;
+		case 'n':
+			g_nr_io_queues_per_ns = atoi(optarg);
+			break;
 		case 'q':
 			g_queue_depth = atoi(optarg);
 			break;
@ -1291,12 +1354,20 @@ parse_args(int argc, char **argv)
 			g_rw_percentage = atoi(optarg);
 			mix_specified = true;
 			break;
+		case 'U':
+			g_nr_unused_io_queues = atoi(optarg);
+			break;
 		default:
 			usage(argv[0]);
 			return 1;
 		}
 	}

+	if (!g_nr_io_queues_per_ns) {
+		usage(argv[0]);
+		return 1;
+	}
+
 	if (!g_queue_depth) {
 		usage(argv[0]);
 		return 1;
@ -1521,6 +1592,17 @@ unregister_controllers(void)
 		    spdk_nvme_ctrlr_is_feature_supported(entry->ctrlr, SPDK_NVME_INTEL_FEAT_LATENCY_TRACKING)) {
 			set_latency_tracking_feature(entry->ctrlr, false);
 		}
+
+		if (g_nr_unused_io_queues) {
+			int i;
+
+			for (i = 0; i < g_nr_unused_io_queues; i++) {
+				spdk_nvme_ctrlr_free_io_qpair(entry->unused_qpairs[i]);
+			}
+
+			free(entry->unused_qpairs);
+		}
+
 		spdk_nvme_detach(entry->ctrlr);
 		free(entry);
 		entry = next;
--- a/include/spdk/nvme.h
+++ b/include/spdk/nvme.h
@ -728,6 +728,17 @@ struct spdk_nvme_io_qpair_opts {
 	 * compatibility requirements, or driver-assisted striping.
 	 */
 	uint32_t io_queue_requests;
+
+	/**
+	 * When submitting I/O via spdk_nvme_ns_read/write and similar functions,
+	 * don't immediately write the submission queue doorbell. Instead, write
+	 * to the doorbell as necessary inside spdk_nvme_qpair_process_completions().
+	 *
+	 * This results in better batching of I/O submission and consequently fewer
+	 * MMIO writes to the doorbell, which may increase performance.
+	 *
+	 * This only applies to local PCIe devices. */
+	bool delay_pcie_doorbell;
 };

 /**
--- a/include/spdk/version.h
+++ b/include/spdk/version.h
@ -54,7 +54,7 @@
 * Patch level is incremented on maintenance branch releases and reset to 0 for each
 * new major.minor release.
 */
-#define SPDK_VERSION_PATCH	0
+#define SPDK_VERSION_PATCH	1

 /**
 * Version string suffix.
--- a/lib/bdev/bdev.c
+++ b/lib/bdev/bdev.c
@ -420,7 +420,7 @@ spdk_bdev_io_put_buf(struct spdk_bdev_io *bdev_io)
 		tmp = STAILQ_FIRST(stailq);

 		aligned_buf = (void *)(((uintptr_t)buf + 511) & ~511UL);
-		spdk_bdev_io_set_buf(bdev_io, aligned_buf, tmp->internal.buf_len);
+		spdk_bdev_io_set_buf(tmp, aligned_buf, tmp->internal.buf_len);

 		STAILQ_REMOVE_HEAD(stailq, internal.buf_link);
 		tmp->internal.buf = buf;
@ -1661,6 +1661,7 @@ _calculate_measured_qd_cpl(struct spdk_io_channel_iter *i, int status)
 	bdev->internal.measured_queue_depth = bdev->internal.temporary_queue_depth;

 	if (bdev->internal.measured_queue_depth) {
+		bdev->internal.io_time += bdev->internal.period;
 		bdev->internal.weighted_io_time += bdev->internal.period * bdev->internal.measured_queue_depth;
 	}
 }
--- a/lib/bdev/nvme/bdev_nvme.c
+++ b/lib/bdev/nvme/bdev_nvme.c
@ -98,6 +98,7 @@ static struct spdk_bdev_nvme_opts g_opts = {
 	.timeout_us = 0,
 	.retry_count = SPDK_NVME_DEFAULT_RETRY_COUNT,
 	.nvme_adminq_poll_period_us = 1000000ULL,
+	.nvme_ioq_poll_period_us = 0,
 };

 #define NVME_HOTPLUG_POLL_PERIOD_MAX			10000000ULL
@ -276,8 +277,12 @@ _bdev_nvme_reset_create_qpair(struct spdk_io_channel_iter *i)
 	struct spdk_nvme_ctrlr *ctrlr = spdk_io_channel_iter_get_io_device(i);
 	struct spdk_io_channel *_ch = spdk_io_channel_iter_get_channel(i);
 	struct nvme_io_channel *nvme_ch = spdk_io_channel_get_ctx(_ch);
+	struct spdk_nvme_io_qpair_opts opts;

-	nvme_ch->qpair = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, NULL, 0);
+	spdk_nvme_ctrlr_get_default_io_qpair_opts(ctrlr, &opts, sizeof(opts));
+	opts.delay_pcie_doorbell = true;
+
+	nvme_ch->qpair = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, &opts, sizeof(opts));
 	if (!nvme_ch->qpair) {
 		spdk_for_each_channel_continue(i, -1);
 		return;
@ -511,6 +516,7 @@ bdev_nvme_create_cb(void *io_device, void *ctx_buf)
 {
 	struct spdk_nvme_ctrlr *ctrlr = io_device;
 	struct nvme_io_channel *ch = ctx_buf;
+	struct spdk_nvme_io_qpair_opts opts;

 #ifdef SPDK_CONFIG_VTUNE
 	ch->collect_spin_stat = true;
@ -518,13 +524,16 @@ bdev_nvme_create_cb(void *io_device, void *ctx_buf)
 	ch->collect_spin_stat = false;
 #endif

-	ch->qpair = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, NULL, 0);
+	spdk_nvme_ctrlr_get_default_io_qpair_opts(ctrlr, &opts, sizeof(opts));
+	opts.delay_pcie_doorbell = true;
+
+	ch->qpair = spdk_nvme_ctrlr_alloc_io_qpair(ctrlr, &opts, sizeof(opts));

 	if (ch->qpair == NULL) {
 		return -1;
 	}

-	ch->poller = spdk_poller_register(bdev_nvme_poll, ch, 0);
+	ch->poller = spdk_poller_register(bdev_nvme_poll, ch, g_opts.nvme_ioq_poll_period_us);
 	return 0;
 }

@ -1300,6 +1309,11 @@ bdev_nvme_library_init(void)
 		g_opts.nvme_adminq_poll_period_us = intval;
 	}

+	intval = spdk_conf_section_get_intval(sp, "IOPollRate");
+	if (intval > 0) {
+		g_opts.nvme_ioq_poll_period_us = intval;
+	}
+
 	if (spdk_process_is_primary()) {
 		hotplug_enabled = spdk_conf_section_get_boolval(sp, "HotplugEnable", false);
 	}
@ -1724,6 +1738,7 @@ bdev_nvme_get_spdk_running_config(FILE *fp)
 		"# Set how often the admin queue is polled for asynchronous events.\n"
 		"# Units in microseconds.\n");
 	fprintf(fp, "AdminPollRate %"PRIu64"\n", g_opts.nvme_adminq_poll_period_us);
+	fprintf(fp, "IOPollRate %" PRIu64"\n", g_opts.nvme_ioq_poll_period_us);
 	fprintf(fp, "\n"
 		"# Disable handling of hotplug (runtime insert and remove) events,\n"
 		"# users can set to Yes if want to enable it.\n"
@ -1765,6 +1780,7 @@ bdev_nvme_config_json(struct spdk_json_write_ctx *w)
 	spdk_json_write_named_uint64(w, "timeout_us", g_opts.timeout_us);
 	spdk_json_write_named_uint32(w, "retry_count", g_opts.retry_count);
 	spdk_json_write_named_uint64(w, "nvme_adminq_poll_period_us", g_opts.nvme_adminq_poll_period_us);
+	spdk_json_write_named_uint64(w, "nvme_ioq_poll_period_us", g_opts.nvme_ioq_poll_period_us);
 	spdk_json_write_object_end(w);

 	spdk_json_write_object_end(w);
--- a/lib/bdev/nvme/bdev_nvme.h
+++ b/lib/bdev/nvme/bdev_nvme.h
@ -53,6 +53,7 @@ struct spdk_bdev_nvme_opts {
 	uint64_t timeout_us;
 	uint32_t retry_count;
 	uint64_t nvme_adminq_poll_period_us;
+	uint64_t nvme_ioq_poll_period_us;
 };

 struct nvme_ctrlr {
--- a/lib/bdev/nvme/bdev_nvme_rpc.c
+++ b/lib/bdev/nvme/bdev_nvme_rpc.c
@ -73,6 +73,7 @@ static const struct spdk_json_object_decoder rpc_bdev_nvme_options_decoders[] =
 	{"timeout_us", offsetof(struct spdk_bdev_nvme_opts, timeout_us), spdk_json_decode_uint64, true},
 	{"retry_count", offsetof(struct spdk_bdev_nvme_opts, retry_count), spdk_json_decode_uint32, true},
 	{"nvme_adminq_poll_period_us", offsetof(struct spdk_bdev_nvme_opts, nvme_adminq_poll_period_us), spdk_json_decode_uint64, true},
+	{"nvme_ioq_poll_period_us", offsetof(struct spdk_bdev_nvme_opts, nvme_ioq_poll_period_us), spdk_json_decode_uint64, true},
 };

 static void
--- a/lib/env_dpdk/env.mk
+++ b/lib/env_dpdk/env.mk
@ -78,6 +78,10 @@ ifneq (, $(wildcard $(DPDK_ABS_DIR)/lib/librte_bus_pci.*))
 DPDK_LIB_LIST += rte_bus_pci
 endif

+ifneq (, $(wildcard $(DPDK_ABS_DIR)/lib/librte_kvargs.*))
+DPDK_LIB_LIST += rte_kvargs
+endif
+
 DPDK_LIB = $(DPDK_LIB_LIST:%=$(DPDK_ABS_DIR)/lib/lib%$(DPDK_LIB_EXT))

 # SPDK memory registration requires experimental (deprecated) rte_memory API for DPDK 18.05
--- a/lib/env_dpdk/pci_nvme.c
+++ b/lib/env_dpdk/pci_nvme.c
@ -52,7 +52,11 @@ static struct rte_pci_id nvme_pci_driver_id[] = {

 static struct spdk_pci_enum_ctx g_nvme_pci_drv = {
 	.driver = {
-		.drv_flags	= RTE_PCI_DRV_NEED_MAPPING,
+		.drv_flags	= RTE_PCI_DRV_NEED_MAPPING
+#if RTE_VERSION >= RTE_VERSION_NUM(18, 8, 0, 0)
+		| RTE_PCI_DRV_WC_ACTIVATE
+#endif
+		,
 		.id_table	= nvme_pci_driver_id,
 #if RTE_VERSION >= RTE_VERSION_NUM(16, 11, 0, 0)
 		.probe		= spdk_pci_device_init,
--- a/lib/env_dpdk/pci_virtio.c
+++ b/lib/env_dpdk/pci_virtio.c
@ -43,7 +43,11 @@ static struct rte_pci_id virtio_pci_driver_id[] = {

 static struct spdk_pci_enum_ctx g_virtio_pci_drv = {
 	.driver = {
-		.drv_flags	= RTE_PCI_DRV_NEED_MAPPING,
+		.drv_flags	= RTE_PCI_DRV_NEED_MAPPING
+#if RTE_VERSION >= RTE_VERSION_NUM(18, 8, 0, 0)
+		| RTE_PCI_DRV_WC_ACTIVATE
+#endif
+		,
 		.id_table	= virtio_pci_driver_id,
 #if RTE_VERSION >= RTE_VERSION_NUM(16, 11, 0, 0)
 		.probe		= spdk_pci_device_init,
--- a/lib/nvme/nvme_ctrlr.c
+++ b/lib/nvme/nvme_ctrlr.c
@ -213,6 +213,10 @@ spdk_nvme_ctrlr_get_default_io_qpair_opts(struct spdk_nvme_ctrlr *ctrlr,
 		opts->io_queue_requests = ctrlr->opts.io_queue_requests;
 	}

+	if (FIELD_OK(delay_pcie_doorbell)) {
+		opts->delay_pcie_doorbell = false;
+	}
+
 #undef FIELD_OK
 }

@ -403,7 +407,7 @@ nvme_ctrlr_set_supported_log_pages(struct spdk_nvme_ctrlr *ctrlr)
 	if (ctrlr->cdata.lpa.celp) {
 		ctrlr->log_page_supported[SPDK_NVME_LOG_COMMAND_EFFECTS_LOG] = true;
 	}
-	if (ctrlr->cdata.vid == SPDK_PCI_VID_INTEL) {
+	if (ctrlr->cdata.vid == SPDK_PCI_VID_INTEL && !(ctrlr->quirks & NVME_INTEL_QUIRK_NO_LOG_PAGES)) {
 		nvme_ctrlr_set_intel_support_log_pages(ctrlr);
 	}
 }
--- a/lib/nvme/nvme_internal.h
+++ b/lib/nvme/nvme_internal.h
@ -107,6 +107,13 @@ extern pid_t g_spdk_nvme_pid;
 */
 #define NVME_QUIRK_OCSSD 0x80

+/*
+ * The controller has an Intel vendor ID but does not support Intel vendor-specific
+ * log pages.  This is primarily for QEMU emulated SSDs which report an Intel vendor
+ * ID but do not support these log pages.
+ */
+#define NVME_INTEL_QUIRK_NO_LOG_PAGES 0x100
+
 #define NVME_MAX_ASYNC_EVENTS	(8)

 #define NVME_MIN_TIMEOUT_PERIOD		(5)
@ -314,14 +321,7 @@ struct nvme_async_event_request {
 };

 struct spdk_nvme_qpair {
-	STAILQ_HEAD(, nvme_request)	free_req;
-	STAILQ_HEAD(, nvme_request)	queued_req;
-	/** Commands opcode in this list will return error */
-	TAILQ_HEAD(, nvme_error_cmd)	err_cmd_head;
-	/** Requests in this list will return error */
-	STAILQ_HEAD(, nvme_request)	err_req_head;
-
-	enum spdk_nvme_transport_type	trtype;
+	struct spdk_nvme_ctrlr		*ctrlr;

 	uint16_t			id;

@ -341,7 +341,15 @@ struct spdk_nvme_qpair {
 	 */
 	uint8_t				no_deletion_notification_needed: 1;

-	struct spdk_nvme_ctrlr		*ctrlr;
+	enum spdk_nvme_transport_type	trtype;
+
+	STAILQ_HEAD(, nvme_request)	free_req;
+	STAILQ_HEAD(, nvme_request)	queued_req;
+
+	/** Commands opcode in this list will return error */
+	TAILQ_HEAD(, nvme_error_cmd)	err_cmd_head;
+	/** Requests in this list will return error */
+	STAILQ_HEAD(, nvme_request)	err_req_head;

 	/* List entry for spdk_nvme_ctrlr::active_io_qpairs */
 	TAILQ_ENTRY(spdk_nvme_qpair)	tailq;
--- a/lib/nvme/nvme_pcie.c
+++ b/lib/nvme/nvme_pcie.c
@ -144,18 +144,6 @@ struct nvme_pcie_qpair {
 	/* Completion queue head doorbell */
 	volatile uint32_t *cq_hdbl;

-	/* Submission queue shadow tail doorbell */
-	volatile uint32_t *sq_shadow_tdbl;
-
-	/* Completion queue shadow head doorbell */
-	volatile uint32_t *cq_shadow_hdbl;
-
-	/* Submission queue event index */
-	volatile uint32_t *sq_eventidx;
-
-	/* Completion queue event index */
-	volatile uint32_t *cq_eventidx;
-
 	/* Submission queue */
 	struct spdk_nvme_cmd *cmd;

@ -172,13 +160,17 @@ struct nvme_pcie_qpair {

 	uint16_t max_completions_cap;

+	uint16_t last_sq_tail;
 	uint16_t sq_tail;
 	uint16_t cq_head;
 	uint16_t sq_head;

-	uint8_t phase;
-
-	bool is_enabled;
+	struct {
+		uint8_t phase			: 1;
+		uint8_t is_enabled		: 1;
+		uint8_t delay_pcie_doorbell	: 1;
+		uint8_t has_shadow_doorbell	: 1;
+	} flags;

 	/*
 	 * Base qpair structure.
@ -187,6 +179,20 @@ struct nvme_pcie_qpair {
 	 */
 	struct spdk_nvme_qpair qpair;

+	struct {
+		/* Submission queue shadow tail doorbell */
+		volatile uint32_t *sq_tdbl;
+
+		/* Completion queue shadow head doorbell */
+		volatile uint32_t *cq_hdbl;
+
+		/* Submission queue event index */
+		volatile uint32_t *sq_eventidx;
+
+		/* Completion queue event index */
+		volatile uint32_t *cq_eventidx;
+	} shadow_doorbell;
+
 	/*
 	 * Fields below this point should not be touched on the normal I/O path.
 	 */
@ -674,6 +680,7 @@ nvme_pcie_ctrlr_construct_admin_qpair(struct spdk_nvme_ctrlr *ctrlr)
 	}

 	pqpair->num_entries = NVME_ADMIN_ENTRIES;
+	pqpair->flags.delay_pcie_doorbell = 0;

 	ctrlr->adminq = &pqpair->qpair;

@ -941,7 +948,7 @@ nvme_pcie_qpair_reset(struct spdk_nvme_qpair *qpair)
 {
 	struct nvme_pcie_qpair *pqpair = nvme_pcie_qpair(qpair);

-	pqpair->sq_tail = pqpair->cq_head = 0;
+	pqpair->last_sq_tail = pqpair->sq_tail = pqpair->cq_head = 0;

 	/*
 	 * First time through the completion queue, HW will set phase
@ -950,7 +957,7 @@ nvme_pcie_qpair_reset(struct spdk_nvme_qpair *qpair)
 	 *  we'll toggle the bit each time when the completion queue
 	 *  rolls over.
 	 */
-	pqpair->phase = 1;
+	pqpair->flags.phase = 1;

 	memset(pqpair->cmd, 0,
 	       pqpair->num_entries * sizeof(struct spdk_nvme_cmd));
@ -1166,6 +1173,28 @@ nvme_pcie_qpair_update_mmio_required(struct spdk_nvme_qpair *qpair, uint16_t val
 	return true;
 }

+static inline void
+nvme_pcie_qpair_ring_sq_doorbell(struct spdk_nvme_qpair *qpair)
+{
+	struct nvme_pcie_qpair	*pqpair = nvme_pcie_qpair(qpair);
+	struct nvme_pcie_ctrlr	*pctrlr = nvme_pcie_ctrlr(qpair->ctrlr);
+	bool need_mmio = true;
+
+	if (spdk_unlikely(pqpair->flags.has_shadow_doorbell)) {
+		need_mmio = nvme_pcie_qpair_update_mmio_required(qpair,
+				pqpair->sq_tail,
+				pqpair->shadow_doorbell.sq_tdbl,
+				pqpair->shadow_doorbell.sq_eventidx);
+	}
+
+	if (spdk_likely(need_mmio)) {
+		spdk_wmb();
+		g_thread_mmio_ctrlr = pctrlr;
+		spdk_mmio_write_4(pqpair->sq_tdbl, pqpair->sq_tail);
+		g_thread_mmio_ctrlr = NULL;
+	}
+}
+
 static void
 nvme_pcie_qpair_submit_tracker(struct spdk_nvme_qpair *qpair, struct nvme_tracker *tr)
 {
@ -1195,14 +1224,8 @@ nvme_pcie_qpair_submit_tracker(struct spdk_nvme_qpair *qpair, struct nvme_tracke
 		SPDK_ERRLOG("sq_tail is passing sq_head!\n");
 	}

-	spdk_wmb();
-	if (spdk_likely(nvme_pcie_qpair_update_mmio_required(qpair,
-			pqpair->sq_tail,
-			pqpair->sq_shadow_tdbl,
-			pqpair->sq_eventidx))) {
-		g_thread_mmio_ctrlr = pctrlr;
-		spdk_mmio_write_4(pqpair->sq_tdbl, pqpair->sq_tail);
-		g_thread_mmio_ctrlr = NULL;
+	if (!pqpair->flags.delay_pcie_doorbell) {
+		nvme_pcie_qpair_ring_sq_doorbell(qpair);
 	}
 }

@ -1374,7 +1397,7 @@ nvme_pcie_qpair_enable(struct spdk_nvme_qpair *qpair)
 {
 	struct nvme_pcie_qpair *pqpair = nvme_pcie_qpair(qpair);

-	pqpair->is_enabled = true;
+	pqpair->flags.is_enabled = true;
 	if (nvme_qpair_is_io_queue(qpair)) {
 		nvme_pcie_io_qpair_enable(qpair);
 	} else {
@ -1400,7 +1423,7 @@ nvme_pcie_qpair_disable(struct spdk_nvme_qpair *qpair)
 {
 	struct nvme_pcie_qpair *pqpair = nvme_pcie_qpair(qpair);

-	pqpair->is_enabled = false;
+	pqpair->flags.is_enabled = false;
 	if (nvme_qpair_is_io_queue(qpair)) {
 		nvme_pcie_io_qpair_disable(qpair);
 	} else {
@ -1553,10 +1576,17 @@ _nvme_pcie_ctrlr_create_io_qpair(struct spdk_nvme_ctrlr *ctrlr, struct spdk_nvme
 	}

 	if (ctrlr->shadow_doorbell) {
-		pqpair->sq_shadow_tdbl = ctrlr->shadow_doorbell + (2 * qpair->id + 0) * pctrlr->doorbell_stride_u32;
-		pqpair->cq_shadow_hdbl = ctrlr->shadow_doorbell + (2 * qpair->id + 1) * pctrlr->doorbell_stride_u32;
-		pqpair->sq_eventidx = ctrlr->eventidx + (2 * qpair->id + 0) * pctrlr->doorbell_stride_u32;
-		pqpair->cq_eventidx = ctrlr->eventidx + (2 * qpair->id + 1) * pctrlr->doorbell_stride_u32;
+		pqpair->shadow_doorbell.sq_tdbl = ctrlr->shadow_doorbell + (2 * qpair->id + 0) *
+						  pctrlr->doorbell_stride_u32;
+		pqpair->shadow_doorbell.cq_hdbl = ctrlr->shadow_doorbell + (2 * qpair->id + 1) *
+						  pctrlr->doorbell_stride_u32;
+		pqpair->shadow_doorbell.sq_eventidx = ctrlr->eventidx + (2 * qpair->id + 0) *
+						      pctrlr->doorbell_stride_u32;
+		pqpair->shadow_doorbell.cq_eventidx = ctrlr->eventidx + (2 * qpair->id + 1) *
+						      pctrlr->doorbell_stride_u32;
+		pqpair->flags.has_shadow_doorbell = 1;
+	} else {
+		pqpair->flags.has_shadow_doorbell = 0;
 	}
 	nvme_pcie_qpair_reset(qpair);

@ -1579,6 +1609,7 @@ nvme_pcie_ctrlr_create_io_qpair(struct spdk_nvme_ctrlr *ctrlr, uint16_t qid,
 	}

 	pqpair->num_entries = opts->io_queue_size;
+	pqpair->flags.delay_pcie_doorbell = opts->delay_pcie_doorbell;

 	qpair = &pqpair->qpair;

@ -1914,11 +1945,11 @@ nvme_pcie_qpair_check_enabled(struct spdk_nvme_qpair *qpair)
 {
 	struct nvme_pcie_qpair *pqpair = nvme_pcie_qpair(qpair);

-	if (!pqpair->is_enabled &&
+	if (!pqpair->flags.is_enabled &&
 	    !qpair->ctrlr->is_resetting) {
 		nvme_qpair_enable(qpair);
 	}
-	return pqpair->is_enabled;
+	return pqpair->flags.is_enabled;
 }

 int
@ -1938,7 +1969,7 @@ nvme_pcie_qpair_submit_request(struct spdk_nvme_qpair *qpair, struct nvme_reques

 	tr = TAILQ_FIRST(&pqpair->free_tr);

-	if (tr == NULL || !pqpair->is_enabled) {
+	if (tr == NULL || !pqpair->flags.is_enabled) {
 		/*
 		 * No tracker is available, or the qpair is disabled due to
 		 *  an in-progress controller-level reset.
@ -2073,7 +2104,7 @@ nvme_pcie_qpair_process_completions(struct spdk_nvme_qpair *qpair, uint32_t max_
 	while (1) {
 		cpl = &pqpair->cpl[pqpair->cq_head];

-		if (cpl->status.p != pqpair->phase) {
+		if (cpl->status.p != pqpair->flags.phase) {
 			break;
 		}
 #ifdef __PPC64__
@ -2087,7 +2118,7 @@ nvme_pcie_qpair_process_completions(struct spdk_nvme_qpair *qpair, uint32_t max_

 		if (spdk_unlikely(++pqpair->cq_head == pqpair->num_entries)) {
 			pqpair->cq_head = 0;
-			pqpair->phase = !pqpair->phase;
+			pqpair->flags.phase = !pqpair->flags.phase;
 		}

 		tr = &pqpair->tr[cpl->cid];
@ -2107,15 +2138,29 @@ nvme_pcie_qpair_process_completions(struct spdk_nvme_qpair *qpair, uint32_t max_
 	}

 	if (num_completions > 0) {
-		if (spdk_likely(nvme_pcie_qpair_update_mmio_required(qpair, pqpair->cq_head,
-				pqpair->cq_shadow_hdbl,
-				pqpair->cq_eventidx))) {
+		bool need_mmio = true;
+
+		if (spdk_unlikely(pqpair->flags.has_shadow_doorbell)) {
+			need_mmio = nvme_pcie_qpair_update_mmio_required(qpair,
+					pqpair->cq_head,
+					pqpair->shadow_doorbell.cq_hdbl,
+					pqpair->shadow_doorbell.cq_eventidx);
+		}
+
+		if (spdk_likely(need_mmio)) {
 			g_thread_mmio_ctrlr = pctrlr;
 			spdk_mmio_write_4(pqpair->cq_hdbl, pqpair->cq_head);
 			g_thread_mmio_ctrlr = NULL;
 		}
 	}

+	if (pqpair->flags.delay_pcie_doorbell) {
+		if (pqpair->last_sq_tail != pqpair->sq_tail) {
+			nvme_pcie_qpair_ring_sq_doorbell(qpair);
+			pqpair->last_sq_tail = pqpair->sq_tail;
+		}
+	}
+
 	if (spdk_unlikely(ctrlr->timeout_enabled)) {
 		/*
 		 * User registered for timeout callback
--- a/lib/nvme/nvme_quirks.c
+++ b/lib/nvme/nvme_quirks.c
@ -76,7 +76,8 @@ static const struct nvme_quirk nvme_quirks[] = {
 		NVME_QUIRK_DELAY_AFTER_QUEUE_ALLOC
 	},
 	{	{SPDK_PCI_VID_INTEL, 0x5845, SPDK_PCI_ANY_ID, SPDK_PCI_ANY_ID},
-		NVME_QUIRK_IDENTIFY_CNS
+		NVME_QUIRK_IDENTIFY_CNS |
+		NVME_INTEL_QUIRK_NO_LOG_PAGES
 	},
 	{	{SPDK_PCI_VID_CNEXLABS, 0x1f1f, SPDK_PCI_ANY_ID, SPDK_PCI_ANY_ID},
 		NVME_QUIRK_IDENTIFY_CNS |
--- a/scripts/rpc.py
+++ b/scripts/rpc.py
@ -224,7 +224,8 @@ if __name__ == "__main__":
                                       action_on_timeout=args.action_on_timeout,
                                       timeout_us=args.timeout_us,
                                       retry_count=args.retry_count,
-                                       nvme_adminq_poll_period_us=args.nvme_adminq_poll_period_us)
+                                       nvme_adminq_poll_period_us=args.nvme_adminq_poll_period_us,
+                                       nvme_ioq_poll_period_us=args.nvme_ioq_poll_period_us)

    p = subparsers.add_parser('set_bdev_nvme_options',
                              help='Set options for the bdev nvme type. This is startup command.')
@ -236,6 +237,8 @@ if __name__ == "__main__":
                   help='the number of attempts per I/O when an I/O fails', type=int)
    p.add_argument('-p', '--nvme-adminq-poll-period-us',
                   help='How often the admin queue is polled for asynchronous events', type=int)
+    p.add_argument('-i', '--nvme-ioq-poll-period-us',
+                   help='How often to poll I/O queues for completions', type=int)
    p.set_defaults(func=set_bdev_nvme_options)

    @call_cmd
--- a/scripts/rpc/bdev.py
+++ b/scripts/rpc/bdev.py
@ -147,14 +147,16 @@ def delete_aio_bdev(client, name):
    return client.call('delete_aio_bdev', params)


-def set_bdev_nvme_options(client, action_on_timeout=None, timeout_us=None, retry_count=None, nvme_adminq_poll_period_us=None):
+def set_bdev_nvme_options(client, action_on_timeout=None, timeout_us=None, retry_count=None,
+                          nvme_adminq_poll_period_us=None, nvme_ioq_poll_period_us=None):
    """Set options for the bdev nvme. This is startup command.

    Args:
        action_on_timeout:  action to take on command time out. Valid values are: none, reset, abort (optional)
        timeout_us: Timeout for each command, in microseconds. If 0, don't track timeouts (optional)
        retry_count: The number of attempts per I/O when an I/O fails (optional)
-        nvme_adminq_poll_period_us: how often the admin queue is polled for asynchronous events in microsecon (optional)
+        nvme_adminq_poll_period_us: How often the admin queue is polled for asynchronous events in microseconds (optional)
+        nvme_ioq_poll_period_us: How often to poll I/O queues for completions in microseconds (optional)
    """
    params = {}

@ -170,6 +172,9 @@ def set_bdev_nvme_options(client, action_on_timeout=None, timeout_us=None, retry
    if nvme_adminq_poll_period_us:
        params['nvme_adminq_poll_period_us'] = nvme_adminq_poll_period_us

+    if nvme_ioq_poll_period_us:
+        params['nvme_ioq_poll_period_us'] = nvme_ioq_poll_period_us
+
    return client.call('set_bdev_nvme_options', params)


--- a/scripts/vagrant/create_vbox.sh
+++ b/scripts/vagrant/create_vbox.sh
@ -1,4 +1,4 @@
-#!/bin/sh -e
+#!/usr/bin/env bash

 # create_vbox.sh
 #
Author	SHA1	Message	Date
Ben Walker	24d5087441	perf: Add option to create unused io queue pairs For some testing, we need queue pairs to exist but not actually be in use. Change-Id: I2b17ff0172c9ec002692babcf7d4d612c3062eb4 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/392977 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:09:13 -07:00
Ben Walker	8aa0e50f1a	perf: Allow the user to specify the number of queues Queues will be doled out to cores as available. Change-Id: Ib6a0fe846a9d90b659754be1c11ae022abbe38a3 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/391876 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-03-27 13:09:06 -07:00
Ben Walker	90a18d469d	nvme: Repack qpair structures Try to group data members that are used often into the same cache lines. We still need to find more space in the second cache line of spdk_nvme_pcie_qpair so that the important parts of spdk_nvme_qpair fit. Change-Id: Ib936cb2b1acc722de7ec313d6faa3812aacde394 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447968 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	581e24004c	nvme: Minimize memory accesses when checking if mmio required Don't touch the shadow doorbells if it isn't necessary. The flag could be combined into a bit mask with other flags in a future patch. Change-Id: I9ffd16468d29f0f0868cf849f7fece327eb6a294 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447967 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	2c8ffe9e74	bdev/nvme: Add configuration parameter to slow down polling For NVMe devices, in conjunction with the new batching options, it can be advantageous to artificially delay between polling for completions. Add an option to slow this rate down. Change-Id: I0fc92709ff45ead0beb388dda60694bf1ed8b258 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447716 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	a4aea8ed5a	bdev/nvme: Use delay_doorbell queue pair option The bdev is polling for completions continually, so it is safe to turn this on. Change-Id: I8ac1c46c1f683463281c4bd8b0a0781f70a72297 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447713 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	7385cc61f2	nvme/perf: Use delay_doorbell queue pair option This tool continually polls for completions, so it is safe to turn this on. Change-Id: Ice1c68cdaff070f8edd428621e19a6fb44fb8c31 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447712 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	bad9d062a1	fio/nvme: Enable delay_doorbell queue pair option Since fio is continually polling for completions, this option can be safely enabled. Change-Id: I02ee3d2507d3b37f79e14d69fe90ee19c4b4eea2 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447711 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	64a50b6177	nvme: Add qpair option to batch command submissions Avoid ringing the submission queue doorbell until the call to spdk_nvme_qpair_process_completions(). Change-Id: I7b3cd952e5ec79109eaa1c3a50f6537d7aaea51a Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447239 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 13:07:33 -07:00
Ben Walker	98c4101c51	nvme: Move sq doorbell ring to a function This is going to get called from two places shortly. Change-Id: I2c67e719c91887987e6e65c5c0c384bed0431409 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448311 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-03-27 13:07:25 -07:00
Ben Walker	b63ad2eec0	nvme: Don't do a write memory barrier if we don't ring the doorbell Change-Id: I6766ae96c155e04bc0162aa8d2e21fd096be3221 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448310 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-03-27 12:47:42 -07:00
Ben Walker	4608e917de	Update 18.07.1 Changelog Change-Id: I527b30a852031a79a01a8ad73e63682a3076296a Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/424892 Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2018-09-10 22:34:58 +00:00
Jim Harris	f55ffa8b57	Update DPDK submodule to 18.05.1 + SPDK patches. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I6caea461d2b13239dee42f6f96c5b9bdde14c160 Reviewed-on: https://review.gerrithub.io/423157 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-09-10 21:16:10 +00:00
Jim Harris	f3cedcc7fe	bdev: set iovs on correct bdev_io in spdk_bdev_io_put_buf spdk_bdev_io_put_buf() is responsible for reclaiming bdev-allocated buffers from a bdev_io. If there are bdev_ios waiting for one of these buffers, it calls spdk_bdev_io_set_buf() on the next bdev_io in the queue. This will set the iov_base and iov_len on the bdev_io to point to the bdev-allocated buffer. But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf() on the just completed bdev_io, not the next bdev_io in the queue. So fix that. Fixes: `844aedf8` ("bdev: Simplify get/set/put buf functions") Reported-by: Alan Tu Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638 Reviewed-on: https://review.gerrithub.io/424881 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-09-10 21:16:10 +00:00
Jim Harris	44a43939e8	nvme: add quirk for Intel SSDs without vendor-specific log pages QEMU emulated NVMe SSDs report themselves with an Intel vendor ID, but don't support the Intel vendor-specific log pages. So add a quirk to avoid confusing error messages. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic41476801ede94d43acb9972217ea7420ca53679 Reviewed-on: https://review.gerrithub.io/423422 Reviewed-on: https://review.gerrithub.io/423928 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-09-05 20:45:06 +00:00
Karol Latecki	9fca71f514	scripts/vagrant: change create_vbox.sh shebang pushd and popd not in default path for /bin/bash Change-Id: I83e0bd1f87005e1c8542ac3db44b26f83eedf96c Signed-off-by: Karol Latecki <karol.latecki@intel.com> Reviewed-on: https://review.gerrithub.io/421903 Reviewed-on: https://review.gerrithub.io/423925 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-09-05 20:45:06 +00:00
Seth Howell	bbb2989c26	bdev: increment io_time if queue depth > 0 This value is used to calculate the disk utilization of a given bdev. Change-Id: I4bf101c524b92bdd21573941e17f61db59c5c6b8 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/423017 Reviewed-on: https://review.gerrithub.io/423927 Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: Ben Walker <benjamin.walker@intel.com>	2018-09-05 20:45:06 +00:00
Dariusz Stojaczyk	4baae265ca	dpdk/pci: support DPDK 18.08 write combined PCI resources We used to support it by default in our DPDK forks, but starting with DPDK 18.08, a new PCI driver flag RTE_PCI_DRV_WC_ACTIVATE is required. We enable now it for NVMe and Virtio, but not for I/OAT, as our I/OAT driver currently assumes strong memory ordering, which prefetchable resources do not provide. Change-Id: I1a13356e28535981153b3d3e52bfe9d66b6172ae Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/422588 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: <wenqianx.zong@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: Jim Harris <james.r.harris@intel.com>	2018-08-21 14:31:09 +00:00
Dariusz Stojaczyk	ec611eb485	env/dpdk: link with rte_kvargs by default Starting with DPDK 18.08, rte_kvargs is a dependency of rte_eal. Change-Id: I0cde78f632fc313cec745d41ee519fb8b37de81a Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/422587 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: Jim Harris <james.r.harris@intel.com>	2018-08-21 14:31:09 +00:00