freebsd-nq

Author	SHA1	Message	Date
Alexander Motin	ee7f4d8187	Revert r292074 (by smh): Limit stripesize reported from nvd(4) to 4K I believe that this patch handled the problem from the wrong side. Instead of making ZFS properly handle large stripe sizes, it made unrelated driver to lie in reported parameters to workaround that. Alternative solution for this problem from ZFS side was committed at r296615. Discussed with: smh	2016-03-10 17:13:10 +00:00
Jim Harris	361e1fb408	nvme: fix intx handler to not dereference ioq during initialization This was a regression from r293328, which deferred allocation of the controller's ioq array until after interrupts are enabled during boot. PR: 207432 Reported and tested by: Andy Carrel <wac@google.com> MFC after: 3 days Sponsored by: Intel	2016-02-24 00:01:10 +00:00
Justin Hibbits	43cd61606b	Replace several bus_alloc_resource() calls using default arguments with bus_alloc_resource_any() Since these calls only use default arguments, bus_alloc_resource_any() is the right call. Differential Revision: https://reviews.freebsd.org/D5306	2016-02-19 03:37:56 +00:00
Jim Harris	7b036d7790	nvme: avoid duplicate SET_NUM_QUEUES commands nvme(4) issues a SET_NUM_QUEUES command during device initialization to ensure enough I/O queues exists for each of the MSI-X vectors we have allocated. The SET_NUM_QUEUES command is then issued again during nvme_ctrlr_start(), to ensure that is properly set after any controller reset. At least one NVMe drive exists which fails this second SET_NUM_QUEUES command during device initialization. So change nvme_ctrlr_start() to only issue its SET_NUM_QUEUES command when it is coming out of a reset - avoiding the duplicate SET_NUM_QUEUES during device initialization. Reported by: gallatin MFC after: 3 days Sponsored by: Intel	2016-02-11 17:32:41 +00:00
Warner Losh	038659e7dd	Implement power command to list all power modes, find out the power mode we're in and to set the power mode.	2016-01-30 22:48:06 +00:00
Jim Harris	9c6b5d40eb	nvme: replace NVME_CEILING macro with howmany() Suggested by: rpokala MFC after: 3 days	2016-01-07 20:35:26 +00:00
Jim Harris	50dea2da12	nvme: add hw.nvme.min_cpus_per_ioq tunable Due to FreeBSD system-wide limits on number of MSI-X vectors (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321), it may be desirable to allocate fewer than the maximum number of vectors for an NVMe device, in order to save vectors for other devices (usually Ethernet) that can take better advantage of them and may be probed after NVMe. This tunable is expressed in terms of minimum number of CPUs per I/O queue instead of max number of queues per controller, to allow for a more even distribution of CPUs per queue. This avoids cases where some number of CPUs have a dedicated queue, but other CPUs need to share queues. Ideally the PR referenced above will eventually be fixed and the mechanism implemented here becomes obsolete anyways. While here, fix a bug in the CPUs per I/O queue calculation to properly account for the admin queue's MSI-X vector. Reviewed by: gallatin MFC after: 3 days Sponsored by: Intel	2016-01-07 20:32:04 +00:00
Jim Harris	2b647da7a0	nvme: do not revert o single I/O queue when per-CPU queues not possible Previously nvme(4) would revert to a signle I/O queue if it could not allocate enought interrupt vectors or NVMe submission/completion queues to have one I/O queue per core. This patch determines how to utilize a smaller number of available interrupt vectors, and assigns (as closely as possible) an equal number of cores to each associated I/O queue. MFC after: 3 days Sponsored by: Intel	2016-01-07 16:18:32 +00:00
Jim Harris	d400f790b1	nvme: break out interrupt setup code into a separate function MFC after: 3 days Sponsored by: Intel	2016-01-07 16:12:42 +00:00
Jim Harris	e5af5854ff	nvme: do not pre-allocate MSI-X IRQ resources The issue referenced here was resolved by other changes in recent commits, so this code is no longer needed. MFC after: 3 days Sponsored by: Intel	2016-01-07 16:11:31 +00:00
Jim Harris	c75ad8ce5a	nvme: remove per_cpu_io_queues from struct nvme_controller Instead just use num_io_queues to make this determination. This prepares for some future changes enabling use of multiple queues when we do not have enough queues or MSI-X vectors for one queue per CPU. MFC after: 3 days Sponsored by: Intel	2016-01-07 16:09:56 +00:00
Jim Harris	d85f84abb8	nvme: simplify some of the nested ifs in interrupt setup code This prepares for some follow-up commits which do more work in this area. MFC after: 3 days Sponsored by: Intel	2016-01-07 16:08:04 +00:00
Steven Hartland	fdf16a68ab	Limit stripesize reported from nvd(4) to 4K Intel NVMe controllers have a slow path for I/Os that span a 128KB stripe boundary but ZFS limits ashift, which is derived from d_stripesize, to 13 (8KB) so we limit the stripesize reported to geom(8) to 4KB. This may result in a small number of additional I/Os to require splitting in nvme(4), however the NVMe I/O path is very efficient so these additional I/Os will cause very minimal (if any) difference in performance or CPU utilisation. This can be controller by the new sysctl kern.nvme.max_optimal_sectorsize. MFC after: 1 week Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D4446	2015-12-11 02:06:03 +00:00
Jim Harris	fdbd3d8068	nvd, nvme: report stripesize through GEOM disk layer MFC after: 3 days Sponsored by: Intel	2015-10-30 16:35:18 +00:00
Jim Harris	e7e7bad3d7	nvme: fix race condition in split bio completion path Fixes race condition observed under following circumstances: 1) I/O split on 128KB boundary with Intel NVMe controller. Current Intel controllers produce better latency when I/Os do not span a 128KB boundary - even if the I/O size itself is less than 128KB. 2) Per-CPU I/O queues are enabled. 3) Child I/Os are submitted on different submission queues. 4) Interrupts for child I/O completions occur almost simultaneously. 5) ithread for child I/O A increments bio_inbed, then immediately is preempted (rendezvous IPI, higher priority interrupt). 6) ithread for child I/O B increments bio_inbed, then completes parent bio since all children are now completed. 7) parent bio is freed, and immediately reallocated for a VFS or gpart bio (including setting bio_children to 1 and clearing bio_driver1). 8) ithread for child I/O A resumes processing. bio_children for what it thinks is the parent bio is set to 1, so it thinks it needs to complete the parent bio. Result is either calling a NULL callback function, or double freeing the bio to its uma zone. PR: 203746 Reported by: Drew Gallatin <gallatin@netflix.com>, Marc Goroff <mgoroff@quorum.net> Tested by: Drew Gallatin <gallatin@netflix.com> MFC after: 3 days Sponsored by: Intel	2015-10-30 16:06:34 +00:00
Jim Harris	0e1fd2dda3	nvme: do not notify a consumer about failures that occur during initialization MFC after: 3 days Sponsored by: Intel	2015-07-29 21:29:50 +00:00
Jeff Roberson	fade8dd714	Refactor unmapped buffer address handling. - Use pointer assignment rather than a combination of pointers and flags to switch buffers between unmapped and mapped. This eliminates multiple flags and generally simplifies the logic. - Eliminate b_saveaddr since it is only used with pager bufs which have their b_data re-initialized on each allocation. - Gather up some convenience routines in the buffer cache for manipulating buf space and buf malloc space. - Add an inline, buf_mapped(), to standardize checks around unmapped buffers. In collaboration with: mlaier Reviewed by: kib Tested by: pho (many small revisions ago) Sponsored by: EMC / Isilon Storage Division	2015-07-23 19:13:41 +00:00
Jim Harris	cbdec09c1c	nvme: ensure csts.rdy bit is cleared before returning from nvme_ctrlr_disable PR: 200458 MFC after: 3 days Sponsored by: Intel	2015-07-23 15:50:39 +00:00
Jim Harris	de9a58f4ee	nvme: properly handle case where pci_alloc_msix does not alloc all vectors Reported by: Sean Kelly <smkelly@smkelly.org> MFC after: 3 days Sponsored by: Intel	2015-07-23 15:35:08 +00:00
Jim Harris	3345ed9a55	nvme: use BUS_SPACE_MAXSIZE for bus_dma_tag_create maxsize parameter This fixes i386 PAE build fallout from r281281. Reported by: bz MFC after: 1 week	2015-04-09 00:37:55 +00:00
Jim Harris	36b0e4ee1f	nvme: remove CHATHAM related code Chatham was an internal NVMe prototype board used for early driver development. MFC after: 1 week Sponsored by: Intel	2015-04-08 21:52:06 +00:00
Jim Harris	eb4929fb41	nvme: add device strings for Intel DC series NVMe SSDs MFC after: 1 week Sponsored by: Intel	2015-04-08 21:50:45 +00:00
Jim Harris	a6e3096392	nvme: create separate DMA tag for non-payload DMA buffers Submission and completion queue memory need to use a separate DMA tag for mappings than payload buffers, to ensure mappings remain contiguous even with DMAR enabled. Submitted by: kib MFC after: 1 week Sponsored by: Intel	2015-04-08 21:49:45 +00:00
Jim Harris	e5ce537999	nvme: fall back to a smaller MSI-X vector allocation if necessary Previously, if per-CPU MSI-X vectors could not be allocated, nvme(4) would fall back to INTx with a single I/O queue pair. This change will still fall back to a single I/O queue pair, but allocate MSI-X vectors instead of reverting to INTx. MFC after: 1 week Sponsored by: Intel	2015-04-08 21:46:18 +00:00
Jim Harris	2efb5fb1ec	Use bitwise OR instead of logical OR when constructing value for SET_FEATURES/NUMBER_OF_QUEUES command. Sponsored by: Intel MFC after: 3 days	2014-06-10 21:40:43 +00:00
Jim Harris	f42ca756b9	nvme: Allocate all MSI resources up front so that we can fall back to INTx if necessary. Sponsored by: Intel MFC after: 3 days	2014-03-18 18:10:35 +00:00
Jim Harris	496a27520d	nvme: Close hole where nvd(4) would not be notified of all nvme(4) instances if modules loaded during boot. Sponsored by: Intel MFC after: 3 days	2014-03-18 18:09:08 +00:00
Jim Harris	1416ef361e	nvme: NVMe specification dictates 4-byte alignment for PRPs (not 8). Sponsored by: Intel MFC after: 3 days	2014-03-17 22:37:17 +00:00
Jim Harris	2b26030cbc	nvme: Remove the software progress marker SET_FEATURE command during controller initialization. The spec says OS drivers should send this command after controller initialization completes successfully, but other NVMe OS drivers are not sending this command. This change will therefore reduce differences between the FreeBSD and other OS drivers. Sponsored by: Intel MFC after: 3 days	2014-03-17 22:36:04 +00:00
Jim Harris	448cffc859	For IDENTIFY passthrough commands to Chatham prototype controllers, copy the spoofed identify data into the user buffer rather than issuing the command to the controller, since Chatham IDENTIFY data is always spoofed. While here, fix a bug in the spoofed data for Chatham submission and completion queue entry sizes. Sponsored by: Intel MFC after: 3 days	2014-01-06 23:51:26 +00:00
Jim Harris	d603c3d73b	Create a unique unit number for each controller and namespace cdev. Sponsored by: Intel MFC after: 3 days	2013-11-01 23:30:54 +00:00
Jim Harris	8a959ae073	Fix the LINT build. Approved by: re (implicit) MFC after: 1 week	2013-10-08 23:23:04 +00:00
Jim Harris	7aa27dbac5	Do not leak resources during attach if nvme_ctrlr_construct() or the initial controller resets fail. Sponsored by: Intel Reviewed by: carl Approved by: re (hrs) MFC after: 1 week	2013-10-08 16:01:43 +00:00
Jim Harris	bb2f67fd72	Log and then disable asynchronous notification of persistent events after they occur. This prevents repeated notifications of the same event. Status of these events may be viewed at any time by viewing the SMART/Health Info Page using nvmecontrol, whether or not asynchronous events notifications for those events are enabled. This log page can be viewed using: nvmecontrol logpage -p 2 <ctrlr id> Future enhancements may re-enable these notifications on a periodic basis so that if the notified condition persists, it will continue to be logged. Sponsored by: Intel Reviewed by: carl Approved by: re (hrs) MFC after: 1 week	2013-10-08 16:00:12 +00:00
Jim Harris	d5fc982133	Do not enable temperature threshold as an asynchronous event notification on NVMe controllers that do not support it. Sponsored by: Intel Reviewed by: carl Approved by: re (hrs) MFC after: 1 week	2013-10-08 15:49:14 +00:00
Jim Harris	992db80f1d	Extend some 32-bit fields and variables to 64-bit to prevent overflow when calculating stats in nvmecontrol perftest. Sponsored by: Intel Reported by: Joe Golio <joseph.golio@emc.com> Reviewed by: carl Approved by: re (hrs) MFC after: 1 week	2013-10-08 15:47:22 +00:00
Jim Harris	a40e72a695	Add driver-assisted striping for upcoming Intel NVMe controllers that can benefit from it. Sponsored by: Intel Reviewed by: kib (earlier version), carl Approved by: re (hrs) MFC after: 1 week	2013-10-08 15:44:04 +00:00
Kenneth D. Merry	ce625ec719	Change the way that unmapped I/O capability is advertised. The previous method was to set the D_UNMAPPED_IO flag in the cdevsw for the driver. The problem with this is that in many cases (e.g. sa(4)) there may be some instances of the driver that can handle unmapped I/O and some that can't. The isp(4) driver can handle unmapped I/O, but the esp(4) driver currently cannot. The cdevsw is shared among all driver instances. So instead of setting a flag on the cdevsw, set a flag on the cdev. This allows drivers to indicate support for unmapped I/O on a per-instance basis. sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it with an SI_UNMAPPED cdev flag. kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine whether or not a particular driver can handle unmapped I/O. geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs. Since GEOM will create a temporary mapping when needed, setting SI_UNMAPPED unconditionally will work. Remove the D_UNMAPPED_IO flag. nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here if NVME_UNMAPPED_BIO_SUPPORT is enabled. vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a cdev instead of the D_UNMAPPED_IO flag on the cdevsw. sys/param.h: Bump __FreeBSD_version to `1000045` for the switch from setting the D_UNMAPPED_IO flag in the cdevsw to setting SI_UNMAPPED in the cdev. Reviewed by: kib, jimharris MFC after: 1 week Sponsored by: Spectra Logic	2013-08-15 22:52:39 +00:00
Jim Harris	086d23cfd3	If a controller fails to initialize, do not notify consumers (nvd) of its namespaces. Sponsoredy by: Intel Reviewed by: carl MFC after: 3 days	2013-08-13 21:49:32 +00:00
Jim Harris	56183abc2b	Send a shutdown notification in the driver unload path, to ensure notification gets sent in cases where system shuts down with driver unloaded. Sponsored by: Intel Reviewed by: carl MFC after: 3 days	2013-08-13 21:47:08 +00:00
Jim Harris	38441bd9a9	Add message when nvd disks are attached and detached. As part of this commit, add an nvme_strvis() function which borrows heavily from cam_strvis(). This will allow stripping of leading/trailing whitespace and also handle unprintable characters in model/serial numbers. This function goes into a new nvme_util.c file which is used by both the driver and nvmecontrol. Sponsored by: Intel Reviewed by: carl MFC after: 3 days	2013-07-19 21:40:57 +00:00
Jim Harris	2fb37e8f1a	Fix nvme(4) and nvd(4) to support non 512-byte sector sizes. Recent testing with QEMU that has variable sector size support for NVMe uncovered some of these issues. Chatham prototype boards supported only 512 byte sectors. Sponsored by: Intel Reviewed by: carl MFC after: 3 days	2013-07-19 21:33:24 +00:00
Jim Harris	8e0ac13f5a	Use pause() instead of DELAY() when polling for completion of admin commands during controller initialization. DELAY() does not work here during config_intrhook context - we need to explicitly relinquish the CPU for the admin command completion to get processed. Sponsored by: Intel Reported by: Adam Brooks <adam.j.brooks@intel.com> Reviewed by: carl MFC after: 3 days	2013-07-17 23:26:56 +00:00
Jim Harris	e8f25c6266	Define constants for the lengths of the serial number, model number and firmware revision in the controller's identify structure. Also modify consumers of these fields to ensure they only use the specified number of bytes for their respective fields. Sponsored by: Intel Reviewed by: carl MFC after: 3 days	2013-07-17 23:23:38 +00:00
Jim Harris	66619178b5	Fix a poorly worded comment in nvme(4). MFC after: 3 days	2013-07-11 15:02:38 +00:00
Jim Harris	bd6b0ac5be	Add comment explaining why CACHE_LINE_SIZE is defined in nvme_private.h if not already defined elsewhere. Requested by: attilio MFC after: 3 days	2013-07-09 21:24:19 +00:00
Jim Harris	e9efbc134f	Update copyright dates. MFC after: 3 days	2013-07-09 21:22:17 +00:00
Jim Harris	ec526ea90b	Do not retry failed async event requests. Sponsored by: Intel MFC after: 3 days	2013-07-09 21:03:39 +00:00
Jim Harris	eb32b874f6	Add pci_enable_busmaster() and pci_disable_busmaster() calls in nvme_attach() and nvme_detach() respectively. Sponsored by: Intel MFC after: 3 days	2013-07-09 21:02:45 +00:00
Jim Harris	49fac6101d	Add firmware replacement and activation support to nvmecontrol(8) through a new firmware command. NVMe controllers may support up to 7 firmware slots for storing of different firmware revisions. This new firmware command supports firmware replacement (i.e. firmware download) with or without immediate activation, or activation of a previously stored firmware image. It also supports selection of the firmware slot during replacement operations, using IDENTIFY information from the controller to check that the specified slot is valid. Newly activated firmware does not take effect until the new controller reset, either via a reboot or separate 'nvmecontrol reset' command to the same controller. Submitted by: Joe Golio <joseph.golio@emc.com> Obtained from: EMC / Isilon Storage Division MFC after: 3 days	2013-06-27 00:08:25 +00:00

1 2 3

127 Commits