Commit Graph

102 Commits

Author SHA1 Message Date
Jim Harris
f42ca756b9 nvme: Allocate all MSI resources up front so that we can fall back to
INTx if necessary.

Sponsored by:	Intel
MFC after:	3 days
2014-03-18 18:10:35 +00:00
Jim Harris
496a27520d nvme: Close hole where nvd(4) would not be notified of all nvme(4)
instances if modules loaded during boot.

Sponsored by:	Intel
MFC after:	3 days
2014-03-18 18:09:08 +00:00
Jim Harris
1416ef361e nvme: NVMe specification dictates 4-byte alignment for PRPs (not 8).
Sponsored by:	Intel
MFC after:	3 days
2014-03-17 22:37:17 +00:00
Jim Harris
2b26030cbc nvme: Remove the software progress marker SET_FEATURE command during
controller initialization.

The spec says OS drivers should send this command after controller
initialization completes successfully, but other NVMe OS drivers are
not sending this command.  This change will therefore reduce differences
between the FreeBSD and other OS drivers.

Sponsored by:	Intel
MFC after:	3 days
2014-03-17 22:36:04 +00:00
Jim Harris
448cffc859 For IDENTIFY passthrough commands to Chatham prototype controllers, copy
the spoofed identify data into the user buffer rather than issuing the
command to the controller, since Chatham IDENTIFY data is always spoofed.

While here, fix a bug in the spoofed data for Chatham submission and
completion queue entry sizes.

Sponsored by:	Intel
MFC after:	3 days
2014-01-06 23:51:26 +00:00
Jim Harris
d603c3d73b Create a unique unit number for each controller and namespace cdev.
Sponsored by:	Intel
MFC after:	3 days
2013-11-01 23:30:54 +00:00
Jim Harris
8a959ae073 Fix the LINT build.
Approved by:	re (implicit)
MFC after:	1 week
2013-10-08 23:23:04 +00:00
Jim Harris
7aa27dbac5 Do not leak resources during attach if nvme_ctrlr_construct() or the initial
controller resets fail.

Sponsored by:	Intel
Reviewed by:	carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 16:01:43 +00:00
Jim Harris
bb2f67fd72 Log and then disable asynchronous notification of persistent events after
they occur.

This prevents repeated notifications of the same event.

Status of these events may be viewed at any time by viewing the
SMART/Health Info Page using nvmecontrol, whether or not asynchronous
events notifications for those events are enabled.  This log page can
be viewed using:

    nvmecontrol logpage -p 2 <ctrlr id>

Future enhancements may re-enable these notifications on a periodic basis
so that if the notified condition persists, it will continue to be logged.

Sponsored by:	Intel
Reviewed by:	carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 16:00:12 +00:00
Jim Harris
d5fc982133 Do not enable temperature threshold as an asynchronous event notification
on NVMe controllers that do not support it.

Sponsored by:	Intel
Reviewed by:	carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 15:49:14 +00:00
Jim Harris
992db80f1d Extend some 32-bit fields and variables to 64-bit to prevent overflow
when calculating stats in nvmecontrol perftest.

Sponsored by:	Intel
Reported by:	Joe Golio <joseph.golio@emc.com>
Reviewed by:	carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 15:47:22 +00:00
Jim Harris
a40e72a695 Add driver-assisted striping for upcoming Intel NVMe controllers that can
benefit from it.

Sponsored by:	Intel
Reviewed by:	kib (earlier version), carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 15:44:04 +00:00
Kenneth D. Merry
ce625ec719 Change the way that unmapped I/O capability is advertised.
The previous method was to set the D_UNMAPPED_IO flag in the cdevsw
for the driver.  The problem with this is that in many cases (e.g.
sa(4)) there may be some instances of the driver that can handle
unmapped I/O and some that can't.  The isp(4) driver can handle
unmapped I/O, but the esp(4) driver currently cannot.  The cdevsw
is shared among all driver instances.

So instead of setting a flag on the cdevsw, set a flag on the cdev.
This allows drivers to indicate support for unmapped I/O on a
per-instance basis.

sys/conf.h:	Remove the D_UNMAPPED_IO cdevsw flag and replace it
		with an SI_UNMAPPED cdev flag.

kern_physio.c:	Look at the cdev SI_UNMAPPED flag to determine
		whether or not a particular driver can handle
		unmapped I/O.

geom_dev.c:	Set the SI_UNMAPPED flag for all GEOM cdevs.
		Since GEOM will create a temporary mapping when
		needed, setting SI_UNMAPPED unconditionally will
		work.

		Remove the D_UNMAPPED_IO flag.

nvme_ns.c:	Set the SI_UNMAPPED flag on cdevs created here
		if NVME_UNMAPPED_BIO_SUPPORT is enabled.

vfs_aio.c:	In aio_qphysio(), check the SI_UNMAPPED flag on a
		cdev instead of the D_UNMAPPED_IO flag on the cdevsw.

sys/param.h:	Bump __FreeBSD_version to 1000045 for the switch from
		setting the D_UNMAPPED_IO flag in the cdevsw to setting
		SI_UNMAPPED in the cdev.

Reviewed by:	kib, jimharris
MFC after:	1 week
Sponsored by:	Spectra Logic
2013-08-15 22:52:39 +00:00
Jim Harris
086d23cfd3 If a controller fails to initialize, do not notify consumers (nvd) of its
namespaces.

Sponsoredy by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:49:32 +00:00
Jim Harris
56183abc2b Send a shutdown notification in the driver unload path, to ensure
notification gets sent in cases where system shuts down with driver
unloaded.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:47:08 +00:00
Jim Harris
38441bd9a9 Add message when nvd disks are attached and detached.
As part of this commit, add an nvme_strvis() function which borrows
heavily from cam_strvis().  This will allow stripping of
leading/trailing whitespace and also handle unprintable characters
in model/serial numbers.  This function goes into a new nvme_util.c
file which is used by both the driver and nvmecontrol.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-07-19 21:40:57 +00:00
Jim Harris
2fb37e8f1a Fix nvme(4) and nvd(4) to support non 512-byte sector sizes.
Recent testing with QEMU that has variable sector size support for
NVMe uncovered some of these issues.  Chatham prototype boards supported
only 512 byte sectors.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-07-19 21:33:24 +00:00
Jim Harris
8e0ac13f5a Use pause() instead of DELAY() when polling for completion of admin
commands during controller initialization.

DELAY() does not work here during config_intrhook context - we need to
explicitly relinquish the CPU for the admin command completion to
get processed.

Sponsored by:	Intel
Reported by:	Adam Brooks <adam.j.brooks@intel.com>
Reviewed by:	carl
MFC after:	3 days
2013-07-17 23:26:56 +00:00
Jim Harris
e8f25c6266 Define constants for the lengths of the serial number, model number
and firmware revision in the controller's identify structure.

Also modify consumers of these fields to ensure they only use the
specified number of bytes for their respective fields.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-07-17 23:23:38 +00:00
Jim Harris
66619178b5 Fix a poorly worded comment in nvme(4).
MFC after:	3 days
2013-07-11 15:02:38 +00:00
Jim Harris
bd6b0ac5be Add comment explaining why CACHE_LINE_SIZE is defined in nvme_private.h
if not already defined elsewhere.

Requested by:	attilio
MFC after:	3 days
2013-07-09 21:24:19 +00:00
Jim Harris
e9efbc134f Update copyright dates.
MFC after:	3 days
2013-07-09 21:22:17 +00:00
Jim Harris
ec526ea90b Do not retry failed async event requests.
Sponsored by:	Intel
MFC after:	3 days
2013-07-09 21:03:39 +00:00
Jim Harris
eb32b874f6 Add pci_enable_busmaster() and pci_disable_busmaster() calls in
nvme_attach() and nvme_detach() respectively.

Sponsored by:	Intel
MFC after:	3 days
2013-07-09 21:02:45 +00:00
Jim Harris
49fac6101d Add firmware replacement and activation support to nvmecontrol(8) through
a new firmware command.

NVMe controllers may support up to 7 firmware slots for storing of
different firmware revisions.  This new firmware command supports
firmware replacement (i.e. firmware download) with or without immediate
activation, or activation of a previously stored firmware image.  It
also supports selection of the firmware slot during replacement
operations, using IDENTIFY information from the controller to
check that the specified slot is valid.

Newly activated firmware does not take effect until the new controller
reset, either via a reboot or separate 'nvmecontrol reset' command to the
same controller.

Submitted by:	Joe Golio <joseph.golio@emc.com>
Obtained from:	EMC / Isilon Storage Division
MFC after:	3 days
2013-06-27 00:08:25 +00:00
Jim Harris
bbd412dd05 Remove remaining uio-related code.
The nvme_physio() function was removed quite a while ago, which was the
only user of this uio-related code.

Sponsored by:	Intel
MFC after:	3 days
2013-06-26 23:37:11 +00:00
Jim Harris
7b68ae1e5e Fail any passthrough command whose transfer size exceeds the controller's
max transfer size.  This guards against rogue commands coming in from
userspace.

Also add KASSERTS for the virtual address and unmapped bio cases, if the
transfer size exceeds the controller's max transfer size.

Sponsored by:	Intel
MFC after:	3 days
2013-06-26 23:32:45 +00:00
Jim Harris
8d09e3c400 Use MAXPHYS to specify the maximum I/O size for nvme(4).
Also allow admin commands to transfer up to this maximum I/O size, rather
than the artificial limit previously imposed.  The larger I/O size is very
beneficial for upcoming firmware download support.  This has the added
benefit of simplifying the code since both admin and I/O commands now use
the same maximum I/O size.

Sponsored by:	Intel
MFC after:	3 days
2013-06-26 23:27:17 +00:00
Jim Harris
5076698e19 Remove the NVME_IDENTIFY_CONTROLLER and NVME_IDENTIFY_NAMESPACE IOCTLs and replace
them with the NVMe passthrough equivalent.

Sponsored by:	Intel
2013-04-12 17:56:47 +00:00
Jim Harris
7c3f19d7bb Add support for passthrough NVMe commands.
This includes a new IOCTL to support a generic method for nvmecontrol(8) to pass
IDENTIFY, GET_LOG_PAGE, GET_FEATURES and other commands to the controller, rather than
separate IOCTLs for each.

Sponsored by:	Intel
2013-04-12 17:52:17 +00:00
Jim Harris
ca269f32ef Move the busdma mapping functions to nvme_qpair.c.
This removes nvme_uio.c completely.

Sponsored by:	Intel
2013-04-12 17:48:45 +00:00
Jim Harris
611060cab5 Remove the NVMe-specific physio and associated routines.
These were added early on for benchmarking purposes to avoid the mapped I/O
penalties incurred in kern_physio.  Now that FreeBSD (including kern_physio)
supports unmapped I/O, the need for these NVMe-specific routines no longer exists.

Sponsored by:	Intel
2013-04-12 17:44:55 +00:00
Jim Harris
97fafe2580 Add a mutex to each namespace, for general locking operations on the namespace.
Sponsored by:	Intel
2013-04-12 17:41:24 +00:00
Jim Harris
a90b810492 Rename the controller's fail_req_lock, so that it can be used for other
locking operations on the controller.

Sponsored by:	Intel
2013-04-12 17:36:48 +00:00
Jim Harris
e2b9900498 Do not panic when a busdma mapping operation fails.
Instead, print an error message and fail the associated command with
DATA_TRANSFER_ERROR NVMe completion status.

Sponsored by:	Intel
2013-04-12 17:34:49 +00:00
Jim Harris
5fdf9c3c8e Add unmapped bio support to nvme(4) and nvd(4).
Sponsored by:	Intel
2013-04-01 16:23:34 +00:00
Jim Harris
1e526bc478 Add "type" to nvme_request, signifying if its payload is a VADDR, UIO, or
NULL. This simplifies decisions around if/how requests are routed through
busdma.  It also paves the way for supporting unmapped bios.

Sponsored by:	Intel
2013-03-29 20:34:28 +00:00
Jim Harris
64432b473b Remove obsolete comment. This code has now been tested with the QEMU
NVMe device emulator.
2013-03-28 16:57:48 +00:00
Jim Harris
bb852ae89b Delete extra IO qpairs allocated based on number of MSI-X vectors, but
later found to not be usable because the controller doesn't support the
same number of queues.

This is not the normal case, but does occur with the Chatham prototype
board.

Sponsored by:	Intel
2013-03-28 16:54:19 +00:00
Jim Harris
bdd1fd402c Fix printf format issue on i386.
Reported by:	bz
2013-03-27 00:37:00 +00:00
Jim Harris
547d523eb8 Clean up debug prints.
1) Consistently use device_printf.
2) Make dump_completion and dump_command into something more
    human-readable.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 22:17:10 +00:00
Jim Harris
dd433dd0fb Move common code from the different nvme_allocate_request functions into a
separate function.

Sponsored by:	Intel
Suggested by:	carl
Reviewed by:	carl
2013-03-26 22:13:07 +00:00
Jim Harris
237d2019e5 Change a number of malloc(9) calls to use M_WAITOK instead of
M_NOWAIT.

Sponsored by:	Intel
Suggested by:	carl
Reviewed by:	carl
2013-03-26 22:11:34 +00:00
Jim Harris
955910a916 Replace usages of mtx_pool_find used for admin commands with a polling
mechanism.

Now that all requests are timed, we are guaranteed to get a completion
notification, even if it is an abort status due to a timed out admin
command.

This has the effect of simplifying the controller and namespace setup
code, so that it reads straight through rather than broken up into
a bunch of different callback functions.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 22:09:51 +00:00
Jim Harris
43a3725688 Abort and do not retry any outstanding admin commands left over after
a controller reset.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 22:06:05 +00:00
Jim Harris
232e2edb6c Add the ability to internally mark a controller as failed, if it is unable to
start or reset.  Also add a notifier for NVMe consumers for controller fail
conditions and plumb this notifier for nvd(4) to destroy the associated
GEOM disks when a failure occurs.

This requires a bit of work to cover the races when a consumer is sending
I/O requests to a controller that is transitioning to the failed state.  To
help cover this condition, add a task to defer completion of I/Os submitted
to a failed controller, so that the consumer will still always receive its
completions in a different context than the submission.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:58:38 +00:00
Jim Harris
3d7eb41c1b Just disable the controller instead of deleting IO queues during detach.
This is just as effective, and removes the need for a bunch of admin commands
to a controller that's going to be disabled shortly anyways.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:48:41 +00:00
Jim Harris
74019d4b67 Set Pre-boot Software Load Count to 0 at the end of the controller
start process.

The spec indicates the OS driver should use Set Features (Software
Progress Marker) to set the pre-boot software load count to 0
after the OS driver has successfully been initialized.  This allows
pre-boot software to determine if there have been any issues with the
OS loading.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:42:53 +00:00
Jim Harris
be34f21609 Remove the is_started flag from struct nvme_controller.
This flag was originally added to communicate to the sysctl code
which oids should be built, but there are easier ways to do this.  This
needs to be cleaned up prior to adding new controller states - for example,
controller failure.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:19:26 +00:00
Jim Harris
02e3348484 Ensure the controller's MDTS is accounted for in max_xfer_size.
The controller's IDENTIFY data contains MDTS (Max Data Transfer Size) to
allow the controller to specify the maximum I/O data transfer size.  nvme(4)
already provides a default maximum, but make sure it does not exceed what
MDTS reports.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:16:53 +00:00