Commit Graph

52 Commits

Author SHA1 Message Date
Warner Losh
0028abe633 Backout r329818, r329816 and r329815.
These aren't the commits I thought I was testing prior to
commit. Revert until I can sort out what happened and fix it.
2018-02-22 11:18:33 +00:00
Warner Losh
4d87e27125 Combine BIO_DELETE requests for nda devices
Now that we're queueing BIO_DELETE requests in the CAM I/O scheduler,
it make sense to try to combine as many as possible into a single
request to send down to hardware. Hopefully, lots of larger requests
like this are better than lots of individual transactions.

Note for future: need to limit based on total size of the trim
request. Should also collapse adjacent ranges where possible to
increase the size of the max payload.

Sponsored by: Netflix
2018-02-22 05:44:00 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Warner Losh
4e3b274457 Provide link speed data in XPT_GET_TRAN_SETTINGS. Provide full version
information for that and XPT_PATH_INQ. Provide macros to encode/decode
major/minor versions.  Read the link speed and lane count to compute
the base_transfer_speed for XPT_PATH_INQ.

Sponsored by: Netflix
2017-11-14 05:05:16 +00:00
Warner Losh
fa271a5d09 Closer examination shows that nvme and CAM both normally zero-fill
allocations (for req and ccb, which ultimately contain the
nvme_cmd). As such, we can micro-optimize these routines. Add a
comment to this effect, and bzero the ccb used to make the requests
for the nda dump rotuine so it more closely matches a ccb allocated
with xpt_get_ccb().

Sponsored by: Netflix
2017-10-15 23:53:55 +00:00
Warner Losh
fbed8df259 Explicitly set reserved fields and 'fuse' to 0. This prevents us from
acidentally sending bogus values in these fields, which some drives
may reject with an error or worse (undefined behavior).

This is especially needed for the ndadump routine which allocates the
cmd from stack garbage....

Sponsored by: Netflix
2017-10-15 16:17:59 +00:00
Warner Losh
c2005bba77 Fix a few overlooked spots where the coded uses 16-bit NSIDs. Chuck
Tuffli had submitted a more thorough patch that I was unaware of when
I did my work and this brings in the bits I missed from that patch.

PR: 220267
Submitted by: Chuck Tuffli
2017-08-29 15:46:34 +00:00
Warner Losh
030edcce02 Fill in reserved areas from NVMe spec in the IDENTIFY structure
(struct nvme_controller_data) as defined in the NVM Express
specification, revsion 1.3.

Sponsored by: Netflix
2017-08-25 21:38:43 +00:00
Warner Losh
223a9b93ac Add feature codes from NVMe 1.3 specification:
o Automomous Power State Transition
o Host Memory Buffer
o Timestamp
o Keep Alive Timer
o Host Controlled Thermal Management
o Non-Operational Power State Config

Also note that feature codes 0x78-0x7f are reserved for the NVMe
Management Interface.

Sponsored by: Netflix
2017-08-25 21:38:29 +00:00
Warner Losh
0012e436e3 Use _Static_assert
These files are compiled in userland too, so we can't use sys/systm.h
and rely on CTASSERT. Switch to using _Static_assert instead.

MFC After: 3 days
Sponsored by: Netflix
2017-08-25 04:33:06 +00:00
Warner Losh
0c26c1992f Sanity check sizes
Add compile time sanity checks to make sure that packed structures are
the proper size, typically as defined in the NVMe standard.
2017-08-25 04:05:53 +00:00
Warner Losh
8a5d94f94d Make nvd vs nda choice boot-time rather than build-time
Introduce hw.nvme.use_nvd tunable. This tunable allows both nvd and
nda to be installed in the kernel, while allowing only one of them to
create devices. This is an all-or-nothing setting, and you can't
change it after boot-time. However, it will allow easier A/B testing.

Differential Revision: https://reviews.freebsd.org/D11825
2017-08-04 03:40:01 +00:00
Warner Losh
594ffc03cd Add new definitions for namespaces.
Sponsored by: Netflix
Submitted by: Matt Williams (via D11330)
2017-06-27 20:24:39 +00:00
Warner Losh
05ee702af6 cwd10 takes the low 32-bits and cwd11 takes the upper 32-bits of the
lba. Rather than do a cast to uint64_t, which clang warns might be
unaligned, do the stores 32-bits at a time.

Sponsored by: Netflix
2017-03-07 23:02:59 +00:00
Warner Losh
0cf14228c4 Implement HGST Log page 0xc1, as documented in the HGST SN100 and
SN150 product manuals. Subpage 0x32 is documented, but not implemented.

Sponsored by: Netflix, Inc
2016-11-19 17:13:08 +00:00
Warner Losh
ab1dd0917b Print Intel's expanded Temperature log page.
Sponsored by: Netflix, Inc
2016-11-19 17:13:03 +00:00
Warner Losh
d01f26f590 Add log pages that Intel SSDs provide. It turns out that many of these
are widely implemented beyond just Intel drives.

Sponsored by: Netflix, Inc
2016-11-19 17:12:58 +00:00
Warner Losh
aea528795a Add log pages defined through NVM Express 1.2.1.
Sponsored by: Netflix, Inc
2016-11-19 17:12:53 +00:00
Warner Losh
dc58cdf95e Expand the SMART / Health Information Log Page (Page 02) printout
based on NVM Express 1.2.1 Standard.

Sponsored by: Netflix, Inc
2016-11-19 17:12:49 +00:00
Scott Long
a498975ef7 Implement crashdump support on NVME
MFC after:	3 days
Sponsored by:	Netflix, Inc.
2016-07-19 03:13:51 +00:00
Warner Losh
f24c011beb Commit the bits of nda that were missed. This should fix the build.
Approved by: re@
2016-06-10 06:04:53 +00:00
Alexander Motin
ee7f4d8187 Revert r292074 (by smh): Limit stripesize reported from nvd(4) to 4K
I believe that this patch handled the problem from the wrong side.
Instead of making ZFS properly handle large stripe sizes, it made
unrelated driver to lie in reported parameters to workaround that.

Alternative solution for this problem from ZFS side was committed at
r296615.

Discussed with:	smh
2016-03-10 17:13:10 +00:00
Warner Losh
038659e7dd Implement power command to list all power modes, find out the power
mode we're in and to set the power mode.
2016-01-30 22:48:06 +00:00
Steven Hartland
fdf16a68ab Limit stripesize reported from nvd(4) to 4K
Intel NVMe controllers have a slow path for I/Os that span a 128KB stripe boundary but ZFS limits ashift, which is derived from d_stripesize, to 13 (8KB) so we limit the stripesize reported to geom(8) to 4KB.

This may result in a small number of additional I/Os to require splitting in nvme(4), however the NVMe I/O path is very efficient so these additional I/Os will cause very minimal (if any) difference in performance or CPU utilisation.

This can be controller by the new sysctl kern.nvme.max_optimal_sectorsize.

MFC after:	1 week
Sponsored by:	Multiplay
Differential Revision:	https://reviews.freebsd.org/D4446
2015-12-11 02:06:03 +00:00
Jim Harris
fdbd3d8068 nvd, nvme: report stripesize through GEOM disk layer
MFC after:	3 days
Sponsored by:	Intel
2015-10-30 16:35:18 +00:00
Jim Harris
992db80f1d Extend some 32-bit fields and variables to 64-bit to prevent overflow
when calculating stats in nvmecontrol perftest.

Sponsored by:	Intel
Reported by:	Joe Golio <joseph.golio@emc.com>
Reviewed by:	carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 15:47:22 +00:00
Jim Harris
a40e72a695 Add driver-assisted striping for upcoming Intel NVMe controllers that can
benefit from it.

Sponsored by:	Intel
Reviewed by:	kib (earlier version), carl
Approved by:	re (hrs)
MFC after:	1 week
2013-10-08 15:44:04 +00:00
Jim Harris
56183abc2b Send a shutdown notification in the driver unload path, to ensure
notification gets sent in cases where system shuts down with driver
unloaded.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:47:08 +00:00
Jim Harris
38441bd9a9 Add message when nvd disks are attached and detached.
As part of this commit, add an nvme_strvis() function which borrows
heavily from cam_strvis().  This will allow stripping of
leading/trailing whitespace and also handle unprintable characters
in model/serial numbers.  This function goes into a new nvme_util.c
file which is used by both the driver and nvmecontrol.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-07-19 21:40:57 +00:00
Jim Harris
e8f25c6266 Define constants for the lengths of the serial number, model number
and firmware revision in the controller's identify structure.

Also modify consumers of these fields to ensure they only use the
specified number of bytes for their respective fields.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-07-17 23:23:38 +00:00
Jim Harris
66619178b5 Fix a poorly worded comment in nvme(4).
MFC after:	3 days
2013-07-11 15:02:38 +00:00
Jim Harris
e9efbc134f Update copyright dates.
MFC after:	3 days
2013-07-09 21:22:17 +00:00
Jim Harris
49fac6101d Add firmware replacement and activation support to nvmecontrol(8) through
a new firmware command.

NVMe controllers may support up to 7 firmware slots for storing of
different firmware revisions.  This new firmware command supports
firmware replacement (i.e. firmware download) with or without immediate
activation, or activation of a previously stored firmware image.  It
also supports selection of the firmware slot during replacement
operations, using IDENTIFY information from the controller to
check that the specified slot is valid.

Newly activated firmware does not take effect until the new controller
reset, either via a reboot or separate 'nvmecontrol reset' command to the
same controller.

Submitted by:	Joe Golio <joseph.golio@emc.com>
Obtained from:	EMC / Isilon Storage Division
MFC after:	3 days
2013-06-27 00:08:25 +00:00
Jim Harris
8d09e3c400 Use MAXPHYS to specify the maximum I/O size for nvme(4).
Also allow admin commands to transfer up to this maximum I/O size, rather
than the artificial limit previously imposed.  The larger I/O size is very
beneficial for upcoming firmware download support.  This has the added
benefit of simplifying the code since both admin and I/O commands now use
the same maximum I/O size.

Sponsored by:	Intel
MFC after:	3 days
2013-06-26 23:27:17 +00:00
Jim Harris
5076698e19 Remove the NVME_IDENTIFY_CONTROLLER and NVME_IDENTIFY_NAMESPACE IOCTLs and replace
them with the NVMe passthrough equivalent.

Sponsored by:	Intel
2013-04-12 17:56:47 +00:00
Jim Harris
7c3f19d7bb Add support for passthrough NVMe commands.
This includes a new IOCTL to support a generic method for nvmecontrol(8) to pass
IDENTIFY, GET_LOG_PAGE, GET_FEATURES and other commands to the controller, rather than
separate IOCTLs for each.

Sponsored by:	Intel
2013-04-12 17:52:17 +00:00
Jim Harris
5fdf9c3c8e Add unmapped bio support to nvme(4) and nvd(4).
Sponsored by:	Intel
2013-04-01 16:23:34 +00:00
Jim Harris
232e2edb6c Add the ability to internally mark a controller as failed, if it is unable to
start or reset.  Also add a notifier for NVMe consumers for controller fail
conditions and plumb this notifier for nvd(4) to destroy the associated
GEOM disks when a failure occurs.

This requires a bit of work to cover the races when a consumer is sending
I/O requests to a controller that is transitioning to the failed state.  To
help cover this condition, add a task to defer completion of I/Os submitted
to a failed controller, so that the consumer will still always receive its
completions in a different context than the submission.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:58:38 +00:00
Jim Harris
0d7e13ecb2 Pass associated log page data to async event consumers, if requested.
Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:08:32 +00:00
Jim Harris
0692579bf3 Add structure definitions and controller command function for firmware
log pages.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:03:03 +00:00
Jim Harris
0892778256 Add structure definitions and a controller command function for
error log pages.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:01:53 +00:00
Jim Harris
cf81529ce3 Create struct nvme_status.
NVMe error log entries include status, so breaking this out into
its own data structure allows it to be included in both the
nvme_completion data structure as well as error log entry data
structures.

While here, expose nvme_completion_is_error(), and change all of
the places that were explicitly looking at sc/sct bits to use this
macro instead.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 21:00:18 +00:00
Jim Harris
dbba74428b Add API for nvme consumers to access controller and namespace identify data.
Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 19:52:57 +00:00
Jim Harris
b846efd7ec Add controller reset capability to nvme(4) and ability to explicitly
invoke it from nvmecontrol(8).

Controller reset will be performed in cases where I/O are repeatedly
timing out, the controller reports an unrecoverable condition, or
when explicitly requested via IOCTL or an nvme consumer.  Since the
controller may be in such a state where it cannot even process queue
deletion requests, we will perform a controller reset without trying
to clean up anything on the controller first.

Sponsored by:	Intel
Reviewed by:	carl
2013-03-26 19:50:46 +00:00
Jim Harris
5f1e251de6 Create a generic nvme_ctrlr_cmd_get_log_page function, and change the
health information log page function to use it.

Sponsored by:	Intel
2013-03-26 18:43:53 +00:00
Jim Harris
99d99f7408 Expose the get/set features API to nvme consumers.
Sponsored by:	Intel
2013-03-26 18:42:05 +00:00
Jim Harris
038a5ee403 Add an interface for nvme shim drivers (i.e. nvd) to register for
notifications when new nvme controllers are added to the system.

Sponsored by:	Intel
2013-03-26 18:39:54 +00:00
Jim Harris
0a0b08cc30 Enable asynchronous event requests on non-Chatham devices.
Also add logic to clean up all outstanding asynchronous event requests
when resetting or shutting down the controller, since these requests
will not be explicitly completed by the controller itself.

Sponsored by:	Intel
2013-03-26 18:37:36 +00:00
Jim Harris
0f71ecf741 Add ability to queue nvme_request objects if no nvme_trackers are available.
This eliminates the need to manage queue depth at the nvd(4) level for
Chatham prototype board workarounds, and also adds the ability to
accept a number of requests on a single qpair that is much larger
than the number of trackers allocated.

Sponsored by:	Intel
2012-10-18 00:45:53 +00:00
Jim Harris
9eb93f2976 Add return codes to all functions used for submitting commands to I/O
queues.

Sponsored by:	Intel
2012-10-18 00:32:07 +00:00