Commit Graph

379 Commits

Author SHA1 Message Date
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
John Baldwin
081c22db85 nvme.h: Fix a comment typo in admin opcode enum
Sponsored by:	Chelsio Communications
2023-08-15 11:06:58 -07:00
Warner Losh
33469f1011 nvme: use mtx_padaalign instead of mtx + alignment attribute
nvme driver predates, it seems, mtx_padalign. Modernize.

Sponsored by:		Netflix
2023-08-14 16:33:26 -06:00
Warner Losh
09c20a2932 nvme: Move bools to fill hole
The two bools in nvme_request create a 6 byte hole today. Move them to
after retires to fill the 4 byte hole there and add a spare[2] to make
nvme_request 8 bytes smaller. spare[2] isn't strictly necessary, but
documents how many bytes we have left in that hole, as the number of
booleans will increase shortly.

Suggested by:		chuck
Sponsored by:		Netflix
2023-08-08 11:44:51 -06:00
Warner Losh
2ad9a815fd nvme: Directly lookup op code
Rather than have a table to walk through, use a sparse array.

Suggested by:		jhb
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D41353
2023-08-07 16:44:32 -06:00
Warner Losh
63b0c00eb0 nvme: Update comment
Fix comment to note we should grab additional data from the error log
page, but don't currently (it's inclear if we should do that here
and other places in nvd that want it, or if we should let nvd / the
nda periph make the request).

Sponsored by:		Netflix
Reviewed by:		chuck, mav, jhb
Differential Revision:	https://reviews.freebsd.org/D41315
2023-08-07 16:44:31 -06:00
Warner Losh
95cd10f139 nvme: Add comments about other fields in status
When manually completing an I/O, we do so because we have no status back
from the card. Note M, CRD and P are all 0 because this is an artificial
event (and phase isn't checked when it's completed this way). There's no
MORE information in the error log page and there's no delayed retry
(CRD=0) and we don't currently request CRD to be set to anything other
than 0 and thus don't implement delayed retry.

Sponsored by:		Netflix
Reviewed by:		chuck, mav, jhb
Differential Revision:	https://reviews.freebsd.org/D41314
2023-08-07 16:44:31 -06:00
Warner Losh
a510dbc848 nvme: Be less verbose when cancelling I/O or admin commands
When we're resetting, and there's outstanding I/O that we're cancelling,
only report we're cancelling the I/O once rather than once per
I/O. Likewise when we reschedule the I/O. We don't need to say for each
one that we're cancelling/rescheduling something, and then report the
I/O that we're doing. Likewise with cancelling admin commands (we never
retry them here, so a similar change isn't needed).

Sponsored by:		Netflix
Reviewed by:		chuck, mav
Differential Revision:	https://reviews.freebsd.org/D41313
2023-08-07 16:44:31 -06:00
Warner Losh
ac8c866fda nvme: Add more NVME Base Spec 2.0 and NVME Command Set Spec 1.0a
Add admin commands capacity management, lockdown and fabrics commands.
Add I/O copy command.

Sponsored by:		Netflix
Reviewed by:		chuck, mav, jhb
Differential Revision:	https://reviews.freebsd.org/D41311
2023-08-07 16:44:31 -06:00
Warner Losh
edd23e4dc0 nvme: Eliminate redundant code
get_admin_opcode_string and get_io_opcode_string are identical, but
start with different tables. Use a helper routine that takes an argument
to implement these instead. A future commit will refine this further.

Sponsored by:		Netflix
Reviewed by:		chuck, mav, jhb
Differential Revision:	https://reviews.freebsd.org/D41310
2023-08-07 16:44:31 -06:00
Warner Losh
7be0b06885 nvme: Remove duplicate command printing routine
Both nvme_dump_command and nvme_qpair_print_command print nvme
commands. The former latter better. Recode the one call to
nvme_dump_command to use nvme_qpair_print_command and delete the
former. No sense having two nearly identical routines. A future commit
will convert to sbuf.

Sponsored by:		Netflix
Reviewed by:		chuck, mav, jhb
Differential Revision:	https://reviews.freebsd.org/D41309
2023-08-07 16:44:30 -06:00
Warner Losh
6f76d49386 nvme: Remove duplicate completion printing routine
Both nvme_dump_completion and nvme_qpair_print_completion print
completions. The latter is better. Recode the two instances of
nvme_dump_completion to use nvme_qpair_print_completion and delete the
former. No sense having two nearly identical routines. A future commit
will convert this to sbuf.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D41308
2023-08-07 16:44:30 -06:00
Vladimir Kondratyev
fc14525044 nvme(4): detect S3X NVMe controller in 2016-2017 MacBooks
Adds support for detection of the S3X NVMe controller found in the
13" MacBook Pro 2017 without Touch Bar (MacBook14,1)
It is known to be used in following MacBooks:
- Retina MacBook 2016 (MacBook9,1)
- 13" MacBook Pro 2016 without Touch Bar (MacBook13,1)
- 13" MacBook Pro 2016 with Touch Bar (MacBook13,2)
2023-07-31 17:33:14 +03:00
John Baldwin
92103adbeb nvme: Use a memdesc for the request buffer instead of a bespoke union.
This avoids encoding CAM-specific knowledge in nvme_qpair.c.

Reviewed by:	chuck, imp, markj
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D41119
2023-07-24 10:32:58 -07:00
Warner Losh
774ab87cf2 cam: Add CAM_NVME_STATUS_ERROR error code
Add CAM_NVME_STATUS_ERROR error code. Flag all NVME commands that
completed with an error status as CAM_NVME_STATUS_ERROR (a new value)
instaead of CAM_REQ_CMP_ERR. This indicates to the upper layers of CAM
that the 'cpl' field for nvmeio CCBs is valid and can be examined for
error recovery, if desired.

No functional change. nda will still see these as errors, call
ndaerror() to get the error recovery action, etc. cam_periph_error will
select the same case as before (even w/o the change, though the change
makes it explicit).

Sponsored by:		Netflix
Reviewed by:		chuck, mav, jhb
Differential Revision:	https://reviews.freebsd.org/D41085
2023-07-20 22:32:31 -06:00
John Baldwin
5ae4463498 nvme: Fix typo in "Command Aborted by Host" constant name.
Reviewed by:	chuck, imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D40763
2023-06-27 10:06:22 -07:00
John Baldwin
9c2203a691 nvme: Tidy up transfer rate settings in XPT_GET_TRAN_SETTINGS.
- Replace a magic number with CTS_NVME_VALID_SPEC.

- Set the transport and protocol versions the same as for XPT_PATH_INQ.

Probably we shouldn't bother with setting the version in the 'spec'
member of ccb_trans_settings_nvme at all and use the transport
and/or protocol version field instead.

Reviewed by:	chuck, imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D40616
2023-06-26 20:32:29 -07:00
Warner Losh
bdc81eeda0 nvme: Switch to nda by default
We already run nda by default on all the !x86 architectures. Switch the
default to nda. nda created nvd compatibility links by default, so this
should be a nop. If this causes problems for your application, set
hw.nvme.use_nvd=1 in your loader.conf.

Sponsored by:		Netflix
2023-06-12 21:41:06 -06:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Alexander Motin
49ebbdb264 Add NAMESPACE MANAGEMENT into admin_opcode[].
MFC after: 1 week
2023-03-08 15:42:31 -05:00
Dag-Erling Smørgrav
9a5acf365d nvme: Clear the notify flag if the consumer rejects the controller.
While here, fix some type mismatch warnings.

Reviewed by:	imp
Sponsored by:	Netapp, Inc.
Sponsored by:	Klara, Inc.
MFC after:	1 week
2022-12-20 02:53:38 +01:00
Wanpeng Qian
8ab99dbea1
bhyve: abort and return FEATURE_NOT_SAVEABLE while set feature with a save flag for NVMe controller.
Currently bhyve's NVMe controller cannot save feature values cross
reboot. It should return a FEATURE_NOT_SAVEABLE error when the command
specifies a save flag.

Quote from NVMe specification, page 205:

https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf

If the Feature Identifier specified in the Set Features command is not
saveable by the controller and the controller receives a Set Features
command with the Save bit set to one, then the command shall be aborted
with a status of Feature Identifier Not Saveable.

Reviewed by:		chuck (older version)
Approved by:		manu (mentor)
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D32767
2022-11-15 07:48:24 +01:00
Alexander Motin
2a31a06bf1 Add random VMware device IDs.
Just to make dmesg look nicer there.

MFC after:	1 week
2022-10-20 10:19:24 -04:00
Warner Losh
4982884b99 nvme: Always set deadline to max
When a transaction is on the outstanding list, it needs to have a valid
timeout value, so set it to infinity before placing it on the
list. Place before we put it on the list, even though the list is
protected by the qpair lock.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D36920
2022-10-11 12:51:32 -06:00
Alexander Motin
a69c096462 nvme: Print CRD, M and DNR status bits on errors.
It may help with some issues debugging.

MFC after:	1 week
2022-08-05 10:58:19 -04:00
Gordon Bergling
6e8ab6715d nvmw(4): Fix a typo in a source code comment
- s/inaccessable/inaccessible/

MFC after:	3 days
2022-06-04 11:46:03 +02:00
John Baldwin
1093caa1bb nvme: Remove unused devclass arguments to DRIVER_MODULE. 2022-05-06 15:46:55 -07:00
John Baldwin
82496a256f nvme: Use devclass_find to lookup the nvme devclass.
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D34995
2022-04-21 10:29:14 -07:00
Warner Losh
0fd4cd405b nvme: Use controller's page size instead of PAGE_SIZE to create qpair
When constructing qpair, use the controller's notion of page size rather
than the host's PAGE_SIZE. Currently, these are both 4k, but the arm 16k
page size support requires decoupling.

There's a "hidden" PAGE_SIZE in btoc, so we must change btoc(x) to
howmany(x, ctrlr->page_size) to properly count the number of pages (in
the drive's world view) are needed for various calculations.

With these changes, we the nvme driver operates at production level load
for both host 4k and host 16k page size.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D34873
2022-04-15 14:46:19 -06:00
Warner Losh
c5ed67dc90 nvme: Prefer nvme_printf to printf when reporting formatting error
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D34872
2022-04-15 14:46:19 -06:00
Warner Losh
3740a8db13 nvme: Further refinements in Host Memory Buffer Sizing
Host Memory Buffer units are a mix. For those in the identify structure,
the size is in 4kiB chunks. For specifying the buffer description,
though, they are in terms of the drive's MPS. Add comments to this
effect and change PAGE_SIZE to ctrlr->page_size where needed, as well as
correct a mistaken use of NVME_HPS_UNITS in 214df80a9c as pointed out
by rpokala@ after the commit. No functional change is intended, as
page_size is still 4k which matches all current hosts' PAGE_SIZE, but to
support 16k pages on arm, we need to differentiate these two cases.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D34871
2022-04-15 14:46:19 -06:00
Warner Losh
3086efe895 nvme: Remove NVME_MAX_XFER_SIZE, replace inline calculation
NVME_MAX_XFER_SIZE used to be a constant (back when MAXPHYS was a
constant) to denote the smaller of MAXPHYS or the largest PRP we could
encode with our prealloation scheme. However, it's no longer constant
since MAXPHYS varies at runtime. In addition, the actual maximum is now
based on the drive's currently in use page_size, which is also a runtime
expression. As such, remove the define and expand it inline in the one
place its used still in the tree.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D34870
2022-04-15 14:46:18 -06:00
Warner Losh
3a468f2010 nvme: Use saved mps when initializing drive
Make sure we set the MPS we cached (currently the drives minimum mps) in
CC (Controller Configuration) when reinitializing the drive. It must
match the page_size that we're going to use. Also retire less specific
NVME_PAGE_SHIFT since it's now unused.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D34869
2022-04-15 14:46:18 -06:00
Warner Losh
55412ef90a nvme: Rename min_page_size to page_size and save mps
The Memory Page Size sets the basic unit of operation for the drive. We
currently set this to the drive's minimum page size, but we could set it
to any page size the drive supports in the future. Replace min_page_size
(it's now unused for that purpose) with page_size to reflect this and
cache the MPS we want to use. Use NVME_MPS_SHIFT to compute page_size.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D34868
2022-04-15 14:46:18 -06:00
Warner Losh
6e3deec8ca nvme: Base maximum data transfer size directly on MPSMIN in cap_hi
Calculate the maxmimum transfer size based on the MPSMIN we have in our
cached copy of cap_hi rather than using min_page_size in the controller.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D34867
2022-04-15 14:46:18 -06:00
Warner Losh
a7218e7a6b nvme: Fix old intel alignment size
The intel raid stripe alignment parameter is based on CAP.MPSMIN, so use
that directly now that we have it available.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D34866
2022-04-15 14:46:18 -06:00
Warner Losh
e66c1b5185 nvme: Define NVME_MPS_SHIFT
The memory page size (MPS) is expressed in terms of a 2^(number + 12)
and other items in the system inherit this. Create a define rather than
sprinkling 12 everywehere.

Sponsored by:		Netflix
Reviewed by:		chuck
Differential Revision:	https://reviews.freebsd.org/D34865
2022-04-15 14:46:18 -06:00
Gordon Bergling
dfa01f4f98 nvme(4): Fix a typo in a source code comment
- s/is is/is/

MFC after:	3 days
2022-04-09 09:24:34 +02:00
Warner Losh
214df80a9c nvme: new define for size of host memory buffer sizes
The nvme spec defines the various fields that specify sizes for host
memory buffers in terms of 4096 chunks. So, rather than use a bare 4096
here, use NVME_HMB_UNITS. This is explicitly not the host page size of
4096, nor the default memory page size (mps) of the NVMe drive, but its
own thing and needs its own define.

No functional change is intended, only the logical spelling of 4k.

Sponsored by:		Netflix
2022-04-08 23:05:25 -06:00
Warner Losh
161fcf7994 nvme: Publish the drive's capabilities
Add cap_lo and cap_hi sysctl to each nvme drive. This publishes the raw
capabilities of the drive. Now we can only discover these with
bootverbose.

Sponsored by:		Netflix
2022-03-31 21:13:16 -06:00
Warner Losh
6af6a52ee4 nvme: Save cap_lo and cap_hi
Save the capabilities for the drive.

Sponsored by:		Netflix
2022-03-31 21:12:38 -06:00
Warner Losh
a70b5660f3 nvme: MPS is a power of two, not a size / 8k
Setting MPS in the CC should be a power of 2 number (it specifies the
page size of the host is 2^(12+MPS)), so adjust the calcuation. There is
no functional change because we do not support any architecutres != 4k
pages (yet). Other changes are needed for architectures with 16k or 64k
pages, especially when the underlying NVMe drive doesn't support that
page size (Most drives support a range that's small, and many only
support 4k), but let's at least do this calculation correctly. 12 - 12
is just as much 0 as 4096 >> 13 is :)

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D34707
2022-03-31 21:12:38 -06:00
Chuck Tuffli
c2318cf80a nvme: fix spelling of Namespace
Fix spelling of a macro definition.

Reviewed by:	mav, imp
Differential Revision:	https://reviews.freebsd.org/D34330
2022-02-21 10:34:46 -08:00
Chuck Tuffli
e71afa1202 nvme: Add OAES bit-field definitions
Create definitions for the Optional Asynchronous Events Supported (OAES)
values. Also adds a helper macro for the common use case of "mask and
shift". E.g.
    value = NVME_CTRLR_DATA_OAES_NS_ATTR_MASK << NVME_CTRLR_DATA_OAES_NS_ATTR_SHIFT;
becomes
    value = NVMEB(NVME_CTRLR_DATA_OAES_NS_ATTR);

Reviewed by:	mav, imp
Differential Revision:	https://reviews.freebsd.org/D34300
2022-02-21 10:34:14 -08:00
Alexander Motin
b3c9b6060f nvme: Do not rearm timeout for commands without one.
Admin queues almost always have several ASYNC_EVENT_REQUEST outstanding.
They have no timeouts, but their presence in qpair->outstanding_tr caused
useless timeout callout rearming twice a second.

While there, relax timeout callout period from 0.5s to 0.5-1s to improve
aggregation.  Command timeouts are measured in seconds, so we don't need
to be precise here.

Reviewed by:	imp
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33781
2022-01-07 12:59:16 -05:00
Warner Losh
8f07932272 nvme_sim: Only report PCI related stats when we can
For AHCI attached devices, we report the location and identification
information of the AHCI controller that we're attached to. We also
don't reprot link speed in that case, since we can't get to the PCIe
config space registers to find that out.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33287
2021-12-06 10:23:40 -07:00
Warner Losh
7cf8d63c88 nvme_ahci: Mark AHCI devices as such in the controller
Add a quirk to flag AHCI attachment to the controller. This is for any
of the strategies for attaching nvme devices as children of the AHCI
device for Intel's RAID devices. This also has a side effect of cleaning
up resource allocation from failed nvme_attach calls now.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33285
2021-12-06 10:23:40 -07:00
Warner Losh
053f8ed6eb nvme: Move to a quirk for the Intel alignment data
Prior to NVMe 1.3, Intel produced a series of drives that had
performance alignment data in the vendor specific space since no
standard had been defined. Move testing the versions to a quick so the
NVMe NS code doesn't know about PCI device info.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D33284
2021-12-06 10:23:40 -07:00
Gordon Bergling
5f8ccf6515 nvme(4): Correct a typo in a sysctl description
- s/printting/printing/

MFC after:	3 days
2021-11-30 10:26:25 +01:00