20 Commits

Author SHA1 Message Date
Chuck Tuffli
31b67520d4 bhyve: update the NVMe CQ based on the status
Instead of skipping the NVMe Completion Queue update based on the
opcode, define a synthetic status value which indicates the completion
queue entry is invalid. This will also allow deferred completion queue
updates for other commands.

Also returns the correct status for unrecognized opcodes ("invalid
opcode").

Reviewed by:	imp, jhb, araujo
Approved by:	imp (mentor), jhb (maintainer)
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D20945
2019-07-17 03:19:30 +00:00
Chuck Tuffli
409a80e5a4 bhyve: Create EUI64 for NVMe namespaces
Accept an IEEE Extended Unique Identifier (EUI-64) from the command
line for each NVMe namespace. If one isn't provided, it will create one
based on the CRC16 of:
 - the FreeBSD IEEE OUI
 - PCI bus, device/slot, function values
 - Namespace ID

Reviewed by:	imp, araujo, jhb, rgrimes
Approved by:	imp (mentor), jhb (maintainer)
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D19905
2019-07-13 12:48:28 +00:00
Sean Chittenden
e47c192236 usr.sbin/bhyve: unconditionally initialize the NVMe completion status
Follow-up work to improve the handling of unsupported/invalid opcodes
is being developed by chuck@.

Coverity CID:	1398928
Reviewed by:	chuck
Approved by:	araujo, imp
Differential Revision:	https://reviews.freebsd.org/D20914
2019-07-12 05:53:13 +00:00
Chuck Tuffli
129f93c5a7 bhyve: Add PCIe Integrated Endpoint capability
The NVMe CAM driver reports the PCIe Link Capability and Status for
devices. For emulated bhyve NVMe devices, this looks like:

nda0: nvme version 1.3 x63 (max x63) lanes PCIe Gen15 (max Gen15) link

The driver outputs this because the emulated device doesn't include the
PCIe Capability structure. The NVMe specification requires these
registers, so the fix is to add this set of capability registers to the
emulated device.

Note that PCI Express devices that are integrated into the Root Complex
(i.e. Bus 0x0) do not have to support the Link Capability or Status
registers. Windows will fail to start (i.e. Code 10) devices that appear
to be part of the Root Complex but report being a PCI Express Endpoint.
So also add a check to pci_emul_add_pciecap() to check if the device is
integrated and change the device type.

Reviewed by:	imp, ken, araujo, jhb, rgrimes
Approved by:	imp (mentor), ken (mentor), jhb (maintainer)
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D19904
2019-06-07 17:09:49 +00:00
Chuck Tuffli
a1daa3ae5e bhyve: Fix NVMe data structure copy to guest
bhyve's NVMe emulation was transferring Identify data back to the guest
incorrectly causing memory corruptions. These corruptions resulted in
core dumps and other system level errors in the guest.

In their simplest form, NVMe Physical Region Page (PRP) values in
commands indicate which physical pages to use for data transfer. The
first PRP value is not required to be page aligned but does not cross a
page boundary. The second PRP value must be page aligned, does not cross
a page boundary, and need not be contiguous with PRP1.

The code was copying Identify data past the end of PRP1. This happens to
work if PRP1 and PRP2 are physically contiguous but will corrupt guest
memory in unpredictable ways if they are not.

Fix is to copy the Identify data back to the guest piecewise (i.e. for
each PRP entry). Also fix a similarly wrong problem when copying back
Log page data.

Reviewed by:	imp (mentor), araujo, jhb, rgrimes, bhyve
Approved by:	imp (mentor), bhyve (jhb)
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D19695
2019-04-05 16:54:20 +00:00
Chuck Tuffli
fe1b713e2c bhyve: Fix NVMe BAR size calculation
The NVMe specification defines bits 13:4 of BAR0 as Reserved (i.e. 0x0).
Most drivers do not enforce this, but the Windows NVMe driver does and
will refuse to start the device (i.e. error 10) if any of these bits are
set.

The current BAR size calculation tries to minimize the amount of memory
the device reserves by scaling the BAR size by the maximum number of
queues supported by the device. But unless the device supports a large
number of queue pairs (over 1536), it will reserve too little memory.

The fix is to allocate a minimum of 16K bytes for BAR0.

Tested on Windows Server 2016 and 2019

Reviewed by:	imp (mentor), araujo, jhb, bhyve
Approved by:	imp (mentor), bhyve (jhb)
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D19676
2019-04-05 16:54:16 +00:00
Chuck Tuffli
7bb1073842 Fix bhyve's NVMe Identify Namespace data
The NVMe Identify Namespace data structure's Number of LBA Formats
(NLBAF) field is a 0's based value (i.e. 0x0 means 1). Since the
emulation only supports a single format, set NLBAF to 0x0, not 1.

Reviewed by:	imp, araujo, rgrimes
Approved by:	imp (mentor)
MFC after:      1 week
Differential Revision: https://reviews.freebsd.org/D19579
2019-03-15 02:11:27 +00:00
Chuck Tuffli
fbac7e0ba2 Fix bhyve's NVMe Completion Queue entry values
The function which processes Admin commands was not returning the
Command Specific value in Completion Queue Entry, Dword 0 (CDW0). This
effects commands such as Set Features, Number of Queues which returns
the number of queues supported by the device in CDW0. In this case, the
host will only create 1 queue pair (Number of Queues is zero based).
This also masked a bug in the queue counting logic.

Reviewed by: imp, araujo
Approved by: imp (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D18703
2019-01-04 15:03:35 +00:00
Chuck Tuffli
76e47b9420 Fix bhyve's NVMe queue bookkeeping
Many size / length parameters in NVMe are "0's based", meaning, a value
of 0x0 represents 1, 0x1 represents 2, etc.. While this leads to an
efficient encoding, it can lead to subtle bugs. With respect to queues,
these parameters include:
 - Maximum number of queue entries
 - Maximum number of queues
 - Number of Completion Queues
 - Number of Submission Queues

To be consistent, convert all 0's based values from the host to 1's
based value internally. Likewise, covert internal 1's based values to
0's based values when returned to the host. This fixes an off-by-one bug
when creating IO queues and simplifies some of the code. Note that this
bug is masked by another bug.

While in the neighborhood,
 - fix an erroneous queue ID check (checking CQ count when deleting SQ)
 - check for queue ID of 0x0 in a few places where this is illegal
 - clean up the Set Features, Number of Queues command and check for
   illegal values

Reviewed by: imp, araujo
Approved by: imp (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D18702
2019-01-04 15:03:30 +00:00
Marcelo Araujo
0f6f91a8ce Comestic change to try to inline the memset with SSE/AVX instructions.
Also switch from int to size_t to keep portability.

Reviewed by:	brooks
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D17795
2018-11-07 06:29:01 +00:00
Chuck Tuffli
9544e6dcf1 Make NVMe compatible with the original API
The original NVMe API used bit-fields to represent fields in data
structures defined by the specification (e.g. the op-code in the command
data structure). The implementation targeted x86_64 processors and
defined the bit fields for little endian dwords (i.e. 32 bits).

This approach does not work as-is for big endian architectures and was
changed to use a combination of bit shifts and masks to support PowerPC.
Unfortunately, this changed the NVMe API and forces #ifdef's based on
the OS revision level in user space code.

This change reverts to something that looks like the original API, but
it uses bytes instead of bit-fields inside the packed command structure.
As a bonus, this works as-is for both big and little endian CPU
architectures.

Bump __FreeBSD_version to 1200081 due to API change

Reviewed by: imp, kbowling, smh, mav
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D16404
2018-08-22 04:29:24 +00:00
Marcelo Araujo
1465a1e1eb Fix resource leak when using strdup(3).
Reported by:	Coverity
CID:		1394929
Sponsored by:	iXsystems Inc.
2018-08-21 23:11:26 +00:00
Marcelo Araujo
6b2c20cd98 NVMe spec version 1.3c says that "serial number" field must be 7-bit ASCII,
with unused bytes padded by space characters. Same for firmware number and
namespace number.

Discussed with:	imp@
Sponsored by:	iXsystems Inc.
2018-08-20 04:56:37 +00:00
Marcelo Araujo
b018ea0174 Users must set the number of queues from 1 to maximum 16 queues.
Sponsored by:	iXsystems Inc.
2018-08-20 04:50:11 +00:00
Marcelo Araujo
df90fce298 Fix double mutex lock.
Reported by:	Coverity
CID:		1394833
Discussed with:	Leon Dang
Sponsored by:	iXsystems Inc.
2018-08-20 04:44:29 +00:00
Marcelo Araujo
ec89307fb1 Fix a resource leak when using strdup(3) and also fix few style(9).
Reported by:	Coverity
CID:		1394929
MFC after:	1 week
Sponsored by:	iXsystems Inc.
2018-08-16 06:38:01 +00:00
Marcelo Araujo
3955e1c03a Remove duplicated code.
Reported by:	Coverity
CID:		1394893
MFC after:	1 week
Sponsored by:	iXsystems Inc.
2018-08-16 06:35:44 +00:00
Marcelo Araujo
9e59a2e8ce Add a comment explaining how the PSN works and why there is no need for
a null terminator. Also mark CID 1394825 as intentional.

Reported by:	Coverity
CID:		1394825
MFC after:	1 week
Sponsored by:	iXsystems Inc.
2018-08-16 06:31:54 +00:00
Marcelo Araujo
e30993c2a6 Increase the mask from 15 to 255 or otherwise NVME_FEAT_SOFTWARE_PROGRESS
will never be reached.

Discussed with:	Leon Dang and Darius Mihai <dariusmihaim@gmail.com>
MFC after:	1 week.
Sponsored by:	iXsystems Inc.
2018-08-16 06:20:25 +00:00
Marcelo Araujo
c066c68c57 - Add bhyve NVMe device emulation.
The initial work on bhyve NVMe device emulation was done by the GSoC student
Shunsuke Mie and was heavily modified in performan, functionality and
guest support by Leon Dang.

bhyve:
	-s <n>,nvme,devpath,maxq=#,qsz=#,ioslots=#,sectsz=#,ser=A-Z

	accepted devpath:
		/dev/blockdev
		/path/to/image
		ram=size_in_MiB

Tested with guest OS: FreeBSD Head, Linux Fedora fc27, Ubuntu 18.04,
                      OpenSuse 15.0, Windows Server 2016 Datacenter.
Tested with all accepted device paths: Real nvme, zdev and also with ram.
Tested on: AMD Ryzen Threadripper 1950X 16-Core Processor and
           Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz.

Tests at: https://people.freebsd.org/~araujo/bhyve_nvme/nvme.txt

Submitted by:	Shunsuke Mie <sux2mfgj_gmail.com>,
		Leon Dang <leon_digitalmsx.com>
Reviewed by:	chuck (early version), grehan
Relnotes:	Yes
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D14022
2018-07-05 03:33:58 +00:00