freebsd-skq

Author	SHA1	Message	Date
jimharris	1dabbdc24c	Do not retry failed async event requests. Sponsored by: Intel MFC after: 3 days	2013-07-09 21:03:39 +00:00
jimharris	44e3ab8eb0	Add pci_enable_busmaster() and pci_disable_busmaster() calls in nvme_attach() and nvme_detach() respectively. Sponsored by: Intel MFC after: 3 days	2013-07-09 21:02:45 +00:00
jimharris	c15f698fb4	Add firmware replacement and activation support to nvmecontrol(8) through a new firmware command. NVMe controllers may support up to 7 firmware slots for storing of different firmware revisions. This new firmware command supports firmware replacement (i.e. firmware download) with or without immediate activation, or activation of a previously stored firmware image. It also supports selection of the firmware slot during replacement operations, using IDENTIFY information from the controller to check that the specified slot is valid. Newly activated firmware does not take effect until the new controller reset, either via a reboot or separate 'nvmecontrol reset' command to the same controller. Submitted by: Joe Golio <joseph.golio@emc.com> Obtained from: EMC / Isilon Storage Division MFC after: 3 days	2013-06-27 00:08:25 +00:00
jimharris	b86441f01b	Remove remaining uio-related code. The nvme_physio() function was removed quite a while ago, which was the only user of this uio-related code. Sponsored by: Intel MFC after: 3 days	2013-06-26 23:37:11 +00:00
jimharris	cd28dd275b	Fail any passthrough command whose transfer size exceeds the controller's max transfer size. This guards against rogue commands coming in from userspace. Also add KASSERTS for the virtual address and unmapped bio cases, if the transfer size exceeds the controller's max transfer size. Sponsored by: Intel MFC after: 3 days	2013-06-26 23:32:45 +00:00
jimharris	8579bd1923	Use MAXPHYS to specify the maximum I/O size for nvme(4). Also allow admin commands to transfer up to this maximum I/O size, rather than the artificial limit previously imposed. The larger I/O size is very beneficial for upcoming firmware download support. This has the added benefit of simplifying the code since both admin and I/O commands now use the same maximum I/O size. Sponsored by: Intel MFC after: 3 days	2013-06-26 23:27:17 +00:00
jimharris	c0e542217e	Remove the NVME_IDENTIFY_CONTROLLER and NVME_IDENTIFY_NAMESPACE IOCTLs and replace them with the NVMe passthrough equivalent. Sponsored by: Intel	2013-04-12 17:56:47 +00:00
jimharris	eee11f2f3d	Add support for passthrough NVMe commands. This includes a new IOCTL to support a generic method for nvmecontrol(8) to pass IDENTIFY, GET_LOG_PAGE, GET_FEATURES and other commands to the controller, rather than separate IOCTLs for each. Sponsored by: Intel	2013-04-12 17:52:17 +00:00
jimharris	72eb2cf9e3	Move the busdma mapping functions to nvme_qpair.c. This removes nvme_uio.c completely. Sponsored by: Intel	2013-04-12 17:48:45 +00:00
jimharris	3b35b1fc99	Remove the NVMe-specific physio and associated routines. These were added early on for benchmarking purposes to avoid the mapped I/O penalties incurred in kern_physio. Now that FreeBSD (including kern_physio) supports unmapped I/O, the need for these NVMe-specific routines no longer exists. Sponsored by: Intel	2013-04-12 17:44:55 +00:00
jimharris	f877ea431a	Add a mutex to each namespace, for general locking operations on the namespace. Sponsored by: Intel	2013-04-12 17:41:24 +00:00
jimharris	d3af7eb2bf	Rename the controller's fail_req_lock, so that it can be used for other locking operations on the controller. Sponsored by: Intel	2013-04-12 17:36:48 +00:00
jimharris	92ebbf5a66	Do not panic when a busdma mapping operation fails. Instead, print an error message and fail the associated command with DATA_TRANSFER_ERROR NVMe completion status. Sponsored by: Intel	2013-04-12 17:34:49 +00:00
jimharris	c4799f93b1	Add unmapped bio support to nvme(4) and nvd(4). Sponsored by: Intel	2013-04-01 16:23:34 +00:00
jimharris	2128397eaf	Add "type" to nvme_request, signifying if its payload is a VADDR, UIO, or NULL. This simplifies decisions around if/how requests are routed through busdma. It also paves the way for supporting unmapped bios. Sponsored by: Intel	2013-03-29 20:34:28 +00:00
jimharris	536305e233	Remove obsolete comment. This code has now been tested with the QEMU NVMe device emulator.	2013-03-28 16:57:48 +00:00
jimharris	86f7620634	Delete extra IO qpairs allocated based on number of MSI-X vectors, but later found to not be usable because the controller doesn't support the same number of queues. This is not the normal case, but does occur with the Chatham prototype board. Sponsored by: Intel	2013-03-28 16:54:19 +00:00
jimharris	2255407cf0	Fix printf format issue on i386. Reported by: bz	2013-03-27 00:37:00 +00:00
jimharris	52767ea66d	Clean up debug prints. 1) Consistently use device_printf. 2) Make dump_completion and dump_command into something more human-readable. Sponsored by: Intel Reviewed by: carl	2013-03-26 22:17:10 +00:00
jimharris	8f8689b1b6	Move common code from the different nvme_allocate_request functions into a separate function. Sponsored by: Intel Suggested by: carl Reviewed by: carl	2013-03-26 22:13:07 +00:00
jimharris	61a3cd77cc	Change a number of malloc(9) calls to use M_WAITOK instead of M_NOWAIT. Sponsored by: Intel Suggested by: carl Reviewed by: carl	2013-03-26 22:11:34 +00:00
jimharris	5242be57d3	Replace usages of mtx_pool_find used for admin commands with a polling mechanism. Now that all requests are timed, we are guaranteed to get a completion notification, even if it is an abort status due to a timed out admin command. This has the effect of simplifying the controller and namespace setup code, so that it reads straight through rather than broken up into a bunch of different callback functions. Sponsored by: Intel Reviewed by: carl	2013-03-26 22:09:51 +00:00
jimharris	ff567ee3e1	Abort and do not retry any outstanding admin commands left over after a controller reset. Sponsored by: Intel Reviewed by: carl	2013-03-26 22:06:05 +00:00
jimharris	69d2e13801	Add the ability to internally mark a controller as failed, if it is unable to start or reset. Also add a notifier for NVMe consumers for controller fail conditions and plumb this notifier for nvd(4) to destroy the associated GEOM disks when a failure occurs. This requires a bit of work to cover the races when a consumer is sending I/O requests to a controller that is transitioning to the failed state. To help cover this condition, add a task to defer completion of I/Os submitted to a failed controller, so that the consumer will still always receive its completions in a different context than the submission. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:58:38 +00:00
jimharris	de155eb698	Just disable the controller instead of deleting IO queues during detach. This is just as effective, and removes the need for a bunch of admin commands to a controller that's going to be disabled shortly anyways. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:48:41 +00:00
jimharris	89ce8fee13	Set Pre-boot Software Load Count to 0 at the end of the controller start process. The spec indicates the OS driver should use Set Features (Software Progress Marker) to set the pre-boot software load count to 0 after the OS driver has successfully been initialized. This allows pre-boot software to determine if there have been any issues with the OS loading. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:42:53 +00:00
jimharris	63beb43e5f	Remove the is_started flag from struct nvme_controller. This flag was originally added to communicate to the sysctl code which oids should be built, but there are easier ways to do this. This needs to be cleaned up prior to adding new controller states - for example, controller failure. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:19:26 +00:00
jimharris	d207d40160	Ensure the controller's MDTS is accounted for in max_xfer_size. The controller's IDENTIFY data contains MDTS (Max Data Transfer Size) to allow the controller to specify the maximum I/O data transfer size. nvme(4) already provides a default maximum, but make sure it does not exceed what MDTS reports. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:16:53 +00:00
jimharris	21ee92ac4f	Cap the number of retry attempts to a configurable number. This ensures that if a specific I/O repeatedly times out, we don't retry it indefinitely. The default number of retries will be 4, but is adjusted using hw.nvme.retry_count. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:14:51 +00:00
jimharris	18a3a60fb4	Pass associated log page data to async event consumers, if requested. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:08:32 +00:00
jimharris	894007a2dc	When an asynchronous event request is completed, automatically fetch the specified log page. This satisfies the spec condition that future async events of the same type will not be sent until the associated log page is fetched. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:05:15 +00:00
jimharris	79d7c4eec2	Add structure definitions and controller command function for firmware log pages. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:03:03 +00:00
jimharris	de4e1d0695	Add structure definitions and a controller command function for error log pages. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:01:53 +00:00
jimharris	3c0b8367a2	Create struct nvme_status. NVMe error log entries include status, so breaking this out into its own data structure allows it to be included in both the nvme_completion data structure as well as error log entry data structures. While here, expose nvme_completion_is_error(), and change all of the places that were explicitly looking at sc/sct bits to use this macro instead. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:00:18 +00:00
jimharris	d0a775e794	Make nvme_ctrlr_reset a nop if a reset is already in progress. This protects against cases where a controller crashes with multiple I/O outstanding, each timing out and requesting controller resets simultaneously. While here, remove a debugging printf from a previous commit, and add more logging around I/O that need to be resubmitted after a controller reset. Sponsored by: Intel Reviewed by: carl	2013-03-26 20:56:58 +00:00
jimharris	b7f7338cc5	By default, always escalate to controller reset when an I/O times out. While aborts are typically cleaner than a full controller reset, many times an I/O timeout indicates other controller-level issues where aborts may not work. NVMe drivers for other operating systems are also defaulting to controller reset rather than aborts for timed out I/O. Sponsored by: Intel Reviewed by: carl	2013-03-26 20:32:57 +00:00
jimharris	83032bc239	Add a tunable for the I/O timeout interval. Default is still 30 seconds, but can be adjusted between a min/max of 5 and 120 seconds. Sponsored by: Intel Reviewed by: carl	2013-03-26 20:02:35 +00:00
jimharris	711dabaf43	Add handling for controller fatal status (csts.cfs). On any I/O timeout, check for csts.cfs==1. If set, the controller is reporting fatal status and we reset the controller immediately, rather than trying to abort the timed out command. This changeset also includes deferring the controller start portion of the reset to a separate task. This ensures we are always performing a controller start operation from a consistent context. Sponsored by: Intel Reviewed by: carl	2013-03-26 19:58:17 +00:00
jimharris	cef3145004	Add API for nvme consumers to access controller and namespace identify data. Sponsored by: Intel Reviewed by: carl	2013-03-26 19:52:57 +00:00
jimharris	93fd264895	Add controller reset capability to nvme(4) and ability to explicitly invoke it from nvmecontrol(8). Controller reset will be performed in cases where I/O are repeatedly timing out, the controller reports an unrecoverable condition, or when explicitly requested via IOCTL or an nvme consumer. Since the controller may be in such a state where it cannot even process queue deletion requests, we will perform a controller reset without trying to clean up anything on the controller first. Sponsored by: Intel Reviewed by: carl	2013-03-26 19:50:46 +00:00
jimharris	5220c76da8	Keep a doubly-linked list of outstanding trackers. This enables in-order re-submission of I/O after a controller reset. Sponsored by: Intel	2013-03-26 18:45:16 +00:00
jimharris	a3af497c87	Create a generic nvme_ctrlr_cmd_get_log_page function, and change the health information log page function to use it. Sponsored by: Intel	2013-03-26 18:43:53 +00:00
jimharris	e3ff62c987	Expose the get/set features API to nvme consumers. Sponsored by: Intel	2013-03-26 18:42:05 +00:00
jimharris	3af2a639e2	Add an interface for nvme shim drivers (i.e. nvd) to register for notifications when new nvme controllers are added to the system. Sponsored by: Intel	2013-03-26 18:39:54 +00:00
jimharris	68cbcde2c3	Enable asynchronous event requests on non-Chatham devices. Also add logic to clean up all outstanding asynchronous event requests when resetting or shutting down the controller, since these requests will not be explicitly completed by the controller itself. Sponsored by: Intel	2013-03-26 18:37:36 +00:00
jimharris	7ad47d8780	Move controller destruction code from nvme_detach() to new nvme_ctrlr_destruct() function. Sponsored by: Intel	2013-03-26 18:34:19 +00:00
jimharris	6162f3ce10	Specify command timeout interval on a per-command type basis. This is primarily driven by the need to disable timeouts for asynchronous event requests, which by nature should not be timed out. Sponsored by: Intel	2013-03-26 18:31:46 +00:00
jimharris	b4217411fa	Explicitly abort a timed out command, if the ABORT command sent to the controller indicates the command was not found. Sponsored by: Intel	2013-03-26 18:29:04 +00:00
jimharris	17c9d83862	Break out the code for completing an nvme_tracker object into a separate function. This allows for completions outside the normal completion path, for example when an ABORT command fails due to the controller reporting the targeted command does not exist. This is mainly for protection against a faulty controller, but we need to clean up our internal request nonetheless. Sponsored by: Intel	2013-03-26 18:27:22 +00:00
jimharris	34e3d4c73e	Add support for ABORT commands, including issuing these commands when an I/O times out. Also ensure that we retry commands that are aborted due to a timeout. Sponsored by: Intel	2013-03-26 18:23:35 +00:00

1 2

80 Commits