freebsd-skq

Author	SHA1	Message	Date
ray	3625eb7f3d	MFC@r248830 Approved by: ed (project owner)	2013-03-28 20:27:01 +00:00
scottl	84ae5b84bb	Several fixes and improvements to sendfile() 1. If we wanted to send exactly as many bytes as the socket buffer is sized for, the inner loop of kern_sendfile() would see that the socket is full before seeing that it had no more bytes left to send. This would cause it to return EAGAIN to the caller instead of success. Fix by changing the order that these conditions are tested. 2. Simplify the calculation for the bytes to send in each iteration of the inner loop of kern_sendfile() 3. Fix some calls with bogus arguments to sf_buf_ext(). These would only trigger on mbuf allocation failure, but would be hilariously bad if they did trigger. Submitted by: gibbs(3), andre(2) Reviewed by: emax, andre Obtained from: Netflix MFC after: 1 week	2013-03-28 14:14:28 +00:00
sbruno	79d409e127	Restore DB_COMMAND capabilities of ciss(4) for debugging and diagnostics Obtained from: Yahoo! Inc. MFC after: 2 weeks	2013-03-28 12:44:43 +00:00
mav	674a0b97f5	Except one case mps(4) driver does not touch the data and works well with unmapped I/O. That one exception is access to INQUIRY VPD request result. Those requests are never unmapped now, but to be safe add respective check there and allow unmapped I/O for the SIM by setting PIM_UNMAPPED flag.	2013-03-28 11:24:30 +00:00
sbruno	e2bc9dfbfd	Fix compile of ciss(4) with CISS_DEBUG defined Obtained from: Yahoo! Inc. MFC after: 2 weeks	2013-03-28 11:00:41 +00:00
kib	7b210bf144	Release the v_writecount reference on the vnode in case of error, before the vnode is vput() in vm_mmap_vnode(). Error return means that there is no use reference on the vnode from the vm object reference, and failing to restore v_writecount breaks the invariant that v_writecount is less or equal to the usecount. The situation observed when nfs client returns ESTALE for VOP_GETATTR() after the open. In collaboration with: pho MFC after: 1 week	2013-03-28 06:39:27 +00:00
adrian	a1961a79d2	Fix the AR933x platform device start/stop code. This was ported from the AR724x code and I think that also doesn't quite work. I'll investigate that soon. With this in place the system reset path works, so 'reset' from kdb actually resets the SoC. Tested: * AP121 test board	2013-03-28 05:43:03 +00:00
jimharris	6ed2dc4d7c	deferal -> deferral	2013-03-27 23:07:43 +00:00
mav	368b37ab92	On SIM destruction free associated CCBs, preallocated inside xpt_get_ccb(). Before this change they were just leaked. Fortunately USB sticks now use only one CCB, and so leak was only 2KB per detach, while other bigger SIMs with much more allocated CCBs are rarely detached. MFC after: 2 weeks	2013-03-27 18:55:01 +00:00
jkim	225dc03864	Limit the amount of video memory we map for the driver to the maximum value. This basically restores the spirit of r203535, which was partially reverted in r205557, while we still map fixed amount to work around transient issues we experienced with r203535. Prodded by: avg Tested by: avg MFC after: 1 week	2013-03-27 18:06:28 +00:00
kib	c45e5da903	Fix a race with the vnode reclamation in the aio_qphysio(). Obtain the thread reference on the vp->v_rdev and use the returned struct cdev dev instead of using vp->v_rdev. Call dev_strategy_csw() instead of dev_strategy(), since we now own the reference. Since the csw was already calculated, test d_flags to avoid mapping the buffer if the driver supports unmapped requests []. Suggested by: kan [*] Reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-03-27 11:47:52 +00:00
kib	448e7c1290	Add dev_strategy_csw() function, which is similar to dev_strategy() but assumes that a thread reference was already obtained on the passed device. Use the function from physio(), to avoid two extra dev_mtx lock and unlock. Note that physio() is always used as the cdevsw method, or is called from a cdevsw method, and the caller already owns the reference. dev_strategy() is left to keep KPI intact, but now it is implemented as a wrapper around dev_strategy_csw(). Do some style cleanup in physio(). Requested and reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-03-27 11:34:27 +00:00
kib	df3795022f	On i386, double the default size of the bio transient map. With the maxbcache size fixed, the auto-tuned transient map is too small for real-world load on i386. Tested by: David Wolfskill Sponsored by: The FreeBSD Foundation	2013-03-27 10:56:15 +00:00
kib	2c9ac98d06	Fix the VM_BCACHE_SIZE_MAX definition on i386 to match the maximal buffer map size, auto-tuned on the 4GB machine. Having the maxbcache bigger than the buffer map causes the transient bio map sizing logic to assume that there is enough KVA to use approximately 90MB (buffer map is sized to 110MB, and maxbcache is 200MB). The increase in the KVA usage caused other big KVA consumers, like nvidia.ko, to fail the initialization. Change the definition for both PAE and non-PAE cases, since PAE is even more KVA-starved. Reported and tested by: David Wolfskill Discussed with: alc Sponsored by: The FreeBSD Foundation	2013-03-27 10:52:18 +00:00
mav	e3a102cae6	Add Subsystem ID field to the quirk table. Use it to identify Mac Pro 1,1, which requires OVREF to be set to get proper playback volume, but which has all zeroes in HDA controller subdevice IDs on PCI. MFC after: 1 month Sponsored by:	2013-03-27 07:30:08 +00:00
adrian	938c374b23	Commit initial (unfinished!) support for the AR933x series of embedded CPUs. The AR933x is a mips24k based SoC with an AR9380 series SoC on board, two gigabit ethernet interfaces and an internal 10/100mbit ethernet switch. There's also the normal interfaces (USB, ethernet, uart, GPIO.) The downside? There's a non-ns8250 UART device. With a very basic UART driver (not in this commit) the SoC is initialised and boots up. I'll commit the UART code soon and then link it into the general setup path. This code is a re-implementation based from the Linux kernel / openwrt AR933x support. TODO: * UART (obviously) * All of the ethernet, USB and wifi SoC glue, including ethernet PLL programming.	2013-03-27 03:38:58 +00:00
adrian	a5e3a9bbda	Add the reference clock for each supported chip. Obtained from: Linux (openwrt)	2013-03-27 03:33:19 +00:00
jimharris	2255407cf0	Fix printf format issue on i386. Reported by: bz	2013-03-27 00:37:00 +00:00
adrian	77e2dba197	* Stop processing after HAL_EIO; this is what the reference driver does. * If we hit an empty queue condition (which I haven't yet root caused, grr.) .. make sure we release the lock before continuing.	2013-03-27 00:35:45 +00:00
jimharris	e86d6338eb	Panic should the SCI framework ever request a pointer into the ccb's data buffer for a ccb that is unmapped. This case is currently not possible, since the SCI framework only requests these pointers for doing SCSI/ATA translation of non- READ/WRITE commands. The panic is more to protect against the unlikely future scenario where additional commands could be unmapped. Sponsored by: Intel	2013-03-27 00:15:22 +00:00
jimharris	ea5eeccdf1	Report support for unmapped I/O by adding PIM_UNMAPPED flag. Submitted by: jhb, scottl	2013-03-26 23:04:06 +00:00
jimharris	52767ea66d	Clean up debug prints. 1) Consistently use device_printf. 2) Make dump_completion and dump_command into something more human-readable. Sponsored by: Intel Reviewed by: carl	2013-03-26 22:17:10 +00:00
jimharris	8f8689b1b6	Move common code from the different nvme_allocate_request functions into a separate function. Sponsored by: Intel Suggested by: carl Reviewed by: carl	2013-03-26 22:13:07 +00:00
jimharris	61a3cd77cc	Change a number of malloc(9) calls to use M_WAITOK instead of M_NOWAIT. Sponsored by: Intel Suggested by: carl Reviewed by: carl	2013-03-26 22:11:34 +00:00
jimharris	5242be57d3	Replace usages of mtx_pool_find used for admin commands with a polling mechanism. Now that all requests are timed, we are guaranteed to get a completion notification, even if it is an abort status due to a timed out admin command. This has the effect of simplifying the controller and namespace setup code, so that it reads straight through rather than broken up into a bunch of different callback functions. Sponsored by: Intel Reviewed by: carl	2013-03-26 22:09:51 +00:00
jimharris	ff567ee3e1	Abort and do not retry any outstanding admin commands left over after a controller reset. Sponsored by: Intel Reviewed by: carl	2013-03-26 22:06:05 +00:00
jimharris	69d2e13801	Add the ability to internally mark a controller as failed, if it is unable to start or reset. Also add a notifier for NVMe consumers for controller fail conditions and plumb this notifier for nvd(4) to destroy the associated GEOM disks when a failure occurs. This requires a bit of work to cover the races when a consumer is sending I/O requests to a controller that is transitioning to the failed state. To help cover this condition, add a task to defer completion of I/Os submitted to a failed controller, so that the consumer will still always receive its completions in a different context than the submission. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:58:38 +00:00
jimharris	de155eb698	Just disable the controller instead of deleting IO queues during detach. This is just as effective, and removes the need for a bunch of admin commands to a controller that's going to be disabled shortly anyways. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:48:41 +00:00
jimharris	aa210ff37b	Have nvd(4) register for controller notifications. Also have nvd maintain controller/namespace relationships internally. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:45:37 +00:00
jimharris	89ce8fee13	Set Pre-boot Software Load Count to 0 at the end of the controller start process. The spec indicates the OS driver should use Set Features (Software Progress Marker) to set the pre-boot software load count to 0 after the OS driver has successfully been initialized. This allows pre-boot software to determine if there have been any issues with the OS loading. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:42:53 +00:00
jimharris	63beb43e5f	Remove the is_started flag from struct nvme_controller. This flag was originally added to communicate to the sysctl code which oids should be built, but there are easier ways to do this. This needs to be cleaned up prior to adding new controller states - for example, controller failure. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:19:26 +00:00
jimharris	d207d40160	Ensure the controller's MDTS is accounted for in max_xfer_size. The controller's IDENTIFY data contains MDTS (Max Data Transfer Size) to allow the controller to specify the maximum I/O data transfer size. nvme(4) already provides a default maximum, but make sure it does not exceed what MDTS reports. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:16:53 +00:00
jimharris	21ee92ac4f	Cap the number of retry attempts to a configurable number. This ensures that if a specific I/O repeatedly times out, we don't retry it indefinitely. The default number of retries will be 4, but is adjusted using hw.nvme.retry_count. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:14:51 +00:00
jimharris	18a3a60fb4	Pass associated log page data to async event consumers, if requested. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:08:32 +00:00
jimharris	894007a2dc	When an asynchronous event request is completed, automatically fetch the specified log page. This satisfies the spec condition that future async events of the same type will not be sent until the associated log page is fetched. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:05:15 +00:00
jimharris	79d7c4eec2	Add structure definitions and controller command function for firmware log pages. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:03:03 +00:00
jimharris	de4e1d0695	Add structure definitions and a controller command function for error log pages. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:01:53 +00:00
jimharris	3c0b8367a2	Create struct nvme_status. NVMe error log entries include status, so breaking this out into its own data structure allows it to be included in both the nvme_completion data structure as well as error log entry data structures. While here, expose nvme_completion_is_error(), and change all of the places that were explicitly looking at sc/sct bits to use this macro instead. Sponsored by: Intel Reviewed by: carl	2013-03-26 21:00:18 +00:00
jimharris	d0a775e794	Make nvme_ctrlr_reset a nop if a reset is already in progress. This protects against cases where a controller crashes with multiple I/O outstanding, each timing out and requesting controller resets simultaneously. While here, remove a debugging printf from a previous commit, and add more logging around I/O that need to be resubmitted after a controller reset. Sponsored by: Intel Reviewed by: carl	2013-03-26 20:56:58 +00:00
jimharris	b7f7338cc5	By default, always escalate to controller reset when an I/O times out. While aborts are typically cleaner than a full controller reset, many times an I/O timeout indicates other controller-level issues where aborts may not work. NVMe drivers for other operating systems are also defaulting to controller reset rather than aborts for timed out I/O. Sponsored by: Intel Reviewed by: carl	2013-03-26 20:32:57 +00:00
pfg	996e9d6f74	Dtrace: dtrace.c erroneously checks for memory alignment on amd64. Merge change from illumos: 3511 dtrace.c erroneously checks for memory alignment on amd64 Illumos Revision: c93cc65 Reference: https://www.illumos.org/issues/3511 Obtained from: Illumos MFC after: 3 weeks	2013-03-26 20:17:08 +00:00
adrian	e84569c7e8	Implement the replacement EDMA FIFO code. (Yes, the previous code temporarily broke EDMA TX. I'm sorry; I should've actually setup ATH_BUF_FIFOEND on frames so txq->axq_fifo_depth was cleared!) This code implements a whole bunch of sorely needed EDMA TX improvements along with CABQ TX support. The specifics: * When filling/refilling the FIFO, use the new TXQ staging queue for FIFO frames * Tag frames with ATH_BUF_FIFOPTR and ATH_BUF_FIFOEND correctly. For now the non-CABQ transmit path pushes one frame into the TXQ staging queue without setting up the intermediary link pointers to chain them together, so draining frames from the txq staging queue to the FIFO queue occurs AMPDU / MPDU at a time. * In the CABQ case, manually tag the list with ATH_BUF_FIFOPTR and ATH_BUF_FIFOEND so a chain of frames is pushed into the FIFO at once. * Now that frames are in a FIFO pending queue, we can top up the FIFO after completing a single frame. This means we can keep it filled rather than waiting for it drain and _then_ adding more frames. * The EDMA restart routine now walks the FIFO queue in the TXQ rather than the pending queue and re-initialises the FIFO with that. * When restarting EDMA, we may have partially completed sending a list. So stamp the first frame that we see in a list with ATH_BUF_FIFOPTR and push _that_ into the hardware. * When completing frames, only check those on the FIFO queue. We should never ever queue frames from the pending queue direct to the hardware, so there's no point in checking. * Until I figure out what's going on, make sure if the TXSTATUS for an empty queue pops up, complain loudly and continue. This will stop the panics that people are seeing. I'll add some code later which will assist in ensuring I'm populating each descriptor with the correct queue ID. * When considering whether to queue frames to the hardware queue directly or software queue frames, make sure the depth of the FIFO is taken into account now. * When completing frames, tag them with ATH_BUF_BUSY if they're not the final frame in a FIFO list. The same holding descriptor behaviour is required when handling descriptors linked together with a link pointer as the hardware will re-read the previous descriptor to refresh the link pointer before contiuning. * .. and if we complete the FIFO list (ie, the buffer has ATH_BUF_FIFOEND set), then we don't need the holding buffer any longer. Thus, free it. Tested: * AR9380/AR9580, STA and hostap * AR9280, STA/hostap TODO: * I don't yet trust that the EDMA restart routine is totally correct in all circumstances. I'll continue to thrash this out under heavy multiple-TXQ traffic load and fix whatever pops up.	2013-03-26 20:04:45 +00:00
jimharris	83032bc239	Add a tunable for the I/O timeout interval. Default is still 30 seconds, but can be adjusted between a min/max of 5 and 120 seconds. Sponsored by: Intel Reviewed by: carl	2013-03-26 20:02:35 +00:00
jimharris	711dabaf43	Add handling for controller fatal status (csts.cfs). On any I/O timeout, check for csts.cfs==1. If set, the controller is reporting fatal status and we reset the controller immediately, rather than trying to abort the timed out command. This changeset also includes deferring the controller start portion of the reset to a separate task. This ensures we are always performing a controller start operation from a consistent context. Sponsored by: Intel Reviewed by: carl	2013-03-26 19:58:17 +00:00
jimharris	cef3145004	Add API for nvme consumers to access controller and namespace identify data. Sponsored by: Intel Reviewed by: carl	2013-03-26 19:52:57 +00:00
jimharris	93fd264895	Add controller reset capability to nvme(4) and ability to explicitly invoke it from nvmecontrol(8). Controller reset will be performed in cases where I/O are repeatedly timing out, the controller reports an unrecoverable condition, or when explicitly requested via IOCTL or an nvme consumer. Since the controller may be in such a state where it cannot even process queue deletion requests, we will perform a controller reset without trying to clean up anything on the controller first. Sponsored by: Intel Reviewed by: carl	2013-03-26 19:50:46 +00:00
adrian	bd33256583	Add per-TXQ EDMA FIFO staging queue support. Each set of frames pushed into a FIFO is represented by a list of ath_bufs - the first ath_buf in the FIFO list is marked with ATH_BUF_FIFOPTR; the last ath_buf in the FIFO list is marked with ATH_BUF_FIFOEND. Multiple lists of frames are just glued together in the TAILQ as per normal - except that at the end of a FIFO list, the descriptor link pointer will be NULL and it'll be tagged with ATH_BUF_FIFOEND. For non-EDMA chipsets this is a no-op - the ath_txq frame list (axq_q) stays the same and is treated the same. For EDMA chipsets the frames are pushed into axq_q and then when the FIFO is to be (re) filled, frames will be moved onto the FIFO queue and then pushed into the FIFO. So: * Add a new queue in each hardware TXQ (ath_txq) for staging FIFO frame lists. It's a TAILQ (like the normal hardware frame queue) rather than the ath9k list-of-lists to represent FIFO entries. * Add new ath_buf flags - ATH_TX_FIFOPTR and ATH_TX_FIFOEND. * When allocating ath_buf entries, clear out the flag value before returning it or it'll end up having stale flags. * When cloning ath_buf entries, only clone ATH_BUF_MGMT. Don't clone the FIFO related flags. * Extend ath_tx_draintxq() to first drain the FIFO staging queue, _then_ drain the normal hardware queue. Tested: * AR9280, hostap * AR9280, STA * AR9380/AR9580 - hostap TODO: * Test on other chipsets, just to be thorough.	2013-03-26 19:46:51 +00:00
jimharris	5220c76da8	Keep a doubly-linked list of outstanding trackers. This enables in-order re-submission of I/O after a controller reset. Sponsored by: Intel	2013-03-26 18:45:16 +00:00
jimharris	a3af497c87	Create a generic nvme_ctrlr_cmd_get_log_page function, and change the health information log page function to use it. Sponsored by: Intel	2013-03-26 18:43:53 +00:00
jimharris	e3ff62c987	Expose the get/set features API to nvme consumers. Sponsored by: Intel	2013-03-26 18:42:05 +00:00

1 2 3 4 5 ...

96608 Commits