77 Commits

Author SHA1 Message Date
Conrad Meyer
f44abf1fe7 ioat(4): Start poll timer when descriptors are released to HW
Rather than when the software creates the descriptors.

Sponsored by:	Dell EMC Isilon
2016-09-11 20:15:41 +00:00
Conrad Meyer
dc46505973 ioat(4): De-spam ioat_process_events KTR logs
Sponsored by:	Dell EMC Isilon
2016-09-11 20:14:19 +00:00
Conrad Meyer
985efa77cc ioat(4): Despam relatively common hardware reset messages
Reported by:	ngie@
2016-09-01 23:56:02 +00:00
Conrad Meyer
ee5642bb35 ioat(4): Add additional CTR tracing during reset 2016-08-29 20:51:34 +00:00
Conrad Meyer
1bbc06b85b ioat(4): Don't "complete" DMA descriptors prematurely
In r304602, I mistakenly removed the ioat_process_events check that we weren't
processing events before the hardware had completed the descriptor
("last_seen").  Reinstate that logic.

Keep the defensive loop condition and additionally make sure we've actually
completed a descriptor before blindly chasing the ring around.

In reset, queue and finish the startup command before allowing any event
processing or submission to occur.  Avoid potential missed callouts by
requeueing the poll later.
2016-08-29 20:46:33 +00:00
Conrad Meyer
c114e74120 ioat(4): Allow callouts to be scheduled after hw reset
is_completion_pending governs whether or not a callout will be scheduled
when new work is queued on the IOAT device.  If true, a callout is
already scheduled, so we do not need a new one.  If false, we schedule
one and set it true.  Because resetting the hardware completed all
outstanding work but failed to clear is_completion_pending, no new
callout could be scheduled after a reset with pending work.

This resulted in a driver hang for polled-only work.
2016-08-22 14:51:09 +00:00
Conrad Meyer
0283c0f581 ioat(4): Don't process events past queue head
Fix a race where the completion routine could overrun the active ring
area in some situations.
2016-08-22 14:51:07 +00:00
Conrad Meyer
520f6023de ioat(4): Log channel number in CTR events 2016-08-05 02:56:31 +00:00
Conrad Meyer
1be25cc928 ioat(4): Check ring links at grow/shrink in INVARIANTS 2016-07-12 21:57:05 +00:00
Conrad Meyer
6d41e1ed15 ioat(4): Add KTR trace for ioat_reset_hw 2016-07-12 21:57:02 +00:00
Conrad Meyer
bd17dd691e ioat(4): Enhance KTR logging for descriptor completions 2016-07-12 21:57:00 +00:00
Conrad Meyer
3b188a672f ioat(4): Assert against ring underflow 2016-07-12 21:56:57 +00:00
Conrad Meyer
08a6b96aa5 ioat_reserve_space: Recheck quiescing flag after dropping submit lock
Fix a minor bound check error while here (ring can only hold 1 <<
MAX_ORDER - 1 entries).
2016-07-12 21:56:55 +00:00
Conrad Meyer
fe19e76ced ioat(4): Remove force_hw_error sysctl; it does not work reliably 2016-07-12 21:56:52 +00:00
Conrad Meyer
f8253f1a39 ioat(4): Export HW capabilities to consumers 2016-07-12 21:56:49 +00:00
Conrad Meyer
25ad958599 ioat(4): Submitters pick up a shovel if queue is too full
Before attempting to grow the ring.
2016-07-12 21:56:46 +00:00
Conrad Meyer
c62db3954c ioat(4): Don't shrink ring if active 2016-07-12 21:56:34 +00:00
Conrad Meyer
35b7a45ef2 ioat(4): Print some more useful information about the ring from ddb "show ioat" 2016-07-12 21:52:26 +00:00
Conrad Meyer
db16ab7ea0 ioat(4): Shrink using the correct timer
Fix a typo introduced in r302352.

Sponsored by:	EMC / Isilon Storage Division
2016-07-12 17:58:58 +00:00
Conrad Meyer
fe8712f84a ioat(4): Block asynchronous work during HW reset
Fix the race between ioat_reset_hw and ioat_process_events.

HW reset isn't protected by a lock because it can sleep for a long time
(40.1 ms).  This resulted in a race where we would process bogus parts
of the descriptor ring as if it had completed.  This looked like
duplicate completions on old events, if your ring had looped at least
once.

Block callout and interrupt work while reset runs so the completion end
of things does not observe indeterminate state and process invalid parts
of the ring.

Start the channel with a manually implemented ioat_null() to keep other
submitters quiesced while we wait for the channel to start (100 us).

r295605 may have made the race between ioat_reset_hw and
ioat_process_events wider, but I believe it already existed before that
revision.  ioat_process_events can be invoked by two asynchronous
sources: callout (softclock) and device interrupt.  Those could race
each other, to the same effect.

Reviewed by:	markj
Approved by:	re
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7097
2016-07-05 20:53:32 +00:00
Conrad Meyer
93f7f84af6 ioat(4): Serialize ioat_reset_hw invocations
Reviewed by:	markj
Approved by:	re
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7097
2016-07-05 20:52:35 +00:00
Conrad Meyer
5ac7796303 ioat(4): Split timer into poll and shrink functions
Poll should happen quickly, while shrink should happen infrequently.

Protect is_completion_pending with submit_lock.

Reviewed by:	markj
Approved by:	re
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7097
2016-07-05 20:51:52 +00:00
Conrad Meyer
0bb35deb07 ioat(4): Add ddb "show ioat <unit>" debugger command
Sponsored by:	EMC / Isilon Storage Division
2016-06-09 01:31:09 +00:00
Conrad Meyer
48fed014cb ioat(4): Always log capabilities on attach
Different, relatively recent Intel Xeon hardware support radically different
features.  E.g., BDX support CRC32 while BDX-DE does not.

Reviewed by:	rpokala (spiritually)
Sponsored by:	EMC / Isilon Storage Division
2016-06-04 04:14:06 +00:00
Conrad Meyer
f8f92e9180 ioat(4): Export the number of available channels
Sponsored by:	EMC / Isilon Storage Division
2016-06-04 03:54:30 +00:00
Conrad Meyer
df1928aa33 ioat(4): Make channel indices unsigned
Sponsored by:	EMC / Isilon Storage Division
2016-06-04 03:52:19 +00:00
Edward Tomasz Napierala
084d207584 Remove misc NULL checks after M_WAITOK allocations.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-05-10 10:26:07 +00:00
Enji Cooper
aa853cb461 Use DEVMETHOD_END ({ NULL, NULL }) instead of hardcoding { 0, 0 }
Sponsored by: EMC / Isilon Storage Division
2016-05-03 23:56:52 +00:00
Conrad Meyer
bec7ff798a ioat(4): Implement CRC and MOVECRC APIs
And document them in ioat.4.

Sponsored by:	EMC / Isilon Storage Division
2016-05-03 17:07:18 +00:00
Conrad Meyer
be3cbf60a5 ioat(4): Add CRC descriptor structure
Add CRC/MOVECRC operations, as well as the TEST and STORE variants.

With these operations, a CRC32C can be computed over one or more
descriptors' source data.  When the STORE operation is encountered, the
accumulated CRC32C is emitted to memory.  A TEST operations triggers an
IOAT channel error if the accumulated CRC32C does not match one in
memory.

These operations are not exposed through any API yet.

Sponsored by:	EMC / Isilon Storage Division
2016-05-03 17:06:33 +00:00
Conrad Meyer
519e8baab7 ioat(4): Limit descriptor allocation to low 40 bits
The IOAT engine can only address the low 40 bits (1 TB) of physmem via
the 'next descriptor' pointer.  Restrict acceptable range given to
bus_dma_tag_create to match.

Sponsored by:	EMC / Isilon Storage Division
2016-05-03 17:05:58 +00:00
Conrad Meyer
0ff814e854 ioat(4): ioat_get_dmaengine(): Add M_WAITOK mode
Sponsored by:	EMC / Isilon Storage Division
2016-04-09 13:15:34 +00:00
Conrad Meyer
374b05e1ff ioat(4): On error detected in ithread, defer HW reset to taskqueue
The I/OAT HW reset process may sleep, so it is invalid to perform a
channel reset from the software interrupt thread.

Sponsored by:	EMC / Isilon Storage Division
2016-02-13 22:51:25 +00:00
Conrad Meyer
d2c55e5ad0 ioat(4): Also check for errors if the channel is suspended
Sponsored by:	EMC / Isilon Storage Division
2016-02-13 22:51:17 +00:00
Conrad Meyer
564af7a654 ioat(4): Decode/define more capabilities, operations
These are defined in the Intel Haswell EDS volume 2 (registers) (507849
v2.1).

Sponsored by:	EMC / Isilon Storage Division
2016-02-13 19:01:56 +00:00
Conrad Meyer
007a703036 ioat(4): Recheck status register on zero-descriptor wakeups
Errors that halt the channel don't necessarily result in a completion
update, apparently.

Sponsored by:	EMC / Isilon Storage Division
2016-02-13 02:55:45 +00:00
Conrad Meyer
6ca07079af ioat(4): Add support for 'fence' bit with DMA_FENCE flag
Some classes of IOAT hardware prefetch reads.  DMA operations that
depend on the result of prior DMA operations must use the DMA_FENCE flag
to prevent stale reads.

(E.g., I've hit this personally on Broadwell-EP.  The Broadwell-DE has a
different IOAT unit that is documented to not pipeline DMA operations.)

Sponsored by:	EMC / Isilon Storage Division
2016-01-15 01:34:43 +00:00
Conrad Meyer
1502e36346 ioat(4): Add ioat_acquire_reserve() KPI
ioat_acquire_reserve() is an extended version of ioat_acquire().  It
allows users to reserve space in the channel for some number of
descriptors.  If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.

Sponsored by:	EMC / Isilon Storage Division
2016-01-07 23:02:15 +00:00
Conrad Meyer
bd81fe68ee ioat(4): Add ioat_get_max_io_size() KPI
Consumers need to know the permitted IO size to send maximally sized
chunks to the hardware.

Sponsored by:	EMC / Isilon Storage Division
2016-01-05 20:42:19 +00:00
Conrad Meyer
31bf2875ea ioat(4): Add an API to get HW revision
Different revisions support different operations.  Refer to Intel
External Design Specifications to figure out what your hardware
supports.

Sponsored by:	EMC / Isilon Storage Division
2015-12-17 23:21:37 +00:00
Conrad Meyer
d37872da34 ioatcontrol(8): Add support for interrupt coalescing
The new flag, -c <period>, sets the interrupt coalescing period in
microseconds through the new ioat(4) API ioat_set_interrupt_coalesce().

Also add a -z flag to zero ioat statistics before tests, to make it easy
to measure results.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:02:01 +00:00
Conrad Meyer
5ca9fc2a8d ioat(4): Add support for interrupt coalescing
In I/OAT, this is done through the INTRDELAY register.  On supported
platforms, this register can coalesce interrupts in a set period to
avoid excessive interrupt load for small descriptor workflows.  The
period is configurable anywhere from 1 microsecond to 16.38
milliseconds, in microsecond granularity.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:01:52 +00:00
Conrad Meyer
01fbbc8886 ioat(4): Gather and expose DMA statistics via sysctl
Organize the dev.ioat sysctl node into a tree while we're here.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:00:07 +00:00
Conrad Meyer
6a301ac85a ioat(4): Add ioatcontrol(8) testing for copy_8k
Add -E ("Eight k") and -m ("Memcpy") modes to the ioatcontrol(8) tool.

Prompted by:	rpokala
Sponsored by:	EMC / Isilon Storage Division
2015-12-10 02:05:35 +00:00
Conrad Meyer
5afc2508e1 ioat(4): Add Broadwell-EP PCI IDs
Sponsored by:	EMC / Isilon Storage Division
2015-12-09 22:46:00 +00:00
Conrad Meyer
9950fde08d ioat(4): Add ioat_copy_8k_aligned KPI
The hardware supports descriptors with two non-contiguous pages.  This
allows issuing one descriptor for an 8k copy from/to non-contiguous but
otherwise page-aligned memory.

Sponsored by:	EMC / Isilon Storage Division
2015-12-09 22:45:51 +00:00
Conrad Meyer
c2b69205ac ioat(4): Add MODULE_VERSION so MODULE_DEPEND works
Suggested by:	jhb
Review in progress:	cc
Sponsored by:	EMC / Isilon Storage Division
2015-12-04 23:31:32 +00:00
Conrad Meyer
faefad9c12 ioat: Handle channel-fatal HW errors safely
Certain invalid operations trigger hardware error conditions.  Error
conditions that only halt one channel can be detected and recovered by
resetting the channel.  Error conditions that halt the whole device are
generally not recoverable.

Add a sysctl to inject channel-fatal HW errors,
'dev.ioat.<N>.force_hw_error=1'.

When a halt due to a channel error is detected, ioat(4) blocks new
operations from being queued on the channel, completes any outstanding
operations with an error status, and resets the channel before allowing
new operations to be queued again.

Update ioat.4 to document error recovery;  document blockfill introduced
in r290021 while we are here;  document ioat_put_dmaengine() added in
r289907;  document DMA_NO_WAIT added in r289982.

Sponsored by:	EMC / Isilon Storage Division
2015-10-31 20:38:06 +00:00
Conrad Meyer
1ffae6e80a ioat_test: Handled forced hardware resets gracefully
Sponsored by:	EMC / Isilon Storage Division
2015-10-29 04:16:52 +00:00
Conrad Meyer
5f77bd3e24 ioat: Drain/quiesce the device less racily
On detach and during a forced HW reset.

Sponsored by:	EMC / Isilon Storage Division
2015-10-29 04:16:39 +00:00