Commit Graph

45 Commits

Author SHA1 Message Date
cem
81a1e6341d ioat(4): On error detected in ithread, defer HW reset to taskqueue
The I/OAT HW reset process may sleep, so it is invalid to perform a
channel reset from the software interrupt thread.

Sponsored by:	EMC / Isilon Storage Division
2016-02-13 22:51:25 +00:00
cem
0784d59f3a ioat(4): Also check for errors if the channel is suspended
Sponsored by:	EMC / Isilon Storage Division
2016-02-13 22:51:17 +00:00
cem
558491d7a7 ioat(4): Decode/define more capabilities, operations
These are defined in the Intel Haswell EDS volume 2 (registers) (507849
v2.1).

Sponsored by:	EMC / Isilon Storage Division
2016-02-13 19:01:56 +00:00
cem
0a4cceb161 ioat(4): Recheck status register on zero-descriptor wakeups
Errors that halt the channel don't necessarily result in a completion
update, apparently.

Sponsored by:	EMC / Isilon Storage Division
2016-02-13 02:55:45 +00:00
cem
532a8c64da ioat(4): Add support for 'fence' bit with DMA_FENCE flag
Some classes of IOAT hardware prefetch reads.  DMA operations that
depend on the result of prior DMA operations must use the DMA_FENCE flag
to prevent stale reads.

(E.g., I've hit this personally on Broadwell-EP.  The Broadwell-DE has a
different IOAT unit that is documented to not pipeline DMA operations.)

Sponsored by:	EMC / Isilon Storage Division
2016-01-15 01:34:43 +00:00
cem
66aa33e3b5 ioat(4): Add ioat_acquire_reserve() KPI
ioat_acquire_reserve() is an extended version of ioat_acquire().  It
allows users to reserve space in the channel for some number of
descriptors.  If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.

Sponsored by:	EMC / Isilon Storage Division
2016-01-07 23:02:15 +00:00
cem
a377302780 ioat(4): Add ioat_get_max_io_size() KPI
Consumers need to know the permitted IO size to send maximally sized
chunks to the hardware.

Sponsored by:	EMC / Isilon Storage Division
2016-01-05 20:42:19 +00:00
cem
a12e9d2b9f ioat(4): Add an API to get HW revision
Different revisions support different operations.  Refer to Intel
External Design Specifications to figure out what your hardware
supports.

Sponsored by:	EMC / Isilon Storage Division
2015-12-17 23:21:37 +00:00
cem
88b00b4658 ioatcontrol(8): Add support for interrupt coalescing
The new flag, -c <period>, sets the interrupt coalescing period in
microseconds through the new ioat(4) API ioat_set_interrupt_coalesce().

Also add a -z flag to zero ioat statistics before tests, to make it easy
to measure results.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:02:01 +00:00
cem
ae2a2410df ioat(4): Add support for interrupt coalescing
In I/OAT, this is done through the INTRDELAY register.  On supported
platforms, this register can coalesce interrupts in a set period to
avoid excessive interrupt load for small descriptor workflows.  The
period is configurable anywhere from 1 microsecond to 16.38
milliseconds, in microsecond granularity.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:01:52 +00:00
cem
88c152e1e4 ioat(4): Gather and expose DMA statistics via sysctl
Organize the dev.ioat sysctl node into a tree while we're here.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:00:07 +00:00
cem
0a85f054bd ioat(4): Add ioatcontrol(8) testing for copy_8k
Add -E ("Eight k") and -m ("Memcpy") modes to the ioatcontrol(8) tool.

Prompted by:	rpokala
Sponsored by:	EMC / Isilon Storage Division
2015-12-10 02:05:35 +00:00
cem
d7fb634354 ioat(4): Add Broadwell-EP PCI IDs
Sponsored by:	EMC / Isilon Storage Division
2015-12-09 22:46:00 +00:00
cem
4837490555 ioat(4): Add ioat_copy_8k_aligned KPI
The hardware supports descriptors with two non-contiguous pages.  This
allows issuing one descriptor for an 8k copy from/to non-contiguous but
otherwise page-aligned memory.

Sponsored by:	EMC / Isilon Storage Division
2015-12-09 22:45:51 +00:00
cem
80431a2924 ioat(4): Add MODULE_VERSION so MODULE_DEPEND works
Suggested by:	jhb
Review in progress:	cc
Sponsored by:	EMC / Isilon Storage Division
2015-12-04 23:31:32 +00:00
cem
39d22b9678 ioat: Handle channel-fatal HW errors safely
Certain invalid operations trigger hardware error conditions.  Error
conditions that only halt one channel can be detected and recovered by
resetting the channel.  Error conditions that halt the whole device are
generally not recoverable.

Add a sysctl to inject channel-fatal HW errors,
'dev.ioat.<N>.force_hw_error=1'.

When a halt due to a channel error is detected, ioat(4) blocks new
operations from being queued on the channel, completes any outstanding
operations with an error status, and resets the channel before allowing
new operations to be queued again.

Update ioat.4 to document error recovery;  document blockfill introduced
in r290021 while we are here;  document ioat_put_dmaengine() added in
r289907;  document DMA_NO_WAIT added in r289982.

Sponsored by:	EMC / Isilon Storage Division
2015-10-31 20:38:06 +00:00
cem
53fac20993 ioat_test: Handled forced hardware resets gracefully
Sponsored by:	EMC / Isilon Storage Division
2015-10-29 04:16:52 +00:00
cem
159d29dfc4 ioat: Drain/quiesce the device less racily
On detach and during a forced HW reset.

Sponsored by:	EMC / Isilon Storage Division
2015-10-29 04:16:39 +00:00
cem
15f623b49e ioatcontrol(8): Add and document "raw" testing mode
Allows DMA from/to arbitrary KVA or physical address.  /dev/ioat_test
must be enabled by root and is only R/W root, so this is approximately
as dangerous as /dev/mem and /dev/kmem.

Sponsored by:	EMC / Isilon Storage Division
2015-10-29 04:16:16 +00:00
cem
06b8e8f97e ioat: Define DMACAPABILITY bits
Check for BFILL capability before initiating blockfill operations.

Sponsored by:	EMC / Isilon Storage Division
2015-10-28 02:37:24 +00:00
cem
a4e559f000 ioat: Add support for Block Fill operations
The IOAT hardware supports writing a 64-bit pattern to some destination
buffer.  The same limitations on buffer length apply as for copy
operations.  Throughput is a bit higher (probably because fill does not
have to spend bandwidth reading from a source in memory).

Support for testing Block Fill has been added to ioatcontrol(8) and the
ioat_test device.  ioatcontrol(8) accepts the '-f' flag, which tests
Block Fill.  (If the flag is omitted, the tool tests copy by default.)
The '-V' flag, in conjunction with '-f', verifies that buffers are
filled in the expected pattern.

Tested on:	Broadwell DE (Xeon D-1500)
Sponsored by:	EMC / Isilon Storage Division
2015-10-26 19:34:12 +00:00
cem
94913a5078 ioat: Dedupe operation enqueue logic
Add generic hw descriptor struct and generic control flags struct, in
preparation for other kinds of IOAT operation.

Sponsored by:	EMC / Isilon Storage Division
2015-10-26 19:34:00 +00:00
cem
4bfb2bd974 ioat: Add %b format string for CHANERR codes
Sponsored by:	EMC / Isilon Storage Division
2015-10-26 03:30:50 +00:00
cem
97adedd3fc ioat: Allocate memory for ring resize sanely
Add a new flag for DMA operations, DMA_NO_WAIT.  It behaves much like
other NOWAIT flags -- if queueing an operation would sleep, abort and
return NULL instead.

When growing the internal descriptor ring, the memory allocation is
performed outside of all locks.  A lock-protected flag is used to avoid
duplicated work.  Threads that cannot sleep and attempt to queue
operations when the descriptor ring is full allocate a larger ring with
M_NOWAIT, or bail if that fails.

ioat_reserve_space() could become an external API if is important to
callers that they have room for a sequence of operations, or that those
operations succeed each other directly in the hardware ring.

This patch splits the internal head index (->head) from the hardware's
head-of-chain (DMACOUNT) register (->hw_head).  In the future, for
simplicity's sake, we could drop the 'ring' array entirely and just use
a linked list (with head and tail pointers rather than indices).

Suggested by:	Witness
Sponsored by:	EMC / Isilon Storage Division
2015-10-26 03:30:38 +00:00
cem
168f37473c ioat: Expose more softc members in sysctls
Kill some unused softc variables while we're here.

Sponsored by:	EMC / Isilon Storage Division
2015-10-26 02:21:32 +00:00
cem
c85ada939e ioat: Introduce KTR probes
Sponsored by:	EMC / Isilon Storage Division
2015-10-26 02:21:19 +00:00
cem
5b9e7b1858 ioat: Actually bring the hardware back online after reset
We need to reset the chancmp and chainaddr MMIO registers to bring the
device back to a working state.

Name the chanerr bits while we're here.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:46:32 +00:00
cem
eb584f4163 ioat: Use bus_alloc_resource_any(9)
Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:46:20 +00:00
cem
f5b9eb50fd ioat: Extract halted error-debugging to a function
Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:46:08 +00:00
cem
85e3a79564 ioat: Always re-arm interrupts in process_events
It doesn't hurt, even if there is nothing to do.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:45:56 +00:00
cem
090b851407 ioat: Add sysctl to force hw reset
To enable controlled testing.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:45:45 +00:00
cem
c7fdd24039 ioat: refcnt users so we can drain them at detach
We only need to borrow a mutex for the drain sleep and the 0->1
transition, so just reuse an existing one for now.

The wchan is arbitrary.  Using refcount itself would have required
__DEVOLATILE(), so use the lock's address instead.

Different uses are tagged by kind, although we only do anything with
that information in INVARIANTS builds.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:45:33 +00:00
cem
ab89cadd9b ioat: When queueing operations, assert the submit lock
Callers should have acquired this lock when they invoked ioat_acquire()
before issuing operations.  Assert it is held.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:45:21 +00:00
cem
740688f4b5 ioat: Don't use sleeping allocation in lock path
This is still the worst possible way to allocate memory if it will ever
be under pressure, but at least it won't deadlock.

Suggested by:	WITNESS
Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:45:10 +00:00
cem
1e45bf339a ioat: Pull out timer callout delay into a constant
Pull out the timer callout delay into IOAT_INTR_TIMO and shorten it
considerably (5s -> 100ms).  Single operations do not take 5-10 seconds
and when interrupts aren't working, waiting 100ms sucks a lot less than
5s.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:44:58 +00:00
cem
3011618906 ioat_test: Add a colon (':') for style
Missed in r289776.

Sponsored by:	EMC / Isilon Storage Division
2015-10-22 23:08:08 +00:00
cem
7aa8ef7cad ioat: Clean up logging
Replace custom Linux-like logging with a thin shim around
device_printf(), when the softc is available.

In ioat_test, shim around printf(9) instead.

Sponsored by:	EMC / Isilon Storage Division
2015-10-22 23:03:33 +00:00
cem
af16729151 ioat: Fix some attach/detach issues
Don't run the selftest until after we've enabled bus mastering, or the
DMA engine can't copy anything for our test.

Create the ioat_test device on attach, if so tuned.  Destroy the
ioat_test device on teardown.

Replace deprecated 'CALLOUT_MPSAFE' with correct '1' in callout_init().

Sponsored by:	EMC / Isilon Storage Division
2015-10-22 16:46:21 +00:00
cem
c8639a4933 Improve flexibility of ioat_test / ioatcontrol(8)
The test logic now preallocates memory before running the test.

The buffer size is now configurable.  Post-copy verification is
configurable.  The number of copies to chain into one transaction (one
interrupt) is configurable.

A 'duration' mode is added, which repeats the test until the duration
has elapsed, reporting the B/s and transactions completed.

ioatcontrol.8 has been updated to document the new arguments.

Initial limits (on this particular Broadwell-DE) (and when the
interrupts are working) seem to be: 256 interrupts/sec or ~6 GB/s,
whichever limit is more restrictive.

Unfortunately, it seems the interrupt-reset handling on Broadwell isn't
working as intended.  That will be fixed in a later commit.

Sponsored by:	EMC / Isilon Storage Division
2015-10-22 04:38:05 +00:00
cem
bf2670a3a7 ioat: Define IOAT_XFERCAP_VALID_MASK and use in ioat_read_xfercap
Instead of ANDing a magic constant later.

Sponsored by:	EMC / Isilon Storage Division
2015-10-22 04:33:05 +00:00
cem
abf381cdb5 ioat: Use correct macro, fix build on i386
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 19:46:12 +00:00
cem
a30daa64f0 ioat(4): pci_save/restore_state to persist MSI-X registers over BDXDE reset
Also for BWD devices, per jimharris@.

Reviewed by:	jhb
Approved by:	markj (mentor)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3552
2015-09-02 22:48:41 +00:00
cem
4cb92b8527 ioat: re-initialize interrupts after resetting hw on BDXDE
Resetting some generations of the I/OAT hardware (just BDXDE for now)
resets the corresponding MSI-X registers.  So, teardown and
re-initialize interrupts after resetting the hardware.

Reviewed by:	jimharris
Approved by:	markj (mentor)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3549
2015-09-02 16:48:03 +00:00
cem
80b01acf73 ioat(4): Minor style cleanups
Suggested by:	ngie
Reviewed by:	jimharris
Approved by:	markj (mentor)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3481
2015-08-25 17:39:03 +00:00
cem
06ccf4bc96 Import ioat(4) driver
I/OAT is also referred to as Crystal Beach DMA and is a Platform Storage
Extension (PSE) on some Intel server platforms.

This driver currently supports DMA descriptors only and is part of a
larger effort to upstream an interconnect between multiple systems using
the Non-Transparent Bridge (NTB) PSE.

For now, this driver is only built on AMD64 platforms.  It may be ported
to work on i386 later, if that is desired.  The hardware is exclusive to
x86.

Further documentation on ioat(4), including API documentation and usage,
can be found in the new manual page.

Bring in a test tool, ioatcontrol(8), in tools/tools/ioat.  The test
tool is not hooked up to the build and is not intended for end users.

Submitted by:	jimharris, Carl Delsey <carl.r.delsey@intel.com>
Reviewed by:	jimharris (reviewed my changes)
Approved by:	markj (mentor)
Relnotes:	yes
Sponsored by:	Intel
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3456
2015-08-24 19:32:03 +00:00