Commit Graph

18 Commits

Author SHA1 Message Date
Alexander Motin
7280125e81 Add ioat_get_domain() to ioat(4) KPI.
This allows NUMA-aware consumers to reduce inter-domain traffic.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-11-19 02:09:04 +00:00
Conrad Meyer
236b57fe36 ioat(4): Set __result_use_check on ioat_acquire_reserve
Even M_WAITOK callers must check for failure.  For example, if the device is
quiescing, either due to automatic error-recovery induced reset, or due to
administrative detach, the routine will return ENXIO and the acquire
reference will not be held.  So, there is no mode in which it is safe to
assume the routine succeeds without checking.

Sponsored by:	Dell EMC Isilon
2019-01-17 23:21:02 +00:00
Conrad Meyer
f8253f1a39 ioat(4): Export HW capabilities to consumers 2016-07-12 21:56:49 +00:00
Conrad Meyer
f8f92e9180 ioat(4): Export the number of available channels
Sponsored by:	EMC / Isilon Storage Division
2016-06-04 03:54:30 +00:00
Conrad Meyer
bec7ff798a ioat(4): Implement CRC and MOVECRC APIs
And document them in ioat.4.

Sponsored by:	EMC / Isilon Storage Division
2016-05-03 17:07:18 +00:00
Conrad Meyer
0ff814e854 ioat(4): ioat_get_dmaengine(): Add M_WAITOK mode
Sponsored by:	EMC / Isilon Storage Division
2016-04-09 13:15:34 +00:00
Conrad Meyer
6ca07079af ioat(4): Add support for 'fence' bit with DMA_FENCE flag
Some classes of IOAT hardware prefetch reads.  DMA operations that
depend on the result of prior DMA operations must use the DMA_FENCE flag
to prevent stale reads.

(E.g., I've hit this personally on Broadwell-EP.  The Broadwell-DE has a
different IOAT unit that is documented to not pipeline DMA operations.)

Sponsored by:	EMC / Isilon Storage Division
2016-01-15 01:34:43 +00:00
Conrad Meyer
1502e36346 ioat(4): Add ioat_acquire_reserve() KPI
ioat_acquire_reserve() is an extended version of ioat_acquire().  It
allows users to reserve space in the channel for some number of
descriptors.  If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.

Sponsored by:	EMC / Isilon Storage Division
2016-01-07 23:02:15 +00:00
Conrad Meyer
bd81fe68ee ioat(4): Add ioat_get_max_io_size() KPI
Consumers need to know the permitted IO size to send maximally sized
chunks to the hardware.

Sponsored by:	EMC / Isilon Storage Division
2016-01-05 20:42:19 +00:00
Conrad Meyer
31bf2875ea ioat(4): Add an API to get HW revision
Different revisions support different operations.  Refer to Intel
External Design Specifications to figure out what your hardware
supports.

Sponsored by:	EMC / Isilon Storage Division
2015-12-17 23:21:37 +00:00
Conrad Meyer
5ca9fc2a8d ioat(4): Add support for interrupt coalescing
In I/OAT, this is done through the INTRDELAY register.  On supported
platforms, this register can coalesce interrupts in a set period to
avoid excessive interrupt load for small descriptor workflows.  The
period is configurable anywhere from 1 microsecond to 16.38
milliseconds, in microsecond granularity.

Sponsored by:	EMC / Isilon Storage Division
2015-12-14 22:01:52 +00:00
Conrad Meyer
9950fde08d ioat(4): Add ioat_copy_8k_aligned KPI
The hardware supports descriptors with two non-contiguous pages.  This
allows issuing one descriptor for an 8k copy from/to non-contiguous but
otherwise page-aligned memory.

Sponsored by:	EMC / Isilon Storage Division
2015-12-09 22:45:51 +00:00
Conrad Meyer
faefad9c12 ioat: Handle channel-fatal HW errors safely
Certain invalid operations trigger hardware error conditions.  Error
conditions that only halt one channel can be detected and recovered by
resetting the channel.  Error conditions that halt the whole device are
generally not recoverable.

Add a sysctl to inject channel-fatal HW errors,
'dev.ioat.<N>.force_hw_error=1'.

When a halt due to a channel error is detected, ioat(4) blocks new
operations from being queued on the channel, completes any outstanding
operations with an error status, and resets the channel before allowing
new operations to be queued again.

Update ioat.4 to document error recovery;  document blockfill introduced
in r290021 while we are here;  document ioat_put_dmaengine() added in
r289907;  document DMA_NO_WAIT added in r289982.

Sponsored by:	EMC / Isilon Storage Division
2015-10-31 20:38:06 +00:00
Conrad Meyer
1693d27b71 ioat: Define DMACAPABILITY bits
Check for BFILL capability before initiating blockfill operations.

Sponsored by:	EMC / Isilon Storage Division
2015-10-28 02:37:24 +00:00
Conrad Meyer
2a4fd6b17a ioat: Add support for Block Fill operations
The IOAT hardware supports writing a 64-bit pattern to some destination
buffer.  The same limitations on buffer length apply as for copy
operations.  Throughput is a bit higher (probably because fill does not
have to spend bandwidth reading from a source in memory).

Support for testing Block Fill has been added to ioatcontrol(8) and the
ioat_test device.  ioatcontrol(8) accepts the '-f' flag, which tests
Block Fill.  (If the flag is omitted, the tool tests copy by default.)
The '-V' flag, in conjunction with '-f', verifies that buffers are
filled in the expected pattern.

Tested on:	Broadwell DE (Xeon D-1500)
Sponsored by:	EMC / Isilon Storage Division
2015-10-26 19:34:12 +00:00
Conrad Meyer
bf8553ea38 ioat: Allocate memory for ring resize sanely
Add a new flag for DMA operations, DMA_NO_WAIT.  It behaves much like
other NOWAIT flags -- if queueing an operation would sleep, abort and
return NULL instead.

When growing the internal descriptor ring, the memory allocation is
performed outside of all locks.  A lock-protected flag is used to avoid
duplicated work.  Threads that cannot sleep and attempt to queue
operations when the descriptor ring is full allocate a larger ring with
M_NOWAIT, or bail if that fails.

ioat_reserve_space() could become an external API if is important to
callers that they have room for a sequence of operations, or that those
operations succeed each other directly in the hardware ring.

This patch splits the internal head index (->head) from the hardware's
head-of-chain (DMACOUNT) register (->hw_head).  In the future, for
simplicity's sake, we could drop the 'ring' array entirely and just use
a linked list (with head and tail pointers rather than indices).

Suggested by:	Witness
Sponsored by:	EMC / Isilon Storage Division
2015-10-26 03:30:38 +00:00
Conrad Meyer
466b3540ff ioat: refcnt users so we can drain them at detach
We only need to borrow a mutex for the drain sleep and the 0->1
transition, so just reuse an existing one for now.

The wchan is arbitrary.  Using refcount itself would have required
__DEVOLATILE(), so use the lock's address instead.

Different uses are tagged by kind, although we only do anything with
that information in INVARIANTS builds.

Sponsored by:	EMC / Isilon Storage Division
2015-10-24 23:45:33 +00:00
Conrad Meyer
e974f91c38 Import ioat(4) driver
I/OAT is also referred to as Crystal Beach DMA and is a Platform Storage
Extension (PSE) on some Intel server platforms.

This driver currently supports DMA descriptors only and is part of a
larger effort to upstream an interconnect between multiple systems using
the Non-Transparent Bridge (NTB) PSE.

For now, this driver is only built on AMD64 platforms.  It may be ported
to work on i386 later, if that is desired.  The hardware is exclusive to
x86.

Further documentation on ioat(4), including API documentation and usage,
can be found in the new manual page.

Bring in a test tool, ioatcontrol(8), in tools/tools/ioat.  The test
tool is not hooked up to the build and is not intended for end users.

Submitted by:	jimharris, Carl Delsey <carl.r.delsey@intel.com>
Reviewed by:	jimharris (reviewed my changes)
Approved by:	markj (mentor)
Relnotes:	yes
Sponsored by:	Intel
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3456
2015-08-24 19:32:03 +00:00