From 6ca07079afcad5a0129c4bcf2662131fde11d823 Mon Sep 17 00:00:00 2001 From: Conrad Meyer Date: Fri, 15 Jan 2016 01:34:43 +0000 Subject: [PATCH] ioat(4): Add support for 'fence' bit with DMA_FENCE flag Some classes of IOAT hardware prefetch reads. DMA operations that depend on the result of prior DMA operations must use the DMA_FENCE flag to prevent stale reads. (E.g., I've hit this personally on Broadwell-EP. The Broadwell-DE has a different IOAT unit that is documented to not pipeline DMA operations.) Sponsored by: EMC / Isilon Storage Division --- share/man/man4/ioat.4 | 15 +++++++++++++-- sys/dev/ioat/ioat.c | 2 ++ sys/dev/ioat/ioat.h | 8 +++++++- 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/share/man/man4/ioat.4 b/share/man/man4/ioat.4 index 10f2663ed8f0..e71c2e12b745 100644 --- a/share/man/man4/ioat.4 +++ b/share/man/man4/ioat.4 @@ -24,7 +24,7 @@ .\" .\" $FreeBSD$ .\" -.Dd January 7, 2016 +.Dd January 14, 2016 .Dt IOAT 4 .Os .Sh NAME @@ -134,7 +134,7 @@ Null operations do nothing, but may be used to test the interrupt and callback mechanism. .Pp All operations can optionally trigger an interrupt at completion with the -.Ar DMA_EN_INT +.Ar DMA_INT_EN flag. For example, a user might submit multiple operations to the same channel and only enable an interrupt and callback for the last operation. @@ -160,6 +160,17 @@ flag. .Ar DMA_NO_WAIT may return NULL.) .Pp +Operations that depend on the result of prior operations should use +.Ar DMA_FENCE . +For example, such a scenario can happen when two related DMA operations are +queued. +First, a DMA copy to one location (A), followed directly by a DMA copy +from A to B. +In this scenario, some classes of I/OAT hardware may prefetch A for the second +operation before it is written by the first operation. +To avoid reading a stale value in sequences of dependent operations, use +.Ar DMA_FENCE . +.Pp All operations, as well as .Fn ioat_get_dmaengine , can return NULL in special circumstances. diff --git a/sys/dev/ioat/ioat.c b/sys/dev/ioat/ioat.c index 7f14c3ff40ec..956e8d1463f4 100644 --- a/sys/dev/ioat/ioat.c +++ b/sys/dev/ioat/ioat.c @@ -852,6 +852,8 @@ ioat_op_generic(struct ioat_softc *ioat, uint8_t op, if ((flags & DMA_INT_EN) != 0) hw_desc->u.control_generic.int_enable = 1; + if ((flags & DMA_FENCE) != 0) + hw_desc->u.control_generic.fence = 1; hw_desc->size = size; hw_desc->src_addr = src; diff --git a/sys/dev/ioat/ioat.h b/sys/dev/ioat/ioat.h index 64f97830a2d6..3b6e0946ac1e 100644 --- a/sys/dev/ioat/ioat.h +++ b/sys/dev/ioat/ioat.h @@ -46,7 +46,13 @@ __FBSDID("$FreeBSD$"); * descriptor without blocking. */ #define DMA_NO_WAIT 0x2 -#define DMA_ALL_FLAGS (DMA_INT_EN | DMA_NO_WAIT) +/* + * Disallow prefetching the source of the following operation. Ordinarily, DMA + * operations can be pipelined on some hardware. E.g., operation 2's source + * may be prefetched before operation 1 completes. + */ +#define DMA_FENCE 0x4 +#define DMA_ALL_FLAGS (DMA_INT_EN | DMA_NO_WAIT | DMA_FENCE) /* * Hardware revision number. Different hardware revisions support different