freebsd-dev/sys
Josh Paetzel 9a625bd31c MFV 316870
7448 ZFS doesn't notice when disk vdevs have no write cache

illumos/illumos-gate@295438ba32
295438ba32

https://www.illumos.org/issues/7448
       I built a SmartOS image with all the NVMe commits including 7372
       (support NVMe volatile write cache) and repeated my dd testing:
       > #!/bin/bash
       > for i in `seq 1 1000`; do
       > dd if=/dev/zero of=file00 bs=1M count=102400 oflag=sync &
       > dd if=/dev/zero of=file01 bs=1M count=102400 oflag=sync &
       > wait
       > rm file00 file01
       > done
       >
       Previously each dd command took ~145 seconds to finish, now it takes
       ~400 seconds.
       Eventually I figured out it is 7372 that causes unnecessary
       nvme_bd_sync() executions which wasted CPU cycles.
  If a NVMe device doesn't support a write cache, the nvme_bd_sync function will
  return ENOTSUP to indicate this to upper layers.
  It seems this returned value is ignored by ZFS, and as such this bug is not
  really specific to NVMe. In vdev_disk_io_start() ZFS sends the flush to the
  disk driver (blkdev) with a callback to vdev_disk_ioctl_done(). As nvme filled
  in the bd_sync_cache function pointer, blkdev will not return ENOTSUP, as the
  nvme driver in general does support cache flush. Instead it will issue an
  asynchronous flush to nvme and immediately return 0, and hence ZFS will not set
  vdev_nowritecache here. The nvme driver will at some point process the cache
  flush command, and if there is no write cache on the device it will return
  ENOTSUP, which will be delivered to the vdev_disk_ioctl_done() callback. This
  function will not check the error code and not set nowritecache.
  The right place to check the error code from the cache flush is in
  zio_vdev_io_assess(). This would catch both cases, synchronous and asynchronous
  cache flushes. This would also be independent of the implementation detail that
  some drivers can return ENOTSUP immediately.

Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Obtained from:	Illumos
2017-04-21 00:17:54 +00:00
..
amd64 Use kmem_malloc() instead of malloc(9) for the native amd64 filter. 2017-04-17 22:02:09 +00:00
arm Use hwreset_get_by_ofw_idx() function instead, since there is 2017-04-19 05:59:00 +00:00
arm64 Restrict the arm64 supervisor all instructions to only allow a zero 2017-04-20 15:53:20 +00:00
boot loader: uboot disk ioctl should call disk_ioctl 2017-04-18 19:36:58 +00:00
bsm Merge OpenBSM 1.2-alpha5 from vendor branch to FreeBSD -CURRENT: 2017-03-26 21:14:49 +00:00
cam Reorder the minimum_cmd_size code to make it a little smaller and 2017-04-20 20:46:34 +00:00
cddl MFV 316870 2017-04-21 00:17:54 +00:00
compat Drop Giant before sleeping in linux_wait_for_{timeout_,}common(). 2017-04-19 16:12:02 +00:00
conf Replace the RC4 algorithm for generating in-kernel secure random 2017-04-16 09:11:02 +00:00
contrib Restore prototype accidently removed by r316811. Also remove $NetBSD$ 2017-04-19 13:24:32 +00:00
crypto Replace the RC4 algorithm for generating in-kernel secure random 2017-04-16 09:11:02 +00:00
ddb Fix printing of negative offsets (typically from frame pointers) again. 2017-03-26 18:46:35 +00:00
dev Eliminate the ega renderer switch. It did nothing useful except hold 2017-04-20 17:22:03 +00:00
fs Fix the setting of atime for Linux client NFSv4 mounts. 2017-04-21 00:17:47 +00:00
gdb
geom Rename two gmirror state flags to make their meanings slightly clearer. 2017-04-14 17:13:57 +00:00
gnu Update our device tree files to a Linux 4.10 2017-03-07 13:56:49 +00:00
i386 Use kmem_malloc() instead of malloc(9) for the native amd64 filter. 2017-04-17 22:02:09 +00:00
isa Renumber copyright clause 4 2017-02-28 23:42:47 +00:00
kern - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter 2017-04-17 17:34:47 +00:00
kgssapi kgssapi: insignificant spelling fix. 2016-05-03 22:05:03 +00:00
libkern Replace the RC4 algorithm for generating in-kernel secure random 2017-04-16 09:11:02 +00:00
mips Switch BERI Programmable Interrupt Controller to INTRNG. 2017-04-18 17:20:03 +00:00
modules 3BSD-licensed implementation of the chacha20 stream cipher, intended for 2017-04-15 20:51:53 +00:00
net Use kmem_malloc() instead of malloc(9) for the native amd64 filter. 2017-04-17 22:02:09 +00:00
net80211 [net80211] refactor out "add slot" and "purge slot" for A-MPDU. 2017-04-11 07:05:55 +00:00
netgraph
netinet Syncoockies can be used in combination with the syncache. If the cache 2017-04-20 19:19:33 +00:00
netinet6 pf: Fix possible incorrect IPv6 fragmentation 2017-04-20 09:05:53 +00:00
netipsec Add large replay widow support to setkey(8) and libipsec. 2017-04-13 14:44:17 +00:00
netnatm
netpfil pf: Fix possible incorrect IPv6 fragmentation 2017-04-20 09:05:53 +00:00
netsmb
nfs Renumber copyright clause 4 2017-02-28 23:42:47 +00:00
nfsclient Add an NFSv4.1 mount option for "use one openowner". 2017-04-13 21:54:19 +00:00
nfsserver Renumber copyright clause 4 2017-02-28 23:42:47 +00:00
nlm
ofed All these files need sys/vmmeter.h, but now they got it implicitly 2017-04-17 17:07:00 +00:00
opencrypto Don't leak a session and lock if a GMAC key has an invalid length. 2017-04-05 01:46:41 +00:00
powerpc - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter 2017-04-17 17:34:47 +00:00
riscv Follow r317061 "Remove struct vmmeter from struct pcpu" 2017-04-19 17:06:32 +00:00
rpc Fix a crash during unmount of an NFSv4.1 mount. 2017-04-10 22:47:18 +00:00
security Break audit_bsm_klib.c into two files: one (audit_bsm_klib.c) 2017-04-03 10:15:58 +00:00
sparc64 - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter 2017-04-17 17:34:47 +00:00
sys Attempt to determine the modes in which 8-bit wide characters are actually 2017-04-20 13:46:55 +00:00
teken Oops, my fix for bright colors broke bright black some more (in cases 2017-03-27 10:48:28 +00:00
tests Style 9 changes. 2015-11-12 10:31:14 +00:00
tools [fdt] Make DTBs generated by make_dtb.sh overlay-ready 2017-03-10 22:45:07 +00:00
ufs All these files need sys/vmmeter.h, but now they got it implicitly 2017-04-17 17:07:00 +00:00
vm - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter 2017-04-17 17:34:47 +00:00
x86 - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter 2017-04-17 17:34:47 +00:00
xdr
xen xenstore: fix suspension when using the xenstore device 2017-03-07 09:17:48 +00:00
Makefile