freebsd-dev/sys/cddl/contrib/opensolaris/uts/common
Josh Paetzel 9a625bd31c MFV 316870
7448 ZFS doesn't notice when disk vdevs have no write cache

illumos/illumos-gate@295438ba32
295438ba32

https://www.illumos.org/issues/7448
       I built a SmartOS image with all the NVMe commits including 7372
       (support NVMe volatile write cache) and repeated my dd testing:
       > #!/bin/bash
       > for i in `seq 1 1000`; do
       > dd if=/dev/zero of=file00 bs=1M count=102400 oflag=sync &
       > dd if=/dev/zero of=file01 bs=1M count=102400 oflag=sync &
       > wait
       > rm file00 file01
       > done
       >
       Previously each dd command took ~145 seconds to finish, now it takes
       ~400 seconds.
       Eventually I figured out it is 7372 that causes unnecessary
       nvme_bd_sync() executions which wasted CPU cycles.
  If a NVMe device doesn't support a write cache, the nvme_bd_sync function will
  return ENOTSUP to indicate this to upper layers.
  It seems this returned value is ignored by ZFS, and as such this bug is not
  really specific to NVMe. In vdev_disk_io_start() ZFS sends the flush to the
  disk driver (blkdev) with a callback to vdev_disk_ioctl_done(). As nvme filled
  in the bd_sync_cache function pointer, blkdev will not return ENOTSUP, as the
  nvme driver in general does support cache flush. Instead it will issue an
  asynchronous flush to nvme and immediately return 0, and hence ZFS will not set
  vdev_nowritecache here. The nvme driver will at some point process the cache
  flush command, and if there is no write cache on the device it will return
  ENOTSUP, which will be delivered to the vdev_disk_ioctl_done() callback. This
  function will not check the error code and not set nowritecache.
  The right place to check the error code from the cache flush is in
  zio_vdev_io_assess(). This would catch both cases, synchronous and asynchronous
  cache flushes. This would also be independent of the implementation detail that
  some drivers can return ENOTSUP immediately.

Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Obtained from:	Illumos
2017-04-21 00:17:54 +00:00
..
ctf
dtrace Use pget() instead of pfind() in fasttrap_pid_{enable,disable}(). 2017-02-15 06:07:01 +00:00
fs MFV 316870 2017-04-21 00:17:54 +00:00
os Mechanically convert cddl sun #ifdef's to illumos 2015-01-17 14:44:59 +00:00
sys zfs: clean up unused files and definitions 2017-02-24 07:53:56 +00:00
zmod
Makefile.files Connect the SHA-512t256 and Skein hashing algorithms to ZFS 2016-05-31 04:12:14 +00:00