freebsd-dev/cmd
Brian Behlendorf 64bdf63f5c
ztest: split block reconstruction
Increase the default allowed number of reconstruction attempts.
There's not an exact right number for this setting.  It needs
to be set large enough to cover any realistic failure scenarios
and small enough to avoid stalling the IO pipeline and invoking
the dead man detection.

The current value of 256 was empirically determined to be too
low based on multi-day runs of ztest.  The fault injection code
would inject more damage than could be reconstructed given the
relatively small number of attempts.  However, in all observed
cases the block could be reconstructed using a slightly higher
limit.

Based on local testing increasing the default value to 4096 was
determined to strike the best balance.  Checking all combinations
takes less than 10s in the worst case, and has so far eliminated
the vast majority of false positives detected by ztest.  This
delay is roughly on par with how long retries may be performed
to a misbehaving HDD and was deemed to be reasonable.  Better to
err on the side of a brief delay rather than fail to reconstruct
the data.

Lastly, the -Y flag has been added to zdb to make it easy to try all
possible combinations when performing split block reconstruction.
For badly damaged blocks with 18 splits, they can be fully enumerated
within a few minutes.  This has been done to ensure permanent errors
are never incorrectly reported when ztest verifies the pool with zdb.

Reviewed by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8271
2019-01-16 14:10:02 -08:00
..
arc_summary pyzfs: python3 support (build system) 2019-01-06 10:39:41 -08:00
arcstat pyzfs: python3 support (build system) 2019-01-06 10:39:41 -08:00
dbufstat pyzfs: python3 support (build system) 2019-01-06 10:39:41 -08:00
fsck_zfs Add /sbin/fsck.zfs helper 2013-01-09 16:54:58 -08:00
mount_zfs Add libzutil for libzfs or libzpool consumers 2018-11-05 11:22:33 -08:00
raidz_test Support -fsanitize=address with --enable-asan 2018-01-10 10:49:27 -08:00
vdev_id Add enclosure_symlinks option to vdev_id 2018-12-14 17:27:49 -08:00
zdb ztest: split block reconstruction 2019-01-16 14:10:02 -08:00
zed zed: detect and offline physically removed devices 2018-11-09 11:17:24 -08:00
zfs Disable 'zfs remap' command 2019-01-15 15:46:58 -08:00
zgenhostid Add zgenhostid utility script 2017-07-25 13:22:03 -04:00
zhack Add libzutil for libzfs or libzpool consumers 2018-11-05 11:22:33 -08:00
zinject Add libzutil for libzfs or libzpool consumers 2018-11-05 11:22:33 -08:00
zpool Add 'zpool status -i' option 2019-01-07 11:03:18 -08:00
zstreamdump zstreamdump dumps core printing truncated nvlist 2018-09-18 09:43:09 -07:00
ztest ztest: split block reconstruction 2019-01-16 14:10:02 -08:00
zvol_id Fedora 28: Fix misc bounds check compiler warnings 2018-04-04 10:16:47 -07:00
Makefile.am Retire legacy test infrastructure 2017-08-15 17:26:38 -07:00