8166 zpool scrub thinks it repaired offline device
illumos/illumos-gate@2d2f193a21
2d2f193a21
https://www.illumos.org/issues/8166
If we do a scrub while a leaf device is offline (via "zpool offline"),
we will inadvertently clear the DTL (dirty time log) of the offline
device, even though it is still damaged. When the device comes back
online, we will incompletely resilver it, thinking that the scrub
repaired blocks written before the scrub was started. The incomplete
resilver can lead to data loss if there is a subsequent failure of a
different leaf device.
The fix is to never clear the DTL of offline devices. Note that if a
device is onlined while a scrub is in progress, the scrub will be
restarted.
The problem can be worked around by running "zpool scrub" after
"zpool online".
See also https://github.com/zfsonlinux/zfs/issues/5806
Reviewed by: George Wilson george.wilson@delphix.com
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
This commit is contained in:
parent
9df2d6f729
commit
f45d37d04e
Notes:
svn2git
2020-12-20 02:59:44 +00:00
svn path=/vendor-sys/illumos/dist/; revision=318942
@ -1788,6 +1788,9 @@ vdev_dtl_should_excise(vdev_t *vd)
|
||||
ASSERT0(scn->scn_phys.scn_errors);
|
||||
ASSERT0(vd->vdev_children);
|
||||
|
||||
if (vd->vdev_state < VDEV_STATE_DEGRADED)
|
||||
return (B_FALSE);
|
||||
|
||||
if (vd->vdev_resilver_txg == 0 ||
|
||||
range_tree_space(vd->vdev_dtl[DTL_MISSING]) == 0)
|
||||
return (B_TRUE);
|
||||
|
Loading…
Reference in New Issue
Block a user