freebsd-nq

History

Matthew Ahrens 9e052db462 OpenZFS 9290 - device removal reduces redundancy of mirrors Mirrors are supposed to provide redundancy in the face of whole-disk failure and silent damage (e.g. some data on disk is not right, but ZFS hasn't detected the whole device as being broken). However, the current device removal implementation bypasses some of the mirror's redundancy. Note that in no case is incorrect data returned, but we might get a checksum error when we should have been able to find the right data. There are two underlying problems: 1. When we remove a mirror device, we only read one side of the mirror. Since we can't verify the checksum, this side may be silently bad, but the good data is on the other side of the mirror (which we didn't read). This can cause the removal to "bake in" the busted data – all copies of the data in the new location are the same, busted version, while we left the good version behind. The fix for this is to read and copy both sides of the mirror. If the old and new vdevs are mirrors, we will read both sides of the old mirror, and write each copy to the corresponding side of the new mirror. (If the old and new vdevs have a different number of children, we will do this as best as possible.) Even though we aren't verifying checksums, this ensures that as long as there's a good copy of the data, we'll have a good copy after the removal, even if there's silent damage to one side of the mirror. If we're removing a mirror that has some silent damage, we'll have exactly the same damage in the new location (assuming that the new location is also a mirror). 2. When we read from an indirect vdev that points to a mirror vdev, we only consider one copy of the data. This can lead to reduced effective redundancy, because we might read a bad copy of the data from one side of the mirror, and not retry the other, good side of the mirror. Note that the problem is not with the removal process, but rather after the removal has completed (having copied correct data to both sides of the mirror), if one side of the new mirror is silently damaged, we encounter the problem when reading the relocated data via the indirect vdev. Also note that the problem doesn't occur when ZFS knows that one side of the mirror is bad, e.g. when a disk entirely fails or is offlined. The impact is that reads (from indirect vdevs that point to mirrors) may return a checksum error even though the good data exists on one side of the mirror, and scrub doesn't repair all data on the mirror (if some of it is pointed to via an indirect vdev). The fix for this is complicated by "split blocks" - one logical block may be split into two (or more) pieces with each piece moved to a different new location. In this case we need to read all versions of each split (one from each side of the mirror), and figure out which combination of versions results in the correct checksum, and then repair the incorrect versions. This ensures that we supply the same redundancy whether you use device removal or not. For example, if a mirror has small silent errors on all of its children, we can still reconstruct the correct data, as long as those errors are at sufficiently-separated offsets (specifically, separated by the largest block size - default of 128KB, but up to 16MB). Porting notes: * A new indirect vdev check was moved from dsl_scan_needs_resilver_cb() to dsl_scan_needs_resilver(), which was added to ZoL as part of the sequential scrub work. * Passed NULL for zfs_ereport_post_checksum()'s zbookmark_phys_t parameter. The extra parameter is unique to ZoL. * When posting indirect checksum errors the ABD can be passed directly, zfs_ereport_post_checksum() is not yet ABD-aware in OpenZFS. Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://illumos.org/issues/9290 OpenZFS-commit: https://github.com/openzfs/openzfs/pull/591 Closes #6900		2018-04-14 12:21:39 -07:00
..
crypto	OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R	2016-10-03 14:51:15 -07:00
fm	Extend deadman logic	2018-01-25 13:40:38 -08:00
fs	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
lua	Fix coverity defects: zfs channel programs	2018-02-20 11:19:42 -08:00
sysevent	OpenZFS 8959 - Add notifications when a scrub is paused or resumed	2018-01-17 10:31:00 -08:00
abd.h	OpenZFS 8416 - abd.h is not C++ friendly	2017-06-30 11:11:01 -07:00
arc_impl.h	Support re-prioritizing asynchronous prefetches	2017-12-21 09:13:06 -08:00
arc.h	Decryption error handling improvements	2018-03-31 11:12:51 -07:00
avl_impl.h
avl.h	Remove dead code from AVL tree	2017-10-05 19:28:00 -07:00
blkptr.h	OpenZFS 8067 - zdb should be able to dump literal embedded block pointer	2017-07-07 11:28:01 -07:00
bplist.h
bpobj.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
bptree.h	Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t	2014-08-06 14:48:41 -07:00
bqueue.h	Illumos 5960, 5925	2016-01-08 15:08:19 -08:00
dbuf.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
ddt.h	Incorrect maximum DVA value in DDE_GET_NDVAS()	2018-02-26 14:20:12 -08:00
dmu_impl.h	Fix race in dnode_check_slots_free()	2018-04-10 11:15:05 -07:00
dmu_objset.h	OpenZFS 9164 - assert: newds == os->os_dsl_dataset	2018-03-30 12:00:40 -07:00
dmu_send.h	Raw receive should change key atomically	2018-02-21 12:31:03 -08:00
dmu_traverse.h	Native Encryption for ZFS on Linux	2017-08-14 10:36:48 -07:00
dmu_tx.h	OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue	2018-01-26 20:19:46 -08:00
dmu_zfetch.h	OpenZFS 6322 - ZFS indirect block predictive prefetch	2016-08-30 14:26:55 -07:00
dmu.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dnode.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_bookmark.h	Illumos 4368, 4369.	2014-07-29 10:55:29 -07:00
dsl_crypt.h	Raw receive should change key atomically	2018-02-21 12:31:03 -08:00
dsl_dataset.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_deadlist.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_deleg.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_destroy.h	OpenZFS 7431 - ZFS Channel Programs	2018-02-08 15:28:18 -08:00
dsl_dir.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_pool.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_prop.h	Illumos 6171 - dsl_prop_unregister() slows down dataset eviction.	2016-01-12 10:53:12 -08:00
dsl_scan.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
dsl_synctask.h	Illumos 4951 - ZFS administrative commands should use reserved space	2015-05-04 09:41:10 -07:00
dsl_userhold.h	Illumos #3740	2013-11-04 11:17:48 -08:00
edonr.h	OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R	2016-10-03 14:51:15 -07:00
efi_partition.h	Fix spelling	2017-01-03 11:31:18 -06:00
frame.h	Suppress incorrect objtool warnings	2017-12-07 10:28:50 -08:00
hkdf.h	Encryption patch follow-up	2017-10-11 16:54:48 -04:00
Makefile.am	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
metaslab_impl.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
metaslab.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
mmp.h	Record skipped MMP writes in multihost_history	2018-03-06 15:15:15 -08:00
mntent.h	Make zfs mount according to relatime config in dataset	2016-04-05 18:55:59 -07:00
multilist.h	OpenZFS 7968 - multi-threaded spa_sync()	2017-03-20 18:36:00 -07:00
nvpair_impl.h
nvpair.h	Replace __va_list with va_list	2014-08-13 10:35:00 -07:00
pathname.h	Add pn_alloc()/pn_free() functions	2016-04-21 09:49:25 -07:00
policy.h	Add `zfs allow` and `zfs unallow` support	2016-06-07 09:16:52 -07:00
range_tree.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
refcount.h	OpenZFS 8081 - Compiler warnings in zdb	2017-10-27 12:46:35 -07:00
rrwlock.h	Illumos 5008 - lock contention (rrw_exit) while running a read only load	2015-07-06 09:34:13 -07:00
sa_impl.h	Implement large_dnode pool feature	2016-06-24 13:13:21 -07:00
sa.h	Project Quota on ZFS	2018-02-13 14:54:54 -08:00
sdt.h	Add line info and SET_ERROR() to ZFS debug log	2017-07-25 23:09:48 -07:00
sha2.h	OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R	2016-10-03 14:51:15 -07:00
skein.h	OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R	2016-10-03 14:51:15 -07:00
spa_boot.h
spa_checksum.h	Implementation of AVX2 optimized Fletcher-4	2016-06-02 14:30:51 -07:00
spa_impl.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
spa.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
space_map.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
space_reftree.h	Illumos #4101 , #4102 , #4103 , #4105 , #4106	2014-07-22 09:39:16 -07:00
sysevent.h	OpenZFS 6939 - add sysevents to zfs core for commands	2017-07-12 21:28:13 -07:00
trace_acl.h	Linux 4.16 compat: inode_set_iversion()	2018-02-08 21:25:19 -08:00
trace_arc.h	Support re-prioritizing asynchronous prefetches	2017-12-21 09:13:06 -08:00
trace_common.h	OpenZFS 6531 - Provide mechanism to artificially limit disk performance	2016-05-26 10:11:51 -07:00
trace_dbgmsg.h	Add line info and SET_ERROR() to ZFS debug log	2017-07-25 23:09:48 -07:00
trace_dbuf.h	Crash in dbuf_evict_one with DTRACE_PROBE	2017-08-09 11:04:41 -07:00
trace_dmu.h	tx_waited -> tx_dirty_delayed in trace_dmu.h	2018-01-31 16:13:26 -08:00
trace_dnode.h	Fix build-it compilation regression	2017-01-24 08:50:15 -08:00
trace_multilist.h	Fix build-it compilation regression	2017-01-24 08:50:15 -08:00
trace_txg.h	Fix build-it compilation regression	2017-01-24 08:50:15 -08:00
trace_vdev.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
trace_zil.h	OpenZFS 8585 - improve batching done in zil_commit()	2017-12-05 09:39:16 -08:00
trace_zio.h	Use cstyle -cpP in `make cstyle` check	2016-12-12 10:46:26 -08:00
trace_zrlock.h	Fix race in trace point in zrl_add_impl	2018-03-12 11:27:02 -07:00
trace.h	Remove duplicate typedefs from trace.h	2015-01-06 16:53:24 -08:00
txg_impl.h	Fix spelling	2017-01-03 11:31:18 -06:00
txg.h	OpenZFS 8063 - verify that we do not attempt to access inactive txg	2017-05-10 13:52:22 -04:00
u8_textprep_data.h
u8_textprep.h
uberblock_impl.h	OpenZFS 8491 - uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS	2017-07-24 13:47:51 -04:00
uberblock.h	Multi-modifier protection (MMP)	2017-07-13 13:54:00 -04:00
uio_impl.h
unique.h	Illumos #3742	2013-11-04 10:55:25 -08:00
uuid.h
vdev_disk.h	Remove custom root pool import code	2016-08-11 11:19:34 -07:00
vdev_file.h	Use a dedicated taskq for vdev_file	2016-12-21 10:47:15 -08:00
vdev_impl.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
vdev_indirect_births.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
vdev_raidz_impl.h	Revert raidz_map and _col structure types	2018-01-09 14:46:52 -08:00
vdev_raidz.h	Use cstyle -cpP in `make cstyle` check	2016-12-12 10:46:26 -08:00
vdev_removal.h	OpenZFS 9290 - device removal reduces redundancy of mirrors	2018-04-14 12:21:39 -07:00
vdev.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
xvattr.h	Project Quota on ZFS	2018-02-13 14:54:54 -08:00
zap_impl.h	OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space	2017-03-07 09:51:59 -08:00
zap_leaf.h	Revert "Handle zap_add() failures in mixed ... "	2018-04-09 14:24:46 -07:00
zap.h	OpenZFS 1300 - filename normalization doesn't work for removes	2017-02-02 14:13:41 -08:00
zcp_global.h	OpenZFS 7431 - ZFS Channel Programs	2018-02-08 15:28:18 -08:00
zcp_iter.h	OpenZFS 7431 - ZFS Channel Programs	2018-02-08 15:28:18 -08:00
zcp_prop.h	OpenZFS 7431 - ZFS Channel Programs	2018-02-08 15:28:18 -08:00
zcp.h	OpenZFS 8677 - Open-Context Channel Programs	2018-02-08 16:05:57 -08:00
zfeature.h	Revert "zhack: Add 'feature disable' command"	2016-05-17 11:52:07 -07:00
zfs_acl.h	Project Quota on ZFS	2018-02-13 14:54:54 -08:00
zfs_context.h	OpenZFS 7431 - ZFS Channel Programs	2018-02-08 15:28:18 -08:00
zfs_ctldir.h	Rename zfs_sb_t -> zfsvfs_t	2017-03-10 09:51:33 -08:00
zfs_debug.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
zfs_delay.h	cstyle: Resolve C style issues	2013-12-18 16:46:35 -08:00
zfs_dir.h	Rename zfs_sb_t -> zfsvfs_t	2017-03-10 09:51:33 -08:00
zfs_fuid.h	Rename zfs_sb_t -> zfsvfs_t	2017-03-10 09:51:33 -08:00
zfs_ioctl.h	OpenZFS 8604 - Simplify snapshots unmounting code	2018-02-08 15:29:44 -08:00
zfs_onexit.h
zfs_project.h	Project Quota on ZFS	2018-02-13 14:54:54 -08:00
zfs_ratelimit.h	Change checksum & IO delay ratelimit values	2018-03-04 17:34:51 -08:00
zfs_rlock.h	Rename zfs_sb_t -> zfsvfs_t	2017-03-10 09:51:33 -08:00
zfs_sa.h	Project Quota on ZFS	2018-02-13 14:54:54 -08:00
zfs_stat.h
zfs_vfsops.h	ZIL claiming should not start user accounting	2018-02-20 16:27:31 -08:00
zfs_vnops.h	Rename zfs_* functions	2017-03-10 09:51:35 -08:00
zfs_znode.h	Project Quota on ZFS	2018-02-13 14:54:54 -08:00
zil_impl.h	OpenZFS 8909 - 8585 can cause a use-after-free kernel panic	2017-12-28 10:18:04 -08:00
zil.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
zio_checksum.h	Remove dependency on linear ABD	2017-03-29 12:24:51 -07:00
zio_compress.h	DLPX-44812 integrate EP-220 large memory scalability	2016-11-29 14:34:27 -08:00
zio_crypt.h	Encryption Stability and On-Disk Format Fixes	2018-02-02 11:37:16 -08:00
zio_impl.h	Native Encryption for ZFS on Linux	2017-08-14 10:36:48 -07:00
zio_priority.h	OpenZFS 7614, 9064 - zfs device evacuation/removal	2018-04-14 12:16:17 -07:00
zio.h	OpenZFS 9290 - device removal reduces redundancy of mirrors	2018-04-14 12:21:39 -07:00
zpl.h	Use cstyle -cpP in `make cstyle` check	2016-12-12 10:46:26 -08:00
zrlock.h	OpenZFS 6328 - Fix cstyle errors in zfs codebase	2017-01-12 09:42:11 -08:00
zvol.h	Add port of FreeBSD 'volmode' property	2017-07-12 13:05:37 -07:00