freebsd-skq

History

Alexander Motin 2e4bc6ee5c 9075 Improve ZFS pool import/load process and corrupted pool recovery illumos/illumos-gate@6f7938128a Some work has been done lately to improve the debugability of the ZFS pool load (and import) process. This includes: https://www.illumos.org/issues/7638: Refactor spa_load_impl into several functions https://www.illumos.org/issues/8961: SPA load/import should tell us why it failed https://www.illumos.org/issues/7277: zdb should be able to print zfs_dbgmsg's To iterate on top of that, there's a few changes that were made to make the import process more resilient and crash free. One of the first tasks during the pool load process is to parse a config provided from userland that describes what devices the pool is composed of. A vdev tree is generated from that config, and then all the vdevs are opened. The Meta Object Set (MOS) of the pool is accessed, and several metadata objects that are necessary to load the pool are read. The exact configuration of the pool is also stored inside the MOS. Since the configuration provided from userland is external and might not accurately describe the vdev tree of the pool at the txg that is being loaded, it cannot be relied upon to safely operate the pool. For that reason, the configuration in the MOS is read early on. In the past, the two configurations were compared together and if there was a mismatch then the load process was aborted and an error was returned. The latter was a good way to ensure a pool does not get corrupted, however it made the pool load process needlessly fragile in cases where the vdev configuration changed or the userland configuration was outdated. Since the MOS is stored in 3 copies, the configuration provided by userland doesn't have to be perfect in order to read its contents. Hence, a new approach has been adopted: The pool is first opened with the untrusted userland configuration just so that the real configuration can be read from the MOS. The trusted MOS configuration is then used to generate a new vdev tree and the pool is re-opened. When the pool is opened with an untrusted configuration, writes are disabled to avoid accidentally damaging it. During reads, some sanity checks are performed on block pointers to see if each DVA points to a known vdev; when the configuration is untrusted, instead of panicking the system if those checks fail we simply avoid issuing reads to the invalid DVAs. This new two-step pool load process now allows rewinding pools accross vdev tree changes such as device replacement, addition, etc. Loading a pool from an external config file in a clustering environment also becomes much safer now since the pool will import even if the config is outdated and didn't, for instance, register a recent device addition. With this code in place, it became relatively easy to implement a long-sought-after feature: the ability to import a pool with missing top level (i.e. non-redundant) devices. Note that since this almost guarantees some loss Of data, this feature is for now restricted to a read-only import. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Author: Pavel Zakharov <pavel.zakharov@delphix.com>		2018-02-22 02:21:03 +00:00
..
dtrace	fix up r319744, add new files	2017-06-09 15:06:50 +00:00
lockstat	Import the lockstat(1) sources from OpenSolaris as of 20080410.	2009-06-18 17:25:38 +00:00
mdb/tools/common	Vendor import of the full userland contrib part of DTrace support from	2008-04-26 00:54:52 +00:00
plockstat	Import plockstat from OpenSolaris r12768.	2010-07-19 14:57:01 +00:00
pyzfs	Update vendor/opensolaris to last OpenSolaris state (13149:b23a4dab3d50)	2012-07-18 07:48:04 +00:00
sgs	Update vendor/opensolaris to last OpenSolaris state (13149:b23a4dab3d50)	2012-07-18 07:48:04 +00:00
stat/common	Update vendor/illumos/dist and vendor-sys/illumos/dist	2013-02-11 08:07:56 +00:00
zdb	8962 zdb should work on non-idle pools	2018-02-22 00:09:15 +00:00
zfs	7614 zfs device evacuation/removal	2018-02-18 01:21:52 +00:00
zhack	6314 buffer overflow in dsl_dataset_name	2016-07-12 12:01:54 +00:00
zinject	6531 Provide mechanism to artificially limit disk performance	2016-03-08 16:11:59 +00:00
zlook	Update vendor/opensolaris to last OpenSolaris state (13149:b23a4dab3d50)	2012-07-18 07:48:04 +00:00
zpool	9075 Improve ZFS pool import/load process and corrupted pool recovery	2018-02-22 02:21:03 +00:00
zstreamdump	7252 7628 compressed zfs send / receive	2017-04-14 18:07:43 +00:00
ztest	8809 libzpool should leverage work done in libfakekernel	2018-02-21 21:04:46 +00:00