sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c

When a da or ada device dissappears, outstanding IOs fail with
	ENXIO, not EIO.  The check for EIO was probably copied from Illumos,
	where that is indeed the correct errno.

	Without this change, pulling a busy drive from a zpool would usually
	turn it into UNAVAIL, even though pulling an idle drive would turn
	it into REMOVED.  With this change, it is REMOVED every time.

	Also, vdev_geom_io_intr shouldn't do zfs_post_remove, because that
	results in devd getting two resource.fs.zfs.removed events.  The
	comment said that the event had to be sent directly instead of
	through the async removal thread because "the DE engine is using
	this information to discard prevoius I/O errors".  However, the fact
	that vdev_geom_io_intr was never actually sending the events until
	now, and that vdev_geom_orphan never sent them at all, and that
	vdev_geom_orphan usually gets called about 2 seconds after the
	actual removal, means that FreeBSD's userland can cope with a late
	event just fine.

Approved by:	ken (mentor)
Sponsored by:	Spectra Logic Corporation
MFC after:	4 weeks
This commit is contained in:
Alan Somers 2013-12-12 00:27:22 +00:00
parent 1cf78c85c5
commit cd730bd6b2

View File

@ -770,20 +770,12 @@ vdev_geom_io_intr(struct bio *bp)
*/
vd->vdev_notrim = B_TRUE;
}
if (zio->io_error == EIO && !vd->vdev_remove_wanted) {
if (zio->io_error == ENXIO && !vd->vdev_remove_wanted) {
/*
* If provider's error is set we assume it is being
* removed.
*/
if (bp->bio_to->error != 0) {
/*
* We post the resource as soon as possible, instead of
* when the async removal actually happens, because the
* DE is using this information to discard previous I/O
* errors.
*/
/* XXX: zfs_post_remove() can sleep. */
zfs_post_remove(zio->io_spa, vd);
vd->vdev_remove_wanted = B_TRUE;
spa_async_request(zio->io_spa, SPA_ASYNC_REMOVE);
} else if (!vd->vdev_delayed_close) {