vdev_geom_close: close errored consumer even if vdev_reopening is set

If vdev_geom_close doesn't close the consumer, then the subsequent call
to vdev_geom_open() would be just a NOP and would always return success.
Thus, at present vdev_reopen() would always succeed for vdev_geom devices
even if the underlying provider is in error state.
The problem was introduced as a result of an optimization in rS308055.

The most significant manifistation of the problem is that
zio_vdev_io_done() --> vdev_probe() --> SPA_ASYNC_PROBE -->
spa_async_probe() --> vdev_reopen()
chain of calls and events becomes a NOP as well.
This chain is invoked when zio_vdev_io_done() detects an "unexpected"
error from the lower level I/O.
Additionally, that call path may race with SPA_ASYNC_REMOVE path because
of the asynchronous nature of them both.  So, the SPA_ASYNC_PROBE may
erroneously mark a vdev as being healthy after SPA_ASYNC_REMOVE marked
it as removed.

Reviewed by:	asomers, mav
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D12731
This commit is contained in:
avg 2017-10-31 10:15:03 +00:00
parent 04e7093fe7
commit 29181cc4cb

View File

@ -934,13 +934,18 @@ vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *max_psize,
static void
vdev_geom_close(vdev_t *vd)
{
struct g_consumer *cp;
if (vd->vdev_reopening)
return;
cp = vd->vdev_tsd;
DROP_GIANT();
g_topology_lock();
vdev_geom_close_locked(vd);
if (!vd->vdev_reopening ||
(cp != NULL && ((cp->flags & G_CF_ORPHAN) != 0 ||
(cp->provider != NULL && cp->provider->error != 0))))
vdev_geom_close_locked(vd);
g_topology_unlock();
PICKUP_GIANT();
}