g_vfs_done: Report when we switch on ENXIO conversion

On the 0 -> 1 transition of sc_enxio_active, report that we're doing
this. This is a rare, but interesting, event. Convert to using atomics
to set this field to prevent a rare race:

    In CAM, when we invalidate a device, one thread (T1) will start the
    process in error processing called from *dadone
    (cam_periph_error). This routine will queue work to xpt_async_td
    (T2) and indicate to *dadone to call biodone(ENXIO) for the bio. T2
    wakes up and basically waits to acquire the periph lock. T2 will do
    so when T1 drops the periph lock just before T1's call to
    biodone. T2 acquires the lock and calls biodone(ENXIO) on all
    pending bios. These two threads will race and we could lose the
    printf or get two in rare cases. Since we only touch sc_enxio_active
    in an error path that's infrequent, the extra atomic traffic will be
    rare but will ensure robustness.

Sponsored by:		Netflix
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D35037
This commit is contained in:
Warner Losh 2022-04-24 13:54:20 -06:00
parent f58385f3da
commit e8827f4094

View File

@ -143,8 +143,11 @@ g_vfs_done(struct bio *bip)
cp = bip->bio_from;
sc = cp->geom->softc;
if (bip->bio_error != 0 && bip->bio_error != EOPNOTSUPP) {
if ((bp->b_xflags & BX_CVTENXIO) != 0)
sc->sc_enxio_active = 1;
if ((bp->b_xflags & BX_CVTENXIO) != 0) {
if (atomic_cmpset_int(&sc->sc_enxio_active, 0, 1))
printf("g_vfs_done(): %s converting all errors to ENXIO\n",
bip->bio_to->name);
}
if (sc->sc_enxio_active)
bip->bio_error = ENXIO;
g_print_bio("g_vfs_done():", bip, "error = %d",