754d27df02
illumos/illumos-gate@f864f99efe
f864f99efe
https://www.illumos.org/issues/8997
When dmu_tx_assign is called from zil_lwb_write_issue, it's possible
for either ERESTART or EIO to be returned.
If ERESTART is returned, this will cause an assertion to fail directly
in zil_lwb_write_issue, where the code assumes the return value is
EIO if dmu_tx_assign returns a non-zero value. This can occur if the
SPA is suspended when dmu_tx_assign is called, and most often occurs
when running zloop.
If EIO is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, zil_commit_waiter_timeout contains the
following logic:
lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);
In this case, if dmu_tx_assign returned EIO from within
zil_lwb_write_issue, the lwb variable passed in will not be issued
to disk. Thus, it's lwb_state field will remain LWB_STATE_OPENED and
this assertion will fail. zil_commit_waiter_timeout assumes that after
it calls zil_lwb_write_issue, the lwb will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where dmu_tx_assign returns EIO.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
MFC after: 3 weeks