HomeFreeBSD

MFV r329715: 8997 ztest assertion failure in zil_lwb_write_issue

Description

MFV r329715: 8997 ztest assertion failure in zil_lwb_write_issue

illumos/illumos-gate@f864f99efe57685e1762590c1a880dd16bca6da9
https://github.com/illumos/illumos-gate/commit/f864f99efe57685e1762590c1a880dd16bca6da9

https://www.illumos.org/issues/8997

When dmu_tx_assign is called from zil_lwb_write_issue, it's possible
for either ERESTART or EIO to be returned.
If ERESTART is returned, this will cause an assertion to fail directly
in zil_lwb_write_issue, where the code assumes the return value is
EIO if dmu_tx_assign returns a non-zero value. This can occur if the
SPA is suspended when dmu_tx_assign is called, and most often occurs
when running zloop.
If EIO is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, zil_commit_waiter_timeout contains the
following logic:
  lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
  ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);
In this case, if dmu_tx_assign returned EIO from within
zil_lwb_write_issue, the lwb variable passed in will not be issued
to disk. Thus, it's lwb_state field will remain LWB_STATE_OPENED and
this assertion will fail. zil_commit_waiter_timeout assumes that after
it calls zil_lwb_write_issue, the lwb will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where dmu_tx_assign returns EIO.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
MFC after: 3 weeks

Details

Provenance
avgAuthored on
Parents
rS329716: lualoader: Use the key that interrupts autoboot as a menu choice
Branches
Unknown
Tags
Unknown