Differential D20319

stand/zfs: don't fail boot if the first label is broken
AbandonedPublic
Actions

Authored by p.bruenn_beckhoff.com on May 19 2019, 10:58 PM.

Details

Reviewers

allanjude
tsoome
kevans
imp

Summary

Booting from a vdev with a broken first label will fail, because
vdev_probe() never tries to read the next label. This revision calls
retries vdev_probe() until a good label was found or all labels are
broken.

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 24326
Build 23149: arc lint + arc unit

Event Timeline

p.bruenn_beckhoff.com created this revision.May 19 2019, 10:58 PM

Herald added a subscriber: delphij. · View Herald TranscriptMay 19 2019, 10:58 PM

Harbormaster completed remote builds in B24326: Diff 57572.May 19 2019, 10:58 PM

Harbormaster completed remote builds in B24326: Diff 57572.

p.bruenn_beckhoff.com added a reviewer: allanjude.May 19 2019, 11:06 PM

allanjude added reviewers: tsoome, kevans, imp.Jun 2 2019, 6:39 PM

Something is off there. The for loop in vdev_probe() does continue in case of failures, but is intended to walk through all 4 labels, so why do we fail?

Yes, but not for all failures. If a label is found in the loop but later the associated data found invalid the vdev_probe() will fail even if there would be another label with better data.

In D20319#442667, @p.bruenn_beckhoff.com wrote:

Yes, but not for all failures. If a label is found in the loop but later the associated data found invalid the vdev_probe() will fail even if there would be another label with better data.

right, I see. I suggest to rework this solution even more; to something like:

vdev_probe()
{

 for (i = 0; i < VDEV_LABELS; i++) {
   vdev_read_label(i, &ub, &nvl);
   check if received  ub and nvl  are better than current.
}
if we have no good ub and nvl
  fail.

}

and vdev_read_label should read the specific label, perform initial tests and return good data.

This way we do not have to play the games with those fancy constants and games with inserting/deleting spa and vdev data.

Agreed, a sane vdev_probe() would be nice. But I have totally no idea how you want to omit cleaning up partially initialized spa/vdev objects without rewriting the whole file. Either some of the current error paths are dead code or things like vdev_init_from_nvlist() have to be replaced.

In D20319#443365, @p.bruenn_beckhoff.com wrote:

Agreed, a sane vdev_probe() would be nice. But I have totally no idea how you want to omit cleaning up partially initialized spa/vdev objects without rewriting the whole file. Either some of the current error paths are dead code or things like vdev_init_from_nvlist() have to be replaced.

I do not think that rewriting the whole file is needed:) And true, we probably need some configuration release during the process.

This change was merge as commit 5f37265777ec9475a5382654cbf4ae27d927c41a

Revision Contents
Changeset List

Path

Size

stand/

libsa/

zfs/

zfsimpl.c

96 lines

Diff 57572

View Options

stand/zfs: don't fail boot if the first label is brokenAbandonedPublicActions