HomeFreeBSD

Initialize metaslab range trees in metaslab_init

Description

Initialize metaslab range trees in metaslab_init

Motivation

We've noticed several zloop crashes within Delphix generated
due to the following sequence of events:

  • A device gets expanded and new metaslabas are allocated for it. These metaslabs go through metaslab_init() but haven't gone through metaslab_sync_done() yet. This meas that the only range tree that's actually set is the ms_allocatable. All the others are NULL.
  • A vdev_initialization is issues and vdev_initialize_thread starts processing one of these new metaslabs of the expanded vdev.
  • As part of vdev_initialize_calculate_progress() we call into metaslab_load() and metaslab_load_impl() which in turn tries to dereference the metaslabs trees that are still NULL and therefore we crash.

The same failure can come up from the vdev_trim code paths.

This Patch

We considered the following solutions to deal with this issue:

  • Add logic to vdev_initialize/trim to skip those new metaslabs. We decided against this as it would be good to avoid exposing this lower-level detail to higer-level operations.
  • Have metaslab_load_impl() return early for new metaslabs and thus never touch those range_trees that are NULL at that time. This seemed more of a work-around for the bug and not a clear-cut solution.
  • Refactor our logic so all metaslabs have their range_trees created at the time of their creatin in metaslab_init().

In this patch we decided to go with [C] because:

(1) It doesn't expose more metaslab details to higher level

    operations such as vdev initialize and trim.

(2) The current behavior of creating the range trees lazily

    in `metaslab_sync_done()` is unnecessarily complicated.

(3) Always initializing the metaslab range_trees makes other

    parts of the codebase cleaner. For example, we used to
    use `ms_freed` as the reference value for knowing whether
    all the range_trees have been initialized. Now we no
    longer need to do that check in most places (and in the
    few that we do we use the `ms_new` boolean field now
    which is more readable).

Side Changes

Probably due to a mismerge we set ms_loaded to B_TRUE twice
in metasloab_load_impl(). In this patch we remove the extraneous
assignment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #11737

Details

Provenance
Serapheim Dimitropoulos <serapheim@delphix.com>Authored on Mar 20 2021, 5:36 AM
GitHub <noreply@github.com>Committed on Mar 20 2021, 5:36 AM
Parents
rGffd6978ef59c: Linux 5.12 update: bio_max_segs() replaces BIO_MAX_PAGES
Branches
Unknown
Tags
Unknown

Event Timeline

GitHub <noreply@github.com> committed rG793c958f6f16: Initialize metaslab range trees in metaslab_init (authored by Serapheim Dimitropoulos <serapheim@delphix.com>).Mar 20 2021, 5:36 AM