HomeFreeBSD

libsa/zfs: refactor vdev tree for better resiliency against stale disks

Description

libsa/zfs: refactor vdev tree for better resiliency against stale disks

Before this change in vdev_insert() we would avoid inserting a duplicate
vdev to the list of children, however this duplicate being unlinked from
the parent is still stored on the global list with initialized v_guid.
Such leaked duplicate can later be returned by vdev_find(). After
6dd0803ffd31 such leaked vdev may be freed or pointing to a freed parent,
which leads to a loader crash. Note that the leak problem was there
before 6dd0803ffd31.

First, in vdev_insert() free conflicting vdev and return the existing one.
Update callers accordingly. There is only one caller that actually may
encounter this condition.

Second, eliminate global list of vdevs and make vdev_find() to work
recursively on the tree that a caller must provide. Of course, a chance
of GUID collision between members of different pools is extremely low. The
main motivation here is just to increase code robustness and fully isolate
the data structures of different pools being tasted by the loader, and
make easier debugging of bugs like the one being fixed.

Reviewed by: mav, imp
Differential Revision: https://reviews.freebsd.org/D51912
Fixes: 6dd0803ffd31c60a84488d06928813353c6303d3

Details

Provenance
glebiusAuthored on Aug 20 2025, 2:51 PM
Reviewer
mav
Differential Revision
D51912: libsa/zfs: refactor vdev tree for better resiliency against stale disks
Parents
rG8ef5016f73b9: libsa/zfs: simplify vdev_find_previous()
Branches
Unknown
Tags
Unknown