HomeFreeBSD

Improve resilver ETAs

Description

Improve resilver ETAs

When resilvering the estimated time remaining is calculated using
the average issue rate over the current pass. Where the current
pass starts when a scan was started, or restarted, if the pool
was exported/imported.

For dRAID pools in particular this can result in wildly optimistic
estimates since the issue rate will be very high while scanning
when non-degraded regions of the pool are scanned. Once repair
I/O starts being issued performance drops to a realistic number
but the estimated performance is still significantly skewed.

To address this we redefine a pass such that it starts after a
scanning phase completes so the issue rate is more reflective of
recent performance. Additionally, the zfs_scan_report_txgs
module option can be set to reset the pass statistics more often.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14410

Details

Provenance
Brian Behlendorf <behlendorf1@llnl.gov>Authored on Jan 25 2023, 7:28 PM
Parents
rGa68dfdb88c88: Fix "Detach spare vdev in case if resilvering does not happen"
Branches
Unknown
Tags
Unknown

Event Timeline

Brian Behlendorf <behlendorf1@llnl.gov> committed rG9fe3da9364fe: Improve resilver ETAs (authored by Brian Behlendorf <behlendorf1@llnl.gov>).Apr 24 2023, 7:55 PM