Page MenuHomeFreeBSD

ZFS: Set spa_ccw_fail_time=0 when expanding vdev.
AcceptedPublic

Authored by cperciva on Oct 14 2023, 6:42 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, May 23, 6:27 PM
Unknown Object (File)
Thu, May 9, 4:17 PM
Unknown Object (File)
Wed, May 8, 11:40 PM
Unknown Object (File)
Apr 30 2024, 6:11 AM
Unknown Object (File)
Apr 30 2024, 5:52 AM
Unknown Object (File)
Apr 29 2024, 10:10 PM
Unknown Object (File)
Apr 9 2024, 4:59 PM
Unknown Object (File)
Dec 23 2023, 2:57 AM

Details

Reviewers
allanjude
mav
Group Reviewers
ZFS
Summary

When a vdev is to be expanded -- either via zpool online -e or via
the autoexpand option -- a SPA_ASYNC_CONFIG_UPDATE request is queued
to be handled via an asynchronous worker thread (spa_async_thread).
This normally happens almost immediately; but will be delayed up to
zfs_ccw_retry_interval seconds (default 5 minutes) if an attempt to
write the zpool configuration cache failed.

When FreeBSD boots ZFS-root VM images generated using makefs -t zfs,
the zpoolupgrade rc.d script runs zpool upgrade, which modifies the
pool configuration and triggers an attempt to write to the cache file.
This attempted write fails because the filesystem is still mounted
read-only at this point in the boot process, triggering a 5-minute
cooldown before SPA_ASYNC_CONFIG_UPDATE requests will be handled by
the asynchronous worker thread.

When expanding a vdev, reset the "when did a configuration cache
write last fail" value so that the SPA_ASYNC_CONFIG_UPDATE request
will be handled promptly. A cleaner but more intrusive option would
be to use separate SPA_ASYNC_ flags for "configuration changed" and
"try writing the configuration cache again", but with FreeBSD 14.0
coming very soon I'd prefer to leave such refactoring for a later
date.

releng/14.0 candidate.

MFC after: 3 days

Diff Detail

Repository
rG FreeBSD src repository
Lint
No Lint Coverage
Unit
No Test Coverage
Build Status
Buildable 53987
Build 50877: arc lint + arc unit

Event Timeline

allanjude added a subscriber: allanjude.

Reviewed-by: allanjude

If we are happy with this fix, we should open a pull request for it upstream as well, I think it is reasonable to reset the retry especially since this is either a user initiated action, or proved by an device change event from the system

This revision is now accepted and ready to land.Oct 14 2023, 2:28 PM

@allanjude Right, I wasn't sure what the usual process was with OpenZFS. Should I commit this to FreeBSD first and then open a PR at https://github.com/openzfs/zfs/ ?

@cperciva Please open PR on the OpenZFS github first, so that more people could review it. After being merged there it will be merged into FreeBSD.

Since nobody seems to be responding to the github PR, should I just commit this to FreeBSD? We need it in ASAP for the release...

I have no objections to commit this directly for the release. But to upstream I would commit a bigger patch.