HomeFreeBSD

gmirror: treat ENXIO as disk disconnect, not media error

Description

gmirror: treat ENXIO as disk disconnect, not media error

In theory, all data access errors mean that a member is out of sync
at most. But they were treated as more serious errors to avoid the
situation where a flaky disk gets repeatedly disconnected, re-synchronized,
reconnected and then disconnected again.

ENXIO is a special error that means that the member disk disappeared,
so it should get the same handling as the GEOM orphaning event.
There is a better chance that when the disk is reconnected, it will be
a good member again.

When ENXIO happens on a read we use the exisiting G_MIRROR_BUMP_SYNCID
mechanism which means that the mirror's syncid is increased as soon
as there is a write to the mirror. That's because no data has got out
of sync yet, but the problematic memeber is disconnected, so the future
write will make it stale.

When ENXIO happens on a write we use a new G_MIRROR_BUMP_SYNCID_NOW
mechanism which means that we update the mirror metadata as soon as
possible because the problematic memeber is already behind.

Reviewed by: markj, imp
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D9463

Details

Provenance
avgAuthored on
Reviewer
markj
Differential Revision
D9463: gmirror: treat ENXIO as disk disconnect, not media error
Parents
rS323611: fastmatch.h: remove duplicate #defines
Branches
Unknown
Tags
Unknown