Don't deliver gmirror BIOs during provider teardown
ClosedPublic
Actions

Authored by markj on Jun 21 2016, 6:47 PM.

Details

Reviewers

ngie
glebius
imp
mav

Commits

rS302091: Do not complete pending gmirror BIOs when tearing down the provider.

Summary

This results in lock recursion if any regular or sync BIOs are on
sc_queue when the provider is destroyed, for example when the last
component of a mirror fails. g_mirror_done() and g_mirror_sync_done()
both acquire the queue lock, but we're already holding it in order to
drain the queue. Instead, just free the BIOs.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj updated this revision to Diff 17737.Jun 21 2016, 6:47 PM

markj retitled this revision from to Don't deliver gmirror BIOs during provider teardown.

markj edited the test plan for this revision. (Show Details)

markj updated this object.

Herald added a subscriber: imp. · View Herald TranscriptJun 21 2016, 6:47 PM

markj added reviewers: mav, glebius.Jun 21 2016, 6:48 PM

imp accepted this revision.Jun 21 2016, 6:51 PM

imp added a reviewer: imp.

imp added inline comments.

sys/geom/mirror/g_mirror.c
2124 ↗	(On Diff #17737)	This warrants a comment about what you're doing. Why does that test make sense?

This revision is now accepted and ready to land.Jun 21 2016, 6:51 PM

Add a comment explaining the check.

This revision now requires review to proceed.Jun 21 2016, 8:04 PM

markj marked an inline comment as done.Jun 21 2016, 8:05 PM

markj added inline comments.

sys/geom/mirror/g_mirror.c
2124 ↗	(On Diff #17737)	Added, thanks.

Perfect. I alway struggle between too much and too little information, and this strikes a good balance.. Thanks.

This revision is now accepted and ready to land.Jun 21 2016, 8:15 PM

ngie accepted this revision.Jun 21 2016, 8:22 PM

ngie added a reviewer: ngie.

I don't remember gmirror good enough to properly review this, sorry. But speaking about lock recursion, it could probably be solved in different way -- grab queue content under the lock and then deliver errors after lock is dropped. I don't very like magic with SYNC flag and free() in proposed change, it looks dirty.

In D6908#145071, @mav wrote:

I don't remember gmirror good enough to properly review this, sorry. But speaking about lock recursion, it could probably be solved in different way -- grab queue content under the lock and then deliver errors after lock is dropped. I don't very like magic with SYNC flag and free() in proposed change, it looks dirty.

That would fix the lock recursion but still be undesirable: the bio_done handlers for the mirror BIOs just enqueue the BIO again, so we end up inserting into the queue we're trying to drain. And at this point, the mirror worker does not run the queue again, so these requeued BIOs would be leaked. I could modify the bio_done handlers in question to check for a flag in the mirror softc, but if they are not executed directly, the GEOM up thread will race with a free of the softc.

Closed by commit rS302091: Do not complete pending gmirror BIOs when tearing down the provider. (authored by markj). · Explain WhyJun 22 2016, 9:00 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

Path

Size

head/

sys/

geom/

mirror/

g_mirror.c

17 lines

Diff 17774

View Options

head/sys/geom/mirror/g_mirror.c

Don't deliver gmirror BIOs during provider teardownClosedPublicActions