Page MenuHomeFreeBSD

We can't release the refcount outside of the periph lock.
ClosedPublic

Authored by imp on May 21 2018, 11:06 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 22, 9:31 PM
Unknown Object (File)
Sat, Nov 16, 1:50 PM
Unknown Object (File)
Oct 25 2024, 11:28 PM
Unknown Object (File)
Oct 23 2024, 9:09 AM
Unknown Object (File)
Oct 23 2024, 8:53 AM
Unknown Object (File)
Oct 10 2024, 11:13 AM
Unknown Object (File)
Oct 2 2024, 2:00 PM
Unknown Object (File)
Sep 28 2024, 12:36 PM
Subscribers

Details

Summary

We're dropping the periph lock then dropping the refcount. However,
that violates the locking protocol and is racy. This seems to be
the cause of weird occasional panics with a bogus assert.

Sponsored by: Netflix

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 16762
Build 16649: arc lint + arc unit

Event Timeline

Fix missed part of the patch

fix bogus mdastart -> ndastart

doh! compile with invariants

doh! no, with invariants

mav accepted this revision.EditedMay 22 2018, 6:11 PM

Looks good to me. The only rough edge I see is that outstanding_cmds is used only for the newly added assertion, that is why it was so broken before this and nobody cared.

This revision is now accepted and ready to land.May 22 2018, 6:11 PM

Running fine for me. Reduced amount of panics, it looks like. Haven't run it for very long though.

In D15517#327757, @mav wrote:

Looks good to me. The only rough edge I see is that outstanding_cmds is used only for the newly added assertion, that is why it was so broken before this and nobody cared.

Yes. I think ada has similar issue with its oustanding_cmds. It's where I got it from :)

However, you're right: this is pure bug detection code. I thought about putting something like
#ifdef INVARIANTS
#define I(x) x
#else
#define I(x)
#endif

then adding I( ) around this stuff, but I decided against that for the moment...

I've been running this patch since Tuseday evening without any issues. Without it I get a panic almost immediately when my disk is read, sometimes even before hitting multi user. Is it possible to get it committed, or what is the holdup?

As I have told, I have no objections. I still don't like the global counters, since they mean we can never remove respective lock or at least atomics, but for now I can live with that. It is better then broken code.

This revision was automatically updated to reflect the committed changes.