Page MenuHomeFreeBSD

Check the index hasn't changed after writing the cmp entry

Authored by andrew on Feb 21 2019, 8:55 PM.



If an interrupt fires while writing the cmp entry we may have a partial
entry. Work around this by using atomic_cmpset to set the new index.
If it fails we need to set the previous index value and try again as
the entry may be in an inconsistent state.

This fixes messages similar to the following from syzkaller:
bad comp 224 type 2163727253

Diff Detail

Lint OK
No Unit Test Coverage
Build Status
Buildable 22649
Build 21754: arc lint + arc unit

Event Timeline

This revision is now accepted and ready to land.Feb 21 2019, 10:02 PM
This revision was automatically updated to reflect the committed changes.
cem added inline comments.
258 ↗(On Diff #54347)

Is this backwards? Should it be index = buf[0];?

258 ↗(On Diff #54347)

No, if we get to this point buf[0] has changed since it was loaded in line 244. This can only happen due to an interrupt so should be discarded. As we don't know when the interrupt happened it may return inconsistent data. As such the best way to handle this is to try writing to the same index slot again, however for this to work we need a known value in buf[0] for the cmpset call in line 256.

TL;DR: Once we have the index we need to write to its slot to ensure we get consistent data.

markj added inline comments.
258 ↗(On Diff #54347)

I'm confused by this. Interrupt handlers shouldn't log any tracing data. So while this update may indeed be interrupted, the tracing buffer shouldn't be modified at all while the interrupt is being handled.

Hmm. I think the checks in get_kinfo() won't catch certain types of IPIs, at least on amd64. That might be the underlying problem here.

258 ↗(On Diff #54347)

Any method to signal we are in an interrupt or trap handler would either need to be performed in assembly, or the code that runs before the state is set and after it is cleared would need to be marked as no-coverage. As soon as the kernel runs a C function it will call into KCOV (if enabled).

258 ↗(On Diff #54347)

Thanks, I see now. I'm not sure how Linux solves this problem, they also seem to set a "in hard interrupt context" flag from C, but their kcov does not perform atomic updates of the buffer header. The C entry point for IRQs has a noinstr attribute, but it's not clear to me that that disables the coverage sanitizer, though I guess it probably does somehow.