Under certain tight race conditions, we found that the lack of a memory barrier in bhyve's virtio handling causes it to miss a NO_NOTIFY state transition on block devices, resulting in guest stall. The investigation is recorded in OS-7613. As part of the examination into bhyve's use of barriers, one other section was found to be problematic, but only on non-x86 ISAs with less strict memory ordering. That was addressed in this patch as well, although it was not at all a problem on x86.
Details
We have only observed this issue when guests ran a specific customer workload. Without the patch, they would encounter the race/stall within hours of running. With the patch in place, their workload was able to run for days without issue. PR https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231117 is going to test as well.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
@kib, I assume we probably want to use 'atomic_thread_fence_*' variants instead of mb and wmb? I think 'atomic_fence_rel()' is sufficient for the first, but does the second need 'seq_cst' instead of 'acq_rel'?
I'm fine with @kib's suggested changes, with the caveat that we (Joyent) only positively tested the mfence version.
This may be a fix for https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231117, the person is testing.
After adding #include <machine/atomic.h> it compiles fine. Virtual Machine that used to crash every few hours works stable for 3 days after switching to patched bhyve. Can we have this merged? Thanks for your work!
FWIW, it happened on AMD Epyc with 16 cores (32 threads).
I would like to see this in next weeks 11.3-PRERELEASE build, this looks to be a low risk barrier addition, @jhb does that seem to be a reasonable target, or would you like to see a longer ^head test cycle? Does anyone see any risk in this change?
usr.sbin/bhyve/virtio.c | ||
---|---|---|
43 | Usually machine/ comes after sys/, before userspace headers. |
usr.sbin/bhyve/virtio.c | ||
---|---|---|
43 | There was some inconsistency in the other bhyve examples for that which lead me astray. Will fix. |