Page MenuHomeFreeBSD

bhyve: fix NVMe emulation missed interrupts
ClosedPublic

Authored by chuck on Mar 15 2020, 11:49 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 11, 4:38 PM
Unknown Object (File)
Sat, Nov 30, 9:45 PM
Unknown Object (File)
Sep 28 2024, 7:07 PM
Unknown Object (File)
Sep 19 2024, 11:08 PM
Unknown Object (File)
Sep 4 2024, 9:27 AM
Unknown Object (File)
Aug 30 2024, 4:05 AM
Unknown Object (File)
Aug 21 2024, 11:02 PM
Unknown Object (File)
Aug 9 2024, 11:41 PM
Subscribers

Details

Summary

The bhyve NVMe emulation has a race in the logic which generates command
completion interrupts. On FreeBSD guests, this manifests as kernel log
messages similar to:

nvme0: Missing interrupt

The NVMe emulation code sets a per-submission queue "busy" flag while
processing the submission queue, and only generates an interrupt when
the submission queue is not busy.

Aside from being counter to the NVMe design (i.e. interrupt properties
are tied to the completion queue) and adding complexity (e.g. exceptions
to not generating an interrupt when "busy"), it causes a race condition
under the following conditions:

  • guest OS has no outstanding interrupts
  • guest OS submits a single NVMe IO command
  • bhyve emulation processes the SQ and sets the "busy" flag
  • bhyve emulation submits the asynchronous IO to the backing storage
  • IO request to the backing storage completes before the SQ processing loop exits and doesn't generate an interrupt because the SQ is "busy"
  • bhyve emulation finishes processing the SQ and clears the "busy" flag

Fix is to remove the "busy" flag and generate an interrupt when the CQ
head and tail pointers do not match.

Reported by khng300

Test Plan

Running

dd if=/dev/nvme0ns1 of=/dev/null bs=4m
``` in a loop causes the "Missing interrupt" message in a FreeBSD 12.1 guest. With this change, this message does not appear.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This revision is now accepted and ready to land.Mar 15 2020, 11:51 PM

I suspect one of the goals was to try to suppress sending too many duplicate interrupts? But given that MSI-X are generally edge triggered anyway, that may not be a worthy goal.

This revision was automatically updated to reflect the committed changes.