Page MenuHomeFreeBSD

bhyve: add locks around NVMe queue accesses
ClosedPublic

Authored by chuck on Apr 7 2019, 7:13 PM.

Details

Summary

The NVMe code attempted to ensure thread safety through a combination of
using atomics and a "busy" flag. But this approach leads to unavoidable
race conditions.

Fix is to use per-queue mutex locks to ensure thread safety within the
queue processing code. While in the neighborhood, move all the queue
initialization code to a common function.

Test Plan

Ran fio and data validation tests in an SMP Linux guest

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

This looks OK to me, but my knowledge of what bhyve does to emulate PCI is weak.

usr.sbin/bhyve/pci_nvme.c
322 ↗(On Diff #55922)

this indentation looks wrong :)

329 ↗(On Diff #55922)

Ditto

This revision is now accepted and ready to land.Apr 12 2019, 6:47 PM

So I think splitting the queue init out of reset is probably a good change to do on its own as a separate commit. I think it will also make the locking part of the diff easier to see.

Can you describe why you want to change the locking in this way? The 'busy' approach seems to ensure that an I/O thread will just return and be able to do something else if the queue is already busy where as the pthread_mutex approach means the I/O thread will be forced to wait for the other threads to finish. It is perhaps racy in that you are relying on the other thread to always "notice" work that you queue, though that race can be handled.

usr.sbin/bhyve/pci_nvme.c
138 ↗(On Diff #55922)

Why did you move this earlier? Was there a padding hole?

143 ↗(On Diff #55922)

There looks to be a 16-bit hole between intr_vec and intr_en FWIW.

148 ↗(On Diff #55922)

It looks like this pthread mutex replaces a home-grown spin lock on the busy flag?

328 ↗(On Diff #55922)

Is the +1 for the Admin queue? If so, a comment to that effect might be helpful.

402 ↗(On Diff #55922)

So why is it ok in the new code to reinit the admin queues during reset? Was it actually ok in the old code?

1096 ↗(On Diff #55922)

You could maybe unlock the mutex before asserting the interrupt. Not sure it will matter in this case if the same thread will answer any doorbell that the interrupt handler in the guest would ring.

chuck retitled this revision from bhyve NVMe emulation locking to bhyve: add locks around NVMe queue accesses.May 18 2020, 1:47 PM
chuck edited the summary of this revision. (Show Details)
chuck edited the test plan for this revision. (Show Details)
chuck added a reviewer: jhb.

Mostly updated to reflect subsequent changes in NVMe emulation since original patch

This revision now requires review to proceed.May 18 2020, 1:48 PM
This revision was not accepted when it landed; it landed in state Needs Review.Jun 29 2020, 12:31 AM
This revision was automatically updated to reflect the committed changes.