Page MenuHomeFreeBSD

bhyve vioapic writes can deadlock instance
Needs ReviewPublic

Authored by on May 23 2019, 8:55 PM.


Group Reviewers

I found that writes to the vioapic are capable of deadlocking with certain other operations on the VM. Say that a multi-vCPU instance is running and a vCPU other than 0 writes to the vioapic redirection table. Assuming certain fields in the register are changed, a call to vm_smp_rendezvous will be made in order to update the trigger mode registers on all the vCPUs. The vioapic write, including the call to vm_smp_rendenzvous is all made in the context of that vCPU. Now say that another thread attempts one of the ioctls which requires locking all the vCPUs. It will do so, starting at vCPU 0 to VM_MAXCPU-1. If it is able to lock any of the vCPUs before the rendezvous begins, a deadlock will occur. This is because it won't be able to lock the vCPU handling the vioapic write, which is waiting for the rendezvous to finish. That rendezvous cannot finish, since any of the vCPUs locked by the other ioctl will never be released until they are all locked, the ioctl completes its work, and they are released.

SmartOS ticket with more detail (including test results): OS-7622

Diff Detail

rS FreeBSD src repository
Lint Skipped
Unit Tests Skipped

Event Timeline

Rebased to reflect upstream changes

rgrimes resigned from this revision.Jun 20 2019, 4:52 PM

Merged downstream in SmartOS as ec6f18e9

markj added inline comments.Aug 7 2019, 7:30 PM

Won't we generally be holding the vioapic spin lock here, now that it is no longer dropped when triggering an update of the TMRs? If so, this sleep is not permitted.

Thanks for pointing that out @markj. This wasn't a problem in SmartOS bhyve since all of the mutexes there are adaptive, rather than spinlocks. (Due to interface differences, we couldn't guarantee a lack of voluntary context switching inside the critical_enter/critical_exit section, and spinlocks came with their own host of deadlock problems.) Considering the constraints in play on the FreeBSD side, I'm not sure this possible without serious restructuring. The vioapic lock would need to be sleeping, so the EOI processing would require a full exit from the VMRUN critical section. Other consumers (hpet, etc) would probably require similar treatment.

This could be changed to drop the vioapic for the duration of TMR updates, perhaps with some extra synch primitive to prevent simultaneous writes from colliding. It wouldn't have the same correctness guarantee in the face of racing vioapic operations, but it may not be worse than what exists now? Considering how infrequent level-triggered interrupts are, it might be an acceptable trade-off.

After discussion with @markj and @jhb, I drafted up a change which involves using an sxlock to guard against racing vioapic writes while allowing the main vioapic mutex to be dropped during the sensitive portion of the TMR update process. I don't have test hardware I can dedicate to running FreeBSD bhyve, so its performance is speculative.