HomeFreeBSD

vmm: Fix a deadlock between vm_smp_rendezvous() and vcpu_lock_all()

Description

vmm: Fix a deadlock between vm_smp_rendezvous() and vcpu_lock_all()

vm_smp_rendezvous() invokes a callback on all vCPUs, blocking the
initiator until all vCPUs have responded. vcpu_lock_all() blocks each
vCPU by waiting for it to go idle and setting the vCPU state to frozen.
These two operations can deadlock on each other, particularly when
booting a Windows guest, when vcpu_lock_all() blocks waiting for a
rendezvous initiator, and the initiator is blocked waiting for the vCPU
thread which called vcpu_lock_all() to invoke the rendezvous callback.

Implement vcpu_lock_all() in a way that avoids deadlocks with
vm_smp_rendezvous(). In particular, when traversing vCPUs, invoke the
rendezvous callback on the vCPU's behalf to help the initiator finish.
We can only safely do so when the vCPU is IDLE or we have already locked
it, otherwise we may be racing with the target vCPU thread. Thus:

  • Use an exclusive lock to serialize vcpu_lock_all() callers, which lets us lock vCPUs out of order without fear of deadlock with parallel vcpu_lock_all() callers.
  • If a rendezvous is pending, lock all idle vCPUs and invoke the callback on their behalf. If the vcpu_lock_all() caller is itself a vCPU thread, this will handle that thread.
  • Block waiting for all non-idle vCPUs to idle, or until one of them initiates a rendezvous, in which case we go back and invoke callbacks on behalf of already-locked vCPUs.

Note that on !amd64 no changes are needed since there is no rendezvous
mechanism, so there is a separate vcpu_set_state_all() for them based on
the previous vcpu_lock_all(). These will be merged together once vcpu
state handling is consolidated into sys/dev/vmm.

Reviewed by: corvink (previous version)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D52968

(cherry picked from commit f39768e52e513264da60add0ca2412bddda271ff)

Details

Provenance
markjAuthored on Oct 17 2025, 1:09 PM
Reviewer
corvink
Differential Revision
D52968: vmm: My attempt at fixing the rendezvous deadlock
Parents
rG129cedd20bc6: riscv/vmm: Remove a redundant maxcpu check in vm_alloc_vcpu()
Branches
Unknown
Tags
Unknown