Page MenuHomeFreeBSD

vmm: Fix AP startup compatibility
ClosedPublic

Authored by markj on Feb 8 2023, 7:51 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, May 28, 5:51 PM
Unknown Object (File)
May 9 2024, 7:46 AM
Unknown Object (File)
May 9 2024, 7:36 AM
Unknown Object (File)
May 9 2024, 7:36 AM
Unknown Object (File)
May 9 2024, 7:36 AM
Unknown Object (File)
May 8 2024, 8:40 PM
Unknown Object (File)
Dec 20 2023, 8:16 AM
Unknown Object (File)
Dec 12 2023, 1:11 PM
Subscribers

Details

Summary

On main, bhyve activates all vCPUs before running the BSP. In 13.1,
bhyve instead just runs the BSP and activates APs when
VM_EXITCODE_SPINUP_AP messages are received. Furthermore, vmm now
allocates vcpu structures lazily.

When an INIT or STARTUP IPI is raised, vlapic_calcdest() only returns
active vCPUs. So with old bhyve, it would return an empty set and
vm_handle_ipi() would do nothing. Fix this by modifying
vlapic_calcdest() to return "inactive" vCPUs in these special cases.

Then, vm_handle_ipi() handles INIT by reinitializing the vlapic on all
destination vCPUs. This does not work with lazy vcpu allocation.
Instead, just rendezvous with active vCPUs, since vm_alloc_vcpu() will
perform the same initialization later anyway.

Finally, fix encoding of the vcpuid in VM_EXITCODE_SPINUP_AP messages.

Also fix some other nits found along the way:

  • Collapse two identical cases in vlapic_icrlo_write_handler() into one.
  • Don't bother reinitializing "retu" before calling vm_handle_ipi(), since that function sets it unconditionally.
  • Handle the default case in vm_handle_ipi() by panicking rather than returning EPERM (1) to userspace.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.Feb 8 2023, 7:51 PM
sys/amd64/vmm/io/vlapic.c
1071

ALLESELF is allowed for INIT IPIs. We have to fix it's calculation too.

1180

That's not the correct behaviour. If you send an INIT to an already running cpu, it'll reset itself. So, keep the unconditional rendezvous.

sys/amd64/vmm/io/vlapic.c
1071

I don't see how this can be fixed. How can we find the "real" destination mask?

Do you know of any operating systems which start APs this way?

1180

The unconditional rendezvous doesn't work. At this point the destination vCPU might not yet be allocated, so calling vm_smp_rendezvous() will crash. The point of this change is to re-initialize already running vCPUs.

sys/amd64/vmm/io/vlapic.c
1071

struct vm already saves the cpu topology set by bhyve.

ALLESELF was broken for INIT on pre 14.0 because bhyve could handle INITonly for a single vcpu. So, if any operating systems starts APs this way, it wasn't supported by bhyve in the past.

1180

Sry, I've misread the code.

sys/amd64/vmm/io/vlapic.c
1071

Then I think it's better to leave this case alone: we only need this hack for compatibility with old bhyve executables, and it sounds like old bhyve wouldn't have been able to boot a guest using APIC_DEST_ALLESELF anyway. So there's no point in complicating the code to handle a situation that we can't easily test.

corvink added inline comments.
sys/amd64/vmm/io/vlapic.c
1071

Yes, it's not required for old bhyve binaries. So, I agree to the change.

However, without a fix, ALLESELF INIT is still broken on latest bhyve. Additionally, if I don't miss anything, a guest is able to start more vcpus than assigned to it (e.g. if you assign a single vcpu to the guest, it's still able to send INIT SIPI to a second vcpu to spin it up). So, creating a bitmask of valid vcpus might be a good idea anyways.

This revision is now accepted and ready to land.Feb 9 2023, 5:43 PM
sys/amd64/vmm/io/vlapic.c
837

This change is no-op I think? In the case that VM_EXITCODE_SPINUP_AP (old bhyve binaries) is used, then vlapic->ipi_exit is false, and we ignore the results of vlapic_calcdest() and instead generate a new ipimask from scratch that is copied to dmask. The code to set the ipimask explicitly for INIT/STARTUP below I had added specifically to avoid doing this special case in this function (which already seemed overly complicated to me).

1121

This is just collapsing these two cases because they are identical?

1131

This overwrites the results of vlapic_calcdest() for legacy binaries.

sys/amd64/vmm/io/vlapic.c
837

Ahh, I see. The problem is that vm_handle_ipi() sets the dmask of the VM_EXITCODE_SPINUP_AP to dmask, not ipimask.

So, this part of the change isn't a no-op, but it isn't needed if vm_handle_ipi() sets the dmask properly.

1131

Yeah, but below we do vmexit->u.ipi.dmask = dmask; and it should be vmexit->u.ipi.dmask = ipimask;.

sys/amd64/vmm/io/vlapic.c
837

Sorry, s/vm_handle_ipi/vlapic_icrlo_write_handler/ in my previous comment.

sys/amd64/vmm/io/vlapic.c
1131

Oh, yes, I think that's right. Before Corvin's most recent change I think the code set dmask here instead of ipimask.

markj marked an inline comment as done.

Drop changes to vlapic_calcdest(), fix the vcpu mask in the VM_EXITCODE_IPI
message instead.

This revision now requires review to proceed.Feb 9 2023, 6:17 PM
sys/amd64/vmm/io/vlapic.c
1121

Right.

jhb added inline comments.
sys/amd64/vmm/io/vlapic.c
1121

Maybe do that as a separate commit just so the fix itself is clearer? The previous commit is what made these two cases the same I think. Before the STARTUP case did more work.

1199

Nice

1205

Maybe use __assert_unreachable?

This revision is now accepted and ready to land.Feb 9 2023, 6:34 PM

(The log message probably needs revising as well now, but I assume you'll do that)

markj marked 2 inline comments as done.Feb 9 2023, 6:55 PM
In D38446#875384, @jhb wrote:

(The log message probably needs revising as well now, but I assume you'll do that)

Yeah, I'll adjust it before committing.

sys/amd64/vmm/io/vlapic.c
1121

Will do.