Page MenuHomeFreeBSD

bhyve: Support Hyper-V (base) and hyperv clock
Needs ReviewPublic

Authored by gusev.vitaliy_gmail.com on Jun 23 2022, 2:22 PM.
Tags
Referenced Files
Unknown Object (File)
Apr 27 2024, 1:29 PM
Unknown Object (File)
Apr 5 2024, 8:07 PM
Unknown Object (File)
Dec 25 2023, 9:15 PM
Unknown Object (File)
Dec 22 2023, 11:21 PM
Unknown Object (File)
Dec 12 2023, 9:48 PM
Unknown Object (File)
Nov 3 2023, 5:41 AM
Unknown Object (File)
Sep 6 2023, 7:37 AM
Unknown Object (File)
Jul 6 2023, 8:40 AM

Details

Reviewers
jhb
markj
kib
howard0su_gmail.com
Group Reviewers
bhyve
Summary

This is version #1 of adding Hyper-V to bhyve in order to use hyperv clock
in guests Windows, and possibly in another guest OS-es too.

Windows OS in VM uses HPET and do not use TSC as clock source if hypervisor passes
CPUID2_HV bit and does not provide Hyper-V paravirtualization. That leads to overhead
by getting VHPET values and a lot of vmexits.

Many hypervisors (KVM, VBOX, VMware) implement Hyper-V paravirtualization for Windows
guests. This patch intends to introduce that feature in bhyve to get Windows guest use
paravirt tsc clock.

To use with Linux guests, it needs more changes, at least modifying bhyve_id vendor ID
signature string.

Sponsored by vStack.

Test Plan

Run Windows several versions and run performance tests.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/amd64/vmm/x86.c
610

Rather than supporting Hyper V leafs under the bhyve ID, we should add support for reporting the Hyper-V leaves. Then a separate knob is which set of leaves to report at 0x4000000 (a VM can export multiple sets of leaves at a fixed stride from the base 0x40000000).

Probably the way to structure this is to have a handler per leaf set that takes the relative offset as an input parameter, and to have an array of pointers to these handlers in struct vmctx and the knob determines the order of the items in the array.

pmooney_pfmooney.com added inline comments.
sys/amd64/vmm/intel/vmx_msr.c
427

Is there a reason why this is implemented solely for VMX, rather than in a central location where both AMD and Intel machines can take advantage of it?

437

I believe the 100ns unit is true for the TSC counter page, but not the Time Reference Count itself?

439

This should probably be based on the boot time of the VM, rather than the host TSC

usr.sbin/bhyve/xmsr.c
76

The state for these are being stored in VM-wide variables. Are they not actually per-CPU?

117

Same question in regards to this logic being Intel-only.

sys/amd64/vmm/intel/vmx_msr.c
427

Thanks for pointing that. It is version #1 of implementation. I will move all things to separate file, named hyperv.c, for instance.

437

It should be counted in same rate according to

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/timers

and code from that page:

do
{
    StartSequence = ReferenceTscPage->TscSequence;
    if (StartSequence == 0)
    {
        // 0 means that the Reference TSC enlightenment is not available at
        // the moment, and the Reference Time can only be obtained from
        // reading the Reference Counter MSR.
        ReferenceTime = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
        return ReferenceTime;
    }

    Tsc = rdtsc();

    // Assigning Scale and Offset should neither happen before
    // setting StartSequence, nor after setting EndSequence.
    Scale = ReferenceTscPage->TscScale;
    Offset = ReferenceTscPage->TscOffset;

    EndSequence = ReferenceTscPage->TscSequence;
} while (EndSequence != StartSequence);

// The result of the multiplication is treated as a 128-bit value.
ReferenceTime = ((Tsc * Scale) >> 64) + Offset;
return ReferenceTime;
439

VM reads the same TSC value as in host until tsc_offset is set to VMCS.

sys/amd64/vmm/x86.c
610

I think we can add parameter to VM how to emulate paravirt. It could be KVM, Hyper-V, etc.

Also I will move all things to separate file hyperv.c

Then a separate knob is which set of leaves to report at 0x4000000 (a VM can export multiple sets of leaves at a fixed stride from the base 0x40000000).

Did you mean to reply to CPUID(0x40000000) several times with different values?

usr.sbin/bhyve/xmsr.c
76

They should be per "partition".

"Hyper-V supports isolation in terms of a partition. A partition is a logical unit of isolation, supported by the hypervisor, in which operating systems execute. The Microsoft hypervisor must have at least one parent, or root, partition, running Windows"

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/hyper-v-architecture

117

Will take it into account. However, I don't have AMD CPU and probably I need help in testing on AMD CPUs.