Page MenuHomeFreeBSD

GVT-d support for bhyve
Needs ReviewPublic

Authored by c.koehne_beckhoff.com on Aug 27 2020, 6:59 AM.

Details

Reviewers
grehan
Group Reviewers
bhyve
Summary

Description:
Adds support for Intels GVT-d feature to passthrough an integrated graphics device to a bhyve-guest.
No dedicated graphic card is neccessary to use this feature. If there is no dedicated graphic card build in, the FreeBSD host runs headless after booting the guest.

Hint:
This patch is based on other patches and I'm going to split this patch into smaller patches to simplify review. For easier testing, it includes all changes. As soon as some patches are merged, this patch will be rebased.
Depending patches:

Prerequisites:

  • Intel CPU with VT-d
  • BIOS option "PM Support" set to Enabled

How to enable GVT-d:
Just add your integrated graphics device to bhyve as any other PCI Passthru device

bhyve -c 2 -m 4G -A -H -S -w \
  -s 0,hostbridge \
  -s 2,passthru,0/2/0 \
  -s 4,virtio-blk,/root/win/win10.img \
  -s 5,virtio-net,tap10 \
  -s 31,lpc \
  -l com1,stdio \
  -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
  win10

Tested Scenarios:

OS
WindowsWorking
DebianWorking with "pci=nocrs" as kernel cmdline option
Ubuntu 20.04Working
FreeBSDWorking (13.0-CURRENT - 20200903 snapshot)

Tested with Windows as a guest:

Architecture
Sandy BridgeNot Working (Windows reports that the device works properly but there's no correct display output)
Ivy BridgeWorking
HaswellWorking
KabylakeWorking

Installation Steps:

  1. Install your VM on the "old" way with GVT-d disabled
  2. Boot into your VM with GVT-d enabled and install the graphics driver (not required for Ubuntu 20.04)
    • debian:
      • Install i915 driver
      • Add "pci=nocrs" option as kernel parameter:
        • add "pci=nocrs" to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub
        • "sudo update-grub"
    • Windows:
      • Do not use a "fbuf" device in your bhyve cmd. It crashes your VM while installing graphics driver. Use Remote-Desktop instead.
      • Install igfx driver (download here)
  3. Boot your VM with GVT-d enabled

Known Limitations:

  • The bhyve-edk2 doesn't contain the Intel GOP driver. Therefore, there is no graphical output while booting. First graphical output is displayed when the Guest-OS driver is loaded. This also means that there is no graphical output while installing an Guest-OS.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
grehan added a subscriber: grehan.

Thanks, looking at this now.

The Intel ACRN edk2 fork has a GVT GOP driver that may be suitable

https://github.com/projectacrn/acrn-edk2/tree/ovmf-acrn/OvmfPkg/GvtGopDxe

The Intel ACRN edk2 fork has a GVT GOP driver that may be suitable

https://github.com/projectacrn/acrn-edk2/tree/ovmf-acrn/OvmfPkg/GvtGopDxe

The ACRN edk2 needs another GOP driver for GVT-d (see here).
I did some research on adding a GOP driver to bhyve edk2, but I'm only getting a broken output:

lib/libvmmapi/vmmapi.c
127 ↗(On Diff #76277)

This will hurt 32-bit VMs. However, it is possible to change this at runtime, though it has to be done prior to PCI config.

sys/dev/pci/pcireg.h
1108

Is this the same as the value configured through BIOS ("DVMT Pre-Allocated" from the intro) ?

Any value in having this configurable ?

1109

Does this value have to be fixed ? (ie.. for a 1:1 guest/host mapping)

Updates:

  • Use PCI-MMIO-Space (0xC0000000 - 0xE0000000) for allocation of Opregion and Graphics Stolen Memory
  • Increase size of Opregion to 16 kB
  • Detect size of Graphics Stolen Memory
c.koehne_beckhoff.com added inline comments.
sys/dev/pci/pcireg.h
1108

Yes, that's the "DVMT Pre-Allocated" setting.

My updated revision detects this value.

1109

Normally this value is set to TOLUD - GSM_SIZE by BIOS. (TOLUD == Top of Low Usable DRAM)

So, I think for bhyve it should be lowmem_limit - GSM_SIZE.
This address is no MMIO-Space, which means that vm_map_pptdev_mmio fails. I don't know how to create such a mapping.

The current implementation maps GPU_GSM_GPA into the PCI-Space (0xC0000000 - 0xE0000000) and works fine. Don't know if it's neccessary to set it to lowmem_limit - GSM_SIZE.

I want to test this patch. Currently, I am using a Lenovo M920Q with i5 8400T, 16G RAM, and 2 SSDs.

What is PM support? I cannot find anything similar in BIOS settings.
and with DVMT, I set pre allocated memory to 64MB.(seems no longer needed with the new update?)

I apply patches to the latest 13.0-CURRENT source, rebuild the KERNEL/WORLD/bhyve.
I try to install a Win10 guest without passing the GVT-d, but VM stops after power on.
seems to have some trouble in memory allocation.

May I have some instructions to apply this Patch? Thanks.

I want to test this patch. Currently, I am using a Lenovo M920Q with i5 8400T, 16G RAM, and 2 SSDs.

What is PM support? I cannot find anything similar in BIOS settings.

Maybe this option isn't visible in your BIOS. It should be under your Graphics Configuration. I assume it's enabled by default. I don't know if it's possible to check that.

and with DVMT, I set pre allocated memory to 64MB.(seems no longer needed with the new update?)

Yes, it's no longer needed.

I apply patches to the latest 13.0-CURRENT source, rebuild the KERNEL/WORLD/bhyve.
I try to install a Win10 guest without passing the GVT-d, but VM stops after power on.
seems to have some trouble in memory allocation.

Could you share your bhyve config and output?

May I have some instructions to apply this Patch? Thanks.

I added some installation steps to the description.
First of all, you have to install the Win10 guest without GVT-d. No special steps are neccessary. I've used this tutorial for that.

Thank you for your update.

Here is what I have done.

  1. install FreeBSD CURRENT from the latest snapshot iso.
  2. git clone the latest source from https://github.com/freebsd/freebsd.git
  3. apply your patch
  4. copy vmm_dev.h and pcireg.h override system file.
  5. build vmm.ko and override the kernel module.
  6. reboot and load vmm.ko
  7. rebuild bhyve and install bhyve
  8. using devctl detach pci0:0:2:0 and devctl set driver pci0:0:2:0 ppt make pci0:0:2:0 using ppt driver.

The output of pciconf -lv as follow:

ppt0@pci0:0:2:0:        class=0x030000 rev=0x00 hdr=0x00 vendor=0x8086 device=0x3e92 subvendor=0x17aa subdevice=0x3135
    vendor     = 'Intel Corporation'
    device     = 'UHD Graphics 630 (Desktop)'
    class      = display
    subclass   = VGA
  1. a windows 10 ver 2004. already installed by normal procedure. (I am using vm-bhyve manager).
  2. using following command line start vm (almost the same as your command line)
bhyve -c 2 -m 4G -A -H -S -w \
  -s 0,hostbridge \
  -s 2,passthru,0/2/0,igd \
  -s 4,ahci-hd,/zroot/vm/win10/disk0.img \
  -s 5,virtio-net,tap10 \
  -s 30,lpc \
  -l com1,stdio \
  -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
  win10
  1. the output is
Add igd-lpc at slot 0:1f.0 to enable GVT-d for igd
bhyve: unregister_bar_passthru: Inappropriate ioctl for device

Anything I am missing?

Thanks for your detailed description.

I've added a new ioctl (VM_UNMAP_PPTDEV_MMIO) to allow remapping of passthru BARs.
It looks like your system is missing that ioctl.

To add this ioctl I did the following steps (assuming that your freebsd sources are located at /usr/src):

  1. Rebuild include files
    • cd /usr/src/include
    • make && make install
  2. Rebuild libvmmapi
    • cd /usr/src/lib/libvmmapi
    • make && make install
  3. Rebuild kernel
    • cd /usr/src
    • make kernel
  4. Rebuild bhyve
    • cd /usr/src/usr.sbin/bhyve
    • make && make install

I guess it's not neccessary to rebuild the whole kernel. Rebuilding vmm.ko should be enough.

I've noticed that my system will load an old vmm.ko if I rebuild and switch it. I don't know what I'm doing wrong.
Rebuilding the whole kernel worked for me.

Thank you!
I follow your instructions, and it works.

BTW: How can I make HDMI Audio work?The VM has no audio device.

I made a video of the boot process here
https://youtu.be/TJ8WhTW7PdA

Glad to hear that it works on your system.

I didn't work on enabling HDMI Audio. I'm only focusing on how to passthrough graphics output to a guest.

So far, I've tried to passthrough the audio controller (on my system located under 0:1f.3). It doesn't work for me.
If I add -s x,hda,play=/dev/dsp to my bhyve config, Windows will show up an audio device.
My monitor has no audio output. Therefore, I can't check if that solution works.

I back-ported this to my own khng300/freebsd@releng/12.1 and I also had a relatively successful GVT-d passthrough trial! However, was it expected when using this with Linux guest the console being blank? I only saw output right after I started GDM (in Wayland), without access to text console at all.

Thank's for your feedback.

Yes, that's the expected behavior. As I mentioned in the Known Issues section, bhyve-edk2 is missing a working GOP driver. As a result, graphical output is visible after i915 driver is loaded and initialized.
If I boot into a shell instead of a desktop environment like GDM, I'll get access to a text console. However, the login prompt is the first graphical output that I get.

This comment was removed by khng.

Thank's for your feedback.

Yes, that's the expected behavior. As I mentioned in the Known Issues section, bhyve-edk2 is missing a working GOP driver. As a result, graphical output is visible after i915 driver is loaded and initialized.
If I boot into a shell instead of a desktop environment like GDM, I'll get access to a text console. However, the login prompt is the first graphical output that I get.

Ok solved. It was the mistake on my side when backporting. Now I get the behavior as this revision.

Notes on FreeBSD guests (may be updated):
This change also works with:

  • 13.0-CURRENT (20200903 snapshot) guest

Fixed Issues:

  1. register_bar and unregister_bar functions doesn't handle MSI-X BAR properly

Updates:

  • Add support for older generations of Intel prozessors

Fixed Issues:

  1. register_bar and unregister_bar functions doesn't handle MSI-X BAR properly

Yes, New update solves my problem. As the previous patch, when Host reboot, it got stalled at BIOS logo, and I had to unplug the power cord.
Now it works flawless.

usr.sbin/bhyve/pci_igd_lpc.c
1 ↗(On Diff #76912)

This new file should have a standard copyright header e.g.

/*-
 * SPDX-License-Identifier: BSD-2-Clause-FreeBSD
 *
 * Copyright (c) 2020 <Beckhoff ...>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 * $FreeBSD$
 */

Add copyright header to new file "pci_igd_lpc.c"

Fixed Issues:

  1. Do not patch cfgread and cfgwrite of passthru devices:

The device emulation functions are shared by every passthru device.
If igd emulationen patches it's cfgread and cfgwrite functions, it will be changed for every passthru device too.
To avoid this, add a passthru_type to decide which function to take.

c.koehne_beckhoff.com edited the summary of this revision. (Show Details)

Updates:

  • Use an own device emulation for GVT-d
  • Do not modify the current implementation of passthru devices

I am curious is it possible you split out mmio remap handling to another differential review? This may make the mmio remap handling part quicker to be reviewed and merged.

D24066 would cover the unmap code: this work could reuse that.

Updates:

Updates:

I think I need to make sure whether it is caused by D24066 or not:

Tracing pid 172 tid 100090 td 0xfffffe005ee35500
kdb_enter() at kdb_enter+0x37/frame 0xfffffe005f608fe0
vpanic() at vpanic+0x19e/frame 0xfffffe005f609030
panic() at panic+0x43/frame 0xfffffe005f609090
trap_fatal() at trap_fatal+0x387/frame 0xfffffe005f6090f0
trap_pfault() at trap_pfault+0x97/frame 0xfffffe005f609150
trap() at trap+0x2ab/frame 0xfffffe005f609260
calltrap() at calltrap+0x8/frame 0xfffffe005f609260
--- trap 0xc, rip = 0xffffffff82b8fc3b, rsp = 0xfffffe005f609330, rbp = 0xfffffe005f609370 ---
intel_uncore_init_mmio() at intel_uncore_init_mmio+0xdb/frame 0xfffffe005f609370
i915_driver_probe() at i915_driver_probe+0x638/frame 0xfffffe005f609410
i915_pci_probe() at i915_pci_probe+0x5d/frame 0xfffffe005f609460
linux_pci_attach_device() at linux_pci_attach_device+0x56f/frame 0xfffffe005f6094c0
device_attach() at device_attach+0x3ca/frame 0xfffffe005f609500
device_probe_and_attach() at device_probe_and_attach+0x70/frame 0xfffffe005f609530
bus_generic_driver_added() at bus_generic_driver_added+0x58/frame 0xfffffe005f609550
devclass_driver_added() at devclass_driver_added+0x39/frame 0xfffffe005f609590
devclass_add_driver() at devclass_add_driver+0x147/frame 0xfffffe005f6095d0
_linux_pci_register_driver() at _linux_pci_register_driver+0xcf/frame 0xfffffe005f609600
i915kms_evh() at i915kms_evh+0x39/frame 0xfffffe005f609610
module_register_init() at module_register_init+0xbd/frame 0xfffffe005f609640
linker_load_module() at linker_load_module+0xbf1/frame 0xfffffe005f609960
kern_kldload() at kern_kldload+0xe6/frame 0xfffffe005f6099a0
sys_kldload() at sys_kldload+0x5b/frame 0xfffffe005f6099d0
amd64_syscall() at amd64_syscall+0x135/frame 0xfffffe005f609af0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe005f609af0
--- syscall (304, FreeBSD ELF64, sys_kldload), rip = 0x800383eea, rsp = 0x7fffffffe6d8, rbp = 0x7fffffffec50 ---

EDIT: Oh my own mistake of KBI mismatch

D24066 should be committed soon so one less obstacle for this work.

There is some code style should be fixed.

From style(9) -- kernel source file style guide

/* Most single-line comments look like this. */
usr.sbin/bhyve/pci_emul.h
314

Changing this function would enable GPU-Passthrough of dedicated graphics to a Windows VM:

static __inline int
is_passthru(struct pci_devinst *pi)
{
    return ((strcmp(pi->pi_d->pe_emu, "passthru") == 0) ||
            (strcmp(pi->pi_d->pe_emu, "gvt-d") == 0));
}
c.koehne_beckhoff.com edited the summary of this revision. (Show Details)

Updates:

  • Implement GVT-d emulation as passthru device.
  • Include D24066 for easier testing (will be removed as soon as it's merged)
c.koehne_beckhoff.com edited the summary of this revision. (Show Details)
  • Works with new OVMF (see D27230) now

Broken for FreeBSD 13.0-ALPHA2:

--- all_subdir_usr.sbin ---
        uint64_t cpu_maxphysaddr, pci_emul_memresv64;
                                  ^
/usr/jails/src/src_13/src/usr.sbin/bhyve/pci_emul.c:1256:8: error: unused variable 'regs' [-Werror,-Wunused-variable]
        u_int regs[4];
              ^
/usr/jails/src/src_13/src/usr.sbin/bhyve/pci_emul.c:1255:11: error: unused variable 'cpu_maxphysaddr' [-Werror,-Wunused-variable]
        uint64_t cpu_maxphysaddr, pci_emul_memresv64;
                 ^
3 errors generated.
  • fix compiling

Stilll not built for 13.0-RCx:

/usr/src/sys/amd64/vmm/io/ppt.c:477:32: error: too few arguments to function call, expected 5, have 3
        ppt = ppt_find(bus, slot, func);

I suspect for ppt_unmap_mmio:

error = ppt_find(vm, bus, slot, func, &ppt);
if (error)
        return(error);

?

In D26209#596955, @khng300_gmail.com wrote:

Updates:

Updates:

I think I need to make sure whether it is caused by D24066 or not:

Tracing pid 172 tid 100090 td 0xfffffe005ee35500
kdb_enter() at kdb_enter+0x37/frame 0xfffffe005f608fe0
vpanic() at vpanic+0x19e/frame 0xfffffe005f609030
panic() at panic+0x43/frame 0xfffffe005f609090
trap_fatal() at trap_fatal+0x387/frame 0xfffffe005f6090f0
trap_pfault() at trap_pfault+0x97/frame 0xfffffe005f609150
trap() at trap+0x2ab/frame 0xfffffe005f609260
calltrap() at calltrap+0x8/frame 0xfffffe005f609260
--- trap 0xc, rip = 0xffffffff82b8fc3b, rsp = 0xfffffe005f609330, rbp = 0xfffffe005f609370 ---
intel_uncore_init_mmio() at intel_uncore_init_mmio+0xdb/frame 0xfffffe005f609370
i915_driver_probe() at i915_driver_probe+0x638/frame 0xfffffe005f609410
i915_pci_probe() at i915_pci_probe+0x5d/frame 0xfffffe005f609460
linux_pci_attach_device() at linux_pci_attach_device+0x56f/frame 0xfffffe005f6094c0
device_attach() at device_attach+0x3ca/frame 0xfffffe005f609500
device_probe_and_attach() at device_probe_and_attach+0x70/frame 0xfffffe005f609530
bus_generic_driver_added() at bus_generic_driver_added+0x58/frame 0xfffffe005f609550
devclass_driver_added() at devclass_driver_added+0x39/frame 0xfffffe005f609590
devclass_add_driver() at devclass_add_driver+0x147/frame 0xfffffe005f6095d0
_linux_pci_register_driver() at _linux_pci_register_driver+0xcf/frame 0xfffffe005f609600
i915kms_evh() at i915kms_evh+0x39/frame 0xfffffe005f609610
module_register_init() at module_register_init+0xbd/frame 0xfffffe005f609640
linker_load_module() at linker_load_module+0xbf1/frame 0xfffffe005f609960
kern_kldload() at kern_kldload+0xe6/frame 0xfffffe005f6099a0
sys_kldload() at sys_kldload+0x5b/frame 0xfffffe005f6099d0
amd64_syscall() at amd64_syscall+0x135/frame 0xfffffe005f609af0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe005f609af0
--- syscall (304, FreeBSD ELF64, sys_kldload), rip = 0x800383eea, rsp = 0x7fffffffe6d8, rbp = 0x7fffffffec50 ---

EDIT: Seems not.

  • fix compiling

Stilll not built for 13.0-RCx:

/usr/src/sys/amd64/vmm/io/ppt.c:477:32: error: too few arguments to function call, expected 5, have 3
        ppt = ppt_find(bus, slot, func);

I suspect for ppt_unmap_mmio:

error = ppt_find(vm, bus, slot, func, &ppt);
if (error)
        return(error);

?

Should be buildable by updating https://reviews.freebsd.org/D24066 .

In D26209#656625, @khng wrote:

Should be buildable by updating https://reviews.freebsd.org/D24066 .

You are right, i missed this patch, thx

I'm going to rebase this patch soon.

The change is upstreamed. You could integrate it with main without integrating your own D24066.

In D26209#656996, @khng wrote:

I'm going to rebase this patch soon.

The change is upstreamed. You could integrate it with main without integrating your own D24066.

I've planned to rebase onto main. D26035 is also upstreamed. It will change this patch a bit too.

  • do not allocate Graphics Stolen Memory inside PCI Region

    This avoid overlapping of GSM and BARs. GSM is located at Top of Low Usable RAM now.