Page MenuHomeFreeBSD

hyperv: enable PCIe pass-through (a.k.a. Discrete Device Assignment)
ClosedPublic

Authored by decui_microsoft.com on Oct 25 2016, 8:17 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 18, 9:24 AM
Unknown Object (File)
Thu, Dec 12, 3:15 PM
Unknown Object (File)
Thu, Dec 12, 12:28 AM
Unknown Object (File)
Nov 23 2024, 7:55 PM
Unknown Object (File)
Nov 23 2024, 2:11 PM
Unknown Object (File)
Nov 23 2024, 2:08 PM
Unknown Object (File)
Nov 22 2024, 10:11 AM
Unknown Object (File)
Nov 6 2024, 9:13 AM
Subscribers
None

Details

Summary

The feature enables us to pass through physical PCIe devices to FreeBSD VM
running on Hyper-V (Windows Server 2016) to get near-native performance with
low CPU utilization.

The patch implements a PCI bridge driver to support the feature:

  1. the pcib driver talks to the host to discover device(s) and presents

the device(s) to FreeBSD's pci driver via PCI configuration space (note:
to access the configuration space, we don't use the standard I/O port
0xCF8/CFC method; instead, we use an MMIO-based method supplied by Hyper-V,
which is very similar to the 0xCF8/CFC method).

  1. the pcib driver allocates resources for the device(s) and initialize

the related BARs, when the device driver's attach method is invoked;

  1. the pcib driver talks to the host to create MSI/MSI-X interrupt

remapping between the guest and the host;

  1. the pcib driver supports device hot add/remove.

BTW, I also put the patch at github:
https://github.com/dcui/freebsd/commits/decui/master/1025-all-in-one

Test Plan

I tested the patch on Windows Server 2016 with Intel 82599 NIC and
Mellanox ConnectX-3 NIC. I successfully passed through the NICs
to FreeBSD VM (HEAD) and the NIC worked in the VM.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

decui_microsoft.com retitled this revision from to hyperv: enable PCIe pass-through (a.k.a. Discrete Device Assignment).
decui_microsoft.com updated this object.
decui_microsoft.com edited the test plan for this revision. (Show Details)

This is an interesting approach for pass through. The only other hypervisor I'm familiar with (bhyve) does the config space emulation in the hypervisor itself so that guests don't need modifications for pass through (they just show up as PCI devices).

sys/dev/hyperv/pcib/pcib.c
1437

In new drivers I prefer to use 'bus_read_4(hbus->cfg_res, <reg>)' rather than 'bus_space_read_4(tag, handle, <reg>)'. It's shorter and requires less space in the softc.

1540

This will device_delete_child() any child devices I assume?

sys/dev/hyperv/pcib/pcib.c
1437

Thanks! I'll change to the new style and I can remove some lines. :-)

1540

Yes.
hv_pci_devices_present() ==> pci_devices_present_work() -> hv_pci_delete_device() -> device_delete_child().

In D8332#175267, @jhb wrote:

This is an interesting approach for pass through. The only other hypervisor I'm familiar with (bhyve) does the config space emulation in the hypervisor itself so that guests don't need modifications for pass through (they just show up as PCI devices).

Yes, Hyper-V's "Para-virtualized emulation of config space" approach is special.
I guess Hyper-V team think that with this approach they need the least changes in the hypervisor, guest firmware, and device model, etc.

E.g., the motherboard chipset and PCI bus 0 emulated by Hyper-V is pretty old so it can't support PCIe, and Hyper-V UEFI VM doesn't have the legacy PCI bus 0. And, to support hot add/remove in standard way, Hyper-V may need to emulate ACPI GPE and SCI interrupts, etc.

Of course, the obvious drawback is: we have to introduce such a pcib driver into the guest...

decui_microsoft.com edited edge metadata.

This new version addressed jhb's comment: "use bus_read_4(hbus->cfg_res, xxx)".

Cleaned up some comments.
Ceaned up vmbus_pcib_detach() according to discussions with Hyper-V team.

No functional change.

This revision is now accepted and ready to land.Nov 14 2016, 5:40 AM
This revision was automatically updated to reflect the committed changes.