Page MenuHomeFreeBSD

hyperv: enable PCIe pass-through (a.k.a. Discrete Device Assignment)
ClosedPublic

Authored by decui_microsoft.com on Oct 25 2016, 8:17 AM.

Details

Summary

The feature enables us to pass through physical PCIe devices to FreeBSD VM
running on Hyper-V (Windows Server 2016) to get near-native performance with
low CPU utilization.

The patch implements a PCI bridge driver to support the feature:

  1. the pcib driver talks to the host to discover device(s) and presents

the device(s) to FreeBSD's pci driver via PCI configuration space (note:
to access the configuration space, we don't use the standard I/O port
0xCF8/CFC method; instead, we use an MMIO-based method supplied by Hyper-V,
which is very similar to the 0xCF8/CFC method).

  1. the pcib driver allocates resources for the device(s) and initialize

the related BARs, when the device driver's attach method is invoked;

  1. the pcib driver talks to the host to create MSI/MSI-X interrupt

remapping between the guest and the host;

  1. the pcib driver supports device hot add/remove.

BTW, I also put the patch at github:
https://github.com/dcui/freebsd/commits/decui/master/1025-all-in-one

Test Plan

I tested the patch on Windows Server 2016 with Intel 82599 NIC and
Mellanox ConnectX-3 NIC. I successfully passed through the NICs
to FreeBSD VM (HEAD) and the NIC worked in the VM.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

decui_microsoft.com retitled this revision from to hyperv: enable PCIe pass-through (a.k.a. Discrete Device Assignment).
decui_microsoft.com updated this object.
decui_microsoft.com edited the test plan for this revision. (Show Details)

The changes to VMBus were moved to 2 separate patches:
https://reviews.freebsd.org/D8410, and
https://reviews.freebsd.org/D8409

jhb edited edge metadata.Nov 2 2016, 4:38 PM

This is an interesting approach for pass through. The only other hypervisor I'm familiar with (bhyve) does the config space emulation in the hypervisor itself so that guests don't need modifications for pass through (they just show up as PCI devices).

sys/dev/hyperv/pcib/pcib.c
1436 ↗(On Diff #21879)

In new drivers I prefer to use 'bus_read_4(hbus->cfg_res, <reg>)' rather than 'bus_space_read_4(tag, handle, <reg>)'. It's shorter and requires less space in the softc.

1539 ↗(On Diff #21879)

This will device_delete_child() any child devices I assume?

sys/dev/hyperv/pcib/pcib.c
1436 ↗(On Diff #21879)

Thanks! I'll change to the new style and I can remove some lines. :-)

1539 ↗(On Diff #21879)

Yes.
hv_pci_devices_present() ==> pci_devices_present_work() -> hv_pci_delete_device() -> device_delete_child().

decui_microsoft.com added a comment.EditedNov 3 2016, 11:43 AM
In D8332#175267, @jhb wrote:

This is an interesting approach for pass through. The only other hypervisor I'm familiar with (bhyve) does the config space emulation in the hypervisor itself so that guests don't need modifications for pass through (they just show up as PCI devices).

Yes, Hyper-V's "Para-virtualized emulation of config space" approach is special.
I guess Hyper-V team think that with this approach they need the least changes in the hypervisor, guest firmware, and device model, etc.

E.g., the motherboard chipset and PCI bus 0 emulated by Hyper-V is pretty old so it can't support PCIe, and Hyper-V UEFI VM doesn't have the legacy PCI bus 0. And, to support hot add/remove in standard way, Hyper-V may need to emulate ACPI GPE and SCI interrupts, etc.

Of course, the obvious drawback is: we have to introduce such a pcib driver into the guest...

decui_microsoft.com edited edge metadata.

This new version addressed jhb's comment: "use bus_read_4(hbus->cfg_res, xxx)".

Cleaned up some comments.
Ceaned up vmbus_pcib_detach() according to discussions with Hyper-V team.

No functional change.

sepherosa_gmail.com edited edge metadata.
This revision is now accepted and ready to land.Nov 14 2016, 5:40 AM
This revision was automatically updated to reflect the committed changes.