Page MenuHomeFreeBSD

bhyve: ignore low bits of CFGADR
ClosedPublic

Authored by c.koehne_beckhoff.com on Sep 3 2021, 10:13 AM.

Details

Summary

Description:
Bhyve could emulate wrong PCI registers. In the best case, the guest reads wrong registers and the device driver would report some errors. In the worst case, the guest writes to wrong PCI registers and could brick hardware when using PCI passthrough.

According to Intels specification, low bits of CFGADR should be
ignored. Some OS like linux may rely on it. Otherwise, bhyve could
emulate a wrong PCI register.

E.g.
If linux would like to read 2 bytes from offset 0x02, following would
happen.
linux:
outl 0x80000002 at CFGADR
inw at CFGDAT + 2
bhyve:
cfgoff = 0x80000002 & 0xFF = 0x02
coff = cfgoff + (port - CFGDAT) = 0x02 + 0x02 = 0x04
Bhyve would emulate the register at offset 0x04 not 0x02.

More information

For more information see: http://www.csit-sun.pub.ro/~cpop/Documentatie_SMP/Intel_Microprocessor_Systems/Intel_ProcessorNew/Intel%20White%20Paper/Accessing%20PCI%20Express%20Configuration%20Registers%20Using%20Intel%20Chipsets.pdf

It is important to realize that all PCI configuration cycles are 4 byte (1
Dword) read/write routines. Since bits [1:0] are reserved, this
simplifies things a bit since the actual register address can be written
into the full 8 bits of [7:0] (data written to bits [1:0] has no affect on
the created PCI configuration cycle)
.

Linux source code for PCI access:

u16 read_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset)
{
	u16 v;
	outl(0x80000000 | (bus<<16) | (slot<<11) | (func<<8) | offset, 0xcf8);
	v = inw(0xcfc + (offset&2));
	return v;
}

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

c.koehne_beckhoff.com edited the summary of this revision. (Show Details)
grehan added a subscriber: grehan.

In practice this hasn't been an issue since a) guests use ECAM for config access, or b) mask off the lower bits (e.g. Linux routines in arch/x86/pci/direct.c).

This revision is now accepted and ready to land.Sep 22 2021, 11:00 AM

In practice this hasn't been an issue since a) guests use ECAM for config access, or b) mask off the lower bits (e.g. Linux routines in arch/x86/pci/direct.c).

For your information: I've noticed that i915 of a linux guest is unable to detect the Data Stolen Memory size correctly because it uses the routines in arch/x86/pci/early.c.

This revision was automatically updated to reflect the committed changes.