Page MenuHomeFreeBSD

bhyve: Let the guest enable PCI BARs on arm64
Needs ReviewPublic

Authored by markj on Wed, May 1, 4:36 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, May 4, 6:04 PM
Unknown Object (File)
Thu, May 2, 6:43 AM
Unknown Object (File)
Thu, May 2, 6:43 AM
Unknown Object (File)
Thu, May 2, 12:15 AM

Details

Reviewers
corvink
jhb
andrew
grehan
Group Reviewers
bhyve
Summary

u-boot appears to ignore existing BAR mappings and creates its own based
on its view of the memory map (hard-coded in the bhyve-arm64 u-boot
config). This can cause problems when it creates mappings on top of the
ones created by bhyve, since the MMIO implementation in bhyve doesn't
expect to handle that scenario. See
https://lists.freebsd.org/archives/freebsd-virtualization/2024-April/002103.html
for a detailed example.

Rather than making bhyve more complex to handle this case, simply add a
mode which lets bhyve avoid creating mappings at all. This might be a
bit simplistic: we could conceivably want different behaviour for
passthrough devices (not supported on arm64) vs emulated devices, or
perhaps the default should depend on the firmware. However, this works
for now and doesn't introduce much complexity.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 57502
Build 54390: arc lint + arc unit

Event Timeline

markj requested review of this revision.Wed, May 1, 4:36 PM

I believe it's also legal to go and map two BARs to overlap so long as you don't try to access the overlapping range when that's the case? QEMU just maintains a list of PCI BARs and goes for the first one in the list that matches. Allocating the big chunk of MMIO memory statically at the start and just maintaining the list layered on top in PCI code as you mess with BARs seems like it would be a simple, more general fix, and should perform just fine? (Though passthrough may well be "fun" as you mention)

I believe it's also legal to go and map two BARs to overlap so long as you don't try to access the overlapping range when that's the case? QEMU just maintains a list of PCI BARs and goes for the first one in the list that matches. Allocating the big chunk of MMIO memory statically at the start and just maintaining the list layered on top in PCI code as you mess with BARs seems like it would be a simple, more general fix, and should perform just fine? (Though passthrough may well be "fun" as you mention)

Yes, we could also permit overlapping BARs since u-boot (empirically) won't try to access them in the window where there's some ambiguity. It's not too hard to implement and I thought about that, but this approach is simpler and is closer to what u-boot actually expects, which is that it needs to program BARs itself. It feels a bit questionable to make bhyve more complicated to support what u-boot's doing when we can instead make bhyve do less. If there are further reasons to support overlapping MMIO regions then it's easy to revert this patch.

My mental model of what is safe is that it's allowed to move BARs around so long as you disable decoding while you do so, and I'm pretty sure we already do that now to handle FreeBSD kernels (and other OS kernels) that rewrite BARs to size them during boot. We have to avoid trying to register/unregister them while rewriting, and I thought that was driven by if the I/O space was enabled. (See how update_bar_address makes the register_bar call conditional on encoding being enabled.) It sounds like u-boot is just buggy here in that it isn't disabling decoding while it messes with the BARs. This idea is ok though. I wonder if FreeBSD/amd64 boots with this set to true. :)

usr.sbin/bhyve/pci_emul.c
862

Hmm, I would expand this if further perhaps to not bother with the pci_get_cfgdata and pci_set_cfgdata. Maybe just return early in this case?

In D45049#1027191, @jhb wrote:

My mental model of what is safe is that it's allowed to move BARs around so long as you disable decoding while you do so, and I'm pretty sure we already do that now to handle FreeBSD kernels (and other OS kernels) that rewrite BARs to size them during boot. We have to avoid trying to register/unregister them while rewriting, and I thought that was driven by if the I/O space was enabled. (See how update_bar_address makes the register_bar call conditional on encoding being enabled.) It sounds like u-boot is just buggy here in that it isn't disabling decoding while it messes with the BARs. This idea is ok though. I wonder if FreeBSD/amd64 boots with this set to true. :)

Aha. So yes, U-Boot indeed isn't disabling decoding whilst it probes and allocates the BARs. It *is* only enabling decoding for the types of BARs it allocates once it's done, but it doesn't write the initial value back first. https://source.denx.de/u-boot/u-boot/-/blob/ff0de1f0557ed7d2dab47ba976a37347a1fdc432/drivers/pci/pci_auto.c#L42

But that still doesn't help you if you move the BAR for A to overlap one of B's before you've probed it? bhyve will still get in a mess in that scenario.

So for "bare metal" cases like booting with a bootrom (e.g. UEFI) we should arguably leave bars unregistered since the raw firmware probably does that. Only the bhyveload case probably wants to enable BARs on startup. I would be fine with not using a user-facing config option but basically making the test in this patch be some kind of "is there a bootrom or not" check. Would need to test it on amd64 with UEFI.

IIRC, the default value of the busmasteren, porten and memen bits is 0 according to PCI specification. So, as jhb already mentioned, it should be valid to unset these values when booting with a bootrom.

usr.sbin/bhyve/pci_emul.c
1155

We shouldn't set BUSMASTEREN as well.