Page MenuHomeFreeBSD

Fix bhyve PCIe capability emulation
ClosedPublic

Authored by chuck on Mar 14 2019, 9:55 AM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 23 2023, 3:19 AM
Unknown Object (File)
Dec 12 2023, 3:13 AM
Unknown Object (File)
Nov 21 2023, 3:53 AM
Unknown Object (File)
Nov 20 2023, 1:20 AM
Unknown Object (File)
Nov 20 2023, 1:20 AM
Unknown Object (File)
Nov 20 2023, 1:20 AM
Unknown Object (File)
Nov 20 2023, 1:20 AM
Unknown Object (File)
Nov 20 2023, 1:20 AM
Subscribers

Details

Summary

PCIe devices starting with version 1.1 must set the Role-Based Error
Reporting bit.

And while we're in the neighborhood, generalize the code assigning the
device type.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 23081

Event Timeline

This revision is now accepted and ready to land.Mar 14 2019, 11:38 AM
imp added inline comments.
usr.sbin/bhyve/pci_emul.c
956 ↗(On Diff #55051)

this is a nop, but likely a good change none-the-less if you intended to do it.

usr.sbin/bhyve/pci_emul.c
956 ↗(On Diff #55051)

Yes, I have a WIP patch to add endpoint support which then would use type.

This revision was automatically updated to reflect the committed changes.

One of @jhb requests was a comment on why the RBER change was needed. Here is the thought process:

qemu in their PCIe emulation (hw/pci/pcie.c) has the comment

/*
 * Windows guests will report Code 10, device cannot start, if
 * a regular Endpoint type is exposed on a root complex.  These
 * should instead be Root Complex Integrated Endpoints.
 */

If true, and Windows is actually validating PCIe capability structures to this degree, then it seemed prudent to set other bits in this area as required by the PCIe spec.

As to where to go from here, I'm conflicted. On the one hand, the change doesn't seem to fix anything from my testing. On the other hand, given this comment in qemu and the recent experience of Windows validating the size of BAR0 in NVMe devices, I'm tempted to go back through the PCIe spec and make sure that bhyve's PCIe capability structure is compliant.

One of @jhb requests was a comment on why the RBER change was needed. Here is the thought process:

qemu in their PCIe emulation (hw/pci/pcie.c) has the comment

/*
 * Windows guests will report Code 10, device cannot start, if
 * a regular Endpoint type is exposed on a root complex.  These
 * should instead be Root Complex Integrated Endpoints.
 */

If true, and Windows is actually validating PCIe capability structures to this degree, then it seemed prudent to set other bits in this area as required by the PCIe spec.

As to where to go from here, I'm conflicted. On the one hand, the change doesn't seem to fix anything from my testing. On the other hand, given this comment in qemu and the recent experience of Windows validating the size of BAR0 in NVMe devices, I'm tempted to go back through the PCIe spec and make sure that bhyve's PCIe capability structure is compliant.

Hmm, due to what this bit actually does, I'd probably be inclined to not set it if we don't need it. When the role based error reporting is set, it means we also support at least one other register to configure the mask of fatal vs non-fatal errors IIRC which I'm pretty sure we don't do. I had assumed Windows was failing to boot without this bit set. If Windows is actually fine then I think I'd prefer we drop it until we find a need for it. A PCIe 1.0 device can have this bit clear and still be compliant. The other thing about being a special endpoint when directly hung off the RC would be required even for 1.0 compliance.

In D19580#422125, @jhb wrote:

Hmm, due to what this bit actually does, I'd probably be inclined to not set it if we don't need it. When the role based error reporting is set, it means we also support at least one other register to configure the mask of fatal vs non-fatal errors IIRC which I'm pretty sure we don't do. I had assumed Windows was failing to boot without this bit set. If Windows is actually fine then I think I'd prefer we drop it until we find a need for it. A PCIe 1.0 device can have this bit clear and still be compliant. The other thing about being a special endpoint when directly hung off the RC would be required even for 1.0 compliance.

I'm happy to revert this change but wanted to double check the "PCIe 1.0" comment. Even though the code advertises Gen1 speeds, the PCIe Capability Version field is set to 0x2 which I don't think a 1.0 device would do. Should this be changed to 0x1?

In D19580#422125, @jhb wrote:

Hmm, due to what this bit actually does, I'd probably be inclined to not set it if we don't need it. When the role based error reporting is set, it means we also support at least one other register to configure the mask of fatal vs non-fatal errors IIRC which I'm pretty sure we don't do. I had assumed Windows was failing to boot without this bit set. If Windows is actually fine then I think I'd prefer we drop it until we find a need for it. A PCIe 1.0 device can have this bit clear and still be compliant. The other thing about being a special endpoint when directly hung off the RC would be required even for 1.0 compliance.

I'm happy to revert this change but wanted to double check the "PCIe 1.0" comment. Even though the code advertises Gen1 speeds, the PCIe Capability Version field is set to 0x2 which I don't think a 1.0 device would do. Should this be changed to 0x1?

Hmm, that's a good point. We probably need to document very clearly which specific PCIe version the pseudo devices are corresponding to and then follow that version of the spec. I'll have to look at it more to see what version we claim to be.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 13 2019, 11:37 PM
Closed by commit rS346194: Revert r345171 pending review (authored by chuck). · Explain Why
This revision was automatically updated to reflect the committed changes.