Page MenuHomeFreeBSD

pcib: Assume a window where both the base and limit are 0 is uninitialized
Needs ReviewPublic

Authored by jhb on Feb 15 2024, 5:55 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Apr 27, 6:15 AM
Unknown Object (File)
Sat, Apr 27, 6:15 AM
Unknown Object (File)
Sat, Apr 27, 4:11 AM
Unknown Object (File)
Fri, Apr 26, 3:27 AM
Unknown Object (File)
Mar 6 2024, 5:02 PM
Unknown Object (File)
Feb 18 2024, 9:40 PM
Unknown Object (File)
Feb 18 2024, 5:56 PM
Unknown Object (File)
Feb 16 2024, 12:27 AM
Subscribers

Details

Reviewers
jrtc27
Summary

Since the low bits of a window's limit are hardwired to 1, this
configuration looks like a minimally sized window at address 0.
However, PCI resources are not generally at address 0 (see the
__PCI_VAR_ZERO_VALID macro that was only defined on sparc64), and some
PCI-PCI bridges report these register values after a reset. The
result today is a lot of spam in dmesg as the minimally-sized windows
fail to allocate. By ignoring these windows and treating them as
closed the end result is the same, but there is less spam during boot.

Reported by: jrtc27

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 56047
Build 52936: arc lint + arc unit

Event Timeline

jhb requested review of this revision.Feb 15 2024, 5:55 PM

s/VAR/BAR/ in the message.

Let me test this actually shuts it up...

Perhaps worth mentioning HiFive Unmatched somewhere in the description so there's a breadcrumb to follow for where this came from in the first place?

Also, once patched so the condition is hit, this doesn't actually work. It stops enumerating at pci(b)1:

pcib0: <SiFive FU740 PCIe Controller> mem 0xe00000000-0xeffffffff,0xdf0000000-0xdffffffff,0x100d0000-0x100d0fff irq 45,46,47,48,49,50,51,52,53 on simplebus0
pci0: <PCI bus> on pcib0
pcib1: <PCI-PCI bridge> at device 0.0 on pci0
pci1: <PCI bus> on pcib1

Enabling a verbose boot in loader weirdly has the side-effect of "fixing" it: https://termbin.com/lm1g

The plot thickens...

sys/dev/pci/pci_pci.c
462

Well, I isolated it down to a single bootverbose that makes it work, but this just tells me it's almost certainly just because of the delay:

diff --git a/sys/dev/pci/pci.c b/sys/dev/pci/pci.c
index 8236b8bde41a..3ed8481084ec 100644
--- a/sys/dev/pci/pci.c
+++ b/sys/dev/pci/pci.c
@@ -4480,7 +4480,7 @@ pci_attach_common(device_t dev)
                return (ENXIO);
        }
 #endif
-       if (bootverbose)
+       if (bootverbose || 1)
                device_printf(dev, "domain=%d, physical bus=%d\n",
                    domain, busno);
        sc->sc_dma_tag = bus_get_dma_tag(dev);
diff --git a/sys/riscv/sifive/fu740_pci_dw.c b/sys/riscv/sifive/fu740_pci_dw.c
index 13937e283042..d0490d6548f2 100644
--- a/sys/riscv/sifive/fu740_pci_dw.c
+++ b/sys/riscv/sifive/fu740_pci_dw.c
@@ -215,12 +215,6 @@ fupci_phy_init(struct fupci_softc *sc)
                return (error);
        }
 
-       /* Hold PERST for 100ms as per the PCIe spec */
-       DELAY(100);
-
-       /* Deassert PERST_N */
-       FUDW_MGMT_WRITE(sc, FUDW_MGMT_PERST_N, 1);
-
        /* Deassert core power-on reset (active low) */
        error = gpio_pin_set_active(sc->porst_pin, true);
        if (error != 0) {
@@ -280,6 +274,12 @@ fupci_phy_init(struct fupci_softc *sc)
        /* Put the controller in Root Complex mode */
        FUDW_MGMT_WRITE(sc, FUDW_MGMT_DEVICE_TYPE, FUDW_MGMT_DEVICE_TYPE_RC);
 
+       /* Hold PERST for 100ms as per the PCIe spec */
+       DELAY(100000);
+
+       /* Deassert PERST_N */
+       FUDW_MGMT_WRITE(sc, FUDW_MGMT_PERST_N, 1);
+
        return (0);
 }
 

This at least seems to make it boot without bootverbose, even if moving the code (100us was obviously my bug...) differs from the reset algorithm adopted by U-Boot and Linux, though I've always been suspicious of that code sequence, it's a very odd interleaving of operations. I don't know if it's the problem breaking your patch, but dodgy PERST timing sure seems like it could result in weird delay-dependent behaviour outside the RC.

What's weird about the original bridge issue here is that every PCI-PCI bridge in this system seemingly presents the same all-zeros window on reset, yet the bridges are from four different vendors: the root complex is Synopsys DesignWare with SiFive glue, the switch everything hangs off is ASMedia, there's a bridge on the Samsung NVMe SSD, and there are multiple bridges on the AMD GPU. Is it really plausible that all four of them have the same bug, compared with something weird going on with this specific controller?

Ok, I went and found a copy of the PCI-PCI bridge spec (v1.1), which has this to say:

3.2.5.4. Subordinate Bus Number Register
...
A bridge must implement this register as a read/write register and the default state after reset must be zero.

3.2.5.6. I/O Base Register and I/O Limit Register
...
If a bridge supports an I/O address range, then these registers must be initialized by configuration software so default states are not specified.

3.2.5.8. Memory Base Register and Memory Limit Register
...
These registers must be initialized by configuration software so default states are not specified.

3.2.5.9. Prefetchable Memory Base Register and Prefetchable Memory Limit Register
...
If a bridge supports a prefetchable memory address range, then these registers must be initialized by configuration software so default states are not specified.

Thus, since we can't rely on the firmware having initialised the PCIe controller for this board and we issue a reset in the driver regardless, we're hitting these cases. Manufacturers implementing them as "reset to 0" is the most obvious choice, and is what we're seeing here, but technically isn't the only option. It feels like perhaps we should be communicating whether or not we expect the bridges to have been already configured to the pci_pci unit from the root?

Ok, I went and found a copy of the PCI-PCI bridge spec (v1.1), which has this to say:

3.2.5.4. Subordinate Bus Number Register
...
A bridge must implement this register as a read/write register and the default state after reset must be zero.

3.2.5.6. I/O Base Register and I/O Limit Register
...
If a bridge supports an I/O address range, then these registers must be initialized by configuration software so default states are not specified.

3.2.5.8. Memory Base Register and Memory Limit Register
...
These registers must be initialized by configuration software so default states are not specified.

3.2.5.9. Prefetchable Memory Base Register and Prefetchable Memory Limit Register
...
If a bridge supports a prefetchable memory address range, then these registers must be initialized by configuration software so default states are not specified.

Thus, since we can't rely on the firmware having initialised the PCIe controller for this board and we issue a reset in the driver regardless, we're hitting these cases. Manufacturers implementing them as "reset to 0" is the most obvious choice, and is what we're seeing here, but technically isn't the only option. It feels like perhaps we should be communicating whether or not we expect the bridges to have been already configured to the pci_pci unit from the root?

Well, we normally do expect bridges to be configured by firmware. That said, there's already an easy way to "clear" a bridge (there's a tunable to force clearing for bridges I used for testing) so if we have a simple way during pcib_attach that we can ask our parent "should we pretend we are reset" we can call the clear function as if the tunable was set. We could perhaps check if all of the window registers are zero, and if so use that to infer that we need to reset the bridge?