Page MenuHomeFreeBSD

Add StarFive JH7110's PCIE controller driver
ClosedPublic

Authored by jsihv_gmx.com on Dec 4 2024, 6:14 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Oct 16, 4:26 AM
Unknown Object (File)
Mon, Oct 13, 10:22 AM
Unknown Object (File)
Sun, Oct 12, 12:43 PM
Unknown Object (File)
Sun, Oct 12, 6:20 AM
Unknown Object (File)
Thu, Oct 9, 3:09 PM
Unknown Object (File)
Sat, Oct 4, 10:35 AM
Unknown Object (File)
Fri, Oct 3, 5:13 AM
Unknown Object (File)
Fri, Oct 3, 3:58 AM

Details

Summary

This driver requires jh7110_gpio.c and jh7110_clk_stg.c

JH7110 has two PCIE controller devices. First one is used by board's integrated USB which has no driver. Switching PHY to USB mode is not currently implemented. This functionality could be added in a form of a separate PCIE PHY driver if needed. PHY is on by default and there's no need to switch it on.

Pre/post_ithread and post_filter methods are not used for interrupt masking since they are meant for level-triggered interrupts whereas JH7110's MSI interrupts are edge triggered (and INTx interrupts do not use this irqsrc scheme at all). Pre_ithread method is nevertheless used for MSI bottom acking.

The driver has been tested with Kingston SNV2S NVME SSD
The functionality of INTx and MSI interrupts (as opposed to default MSIx) has been tested by forcing NVME to use them

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Now adding this file to files.starfive

More about the memory allocation issue: It looks now that in pcib_probe_windows() a wrong value is read to "val" variable from register PCIR_PMBASEL_1. I don't suspect a bug in the PCI code anymore. One developer said it's probably related to U-Boot.

This is not a full review, just a couple of quick notes.

I pulled the change and will be testing it out :)

sys/riscv/starfive/files.starfive
7

You did not add device jh7110_pcie anywhere, and therefore did not enabled the driver.

Instead I suggest just making it optional on fdt pci, we don't need the new device IMO.

sys/riscv/starfive/jh7110_pcie.c
10–11

We don't include this header anymore.

258

Preferred style for this type of mask check.

999–1001

To follow recent bus API changes.

I agree with all the comments.

This update fixes small issues mentioned in previous comments and removes KVCO adjustments. I also report here about a U-Boot dependency.

I found out that I had done a careless mistake with KVCO adjustments. Registers used there were all plain wrong. It seems that KVCO adjustment should be done by using PCIE-PHY's registers (StarFive's own FDT is different in this respect, they have KVCO in their own kernel's pcie driver).

I'm not sure what the actual meaning of KVCO adjustments is (U-Boot doesn't have them) but if we want them, then we should implement a PCIE-PHY driver (or make some kind of hack here to reach another device's registers).

Otherwise PCIE-PHY driver could potentially contain just mode selection but FreeBSD's xhci driver seems not to use these modes and USB and NVME seem to work as one would expect without switching any modes, so it seems our PCIE-PHY driver would have nothing but this KVCO adjustment (two write calls). Would that be fine? It wouldn't take much if any time to implement that.

By building U-Boot without PCIE and/or USB, I found out that PCIB0 (the first PCIE device) is currently dependent on U-Boot's PCIE driver for JH7110. Despite of extensive debugging, I haven't yet found out the actual reason for this dependency. Currently it seems it's something in a function called starfive_pcie_init_port() in U-Boot. No matter how odd it may sound, while the second PCIe device works fine without U-Boot's PCIE driver, PCIB0's data link is not found.

Note that PCIE-PHY and USB-PHY drivers do not appear in U-Boot which uses xhci driver for USB like we do. I doesn't seem to me now that USB-PHY driver would do anything that we need.

More on the dependency issue:
I took my tests later to a point where I discovered that removing any other parts than resets or clocks from u-boot's pcie driver doesn't cause the PCIB0's datalink to disapper (while removing the whole driver causes it). Then removing either clocks or resets causes u-boot to halt after running through the driver code. These outcomes are pretty mysterious because clocks and resets are treated similarly in freebsd's pcie driver and u-boot's driver. Could they nevertheless cause the dependency? Next step might be to investigate if clock rates appear similar for both drivers. But I'm not sure if this is worth the time.

About the PCIE-PHY driver and KVCO adjustments... I think we're not going to have them, since they're probably just for cdns3 driver and we don't know their purpose in any case.

I tested now with a revised version of the GPIO driver (submitted to this site few days ago) and U-boot which doesn't use its PCIE driver and the problem (PCIB0 not getting the link) does not show up anymore. Probably a mistake in GPIO driver's IO methods had caused the issue which I had described as a U-boot dependency. This PCIE driver's IO methods were much more carefully tested.

Looks good to me, with some small nitpicks.

I am still no expert on PCI drivers, but I will test this out again.

sys/riscv/starfive/jh7110_pcie.c
14

Check sorting.

165–173

Can ilog2 from sys/libkern.h be used instead?

720

Missing mtx_destroy(&sc->msi_mtx).

855–859

Above we call gpio_pin_setflags() with GPIO_PIN_PRESET_HIGH. If I taught the jh7110_gpio to handle this flag, would this call still be needed?

Also, is there a reason this call appears after all this other setup?

This revision is now accepted and ready to land.Jul 30 2025, 3:19 PM
jsihv_gmx.com added inline comments.
sys/riscv/starfive/jh7110_pcie.c
165–173

Yes, I checked these functions give identical results. (ilog2() panics for argument 0 but if FDT has such strange data, then attach function very likely fails at some earlier point)

855–859

This part of the code is a remnant from the past which I was supposed to reconsider but forgot it. I don't think it has any purpose so I remove it.

This revision now requires review to proceed.Jul 31 2025, 2:21 PM

I've built and booted the driver.

Testing with NVMe, I got a pretty spectacular failure. I had root mounted from SD card (UFS). From the NVMe drive I imported the zpool, and successfully mounted several ZFS datasets.

When trying to write a large test file to the NVMe drive, I got repeated "pcib2: axi fetch error" messages:

root@vf2:/home/mhorne # dd if=/dev/random of=./testfile bs=1m count=512
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
512+0 records in
512+0 records out
536870912 bytes transferred in 19.761650 secs (27167312 bytes/sec)
root@vf2:/home/mhorne # pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error

... (messages repeat many times; I type 'shutdown -p now')

nvme0: Resetting controller due to a timeout.
nvme0: event="start"
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error

...

pcib2: axi fetch error
pcib2: axi fetch error
nvme0: Waiting for reset to complete
pcib2: axi fetch error
pcib2: axi fetch error

...

pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
pcib2: axi fetch error
nvme0: controller ready did not become 1 within 10500 ms
pcib2: axi fetch error
nvme0: event="timed_out"
nvme0: failing outstanding i/o
nvme0: WRITE sqid:2 cid:127 nsid:1 lba:248405776 len:256
nvme0: ABORTED_BY_REQUEST crd:0 m:0 dnr:1 p:0 sqid:2 cid:127 cdw0:0
nvme0: failing outstanding i/o
(nda0:nvme0:0:0:1): WRITE. NCB: opc=1 fuse=0 nsid=1 prp1=0 prp2=0 cdw=ece5f10 0 ff 0 0 0
nvme0: WRITE sqid:3 cid:122 nsid:1 lba:248405520 len:256
(nda0:nvme0:0:0:1): CAM status: NVME Status Error
nvme0: ABORTED_BY_REQUEST crd:0 m:0 dnr:1 p:0 sqid:3 cid:122 cdw0:0
(nda0:nvme0:0:0:1): NVMe status: ABORTED_BY_REQUEST DNR
(nda0:nvme0:0:0:1): Error 5, Retries exhausted
pass0 at nvme0 bus 0 scbus0 target 0 lun 1
pass0: <WDC WDS100T2B0C-00PXH0 211070WD 2042C0806040> s/n 2042C0806040 detached
nda0 at nvme0 bus 0 scbus0 target 0 lun 1
nda0: <WDC WDS100T2B0C-00PXH0 211070WD 2042C0806040> s/n 2042C0806040 detached
(nda0:nvme0:0:0:1): WRITE. NCB: opc=1 fuse=0 nsid=1 prp1=0 prp2=0 cdw=ece5e10 0 ff 0 0 0
(nda0:nvme0:0:0:1): CAM status: NVME Status Error
(nda0:nvme0:0:0:1): NVMe status: ABORTED_BY_REQUEST DNR
(nda0:nvme0:0:0:1): Error 6, Periph was invalidated
(pass0:nvme0:0:0:1): Periph destroyed
nvme0: Failed controller, stopping watchdog timeout.
nvme0: Failed controller, stopping watchdog timeout.
Solaris: WARNING: Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.

System shutdown time has arrived
Solaris: WARNING: Pool 'zroot' has encountered an uncorrectable I/O failure and has been suspended.
Stopping cron.
Stopping sshd.
Waiting for PIDS: 2171.

90 second watchdog timeout expired. Shutdown terminated.
Wed Aug  6 19:22:30 UTC 2025
Aug  6 19:22:30 vf2 init[1]: /etc/rc.shutdown terminated abnormally, going to single user mode
Aug  6 19:22:33 vf2 syslogd: exiting on signal 15
2025-08-06T19:22:50.621664+00:00 vf2.mhorne.lan init 1 - - some processes would not die; ps axl advised
This revision was not accepted when it landed; it landed in state Needs Review.Aug 7 2025, 11:15 PM
This revision was automatically updated to reflect the committed changes.

This is wrong. The triplet of pic_post_filter/pic_post_ithread/pic_pre_ithread are always *required* to be implemented by all INTRNG PICs. Omitting pic_post_filter and pic_post_ithread is a violation of the interface.

Initially implementing them as stubs would be okay, but not omitting them. I'm surprised this doesn't reliably panic due to an unimplemented hook being called. Contrary to the commit message, use of the other two functions is determined by the type of handler not the type of interrupt. In particular if a system has at least one interrupt filter and at least one interrupt thread, this is certain to malfunction.

Theoretically, every interrupt controller should implement all 3 of these functions.

But:
For edge-triggered interrupts, post_filter() is always empty because EOI was issued before the filter function was called, and masking is not necessary.

MSI handler (not to be confused with MSIX) is not root, but chained interrupt controller. This means that the interrupt is signaled to the root GIC using a dedicated interrupt line, which is common to all MSI interrupts. Moreover, MSI handler does not have an EOI 'facility' (by design). Because of this, we can leave pre/post_ithread() also empty, since masking is already done by the root GIC (on common line for all MSIs at once)
and MSI handler does not need EOI.

And IMHO, it is always better to implement empty methods to distinguish it from forgotten ones.

Sorry for late response.