Page MenuHomeFreeBSD

Add StarFive JH7110's Ethernet (eqos & mcommphy modifications, a new board specific file)
ClosedPublic

Authored by jsihv_gmx.com on Jun 15 2024, 9:29 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Apr 2, 7:49 PM
Unknown Object (File)
Wed, Apr 2, 5:00 PM
Unknown Object (File)
Tue, Apr 1, 2:43 PM
Unknown Object (File)
Thu, Mar 27, 5:08 PM
Unknown Object (File)
Thu, Mar 27, 5:08 PM
Unknown Object (File)
Fri, Mar 14, 2:08 PM
Unknown Object (File)
Mon, Mar 10, 6:24 PM
Unknown Object (File)
Feb 17 2025, 5:28 PM

Details

Summary

This commit includes what is needed to get JH7110's Ethernet working

CSR clock's frequency should have been given on datasheets but it isn't there. The value is based on its frequency on vendor OS's debug data.

Changing the network speed is tested by switching between 1000M & 100M cables

Witness warns about lock order reversal when setting CSR clock's frequency. These locks are located in softclock code and clk.c. I think it's a real LOR and not a false positive but it only appears during the boot and when connecting an Ethernet cable so maybe we can live with it? I doubt it may be laborous to come up with an alternative way to set the clock frequency.

iperf3 test gives 719 Mbits/sec transfer rate (and has some retrys) while with vendor's OS it's closer to 1000 Mbits/sec. On basis of what I've heard this can be attributed to shortcomings of the eqos driver.

Port switching issue remains unsolved. Ethernet device does not change automatically when switching a cable to another port (as seen in netstat -r and arp -a) and even by using manual commands I've only managed to switch from eqos0 to eqos1 (and not the other way around). No matter how much I've put time in investigating this issue, I've not yet found the ultimate cause. Hopefully the problem can be fixed in a near future.

(When using kernel debug configurations it may take a minute or two that the connection appears when booting without a cable and connecting it afterwards. Without debug options it's much faster)

the style check has been applied to if_eqos.c

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

manu requested changes to this revision.Jun 16 2024, 6:44 AM

Thanks for this, could you split the style(9) changes in a other review please ?
Also please re-upload the diff with context (git format-patch -U999 will do the job).

This revision now requires changes to proceed.Jun 16 2024, 6:44 AM
sys/riscv/starfive/starfive_if_eqos.c
95

Part of this function should probably be made generic with part of eqos_fdt_init, some clocks are the same (and have the same fdt property name).

Now with the full context and without style corrections for if_eqos.c

I investigated the LOR (lock order reversal) issue further by temporarily removing eqos locks entirely. Witness still keeps on talking about eqos lock, so it possibly rightly notices a connection between the eqos driver and callout API linked to its operations. Debugger's lock commands were pretty uninformative but by using printfs I became pretty sure the lock really is located in softclock (callout API) code (while the other lock involved in LOR is in the generic clock code, clk.c).

sys/riscv/starfive/starfive_if_eqos.c
95

Property name checking could written as a generic code though Rockchip & JH7110 have little if any overlap there. There's now one shared clock name. In any case, there clearly is grounds to separate a generic FDT code at some point if we get more support for boards using Eqos driver but it probably would be an overkill at this moment.

Some of the issues you described are very similar to what I've encountered a couple of months ago when trying to run FreeBSD on a Orange Pi 3B, sometimes the interface would restart which would lead to an inconsistent state on the TX/RX ring descriptors on if_eqos, you can try to run with my patch too.

I hardware bug happened as well, as I've described in here:

Some of the issues you described are very similar to what I've encountered a couple of months ago when trying to run FreeBSD on a Orange Pi 3B, sometimes the interface would restart which would lead to an inconsistent state on the TX/RX ring descriptors on if_eqos, you can try to run with my patch too.

Thanks. Unfortunately those patches didn't help. I suspect there may be a bug somewhere in kernel's network configuring code or in a program like route and dhclient.

mhorne added a subscriber: mhorne.

Okay, I looked through the code again in detail and things seem good to me.

I took this version and made some small tweaks, mostly style/cosmetic. I split the mcommphy changes into a separate commit. Of course I also enabled the files in the build.

I do think there is an opportunity for refactoring the driver for reduced duplication/better structuring. Equally, we mentioned the possibility of merging this with dwc(4), which is probably the better future. I'm not in a position to properly evaluate this, for the moment.

Unless someone objects, I plan to just merge this in a day or two. It is worth more imperfect in the tree than rotting here in review.

sys/dev/eqos/if_eqos.c
459

In my testing this change is not required. A value of 1000 was too small, so I will split the difference and bump it to 5000 (5s).

sys/dev/mii/mcommphy.c
138–142

I did not chase down the possibilities, but I believe it can happen that only one of these flags is changed at a time, so I tweaked the check to be more permissive. At worst, we will run the adjustment again redundantly. This is not a fast path in any case.

sys/riscv/starfive/starfive_if_eqos.c
81

Because this runs in a non-sleepable context (callout, eqos mutex), it generates this LOR:

lock order reversal: (sleepable after non-sleepable)
 1st 0xffffffc003976070 eqos lock (network driver, sleep mutex) @ /usr/home/mitchell/dev/freebsd/sys/kern/kern_mutex.c:213
 2nd 0xffffffc0008e3dc0 Clock topology lock (Clock topology lock, sx) @ /usr/home/mitchell/dev/freebsd/sys/dev/clk/clk.c:1208
lock order network driver -> Clock topology lock attempted at:
#0 0xffffffc00038275c at witness_checkorder+0xa02
#1 0xffffffc000325aba at _sx_xlock+0x58
#2 0xffffffc0000ec10c at clk_set_freq+0x44
#3 0xffffffc00060086e at if_eqos_starfive_set_speed+0x78
#4 0xffffffc0005fee94 at eqos_miibus_statchg+0x13e
#5 0xffffffc000113f4e at miibus_statchg+0x50
#6 0xffffffc000114c60 at mii_phy_update+0x60
#7 0xffffffc000112878 at mcommphy_service+0x222
#8 0xffffffc0001138c6 at mii_tick+0x32
#9 0xffffffc00060045e at eqos_tick+0x68
#10 0xffffffc00033706e at $x+0
#11 0xffffffc000338520 at softclock_thread+0xaa
#12 0xffffffc0002d7dd8 at fork_exit+0x68
#13 0xffffffc0005ea4aa at fork_trampoline+0xa

We could cheat by writing the register via SYSCON although this would leave the clknode uninformed that its cached value is outdated.

This revision was not accepted when it landed; it landed in state Needs Review.Wed, Apr 2, 4:57 PM
This revision was automatically updated to reflect the committed changes.

(I tried to comment earlier but I had some hassle)

I'm fine with those code changes.

I noted that the endian header and both gpio headers in if_eqos_starfive.c are actually redundant.

I thought that eqos will be added to jh7110 by moving eqos lines from sys/conf/files.arm64 to sys/conf/files. It looks it's done like this with some other multi-arch Ethernet drivers. But maybe this is not important.

In case this mcommphy.c's building issue with amd64 eats your precious time, you can pass it to me.