Page MenuHomeFreeBSD

powerpc64: Implement Radix MMU for POWER9 CPUs
ClosedPublic

Authored by jhibbits on Mar 9 2019, 3:45 AM.
Tags
None
Referenced Files
F106023858: D19516.id.diff
Tue, Dec 24, 2:00 AM
Unknown Object (File)
Sun, Dec 15, 10:08 AM
Unknown Object (File)
Sun, Dec 8, 3:26 PM
Unknown Object (File)
Sun, Dec 8, 9:54 AM
Unknown Object (File)
Nov 21 2024, 3:03 PM
Unknown Object (File)
Nov 19 2024, 3:42 AM
Unknown Object (File)
Nov 19 2024, 3:42 AM
Unknown Object (File)
Nov 19 2024, 3:42 AM

Details

Summary

POWER9 supports two MMU formats: traditional hashed page tables, and Radix
page tables, similar to what's presesnt on most other architectures. The
PowerISA also specifies a process table -- a table of page table pointers--
which on the POWER9 is only available with the Radix MMU, so we can take
advantage of it with the Radix MMU driver.

Written by Matt Macy.

There might be more room for diff reduction, too. The goal of this is to get
input on overall design approach.

Test Plan

Boot with radix MMU enabled on POWER9, and buildworld as a basic
stress test. Boot with radix disabled on POWER9, and buildworld. Boot on
non-radix CPU and buildworld.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 30995
Build 28702: arc lint + arc unit

Event Timeline

FWIW ZFS works fine on the POWER9 branch. However, there are a few things needed to make it play nice with HPT. Have you tracked them down?

Still looking into it, it's probably something in vmparam.h that needs to be switchable. The problem seen is init crashing with a bus error. I haven't debugged that further yet.

I'm really puzzled why ZFS isn't working, I'd expect some errors, but just get silence. No errors, no warnings, nothing. With this diff reverted and using HPT ZFS works just fine.

sys/powerpc/aim/mmu_oea.c
1112

x86 pmaps' page_is_mapped() return false for unmanaged pages.

sys/powerpc/aim/mmu_radix.c
563

These should go into cpufunc.h and provide functional definitions ?

594

Why radix needs DI ? Matt told me that ISAv3.0 has broadcast TLB invalidation instructions.

647

Style: space around binary op, there and below.

699

Indent looks wrong.

720

Why this invalidation ? For whatever reasons it done (probably to emulate IA32 behaviour of flushing TLB on fault),, would it be enough to flush only the faulting address ?

747

Don't you need to set referenced bit for write faults as well, if not set yet ?

767

If startpte == newpte, does it still make sense to store newpte into *pte, with the follow-up ptesync ?

Also, note that you do not flush TLB for this case.

2946

aaa

4790

Does TLB tolerate simultaneous l3 and l4 entries for the same address ?

sys/powerpc/include/param.h
129

Why ppc needs so large kernel stacks ? Does it cause KVA fragmentation issues ?

sys/powerpc/include/vmparam.h
202

The comment should be moved to #else part of the ifdef below, I think.

sys/vm/vm_page.c
793 ↗(On Diff #54866)

This chunk should be committed separately.

Hi @kib, thanks for your review. There's still a bit of work to do, but this definitely helps.

sys/powerpc/aim/mmu_oea.c
1112

This is literally just moved from a #define into a function here, no other change. If it needs fixed, that'll be separate.

sys/powerpc/aim/mmu_radix.c
563

We do already have powerpc_sync() and eieio(), but not tlbsync(). On top of that, these macros aren't even used anywhere in the code.

720

Yeah, I think it should be sufficient to just flush the faulting address.

sys/powerpc/include/param.h
129

No idea. I asked about this, and apparently recursion is much more likely with radix tables. I bumped it from 4 to 8 ~5 years ago because ZFS is a stack pig. Didn't think radix tables would also be a stack pig. @mmacy, any input on this?

sys/vm/vm_page.c
793 ↗(On Diff #54866)

The code that makes this necessary should really be fixed instead.

jhibbits marked an inline comment as done.

Fix HPT. ZFS has been confirmed working, when loaded as a module.

sys/vm/vm_page.c
793 ↗(On Diff #54957)

This is obviously unnecessary, and will be reverted. A case of over-reverting the previous change.

sys/powerpc/aim/mmu_radix.c
89

I think pmap_private.h is missing from the diff.

Add pmap_private.h to the diff, fix more style issues in mmu_radix.c.

Update diff track latest head changes.

So why is this not yet committed ? What input do you wait for ?

sys/powerpc/aim/mmu_radix.c
594

This is still unanswered.

In D19516#545217, @kib wrote:

So why is this not yet committed ? What input do you wait for ?

It's not committed yet because I felt it was too unstable. It still is too unstable for normal use, but as you can see, it kinda bitrotted over the last year, so it's taken a bit of work to update it. I think the current problem is TLB related, given the panics and errors we've seen (me, @bdragon, @pkubaj).

sys/powerpc/aim/mmu_radix.c
594

I'm not sure why. I know ISAv3.0 has broadcast TLB invalidation, which even dates back to the beginning. However, it may impact execution on other processors. Radix mode does have support for invalidating ranges, which we may want to use here instead. I'll need to look at this bit more.

sys/powerpc/include/spr.h
94

Probably should have a leading 0, and should probably be vaugely sorted.

135

SRR1_ISI_PP line is duplicated now that D23731 is in.

136

These should probably have a leading 0 for uniformity with the other sprs.

This revision was not accepted when it landed; it landed in state Needs Review.May 11 2020, 2:33 AM
This revision was automatically updated to reflect the committed changes.