Page MenuHomeFreeBSD

powerpc64: Implement Radix MMU for POWER9 CPUs
ClosedPublic

Authored by jhibbits on Mar 9 2019, 3:45 AM.
Tags
None
Referenced Files
F108595238: D19516.id55004.diff
Sun, Jan 26, 6:24 PM
Unknown Object (File)
Sat, Jan 25, 8:02 PM
Unknown Object (File)
Sat, Jan 25, 7:15 PM
Unknown Object (File)
Fri, Jan 24, 7:05 PM
Unknown Object (File)
Sat, Jan 18, 5:46 PM
Unknown Object (File)
Fri, Jan 17, 3:42 PM
Unknown Object (File)
Fri, Jan 17, 1:42 PM
Unknown Object (File)
Wed, Jan 1, 3:06 PM

Details

Summary

POWER9 supports two MMU formats: traditional hashed page tables, and Radix
page tables, similar to what's presesnt on most other architectures. The
PowerISA also specifies a process table -- a table of page table pointers--
which on the POWER9 is only available with the Radix MMU, so we can take
advantage of it with the Radix MMU driver.

Written by Matt Macy.

There might be more room for diff reduction, too. The goal of this is to get
input on overall design approach.

Test Plan

Boot with radix MMU enabled on POWER9, and buildworld as a basic
stress test. Boot with radix disabled on POWER9, and buildworld. Boot on
non-radix CPU and buildworld.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 22971
Build 22049: arc lint + arc unit

Event Timeline

FWIW ZFS works fine on the POWER9 branch. However, there are a few things needed to make it play nice with HPT. Have you tracked them down?

Still looking into it, it's probably something in vmparam.h that needs to be switchable. The problem seen is init crashing with a bus error. I haven't debugged that further yet.

I'm really puzzled why ZFS isn't working, I'd expect some errors, but just get silence. No errors, no warnings, nothing. With this diff reverted and using HPT ZFS works just fine.

sys/powerpc/aim/mmu_oea.c
1109

x86 pmaps' page_is_mapped() return false for unmanaged pages.

sys/powerpc/aim/mmu_radix.c
562

These should go into cpufunc.h and provide functional definitions ?

593

Why radix needs DI ? Matt told me that ISAv3.0 has broadcast TLB invalidation instructions.

646

Style: space around binary op, there and below.

698

Indent looks wrong.

719

Why this invalidation ? For whatever reasons it done (probably to emulate IA32 behaviour of flushing TLB on fault),, would it be enough to flush only the faulting address ?

746

Don't you need to set referenced bit for write faults as well, if not set yet ?

766

If startpte == newpte, does it still make sense to store newpte into *pte, with the follow-up ptesync ?

Also, note that you do not flush TLB for this case.

2945

aaa

4789

Does TLB tolerate simultaneous l3 and l4 entries for the same address ?

sys/powerpc/include/param.h
127

Why ppc needs so large kernel stacks ? Does it cause KVA fragmentation issues ?

sys/powerpc/include/vmparam.h
203

The comment should be moved to #else part of the ifdef below, I think.

sys/vm/vm_page.c
793

This chunk should be committed separately.

Hi @kib, thanks for your review. There's still a bit of work to do, but this definitely helps.

sys/powerpc/aim/mmu_oea.c
1109

This is literally just moved from a #define into a function here, no other change. If it needs fixed, that'll be separate.

sys/powerpc/aim/mmu_radix.c
562

We do already have powerpc_sync() and eieio(), but not tlbsync(). On top of that, these macros aren't even used anywhere in the code.

719

Yeah, I think it should be sufficient to just flush the faulting address.

sys/powerpc/include/param.h
127

No idea. I asked about this, and apparently recursion is much more likely with radix tables. I bumped it from 4 to 8 ~5 years ago because ZFS is a stack pig. Didn't think radix tables would also be a stack pig. @mmacy, any input on this?

sys/vm/vm_page.c
793

The code that makes this necessary should really be fixed instead.

jhibbits marked an inline comment as done.

Fix HPT. ZFS has been confirmed working, when loaded as a module.

sys/vm/vm_page.c
791

This is obviously unnecessary, and will be reverted. A case of over-reverting the previous change.

sys/powerpc/aim/mmu_radix.c
89

I think pmap_private.h is missing from the diff.

Add pmap_private.h to the diff, fix more style issues in mmu_radix.c.

Update diff track latest head changes.

So why is this not yet committed ? What input do you wait for ?

sys/powerpc/aim/mmu_radix.c
593

This is still unanswered.

In D19516#545217, @kib wrote:

So why is this not yet committed ? What input do you wait for ?

It's not committed yet because I felt it was too unstable. It still is too unstable for normal use, but as you can see, it kinda bitrotted over the last year, so it's taken a bit of work to update it. I think the current problem is TLB related, given the panics and errors we've seen (me, @bdragon, @pkubaj).

sys/powerpc/aim/mmu_radix.c
593

I'm not sure why. I know ISAv3.0 has broadcast TLB invalidation, which even dates back to the beginning. However, it may impact execution on other processors. Radix mode does have support for invalidating ranges, which we may want to use here instead. I'll need to look at this bit more.

sys/powerpc/include/spr.h
94

Probably should have a leading 0, and should probably be vaugely sorted.

124

SRR1_ISI_PP line is duplicated now that D23731 is in.

125

These should probably have a leading 0 for uniformity with the other sprs.

This revision was not accepted when it landed; it landed in state Needs Review.May 11 2020, 2:33 AM
This revision was automatically updated to reflect the committed changes.