Page MenuHomeFreeBSD

powerpc64: Implement Radix MMU for POWER9 CPUs
Needs ReviewPublic

Authored by jhibbits on Mar 9 2019, 3:45 AM.

Details

Summary

POWER9 supports two MMU formats: traditional hashed page tables, and Radix
page tables, similar to what's presesnt on most other architectures. The
PowerISA also specifies a process table -- a table of page table pointers--
which on the POWER9 is only available with the Radix MMU, so we can take
advantage of it with the Radix MMU driver.

Written by Matt Macy.

There might be more room for diff reduction, too. The goal of this is to get
input on overall design approach.

Test Plan

Boot with radix MMU enabled on POWER9, and buildworld as a basic
stress test. Boot with radix disabled on POWER9, and buildworld. Boot on
non-radix CPU and buildworld.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

jhibbits created this revision.Mar 9 2019, 3:45 AM
mmacy added a comment.Mar 9 2019, 4:47 AM

FWIW ZFS works fine on the POWER9 branch. However, there are a few things needed to make it play nice with HPT. Have you tracked them down?

Still looking into it, it's probably something in vmparam.h that needs to be switchable. The problem seen is init crashing with a bus error. I haven't debugged that further yet.

I'm really puzzled why ZFS isn't working, I'd expect some errors, but just get silence. No errors, no warnings, nothing. With this diff reverted and using HPT ZFS works just fine.

kib added inline comments.Mar 9 2019, 11:30 AM
sys/powerpc/aim/mmu_oea.c
1109

x86 pmaps' page_is_mapped() return false for unmanaged pages.

sys/powerpc/aim/mmu_radix.c
563

These should go into cpufunc.h and provide functional definitions ?

594

Why radix needs DI ? Matt told me that ISAv3.0 has broadcast TLB invalidation instructions.

647

Style: space around binary op, there and below.

699

Indent looks wrong.

720

Why this invalidation ? For whatever reasons it done (probably to emulate IA32 behaviour of flushing TLB on fault),, would it be enough to flush only the faulting address ?

747

Don't you need to set referenced bit for write faults as well, if not set yet ?

767

If startpte == newpte, does it still make sense to store newpte into *pte, with the follow-up ptesync ?

Also, note that you do not flush TLB for this case.

2946

aaa

4790

Does TLB tolerate simultaneous l3 and l4 entries for the same address ?

sys/powerpc/include/param.h
127

Why ppc needs so large kernel stacks ? Does it cause KVA fragmentation issues ?

sys/powerpc/include/vmparam.h
203

The comment should be moved to #else part of the ifdef below, I think.

sys/vm/vm_page.c
793 ↗(On Diff #54866)

This chunk should be committed separately.

kib added a reviewer: markj.Mar 9 2019, 11:31 AM
jhibbits marked 4 inline comments as done.Mar 9 2019, 7:48 PM

Hi @kib, thanks for your review. There's still a bit of work to do, but this definitely helps.

sys/powerpc/aim/mmu_oea.c
1109

This is literally just moved from a #define into a function here, no other change. If it needs fixed, that'll be separate.

sys/powerpc/aim/mmu_radix.c
563

We do already have powerpc_sync() and eieio(), but not tlbsync(). On top of that, these macros aren't even used anywhere in the code.

720

Yeah, I think it should be sufficient to just flush the faulting address.

sys/powerpc/include/param.h
127

No idea. I asked about this, and apparently recursion is much more likely with radix tables. I bumped it from 4 to 8 ~5 years ago because ZFS is a stack pig. Didn't think radix tables would also be a stack pig. @mmacy, any input on this?

sys/vm/vm_page.c
793 ↗(On Diff #54866)

The code that makes this necessary should really be fixed instead.

jhibbits edited the summary of this revision. (Show Details)Mar 12 2019, 3:40 AM
jhibbits marked an inline comment as done.Mar 12 2019, 3:43 AM
jhibbits updated this revision to Diff 54957.

Fix HPT. ZFS has been confirmed working, when loaded as a module.

jhibbits added inline comments.Mar 12 2019, 3:46 AM
sys/vm/vm_page.c
793 ↗(On Diff #54957)

This is obviously unnecessary, and will be reverted. A case of over-reverting the previous change.

bdragon added inline comments.Mar 12 2019, 10:57 PM
sys/powerpc/aim/mmu_radix.c
89

I think pmap_private.h is missing from the diff.

jhibbits updated this revision to Diff 55004.Mar 13 2019, 12:42 AM

Add pmap_private.h to the diff, fix more style issues in mmu_radix.c.