Page MenuHomeFreeBSD

Mali GPU pmap support
Needs ReviewPublic

Authored by br on Apr 12 2021, 5:20 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Apr 12, 2:03 PM
Unknown Object (File)
Feb 23 2024, 2:42 PM
Unknown Object (File)
Jan 20 2024, 10:34 AM
Unknown Object (File)
Jan 1 2024, 5:50 AM
Unknown Object (File)
Nov 23 2023, 7:30 AM
Unknown Object (File)
Nov 14 2023, 7:09 AM
Unknown Object (File)
Nov 8 2023, 9:07 PM
Unknown Object (File)
Nov 7 2023, 10:03 PM
Subscribers

Details

Reviewers
andrew
markj
manu
Group Reviewers
arm64
Summary

Add pmap routines for adding/removing Mali GPU page table entires.

This is needed for the Panfrost driver.

Mali GPU permissions are Stage2-ish, while it's the attributes that are Stage1-ish, but is really neither.

Test Plan

Tested with panfrost driver (wayland, firefox, glx apps etc).

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

br requested review of this revision.Apr 12 2021, 5:20 PM

no need to invalidate ptes just write-back

cast vm_offset_t to dcache_wb

Can you provide any pointers to docs or driver code that illustrates how the functions are used?

sys/arm64/arm64/pmap.c
3695

How generic is the implementation? Would it ever be used for anything besides the one GPU driver? I think it should have a more specific name if not, at least pmap_gpu_enter().

3708

Is it permitted to have pmap == kernel_pmap?

3710

We don't need the same for pa?

3776

Assert that va is page-aligned?

3788

How can this case happen?

Rename pmap_genter/gremove to pmap_gpu_enter/remove

Can you provide any pointers to docs or driver code that illustrates how the functions are used?

Usage is here:
https://github.com/CTSRD-CHERI/freebsd-morello/blob/drm_base/sys/dev/drm/panfrost/panfrost_mmu.c

sys/arm64/arm64/pmap.c
3695

This is Mali GPU only.

For SMMU we have senter/sremove but they differ a lot to gpu_enter/remove:
o Last level is L3_PAGE for SMMU, and L3_BLOCK for GPU
o Attributes different
o Cache invalidation is needed for GPU

3708

No, drm gem objects have their own pmaps

3710

We don't need this for VA or PA, so I replaced with kasserts

3776

Added!

3788

GEM objects has buffers that are mapped on demand (for tiling operations).
We don't track pages that are mapped, so when we need to remove all pages this case could happen.

sys/arm64/arm64/pmap.c
3695

I think we talked about this for the SMMU case as well, but to me this is a layering violation. If the conventions are specific to Mali, then this code should live in the Mali driver. This is how, e.g., the i915 GTT is implemented, though admittedly the page table format there is quite different.

Having driver-specific code here makes it difficult to review and refactor the pmap code. It effectively introduces a new flavour of pmap, but no pmap functions know about it, so we have no assertions to check for misuse.

I have not done much work on the pmap lately so I don't want to insist on a particular direction, especially since the changes in this diff are straightforward, but I would be interested in @andrew's thoughts on this.

3708

We should at least assert that this isn't the kernel pmap.

3788

Does the panfrost driver also permit processes to map regular pages into the GPU? I cannot see how from looking at the driver, but I remember that some DRM drivers have ioctls to do this.

3797

How do empty page table pages get freed?

Looking at the panfrost driver, I also don't see where the pmap is destroyed.

br marked an inline comment as done.Apr 14 2021, 4:09 PM
br added inline comments.
sys/arm64/arm64/pmap.c
3695

Happy to discuss and listen @andrew feedback, but moving these out of pmap.c will require to copy lots of macroses and functions from pmap or moving them to a header.
Also _pmap_alloc_l3() has to be non-static.

3708

Added

3788

No, only contig pages allocated from the driver or pages from imported objects (e.g. DRM CMA module) which also come from contig allocator

3797

the SMMU function for removing pages (pmap_sremove_pages()) work fine:
https://github.com/CTSRD-CHERI/freebsd-morello/commit/ab54e5a3b59dec3e59717ae918ede9ccb31026ae

I guess we need to rename it so it go both for SMMU and MALI

sys/arm64/arm64/pmap.c
3695

I'm not a fan of adding iommu functions to pmap.c. I have investigated supporting 16k pages, but the smmu code could make this difficult as the iommu and CPU page sizes may be different breaking the former.

It also means we are unable to use the iommu or Mali on 32-bit arm, however this may be less of an issue.

sys/arm64/arm64/pmap.c
3695

My unverified feeling is that it is really not a lot of work to implement driver-private routines to handle insertion and removal into GPU page tables. pmap_alloc_l3() is complex partly because it has to compute and maintain the pindex for page table pages, but in this case that is not required.

For instance, pmap_pti_* in the amd64 pmap implements most of the required machinery separate from the main pmap code, using several hundred lines of code.

br marked an inline comment as done.Apr 16 2021, 2:44 PM
br added inline comments.
sys/arm64/arm64/pmap.c
3695

@andrew advised just to copy functions needed to a new file (iommu_pmap.c)
I tried and it works fine:
https://github.com/CTSRD-CHERI/freebsd-morello/blob/drm_base/sys/arm64/iommu/iommu_pmap.c

I plan to clean it up a bit