Page MenuHomeFreeBSD

arm64: Implement pmap_map_io_transient
AcceptedPublic

Authored by andrew on Tue, Nov 19, 5:02 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Nov 25, 2:37 PM
Unknown Object (File)
Mon, Nov 25, 2:36 PM
Unknown Object (File)
Mon, Nov 25, 2:16 PM
Unknown Object (File)
Mon, Nov 25, 12:21 PM
Unknown Object (File)
Mon, Nov 25, 8:57 AM
Unknown Object (File)
Sun, Nov 24, 11:30 AM
Unknown Object (File)
Sun, Nov 24, 7:29 AM
Unknown Object (File)
Wed, Nov 20, 1:33 AM
Subscribers

Details

Reviewers
alc
markj
kib
manu
Group Reviewers
arm64
Summary

This needs to map the memory when it's not in the DMAP region.

Sponsored by: Arm Ltd

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 60692
Build 57576: arc lint + arc unit

Event Timeline

This revision is now accepted and ready to land.Wed, Nov 20, 6:15 AM
alc added inline comments.
sys/arm64/arm64/pmap.c
9301

As a reminder, please add an XXX comment here saying that the implementation is still incomplete: When !can_fault, we don't need to perform a broadcast TLB invalidation, only a local one.

sys/arm64/arm64/pmap.c
9301

... otherwise, the above pinning can be dropped. It's only needed for mappings that perform a local-only invalidation.

sys/arm64/arm64/pmap.c
9301

But the implementation if formally correct after this patch, do you agree?

sys/arm64/arm64/pmap.c
9301

Yes, but the pinning is currently pointless.

sys/arm64/arm64/pmap.c
9301

Suppose pmap_qenter() adds a PTE, and then an unrelated pmap_enter() into the kernel map results in promotion of the PTE's L3 table. pmap_qremove() will just clear it, which seems wrong. Is there anything preventing this?

sys/arm64/arm64/pmap.c
9301

For the same situation on amd64, I tried to come up with an argument why is it impossible, but failed. Of course, it is highly unlikely, but still.

It seems not possible to do a trick with some unused bit from PTE, since there is no unused bits. E.g. combinations like wired but not managed etc.

It seems that the best way (just for transient io mappings, but not for general qenter/qremove) is to allocate three pages for single page mapping, making the frame before and after the unmapped guards.

sys/arm64/arm64/pmap.c
9301

Guard pages should work... or should we simply use pmap_enter() here? I'm not sure if it is safe to lock the kernel pmap in this context.

sys/arm64/arm64/pmap.c
9301

pmap_enter() would change semantic. For instance, we get spurious PGA_WRITEABLE tracking.

sys/arm64/arm64/pmap.c
9301

Yes. It stems from the way that we currently manage the kernel virtual address space. Ignoring the faultable submaps within the kernel virtual address space for the moment, the rest of the pmap_enter() calls within the kernel virtual address space happen on virtual addresses from subarenas that import superpage-sized and -aligned address ranges. So, the virtual address that we allocate here can't be close enough to one of those pmap_enter() calls for it to be caught up in a promotion. That said, rather than relying here on the good behavior of the rest of the kernel, allocating here from a subarena, not the kernel_arena, that makes this guarantee explicit would be better engineering.

sys/arm64/arm64/pmap.c
9301

But then there must be a gap between subarenas that use pmap_enter(), and our vmem allocations that are served by pmap_qenter(). We do not provide such guards around, I believe.

sys/arm64/arm64/pmap.c
9301

No, if the import into the subarena is superpage aligned and a multiple of the superpage size, which they are. The pmap_qenter() here can't fall within a superpage-sized and -aligned address range on which we do a pmap_enter().

sys/arm64/arm64/pmap.c
9301

If we don't want the entries created in pmap_qenter to be promoted we could add ATTR_SW_NO_PROMOTE.

markj added inline comments.
sys/arm64/arm64/pmap.c
9301

Ah, so it would be a bug to "optimize" KVA allocation by collapsing vm_dom[0].vmd_kernel_arena into kernel_arena on systems with only a single NUMA domain, which is a thought I've had once or twice.

sys/arm64/arm64/pmap.c
9301

While there would be no harm in having pmap_qenter set ATTR_SW_NO_PROMOTE, this issue is a concern across most if not all architectures that implement superpage promotion, so ultimately I'd rather see it handled uniformly at the machine-independent layer. The pmap_pte_exists that is eventually called by pmap_qremove already has a similar assertion to what @kib added to amd64 reporting an unexpected superpage.

9301

@markj , yes. That is why I think we would be wise to make this function allocate from a different arena, which is itself setup to do superpage-sized and -aligned imports. Could we do this with the existing transient I/O arena?

sys/arm64/arm64/pmap.c
9301

A bit of modification would be needed in order to use transient_arena here:

  • we need to make sure that the KVA region imported into the arena is superpage-aligned and -sized;
  • we need to initialize bio_transient_maxcnt unconditionally; currently it's only created when unmapped I/O is enabled, but that can be disabled administratively (and we disable it by default in sanitizer kernels).

I think that it's probably still the right direction to go. I'll work on a patch to implement that.