Page MenuHomeFreeBSD

Eliminate redundant TLB invalidations in the arm64 pmap
ClosedPublic

Authored by alc on Oct 19 2017, 5:42 AM.

Details

Summary

pmap_remove_l3() already invalidates any existing TLB entry for the address being unmapped. There is no need to repeat the invalidation in the caller.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

alc created this revision.Oct 19 2017, 5:42 AM
kib added a comment.Oct 19 2017, 5:19 PM

I agree that the change is functionally correct, but perhaps the intent was to coalesce the invalidations ? I.e. either pmap_remove_l3() should stop doing the invalidation, or its variant is added which does not do the invalidation and used there.

I mean that invalidate page puts the barriers around tlbi, dsb+isb probably mean full CPU stop, so the existing code tried to avoid doing it for each page.

alc added a comment.Oct 19 2017, 5:41 PM
In D12725#264159, @kib wrote:

I agree that the change is functionally correct, but perhaps the intent was to coalesce the invalidations ? I.e. either pmap_remove_l3() should stop doing the invalidation, or its variant is added which does not do the invalidation and used there.

Then, for correctness, you have to introduce the delayed invalidation machinery from amd64. (Or perform a TLB range invalidation operation whenever you change PV list locks of the addresses whose PTEs were destroyed since the last PV list lock change.)

kib added a comment.Oct 19 2017, 6:30 PM
In D12725#264164, @alc wrote:
In D12725#264159, @kib wrote:

I agree that the change is functionally correct, but perhaps the intent was to coalesce the invalidations ? I.e. either pmap_remove_l3() should stop doing the invalidation, or its variant is added which does not do the invalidation and used there.

Then, for correctness, you have to introduce the delayed invalidation machinery from amd64. (Or perform a TLB range invalidation operation whenever you change PV list locks of the addresses whose PTEs were destroyed since the last PV list lock change.)

You are right, of course. But I think that this is worth it as well (invalidate ranges on PV list unlocks).

alc added a comment.Oct 19 2017, 6:44 PM
In D12725#264182, @kib wrote:
In D12725#264164, @alc wrote:
In D12725#264159, @kib wrote:

I agree that the change is functionally correct, but perhaps the intent was to coalesce the invalidations ? I.e. either pmap_remove_l3() should stop doing the invalidation, or its variant is added which does not do the invalidation and used there.

Then, for correctness, you have to introduce the delayed invalidation machinery from amd64. (Or perform a TLB range invalidation operation whenever you change PV list locks of the addresses whose PTEs were destroyed since the last PV list lock change.)

You are right, of course. But I think that this is worth it as well (invalidate ranges on PV list unlocks).

It will work well for 4KB mappings of physical pages that come from superpage reservations and those cases where the vm_phys buddy allocator just happens to give you contiguous physical pages.

In any case, the pmap_invalidate_range() calls would have to be removed from pmap_remove() and "pushed down" inside of pmap_remove_l3() where we perform the change lock.

I think the architecture allows us to issue the tlb invalidation, then later execute the dsb & isb instructions as long as we don't try to the old mapping, however will need to check with ARM. We should also just invalidate all entries in pmap_invalidate_range if the length is too long, however in this case it may be better to stop invalidating until after we have removed all the pagetable entries.

kib added a comment.Oct 20 2017, 10:13 AM

I think the architecture allows us to issue the tlb invalidation, then later execute the dsb & isb instructions as long as we don't try to the old mapping, however will need to check with ARM. We should also just invalidate all entries in pmap_invalidate_range if the length is too long, however in this case it may be better to stop invalidating until after we have removed all the pagetable entries.

This would not help. As Alan pointed out, we need to ensure that no other CPU cached the translations at the time pv list unlock happens.

alc added a comment.Oct 21 2017, 5:47 PM

Here are two related observations:

  1. pmap_protect() could easily (and correctly) use pmap_invalidate_range(). Because pmap_protect() does not destroy PV entries, it does not face the above mentioned issues with range invalidations in pmap_remove().
  2. I can't imagine that pmap_remove_l2() needs to do a pmap_invalidate_range(). Given the break-before-make nonsense that arm requires for the creation of promoted mappings, there shouldn't be anything besides a single superpage mapping in the TLB when pmap_remove_l2() executes.
alc added a reviewer: markj.Jul 27 2018, 5:43 PM
alc added inline comments.Jun 16 2019, 9:43 PM
arm64/arm64/pmap.c
4166 ↗(On Diff #34136)

I think that the removal of this particular redundant call to pmap_invalidate_page() in pmap_ts_referenced() is non-controversial, so I'm going to go ahead and eliminate it.

It actually gets executed with some frequency, because we are making up for the lack of a functioning access bit by removing mappings.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 17 2019, 1:58 AM
This revision was automatically updated to reflect the committed changes.
alc reopened this revision.Jun 19 2019, 5:58 PM
alc updated this revision to Diff 58799.Jun 19 2019, 6:13 PM

Introduce pmap_remove_l3_range() and use it in two places: (1) pmap_remove(), where it eliminates redundant TLB invalidations by pmap_remove() and pmap_remove_l3(), and (2) pmap_enter_l2(), where it may optimize the TLB invalidations by batching them.

markj accepted this revision.Jun 19 2019, 7:48 PM
markj added inline comments.
arm64/arm64/pmap.c
2526 ↗(On Diff #58799)

Perhaps also assert that sva and eva belong to the same 2MB virtual page?

This revision is now accepted and ready to land.Jun 19 2019, 7:48 PM
alc updated this revision to Diff 58822.Jun 20 2019, 2:22 AM

Add the requested assertion.

This revision now requires review to proceed.Jun 20 2019, 2:22 AM
alc marked an inline comment as done.Jun 20 2019, 2:22 AM
markj accepted this revision.Jun 20 2019, 2:24 AM
This revision is now accepted and ready to land.Jun 20 2019, 2:24 AM
This revision was automatically updated to reflect the committed changes.