Page MenuHomeFreeBSD

amd64 pmap: batch chunk removal in pmap_remove_pages
ClosedPublic

Authored by mjg on Sep 28 2019, 6:40 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 24, 10:13 PM
Unknown Object (File)
Sun, Dec 8, 12:25 PM
Unknown Object (File)
Tue, Dec 3, 12:36 AM
Unknown Object (File)
Nov 21 2024, 6:01 AM
Unknown Object (File)
Nov 12 2024, 9:24 PM
Unknown Object (File)
Oct 19 2024, 9:16 AM
Unknown Object (File)
Oct 10 2024, 7:39 AM
Unknown Object (File)
Oct 4 2024, 12:20 AM
Subscribers

Details

Summary

pv list lock is the main bottleneck during poudriere -j 104 and pmap_remove_pages is the most impactful consumer. It frees chunks with the lock held even though it plays no role in correctness. Moreover chunks are often freed in groups, sample counts during buildkernel (0-sized frees removed):

value  ------------- Distribution ------------- count
      0 |                                         0
      1 |                                         8
      2 |@@@@@@@                                  19329
      4 |@@@@@@@@@@@@@@@@@@@@@@                   58517
      8 |                                         1085
     16 |                                         71
     32 |@@@@@@@@@@                               24919
     64 |                                         899
    128 |                                         7
    256 |                                         2
    512 |                                         0
  1. batch freeing
  2. move it past unlocking pv list

I did not bother benchmarking this change in isolation, see another review for details.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/amd64/amd64/pmap.c
1111 ↗(On Diff #62690)

It should be called pv_chunklist.

Missing space after the comma.

6923 ↗(On Diff #62690)

Call it free_chunks?

7095 ↗(On Diff #62690)

Why does the pmap lock need to be held over free_pv_chunk_batch?

markj added inline comments.
sys/amd64/amd64/pmap.c
7095 ↗(On Diff #62690)

(It's because the chunks are still in the global LRU.)

  • chunklist -> pv_chunklist
  • free_chunk -> free_chunks
This revision is now accepted and ready to land.Sep 28 2019, 7:33 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
4295 ↗(On Diff #62694)

_<anything> is the implementation reseverd namespace (for file-scoped definitions). In recent times, we try to not abuse reserved namespace in kernel.

sys/amd64/amd64/pmap.c
4295 ↗(On Diff #62694)

What should I rename this to? free_pv_chunk_dequeued?

7095 ↗(On Diff #62690)

I can split this into dequeue in the current func and actual free after unlocking pmap.

sys/amd64/amd64/pmap.c
4295 ↗(On Diff #62694)

I am fine with this name.

This revision was automatically updated to reflect the committed changes.
sys/amd64/amd64/pmap.c
7095 ↗(On Diff #62690)

I can split this into dequeue in the current func and actual free after unlocking pmap.

We're tearing down the address space. (Note the assertion at the start of this function that no other threads are active within this pmap.) So, moving the actual free until after unlocking the pmap isn't going to have any consequential effect.

sys/amd64/amd64/pmap.c
7095 ↗(On Diff #62690)

We're tearing down the address space. (Note the assertion at the start of this function that no other threads are active within this pmap.) So, moving the actual free until after unlocking the pmap isn't going to have any consequential effect.

I agree in principle, but note that pmap_remove_pages() may still block the page daemon when it calls pmap_ts_referenced() on pages mapped into the pmap. This has systemic effects since the page daemon may be trying to reclaim pages. It can even cause problems during the background active queue scan, since the page daemon holds a page lock while calling pmap_ts_referenced(). This actually appears in practice when a large process forks frequently, since pmap_copy() will hold the pmap lock for a long time. It is even worse on 11.x, where the page daemon is also holding the active queue lock during scans and thus blocks concurrent page faults.

In this case though I suspect that the gain from freeing the pv chunk pages outside of the pmap lock is probably not significant enough to be worth doing.