Page MenuHomeFreeBSD

vm: reduce work of reclamation search
ClosedPublic

Authored by dougm on May 11 2023, 6:46 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Apr 14, 12:04 PM
Unknown Object (File)
Sat, Apr 13, 1:10 PM
Unknown Object (File)
Sat, Apr 13, 2:43 AM
Unknown Object (File)
Mar 8 2024, 4:02 PM
Unknown Object (File)
Mar 8 2024, 4:02 PM
Unknown Object (File)
Mar 8 2024, 4:02 PM
Unknown Object (File)
Mar 8 2024, 4:02 PM
Unknown Object (File)
Mar 8 2024, 4:01 PM

Details

Summary

Calling vm_phys_scan_contig in a loop in vm_page_reclaim_contig_domain_ext duplicates the work of searching for the appropriate segment needlessly. Break up vm_phys_scan_contig into pieces and invoke them with an awareness of the march through segments, to avoid that extra computation.

Test Plan

Can build and run kernel.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

dougm requested review of this revision.May 11 2023, 6:46 AM
dougm created this revision.
dougm edited the test plan for this revision. (Show Details)

Fix compilation errors.

Steal a bit from a markj patch.

sys/vm/vm_page.c
3003
3040

max() and min() truncate their parameters to 32 bits. MAX()and MIN() do not.

3050

Does this function perhaps belong more naturally to the vm_phys layer, since it operates on vm_phys segments?

dougm marked 2 inline comments as done.Jun 4 2023, 5:09 PM
dougm added inline comments.
sys/vm/vm_page.c
3050

I assume this refers to vm_page_found_range.

Perhaps. But this vm_page.c already includes several references to vm_phys segments, so I assumed that using it here was okay. I will consider moving the vm_phys segments code to vm_phys.c.

Apply markj's easy suggestions.

Move some phys stuff back into vm_phys.c. Make the binary search for the segment containing an address more generally used.

Have vm_phys_found_range return -1 for failure, so that vm_page_reclaim_contig_domain_ext need not access vm_phys_nsegs.

Inline the binary search. Include _vm_phys.h in vm_phys.h to make that work.

sys/vm/vm_page.c
3104

A couple of these lines appear to be longer than 80 columns.

sys/vm/vm_phys.c
905

I believe this code is assuming that pa is either within the bounds of a segment, or past the end of the highest segment, but I don't see why that's true. In general the vm_phys segments of a given system will have holes in between each other.

sys/vm/vm_phys.h
45

Style: please include a newline after includes.

dougm marked 4 inline comments as done.

Fix style problems. Resolve a merge conflict.

sys/vm/vm_phys.c
905

If pa is before the start of the first segment, then segind is 0 and pa < seg->start, so we return NULL. If pa is between the first and second segments, then segind is 1, and pa < seg->start, so we return NULL. And so on.

sys/vm/vm_phys.c
1258

This is indented by one space too many.

1261

The segment can be rejected due to domain before the above arithmetic is performed.

1265

Otherwise, you are making the processor execute another conditional branch.

dougm marked 3 inline comments as done.

Fix whitespace. Tighten up loop in vm_phys_found_range.

Increment segend before second and subsequent calls to vm_phys_found_range.

Define vm_phys_paddr_lookup to do the binary search based on 'end', without promising to check 'start', and vm_phys_paddr_to_seg, which checks 'start' too and returns a vm_phys_seg*. Use the latter in vm_phys_paddr_to_vm_page, and also in _pa_to_pmdp in arm64/pmap.c.

reduce binary search code size by 16 bytes.

sys/vm/vm_phys.c
1242

We don't use past tense in related function names. "find" would be more in keeping with those names.

1255–1256

Check the domain first.

sys/vm/vm_phys.h
142

You can verify the segind before computing seg,

dougm marked 3 inline comments as done.

Rename function. Reorder statements, as suggested.

The arm64 code for the binary search is

dd4: 90000008      adrp    x8, 0x0 <vm_phys_paddr_to_vm_page+0x8>
dd8: 2a1f03e9      mov     w9, wzr
ddc: b940010a      ldr     w10, [x8]
de0: 90000008      adrp    x8, 0x0 <vm_phys_paddr_to_vm_page+0x14>
de4: 91000108      add     x8, x8, #0
de8: 3400018a      cbz     w10, 0xe18 <vm_phys_paddr_to_vm_page+0x4c>
dec: 5280070b      mov     w11, #56
df0: 2a0a03ec      mov     w12, w10
df4: 4b09018d      sub     w13, w12, w9
df8: 0b4d052d      add     w13, w9, w13, lsr #1
dfc: 9bab21ae      umaddl  x14, w13, w11, x8
e00: f94005ce      ldr     x14, [x14, #8]
e04: eb0001df      cmp     x14, x0
e08: 1a8d8529      csinc   w9, w9, w13, hi
e0c: 1a8c81ac      csel    w12, w13, w12, hi
e10: 6b09019f      cmp     w12, w9
e14: 54ffff01      b.ne    0xdf4 <vm_phys_paddr_to_vm_page+0x28>

The inner loop for the current code also consists of 9 instructions

df0: 9100e129      add     x9, x9, #56
df4: f1000508      subs    x8, x8, #1
df8: 540001c0      b.eq    0xe30 <vm_phys_paddr_to_vm_page+0x64>
dfc: f85f012a      ldur    x10, [x9, #-16]
e00: eb00015f      cmp     x10, x0
e04: 54ffff68      b.hi    0xdf0 <vm_phys_paddr_to_vm_page+0x24>
e08: f85f812b      ldur    x11, [x9, #-8]
e0c: eb00017f      cmp     x11, x0
e10: 54ffff09      b.ls    0xdf0 <vm_phys_paddr_to_vm_page+0x24>
sys/vm/vm_phys.h
83–84

Move this up, so as not to introduce another unsorted declaration.

115

I would suggest vm_phys_lookup_segind.

117

Use u_int.

121

The compiler generates the same code for this as it would for lo != hi.

dougm marked 4 inline comments as done.

Accept all suggestions.

Change from "== NULL" to "!= NULL" checks in a couple of functions to match what's already there and optimize the common case.

The change looks correct to me.

sys/vm/vm_phys.c
1248–1251

I would either permit domain == -1 to mean that the caller doesn't care to specify a particular NUMA domain, or assert that domain >= 0 && domain < vm_ndomain. Probably the latter is fine for now.

sys/vm/vm_phys.h
133

I have access to an Ampere Altra with 256GB of RAM. Its vm_phys_seg array looks like this: https://reviews.freebsd.org/P578

In particular, almost all of the physical memory belongs to segment 4. I wonder if it would be possible and profitable to precompute an initial "mid"-point closer to the likely result, so as to reduce the average number of iterations. That is, start by looking at the segment in the middle of the physical address space as determined by the number of pages rather than the number of segments.

This revision is now accepted and ready to land.Jun 15 2023, 8:52 PM
This revision was automatically updated to reflect the committed changes.