Page MenuHomeFreeBSD

vm_page_startup(): Save memory in more cases on VM_PHYSSEG_DENSE
Needs ReviewPublic

Authored by olce on Jan 23 2025, 5:16 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Feb 24, 5:04 PM
Unknown Object (File)
Fri, Feb 21, 5:04 AM
Unknown Object (File)
Mon, Feb 10, 5:30 PM
Unknown Object (File)
Sat, Feb 8, 8:30 PM
Unknown Object (File)
Thu, Feb 6, 6:32 PM
Unknown Object (File)
Tue, Feb 4, 4:50 PM
Unknown Object (File)
Feb 3 2025, 5:52 AM
Unknown Object (File)
Feb 2 2025, 3:37 AM

Details

Reviewers
markj
dougm
alc
Summary

The main trick to save memory in the VM_PHYSSEG_DENSE case is to try to
allocate the VM page array (the 'struct vm_page' array representing all
VM pages, named 'vm_page_array') at some boundary of the single physical
address span, so that the array does not have to contain 'struct
vm_page' corresponding to its own storage. On amd64, it would allow to
save at least 0.064% of the total amount of physical memory (e.g.,
~44.3MB on a 64GB machine). See comments for more details.

Prior to this commit, that trick was applied only if there was enough
memory available in the last chunk (highest addresses) and if
PMAP_HAS_PAGE_ARRAY was not defined. PMAP_HAS_PAGE_ARRAY is
a per-architecture knob that enables the pmap itself to build the
'struct vm_page' array, with the aim to allocate pages backing this
array in the same NUMA domain as the pages that are backed. By
construction, if there are multiple domains (at least 3, or only 2 but
with the first and last physical chunks belonging to the same one), the
trick cannot apply as is because pages of the array are then allocated
in different physical chunks corresponding to the different domains.
Architectures currently defining/implementing PMAP_HAS_PAGE_ARRAY are
amd64 and powerpc. So, basically, the space saving trick has not been
applied to these architectures since the introduction of
PMAP_HAS_PAGE_ARRAY.

This commit introduces the following enhancements:

  1. Even if PMAP_HAS_PAGE_ARRAY is defined, the trick is applied if there is only one VM domain, in which case pmap_page_array_startup() is not called.
  2. Early allocation of memory (before the allocation of the VM page array) now systematically happens at the boundaries of the available physical memory, as this allows to reduce the final span of physical memory, and thus the size of VM page array (a sizeof(struct vm_page)/PAGE_SIZE ratio of the memory removed from the single span by this change is saved; e.g., on amd64, around 2.5% (104/4096)).
  3. The VM page array can be allocated at start of the physical memory span instead of the end, increasing chances of the trick working (although probably not on amd64, as the first chunk seems to be in general too small to contain the VM page array). As before, the VM page array is allocated at once, so must fit entirely in either the first or last chunk so that the trick can apply (this could be improved).

While here, simplify vm_page_startup() by moving all the VM page array
allocation logic into vm_page_array_alloc(), and the dump pages' bitmap
allocation before that of witness pages, allowing to register witness
pages to be dumped just after they have been allocated (and to stop
keeping some witness state in function-scope variables).

While here, as pmap_page_array_startup() is now called only when there
are at least two domains (i.e., the machine is really NUMA), replace
code now dead in moea64_page_array_startup() with an assertion on the
number of domains.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 61939
Build 58823: arc lint + arc unit

Event Timeline

olce requested review of this revision.Jan 23 2025, 5:16 PM

Update of the diff's base.

I'm not sure which commit causes problems, but when booting a KASAN kernel on your branch in bhyve, there is a page fault exception early during boot. Did you test KASAN/KMSAN kernels in your branch?

sys/vm/vm_page.c
615

Or just (void)vm_phys_early_alloc(PAGE_SIZE);?

I'm not sure which commit causes problems, but when booting a KASAN kernel on your branch in bhyve, there is a page fault exception early during boot. Did you test KASAN/KMSAN kernels in your branch?

Ah, no, sorry. Going to test with KASAN and check that I can reproduce, and will take it from there.

Ah, no, sorry. Going to test with KASAN and check that I can reproduce, and will take it from there.

Getting an infinite loop at boot in VirtualBox. I have an idea why this is the case, going to quickly check (and hopefully fix).

I'm not sure which commit causes problems, but when booting a KASAN kernel on your branch in bhyve, there is a page fault exception early during boot. Did you test KASAN/KMSAN kernels in your branch?

Well, the problem isn't was I had first thought it could be. Bisection indicates it's this revision that is causing the problem (and, in particular, not the commit changing default and specific alignments).

Re-reading the change, I have noticed that the interface "change" I made to pmap_page_array_startup(), i.e., it is not called anymore when vm_ndomains is 1, isn't currently compatible with the fact that, on amd64, pmap_page_array_startup() also calls pmap_kmsan_page_array_startup() for KMSAN, that's a first problem.

Actually, disabling the optimization by forcibly making the new code here call pmap_page_array_startup() unconditionally, as before, fixes the KASAN kernel boot. (I removed the call to pmap_kmsan_page_array_startup() there, as I wasn't sure if there was some KASAN -> KMSAN dependency.) So there is clearly some kind of interaction between allocating PDEs to back the page array in pmap_page_array_startup() and KASAN. Before I delve into KASAN, would you have any hint on what could be going on?

I'm not sure which commit causes problems, but when booting a KASAN kernel on your branch in bhyve, there is a page fault exception early during boot. Did you test KASAN/KMSAN kernels in your branch?

Well, the problem isn't was I had first thought it could be. Bisection indicates it's this revision that is causing the problem (and, in particular, not the commit changing default and specific alignments).

Re-reading the change, I have noticed that the interface "change" I made to pmap_page_array_startup(), i.e., it is not called anymore when vm_ndomains is 1, isn't currently compatible with the fact that, on amd64, pmap_page_array_startup() also calls pmap_kmsan_page_array_startup() for KMSAN, that's a first problem.

Actually, disabling the optimization by forcibly making the new code here call pmap_page_array_startup() unconditionally, as before, fixes the KASAN kernel boot. (I removed the call to pmap_kmsan_page_array_startup() there, as I wasn't sure if there was some KASAN -> KMSAN dependency.) So there is clearly some kind of interaction between allocating PDEs to back the page array in pmap_page_array_startup() and KASAN. Before I delve into KASAN, would you have any hint on what could be going on?

I'm not sure offhand. The one detail that comes to mind is the fact that the vm_page array is not shadowed in KASAN kernels. This is reflected in kasan_md_unsupported(), which effectively assumes that the vm_page array is mapped at the bottom of the kernel map, starting at VM_MIN_KERNEL_ADDRESS. I can't really see why we'd panic though.