acpi: Ensure that the affinity table is maximally coalesced
ClosedPublic
Actions

Authored by markj on Dec 15 2020, 2:46 PM.

Details

Reviewers

kib
jhb
alc
dgmorris_earthlink.net
vangyzen

Commits

rS368763: acpi: Ensure that adjacent memory affinity table entries are coalesced

Summary

Apparently the SRAT may contain multiple contiguous segments as separate
entries. For example:

SRAT: Found memory domain 0 addr 0x0 len 0x80000000: enabled
SRAT: Found memory domain 0 addr 0x100000000 len 0x5f70000000: enabled
SRAT: Found memory domain 1 addr 0x6070000000 len 0x3000000000: enabled
SRAT: Found memory domain 1 addr 0x9070000000 len 0x2000000000: enabled
SRAT: Found memory domain 1 addr 0xb070000000 len 0x800000000: enabled
SRAT: Found memory domain 1 addr 0xb870000000 len 0x400000000: enabled

Currently this results in multiple contiguous segments in the affinity
table built from the SRAT, because the SRAT parser assumes that SRAT
entries are already coalesced when possible.

vm_phys_early_startup() uses the affinity table to carve up
phys_avail[] so that each entry is contained in a single domain.
However, when the affinity table contains multiple contiguous entries it
will also result in multiple contiguous phys_avail[] entries.

Then, vm_phys segments are created from phys_avail[] entries. They are
coalesced since r338431, so we end up with vm_phys_segs[] entries that
span multiple phys_avail[] entries.

Finally, at the end of vm_page_startup() we add vm_pages to the vm_phys
freelists. We add a range for each vm_phys_seg[] entry for which there
is a covering phys_avail[] entry. But, the fragmentation of
phys_avail[] entries described above means that we may not add some
segments to the vm_phys allocator, and as a result the system leaves
large amounts of RAM unused.

Fix the problem by ensuring that contiguous entries in the memory
affinity table are coalesced. I think we could instead change
vm_page_startup() to call vm_phys_enqueue_contig() on all subranges
covered by phys_avail[] entry, and that would solve the problem too.

Test Plan

Reported and tested by Don Morris.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Dec 15 2020, 2:46 PM

Herald added a subscriber: imp. · View Herald TranscriptDec 15 2020, 2:46 PM

markj requested review of this revision.Dec 15 2020, 2:46 PM

Harbormaster completed remote builds in B35416: Diff 80721.Dec 15 2020, 2:46 PM

markj edited the summary of this revision. (Show Details)Dec 15 2020, 2:47 PM

markj edited the test plan for this revision. (Show Details)

markj added reviewers: kib, jhb, alc.

markj added a subscriber: dgmorris_earthlink.net.

dgmorris_earthlink.net accepted this revision.Dec 15 2020, 3:53 PM

This revision is now accepted and ready to land.Dec 15 2020, 3:53 PM

kib added inline comments.Dec 16 2020, 1:10 AM

sys/dev/acpica/acpi_pxm.c
357 ↗	(On Diff #80721)	Don't you need to decrement mem_info[i].start in this case, instead of incrementing mem_info[i].end ?

Fix coalescing when the new entry appears before an existing one.

This revision now requires review to proceed.Dec 16 2020, 1:29 AM

Harbormaster completed remote builds in B35428: Diff 80740.Dec 16 2020, 1:29 AM

kib accepted this revision.Dec 16 2020, 1:58 AM

This revision is now accepted and ready to land.Dec 16 2020, 1:58 AM

vangyzen added a subscriber: vangyzen.Dec 17 2020, 5:30 PM

vangyzen added inline comments.

sys/dev/acpica/acpi_pxm.c
367 ↗	(On Diff #80740)	Why did this change from `<=` to `<`? With `<`, the code will incorrectly take the "Overlapping" path when `mem` immediately follows `mem_info[i]` but is in a different domain.