Page MenuHomeFreeBSD

vm: Allow kstack pages to come from different domains
Needs ReviewPublic

Authored by bnovkov on Sat, May 11, 10:34 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, May 19, 8:05 PM
Unknown Object (File)
Sun, May 19, 8:05 PM
Unknown Object (File)
Sun, May 19, 8:05 PM
Unknown Object (File)
Sun, May 19, 8:05 PM
Unknown Object (File)
Thu, May 16, 7:41 PM
Unknown Object (File)
Mon, May 13, 5:57 PM
Unknown Object (File)
Mon, May 13, 7:37 AM
Subscribers

Details

Reviewers
jhb
markj
alc
kib
Summary

vm_thread_stack_back currently requires that all kstack pages must come from the same domain as the kstack KVA.
However, this can lead to an infinite loop if we constantly attempt to allocate pages from a depleted domain.

This patch reworks kstack page allocation to allocate and handle kstack pages from multiple NUMA domains.

Test Plan

I've tested the fix using stress2 on a non-NUMA bhyve VM with no issues so far.
I'm in the middle of reworking https://reviews.freebsd.org/D44567 so I sadly do not have a NUMA testing environment to test this patch.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

Simplify vm_thread_stack_back page allocation loop.

Remove redundant m == NULL check.

why even support something like this?

instead, if pages can't be all be allocated from one domain, the code could be patched to free up whatever it did and try a different one

per-cpu caching for thread stacks is mostly a waste of memory and could be reworked for something with significantly smaller granularity, also meaning such allocations would be less likely to fail. but then supporting mixed backing only complicates things.

In D45163#1030231, @mjg wrote:

why even support something like this?

instead, if pages can't be all be allocated from one domain, the code could be patched to free up whatever it did and try a different one

Well, if there is sufficient pressure in the original domain this code will do the same thing.
It will, however, try to minimize the amount of pages allocated from different domains which makes more sense imo.

The main goal here is to fall back to other domains when there's memory pressure.
Using DOMAINSET_PREF (which is how kstack pages were allocated before D38852 landed) will provide that fallback and preserve existing behavior when the system is not under memory pressure.

What if you revert the order of allocation, first finding some domain which provides enough physical pages to back the stack, and then try to allocate KVA from corresponding arena?