Page MenuHomeFreeBSD

Avoid waiting on physical allocations that can't possibly be satisfied
ClosedPublic

Authored by jah on Nov 21 2023, 11:45 PM.
Tags
None
Referenced Files
F103206851: D42706.id130488.diff
Fri, Nov 22, 6:04 AM
F103206847: D42706.id130894.diff
Fri, Nov 22, 6:04 AM
F103206846: D42706.id130494.diff
Fri, Nov 22, 6:04 AM
Unknown Object (File)
Thu, Nov 21, 1:45 AM
Unknown Object (File)
Fri, Nov 15, 8:47 PM
Unknown Object (File)
Fri, Nov 15, 7:43 PM
Unknown Object (File)
Fri, Nov 15, 8:41 AM
Unknown Object (File)
Wed, Nov 13, 4:21 AM

Details

Summary
  • Change vm_page_reclaim_contig[_domain] to return an errno instead of a boolean. 0 indicates a successful reclaim, ENOMEM indicates lack of available memory to reclaim, with any other error (currently only ERANGE) indicating that reclamation is impossible for the specified address range. Change all callers to only follow up with vm_page_wait* in the ENOMEM case.
  • Introduce vm_domainset_iter_ignore(), which marks the specified domain as unavailable for further use by the iterator. Use this function to ignore domains that can't possibly satisfy a physical allocation request. Since WAITOK allocations run the iterators repeatedly, this avoids the possibility of infinitely spinning in domain iteration if no available domain can satisfy the allocation request.

PR: 274252
Reported by: kevans
Tested by: kevans

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jah requested review of this revision.Nov 21 2023, 11:45 PM
sys/vm/vm_kern.c
418

This is naive in the sense that vm_phys_alloc_contig() effectively does the equivalent of vm_phys_find_range(), but there's currently no means of indicating an out-of-range allocation request to the caller. That would require extra plumbing in the higher-level allocators, and I'm not sure if the added complexity would be worth the trouble given that we would probably expect most allocations to succeed before reaching this point. I therefore decided to start off with what seemed like the simplest implementation.

One minor thing: I would check explicitly for ERANGE, instead of != ENOMEM, and assert that we get the error status we understand, i.e. 0/ENOMEM/ERANGE.

BTW, if we have a request that could be satisfied by contig allocation from adjacent domains, currently we cannot do it?

sys/vm/vm_page.c
3091

Can we assign to m_runs only after this test is passed, so that failing this test can just 'return', rather than going to 'done' for cleanup?

Avoid allocation in the ERANGE case, assert that return status is ENOMEM if not 0/ERANGE.

sys/vm/vm_page.c
3065–3071

Nitpick - these lines don't need to be relocated.

3116

Perhaps we can avoid the initial call to vm_phys_find_range with:

prev_segind = start_segind;
while ((segind = vm_phys_find_range(... prev_segind ...) != -1) {

 while () {}
prev_segind = segind + 1;

}
if (prev_segind == start_segind) {

 ret = ERANGE;
goto done;

}

sys/vm/vm_page.c
3091

It's a bit of hair-splitting, but I actually just moved the computation of m_runs after the check of vm_phys_find_range(), but of course the diff tool doesn't know that. In any case I'll probably just revert this part per your suggestion below.

3116

Good idea, to be honest I don't know why I didn't do it this way to begin with.

Eliminate extraneous call to vm_phys_find_range()

This seems good to me, my comments are just nitpicking.

sys/compat/linuxkpi/common/src/linux_page.c
102

style(9) doesn't require all local variables to be declared at the beginning of this function anymore. This function in particular does not follow that convention, so it's a bit strange to declare err here.

sys/kern/uipc_shm.c
882

The indentation here looks wrong.

sys/vm/vm_domainset.c
225

I suggest annotating this with __predict_false().

329

Should we first assert that the domain is currently set?

sys/vm/vm_kern.c
236

There's no need for parentheses around the function call.

sys/vm/vm_page.c
2190

The duplication of this pattern suggests to me that we might want to introduce yet another domainset iterator, this time for physical memory allocations, which takes all of the constraints variables and handles checking for impossible allocations. Then there would be no need to have a public vm_domainset_iter_ignore(). I'm not sure this is a good way to go, though, especially if we need to add new fields to vm_domainset_iter or introduce a new "contig iterator" type.

More generally I wonder if it might be sensible to put all of these parameters into a struct and pass that around instead, but that's not really in the scope of this diff.

3119

Rather than initializing this value once at the beginning of the function, I'd initialize it to false right before the loop. It doesn't affect correctness, but there's no need to preserve phys_range_exists's value across iterations of the outer loop.

sys/vm/vm_domainset.c
329

Hmm, that certainly wouldn't break anything I'm doing here, but I'm not sure that calling vm_domainset_iter_ignore() on an already-cleared (or already-ignored) domain would inherently indicate a programming error either. Sort of like calling free(NULL).

sys/vm/vm_page.c
2190

I think for now I'd prefer to keep it the way I have it. I like the orthogonality of having a generic way to skip individual iterator elements, on the off chance that we may want to use vm_domainset_iter_ignore() for something different in the future.

sys/vm/vm_domainset.c
329

While the semantics for free(NULL) are sensible given how much free() is used in cleanup paths, this function is used only in quite specific situations where it IMHO makes more sense to be strict. If a specific need arises, the restriction can always be removed. But I do not insist.

sys/vm/vm_page.c
2190

Fair enough.

Apply code review feedback from markj

This revision is now accepted and ready to land.Dec 4 2023, 6:18 PM