Page MenuHomeFreeBSD

Simpllfy vm_reserv_reclaim_config

Authored by dougm on May 16 2019, 7:58 AM.



Simplify the search for free space in the reservation by finding the beginnings and ends of free ranges using xor. Use arithmetic, instead of iteration, to find addresses that satisfy alignment and boundary conditions.

The means of finding ranges of free pages was changed for vm_reserv_break in r348484, and there was found to improve performance minutely and reduce code size. This change applies a similar change to vm_reserv_reclaim_config, expecting similar benefits. This change also allows quick rejection of page ranges that are unsuitable on account of alignment or boundary issues, where those issues are processed a page at a time in the current implementation.

Tested by: pho

Test Plan

Peter Holm has reported that a 5-hour test of this patch revealed no problems.

I defined a sysctl to create reservations with every 17th page allocated, and then to check for the availability of 16 pages on a 16 page boundary. Ten samples without this change:

cycles/break: 2769630
cycles/break: 2586584
cycles/break: 2585703
cycles/break: 2587762
cycles/break: 2585641
cycles/break: 2587068
cycles/break: 2586093
cycles/break: 2585840
cycles/break: 2587580
cycles/break: 2586152

Ten samples after this change:
cycles/break: 88031
cycles/break: 88288
cycles/break: 88507
cycles/break: 88295
cycles/break: 88107
cycles/break: 88079
cycles/break: 88303
cycles/break: 88384
cycles/break: 88356

The patch that implements the sysctl is here.

Diff Detail

rS FreeBSD src repository - subversion
Lint Not Applicable
Tests Not Applicable

Event Timeline

dougm retitled this revision from Simplfy vm_reserv_reclaim_config to Simpllfy vm_reserv_reclaim_config.May 16 2019, 4:21 PM
dougm edited the summary of this revision. (Show Details)
dougm added reviewers: markj, kib, alc.
dougm added a subscriber: pho.

Add comments akin to those in vm_reserv_break.

With D20274.57978.diff I get

cc -c -O2 -pipe -fno-strict-aliasing  -g -nostdinc  -I. -I../../.. -I../../../contrib/ck/include -I../../../contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h    -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -MD  -MF.depend.vm_reserv.o -MTvm_reserv.o -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member  -mno-aes -mno-avx  -std=iso9899:1999 -Werror  ../../../vm/vm_reserv.c
../../../vm/vm_reserv.c:1288:59: error: operator '<<' has lower precedence than '-'; '-' will be evaluated first [-Werror,-Wshift-op-parentheses]
        changes = rv->popmap[i] | (1UL << (low_index % NBPOPMAP) - 1);
                                       ~~ ~~~~~~~~~~~~~~~~~~~~~~~^~~
../../../vm/vm_reserv.c:1288:59: note: place parentheses around the '-' expression to silence this warning
        changes = rv->popmap[i] | (1UL << (low_index % NBPOPMAP) - 1);
                                          (                         )
1 error generated.

I ran test for 5 hours on D20274.57979.diff.
The test included a buildworld.
No problems seen.

dougm edited the test plan for this revision. (Show Details)
dougm added reviewers: alc, kib, markj.
1278 ↗(On Diff #57979)

You can just use bool in new code.

1313 ↗(On Diff #57979)

I can't see anything that prevents lo from being adjusted beyond hi to satisfy alignment and boundary constraints. Then, when we check (hi - lo) * PAGE_SIZE >= size below, a negative LHS would be promoted to an unsigned paddr_t, in which case we'd return true.

dougm marked 2 inline comments as done.
dougm edited the test plan for this revision. (Show Details)

Address reviewer concerns. Update performance test results.

1323 ↗(On Diff #58205)

The return value here and below can use the C99 true and false.

downcase some boolean literals.

1313 ↗(On Diff #57979)

I think there's still a problem here in that lo is not guaranteed to be an index into the current reservation after an adjustment.

1313 ↗(On Diff #57979)

I can see lo==NBPOPMAP happening. For lo to get any higher, I think you'd need a value of alignment or boundary that's too big for reservations to support. In any case, I'd say that hi-1 is guaranteed to be an index into the reservation, and if lo < hi, we're okay. If not, we're not guiding anyone to use lo to index anything.

Unless you are worried about lo wrapping around to 0. Which I briefly considered, but decided was impossible. If I'm wrong, I could add checks to make sure that the changes to lo were not decreases.

1313 ↗(On Diff #57979)

I don't see anything preventing a request for, say, a 2MB allocation with 16MB alignment. Granted, that would be a bit odd, but vm_reserv_reclaim_contig() excludes requests only based on their size, not their alignment or boundary constraints.

1313 ↗(On Diff #57979)

So in that case, we compute pa rounded up to a multiple of 16MB, and shift that down by 12, so that we're rounding lo up to a multiple of 4k, and its definitely bigger than hi now. But it hasn't overflowed, and lo * PAGE_SIZE isn't going to overflow, and the test will fail, and we'll set lo to hi and keep iterating.

Other than that I haven't optimized the 16MB alignment case, I'm still not seeing what's wrong.

1313 ↗(On Diff #57979)

My concern is that we're computing VM_PAGE_TO_PHYS(&rv->pages[lo]) without having verified that lo is a valid index. If the reservation corresponds to a chunk near the end of physical memory, I think we could index beyond the end of the vm_page array and get a garbage pa.

dougm edited the test plan for this revision. (Show Details)

If an alignment adjustment leads to a 'lo' index value beyond the upper bounds of the pages array, abandon the reservation immediately.

Update performace results.

Abort the last iteration. I used the wrong constant.

dougm edited the test plan for this revision. (Show Details)

Give up on a reservation if we're about to look at pages[VM_LEVEL_0_NPAGES] or beyond.

1391 ↗(On Diff #58215)

We don't use low_index except as a parameter to vm_reserv_test_contig(). Perhaps lift its computation into the subroutine?

1391 ↗(On Diff #58215)

In case I misinterpret "lift" here, I'll consider both possible meanings.

If I want to calculate low_index in vm_reserv_test_contig(), then I have to pass additional arguments low and pa, which wouldn't be used for anything else.

If I want to avoid passing low_index to vm_reserv_test_contig, then I can pass 'i' and 'changes', or 'i' and something with value low_index%NBPOPMAP that is only used to compute an initial value of 'changes'.

Either way, it means an additional argument to vm_reserv_test_contig, and I don't see the value of it. But if you have a preference, tell me.

1391 ↗(On Diff #58215)

Maybe you want me to drop the variable low_index and just pass the value
(pa < low) ? ((low + PAGE_MASK - pa) >> PAGE_SHIFT) : 0
as a parameter to vm_reserv_test_contig?

This looks ok to me.

1391 ↗(On Diff #58215)

I meant calculating low_index in vm_reserv_test_contig(). pa could simply be re-fetched in vm_reserv_test_contig(); VM_PAGE_TO_PHYS(m) is just m->phys_addr. (And we could avoid accessing rv->pages[VM_LEVEL_0_NPAGES - 1] entirely by using VM_PAGE_TO_PHYS(&rv->pages[0]) + VM_LEVEL_0_SIZE instead.)

I still slightly prefer having low_index be calculated in vm_reserv_test_contig() since a low parameter has a more obvious meaning than low_index, but I don't insist on it.

This revision is now accepted and ready to land.Jun 4 2019, 4:02 AM
dougm marked an inline comment as done.

Make the arguments to vm_reserv_test_contig look as much as possible like the arguments to vm_reserv_reclaim_contig.

This revision now requires review to proceed.Jun 4 2019, 4:26 AM

Compute an upper bound less than NPOPMAP when high lies within the reservation, and use that instead of repeated comparing to high.

1344 ↗(On Diff #58220)

Is there a reason not to just write changes = rv->popmap[n] | (-1UL << bits_left)?

1344 ↗(On Diff #58220)

The case n==NPOPMAP.

This revision is now accepted and ready to land.Jun 4 2019, 9:20 PM
This revision was automatically updated to reflect the committed changes.