Differential D19337

Improve vmem tuning for platforms without a direct map.
ClosedPublic
Actions

Authored by markj on Feb 24 2019, 11:03 PM.

Details

Reviewers

alc
kib
jeff

Commits

rS344550: Improve vmem tuning for platforms without a direct map.

Summary

r327899 added per-domain vmem arenas which import superpage-sized and
-aligned chunks from the global kernel_arena. On platforms without a
direct map, this increases the maximum number of boundary tags required
vmem_bt_alloc(). In particular, BT_MAXALLOC is 4, so vmem_bt_alloc()
may require up to 8 tags since it allocates from a per-domain arena,
which may import from kernel_arena, which may import from kernel_map.

vmem reserves items in the boundary tag UMA zone to handle the recursion
in vmem_bt_alloc(), but with the above-mentioned change this reservation
may be insufficient. Consider a system with 2 CPUs; we will reserve 6
tags using the old calculation, which is insufficient. Increase the
reservation so that vmem_bt_alloc() is more likely to succeed.[*]

Also reduce KVA_QUANTUM on system where VM_NRESERVLEVEL is 0. On
systems with limited KVA, this import size is too large. On a 32-bit
powerpc system, I saw a case where per-domain kernel arena allocations
were failing with 30MB free in kernel_arena but no 4MB chunks available.
An import size of 1MB (assuming a 4KB page size) seems more reasonable.

(*) The bug described in PR 235747 can cause reserved tags to be leaked,
causing subsequent failures to allocate KVA in vmem_bt_alloc(). I think
this bug is harder to solve, and I think the change in this diff is
required regardless.

Test Plan

Justin reported a number of hangs on a 64-bit powerpc system without
UMA_MD_SMALL_ALLOC that appear to be solved by this diff.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Feb 24 2019, 11:03 PM

Harbormaster completed remote builds in B22699: Diff 54320.Feb 24 2019, 11:03 PM

markj edited the summary of this revision. (Show Details)Feb 24 2019, 11:03 PM

markj edited the summary of this revision. (Show Details)

markj edited the test plan for this revision. (Show Details)

markj added reviewers: alc, kib, jeff.

markj added a subscriber: jhibbits.

We probably shouldn't enable NUMA on 32 bit platforms. It doesn't make a ton of sense.

I have no objections to this patch however.

In D19337#413816, @jeff wrote:

We probably shouldn't enable NUMA on 32 bit platforms. It doesn't make a ton of sense.

That on its own wouldn't fix the problem addressed by this patch. But, is there any reason we can't just have vm_dom[0].vmd_kernel_arena point to kernel_arena when vm_ndomains == 1?

@jeff this wasn't just on a 32-bit system; I experienced the same problem on a 64-bit PowerPC Book-E system, which currently does not support UMA_MD_SMALL_ALLOC (I tried turning it on with @markj's suggestion, and it failed quite spectacularly, so more work is needed on that front). So this helps more than just 32-bit platforms.

kib accepted this revision.Feb 25 2019, 7:21 AM

This revision is now accepted and ready to land.Feb 25 2019, 7:21 AM

markj edited the test plan for this revision. (Show Details)Feb 25 2019, 3:13 PM

alc added inline comments.Feb 25 2019, 4:26 PM

sys/kern/subr_vmem.c
690 ↗	(On Diff #54320)	Is there really any point to calling uma_prealloc() here?

markj added inline comments.Feb 25 2019, 4:39 PM

sys/kern/subr_vmem.c
690 ↗	(On Diff #54320)	I suspect it's necessary. Note that the prealloc is done before we set the keg's allocf to vmem_bt_alloc(). I believe this means that the allocation will be from UMA's reserve of boot pages, so boundary tags aren't needed in order to allocate the initial slab. That is, since vmem_bt_alloc() needs boundary tags in order to allocate boundary tags, something needs to bootstrap the initial allocation. I might be misunderstanding something here though.