Page MenuHomeFreeBSD

[uma-multipage 3/3] uma: grow slabs to enforce minimum memory efficiency
ClosedPublic

Authored by rlibby on Jan 17 2020, 5:53 PM.
Tags
None
Referenced Files
F103994345: D23239.id67797.diff
Mon, Dec 2, 3:54 AM
Unknown Object (File)
Fri, Nov 29, 10:14 PM
Unknown Object (File)
Sun, Nov 24, 11:54 PM
Unknown Object (File)
Tue, Nov 19, 1:20 PM
Unknown Object (File)
Tue, Nov 12, 8:39 PM
Unknown Object (File)
Tue, Nov 12, 3:57 PM
Unknown Object (File)
Mon, Nov 11, 4:32 AM
Unknown Object (File)
Sat, Nov 9, 8:53 PM

Details

Summary

Memory efficiency can be poor with awkward item sizes (e.g. 1/2 or 1
page size + epsilon). In order to achieve a minimum memory efficiency,
select a slab size with a potentially larger number of pages if it
yields a lower portion of waste.

This may mean using page_alloc instead of uma_small_alloc, which could
be more costly.

This may need some perf evaluation.

Test Plan

kyua test -k /usr/tests/sys/audit/Kyuafile

vali% sysctl vm.uma | grep eff | sort -k 2 -n -r | tail
vm.uma.32_Bucket.keg.efficiency: 93
vm.uma.256.keg.efficiency: 93
vm.uma.pipe.keg.efficiency: 92
vm.uma.VMSPACE.keg.efficiency: 92
vm.uma.UMA_Zones.keg.efficiency: 92
vm.uma.g_bio.keg.efficiency: 91
vm.uma.Mountpoints.keg.efficiency: 91
vm.uma.AIOCB.keg.efficiency: 91
vm.uma.vmem.keg.efficiency: 90
vm.uma.mbuf_jumbo_9k.keg.efficiency: 75

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 28740
Build 26753: arc lint + arc unit

Event Timeline

sys/vm/uma_core.c
1913

Oops, I should mask off INTERNAL here, in case we get here with an INTERNAL format from CACHEONLY (even though I don't think we will).

For the benefit of other readers; Ryan and I have discussed attempting a contig allocation with DMAP KVA first and then falling back to regular kva allocated sparse allocation. We could do this in the UMA multi-page allocator or natively in kmem_*. This would have the advantage that pmap_kextract() is very cheap and we use it to find the slab and also the domain for NUMA. It would have a disadvantage that it may increase memory fragmentation. If we take this approach we may want to only use power of two allocation sizes 'ppera' so that we don't negatively impact phys fragmentation.

sys/vm/uma_core.c
1879

I would like to see if there is a better way to encapsulate this loop to make the logic easier to understand. I do not object to the computational cost of the iteration although I think we could calculate without iteration for some cases. I just think there are a few too many local variables and conditions to easily parse the behavior here.

I am really happy to see these changes as it will be especially helpful for vnodes that waste many Megabytes of space.
I concur with Jeff that the new code is difficult to follow, though do not have any suggestions on how to simplify it.

Try to clean up size selection loop. Does this seem more clear, or
still obscure?

The logic is easier for me to follow.

Fix CTR/KASSERT after dropping some local variables

This revision was not accepted when it landed; it landed in state Needs Review.Feb 4 2020, 10:40 PM
This revision was automatically updated to reflect the committed changes.