Page MenuHomeFreeBSD

[uma-multipage 3/3] uma: grow slabs to enforce minimum memory efficiency
ClosedPublic

Authored by rlibby on Jan 17 2020, 5:53 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 10, 2:15 AM
Unknown Object (File)
Dec 15 2024, 12:43 AM
Unknown Object (File)
Dec 2 2024, 3:54 AM
Unknown Object (File)
Nov 29 2024, 10:14 PM
Unknown Object (File)
Nov 24 2024, 11:54 PM
Unknown Object (File)
Nov 19 2024, 1:20 PM
Unknown Object (File)
Nov 12 2024, 8:39 PM
Unknown Object (File)
Nov 12 2024, 3:57 PM

Details

Summary

Memory efficiency can be poor with awkward item sizes (e.g. 1/2 or 1
page size + epsilon). In order to achieve a minimum memory efficiency,
select a slab size with a potentially larger number of pages if it
yields a lower portion of waste.

This may mean using page_alloc instead of uma_small_alloc, which could
be more costly.

This may need some perf evaluation.

Test Plan

kyua test -k /usr/tests/sys/audit/Kyuafile

vali% sysctl vm.uma | grep eff | sort -k 2 -n -r | tail
vm.uma.32_Bucket.keg.efficiency: 93
vm.uma.256.keg.efficiency: 93
vm.uma.pipe.keg.efficiency: 92
vm.uma.VMSPACE.keg.efficiency: 92
vm.uma.UMA_Zones.keg.efficiency: 92
vm.uma.g_bio.keg.efficiency: 91
vm.uma.Mountpoints.keg.efficiency: 91
vm.uma.AIOCB.keg.efficiency: 91
vm.uma.vmem.keg.efficiency: 90
vm.uma.mbuf_jumbo_9k.keg.efficiency: 75

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/vm/uma_core.c
1913 ↗(On Diff #66924)

Oops, I should mask off INTERNAL here, in case we get here with an INTERNAL format from CACHEONLY (even though I don't think we will).

For the benefit of other readers; Ryan and I have discussed attempting a contig allocation with DMAP KVA first and then falling back to regular kva allocated sparse allocation. We could do this in the UMA multi-page allocator or natively in kmem_*. This would have the advantage that pmap_kextract() is very cheap and we use it to find the slab and also the domain for NUMA. It would have a disadvantage that it may increase memory fragmentation. If we take this approach we may want to only use power of two allocation sizes 'ppera' so that we don't negatively impact phys fragmentation.

sys/vm/uma_core.c
1879 ↗(On Diff #66924)

I would like to see if there is a better way to encapsulate this loop to make the logic easier to understand. I do not object to the computational cost of the iteration although I think we could calculate without iteration for some cases. I just think there are a few too many local variables and conditions to easily parse the behavior here.

I am really happy to see these changes as it will be especially helpful for vnodes that waste many Megabytes of space.
I concur with Jeff that the new code is difficult to follow, though do not have any suggestions on how to simplify it.

Try to clean up size selection loop. Does this seem more clear, or
still obscure?

The logic is easier for me to follow.

Fix CTR/KASSERT after dropping some local variables

This revision was not accepted when it landed; it landed in state Needs Review.Feb 4 2020, 10:40 PM
This revision was automatically updated to reflect the committed changes.