Reclaim memory from UMA if the page daemon is struggling.
ClosedPublic
Actions

Authored by markj on Nov 15 2019, 11:46 PM.

Details

Reviewers

alc
dougm
jeff
kib

Commits

rS355004: Reclaim memory from UMA if the page daemon is struggling.

Summary

Use the UMA reclaim thread to asynchronously drain all caches if
there is a severe shortage in a domain. Otherwise we only trigger UMA
reclamation every 10s even when the system has completely run out of
memory.

Stop entirely draining the caches when one domain falls below its min
threshold. In some workloads it is "normal" for one NUMA domain to end
up being nearly depleted by kernel memory allocations, for example for
the ZFS ARC. The domainset iterators skip domains below the
vmd_min_free theshold on the first iteration, so we should allow that
mechanism to limit further depletion of the domain's free pages before
taking the extreme step of calling uma_reclaim(UMA_RECLAIM_DRAIN_CPU).

In my testing this allowed the system to stay quite responsive in
scenarios where I created artificial memory pressure. Previously it
would stutter and stall quite badly.

I do not claim this to be a final solution; now that UMA caches can
estimate their own WSS we can trim them proactively and more frequently.
However, this change allows us to quickly put pressure on both the
caches and the UMA bucket sizes, and I think that's a step in the right
direction.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Nov 15 2019, 11:46 PM

Harbormaster completed remote builds in B27554: Diff 64411.Nov 15 2019, 11:46 PM

markj added reviewers: alc, dougm, jeff, kib.Nov 15 2019, 11:49 PM

kib added inline comments.Nov 16 2019, 10:20 AM

sys/vm/vm_pageout.c
2013 ↗	(On Diff #64411)	Why not do the wakeup in the loop above ?

markj added inline comments.Nov 16 2019, 5:42 PM

sys/vm/vm_pageout.c
2013 ↗	(On Diff #64411)	The loop only runs once every lowmem_period seconds, 10s by default.

I am ok with this but I would definitely like to see a UMA daemon thread that handles timeouts and wss trimming proactively and regularly. Reviewing the mechanism it seems that we might need to work a little bit to limit the cost of more frequent processing but doing it regularly should make it less impactful when it does run.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 22 2019, 4:31 PM

Closed by commit rS355004: Reclaim memory from UMA if the page daemon is struggling. (authored by markj). · Explain Why

This revision was automatically updated to reflect the committed changes.

markj added a commit: rS355004: Reclaim memory from UMA if the page daemon is struggling..

Herald added a subscriber: imp. · View Herald TranscriptNov 22 2019, 4:31 PM