Use the UMA reclaim thread to asynchronously drain all caches if
there is a severe shortage in a domain. Otherwise we only trigger UMA
reclamation every 10s even when the system has completely run out of
memory.
Stop entirely draining the caches when one domain falls below its min
threshold. In some workloads it is "normal" for one NUMA domain to end
up being nearly depleted by kernel memory allocations, for example for
the ZFS ARC. The domainset iterators skip domains below the
vmd_min_free theshold on the first iteration, so we should allow that
mechanism to limit further depletion of the domain's free pages before
taking the extreme step of calling uma_reclaim(UMA_RECLAIM_DRAIN_CPU).
In my testing this allowed the system to stay quite responsive in
scenarios where I created artificial memory pressure. Previously it
would stutter and stall quite badly.
I do not claim this to be a final solution; now that UMA caches can
estimate their own WSS we can trim them proactively and more frequently.
However, this change allows us to quickly put pressure on both the
caches and the UMA bucket sizes, and I think that's a step in the right
direction.