Page MenuHomeFreeBSD

Further constrain the use of per-CPU page caches.
ClosedPublic

Authored by markj on Oct 15 2019, 4:12 PM.

Details

Summary

The calculation which determines whether or not to use UMA cache zones
is wrong in that it does not account for the fact that UMA maintains two
buckets per CPU, not one. In the worst case, both the alloc and free
buckets on a CPU may be full of unused pages.

Also increase the amount of required RAM per CPU to enable the per-CPU
caches. With this change, on amd64 we require ~2.5GB or more RAM
per CPU to enable the caches.

I have seen a couple of reports of unexpected OOM kills and memory
allocation failures which I believe are caused by a large number of
cached pages in systems with relatively small amounts of RAM. In
particular, uma_reclaim() does not do anything to reclaim items from
cache zones, so cached pages are effectively unreclaimable right now.
This is a bug.

We could also use uma_zone_set_max() to set an upper limit on the number
of items cached in the zone, i.e., use a smaller maximum bucket size
instead of 256.

However, let's first adjust the threshold since that helps alleviate the
problem and is not too risky for 12.1.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj created this revision.Oct 15 2019, 4:12 PM
kib accepted this revision.Oct 16 2019, 12:40 PM
This revision is now accepted and ready to land.Oct 16 2019, 12:40 PM
markj added a comment.Oct 16 2019, 5:34 PM

I would like to get this into 12.1, so I will commit later today if there are no objections.

alc accepted this revision.Oct 16 2019, 5:40 PM
jeff added a comment.Oct 16 2019, 6:36 PM

On the other hand this means you'd need 1TB of ram to run with 256 processor cores threads which is a possible amount for a two socket system but still a somewhat unlikely amount of ram. Netflix ran into this on some of their test systems and had to disable the check.

I think that this should have a clause for large systems. Something like && mp_ncpus < 16 or && vmd_page_count < XGB

emaste added a subscriber: emaste.Oct 17 2019, 3:47 PM
markj added a comment.Fri, Oct 18, 5:10 PM
In D22040#481861, @jeff wrote:

On the other hand this means you'd need 1TB of ram to run with 256 processor cores threads which is a possible amount for a two socket system but still a somewhat unlikely amount of ram. Netflix ran into this on some of their test systems and had to disable the check.
I think that this should have a clause for large systems. Something like && mp_ncpus < 16 or && vmd_page_count < XGB

I certainly agree that the check is too conservative, it is just a bandaid for reported problems.

I believe that the right solution is to stop setting UMA_ZONE_MAXBUCKET and allow for a smaller cache size by setting uz_count_max based on the amount of memory in the domain, but I don't have time to get that into 12.1. I am worried that even on large systems you might be able to trigger OOM conditions by having a large number of pages cached. How about I add a tunable to enable the use of caches for the time being? This would also be a temporary solution.

This revision was automatically updated to reflect the committed changes.