Change Details

Ktls uses 16K physically contiguous crypto destination buffers as an optimization. These buffers are managed by UMA, using a cache zone where the import function allocates these buffers using vm_page_alloc_noobj_contig_domain(). After a server has been up for a while serving a "Netfllix" workload, physical memory becomes severely fragmented, resulting in these contig allocations failing when UMA tries to expand the zone. When attempting to fix this by calling vm_page_reclaim_contig_domain() from the ktls alloc thread, I observed that the ktls alloc thread wound up consuming an entire core for hours. This is because vm_page_reclaim_contig_domain() scans all of physical memory looking for runs of relocatable pages, but then reclaims just one of those runs. The problem with this is that on a large memory machine (order of a 100GB or more), each call can take several seconds to scan all of physical memory. The algorithm for vm_page_reclaim_contig_domain() is the way it is for good reasons (see https://reviews.freebsd.org/D28924#649775). So rather than modifying the core algorithm, I extended vm_page_reclaim_contig_domain() to take a "num_runs" argument to allow the caller to request that it reclaim more than just a single run. There is no functional change intended for all existing callers. The first user for this interface is the ktls code (will post a follow on review for ithttps://reviews.freebsd.org/D39421). By reclaiming multiple runs, ktls goes from consuming hours of CPU to refill its buffer zone to just tens of seconds.