Page MenuHomeFreeBSD

Update zfs_arc_free_target after r329882.
ClosedPublic

Authored by markj on Apr 6 2018, 8:38 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Nov 24, 10:19 AM
Unknown Object (File)
Tue, Nov 5, 8:34 AM
Unknown Object (File)
Oct 13 2024, 8:49 AM
Unknown Object (File)
Sep 28 2024, 9:20 AM
Unknown Object (File)
Sep 27 2024, 12:32 PM
Unknown Object (File)
Sep 24 2024, 10:46 PM
Unknown Object (File)
Sep 19 2024, 11:17 AM
Unknown Object (File)
Sep 8 2024, 5:50 AM

Details

Summary

With that change, the page daemon will reclaim pages if the free page
count for the domain drops below the free target. In particular, it is
now unlikely for v_free_count to drop below zfs_arc_free_target in
ordinary circumstances. Update zfs_arc_free_target accordingly.

Test Plan

Don tested this change and found that it addressed the ARC
backpressure issue which appeared after r329882. I don't
think this is a complete solution, but it's an improvement over
the current behaviour.

Diff Detail

Lint
No Lint Coverage
Unit
No Test Coverage
Build Status
Buildable 16026
Build 16003: arc lint + arc unit

Event Timeline

markj added reviewers: avg, mav, jeff, delphij.
markj edited subscribers, added: truckman; removed: delphij.

This patch works as well as the manual sysctl tuning experiment that I previously tried.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
392

There is an inherent problem when attempting to balance multiple competing caches. Which one do we reclaim from? Ideally you'd want to keep the pages that were most likely to be re-used regardless of where they exist. Unfortunately that is impossible to determine. We have this same problem with UMA. Does it make more sense to page out clean file pages or throw away cached kernel memory? On costs I/O, the other potentially a large amount of CPU due to lock contention.

What would a better solution be? From research papers I see little value in the ARC over our own page cache. It does loop detection and some other minor cache improvements In the age of SSD these are increasingly irrelevant for high performance systems. I haven't looked at it feature wise but I think placing arc data in the inactive queue to be recycled like the buf cache would be preferable. You could do it like the VMIO integration in the buf cache where you statically keep a small subset pinned and then let the rest float. Since the arc is physically organized you could leave evicted blocks on the devvps or construct some pseudo device vm objects for the purpose.

In the absence of some better solution we have two back-pressure mechanisms that need to coordinate, inactive queue, and arc. With the pid controller the free page count is not likely to often dip below v_free_target unless consumption ramps very quickly or the inactive queue is exhausted. We probably want to keep some balance between inactive queue processing for free pages and arc eviction for free pages.

Possibly this would be better served by an additional event handler. Rather than lowmem which only happens when we're low, this would be a hook on every pageout. This would at least allow us to synchronize the two processes and let each give up a fraction.

In the absence of that, and recognizing that we need a temporary fix, I would set the arc_free_target at a percent or so above the free_target because in reality the pid controller will raise the free_target dynamically according to consumption.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 10 2018, 1:56 PM
This revision was automatically updated to reflect the committed changes.