Page MenuHomeFreeBSD

Avoid dequeuing the page found by a soft fault.
ClosedPublic

Authored by markj on Mar 8 2018, 4:30 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 11, 2:53 PM
Unknown Object (File)
Sun, Dec 29, 10:15 PM
Unknown Object (File)
Nov 27 2024, 3:49 AM
Unknown Object (File)
Nov 24 2024, 3:25 PM
Unknown Object (File)
Nov 23 2024, 6:11 PM
Unknown Object (File)
Nov 23 2024, 10:35 AM
Unknown Object (File)
Nov 20 2024, 1:14 AM
Unknown Object (File)
Nov 19 2024, 11:17 PM
Subscribers

Details

Summary

This dequeue dates back to the CSRG sources:
https://svnweb.freebsd.org/csrg/sys/vm/vm_fault.c?view=markup#l239

However, I don't believe it's necessary. If the page is in
PQ_INACTIVE/PQ_LAUNDRY, the busy lock will cause the page daemon to
ignore it. If the page is in PQ_ACTIVE, we can handle a concurrent scan
by the page daemon, which is only counting references.

If the fault is CoW, release_page() will call vm_page_deactivate() on
the page, which handles the dequeue. Otherwise we will call
vm_page_activate() on the page, which also dequeues as necessary. In
particular, if the page is already in PQ_ACTIVE, this change lets us
avoid a needless pair of page queue lock acquisitions.

Test Plan

Smoke testing didn't turn up any problems. If the patch is accepted,
I plan to ask Peter to test it.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj edited the test plan for this revision. (Show Details)
markj added reviewers: alc, kib, jeff.
alc accepted this revision.EditedMar 8 2018, 5:12 PM

For what it's worth, this goes back even further, to the 1980's at CMU. I also had this in my TODO list as a post-lazy wiring change.

This revision is now accepted and ready to land.Mar 8 2018, 5:12 PM
sys/vm/vm_fault.c
714 ↗(On Diff #40078)

If the page is being copied frequently (i.e., we deactivate it in release_page() after the copy), with this change we'll no longer preserve LRU. One solution might be to set PGA_REFERENCED instead. Or, with r330296 we can give the page a second chance through PQ_INACTIVE by setting PGA_REQUEUE without creating a batch queue entry.

sys/vm/vm_fault.c
714 ↗(On Diff #40078)

My recollection of the page daemon's inactive scan is that an unmapped inactive page with PGA_REFERENCED set is moved to the tail of the inactive queue. That should suffice.

A thing to keep in mind is that when we prefault a mapping for pages (that are likely to be sources for future COW faults) we do not change their page queue or set PGA_REFERENCED because we don't want speculative, non-demand mappings to appear as though an access has been performed. So, if we don't do something to the page, it could too easily be reclaimed.

This revision now requires review to proceed.Mar 8 2018, 5:47 PM
markj marked an inline comment as done.Mar 8 2018, 5:48 PM
markj added inline comments.
sys/vm/vm_fault.c
714 ↗(On Diff #40078)

Right, I forgot that we simply requeue an unmapped inactive page if references are found during a scan.

I have to prepare for my lecture. I'll think about this some more later.

This revision is now accepted and ready to land.Mar 8 2018, 9:02 PM

Dequeue the page before temporarily disassociating it from its object.

We need to keep the dequeue for optimized CoW faults: otherwise the page
daemon may encounter a queued page with no object, which it is not
prepared to handle.

This revision now requires review to proceed.Mar 11 2018, 4:06 PM

Have vm_page_replace() assert that the new page is dequeued.

Since mnew must not belong to an object when vm_page_replace() is
called, it must also be dequeued.

This revision is now accepted and ready to land.Mar 11 2018, 5:03 PM
sys/vm/vm_fault.c
669 ↗(On Diff #40165)

I'm having second thoughts about this approach. The reason being that the mapping check by the page daemon is based on the object's ref count. It accurately tells you when a page can't possibly be mapped but not vice versa. Backing objects in a shadow chain will pass the test, and so we will dubiously reactivate the source page of a COW fault at some later point.

Before talking about implementation, do we agree that the following describes the desired behavior for COW faults?

Suppose that the page was active. I think it's okay to deactivate the page, as we have previously done. If the page is actually mapped (read-only) and accessed by another process, the page daemon will reactivate it, because the object's ref count will be non-zero and pmap_ts_referenced() will return a non-zero value. In fact, with the exception of pmap_remove_pages(), because we set PGA_REFERENCED whenever we destroy a mapping with the accessed bit set in the PTE, the page will be reactivated even if it is not still mapped by this other process at the instant that the page daemon examines the page.

Suppose that the page was inactive. I think that requeueing the page suffices, and this requeueing can be deferred to the page daemon. As described in the active case, the page daemon may actually discover that the page is actually mapped and accessed by another process, and so it may actually reactivate the page.

In D14625#307878, @alc wrote:

Before talking about implementation, do we agree that the following describes the desired behavior for COW faults?

Yep, I agree that the behaviour you described seems reasonable.

In D14625#307878, @alc wrote:

Before talking about implementation, do we agree that the following describes the desired behavior for COW faults?

Yep, I agree that the behaviour you described seems reasonable.

Then, I'm going to retract my original support for the PGA_REFERENCED-based solution, because that will lead to page reactivation when we want the page to be requeued in the inactive queue. Do you have a plan for merging r330296 into HEAD?

In D14625#308095, @alc wrote:

Then, I'm going to retract my original support for the PGA_REFERENCED-based solution, because that will lead to page reactivation when we want the page to be requeued in the inactive queue. Do you have a plan for merging r330296 into HEAD?

Not quite yet. I still have some cleanup to do, and the change needs review and testing, assuming of course that we agree with the direction taken there. I'm not sure if you've looked much at r330296 but it's a fairly drastic change. I can describe it over the phone at some point if that would be more convenient.

In any case, it seems we want a variation of vm_page_deactivate() that requeues the page even if it's already in PQ_INACTIVE (and enqueues it there otherwise). As part of r330296 I'd like to make sure there is a clean separation between the higher-level page queue APIs (e.g., vm_page_activate(), vm_page_deactivate()) and the lower-level queue manipulation APIs (e.g., vm_page_requeue()). In particular, use of the latter should be confined to vm_page.c and vm_pageout.c. I don't like the

else if (m->queue != PQ_INACTIVE)
    vm_page_deactivate(m);
else
    vm_page_requeue(m);

in vfs_vmio_unwire(), for example.

Even before r330296, having a vm_page_deactivate_or_requeue() for use in release_page() would be an improvement since we'd only have to touch the page and page queue locks once instead of twice. With r330296, vm_page_deactivate_or_requeue() could be modified to use a lazy requeue, or another API could be added so that the caller can choose between strict LRU and second-chance. Does that sound like a reasonable interim approach? If so, any suggestions for a better name than vm_page_deactivate_or_requeue()? :)

... I don't like the

else if (m->queue != PQ_INACTIVE)
    vm_page_deactivate(m);
else
    vm_page_requeue(m);

in vfs_vmio_unwire(), for example.

Even before r330296, having a vm_page_deactivate_or_requeue() for use in release_page() would be an improvement since we'd only have to touch the page and page queue locks once instead of twice. With r330296, vm_page_deactivate_or_requeue() could be modified to use a lazy requeue, or another API could be added so that the caller can choose between strict LRU and second-chance. Does that sound like a reasonable interim approach? If so, any suggestions for a better name than vm_page_deactivate_or_requeue()? :)

Yes. No.

That said, there aren't that many calls to vm_page_deactivate() in the tree, and I suspect that none of them would be adversely affected by changing vm_page_deactivate() to requeue when the page is already inactive.

Make vm_page_deactivate() and vm_page_launder() requeue pages

These functions would previously do nothing if the page was already in
the corresponding queue. In many cases, the page is known not to be in
the queue, so this change has no effect. sendfile and the buffer cache
were already emulating this behaviour, so the change has no effect there
either.

In most other cases I believe the change to vm_page_deactivate() will
have little effect:

  • release_page() is invoked when the fault needs to be restarted, or after the copy in a CoW fault. Restarts are rare, and we want to treat the copy as an access for the source page.
  • dmu_read_pages(), in ZFS, deactivates the page after grabbing and copying it, so requeuing to provide LRU seems reasonable.
  • vm_fault_populate_cleanup() is only called in rare cases, where a fault races with a map entry update.
  • vm_fault_dontneed() is used to speed up reclamation of read-behind pages by deactivating them. The change avoids requeuing in this case.

vm_page_launder() is called on pages known not to be in PQ_LAUNDRY. The
only exception is vm_page_advise(), where we avoid requeuing a dirty
page after MADV_DONTNEED is applied.

This revision now requires review to proceed.Mar 16 2018, 3:22 PM
In D14625#308444, @alc wrote:

That said, there aren't that many calls to vm_page_deactivate() in the tree, and I suspect that none of them would be adversely affected by changing vm_page_deactivate() to requeue when the page is already inactive.

I went with this approach. Now we preserve LRU without using PGA_REFERENCED.

This revision is now accepted and ready to land.Mar 17 2018, 5:21 PM
This revision was automatically updated to reflect the committed changes.