Page MenuHomeFreeBSD

Allow default objects to have invalid pages so that fault restarts don't reallocate continuously.
ClosedPublic

Authored by jeff on Dec 3 2019, 9:31 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jan 22, 10:03 AM
Unknown Object (File)
Sat, Jan 11, 1:04 PM
Unknown Object (File)
Sat, Jan 11, 11:57 AM
Unknown Object (File)
Dec 23 2024, 11:09 PM
Unknown Object (File)
Nov 23 2024, 11:46 AM
Unknown Object (File)
Nov 22 2024, 7:44 AM
Unknown Object (File)
Nov 21 2024, 1:41 AM
Unknown Object (File)
Nov 10 2024, 2:46 PM
Subscribers

Details

Summary

Imagine this scenario: several threads or processes share a shadow object that maps a vnode. They all simultaneously fault for the same address that is currently busy. The first thread through will acquire the first object lock, allocate a busy page, follow the shadow chain unlocking the first object, discover the busy page, re lock the first_object, free the first_m, and wait. Each subsequent thread will repeat this with minor variations depending on how the operations are ordered by the first_object lock. If they race in before the free they will busy sleep on this page which will be immediately freed and they will immediately loop and repeat the process. Or they will be after the first thread's free and simply repeat the allocation process.

If we instead leave the page allocated they will each skip the allocation and wait for the first thread through fault to validate the page if it is paging in or they will all reach the backing_object page and wait without each allocating and freeing a page.

This also has the advantage of not requiring the object lock to be re-acquired just to free the page. If we're low on memory pageout is capable of freeing this page already. We're limiting the total number of additional invalid pages on the inactive queue to 2 * threads if all threads are stuck waiting in fault.

I identified two places in vm_object that did not expect invalid pages on default objects. I _think_ this is all of them.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 27920
Build 26089: arc lint + arc unit

Event Timeline

I agree with idea, but I do not understand the implementation.

Assume that we have a shadow object over the vnode object, and we have a fault for read over the resident vnode page. In this case we

  • allocate a new invalid page in the top object
  • walk the shadow chain to find the content page in the vnode object
  • pmap_enter() the vnode object' page
  • do fault_deallocate(), which leaves invalid page unbusied on the shadow object queue.

You entered the content page into pte, so the system should still work, but read-only accesses now would always leave the invalid page in the queue if there is a shadow object in the chain.

You are right. When I first wrote this patch I only did it in the busy handler and that passed stress2. However, I tried to make it more broad before review. I need to explicitly remove first_m in that case but it only needs to be handled just before pmap_enter().

What if we wait for fs.m unbusy without unbusying fs.first_m ?

sys/vm/vm_fault.c
199

I am testing stress2 with this part of the diff reversed. This means that only busy sleeps will retain the first_m. This is safe because it always restarts.

jeff edited the summary of this revision. (Show Details)

I added another wrapper around page free and more asserts around fs.m and
fs.first_m settings. This version explicitly frees default pages if they
are not first_m or if we restart for any reason other than busy.

I continue to run stress2 but this passes the couple of tests that failed
before.

sys/vm/vm_fault.c
1143

This was a page leak in my first version. It is the same case as if the pager did not contain the page but we need to run it now for DEFAULT objects as well.

I'm a little surprised we don't do has_pages() before allocation on swap objects. We could skip a lot of code in that case.

Any objections to going forward with this commit? It has successfully passed numerous stress2 runs.

This revision is now accepted and ready to land.Dec 12 2019, 9:47 PM
sys/vm/vm_fault.c
858–859

This actually may wakeup and touch an object that has been freed but only to unlock it. Currently this is harmless. In a later patch I have a change that stops re-locking the object just to return and unlock so the page busy sleep touches no state when it is done.