Page MenuHomeFreeBSD

Remove the page lock dependency on busy sleeps by using sleepq to avoid races.
AcceptedPublic

Authored by jeff on Tue, Aug 13, 8:08 PM.

Details

Reviewers
alc
kib
markj
Summary

This uses the sleepq lock itself to synchronize busy sleep/wakeup and avoids the need for the page lock. I need the lock test and conditional acquire/release for a later patch. It could be elided for now but it allows for simplifying some consumers already.

I didn't realize some diffs from a different patchset snuck in and I will correct that. I made a note where this occurred.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 25967
Build 24523: arc lint + arc unit

Event Timeline

jeff created this revision.Tue, Aug 13, 8:08 PM
jeff edited the summary of this revision. (Show Details)Tue, Aug 13, 8:16 PM
jeff added reviewers: alc, kib, markj.
markj added inline comments.Tue, Aug 13, 8:34 PM
sys/vm/vm_page.c
963

If the page lock isn't held and the object lock isn't held, what prevents m from being freed after the call to vm_page_busy_sleep() and before the dereference of m->object?

jeff added inline comments.Tue, Aug 13, 8:43 PM
sys/vm/vm_page.c
963

Nothing prevents it. I believe we handle the race gracefully.

m->object could become NULL at any point after we drop the object lock. However, this means the busy holder is removing the page and will unbusy at some point, waking us up, or avoiding the sleep below.

All vm_page_busy_sleep callers are basically doing what we call "WAITFAIL" elsewhere. They don't return with the busy lock held. They restart compound operations and look for the correct page again.

It is possible that the page could be freed, reallocated, and we sleep for the wrong thing. However, I believe this is harmless and rare. It would be possible to treat the object lock as an interlock here with a slightly more complicated and expensive sleepq operation. Cases that don't hold the object lock are protected by a wire count.

3955–3956

This is a merge bug and related to a later diff.

sys/vm/vm_pageout.c
378 ↗(On Diff #60758)

Merge bug for later diff.

markj added inline comments.Tue, Aug 13, 8:46 PM
sys/vm/vm_page.c
963

The wiring doesn't prevent the page from being removed from its object, in which case m->object will become NULL. See vm_object_terminate_pages() for instance.

jeff added inline comments.Tue, Aug 13, 8:48 PM
sys/vm/vm_page.c
963

Yes I understand but it does prevent the page from being reallocated to some other purpose. So there is no chance of a spurious sleep.

markj added inline comments.Tue, Aug 13, 8:52 PM
sys/vm/vm_page.c
963

We still need to check that obj is non-NULL before dereferencing it.

kib added inline comments.Tue, Aug 13, 8:53 PM
sys/vm/vm_page.c
927–929

I wonder if this needs atomc_cmpset_int_rel. Before, the vm_page_unlock() provided the release semantic, sunbusy acts as unlock. vm_page_xunbusy does use _rel.

958

Then assert that the object ref count is > 0 ?

I would prefer to convert the nonshared argument to flags, and add two flags, one for nonshared, another for obj_locked, instead of using VM_OBJECT_WOWNED().

jeff added inline comments.Wed, Aug 14, 10:50 PM
sys/vm/vm_page.c
958

I copied the comment from elsewhere. I'm not comfortable enough with all callers to add a new assert. This may be used in the object tear-down path for example. We rely on object type stability elsewhere.

I thought about the flag. It is a tradeoff. With a flag you are more explicit and so less likely to have bugs where the caller doesn't realize the object lock is dropped. However, you're then subject to bugs where the flag was just specified wrong. With the silent operation you won't have bugs with releasing/re-acquiring the lock but the caller may not anticipate losing synchronization.

If you follow all of my patch sets you will see that ultimately there are few cases that are called with any lock held so I settled on this method.

jeff updated this revision to Diff 60810.Wed, Aug 14, 11:11 PM

Address review feedback. Clean up patch. Switch to fcmpset and rel semantics.

jeff updated this revision to Diff 61023.Tue, Aug 20, 12:46 AM

Respond to review feedback. Eliminate the object locking changes for now.

markj accepted this revision.Tue, Aug 20, 1:21 AM
This revision is now accepted and ready to land.Tue, Aug 20, 1:21 AM
kib accepted this revision.Tue, Aug 20, 11:29 AM