Page MenuHomeFreeBSD

Don't hold the object lock across getpages.
ClosedPublic

Authored by jeff on Jan 4 2020, 8:26 PM.

Details

Summary

Only swap and the device pager want the object lock to perform getpages. vnodes do not. Device pagers only really need the object lock for legacy compat, the individual pagers could simply lock the object around vm_page_replace() calls if we wanted to push this down one more level for devices. Most cdev pagers drop the lock for the bulk of their operation. swap could possibly even use its own lock or a large map of locks if we wanted to further optimize and improve situations like unswapped().

This patch is really mechanical and does not push the synchronization change very far into consumers. I have a long series of commits that will improve the consumers one by one. The first is fault here: https://reviews.freebsd.org/D23034

This does now rely completely on exclusive busy and paging in progress to handle concurrency with other object operations.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jeff created this revision.Jan 4 2020, 8:26 PM
jeff edited the summary of this revision. (Show Details)Jan 4 2020, 8:38 PM
jeff added reviewers: alc, dougm, kib, markj.
jeff set the repository for this revision to rS FreeBSD src repository.
markj added inline comments.Jan 5 2020, 11:07 PM
sys/dev/md/md.c
1072 ↗(On Diff #66349)

Shouldn't we just drop the lock around the get_pages call? If we are reading cached data the lock will be dropped needlessly.

sys/vm/sg_pager.c
158 ↗(On Diff #66349)

Can we assert paging_in_progress > 0?

sys/vm/swap_pager.c
1201 ↗(On Diff #66349)

Just a nit: the assignment and lock can be reversed.

kib added inline comments.Jan 6 2020, 1:38 AM
sys/vm/vm_fault.c
1084 ↗(On Diff #66349)

Don' you need to re-evaluate behind/ahead after the relock, or at least clip ahead to potentially trimmed object size ?

jeff added inline comments.Jan 6 2020, 2:22 AM
sys/dev/md/md.c
1072 ↗(On Diff #66349)

We actually only need the object lock in the vm_page_free() case. I didn't realize this was so wonky in this patch. My follow-up is cleaner. I can tidy this up too.

sys/vm/sg_pager.c
158 ↗(On Diff #66349)

I can make sure it's part of the vm_pager_getpages asserts.

sys/vm/vm_fault.c
1084 ↗(On Diff #66349)

For vnode we were already relying on the vnode lock to protect against truncate.

I need to look at tmpfs to see if that is also true there. For anonymous objects it is not possible that it shrunk.

kib added inline comments.Jan 6 2020, 3:21 AM
sys/vm/vm_fault.c
1084 ↗(On Diff #66349)

For tmpfs we do not lock the vnode, since the backing object type is swap. The same is true for shmfd that can be truncated.

jeff added inline comments.Jan 6 2020, 3:25 AM
sys/vm/vm_fault.c
1084 ↗(On Diff #66349)

swap pager already drops the lock during the actual pagein which is after the clustering is determined. So this is not really a new race if it exists. I will still see if I can identify how it is handled.

jeff added inline comments.Jan 6 2020, 3:27 AM
sys/vm/vm_fault.c
1084 ↗(On Diff #66349)

Actually vm_fault_prefault() will simply fail to find the pages so this is harmless in any case.

jeff updated this revision to Diff 66660.Jan 12 2020, 9:03 PM

Add an assert and fix a number of cases missing pip. I do not believe they would have caused bugs before but they may after my collapse change.

markj added inline comments.Jan 13 2020, 2:43 PM
sys/vm/vm_swapout.c
577 ↗(On Diff #66660)

What ensures that the page isn't unswapped after the lock is dropped? swap_pager_unswapped() can be called without the page being busy, for example in the active queue scan.

jeff added inline comments.Jan 13 2020, 7:10 PM
sys/vm/vm_swapout.c
577 ↗(On Diff #66660)

We do valid checks and busy pages before calling get_pages. unswapped can clear dirty but not valid. We are guaranteed the validity won't change while xbusy or vm object locked. No one should be paging in a valid page.

jeff added inline comments.Jan 13 2020, 8:15 PM
sys/kern/kern_sendfile.c
273 ↗(On Diff #66660)

There is one missing pip_wakeup in an early exit case that I fixed on my branch.

kib accepted this revision.Jan 13 2020, 8:19 PM
This revision is now accepted and ready to land.Jan 13 2020, 8:19 PM
markj accepted this revision.Jan 13 2020, 8:33 PM
This revision was automatically updated to reflect the committed changes.