Page MenuHomeFreeBSD

Fix object locking races in swapoff.
ClosedPublic

Authored by markj on Thu, Feb 13, 5:23 PM.

Details

Summary

swap_pager_swapoff_object()'s goal is to allocate pages for all valid
swap blocks belonging to the object, for which there is no resident
page. If the page corresponding to a block is already resident and
valid, the block can simply be discarded.

The existing implementation tries to minimize the number of I/Os used.
For each cluster of swap blocks, it finds maximal runs of valid swap
blocks not resident in memory, and valid resident pages. During this
processing, the object lock may be dropped in several places: when
calling getpages, or when blocking on a busy page in
vm_page_grab_pages(). While the lock is dropped, another thread may
free swap blocks, causing getpages to page in stale data.

Fix the problem following a suggestion from Jeff: use getpages'
readahead capability to perform clustering rather than doing it
ourselves. The simplies the code a bit without reintroducing the old
behaviour of performing one I/O per page.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj created this revision.Thu, Feb 13, 5:23 PM
jeff added inline comments.Thu, Feb 13, 10:56 PM
sys/vm/swap_pager.c
1773 ↗(On Diff #68256)

vm_page_grab_valid() supports a prefetch count that is validated with has_pages. You could likely use it here and simplify this block.

You also can test the valid bit with only the object lock and optimize this further by looping with vm_page_next(). I would do a lookup and iteration with vm_page_lookup() and vm_page_next() and only use vm_page_grab_valid() when no page was present. If you don't do this I will do so to fix my kstack problem. I would drop a comment documenting that the valid test depends on the object lock as I have done elsewhere.

You will have to add SLEEPFAIL support to grab_valid but that is trivial. I just didn't have a use case before.

markj added inline comments.Fri, Feb 14, 3:53 PM
sys/vm/swap_pager.c
1773 ↗(On Diff #68256)

After thinking it over some more, I'm not sure that we can avoid busying valid pages here. I think that is the only way we synchronize with a concurrent putpages. Once putpages has allocated swap blocks and submitted I/O to a device, swapoff should wait for that to finish before proceeding.

GEOM does some tracking of in-flight requests and should wait for them to drain before destroying a swap device, so that may be sufficient after all, but I need to spend some time to verify that.

markj updated this revision to Diff 68328.Fri, Feb 14, 5:11 PM

If the object is dead, wait for pending paging operations
to complete before returning.

kib accepted this revision.Fri, Feb 14, 7:21 PM
kib added inline comments.
sys/vm/swap_pager.c
1725 ↗(On Diff #68328)

Isn't it better to set unswapped before xunbusy, even if it is not strictly needed now ?

This revision is now accepted and ready to land.Fri, Feb 14, 7:21 PM
markj updated this revision to Diff 68337.Fri, Feb 14, 8:53 PM

Avoid busying valid pages, to avoid a deadlock involving
kernel stack pages.

This revision now requires review to proceed.Fri, Feb 14, 8:53 PM
jeff accepted this revision.Fri, Feb 14, 9:29 PM
jeff added inline comments.
sys/vm/swap_pager.c
1794 ↗(On Diff #68337)

It may not be required by style but I really prefer { } here.

This revision is now accepted and ready to land.Fri, Feb 14, 9:29 PM
This revision was automatically updated to reflect the committed changes.