Page MenuHomeFreeBSD

Fix object locking races in swapoff.
ClosedPublic

Authored by markj on Feb 13 2020, 5:23 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 17, 10:44 AM
Unknown Object (File)
Tue, Dec 17, 3:23 AM
Unknown Object (File)
Wed, Dec 11, 4:59 AM
Unknown Object (File)
Nov 26 2024, 9:00 AM
Unknown Object (File)
Nov 11 2024, 4:54 AM
Unknown Object (File)
Sep 26 2024, 5:10 PM
Unknown Object (File)
Sep 26 2024, 5:10 PM
Unknown Object (File)
Sep 26 2024, 5:10 PM
Subscribers

Details

Summary

swap_pager_swapoff_object()'s goal is to allocate pages for all valid
swap blocks belonging to the object, for which there is no resident
page. If the page corresponding to a block is already resident and
valid, the block can simply be discarded.

The existing implementation tries to minimize the number of I/Os used.
For each cluster of swap blocks, it finds maximal runs of valid swap
blocks not resident in memory, and valid resident pages. During this
processing, the object lock may be dropped in several places: when
calling getpages, or when blocking on a busy page in
vm_page_grab_pages(). While the lock is dropped, another thread may
free swap blocks, causing getpages to page in stale data.

Fix the problem following a suggestion from Jeff: use getpages'
readahead capability to perform clustering rather than doing it
ourselves. The simplies the code a bit without reintroducing the old
behaviour of performing one I/O per page.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/vm/swap_pager.c
1773 ↗(On Diff #68256)

vm_page_grab_valid() supports a prefetch count that is validated with has_pages. You could likely use it here and simplify this block.

You also can test the valid bit with only the object lock and optimize this further by looping with vm_page_next(). I would do a lookup and iteration with vm_page_lookup() and vm_page_next() and only use vm_page_grab_valid() when no page was present. If you don't do this I will do so to fix my kstack problem. I would drop a comment documenting that the valid test depends on the object lock as I have done elsewhere.

You will have to add SLEEPFAIL support to grab_valid but that is trivial. I just didn't have a use case before.

sys/vm/swap_pager.c
1773 ↗(On Diff #68256)

After thinking it over some more, I'm not sure that we can avoid busying valid pages here. I think that is the only way we synchronize with a concurrent putpages. Once putpages has allocated swap blocks and submitted I/O to a device, swapoff should wait for that to finish before proceeding.

GEOM does some tracking of in-flight requests and should wait for them to drain before destroying a swap device, so that may be sufficient after all, but I need to spend some time to verify that.

If the object is dead, wait for pending paging operations
to complete before returning.

kib added inline comments.
sys/vm/swap_pager.c
1725 ↗(On Diff #68328)

Isn't it better to set unswapped before xunbusy, even if it is not strictly needed now ?

This revision is now accepted and ready to land.Feb 14 2020, 7:21 PM

Avoid busying valid pages, to avoid a deadlock involving
kernel stack pages.

This revision now requires review to proceed.Feb 14 2020, 8:53 PM
jeff added inline comments.
sys/vm/swap_pager.c
1794 ↗(On Diff #68337)

It may not be required by style but I really prefer { } here.

This revision is now accepted and ready to land.Feb 14 2020, 9:29 PM
This revision was automatically updated to reflect the committed changes.