Page MenuHomeFreeBSD

Make swapoff more robust, when using swap into file
ClosedPublic

Authored by kib on Nov 28 2021, 8:28 AM.

Details

Summary
shutdown: unmount filesystems after swapoff

Swap on file requires operational underlying mount, otherwise
swapoff_all() is guaranteed to panic due to the default strategy VOP for
reclaimed vnodes.
swapoff_one(): only check free pages count manually turning swap off

When swap is turned off due to system shutdown or reboot, ignore the
check.  Problem is that the check is not accurate by any means, free
page count can legitimately be low while system still able to page in
everything from the swap.  Then, we turn swap off if swapping on
real file or some non-standard geom provider, and typically panic
when system appears to actually need to unavailable page.

For syscall, it is better to be safe than sorry.

Reported and tested by: peterj

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kib requested review of this revision.Nov 28 2021, 8:28 AM
sys/kern/vfs_bio.c
1462 ↗(On Diff #99096)
1463 ↗(On Diff #99096)
sys/vm/swap_pager.c
2534

shutdown (with inverted sense) seems like a more descriptive name.

2557
2560
2564

Just a comment, I sometimes wish we could run swapoff -f to skip this check. Not sure how to implement it except with a new swapoff system call that can take flags. A sysctl could be used as well but that's quite ugly.

kib marked 4 inline comments as done.Nov 29 2021, 4:05 PM
kib added inline comments.
sys/vm/swap_pager.c
2564

I think we can use zero-started string as an escape hatch. Or even something better, I will post a patch later.

Edit grammar in comments. Rename check-controlling arg.

markj added inline comments.
sys/vm/swap_pager.c
472
2560
This revision is now accepted and ready to land.Nov 29 2021, 4:25 PM
kib marked 4 inline comments as done.Nov 29 2021, 4:32 PM

Individually, the updated comments make perfect sense from a purely localized standpoint, that is, they explain what is happening in that place. However, if I put myself in the place of a first-time reader of this code, the question that I would ask that isn't explained by either the old or updated comments is why any data is paged in at shutdown time.

In D33147#749591, @alc wrote:

Individually, the updated comments make perfect sense from a purely localized standpoint, that is, they explain what is happening in that place. However, if I put myself in the place of a first-time reader of this code, the question that I would ask that isn't explained by either the old or updated comments is why any data is paged in at shutdown time.

I wrote the following response to the similar question asked by Peter:

> Moving up a level, does it really matter if swapoff_one() is skipped?
> If it actually returned an error (eg if the free memory test failed),
> then that's what would happen.  By this point in the shutdown, there's
> no userland left (which makes me wonder why there's anything left in
> swap in any case) and only the final cleanups remain before the kernel
> shuts down.  What's really needed is a way to detect that the relevant
> swap I/O provider has gone away and return to swapoff_all() without
> panicing.
                                                                                
I think the intent there is to ensure that the system is in as much steady      
state as possible.  I can easily imagine some HBA doing weird things if         
some io is in flight when the reset comes in.  So we want to ensure that        
no io occurs.