Page MenuHomeFreeBSD

vm: require page to be xbusied to invalidate the content
AbandonedPublic

Authored by mjg on Mar 9 2023, 3:50 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 7:47 AM
Unknown Object (File)
Tue, Jan 7, 4:02 PM
Unknown Object (File)
Tue, Jan 7, 1:40 PM
Unknown Object (File)
Dec 14 2024, 8:04 PM
Unknown Object (File)
Oct 24 2024, 5:45 AM
Unknown Object (File)
Sep 19 2024, 5:49 AM
Unknown Object (File)
Sep 9 2024, 2:46 AM
Unknown Object (File)
Sep 8 2024, 9:18 AM
Subscribers

Details

Reviewers
kib
markj
Summary

This will facilitate sbusy for fault handling. Also allowing invalidation while sbusied is weird to say the least.

So far did not blew up with swap testing, will ask pho later.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

mjg requested review of this revision.Mar 9 2023, 3:50 PM
mjg edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Mar 9 2023, 4:14 PM
sys/kern/vfs_bio.c
3046

I suspect this might recurse into busy. Imagine that we sbusied the pages for pageout, and do the writes through the buffer cache. If brelse decides to invalidate the buffer for whatever reason then we would wait for the sbusy to go against themself.

sys/kern/vfs_bio.c
3046

can you elaborate what's the exact codepath to end up in this state?

sys/kern/vfs_bio.c
3046

Now I'm even more confused:

in brelse:

if ((bp->b_flags & B_VMIO) && (bp->b_flags & B_NOCACHE ||
    (bp->b_ioflags & BIO_ERROR && bp->b_iocmd == BIO_READ)) &&
    (v_mnt == NULL || (v_mnt->mnt_vfc->vfc_flags & VFCF_NETWORK) == 0 ||
    vn_isdisk(bp->b_vp) || (bp->b_flags & B_DELWRI) == 0)) {
        vfs_vmio_invalidate(bp);
        allocbuf(bp, 0);
}

this is the only caller

thus for this to happen this has to be *read* which fails.

how is read *to the page* safe to do with the page only sbusied though?

sys/kern/vfs_bio.c
3046

The io failure is a rare case. More common is for filesystem to set e.g. B_NOCACHE to not leave the buffer constructed after the write, e.g. for directio support.