Page MenuHomeFreeBSD

Mark pages after EOF as clean after pageout.
ClosedPublic

Authored by kib on Jul 22 2017, 5:52 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, May 28, 12:21 AM
Unknown Object (File)
May 21 2024, 10:43 PM
Unknown Object (File)
May 21 2024, 7:49 AM
Unknown Object (File)
May 17 2024, 5:25 AM
Unknown Object (File)
Apr 30 2024, 7:15 AM
Unknown Object (File)
Feb 20 2024, 2:43 AM
Unknown Object (File)
Feb 13 2024, 3:56 PM
Unknown Object (File)
Feb 11 2024, 11:12 AM
Subscribers

Details

Summary

Suppose that a file on NFS has partial last page, and this page was marked dirty. NFS VOP_PAGEOUT() method only marks the the page clean up to the block of the last written byte, leaving other blocks dirty. Also any page which erronously exists in the vnode vm_object past EOF is also left marked as dirty.

With the introduction of the buf-cache coherent pager, each pass of syncer over the object with such page results in creation of B_DELWRI buffer due to VOP_WRITE() call. This buffer is noted on next syncer pass, which has e.g. a visible manifestation of shutdown never finishing vnode sync. Note that before buf-cache coherency commit, a dirty page might left never synced to server if a partial writes occur.

Fix this by clearing dirty bits after EOF. Only blocks of the partial page which are completely after EOF are marked clean, to avoid possible user data loss.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/vm/vnode_pager.c
1341 ↗(On Diff #31084)

Suppose that "pos" is 4000. Isn't "pos_devb" then going to be zero, and we will wind up clearing all the dirty bits in vm_page_clear_dirty()?

kib marked an inline comment as done.Jul 22 2017, 7:45 PM
kib added inline comments.
sys/vm/vnode_pager.c
1341 ↗(On Diff #31084)

I moved masking before roundup. I believe this clears the issue you pointed out.

But in this case we would clean all dirty bits anyway.

kib marked an inline comment as done.

Mask before rounding up.

I ran all of the NFS tests I have plus a new one that triggered problem on a pristine kernel.
Fix verified. No problems seen.

sys/fs/nfsclient/nfs_clbio.c
342 ↗(On Diff #31089)

Is it not possible for "np->n_size - offset" to overflow the "int" type that this parameter is declared as? Then, you could have spurious iterations of the new loop in vnode_pager_undirty_pages().

As an aside, I don't see why ncl_putpages() initializes rtvals[] while the n_mtx is held.

sys/vm/vnode_pager.c
1339 ↗(On Diff #31089)

I don't really understand this initialization of i. How do we know that OFF_TO_IDX(trunc_page(eof)) is a valid index into the page array?

sys/vm/vnode_pager.c
1339 ↗(On Diff #31089)

It may not be, but in those cases I believe that "pos >= lpos".

Change type of eof argument to off_t. Check the eof >= lpos before assigning to pos. These measures should avoid truncation.

Unrelated, move rtvals initialization in ncl_putpages() out of the region protected by nfs node mutex.

This revision is now accepted and ready to land.Jul 24 2017, 4:37 PM

I think that the code is correct.

sys/vm/vnode_pager.c
1318–1320 ↗(On Diff #31123)

I think that a block comment describing what this function does and especially what the new parameters represent would be helpful. For example, "eof" is not the length of the file as most people might expect.

1346–1347 ↗(On Diff #31123)

I would suggest a comment explaining these two lines as well.

kib edited edge metadata.

Add comments.

This revision now requires review to proceed.Jul 25 2017, 7:02 AM
markj added inline comments.
sys/vm/vnode_pager.c
1359 ↗(On Diff #31162)

"the whole page"

1363 ↗(On Diff #31162)

"the" can be removed here.

This revision is now accepted and ready to land.Jul 25 2017, 7:47 PM
sys/vm/vnode_pager.c
1322 ↗(On Diff #31162)

-> "... specifies the page ..."

1323 ↗(On Diff #31162)

-> "... bytes, and the ... bytes were actually ...

1324 ↗(On Diff #31162)

"lpos" -> "eof" here?

1325 ↗(On Diff #31162)

-> "... the vnode using the absolute file position of the first byte in the run as the base from which it is computed."?

1359 ↗(On Diff #31162)

-> "... mark the whole ..."

1360 ↗(On Diff #31162)

The wording here is a bit confusing in the sense that it would lead one to expect that the dirty field is being clear right here in the next snippet of code.

1363 ↗(On Diff #31162)

-> "... creating write ..." (i.e., drop the "the" here)

kib marked 9 inline comments as done.Jul 26 2017, 6:29 AM
kib edited edge metadata.

Edit comments.

This revision now requires review to proceed.Jul 26 2017, 6:31 AM
sys/vm/vnode_pager.c
1368–1369 ↗(On Diff #31209)

I think that one of my earlier comments still applies here: "The wording here is a bit confusing in the sense that it would lead one to expect that the dirty field is being cleared right here..." But, instead, rtvals[] is being updated.

Move the 'eof' comment to cover the whole branch, reformulate it slightly.

sys/vm/vnode_pager.c
1367 ↗(On Diff #31217)

According to the comment, this line should be "... == 0)", not "... == VM_PAGE_BITS_ALL)". Which is correct? :-)

Fix the condition.
Split comments.

This revision is now accepted and ready to land.Jul 26 2017, 7:54 PM
This revision was automatically updated to reflect the committed changes.