Page MenuHomeFreeBSD

Mark pages after EOF as clean after pageout.
ClosedPublic

Authored by kib on Jul 22 2017, 5:52 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 23, 12:23 AM
Unknown Object (File)
Fri, Nov 22, 6:23 PM
Unknown Object (File)
Tue, Nov 19, 2:56 PM
Unknown Object (File)
Tue, Nov 19, 10:49 AM
Unknown Object (File)
Tue, Nov 19, 10:27 AM
Unknown Object (File)
Tue, Nov 19, 10:26 AM
Unknown Object (File)
Tue, Nov 19, 10:24 AM
Unknown Object (File)
Tue, Nov 19, 8:07 AM
Subscribers

Details

Summary

Suppose that a file on NFS has partial last page, and this page was marked dirty. NFS VOP_PAGEOUT() method only marks the the page clean up to the block of the last written byte, leaving other blocks dirty. Also any page which erronously exists in the vnode vm_object past EOF is also left marked as dirty.

With the introduction of the buf-cache coherent pager, each pass of syncer over the object with such page results in creation of B_DELWRI buffer due to VOP_WRITE() call. This buffer is noted on next syncer pass, which has e.g. a visible manifestation of shutdown never finishing vnode sync. Note that before buf-cache coherency commit, a dirty page might left never synced to server if a partial writes occur.

Fix this by clearing dirty bits after EOF. Only blocks of the partial page which are completely after EOF are marked clean, to avoid possible user data loss.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/vm/vnode_pager.c
1351

Suppose that "pos" is 4000. Isn't "pos_devb" then going to be zero, and we will wind up clearing all the dirty bits in vm_page_clear_dirty()?

kib marked an inline comment as done.Jul 22 2017, 7:45 PM
kib added inline comments.
sys/vm/vnode_pager.c
1351

I moved masking before roundup. I believe this clears the issue you pointed out.

But in this case we would clean all dirty bits anyway.

kib marked an inline comment as done.

Mask before rounding up.

I ran all of the NFS tests I have plus a new one that triggered problem on a pristine kernel.
Fix verified. No problems seen.

sys/fs/nfsclient/nfs_clbio.c
341

Is it not possible for "np->n_size - offset" to overflow the "int" type that this parameter is declared as? Then, you could have spurious iterations of the new loop in vnode_pager_undirty_pages().

As an aside, I don't see why ncl_putpages() initializes rtvals[] while the n_mtx is held.

sys/vm/vnode_pager.c
1349

I don't really understand this initialization of i. How do we know that OFF_TO_IDX(trunc_page(eof)) is a valid index into the page array?

sys/vm/vnode_pager.c
1349

It may not be, but in those cases I believe that "pos >= lpos".

Change type of eof argument to off_t. Check the eof >= lpos before assigning to pos. These measures should avoid truncation.

Unrelated, move rtvals initialization in ncl_putpages() out of the region protected by nfs node mutex.

This revision is now accepted and ready to land.Jul 24 2017, 4:37 PM

I think that the code is correct.

sys/vm/vnode_pager.c
1328โ€“1331

I think that a block comment describing what this function does and especially what the new parameters represent would be helpful. For example, "eof" is not the length of the file as most people might expect.

1356โ€“1357

I would suggest a comment explaining these two lines as well.

kib edited edge metadata.

Add comments.

This revision now requires review to proceed.Jul 25 2017, 7:02 AM
markj added inline comments.
sys/vm/vnode_pager.c
1360

"the whole page"

1364

"the" can be removed here.

This revision is now accepted and ready to land.Jul 25 2017, 7:47 PM
sys/vm/vnode_pager.c
1322

-> "... specifies the page ..."

1323

-> "... bytes, and the ... bytes were actually ...

1324

"lpos" -> "eof" here?

1325

-> "... the vnode using the absolute file position of the first byte in the run as the base from which it is computed."?

1360

-> "... mark the whole ..."

1361

The wording here is a bit confusing in the sense that it would lead one to expect that the dirty field is being clear right here in the next snippet of code.

1364

-> "... creating write ..." (i.e., drop the "the" here)

kib marked 9 inline comments as done.Jul 26 2017, 6:29 AM
kib edited edge metadata.

Edit comments.

This revision now requires review to proceed.Jul 26 2017, 6:31 AM
sys/vm/vnode_pager.c
1368โ€“1369

I think that one of my earlier comments still applies here: "The wording here is a bit confusing in the sense that it would lead one to expect that the dirty field is being cleared right here..." But, instead, rtvals[] is being updated.

Move the 'eof' comment to cover the whole branch, reformulate it slightly.

sys/vm/vnode_pager.c
1367

According to the comment, this line should be "... == 0)", not "... == VM_PAGE_BITS_ALL)". Which is correct? :-)

Fix the condition.
Split comments.

This revision is now accepted and ready to land.Jul 26 2017, 7:54 PM
This revision was automatically updated to reflect the committed changes.