Page MenuHomeFreeBSD

vm_object: Fix handling of wired map entries in vm_object_split()
AcceptedPublic

Authored by markj on Fri, Mar 21, 2:10 PM.

Details

Reviewers
alc
dougm
kib
Summary

Suppose a vnode is mapped with MAP_PROT and MAP_PRIVATE, mlock() is
called on the mapping, and then the vnode is truncated such that the
last page of the mapping becomes invalid. The now-invalid page will be
unmapped, but stays resident in the VM object to preserve the invariant
that a range of pages mapped by a wired map entry is always resident.
This invariant is checked by vm_object_unwire(), for example.

Then, suppose that the mapping is upgraded to PROT_READ|PROT_WRITE. We
will copy the invalid page into a new anonymous VM object. If the
process then forks, vm_object_split() may then be called on the object.
Upon encountering an invalid page, rather than moving it into the
destination object, it is removed. However, this is wrong when the
entry is wired, since the invalid page's wiring belongs to the map
entry; this behaviour also violates the invariant mentioned above.

Fix this by moving invalid pages into the destination object if the map
entry is wired. In this case we must not dirty the page, so add a flag
to vm_page_iter_rename() to control this.

Reported by: syzkaller

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 63057
Build 59941: arc lint + arc unit

Event Timeline

markj requested review of this revision.Fri, Mar 21, 2:10 PM
sys/vm/vm_object.c
1614

Stale comment.

sys/vm/vm_page.c
2040–2041

Stale comment.

Modulo the notes about comments by Doug.

This revision is now accepted and ready to land.Sat, Mar 22, 12:45 AM
markj marked 2 inline comments as done.

Rather than passing a flag to vm_page_iter_rename(), have it check whether
the page is valid and use that to decide whether to dirty the page.

Update comments.

This revision now requires review to proceed.Sat, Mar 22, 1:58 AM
This revision is now accepted and ready to land.Sat, Mar 22, 2:02 AM
sys/vm/vm_object.c
1599–1607
sys/vm/vm_page.c
2040–2041
2041–2045

I hope that the comment as I have rewritten it is true. I'm assuming the last sentences is true. If it's not, the patch may introduce a bug.

sys/vm/vm_object.c
1599–1607

I think it is backing pageR and not backing page which is truncated.

sys/vm/vm_page.c
2041–2045

Dirtyness of the page means that its content cannot be discarded, it must be saved to the backing store (swap space, since we move the page into the shadowing object which must be swap object). This is true because the page changes its identity by renaming, so its content cannot be already stored by swap under the new name.

sys/vm/vm_object.c
1599–1607

Right, "pager" there is deliberate. Basically, the store of pages (which I called a "pager" here but maybe there's a clearer term) which backs a VM object may be truncated such that it's smaller than the object. For vnodes, this is handled in vnode_pager_setsize(). There, vm_object_page_remove() will remove and free pages beyond the end of the new object size, but if the page is wired, we merely unmap it and mark it invalid. N.B., One surprising consequence of this behaviour is that it's possible for a VM object to have a resident page with m->pindex > m->object->size.

Removing a wired page from its object there would be incorrect in general: if the page is wired via mlock(2), then the mapping process would have no other means of retrieving the page in order to unwire it, and thus the page would be leaked forever.

I think "invalid map entry" isn't right either. Here "invalid" just refers to a page beyond the end of the VM object/pager.

sys/vm/vm_page.c
2041–2045

The two "notes" are somewhat confusing, as they refer to the caller's responsibilities, not those of vm_page_iter_rename(). I'll try to make the comments a bit clearer, I think they're a bit muddled.

I don't think there's a bug here. If the page is invalid, then it's the responsibility of the code which invalidated it to clear the copy on swap, if any.

This revision now requires review to proceed.Sun, Mar 23, 2:33 AM

Thanks for the clarifications. Looks okay to me.

This revision is now accepted and ready to land.Sun, Mar 23, 4:15 AM
kib added inline comments.
sys/vm/vm_object.c
1606

BTW why not say 'object'?

sys/vm/vm_object.c
1606

Mostly because there is no distinct truncation operation for VM objects, i.e., there's no vm_object_truncate() or so. Indeed, vnode_pager_setsize() sets the object size directly. So, "pager" sounds more natural to me.