vm_page: Fix PG_ZERO handling in vm_page_grab_*()
Needs ReviewPublic
Actions

Authored by markj on Apr 16 2026, 8:51 PM.

Details

Reviewers

alc
kib

Summary

vm_page_grab_pages() and vm_page_grab_pages_unlocked() were honouring
PG_ZERO for resident, invalid pages, but this isn't really sound.
PG_ZERO should only be considered at allocation time. Unconditionally
already-resident pages if VM_ALLOC_ZERO was specified.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 72304
Build 69187: arc lint + arc unit

Event Timeline

markj created this revision.Apr 16 2026, 8:51 PM

Herald added a subscriber: imp. · View Herald TranscriptApr 16 2026, 8:51 PM

markj requested review of this revision.Apr 16 2026, 8:51 PM

Harbormaster completed remote builds in B72267: Diff 175697.Apr 16 2026, 8:51 PM

alc added inline comments.Apr 17 2026, 7:11 PM

sys/vm/vm_page.c
5235–5236	Just as a matter style, for consistency, could you swap the order of the tests (vm_page_none_valid(m), (allocflags & VM_ALLOC_ZERO) != 0) either here or in vm_page_grab_pages_unlocked().

Make tests more consistent.

Harbormaster completed remote builds in B72304: Diff 175786.Apr 17 2026, 8:03 PM

Do we have any caller using vm_page_grab_pages(_unlocked)(VM_ALLOC_ZERO)? I did not found any.

I thought what could be a use for such call, and I really do not see how it can be useful. Both vnode and swap objects might have the page content on the volume, so the invalid page we found on the queue cannot be safely zeroed without consulting pager first. In the end, the decision to zero such page belongs to the caller.

In D56432#1292906, @kib wrote:

Do we have any caller using vm_page_grab_pages(_unlocked)(VM_ALLOC_ZERO)? I did not found any.

kern_kexec.c calls vm_page_grab_pages_unlocked(VM_ALLOC_ZERO). It looks like it could just use vm_page_alloc() though.

I thought what could be a use for such call, and I really do not see how it can be useful. Both vnode and swap objects might have the page content on the volume, so the invalid page we found on the queue cannot be safely zeroed without consulting pager first. In the end, the decision to zero such page belongs to the caller.

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

In D56432#1294556, @kib wrote:

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

After kern_kexec is fixed, yes. But now I am trying to understand what that code is doing, and failing.

In D56432#1292906, @kib wrote:

I thought what could be a use for such call, and I really do not see how it can be useful. Both vnode and swap objects might have the page content on the volume, so the invalid page we found on the queue cannot be safely zeroed without consulting pager first. In the end, the decision to zero such page belongs to the caller.

For other object types, like OBJT_PHYS in kern_kexec.c, it is still somewhat useful since we do not have a vm_page_alloc_pages(), so pages have to be allocated one-by-one. But yes, also note that vm_page_grab_valid() does not handle VM_ALLOC_ZERO.

In D56432#1294557, @markj wrote:

In D56432#1294556, @kib wrote:

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

After kern_kexec is fixed, yes. But now I am trying to understand what that code is doing, and failing.

In D56432#1292906, @kib wrote:

I thought what could be a use for such call, and I really do not see how it can be useful. Both vnode and swap objects might have the page content on the volume, so the invalid page we found on the queue cannot be safely zeroed without consulting pager first. In the end, the decision to zero such page belongs to the caller.

For other object types, like OBJT_PHYS in kern_kexec.c, it is still somewhat useful since we do not have a vm_page_alloc_pages(), so pages have to be allocated one-by-one. But yes, also note that vm_page_grab_valid() does not handle VM_ALLOC_ZERO.

Even so, using the unlocked variant is utterly pointless, since there can't possibly be any existing pages in the object.

In D56432#1294614, @alc wrote:

In D56432#1294557, @markj wrote:

In D56432#1294556, @kib wrote:

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

After kern_kexec is fixed, yes. But now I am trying to understand what that code is doing, and failing.

In D56432#1292906, @kib wrote:

I thought what could be a use for such call, and I really do not see how it can be useful. Both vnode and swap objects might have the page content on the volume, so the invalid page we found on the queue cannot be safely zeroed without consulting pager first. In the end, the decision to zero such page belongs to the caller.

For other object types, like OBJT_PHYS in kern_kexec.c, it is still somewhat useful since we do not have a vm_page_alloc_pages(), so pages have to be allocated one-by-one. But yes, also note that vm_page_grab_valid() does not handle VM_ALLOC_ZERO.

Even so, using the unlocked variant is utterly pointless, since there can't possibly be any existing pages in the object.

Yes, though I believe kib's question applies to the locked vm_page_grab_pages() as well.

In D56432#1294557, @markj wrote:

In D56432#1294556, @kib wrote:

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

After kern_kexec is fixed, yes. But now I am trying to understand what that code is doing, and failing.

I had the same issue. It could happen that the allocation provide the page that has phys address already belonging to the final location of the segment. In this case, the page should not be copied through trampoline, it would not work since it content could be overwritten before current data is copied to the final location. Instead, they put the page into the correct index of the containing object there.

In D56432#1294689, @kib wrote:

In D56432#1294557, @markj wrote:

In D56432#1294556, @kib wrote:

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

After kern_kexec is fixed, yes. But now I am trying to understand what that code is doing, and failing.

I had the same issue. It could happen that the allocation provide the page that has phys address already belonging to the final location of the segment. In this case, the page should not be copied through trampoline, it would not work since it content could be overwritten before current data is copied to the final location. Instead, they put the page into the correct index of the containing object there.

I think I understand. In any case I think we can remove the use of the VM_ALLOC_ZERO flag there, but I would do that separately. My immediate goal is to avoid trusting PG_ZERO for managed page allocations, since that has led to several severe bugs and does not provide much benefit (unlike for noobj allocations, see my comments in D56234).

In D56432#1295190, @markj wrote:

In D56432#1294689, @kib wrote:

In D56432#1294557, @markj wrote:

In D56432#1294556, @kib wrote:

Effectively, you agree that we can drop the support and disable passing VM_ALLOC_ZERO to vm_page_grab_unlocked?

After kern_kexec is fixed, yes. But now I am trying to understand what that code is doing, and failing.

I had the same issue. It could happen that the allocation provide the page that has phys address already belonging to the final location of the segment. In this case, the page should not be copied through trampoline, it would not work since it content could be overwritten before current data is copied to the final location. Instead, they put the page into the correct index of the containing object there.

I think I understand. In any case I think we can remove the use of the VM_ALLOC_ZERO flag there, but I would do that separately. My immediate goal is to avoid trusting PG_ZERO for managed page allocations, since that has led to several severe bugs and does not provide much benefit (unlike for noobj allocations, see my comments in D56234).

In my opinion, removing VM_ALLOC_ZERO (formal) support from vm_page_grab* is the best step there.

Revision Contents
Changeset List

Path

Size

sys/

vm/

vm_page.c

19 lines

Diff 175786

View Options

vm_page: Fix PG_ZERO handling in vm_page_grab_*()Needs ReviewPublicActions