Page MenuHomeFreeBSD

pmap: Avoid clearing the accessed bit for wired mappings
Needs ReviewPublic

Authored by markj on Mar 21 2025, 2:09 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jul 9, 11:30 AM
Unknown Object (File)
Tue, Jul 1, 12:47 AM
Unknown Object (File)
Mon, Jun 30, 9:41 PM
Unknown Object (File)
Mon, Jun 30, 7:19 AM
Unknown Object (File)
Mon, Jun 30, 2:29 AM
Unknown Object (File)
Sun, Jun 29, 11:50 PM
Unknown Object (File)
Sun, Jun 29, 4:22 AM
Unknown Object (File)
Sat, Jun 28, 4:41 AM
Subscribers

Details

Reviewers
alc
kib
andrew
manu
Summary

Currently pmap_advise() does not handle wired mappings specially in the
4KB mapping case: it will clear the accessed bit even though wired pages
cannot be reclaimed. This is mostly harmless in principle, but it can
violate an invariant in pmap_demote_pde_locked(), which assumes that a
superpage mapping with the accessed bit clear must not be wired. When
it encounters such a superpage, the mapping is deleted instead of being
demoted, but this is not permitted for wired mappings.

Since pmap_advise() can reasonably avoid clearing the accessed bit for
wired mappings, let's do that instead of adding additional complexity to
the promotion and demotion paths. For instance, one alternate solution
would be to have pmap_promote_pde() always set PG_A on a wired superpage
mapping, even when constituent PTEs do not all have it set, but I'm not
sure that that's preferable.

I believe the same problem exists on arm64 and will update the diff once
there's consensus on the right solution.

Reported by: syzbot+4b9dad11826c30bb6745@syzkaller.appspotmail.com

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 65312
Build 62195: arc lint + arc unit

Event Timeline

markj requested review of this revision.Mar 21 2025, 2:09 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
9308

I suspect that the new code would be clearer if you slightly restructure it:

   			if ((*pte & (PG_MANAGED | PG_V)) != (PG_MANAGED | PG_V))
				goto maybe_invlrng;
			else if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
				if (advice == MADV_DONTNEED) {
					/*
					 * Future calls to pmap_is_modified()
					 * can be avoided by making the page
					 * dirty now.
					 */
					m = PHYS_TO_VM_PAGE(*pte & PG_FRAME);
					vm_page_dirty(m);
				}
			} else if ((*pte & PG_A) == 0)
				goto maybe_invlrng;
                        atomic_clear_long(pte, PG_M | ((*pte & PG_W) == 0 ? PG_A : 0));
This revision is now accepted and ready to land.Mar 21 2025, 11:17 PM

Simplify according to kib's suggestion.

This revision now requires review to proceed.Mar 23 2025, 2:40 AM
kib added inline comments.
sys/amd64/amd64/pmap.c
9299

IMO the comment should be moved right above the new atomic().

This revision is now accepted and ready to land.Mar 23 2025, 6:42 AM
markj marked 2 inline comments as done.

Move the comment.

This revision now requires review to proceed.Mar 23 2025, 9:40 AM
This revision is now accepted and ready to land.Mar 23 2025, 10:28 AM

@alc did you have any thoughts on this patch?

@alc did you have any thoughts on this patch?

Thinking...

A more direct approach would be to change pmap_demote_pde_locked() to handle wired mappings when the PDE was never accessed:

diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 6d1c2d70d8c0..97ff9c67e8d5 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -6104,9 +6104,7 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
         * Invalidate the 2MB page mapping and return "failure" if the
         * mapping was never accessed.
         */
-       if ((oldpde & PG_A) == 0) {
-               KASSERT((oldpde & PG_W) == 0,
-                   ("pmap_demote_pde: a wired mapping is missing PG_A"));
+       if ((oldpde & (PG_W | PG_A)) == 0) {
                pmap_demote_pde_abort(pmap, va, pde, oldpde, lockp);
                return (false);
        }
@@ -6164,7 +6162,7 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
         * have PG_A set in every PTE, then fill it.  The new PTEs will all
         * have PG_A set.
         */
-       if (!vm_page_all_valid(mpte))
+       if (vm_page_all_valid(mpte) ^ (oldpde & PG_A) != 0)
                pmap_fill_ptp(firstpte, newpte);
 
        pmap_demote_pde_check(firstpte, newpte);

The comments would also need updating, particularly for the second part.

I'm really on the fence about this. I could make a respectable argument for either approach.

In D49442#1163641, @alc wrote:

A more direct approach would be to change pmap_demote_pde_locked() to handle wired mappings when the PDE was never accessed:

diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 6d1c2d70d8c0..97ff9c67e8d5 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -6104,9 +6104,7 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
         * Invalidate the 2MB page mapping and return "failure" if the
         * mapping was never accessed.
         */
-       if ((oldpde & PG_A) == 0) {
-               KASSERT((oldpde & PG_W) == 0,
-                   ("pmap_demote_pde: a wired mapping is missing PG_A"));
+       if ((oldpde & (PG_W | PG_A)) == 0) {

Just for my understanding, do you mean

if ((oldpde & (PG_W | PG_A)) == PG_W) {

?

        pmap_demote_pde_abort(pmap, va, pde, oldpde, lockp);
        return (false);
}

@@ -6164,7 +6162,7 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,

    • have PG_A set in every PTE, then fill it. The new PTEs will all
    • have PG_A set. */
  • if (!vm_page_all_valid(mpte))

+ if (vm_page_all_valid(mpte) ^ (oldpde & PG_A) != 0)

        pmap_fill_ptp(firstpte, newpte);
 
pmap_demote_pde_check(firstpte, newpte);
The comments would also need updating, particularly for the second part.

I'm really on the fence about this.  I could make a respectable argument for either approach.
In D49442#1163665, @kib wrote:
In D49442#1163641, @alc wrote:

A more direct approach would be to change pmap_demote_pde_locked() to handle wired mappings when the PDE was never accessed:

diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 6d1c2d70d8c0..97ff9c67e8d5 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -6104,9 +6104,7 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
         * Invalidate the 2MB page mapping and return "failure" if the
         * mapping was never accessed.
         */
-       if ((oldpde & PG_A) == 0) {
-               KASSERT((oldpde & PG_W) == 0,
-                   ("pmap_demote_pde: a wired mapping is missing PG_A"));
+       if ((oldpde & (PG_W | PG_A)) == 0) {

Just for my understanding, do you mean

if ((oldpde & (PG_W | PG_A)) == PG_W) {

?

No, it is correct as is. If the PDE has the wired bit set, then I don't want to execute the "then" clause, which destroys the mapping. Instead, I want to skip it, and reinstall a page table page with the 4KB granularity wired mappings.

In D49442#1163641, @alc wrote:

I'm really on the fence about this. I could make a respectable argument for either approach.

Perhaps it's reasonable to simply implement both approaches? The patch in review makes some sense on its own, independent of the bug which motivated it: I can't see any practical reason for madvise(DONTNEED or FREE) to have any side effects on wired mappings. On the other hand, it seems sensible to modify the demotion path to handle this case, since we have no rule which says that wired mappings must have PG_A set, so this case could arise again in the future.

Modify the arm64 pmap to avoid clearing ATTR_AF on wired PTEs as well.

This revision now requires review to proceed.Wed, Jul 9, 8:48 PM