Page MenuHomeFreeBSD

Simplify flow of pmap_demote_pde_locked() and add more comprehensive debugging checks.
ClosedPublic

Authored by kib on May 15 2019, 8:54 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Nov 18, 6:51 AM
Unknown Object (File)
Sun, Oct 27, 12:26 PM
Unknown Object (File)
Sun, Oct 27, 12:26 PM
Unknown Object (File)
Sun, Oct 27, 12:25 PM
Unknown Object (File)
Sun, Oct 27, 12:25 PM
Unknown Object (File)
Sun, Oct 27, 12:25 PM
Unknown Object (File)
Sun, Oct 27, 12:25 PM
Unknown Object (File)
Sun, Oct 27, 12:25 PM
Subscribers

Details

Summary

In particular,

  • Move the code to handle failure to allocate page table page into helper
  • After the previous item is done, it is possible to distinguish !PG_A case and case of missed page, in the control flow.
  • Make the variable to indicate that in-kernel mapping is demoted.
  • Assert that missed page table page can only happen for in-kernel mapping when demoting direct map.
  • If DIAGNOSTIC is enabled, and the page table page should be already filled, check all ptes instead of only first one.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/amd64/amd64/pmap.c
1025 ↗(On Diff #57420)

If you're using rounded-up *firstaddr for this loop, you should just move the loop to after the point where you've rounded up *firstaddr in line 1045.

sys/amd64/amd64/pmap.c
1025 ↗(On Diff #57420)

This operation is boot-time only, so micro-optimizing it is not too important. The current sequence of initialization actually makes some sense from the hierarchical view on the page tables, and I want to keep it as is.

What are the errors that Peter reported?

sys/amd64/amd64/pmap.c
1375 ↗(On Diff #57420)

The kernel page table pages that are dynamically allocated by pmap_growkernel() will have a wire count of 1.

In D20266#437675, @alc wrote:

What are the errors that Peter reported?

It started with assert *firstpte & PG_FRAME != newpte & PG_FRAME at old line 4179 for pipe_map. Then I added the comparision and dump of the whole page table page and saw some other cases.

sys/amd64/amd64/pmap.c
4219–4225 ↗(On Diff #57420)

When I wrote this block of code, the PG_PROMOTED flag did not yet exist. If, today, I were writing this block for the first time, I would test (oldpte & PG_PROMOTED) == 0 instead of mpte->wire_count == 1.

4232–4233 ↗(On Diff #57420)

I speculate that this block would no longer be needed if we test (oldpte & PG_PROMOTED) == 0 above. In particular, pmap_protect_pde() clears PG_PROMOTED if it removes access rights from the PDE.

Also, I never intended for promotion and demotion on the kernel pmap to create an inconsistency in how we maintain the wire count on kernel page table pages. Switching to testing (oldpte & PG_PROMOTED) == 0 would allow for eliminating that inconsistency.

Index: amd64/amd64/pmap.c
===================================================================
--- amd64/amd64/pmap.c  (revision 348078)
+++ amd64/amd64/pmap.c  (working copy)
@@ -4537,8 +4537,10 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pd
                            " in pmap %p", va, pmap);
                        return (FALSE);
                }
-               if (va < VM_MAXUSER_ADDRESS)
+               if (va < VM_MAXUSER_ADDRESS) {
+                       mpte->wire_count = NPTEPG;
                        pmap_resident_count_inc(pmap, 1);
+               }
        }
        mptepa = VM_PAGE_TO_PHYS(mpte);
        firstpte = (pt_entry_t *)PHYS_TO_DMAP(mptepa);
@@ -4553,10 +4555,8 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pd
        /*
         * If the page table page is new, initialize it.
         */
-       if (mpte->wire_count == 1) {
-               mpte->wire_count = NPTEPG;
+       if ((oldpde & PG_PROMOTED) == 0)
                pmap_fill_ptp(firstpte, newpte);
-       }
        KASSERT((*firstpte & PG_FRAME) == (newpte & PG_FRAME),
            ("pmap_demote_pde: firstpte and newpte map different physical"
            " addresses"));

Use PG_PROMOTED == 0 as a predicate for the pmap_pt_fill() call.

sys/amd64/amd64/pmap.c
4551 ↗(On Diff #57709)

I guess it's supposed to be MAXUSER_ADDRESS. Technically this is a functional change since before we were only allocating with interrupt priority for demotions in the direct map. I think it's fine, just contradicts the review description.

sys/amd64/amd64/pmap.c
1404 ↗(On Diff #57709)

Unless you want to set PG_PROMOTED on the PDEs below, this loop is pointless (and has been for quite some time).

If we actually want to support demotion on the virtual address range being mapped here, we should be inserting these page table pages into the pmap's radix tree.

sys/amd64/amd64/pmap.c
1404 ↗(On Diff #57709)

... and, in fact, pmap_init() is doing the insertion.

sys/amd64/amd64/pmap.c
1755 ↗(On Diff #57709)

I'm not convinced that KERNend is the correct value to use here. I think that it is too small.

Fix VM_MAXUSER_ADDRESS spelling.
Remember the last initialized page table page for insertion into the pmap radix.

sys/amd64/amd64/pmap.c
1334 ↗(On Diff #57713)

Before introducing a new variable, please take a look at how KERNend is initialized and used. I don't think that we need both KERNend and kern_promoted_end.

sys/amd64/amd64/pmap.c
4551 ↗(On Diff #57709)

The only situation in which this code should ever allocate a kernel page table page is when it is called upon to demote a direct map mapping, so this doesn't really represent a functional change.

sys/amd64/amd64/pmap.c
1334 ↗(On Diff #57713)

We probably can remove KERNend if going ahead with my change. I think you mean that KERNend is de-facto unused in the patched pmap, and kern_promoted_end can be int, i.e. 4 bytes shorter.

Can you separate the pmap_demote_pde_locked() changes from the create_pagetables()/pmap_init() changes, i.e., create a separate patch.

Leave only refactoring of the demotion in this review.

kib retitled this revision from Fixes for some corner cases of amd64 demotions. to Simplify flow of pmap_demote_pde_locked() and add more comprehensive debugging checks..May 24 2019, 5:34 PM
kib edited the summary of this revision. (Show Details)

i386 has similar code.

sys/amd64/amd64/pmap.c
4554 ↗(On Diff #57849)

_fail is a bit misleading here. Maybe _abort?

This revision is now accepted and ready to land.May 24 2019, 6:09 PM
sys/amd64/amd64/pmap.c
4541 ↗(On Diff #57849)

You've eliminated the only use of PG_G in this function.

sys/amd64/amd64/pmap.c
4566–4577 ↗(On Diff #57849)

The first part of this comment no longer belongs here.

kib marked 2 inline comments as done.May 24 2019, 10:11 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
4541 ↗(On Diff #57849)

It is used by PG_PTE_PROMOTED macro. I tried that in initial version.

Split large comment.
_fault() -> _abort().

This revision now requires review to proceed.May 24 2019, 10:12 PM
This revision is now accepted and ready to land.May 24 2019, 11:10 PM

I ran tests on D20266.57868.diff for 24 hours without seeing any problems.

In D20266#440649, @pho wrote:

I ran tests on D20266.57868.diff for 24 hours without seeing any problems.

Thank you. It would require a combined test with D20380 after both settle.

sys/amd64/amd64/pmap.c
4492 ↗(On Diff #57868)

You could simply add PAGE_SIZE to newpte at the end of each iteration.

4496 ↗(On Diff #57868)

The "XX" should be replaced.

4524–4525 ↗(On Diff #57868)

You should be able to unwrap the format string.

4571 ↗(On Diff #57868)

"is for a kernel address, ..."

4594 ↗(On Diff #57868)

"abort" could reasonably be interpreted as leave the mapping in place. Instead, I would explicitly say, "invalidate the 2MB page mapping".

Handle Alan' comments. Adjust the output of the diagnostic version of the checker even more.

This revision now requires review to proceed.May 27 2019, 3:43 PM
markj added inline comments.
sys/amd64/amd64/pmap.c
4678 ↗(On Diff #57957)

This format string may be unwrapped as well.

This revision is now accepted and ready to land.May 27 2019, 6:41 PM

Unwrap one more CTR format string.

This revision now requires review to proceed.May 27 2019, 7:13 PM
alc added inline comments.
sys/amd64/amd64/pmap.c
4504 ↗(On Diff #57963)

I suggest: "different pages: found %#lx, expected %#lx\n"

This revision is now accepted and ready to land.May 31 2019, 4:31 PM
This revision was automatically updated to reflect the committed changes.