Use pmap_pte_exists() instead of pmap_pte() when the caller expects a mapping to exist at a particular level. The caller benefits in two ways from using pmap_pte_exists(). First, because the level is specified to pmap_pte_exists() as a constant, rather than returned, the compiler can specialize the implementation of pmap_pte_exists() to the caller's exact needs, i.e., generate fewer instructions. Consequently, within a GENERIC-NODEBUG kernel, 704 bytes worth of instructions are eliminated from the inner loops of various pmap functions. Second, suppose that the mapping doesn't exist. Rather than requiring every caller to implement its own KASSERT()s to report missing mappings, the caller can optionally have pmap_pte_exists() provide the KASSERT().
Details
- Reviewers
manu andrew kib markj - Commits
- rGb7ec0d268b73: arm64: Introduce and use pmap_pte_exists()
Survived buildworld.
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
As an aside, while reviewing all of the callers to pmap_pte(), I noticed that pmap_is_prefaultable() is incorrectly implemented. It is going to return FALSE when it should return TRUE, and occasionally the opposite.
sys/arm64/arm64/pmap.c | ||
---|---|---|
618 | We assume that L1_TABLE and level == 1, i.e. supersuperpage mapping and request for smaller pte, are not possible. Would it make sense to assert this? Also, generally, this function defines 'pte' as the page table entry which translates accessed address, and not the paging structure, i.e. a leaf element of paging structure, am I right? I believe that the herald comment would be more clean if you mention this. | |
626 | Shouldn't level == 3 be an assert for desc == L3_PAGE? |
sys/arm64/arm64/pmap.c | ||
---|---|---|
626 | If desc != L3_PAGE, then execution falls through to the below KASSERT. And, if INVARIANTS is disabled, we return NULL. In other words, I believe that the current KASSERT suffices. |
sys/arm64/arm64/pmap.c | ||
---|---|---|
618 | I don’t understand the first question. The code does handle 1G pages. Remember that Arm64 numbers the levels in the opposite order as amd64. “Yes” to the second question. |
sys/arm64/arm64/pmap.c | ||
---|---|---|
618 | In regards to the first question, do you mean the following? Suppose that there is a valid L1_BLOCK, but the caller specifies level equal to 2 or 3. In that case, the function fails, possibly panicking if INVARIANTS are enabled. |
sys/arm64/arm64/pmap.c | ||
---|---|---|
618 | I intentionally designed pmap_pte_exists() to fail in this case. I wanted the caller to be able to assume that the mapping was at the specified level. |
For amd64, which function would it replace? We typically use the step-walk style functions, like pmap_pdpde_to_pde(), pmap_pde_to_pte(), in amd64 pmap, which both makes explicit level argument not useful (and IMO even quite inconvenient to use), and minimizes full-walk functions use (pmap_pte() as is is used twice) . Also we access resulting paging element right away, which means that we panic on NULL accesses without explicit asserts.
For example, in pmap_remove_write()'s second loop, it would replace 5 lines with 1 line:
pde = pmap_pde(pmap, pv->pv_va); KASSERT((*pde & PG_PS) == 0, ("pmap_remove_write: found a 2mpage in page %p's pv list", m)); pte = pmap_pde_to_pte(pde, pv->pv_va);
In other cases, e.g., the call to pmap_pte() in pmap_page_test_mappings(), it would provide tighter assertions. The current code does not assert that we found a 4KB mapping. The current code would mistakenly handle a 2MB page mapping in the first loop over the 4KB page pv list.
In other words, on amd64, I don't think that pmap_pte_exists() would be an optimization. It would shorten a few callers and add some assertions.
I do not object, just doubt about it usefulness. If you see the possibility to shorten the code, sure.