Page MenuHomeFreeBSD

amd64 pmap: add comment explaining TLB invalidation modes.
ClosedPublic

Authored by kib on Jul 26 2020, 4:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Nov 19, 4:38 AM
Unknown Object (File)
Tue, Nov 19, 4:34 AM
Unknown Object (File)
Tue, Nov 19, 4:32 AM
Unknown Object (File)
Tue, Nov 19, 4:17 AM
Unknown Object (File)
Tue, Nov 19, 4:16 AM
Unknown Object (File)
Tue, Nov 19, 2:39 AM
Unknown Object (File)
Thu, Nov 14, 12:14 AM
Unknown Object (File)
Mon, Nov 11, 8:57 AM
Subscribers

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/amd64/amd64/pmap.c
2711

"The kernel page table maps the user and kernel address spaces in their entirety. It is identical to the per-process page table allocated in non-PTI ...

2712–2713

I would open the paragraph with this sentence.

kib marked 7 inline comments as done.Nov 8 2020, 8:57 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
2703

I said 'high operational impact' there. It is explained below, at this point intent is only to note to reader that PTI changes a lot. What specifically is changed comparing for !PTI, is described later.

2705–2706

Typo: I mean PTI vs. non-PTI.

kib marked 2 inline comments as done.

Edits.

sys/amd64/amd64/pmap.c
2708

-> "the Meltdown bug ..."

Remove the stray space between PTI and the subsequent comma.

2717

-> "... that the user space ..."

I would drop the "s" at the end of "kernel page tables".

2718

I would take it for granted that TLB coherence must be maintained. Is the real point that you're trying to make here that these TLB invalidations can't be deferred, unlike the user page table ones mentioned below? If so, I think that this paragraph would be clearer if you swapped the order of the sentences, that is, talk about the user page table first.

2719

"... when the CPU is in user mode. Consequently, some TLB invalidations ...

2722

"The existence of a user mode page table ..."

2723

"by a pm_ucr3 value that differs from ..., in which case pm_ucr3 contains

2724

"... for the user mode page table's root.

2726

"... which CPUs currently have the pmap active. A CPU's bit is set on ...

2728

"... For the kernel page table, the pm_active field

2731

"present in hardware" -> "in use by the hardware"

2734

"..., the pmap sends ...

2735

"... recorded as active in pm_active. Updates to and reads from pm_active are not synchronized, and so they may race with each other. Shootdown handlers ...

2741

"... to. This feature provides a limited namespace ..."

(The text that follows uses "PCID" to refer to a "Process ID" so I want to stop using it to refer to the feature.)

2742

"... 12 bits, supporting 4095 simultaneous IDs total.

2744

"Allocation of a PCID to a pmap ...

2745

"section 15.12, "Other TLB Consistency Algorithms", of Vahalia's book "Unix Internals".

2746

"A PCID cannot ... whole lifetime of a pmap in ...

2748

"when the CPU ...

2750

"switch that activates ...

2754

"On invalidation" -> "On TLB invalidation"

and

"is" -> "are"

2755

"... that the previously allocated ..."

2756–2758

"Consequently, if a pmap is inactive on a CPU, then a TLB shootdown for that pmap and CPU can be initiated by an ordinary memory access to reset the target CPU's generation count within the pmap. The CPU initiating the TLB shootdown does not need to send an IPI to the target CPU.

2761

For consistency, "user mode"

2762

"A user PCID ...

2763

"... bit, 11, to ...

2765

"Userspace" -> "User space"
and
"usermode" -> "user mode"

2766–2767

"... clearing bit 63 of the loaded ..."

2768

"usermode" -> "user mode"

Also, can you clarify whether this is a "total invalidation" for one pmap or all pmaps on that CPU.

2769

-> "... in the user ..."

2773

INVTLB?

2776

-> "If the INVPCID ...

2777

-> "from the kernel page table.

2779

-> "... The kernel reserves ...

2780

"usermode" -> "user mode"

2781

-> "... A context switch allocates a new ...

2782

-> "the recorded PCID is zero or the recorded generation does not match the CPU's

2783

-> "..., effectively flushing the TLB ...

2824

"usermode" -> "user mode"

2825

-> "... for the user page table.

kib marked 39 inline comments as done.

39 more edits.

That's all that I have.

sys/amd64/amd64/pmap.c
2700–2701

I would drop the last sentence of this paragraph, i.e., the sentence "For SMP, ...", and then merge this paragraph with the next one.

(The use of IPIs comes up below. Here, I think it is lower level than the rest of this new combined paragraph.)

2702

This line is missing an "*".

2703

"The kernel configuration option that has the greatest operational impact on TLB invalidation is PTI, which ...

2703–2706

PTI is the kernel configuration option that most affects

2704

"CPUs. The most impactful hardware ...

2715

"allocated" -> "used"

2719–2721

"... mode. In contrast, the user space part of the kernel page table is used for copyout(9), so TLB invalidations on this page table cannot be postponed."

2724

"... by a pm_ucr3

2737

For consistency, add another space after the period.

2755–2756

"... count, pm_gen, which is ...

2759

"... valid. Effectively, zeroing any of these counters triggers a TLB shootdown ...

2761

"of a new ...

2776

Drop the first "the": "clearing bit 63 ...

2781–2782

"modes. If the requested invalidation is for a specific address or the total invalidation of a currently active pmap, then the TLB is flushed using INVLPG for a kernel page table, and ..."

I feel like there is a missing qualifying phrase at the end of INVPCID(INVPCID_CTXGLOB)/invltlb_glob().

2799

I think that all of the entries of this form would be clearer if written as "remote user page, inactive pmap:" Otherwise, it reads like the page must be inactive.

kib marked 14 inline comments as done.Jan 6 2021, 7:10 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
2703–2706

Sorry, I cannot get sense of this comment.

Another batch of edits.
Move comment out of #ifdef SMP.

sys/amd64/amd64/pmap.c
2703–2706

I think that this was the start of a comment that I intended to delete because it was superseded by another comment.

sys/amd64/amd64/pmap.c
2703

"mainly" -> "first"

2719

Can shorten to: "... from kernel to user mode."

2721

-> "... be similarly postponed.

2729

-> "... switch to the pmap, and ..."

2733

Add comma, -> "..., e.g., ..."

2753

Add comma, -> "i.e., ..."

2777

-> "... TLB entries for ..."

2778

Can we change "If ucr3_load_mask is set, then local ..." to "In which case, local ..."?

2785

We rarely use the word "process" in this comment. Can we say, "user space", instead?

2801

Here is a technical question as opposed to one about wording: Given the IPI:INVLPG why is pm_gen zeroed? Is this because of aforementioned pm_active race?

2839

Are you planning to replace this?

kib marked 13 inline comments as done.Jan 9 2021, 9:29 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
2801

Yes, due to a possibility to lost the race and have the target pmap no longer active.

I believe that the recent fix where pm_gen zeroing was misplaced and this caused observable issues proves the point.

2839

Not in this change.

I plan to write about generation, emulation of A/D. Might be mention different format of EPT tables.

For SVM, something should be also noted.

kib marked 2 inline comments as done.

Next batch of edits.

sys/amd64/amd64/pmap.c
2801

Then, I would add a parenthetical comment on the next line: "(Both actions are required to handle the aforementioned pm_active races.)"

sys/amd64/amd64/pmap.c
2761

For consistency, "cpu" -> "CPU"

kib marked an inline comment as done.

Add note about INVLPG+pm_gen = 0

kib marked an inline comment as done.Jan 9 2021, 11:38 PM
This revision is now accepted and ready to land.Jan 9 2021, 11:56 PM