Page MenuHomeFreeBSD

X86 pmap_qenter needs to always invalidate
AbandonedPublic

Authored by bz on Mar 20 2017, 8:12 PM.
Tags
None
Referenced Files
Unknown Object (File)
Jan 28 2024, 9:48 PM
Unknown Object (File)
Jan 17 2024, 5:51 PM
Unknown Object (File)
Dec 26 2023, 6:44 AM
Unknown Object (File)
Dec 23 2023, 1:03 AM
Unknown Object (File)
Sep 12 2023, 12:28 PM
Unknown Object (File)
Sep 1 2023, 10:36 PM
Unknown Object (File)
Aug 23 2023, 1:47 PM
Unknown Object (File)
Jul 8 2023, 9:21 PM
Subscribers

Details

Reviewers
alc
kib
Summary

Running FreeBSD X86(_64) under gem5 we have seen

"vm_fault: fault on nofault entry, addr: ..."

panics regularly (and over time in different places).

Analysing what happened we found that with the out-of-order CPU
model we would kick off the page table walker and cache a zero pte
entry in the walker cache.
The page table walker had no insight into the updated pte value in
the register file of the CPU at that time and this would happen
before the store from pmap_qenter changing the pte was committed
and hence visible on the memory side of the CPU.

The reason we do not seem to see this problem on (most) hardware
is that according to [1] mos CPUs seem to implement a stronger
coherence guarantees than the specifications demand.

Intel's SDM [2] states (the end of 4.10.3.1 Caches for Paging Structures):

"The processor may create entries in paging-structure caches for
translations required for prefetches and for accesses that are a
result of speculative execution that would never actually occur
in the executed code path.
..
Because the processor may create the cache entries at the time of
translation and not update them following subsequent modifications
to the paging structures in memory, software should take care to
invalidate the cache entries appropriately when causing such modifications."

AMD's ArchPM Vol 2: System Programming [3] describes this case in:
7.3.1 Special Coherency Considerations.

In order to not rely on non-guaranteed behaviour, remove the
optimization to only invalidate the range if any of the previous
pte-s was a valid mapping. With this FreeBSD runs properly on
gem5 and likely also better on certain (AMD) CPUs.

[1] http://blog.stuffedcow.net/2015/08/pagewalk-coherence/#coherence
[2] https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf
[3] http://support.amd.com/TechDocs/24593.pdf

Diff Detail

Event Timeline

Analysing what happened we found that with the out-of-order CPU model we would kick off the page table walker and cache a zero pte entry in the walker cache.

This is clearly the bug in the emulator. According to SDM 4.10.2.3 Details of TLB Use:

Because the TLBs cache entries only for linear addresses with translations, there can be a TLB entry for a page
number only if the P flag is 1 and the reserved bits are 0 in each of the paging-structure entries used to translate
that page number. In addition, the processor does not cache a translation for a page number unless the accessed
flag is 1 in each of the paging-structure entries used during translation; before caching a translation, the processor
sets any of these accessed flags that is not already 1.

In other words, zero PTEs (which are invalid) are not allowed to be cached by the architecture and do not need an invalidation.

@kib ok, this may be a secondary problem; change my sentence to "Analysing what happened we found that with the out-of-order CPU model we would kick off the page table walker and find the 0 pte entry." The problem remains that the store has not been committed yet and a speculative walk will only see the old (zero) pte.

In D10067#208210, @bz wrote:

we would kick off the page table walker and find the 0 pte entry." The problem remains that the store has not been committed yet and a speculative walk will only see the old (zero) pte.

Does this happen on the same CPU which did the pte_store() ? If yes, this is again an emulator bug: the page walks must be coherent, in particular, they must be able to see the content of the store buffers on the local processor (AKA store forwarding).

If it is another CPU which sees zero pte after pmap_qenter(), then it is legitimate machine behavior, but just means that there is a race and code would behave the same as if pmap_qenter() did not yet executed the pte_store() at all. There must be external facilities (like locks) which ensure that other threads does not access mappings until our thread finished setting it up. Can you provide the backtraces for pmap_qenter() thread and the raced thread, if the issue is caused by a race ?

This is on a single CPU. Can you please give me a reference for "the page walks must be coherent" as my understanding from the cited pages and the blog post referenced is that they must not be.

In D10067#208217, @bz wrote:

This is on a single CPU. Can you please give me a reference for "the page walks must be coherent" as my understanding from the cited pages and the blog post referenced is that they must not be.

I am not sure which blog post you mean, could you please provide the url ? If you mean the situation explained e.g. in SDM 11.7 IMPLICIT CACHING, then it is not applicable because the previous pte entry is invalid.

For the coherence of the page walks, this is the generic rule that the CPU accesses, unless documented contrary, must obey the caching policy specified on the given memory address. Store buffers are only visible for specific situations explicitly mentioned in the specification, all invoking more than one CPU to happen.

In D10067#208488, @bz wrote:

See reference [1]

This is 11.7 IMPLICIT CACHING.

Close for now; still need to track down but ETIME currently.