Page MenuHomeFreeBSD

RISC-V superpage support, part 6/6.
ClosedPublic

Authored by markj on Jan 17 2019, 2:52 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Oct 18, 12:26 PM
Unknown Object (File)
Fri, Oct 18, 7:06 AM
Unknown Object (File)
Wed, Oct 16, 8:00 PM
Unknown Object (File)
Wed, Oct 16, 7:23 PM
Unknown Object (File)
Tue, Oct 15, 11:48 AM
Unknown Object (File)
Tue, Oct 15, 3:27 AM
Unknown Object (File)
Mon, Oct 14, 9:20 AM
Unknown Object (File)
Sat, Oct 12, 10:06 PM
Subscribers
None

Details

Summary

Superpage support for pmap_fault_fixup(). While here, make sure we
don't swallow the fault if the entry is valid but doesn't have PTE_U
set. Also, add a comment about spurious faults.

The RISC-V specification permits implementations where the TLB may cache
invalid translations. In particular, when pmap_enter() or
pmap_enter_quick() overwrites an invalid entry, there is no guarantee
that a subsequent access won't generate a fault unless the pmap layer
calls pmap_invalidate_*() for the corresponding VA. In general this is
rather expensive, so I would like to take the approach of having
pmap_enter() and pmap_enter_quick() issue only a local TLB shootdown,
and permit spurious faults on other CPUs. This will be handled in a
separate commit, but I wanted to explain here why we don't panic if a
page fault occurs when PTE_A and PTE_D are already set on the PTE.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 21998
Build 21232: arc lint + arc unit

Event Timeline

(This one I understand well enough to sign off on)

This revision is now accepted and ready to land.Jan 22 2019, 10:29 PM

Note that "spurious" faults may also occur upon simultaneous accesses.

This revision now requires review to proceed.Jan 23 2019, 9:18 PM
sys/riscv/riscv/pmap.c
2393

Small nit: you can use l2e here, otherwise it is unused.

markj marked an inline comment as done.

Cleanup based on Mitchell's comment.

sys/riscv/riscv/pmap.c
2393

Nothing has changed, looks like you uploaded the old diff.

sys/riscv/riscv/pmap.c
2417

Should this be cmpxchg ?

What if oldpte == newpte ? Is there a chance that the same fault would repeat again ? I mean, other CPU might already faulted on the same address and update PTE_A/D. But this CPU still needs to invalidate its TLB.

BTW, since we take the PMAP_LOCK() at the start of the function, I conclude that the interrupts are not disabled when pmap_fault_fixup() is called. Then I see nothing that would prevent the context switch to move the thread to other core, so sfence_vma() might occur on the core which does not have invalid translation cached.

In D18868#405977, @kib wrote:

BTW, since we take the PMAP_LOCK() at the start of the function, I conclude that the interrupts are not disabled when pmap_fault_fixup() is called. Then I see nothing that would prevent the context switch to move the thread to other core, so sfence_vma() might occur on the core which does not have invalid translation cached.

I believe you are right, but the only effect will be extra spurious faults.

sys/riscv/riscv/pmap.c
2417

I think we can use atomic_set instead.

You're right, we should unconditionally issue the sfence.vma.

markj marked 2 inline comments as done.

Address feedback.

Intel specifies that CPU always flushes TLB for the fault address, automatically. It might make sense to consider flushing TLB in the low level routine before enabling interrupts.

In D18868#406028, @kib wrote:

Intel specifies that CPU always flushes TLB for the fault address, automatically. It might make sense to consider flushing TLB in the low level routine before enabling interrupts.

You mean, change pmap_fault_fixup() to flush the TLB only after modifying the PTE, and unconditionally flush the TLB immediately after the trap to supervisor mode?

In D18868#406028, @kib wrote:

Intel specifies that CPU always flushes TLB for the fault address, automatically. It might make sense to consider flushing TLB in the low level routine before enabling interrupts.

You mean, change pmap_fault_fixup() to flush the TLB only after modifying the PTE, and unconditionally flush the TLB immediately after the trap to supervisor mode?

Yes, I mean flush TLB for the address after entering supervisor mode, before enabling interrupts. If done, do we need to flush TLB again in fixup() ? Perhaps formally yes, if the rules are that any address can be cached in TLB at any time, but practically even speculative execution only speculates over the existing code. While in the fault handler and trap()/fixup(), nothing should access the faulted address.

In D18868#406082, @kib wrote:
In D18868#406028, @kib wrote:

Intel specifies that CPU always flushes TLB for the fault address, automatically. It might make sense to consider flushing TLB in the low level routine before enabling interrupts.

You mean, change pmap_fault_fixup() to flush the TLB only after modifying the PTE, and unconditionally flush the TLB immediately after the trap to supervisor mode?

Yes, I mean flush TLB for the address after entering supervisor mode, before enabling interrupts. If done, do we need to flush TLB again in fixup() ? Perhaps formally yes, if the rules are that any address can be cached in TLB at any time, but practically even speculative execution only speculates over the existing code. While in the fault handler and trap()/fixup(), nothing should access the faulted address.

If the initial fault is not spurious (e.g., access/dirty bits are software-managed and PTE_A is not set on the PTE for the faulting address), then we must issue a sfence.vma after setting PTE_A. Thus it is not sufficient to flush the TLB upon entering supervisor mode.

Committed in r344106.

This revision is now accepted and ready to land.Feb 13 2019, 6:03 PM