Page MenuHomeFreeBSD

riscv/mp_machdep.c: Flush the TLB after releasing APs
AcceptedPublic

Authored by bnovkov on Thu, May 14, 4:08 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, May 23, 3:05 AM
Unknown Object (File)
Sat, May 23, 2:59 AM
Unknown Object (File)
Fri, May 22, 5:00 PM
Unknown Object (File)
Thu, May 21, 9:08 PM
Unknown Object (File)
Wed, May 20, 10:20 PM
Unknown Object (File)
Tue, May 19, 7:45 PM
Unknown Object (File)
Tue, May 19, 4:45 PM
Unknown Object (File)
Tue, May 19, 4:45 PM
Subscribers

Details

Reviewers
markj
Group Reviewers
riscv
Summary

Spurious page faults caused by cached invalid entries may occur when
starting APs and potentially panic the kernel if we're running in
a non-sleepable context.

Fix this avoidable panic by flushing the TLB after the AP is released.

Test Plan

This patch fixes a boot panic encountered when booting a recent version of -CURRENT on a BananaPi F3:

panic: mtx_lock() by idle thread 0xffffffc05ce01140 on mutex 0xffffffc000bd8c80 @ /usr/src/sys/riscv/riscv/pmap.c:2942
cpuid = 2
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x36
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x134
panic() at panic+0x26
$x() at $x+0x6e
pmap_fault() at pmap_fault+0x52
page_fault_handler() at page_fault_handler+0x11e
do_trap_supervisor() at do_trap_supervisor+0x6c
cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x74
--- exception 13, tval = 0xffffffc002719210
cpu_search_highest() at cpu_search_highest+0xdc
sched_ule_idletd() at sched_ule_idletd+0x154
fork_exit() at fork_exit+0x68
fork_trampoline() at fork_trampoline+0xa
Uptime: 1s
timeout stopping cpus
panic: mtx_lock() by idle thread 0xffffffc05ce

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/riscv/riscv/mp_machdep.c
202

Shouldn't this be done in pmap_activate_boot()?

sys/riscv/riscv/mp_machdep.c
202

sorry, I realized that referencing pmap_activate_boot may be a bit misleading since this intended effect here is to clear out any stale entries the AP.
More specifically, it appears that this issue is somehow related to the "wfi" while loop above; issuing the sfence directly before the loop doesn't fix the boot panic but issuing it directly after the AP is released does. I'm guessing that some entries get invalidated because kernel pmap changes while the AP waiting.

I'll hoist the sfence directly after the wfi loop to clear up the confusion.

bnovkov retitled this revision from riscv/mp_machdep.c: Flush the TLB after pmap_activate_boot to riscv/mp_machdep.c: Flush the TLB after releasing APs.
bnovkov edited the summary of this revision. (Show Details)

Address @markj 's comment.

This revision is now accepted and ready to land.Thu, May 14, 10:34 PM