riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion
ClosedPublic
Actions

Authored by jrtc27 on Jul 21 2021, 2:09 AM.

Details

Reviewers

markj
mhorne

Commits

rG2e3c6024a476: riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion
rG4a235049082e: riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion

Summary

This repeats amd64's cfcbf8c6fd3b (r180498) and i386's cf3508519c5e
(r202894) but for riscv; pmap_kextract must be lock-free and so it can
race with superpage promotion and demotion, thus the L2 entry must only
be loaded once to avoid using inconsistent state.

PR: 250866

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jrtc27 created this revision.Jul 21 2021, 2:09 AM

Herald added a subscriber: imp. · View Herald TranscriptJul 21 2021, 2:09 AM

jrtc27 requested review of this revision.Jul 21 2021, 2:09 AM

Harbormaster completed remote builds in B40590: Diff 92553.Jul 21 2021, 2:09 AM

Since updating my Unmatched tree with this patch 11 days ago I haven't heard of any more panics from zBeeble (dgilbert) who's been continuing to build various ports locally, as well as possibly even a full buildworld+buildkernel. I've asked for confirmation of the distinct lack of panics (compared with one every day or so before), but strongly believe this was indeed the issue, especially since the amd64 and i386 commits explicitly mention the bug causing panics with ZFS (though despite trawling the web I couldn't find any record of *what* the panics were), which is what was seen here.

EDIT: Lack of panics with this patch applied has been confirmed.

In D31253#703777, @jrtc27 wrote:

Since updating my Unmatched tree with this patch 11 days ago I haven't heard of any more panics from zBeeble (dgilbert) who's been continuing to build various ports locally, as well as possibly even a full buildworld+buildkernel. I've asked for confirmation of the distinct lack of panics (compared with one every day or so before), but strongly believe this was indeed the issue, especially since the amd64 and i386 commits explicitly mention the bug causing panics with ZFS (though despite trawling the web I couldn't find any record of *what* the panics were), which is what was seen here.

What is the panic in this case?

This revision is now accepted and ready to land.Jul 21 2021, 1:00 PM

In D31253#703891, @markj wrote:

In D31253#703777, @jrtc27 wrote:

Since updating my Unmatched tree with this patch 11 days ago I haven't heard of any more panics from zBeeble (dgilbert) who's been continuing to build various ports locally, as well as possibly even a full buildworld+buildkernel. I've asked for confirmation of the distinct lack of panics (compared with one every day or so before), but strongly believe this was indeed the issue, especially since the amd64 and i386 commits explicitly mention the bug causing panics with ZFS (though despite trawling the web I couldn't find any record of *what* the panics were), which is what was seen here.

What is the panic in this case?

For example:

panic: pmap_l2_to_l3: PA out of range, PA: 0x0
cpuid = 1
time = 1625512247
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
pmap_remove_write() at pmap_remove_write+0x56a
vm_object_page_collect_flush() at vm_object_page_collect_flush+0xf8
vm_object_page_clean() at vm_object_page_clean+0x144
vinactivef() at vinactivef+0x90
vput_final() at vput_final+0x2ea
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x18
closef() at closef+0x1b8
closefp_impl() at closefp_impl+0x78
closefp() at closefp+0x52
kern_close() at kern_close+0x134
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x6
KDB: enter: panic

In D31253#703905, @jrtc27 wrote:

For example:

panic: pmap_l2_to_l3: PA out of range, PA: 0x0
cpuid = 1
time = 1625512247
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
pmap_remove_write() at pmap_remove_write+0x56a
vm_object_page_collect_flush() at vm_object_page_collect_flush+0xf8
vm_object_page_clean() at vm_object_page_clean+0x144
vinactivef() at vinactivef+0x90
vput_final() at vput_final+0x2ea
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x18
closef() at closef+0x1b8
closefp_impl() at closefp_impl+0x78
closefp() at closefp+0x52
kern_close() at kern_close+0x134
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x6
KDB: enter: panic

The change itself is easy enough to understand, but I don't see exactly how the issue correlates to the panic. Are you able to explain it?

From a quick look:
If pmap_kextract() races with demotion, then it's possible that the pa returned points to the l3 table, rather than the expected physical address corresponding to va. There aren't a ton of callers of pmap_kextract(), but one interesting one is pcpu_page_free(), which looks like it could inadvertently free the wrong vm page if the race happens as I described. Could this lead to the panics observed?

In D31253#703999, @mhorne wrote:

In D31253#703905, @jrtc27 wrote:

For example:

panic: pmap_l2_to_l3: PA out of range, PA: 0x0
cpuid = 1
time = 1625512247
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
pmap_remove_write() at pmap_remove_write+0x56a
vm_object_page_collect_flush() at vm_object_page_collect_flush+0xf8
vm_object_page_clean() at vm_object_page_clean+0x144
vinactivef() at vinactivef+0x90
vput_final() at vput_final+0x2ea
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x18
closef() at closef+0x1b8
closefp_impl() at closefp_impl+0x78
closefp() at closefp+0x52
kern_close() at kern_close+0x134
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x6
KDB: enter: panic

The change itself is easy enough to understand, but I don't see exactly how the issue correlates to the panic. Are you able to explain it?

Not really, it was a shot in the dark that seemed relevant to ZFS, so the justification for why it fixes the bug is rather empirical.

From a quick look:
If pmap_kextract() races with demotion, then it's possible that the pa returned points to the l3 table, rather than the expected physical address corresponding to va. There aren't a ton of callers of pmap_kextract(), but one interesting one is pcpu_page_free(), which looks like it could inadvertently free the wrong vm page if the race happens as I described. Could this lead to the panics observed?

I didn't really chase it through. But yes, that kind of thing is what I was thinking might be happening, where you end up doing manipulations on the "wrong" page due to pmap_kextract giving you back the wrong address and corrupt the pmap, only discovering at a later date when you come to do another operation on a now-corrupted part of the pmap.

mhorne accepted this revision.Jul 21 2021, 11:24 PM

Closed by commit rG4a235049082e: riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion (authored by jrtc27). · Explain WhyJul 22 2021, 7:05 PM

This revision was automatically updated to reflect the committed changes.

jrtc27 added a commit: rG4a235049082e: riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion.

jrtc27 added a commit: rG2e3c6024a476: riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion.Sep 7 2021, 12:10 PM