Page MenuHomeFreeBSD

arm64: Improve DDB backtrace support
ClosedPublic

Authored by jrtc27 on Jan 7 2021, 6:14 PM.

Details

Summary

The existing implementation relies on each trap handler saving a normal
stack frame record, which is a waste of time and space when we're
already saving a trapframe to the stack. It's also wrong as it currently
saves LR not ELR.

Instead of patching it up, rewrite it based on the RISC-V implementation
with inspiration from the amd64 implementation for how to handle
vectored traps to provide an improved implementation. This includes
compressing the information down to one line like other architectures
rather than the highly-verbose old form that repeats itself by printing
LR and FP in one frame only to print them as PC and SP in the next. It
also includes printing out actually useful information about the traps
that occurred, though FAR is not saved in the trapframe so we cannot
print it (in general it can be clobbered between when the trap happened
and now), only ESR.

The AAPCS also allows the stack frame record to be located anywhere in
the frame, not just the top, so the caller's SP is not at a fixed offset
from the callee's FP like on almost all other architectures in
existence. This means there is no way to derive the caller's SP in the
unwinder, and so we have to drop that bit of (unused) state everywhere.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jrtc27 requested review of this revision.Jan 7 2021, 6:14 PM
sys/arm64/arm64/db_trace.c
134

Wasn't sure whether to panic here (which is what amd64 does) or just print a big error message and continue given this will almost certainly lead to a panic loop as DDB keeps having to unwind (an ever growing stack with) the same problematic frame.

This includes compressing the information down to one line like other architectures rather than the highly-verbose old form that repeats itself my printing LR and FP in one frame only to print them as PC and SP in the next.

Thank you!

though FAR is not saved in the trapframe so we cannot print it

I think we at least print it before calling panic() when it makes sense to do so, so this doesn't seem like a problem.

sys/arm64/arm64/db_trace.c
150

I'd find the control flow a bit easier to follow if the continue above was removed and we had

} else if (strcmp(name, "fork_trampoline") == 0) {
    break;
} else if (!unwind_frame(td, frame)) {
    break;
}

but it's just a suggestion.

though FAR is not saved in the trapframe so we cannot print it

I think we at least print it before calling panic() when it makes sense to do so, so this doesn't seem like a problem.

The case where this is a bit annoying is if there's a non-fatal fault you want to know about (perhaps one that led to a different fatal fault or generic panic).

sys/arm64/arm64/db_trace.c
150

Hm, I can see the argument, though I do feel they're unrelated, especially the side-effecting unwind_frame condition shouldn't be lumped in with the others IMO. I did it this way because (a) it's how RISC-V does it (b) it can be thought of as a list of special cases.

Though putting the entire rest of the block inside an else rather than chaining else ifs should work for both of us and could be done for RISC-V too to remove the continue there (which is a little simpler due to the trap type being in scause rather than being based on which trap handler was called).

sys/arm64/arm64/db_trace.c
134

I think for modern kernel code we now have __assert_unreachable(). We didn't have that yet when the amd64 code was written. This definitely falls into that camp as you've handled all the values frame_type could possibly be set to. The compiler might not even require a default case if you instead made the frame types an enum.

150

I'm ambivalent. Exception frames certainly are kind of special in how they are treated, so

if (trapframe) {
  big blob of code;
  continue;
}

/* normal frames */

Seems ok to me. It's at least a bit less painful that what we do over in i386 where we have a return in the middle of db_nextframe() for "normal" frames.

sys/arm64/arm64/exception.S
42

So we stored x29 twice?

sys/arm64/arm64/exception.S
42

Yes, once to the trapframe and once to the normal stack frame.

sys/arm64/arm64/db_trace.c
134

Sadly this isn't C++'s enum class, there's very little benefit to using a real enum.

The change survives some basic testing on my part, and from a cursory review it LGTM. It would be nice to see it included in 13.0, if possible.

I think the next nice-to-have would be the syscall decoding that amd64 does.

@gnn You're currently a blocking reviewer on this

Rebase and address review comments

jrtc27 added inline comments.
sys/arm64/arm64/db_trace.c
150

Hopefully easier to follow now

Is the plan to get this into 13.0?

Is the plan to get this into 13.0?

Yes, once @gnn gets round to reviewing (or resigning from the revision).

Is the plan to get this into 13.0?

Yes, once @gnn gets round to reviewing (or resigning from the revision).

He updated the herald rule for DTrace reviews to not add a blocking review (https://reviews.freebsd.org/H36), but I guess that doesn't apply retroactively. I think you should go ahead and commit without waiting.

This revision is now accepted and ready to land.Mon, Feb 1, 2:14 PM

Is the plan to get this into 13.0?

Yes, once @gnn gets round to reviewing (or resigning from the revision).

He updated the herald rule for DTrace reviews to not add a blocking review (https://reviews.freebsd.org/H36), but I guess that doesn't apply retroactively. I think you should go ahead and commit without waiting.

Thanks, manually updated and will push momentarily.

This revision was automatically updated to reflect the committed changes.