Page MenuHomeFreeBSD

Various changes to DTrace FBT to avoid crashed on FreeBSD/arm64.
ClosedPublic

Authored by rwatson on Dec 24 2020, 5:30 PM.

Details

Summary

A number of changes to DTrace to improve robustness (i.e., not panic quickly) on FreeBSD/arm64:

  • Don't allow FBT to instrument the exception handlers used to implement FBT.
  • Don't allow FBT to instrument DDB, which makes debugging FBT even harder.
  • Don't allow FBT to instrument memcpy(), which may be used by the compiler to optimise parts of the DTrace implementation, as well as in exception handlers.
  • Implement a dtrace_getnanouptime(), modeled on the existing dtrace_getnanotime(), so that we don't have to block FBT instrumentation of getnanouptime().

(Note: I'm pretty sure that dtrace_gethrtime() should not use getnanouptime(), but I'll change its semantics in some later commit -- this change preserve current semantics.)

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sys/kern/kern_tc.c
1006

Should this and dtrace_getnanotime be under KDTRACE_HOOKS?

sys/kern/kern_tc.c
1006

It's an interesting question. I guess I simply maintained the status quo. I'm tempted to stick with that in this change, and then fix both at once, assuming no surprises arise?

jrtc27 added inline comments.
sys/cddl/dev/fbt/fbt.c
163
163–164

I feel this belongs in the aarch64 fbt_provide_module_function? At least, that's where x86 puts its architecture-specific exclusions for functions (well, just one in its case) on the synchronous trap path that FBT probes themselves trigger. In theory unwind_frame also belongs there too though one could argue that all architectures should follow the same clear model and then it'd be able to be MI.

rwatson edited the summary of this revision. (Show Details)

Various fixes to the DTrace FBT provider on FreeBSD/arm64:

Exclude kdb_ symbols from FBT.

Add bcopy(9) to FBT exclusions for arm64.

Re-sort exclusions with more clear comments as to why they are excluded.
  • Since non-aarch64 haven't required a memcpy() exclusion in the past, don't exclude it now.
  • Move arm64-specific exclusions from the generic FBT code to the arm64-specific code.

    Suggested by: @jrtc27
rwatson added inline comments.
sys/cddl/dev/fbt/fbt.c
163–164

Yes, sounds good. I've left unwind_frame() and moved the bcopy()/memcpy() exclusions, and also exception-handling exclusions. The memcpy() exclusion might be needed on other architectures too, but since it seems not to have been triggering unhappiness there, I'll move it and assume it was the lack of conventional aarch64 function prologue/epilogue that was causing issues for me previously.

rwatson marked an inline comment as done.

Updates to candidate commit message.

Candidate commit message (will update Reviewed by notes as needed):

A number of changes that improve DTrace FBT reliability on freebsd/arm64:
    
  - Implement a dtrace_getnanouptime(), matching the existing
    dtrace_getnanotime(), to avoid DTrace calling out to a potentially
    instrumentable function.
  
    (These should probably both be under KDTRACE_HOOKS.  Also, it's not clear
    to me that they are correct implementations for the DTrace thread time
    functions they are used in .. fixes for another commit.)
    
  - Don't allow FBT to instrument DDB and KDB functions, as that makes it
    rather harder to debug FBT problems.
    
  - Don't allow FBT to instrument functions involved in EL1 exception handling
    that are involved in FBT trap processing: handle_el1h_sync() and
    do_el1h_sync().
  
  Prior to these changes, use of FBT on FreeBSD/arm64 rapidly led to kernel
  panics due to a blend of memory corruption and recursion within DTrace.
    
  MFC after:      2 weeks
  Reviewed by:    jrtc27, kp, markj (earlier version)
  Differential revision:  https://reviews.freebsd.org/D27766

Still happy with this. I'll rebase the riscv fixes on top of this once it gets committed and push them as wel.

Trim exclusion of memcpy() and bcopy() as @andrew's in-progress patch to
avoid improper FBT instrumentation of functions without conventional stack
frames now solves that problem in a better way.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 11 2021, 3:45 PM
This revision was automatically updated to reflect the committed changes.