Page MenuHomeFreeBSD

Add stack_save_td_running(9).
ClosedPublic

Authored by markj on Jul 31 2015, 1:22 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Sep 8, 6:52 PM
Unknown Object (File)
Sun, Sep 8, 2:32 AM
Unknown Object (File)
Sat, Sep 7, 6:34 PM
Unknown Object (File)
Mon, Sep 2, 1:56 PM
Unknown Object (File)
Mon, Sep 2, 1:22 AM
Unknown Object (File)
Fri, Aug 30, 5:00 PM
Unknown Object (File)
Fri, Aug 16, 7:01 PM
Unknown Object (File)
Jul 20 2024, 2:39 PM
Subscribers

Details

Summary

This function allows the caller to capture the kernel stack of a running
thread. It is currently only implemented on i386 and amd64; on these
architectures, it is implemented by raising an NMI on the CPU on which
the target thread is currently running. Unlike stack_save_td(9), it may
fail, e.g. if the target thread is running in user mode.

Have kern.proc.kstack use this function for running threads.

Test Plan

procstat -kk shows stacks for running threads on amd64.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj retitled this revision from to Add stack_save_td_running(9)..
markj edited the test plan for this revision. (Show Details)
markj updated this object.
markj added reviewers: bdrewery, jhb, kib.
kib edited edge metadata.
kib added inline comments.
sys/amd64/amd64/trap.c
217 ↗(On Diff #7537)

The only minor note I have, would it be safer to process both pmc and stack callbacks, instead of ignoring stack if the pmc consumed the interrupt ? Hmm, this might be even required for pmc and stack coexistence.

This revision is now accepted and ready to land.Jul 31 2015, 3:35 AM
sys/amd64/amd64/trap.c
217 ↗(On Diff #7537)

From my reading of the PMC interrupt handlers, the NMI will not be consumed unless at least one counter is found to have raised an interrupt. That is, an NMI raised by the stack code may be consumed by pmc, but in this case a second NMI will be pending, so the stack handler should eventually run. I may be misunderstanding your concern though.

I'll do some testing with PMCs enabled.

sys/amd64/amd64/trap.c
217 ↗(On Diff #7537)

I think that you understood my concern correctly.

sys/amd64/amd64/trap.c
207 ↗(On Diff #7537)

I have talked with Scott Long about adding a better NMI handler and moving all NMI handling to one place (right now some NMIs are handled here, and others are in duplicate cases in the switch statement later in trap()). Just FYI.

jhb edited edge metadata.

I think this is probably fine. Originally I had figured that doing precise runtime stacks would be expensive and was going to ask for a runtime sysctl to enable/disable it. Now I think it may not be all that expensive.

bdrewery edited edge metadata.

Much better than the other routes.

sys/amd64/amd64/trap.c
207 ↗(On Diff #7537)

I've thought a bit about adding an x86 KPI that allows drivers/subsystems to register an NMI handler. When an NMI is received, each handler would get a chance to taste the NMI until one of them decides to consume it. (And if none of them do, panic.) I can try implementing that, but it'd be nice to have some more use-cases in mind.

sys/amd64/amd64/trap.c
207 ↗(On Diff #7537)

I talked with Scott about this a bit and the problem is that there is an explicit order in the handlers. It's probably a bit simpler to just hardcode that order in an NMI handler, but we can at least move the logic out of trap() and avoid duplicating it.

sys/amd64/amd64/trap.c
207 ↗(On Diff #7537)

That makes sense to me. Are you referring to the duplication between i386 and amd64? If so, how about moving it into x86/intr_machdep.c? Or maybe create x86/nmi.c.

sys/amd64/amd64/trap.c
207 ↗(On Diff #7537)

Well, there's duplication in trap() itself where the switch statement for both user and kernel cases has some of the NMI handling. It certainly might make sense to move the nmi handler to sys/x86 as well though. Maybe the cpu_machdep.c file kib added would make sense?

I do think as part of this that trap should start with a switch statement here for "global" traps including NMI and MCE. The call to kdb_reenter() does make this a bit messy though. I think MCE should not trigger a kdb reenter so should move up. We can make the NMI handler handle kdb_reenter explicitly if there are some NMIs that shouldn't interrupt the debugger. However T_RESERVED traps probably should not interrupt the debugger.

/* Handle traps for hardware faults. */
switch (type) {
case T_MCHK:
    mca_intr();
    goto out;
case T_NMI:
    nmi_intr();
    goto out;
}

/* kdb_reenter */
/* T_RESERVED */
/* PSL_I check and rest of trap */
This revision was automatically updated to reflect the committed changes.