Details

Reviewers

gnn
andrew
manu
domagoj.stolfa_gmail.com
avg

Group Reviewers

DTrace

Commits

rGddf0ed09bd8f: sdt: Implement SDT probes using hot-patching

Summary

The idea here is to avoid a memory access and conditional branch per
probe site.  Instead, the probe is represented by an "unreachable"
unconditional function call.  asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record.  Each SDT probe carries a list
of tracepoints.

When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label.  The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general.

I verified that llvm 17 moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.

Per gallatin@ in D43504, this approach has less overhead when probes are
disabled.  To make the implementation simpler, I removed support for
probes with 7 arguments; nothing makes use of this except a regression
test case.  I also didn't implement this for 32-bit powerpc since I
wasn't able to figure out how to boot it in QEMU.

I have a couple of follow-up patches which take this further:

1. We can now fill out the "function" field of SDT probe names
   automatically, since we know exactly where each tracepoint is
   located.

2. We can put additional code between the asm goto target label and the
   probe itself.  This lets us perform some probe-specific argument
   marshalling without any overhead when the probe is disabled.  For
   example:

```
if (SDT_PROBES_ENABLED()) {
        int reason = CLD_EXITED;

        if (WCOREDUMP(signo))
                reason = CLD_DUMPED;
        else if (WIFSIGNALED(signo))
                reason = CLD_KILLED;
        SDT_PROBE1(proc, , , exit, reason);
}
```

becomes

```
SDT_PROBE1_EXP(proc, , , exit, reason,
        int reason;

        reason = CLD_EXITED;
        if (WCOREDUMP(signo))
                reason = CLD_DUMPED;
        else if (WIFSIGNALED(signo))
                reason = CLD_KILLED;
);
```

In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Mar 23 2024, 6:49 AM

Herald added a reviewer: gnn. · View Herald TranscriptMar 23 2024, 6:49 AM

Herald added subscribers: olce, imp. · View Herald Transcript

markj requested review of this revision.Mar 23 2024, 6:49 AM

Harbormaster completed remote builds in B56756: Diff 136124.Mar 23 2024, 6:50 AM

markj mentioned this in D43504: netinet: add a probe point for IP stats counters.Mar 23 2024, 7:16 AM

Provide a full implementation.

Herald added a reviewer: andrew. · View Herald TranscriptApr 10 2024, 12:07 AM

Herald added a reviewer: andrew. · View Herald Transcript

Herald added a reviewer: manu. · View Herald Transcript

Herald added subscribers: riscv, jhibbits, emaste. · View Herald Transcript

Harbormaster completed remote builds in B57023: Diff 136829.Apr 10 2024, 12:07 AM

markj edited the summary of this revision. (Show Details)Apr 10 2024, 12:08 AM

markj added a reviewer: DTrace.

christos added a subscriber: christos.Apr 11 2024, 12:39 AM

bnovkov added a subscriber: bnovkov.Apr 11 2024, 7:47 AM

Any feedback from DTrace? I would like to commit this soon.

LGTM

This revision is now accepted and ready to land.Apr 20 2024, 4:28 PM

I ran the tests with this applied on amd64, had a few kmods load/unload concurrently for a while and looked through the concurrency around the patching code. Everything seems to work fine on my end.

sys/cddl/dev/sdt/sdt.c
374	Might make sense for this to say: `... for %s:%s:%s:%s\n", ..., tp->probe->prov->name, tp->probe->mod, tp->probe->func, tp->probe->name);` to avoid confusion?

Rebase.
Remove license boilerplate, just keep SPDX and copyright lines.
Fix a problem with DTRACE_PROBE which uses function-static probe structure definitions. To reference them from inline asm, we need to make the structure an input operand since we don't know the symbol name.
Rename some constants to improve consistency.
Work around a clang bug/limitation on i386 wherein I can't use the "i" constraint with a global variable for some reason. This works perfectly well if I just reference the symbol directly, so I'm not sure why the backend is rejecting it. Happily, there is an MD constraint ("Ws") which empirically has the behaviour I want.

This revision now requires review to proceed.May 1 2024, 12:17 PM

Harbormaster completed remote builds in B57496: Diff 137942.May 1 2024, 12:17 PM

In D44483#1023820, @domagoj.stolfa_gmail.com wrote:

I ran the tests with this applied on amd64, had a few kmods load/unload concurrently for a while and looked through the concurrency around the patching code. Everything seems to work fine on my end.

Thanks! I'm more or less happy with the current version, so will commit in a day or two unless I hear some objection.

Apply Domagoj's suggestion

Harbormaster completed remote builds in B57499: Diff 137945.May 1 2024, 12:22 PM

domagoj.stolfa_gmail.com accepted this revision.May 2 2024, 1:37 PM

This revision is now accepted and ready to land.May 2 2024, 1:37 PM

jhibbits added inline comments.May 22 2024, 1:32 PM

sys/conf/files.powerpc
339	This machdep file looks fine to me for all powerpc, so I think it's fine to not gate on powerpc64. I don't see any changes in the MI files that would prevent powerpc, either (all atomics are ints, no 64-bit atomics).

The thing that's holding me back right now is uncertainty around compiler support. asm goto is a reasonably recent feature in LLVM so this would break one's ability to compile GENERIC with older toolchains, but I'm not sure yet whether that's likely to inconvenience anyone. I could provide a fallback implementation, but I'd rather not if it can be avoided, sdt.h is too messy as it is.

sys/conf/files.powerpc
339	Hmm, so the instructions written by that file have the same encoding on 32-bit powerpc as well? I still have not yet tried to test this on 32-bit powerpc, too busy with other stuff.

jhibbits added inline comments.May 22 2024, 1:54 PM

sys/conf/files.powerpc
339	Yeah, encodings are the same for all instructions between 32 and 64-bit. 32-bit is a strict subset of 64-bit ISA, and instructions affect the entire GPR in both 32-bit and 64-bit mode.

avg accepted this revision.Jun 14 2024, 9:12 AM

In D44483#1033600, @markj wrote:

The thing that's holding me back right now is uncertainty around compiler support. asm goto is a reasonably recent feature in LLVM so this would break one's ability to compile GENERIC with older toolchains, but I'm not sure yet whether that's likely to inconvenience anyone. I could provide a fallback implementation, but I'd rather not if it can be avoided, sdt.h is too messy as it is.

It's been around since LLVM 9, which is longer than I had thought. So this isn't a blocker.

Closed by commit rGddf0ed09bd8f: sdt: Implement SDT probes using hot-patching (authored by markj). · Explain WhyJun 19 2024, 8:59 PM

This revision was automatically updated to reflect the committed changes.

markj added a commit: rGddf0ed09bd8f: sdt: Implement SDT probes using hot-patching.

sdt: Prototype implementation of SDT probes using hot-patching
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 140007

cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/sdt/tst.sdtargs.d

sys/amd64/include/sdt_machdep.h

sys/arm/arm/sdt_machdep.c

sys/arm/include/sdt_machdep.h

sys/arm64/arm64/sdt_machdep.c

sys/arm64/include/sdt_machdep.h

sys/cddl/dev/dtrace/dtrace_test.c

sys/cddl/dev/sdt/sdt.c

sys/conf/files.arm

sys/conf/files.arm64

sys/conf/files.powerpc

sys/conf/files.riscv

sys/conf/files.x86

sys/i386/include/sdt_machdep.h

sys/kern/kern_sdt.c

sys/modules/dtrace/Makefile

sys/powerpc/include/sdt_machdep.h

sys/powerpc/powerpc/sdt_machdep.c

sys/riscv/include/sdt_machdep.h

sys/riscv/riscv/sdt_machdep.c

sys/sys/sdt.h

sys/x86/x86/sdt_machdep.c

sdt: Prototype implementation of SDT probes using hot-patchingClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 140007

cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/sdt/tst.sdtargs.d

sys/amd64/include/sdt_machdep.h

sys/arm/arm/sdt_machdep.c

sys/arm/include/sdt_machdep.h

sys/arm64/arm64/sdt_machdep.c

sys/arm64/include/sdt_machdep.h

sys/cddl/dev/dtrace/dtrace_test.c

sys/cddl/dev/sdt/sdt.c

sys/conf/files.arm

sys/conf/files.arm64

sys/conf/files.powerpc

sys/conf/files.riscv

sys/conf/files.x86

sys/i386/include/sdt_machdep.h

sys/kern/kern_sdt.c

sys/modules/dtrace/Makefile

sys/powerpc/include/sdt_machdep.h

sys/powerpc/powerpc/sdt_machdep.c

sys/riscv/include/sdt_machdep.h

sys/riscv/riscv/sdt_machdep.c

sys/sys/sdt.h

sys/x86/x86/sdt_machdep.c

sdt: Prototype implementation of SDT probes using hot-patching
ClosedPublic
Actions

Revision Contents
Changeset List