Page MenuHomeFreeBSD

Attempt to support multiple tracepoints with the same address.
ClosedPublic

Authored by markj on Aug 27 2018, 4:10 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Mar 27, 6:49 AM
Unknown Object (File)
Jan 20 2024, 5:03 AM
Unknown Object (File)
Dec 25 2023, 1:15 AM
Unknown Object (File)
Dec 20 2023, 4:33 AM
Unknown Object (File)
Dec 10 2023, 4:49 PM
Unknown Object (File)
Dec 10 2023, 3:23 PM
Unknown Object (File)
Nov 25 2023, 8:11 PM
Unknown Object (File)
Oct 21 2023, 2:34 PM

Details

Summary

With ifuncs, it's possible for multiple FBT probes to resolve to the
same address. For instance, fbt::copyout:entry and
fbt::copyout_nosmap:entry are the same on one of my systems: the former
is an ifunc and resolves to copyout_nosmap().

fbt_invop() isn't prepared to handle this possibility: it returns after
calling dtrace_probe() for the first matching tracepoint it finds. In
my case it always finds fbt::copyout_nosmap:entry first, so if I enable
fbt::copyout:entry, it never fires.

Add a hackish fix: flag tracepoints as having an associated enabling so
that we know we can skip over them in some cases. This has a few
caveats: we need an IPI to synchronize the flag with threads performing
FBT tracepoint hash table lookups, so disabling probes is more expensive
than before, and if one enables multiple probes sharing a tracepoint,
only one of those probes will ever fire. I will try to implement a
better fix, but this is better than nothing for 12.0.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

I'm a little worried that this may cause additional confusion if someone expects multiple probes to fire at that point. I wonder if a better workaround would be to just add a warning message whenever DTrace is called, or perhaps when someone calls dtrace -l, in the man page or something along the lines until we fix? Is there a reason we can't fix this in 13 and MFC the changes?

I'm a little worried that this may cause additional confusion if someone expects multiple probes to fire at that point. I wonder if a better workaround would be to just add a warning message whenever DTrace is called, or perhaps when someone calls dtrace -l, in the man page or something along the lines until we fix?

Why "additional"? The problem exists regardless since fbt_invop() returns after the first matching tracepoint. The problem was invisible before ifuncs since before that all tracepoints had distinct addresses. This patch just hides the problem in the common case.

Is there a reason we can't fix this in 13 and MFC the changes?

No reason except that I would like "dtrace -n fbt::copyout:entry" to do the right thing in 12.0.

Why "additional"? The problem exists regardless since fbt_invop() returns after the first matching tracepoint. The problem was invisible before ifuncs since before that all tracepoints had distinct addresses. This patch just hides the problem in the common case.

Ah, my bad I've misread your initial description. My understanding was that any tracepoint other than the first one would not even show in dtrace -l so it couldn't be enabled in the first place. Understanding it properly now, I tend to agree with this change.

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).

Fix the problem a different way: link probes that share a tracepoint
together using a new linkage pointer in struct fbt_probe. When the
tracepoint is hit, make all probes on that tracepoint fire.

This complicates fbt_destroy() a bit.

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).

I think it's acceptable for the time being since, as I pointed out, there are at most 2 probes for a given tracepoint currently. The FBT hash table is itself cache-unfriendly, and I'd rather fix the performance problems there holistically. Here's the distribution of hash chain lengths on a system with a fairly stripped-down kernel:

5395 0
9755 1
8748 2
5244 3
2369 4
 876 5
 280 6
  73 7
  24 8
   3 9
   1 10

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).

I think it's acceptable for the time being since, as I pointed out, there are at most 2 probes for a given tracepoint currently. The FBT hash table is itself cache-unfriendly, and I'd rather fix the performance problems there holistically. Here's the distribution of hash chain lengths on a system with a fairly stripped-down kernel:

5395 0
9755 1
8748 2
5244 3
2369 4
 876 5
 280 6
  73 7
  24 8
   3 9
   1 10

The strategy sounds good to me. Out of curiosity -- which one has 10?!

  • Remove incorrect claim about cpucontrol(8)-based updates.
This revision was not accepted when it landed; it landed in state Needs Review.Aug 28 2018, 8:21 PM
This revision was automatically updated to reflect the committed changes.